Intelligent Distributed Computing XIII [1st ed. 2020] 978-3-030-32257-1, 978-3-030-32258-8

This book gathers research contributions on recent advances in intelligent and distributed computing. A major focus is p

1,682 102 38MB

English Pages XXV, 555 [566] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Intelligent Distributed Computing XIII [1st ed. 2020]
 978-3-030-32257-1, 978-3-030-32258-8

Table of contents :
Front Matter ....Pages i-xxv
Front Matter ....Pages 1-1
Using Blockchain for Reputation-Based Cooperation in Federated IoT Domains (Giancarlo Fortino, Fabrizio Messina, Domenico Rosaci, Giuseppe M. L. Sarné)....Pages 3-12
Design of Fail-Safe Quadrocopter Configuration (Oleg V. Baranov, Nikolay V. Smirnov, Tatiana E. Smirnova, Yefim V. Zholobov)....Pages 13-22
CAAVI-RICS Model for Analyzing the Security of Fog Computing Systems (Saša Pešić, Miloš Radovanović, Mirjana Ivanović, Costin Badica, Milenko Tošić, Ognjen Iković et al.)....Pages 23-34
Conceptual Model of Digital Platform for Enterprises of Industry 5.0 (Vladimir Gorodetsky, Vladimir Larukchin, Petr Skobelev)....Pages 35-40
Conceptual Data Modeling Using Aggregates to Ensure Large-Scale Distributed Data Management Systems Security (Maria A. Poltavtseva, Maxim O. Kalinin)....Pages 41-47
Smart Topic Sharing in IoT Platform Based on a Social Inspired Broker (Vincenza Carchiolo, Alessandro Longheu, Michele Malgeri, Giuseppe Mangioni)....Pages 48-55
Easy Development of Software for IoT Systems (Ichiro Satoh)....Pages 56-61
Front Matter ....Pages 63-63
Privacy-Preserving LDA Classification over Horizontally Distributed Data (Fatemeh Khodaparast, Mina Sheikhalishahi, Hassan Haghighi, Fabio Martinelli)....Pages 65-74
Improving Parallel Data Mining for Different Data Distributions in IoT Systems (Ivan Kholod, Andrey Shorov, Sergei Gorlatch)....Pages 75-85
An Experiment on Automated Requirements Mapping Using Deep Learning Methods (Felix Petcuşin, Liana Stănescu, Costin Bădică)....Pages 86-95
Using the Doc2Vec Algorithm to Detect Semantically Similar Jira Issues in the Process of Resolving Customer Requests (Artem Kovalev, Nikita Voinov, Igor Nikiforov)....Pages 96-101
evoRF: An Evolutionary Approach to Random Forests (Diogo Ramos, Davide Carneiro, Paulo Novais)....Pages 102-107
The Method of Fuzzy Logic and Data Mining for Monitoring Troposphere Parameters Using Ground-Based Radiometric Complex (S. I. Ivanov, G. N. Ilin)....Pages 108-114
Front Matter ....Pages 115-115
Actor-Network Approach to Self-organisation in Global Logistics Networks (Yury Iskanderov, Mikhail Pautov)....Pages 117-127
Multi-agent System for Simulation of Response to Supply Chain Disruptions (Jing Tan, Rongjun Xu, Kai Chen, Lars Braubach, Kai Jander, Alexander Pokahr)....Pages 128-139
Data Warehouse Design for Security Applications Using Distributed Ontology-Based Knowledge Representation (Maria A. Butakova, Andrey V. Chernov, Ilias K. Savvas, Georgia Garani)....Pages 140-145
Front Matter ....Pages 147-147
Exploring the Space of Block Structured Scheduling Processes Using Constraint Logic Programming (Amelia Bădică, Costin Bădică, Mirjana Ivanović, Doina Logofătu)....Pages 149-159
Global and Private Job-Flow Scheduling Optimization in Grid Virtual Organizations (Victor Toporkov, Anna Toporkova, Dmitry Yemelyanov)....Pages 160-169
Type-Based Genetic Algorithms (Roman Sizov, Dan A. Simovici)....Pages 170-176
Distributed Construction of a Level Class Description in the Framework of Logic-Predicate Approach to AI Problems (Tatiana M. Kosovskaya)....Pages 177-182
On Approaches for Solving Nonlinear Optimal Control Problems (Alina V. Boiko, Nikolay V. Smirnov)....Pages 183-188
Front Matter ....Pages 189-189
Hierarchical Simulation of Onboard Networks (Valentin Olenev, Irina Lavrovskaya, Ilya Korobkov, Nikolay Sinyov, Yuriy Sheynin)....Pages 191-196
Strategies Comparison in Link Building Problem (Vincenza Carchiolo, Marco Grassia, Alessandro Longheu, Michele Malgeri, Giuseppe Mangioni)....Pages 197-202
Research of the Possibility of Hidden Embedding of a Digital Watermark Using Practical Methods of Channel Steganography (Pavel I. Sharikov, Andrey V. Krasov, Artem M. Gelfand, Nikita A. Kosov)....Pages 203-209
A Highly Scalable Index Structure for Multicore In-Memory Database Systems (Hitoshi Mitake, Hiroshi Yamada, Tatsuo Nakajima)....Pages 210-217
Applying the Split-Join Queuing System Model to Estimating the Efficiency of Detecting Contamination Content Process in Multimedia Objects Streams (Vladimir Lokhvickii, Yuri Ryzhikov, Andry Dudkin)....Pages 218-223
On the Applicability of the Modernized Method of Latent-Semantic Analysis to Identify Negative Content in Multimedia Objects (Sergey Krasnov, Vladimir Lokhvitckii, Andry Dudkin)....Pages 224-229
Front Matter ....Pages 231-231
Digital Subjects as New Power Actors: A Critical View on Political, Media-, and Digital Spaces Intersection (Dmitry Gavra, Vladislav Dekalov, Ksenia Naumenko)....Pages 233-243
An Approach to Creating an Intelligent System for Detecting and Countering Inappropriate Information on the Internet (Lidiya Vitkova, Igor Saenko, Olga Tushkanova)....Pages 244-254
A Note on Analysing the Attacker Aims Behind DDoS Attacks (Abhishta Abhishta, Marianne Junger, Reinoud Joosten, Lambert J. M. Nieuwenhuis)....Pages 255-265
Formation of the System of Signs of Potentially Harmful Multimedia Objects (Sergey Pilkevich, Konstantin Gnidko)....Pages 266-271
Soft Estimates for Social Engineering Attack Propagation Probabilities Depending on Interaction Rates Among Instagram Users (Anastasiia O. Khlobystova, Maxim V. Abramov, Alexander L. Tulupyev)....Pages 272-277
Development of the Complex Algorithm for Web Pages Classification to Detection Inappropriate Information on the Internet (Diana Gaifulina, Andrey Chechulin)....Pages 278-284
Approach to Identification and Analysis of Information Sources in Social Networks (Lidia Vitkova, Maxim Kolomeets)....Pages 285-293
The Architecture of Subsystem for Eliminating an Uncertainty in Assessment of Information Objects’ Semantic Content Based on the Methods of Incomplete, Inconsistent and Fuzzy Knowledge Processing (Igor Parashchuk, Elena Doynikova)....Pages 294-301
The Common Approach to Determination of the Destructive Information Impacts and Negative Personal Tendencies of Young Generation Using the Neural Network Methods for the Internet Content Processing (Alexander Branitskiy, Elena Doynikova, Igor Kotenko, Natalia Krasilnikova, Dmitriy Levshun, Artem Tishkov et al.)....Pages 302-310
Front Matter ....Pages 311-311
Authorize-then-Authenticate: Supporting Authorization Decisions Prior to Authentication in an Electronic Identity Infrastructure (Diana Berbecaru, Antonio Lioy, Cesare Cameroni)....Pages 313-322
Modeling and Evaluation of Battery Depletion Attacks on Unmanned Aerial Vehicles in Crisis Management Systems (Vasily Desnitsky, Nikolay Rudavin, Igor Kotenko)....Pages 323-332
The Integrated Model of Secure Cyber-Physical Systems for Their Design and Verification (Dmitry Levshun, Igor Kotenko, Andrey Chechulin)....Pages 333-343
Scalable Data Processing Approach and Anomaly Detection Method for User and Entity Behavior Analytics Platform (Alexey Lukashin, Mikhail Popov, Anatoliy Bolshakov, Yuri Nikolashin)....Pages 344-349
Approach to Detection of Denial-of-Sleep Attacks in Wireless Sensor Networks on the Base of Machine Learning (Anastasia Balueva, Vasily Desnitsky, Igor Ushakov)....Pages 350-355
Model of Smart Manufacturing System (Maria Usova, Sergey Chuprov, Ilya Viksnin, Ruslan Gataullin, Antonina Komarova, Andrey Iuganson)....Pages 356-362
Front Matter ....Pages 363-363
Technology Resolution Criterion of Uncertainty in Intelligent Distributed Decision Support Systems (Alexsander N. Pavlov, Dmitry A. Pavlov, Valerii V. Zakharov)....Pages 365-373
Satellite Constellation Control Based on Inter-Satellite Information Interaction (Oleg Karsaev, Evgeniy Minakov)....Pages 374-384
Load Balancing Cloud Computing with Web-Interface Using Multi-channel Queuing Systems with Warming up and Cooling (Maad M. Khalill, Anatoly D. Khomonenko, Sergey I. Gindin)....Pages 385-393
Application of Cyber-Physical System and Real-Time Control Construction Algorithm in Supply Chain Management Problem (Inna Trofimova, Boris Sokolov, Dmitry Nazarov, Semyon Potryasaev, Andrey Musaev, Vladimir Kalinin)....Pages 394-403
Method for Design of ‘smart’ Spacecraft Onboard Decision Making in Case of Limited Onboard Resources (Andrey Tyugashev, Sergei Orlov)....Pages 404-413
Intelligent Technologies and Systems for Spatial Industrial Strategic Planning (Elena Serova)....Pages 414-422
Conceptual and Formal Models of Information Technologies Use for Decisions Support in Technological Systems (Alexander S. Geyda)....Pages 423-429
Role and Future of Standards in Development of Intelligent and Dependable Control Software in Russian Space Industry (Andrey Tyugashev, Alexander Kovalev, Vjacheslav Pjatkov)....Pages 430-436
Improved Particle Swarm Medical Image Segmentation Algorithm for Decision Making (Samer El-Khatib, Yuri Skobtsov, Sergey Rodzin)....Pages 437-442
Collecting and Processing Distributed Data for Decision Support in Social Ecology (Dmitry Verzilin, Tatyana Maximova, Irina Sokolova)....Pages 443-448
Evaluation of the Dynamics of Phytomass in the Tundra Zone Using a Fuzzy-Opportunity Approach (V. V. Mikhailov, Alexandr V. Spesivtsev, Andrey Yu. Perevaryukha)....Pages 449-454
Front Matter ....Pages 455-455
Lower Limbs Exoskeleton Control System Based on Intelligent Human-Machine Interface (Ildar Kagirov, Alexey Karpov, Irina Kipyatkova, Konstantin Klyuzhev, Alexander Kudryavcev, Igor Kudryavcev et al.)....Pages 457-466
Central Audio-Library of the University of Novi Sad (Vlado Delić, Dragiša Mišković, Branislav Popović, Milan Sec̆ujski, Siniša Suzić, Tijana Delić et al.)....Pages 467-476
Applying Ensemble Learning Techniques and Neural Networks to Deceptive and Truthful Information Detection Task in the Flow of Speech (Alena Velichko, Viktor Budkov, Ildar Kagirov, Alexey Karpov)....Pages 477-482
Front Matter ....Pages 483-483
Model Checking to Detect the Hummingbad Malware (Fabio Martinelli, Francesco Mercaldo, Vittoria Nardone, Antonella Santone, Gigliola Vaglini)....Pages 485-494
ECU-Secure: Characteristic Functions for In-Vehicle Intrusion Detection (Yannick Chevalier, Roland Rieke, Florian Fenzl, Andrey Chechulin, Igor Kotenko)....Pages 495-504
Experimenting with Machine Learning in Automated Intrusion Response (Andre Lopes, Andrew Hutchison)....Pages 505-514
Method of Several Information Spaces for Identification of Anomalies (Alexander Grusho, Nick Grusho, Elena Timonina)....Pages 515-520
Gateway for Industrial Cyber-Physical Systems with Hardware-Based Trust Anchors (Diethelm Bienhaus, Lukas Jäger, Roland Rieke, Christoph Krauß)....Pages 521-528
Front Matter ....Pages 529-529
The Architecture of the System for Monitoring the Status in Patients with Parkinson’s Disease Using Mobile Technologies (Shichkina Yulia, Kataeva Galina, Irishina Yulia, Stanevich Elizaveta)....Pages 531-540
Approach to Association and Classification Rules Visualization (Yana Bekeneva, Vladimir Mochalov, Andrey Shorov)....Pages 541-546
The Visualization-Driven Approach to the Analysis of the HVAC Data (Evgenia Novikova, Mikhail Bestuzhev, Andrey Shorov)....Pages 547-552
Back Matter ....Pages 553-555

Citation preview

Studies in Computational Intelligence 868

Igor Kotenko · Costin Badica · Vasily Desnitsky · Didier El Baz · Mirjana Ivanovic   Editors

Intelligent Distributed Computing XIII

Studies in Computational Intelligence Volume 868

Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland

The series “Studies in Computational Intelligence” (SCI) publishes new developments and advances in the various areas of computational intelligence—quickly and with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. The books of this series are submitted to indexing to Web of Science, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink.

More information about this series at http://www.springer.com/series/7092

Igor Kotenko Costin Badica Vasily Desnitsky Didier El Baz Mirjana Ivanovic •







Editors

Intelligent Distributed Computing XIII

123

Editors Igor Kotenko SPIIRAS St. Petersburg, Russia

Costin Badica University of Craiova Craiova, Romania

Vasily Desnitsky SPIIRAS St. Petersburg, Russia

Didier El Baz LAAS - CNRS Toulouse Cedex 4, France

Mirjana Ivanovic University of Novi Sad Novi Sad, Serbia

ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN 978-3-030-32257-1 ISBN 978-3-030-32258-8 (eBook) https://doi.org/10.1007/978-3-030-32258-8 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

Intelligent distributed computing appeared in the 1970s as an outcome of the exploitation of synergies between different research and industrial trends coming from the fields of Intelligent Systems and Distributed Computing. It is a stream directly derived from artificial intelligence, granting novel and significant intelligent solutions built upon the combination of models from this classical field with computational intelligence, distributed, multi-agent systems and computer security. This volume collects a range of research contributions exposing innovative advances of intelligent and distributed computing, comprising both architectural and algorithmic findings related to these fields. A major focus is placed on new models, techniques, and applications for intelligent distributed and high-performance architectures, ephemeral and unreliable computing, intelligent distributed knowledge representation and processing, networked intelligence, distributed swarm robotics systems, nature-inspired methods for data science and machine learning, and many others. The book comprises the peer-reviewed proceedings of the 13th International Symposium on Intelligent Distributed Computing (IDC 2019), which was held in St. Petersburg, Russia, October 7–9, 2019. IDC 2019 continues the legacy of the prior IDC Symposium Series as an initiative of two research groups: Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland, and Software Engineering Department, University of Craiova, Craiova, Romania. IDC 2019 event comprised the following 11 sessions: (1) Internet of Things, Cloud Computing, and Big Data, (2) Data Analysis, Mining, and Machine Learning, (3) Multi-agent and Service-Based Distributed Systems, (4) Distributed Algorithms and Optimization, (5) Modeling Operational Processes for Intelligent Distributed Computing, (6) Advanced Methods for Social Network Analysis and Inappropriate Content Counteraction, (7) Intelligent Distributed Computing for Cyber-physical Security and Safety, (8) Intelligent Distributed Decision Support Systems, (9) Intelligent Human–Machine Interfaces, (10) Security for Intelligent Distributed Computing—Machine Learning vs. Chains of Trust, and (11) Visual Analytics in Distributed Environment. The proceedings book contains contributions of 28 regular and 36 short papers selected from 105 received submissions from 17 v

vi

Preface

countries. Each submission was carefully reviewed of least three members of the Program Committee. Acceptance and publication were judged based on the relevance to the conference topics, clarity of presentation, novelty, and accuracy of the contribution. The acceptance rates were 26.66%, counting only regular papers, and 60.95% when including also short ones. We would like to thank Janusz Kacprzyk, editor of Studies in Computational Intelligence series and member of the Steering Committee, for his continuous support and encouragement for the development of the IDC Symposium Series. Also, we would like to thank the IDC 2019 Program Committee members for their work in promoting the event and refereeing submissions. A special thank goes to all colleagues who submitted their work to this event. IDC 2019 enjoyed outstanding keynote speeches by distinguished invited speakers: Prof. Helen Karatza—Professor Emeritus, Department of Informatics, Aristotle University of Thessaloniki, Greece, and Prof. Vladimir Gorodetsky— Professor of Computer Science, InfoWings, Russia. Finally, we acknowledge and appreciate the efforts made by of all the organizers from the Laboratory of Computer Security Problems—St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences for collaborating in the organization of the conference. Special thanks also to the people from the ITMO University, for hosting the event in such a beautiful place. August 2019

Igor Kotenko Costin Badica Vasily Desnitsky Didier El Baz Mirjana Ivanovic

Organization

General Chairs Igor Kotenko Costin Bădică

SPIIRAS, ITMO University, Russian Federation University of Craiova, Romania

Steering Committee Janusz Kacprzyk Costin Bădică David Camacho Paulo Novais Filip Zavoral Frances Brazier George A. Papadopoulos Giancarlo Fortino Kees Nieuwenhuis Marcin Paprzycki Michele Malgeri Mohammad Essaaidi Mirjana Ivanović Amal El Fallah Seghrouchni Javier Del Ser

Polish Academy of Sciences, Poland University of Craiova, Romania Universidad Autonoma de Madrid, Spain University of Minho, Portugal Charles University Prague, Czech Republic Delft University of Technology, The Netherlands University of Cyprus, Cyprus University of Calabria, Italy Thales Research and Technology, The Netherlands Polish Academy of Sciences, Poland University of Catania, Italy Abdelmalek Essaadi University in Tetuan, Morocco University of Novi Sad, Serbia LIP6/University Pierre and Marie Curie, France University of the Basque Country (UPV-EHU), TECNALIA and BCAM, Spain

vii

viii

Organization

Technical Program Chairs Vasily Desnitsky Didier El Baz Mirjana Ivanović

SPIIRAS, ITMO University, Russian Federation LAAS-CNRS, France University of Novi Sad, Serbia

Web and Publicity Chair Andrey Chechulin

SPIIRAS, ITMO University, Russian Federation

Publications Chairs David Camacho Giancarlo Fortino Ilsun You

Universidad Autonoma de Madrid, Spain University of Calabria, Italy Soonchunhyang University, Korea

Program Committee Anatoliy Khomonenko Adrian Groza Agostino Poggi Albert Ali Salah Alberto Fernandez Alejandro Martín Alessandro Longheu Alessandro Ricci Alexander Branitskiy

Alexander Grusho Alexander Ivanov Alexander Pokahr Alexander Tulupyev

Alexey Bobtsov Alexey Karpov

Alisa Vorobeva

Petersburg Transport State University, Russian Federation Technical University of Cluj-Napoca, Romania University of Parma, Italy Utrecht University, The Netherlands University Rey Juan Carlos, Spain Universidad Autonoma de Madrid, Spain DIEEI - University of Catania, Italy University of Bologna, Italy St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russian Federation Lomonosov Moscow State University, Russian Federation St. Petersburg University of the Ministry of Internal Affairs, Russian Federation University of Hamburg, Germany St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russian Federation ITMO University, Russian Federation St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russian Federation ITMO University, Russian Federation

Organization

Amal El Fallah Seghrouchni Amelia Bădică Amparo Alonso-Betanzos Ana Garcia-Fornes Ana Madevska Bogdanova Anastasios Gounaris Andre de Carvalho Andre Rein Andrea Omicini Andrei Doncescu Andrei Sabelfeld Andrew Hutchison Andrey Chechulin

Andrey Chernov Andrey Fedorchenko

Andrey Krasov Andrey Privalov Andrey Shorov Andrey Tyugashev Angel Panizo Ângelo Costa Anton Saveliev

Antonio D. Masegosa Apostolos N. Papadopoulos Artem Tishkov Attila Kiss Bastien Plazolles Bela Stantic Bertha Guijarro-Berdiñas

ix

LIP6 - University of Pierre and Marie Curie, France University of Craiova, Romania University of A Coruña, Spain Universidad Politecnica de Valencia, Spain FCSE, University Ss. Cyril and Methodius, North Macedonia Aristotle University of Thessaloniki, Greece University of São Paulo, Brazil Fraunhofer, Germany Alma Mater Studiorum–Università di Bologna, Italy LAAS, France Chalmers University of Technology, Sweden T-Systems Switzerland, Switzerland St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), ITMO University, Russian Federation Rostov State Transport University, Russian Federation St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), ITMO University, Russian Federation SPbSUT, Russian Federation St. Petersburg state University of Railways of Emperor Alexander I, Russian Federation Saint-Petersburg Electrotechnical University (LETI), Russian Federation Samara State Transport University, Russian Federation Universidad Autonoma de Madrid, Spain University of Minho, Portugal St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russian Federation University of Deusto/IKERBASQUE, Spain Aristotle University of Thessaloniki, Greece Pavlov First Saint-Petersburg State Medical University, Russian Federation Eötvös Loránd University, Hungary GET-CNRS, France Griffith University, Australia University of A Coruña, Spain

x

Bharat Chaudhari Bigomokero Antoine Bagula Bilal Fakih Boris Sokolov

Branislav Popović Cédric Herpson Charalampos Bratsas Christoph Krauß Constantin Zamfirescu Costin Bădică Cristina Bianca Pop Cristina Onete Dan Selişteanu Dana Petcu Daniele D’Agostino Daniil Kocharov Dariusz Krol David Bednárek David Camacho Davide Carneiro Davide Grossi Dennis Tatang Diana Graţiela Berbecaru Didier El Baz Diethelm Bienhaus Dmitrii Fedotov Dmitrii Gavra Dmitrii Verzilin

Dmitry Chalyy Dmitry Levshun

Dmitry Novikov Doina Bein Domenico Rosaci

Organization

Maharashtra Institute of Technology, India University of the Western Cape, South Africa LAAS-CNRS, France St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russian Federation Faculty of Technical Sciences, University of Novi Sad, Serbia LIP6, University Pierre and Marie Curie, France Aristotle University of Thessaloniki, Greece Fraunhofer, Germany Lucian Blaga University of Sibiu, Romania University of Craiova, Romania Technical University of Cluj-Napoca, Romania CASED (TU Darmstadt), Germany University of Craiova, Romania West University of Timisoara, Romania CNR-IMATI, Italy Saint-Petersburg State University, Russian Federation Wrocław University of Science and Technology, Poland Charles University Prague, The Czech Republic Universidad Autonoma de Madrid, Spain Polytechnic Institute of Porto, Portugal University of Groningen, The Netherlands Ruhr-University Bochum, Germany Politecnico di Torino, Italy LAAS/CNRS, France University of Applied Sciences Mittelhessen, Germany Ulm University, Germany St. Petersburg State University, Russian Federation Lesgaft National State University of Physical Education, Sport and Health, St. Petersburg, Russian Federation Yaroslavl State University, Russian Federation St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), ITMO University, Russian Federation Institute of Control Sciences, Russian Federation California State University, USA University Mediterranea of Reggio Calabria, Italy

Organization

Dorian Cojocaru Dosam Hwang Drazen Brdjanin Dumitru Dan Burdescu Dušan Gajić Efstratios Kontopoulos Eleftherios Tiakas Elena Doynikova

Elena Serova

Elvira Popescu Eneko Osaba Ester Martinez-Martin Eugénio Oliveira Eva Onaindia Evgenia Novikova Fabio Martinelli Fábio Silva Fabrizio Messina Fernando Otero Filip Zavoral Florin Leon Florin Pop

Frances Brazier Friedhelm Schwenker Galina Ilieva George Vouros Georgia Koloniari Georgia Kougka Georgios Meditskos Giancarlo Fortino Giandomenico Spezzano Giuseppe Mangioni Gleb Rogozinskiy

xi

University of Craiova, Romania Yeungnam University, South Korea University of Banja Luka, Bosnia and Herzegovina University of Craiova, Romania University of Novi Sad, Serbia Information Technologies Institute, Greece Aristotle University of Thessaloniki, Greece St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russian Federation National Research University Higher School of Economics, St. Petersburg School of Economics and Management, Russian Federation University of Craiova, Romania Tecnalia Research & Innovation, Spain Universidad de Alicante, Spain Faculdade de Engenharia Universidade do Porto, Portugal Universitat Politècnica de València, Spain Saint-Petersburg Electrotechnical University “LETI” (ETU), Russian Federation IIT-CNR, Italy University of Minho, Portugal University of Catania, Italy University of Kent, UK Charles University Prague, The Czech Republic “Gheorghe Asachi” Technical University of Iasi, Romania University Politehnica of Bucharest/National Institute for Research and Development in Informatics (ICI), Bucharest, Hungary Delft University of Technology, The Netherlands Ulm University, Germany The University of Plovdiv “Paisii Hilendarsky,” Bulgaria University of Piraeus, Greece University of Macedonia, Greece Aristotle University of Thessaloniki, Greece Aristotle University of Thessaloniki, Greece University of Calabria, Italy CNR-ICAR and University of Calabria, Italy University of Catania, Italy The Bonch-Bruevich Saint-Petersburg University of Telecommunications, Russian Federation

xii

Goce Trajcevski Gordan Jezic Goreti Marreiros Gregoire Danoy Grzegorz J. Nalepa Gustavo Gonzalez Henry Hexmoor Hervé Debar Heysem Kaya Hicham Lakhlef Ichiro Satoh Igor Khokhlov Igor Kotenko

Igor Saenko

Ilias Sakellariou Ilsun You Ilya Viksnin Ioanna Dionysiou Irina Kipyatkova

Ivan Kholod Ivan Lirkov

Ivan Merelli Izaskun Oregui Iztok Fister Jr. Jacek Rak Jakub Yaghob Janusz Kacprzyk Jason Jung Jesús López Jia Luo

Organization

Iowa State University, USA University of Zagreb, Croatia ISEP/IPP-GECAD, Portugal University of Luxembourg, Luxembourg AGH University of Science and Technology, Poland Atos Spain, Spain Southern Illinois University, USA Télécom Sud Paris, France Namik Kemal University, Turkey University of Technology of Compiegne, France National Institute of Informatics, Japan Rochester Institute of Technology, USA St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), ITMO University, Russian Federation St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), ITMO University, Russian Federation Department of Applied Informatics, University of Macedonia, Greece Soonchunhyang University, South Korea ITMO University, Russian Federation University of Nicosia, Greece St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russian Federation Saint-Petersburg Electrotechnical University “LETI,” Russian Federation Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Bulgaria Institute for Biomedical Technologies, Italy Tecnalia Research & Innovation, Spain University of Maribor, Slovenia Gdansk University of Technology, Poland Charles University in Prague, The Czech Republic Systems Research Institute, Polish Academy of Sciences, Poland Chung-Ang University, South Korea Tecnalia Research & Innovation, Spain LAAS DU CNRS, France

Organization

Johannes Fähndrich Jørgen Villadsen Jörn Eichler Jose Carlos Castillo Montoya José Machado Juan Pavón Julien Bourgeois Kalliopi Kravari Konstantin Gnidko Ksenia Naumenko Kuldar Taveter Lars Braubach Leon Reznik Leonid Gladkov Lev Stankevich Liana Stănescu Lidia Vitkova

Linara Adilova Lucian Vinţan Marcin Paprzycki Marco Danelutto Marek Hruz Marin Lujak Marina De Vos Marius Brezovan Mark Last Marko Hölbl Martijn Warnier Martin Strecker Massimo Torquati Matthias Hiller Maxim Abramov

xiii

Technische Universität Berlin/DAI Labor, Germany Technical University of Denmark, Denmark Freie Universität Berlin, Germany Universidad Carlos III de Madrid, Spain University of Minho, Portugal Universidad Complutense de Madrid, Spain UBFC, FEMTO-ST Institute, CNRS, France Aristotle University of Thessaloniki, Greece Military Space Academy named after A. F. Mozhaysky, Russian Federation Reputation agency Glory Story, Russian Federation Tallinn University of Technology, Estonia University of Hamburg, Germany Rochester Institute of Technology, USA Southern Federal University, Russian Federation St. Peterburg Polytechnic University, Russian Federation University of Craiova, Romania St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), ITMO University, Russian Federation Fraunhofer, Germany “Lucian Blaga” University of Sibiu, Romania IBS PAN and WSM, Poland University of Pisa, Italy University of West Bohemia, The Czech Republic IMT Lille Douai, France University of Bath, UK University of Craiova, Romania Ben-Gurion University of the Negev, Israel University of Maribor, Slovenia Delft University of Technology, The Netherlands Université de Toulouse, France University of Pisa, Italy Fraunhofer Institute for Applied and Integrated Security AISEC, Germany St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), SPbSU, Russian Federation

xiv

Maxim Kolomeec

Michael Kasper Michael Negnevitsky Mihaela Colhon Mihaela Oprea Mikhail Y. Kovalyov

Milan Sečujski Miloš Radovanović Milos Savic Milos Zelezny Mirjana Ivanović Natalia Garanina Natalja Krasilnikova Nevena Ackovska Nick Bassiliades Nicolas Sklavos Nouredine Melab Olga Kolesnichenko Olga Lozhkina

Olga Tushkanova

Oliver Jokisch Olivier Boissier Oscar Sapena Panagiotis Demestichas Paolo Bresciani Paul Davidsson Paulo Moura Oliveira Paulo Novais Pedro López Petia Koprinkova-Hristova

Organization

St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), ITMO University, Russian Federation Fraunhofer, Germany University of Tasmania, Australia University of Craiova, Romania University Petroleum-Gas of Ploiesti, Romania United Institute of Informatics Problems, National Academy of Sciences of Belarus, Belarus University of Novi Sad, Serbia University of Novi Sad, Serbia University of Novi Sad, Serbia University of West Bohemia, The Czech Republic University of Novi Sad, Serbia Ershov Institute of Informatics Systems, Russian Federation Pavlov First Saint-Petersburg State Medical University, Russian Federation Saints Cyril and Methodius University of Skopje, North Macedonia Aristotle University of Thessaloniki, Greece University of Patras, Greece Lille 1 University, France Security Analysis Bulletin, Russian Federation Institute of Transport Problems of the Russian Academy of Sciences (IPT RAS), Russian Federation St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS) Leipzig University of Telecommunications, Germany Ecole des Mines de Saint-Étienne, France Universitat Politècnica de València, Spain University of Piraeus, Greece Fondazione Bruno Kessler - FBK, Italy Malmö University, Sweden UTAD University, Portugal University of Minho, Portugal Universidad de Deusto, Spain Bulgarian Academy of Sciences, Bulgaria

Organization

Petr Skobelev Petros Kefalas Phong Nguyen Radu-Emil Precup Răzvan Andonie Rem Collier Roland Rieke Romeo Sanchez Ronald Marx Saad Alqithami Salvador Abreu Sasko Ristov Sergei Chernyi Sergei Gorlatch Sergey Makarenko Sergiu Nedevschi Setsuya Kurahashi Stanimir Stoyanov Stefka Fidanova Tatiana Maximova Tatyana Tulupyeva Thomas Ågotnes Tihana Galinac Grbac Turganbek Omar Vacius Jusas Vadim Ermolayev Vasiliy Osipov

Vasily Desnitsky

Vicente Julian Victor V. Toporkov Vladimir Gorodetsky

xv

Samara Technical University, Smart Solutions, Russian Federation The University of Sheffield, UK University of London, UK Politehnica University of Timişoara, Romania Central Washington University, USA University College Dublin, Ireland Fraunhofer, Germany Universidad Autonoma de Nuevo Leon, Mexico Huawei Technologies, Germany Al Baha University, Saudi Arabia University of Evora, Portugal University of Innsbruck, Austria Admiral Makarov State University of Maritime and Inland Shipping, Russian Federation University of Muenster, Germany Intel Group Corporation ltd, Russian Federation Technical University of Cluj-Napoca, Romania University of Tsukuba, Japan University of Plovdiv “Paisii Hilendarski,” Bulgaria Institute of Information and Communication Technologies, Bulgaria ITMO University, Russian Federation SPbU, SPIIRAS, NWIM RANEPA, Russian Federation University of Bergen, Norway Juraj Dobrila University of Pula, Croatia Almaty University of Power Engineering and Telecommunications, Kazakhstan Kaunas University of Technology, Lithuania Zaporizhzhia National University, Ukraine St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russian Federation St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), ITMO University, Russian Federation Universitat Politècnica de València, Spain National Research University “MPEI,” Russian Federation InfoWings Ltd., Russian Federation

xvi

Vladimir Komashinskiy

Vladimir Kurbalija Vladimir Maric Vladimir Oleshchuk Vlado Delic Vyacheslav Shkodyrev Wolfgang Minker Yannick Chevalier Yingqian Zhang Yuki Matsuda Yulia Shichkina Yuri Matveev Yury Iskanderov

Yury Sherstyuk Yury Zagorulko

Zdeněk Krňoul Zhuo Wei Zoran Bosnic

Organization

Institute of Transport Problems of the Russian Academy of Sciences (IPT RAS), Russian Federation University of Novi Sad, Serbia Czech Technical University in Prague, The Czech Republic University of Agder, Norway University of Novi Sad, Serbia Peter the Great St. Petersburg Polytechnic University, Russian Federation University of Ulm, Germany Université de Toulouse, France Eindhoven University of Technology, The Netherlands Nara Institute of Science and Technology, Japan Saint-Petersburg Electrotechnical University “LETI,” Russian Federation ITMO University, Russian Federation St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russian Federation Institute of Informatics and Telecommunications, Russian Federation A.P. Ershov Institute of Informatics Systems, Russian Academy of Sciences, Russian Federation University of West Bohemia, The Czech Republic Huawei, Singapore University of Ljubljana, Slovenia

Local Organizing Committee Igor Kotenko Igor Saenko Evgenia Novikova Andrey Chechulin Vasily Desnitsky Alexander Branitskiy Andrey Fedorchenko Lidia Vitkova Elena Doynikova Maxim Kolomeec Dmitry Levshun

SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS,

ITMO University, Russian ITMO University, Russian Russian Federation ITMO University, Russian ITMO University, Russian Russian Federation ITMO University, Russian ITMO University, Russian ITMO University, Russian ITMO University, Russian ITMO University, Russian

Federation Federation Federation Federation Federation Federation Federation Federation Federation

Organization

Yurii Bakhtin Nickolay Komashinsky Alexei Kushnerevich Eugene Merkushev Anton Pronoza Nikolay Rudavin Kseniia Zhernova Diana Gaifulina Aleksej Meleshko Anastasiya Balueva

xvii

SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS, SPIIRAS,

Russian Federation Russian Federation Russian Federation Russian Federation Russian Federation Russian Federation Russian Federation ITMO University, Russian Federation ITMO University, Russian Federation Russian Federation

Contents

Internet of Things, Cloud Computing and Big Data Using Blockchain for Reputation-Based Cooperation in Federated IoT Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giancarlo Fortino, Fabrizio Messina, Domenico Rosaci, and Giuseppe M. L. Sarné Design of Fail-Safe Quadrocopter Configuration . . . . . . . . . . . . . . . . . . Oleg V. Baranov, Nikolay V. Smirnov, Tatiana E. Smirnova, and Yefim V. Zholobov CAAVI-RICS Model for Analyzing the Security of Fog Computing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saša Pešić, Miloš Radovanović, Mirjana Ivanović, Costin Badica, Milenko Tošić, Ognjen Iković, and Dragan Bošković

3

13

23

Conceptual Model of Digital Platform for Enterprises of Industry 5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vladimir Gorodetsky, Vladimir Larukchin, and Petr Skobelev

35

Conceptual Data Modeling Using Aggregates to Ensure Large-Scale Distributed Data Management Systems Security . . . . . . . . . Maria A. Poltavtseva and Maxim O. Kalinin

41

Smart Topic Sharing in IoT Platform Based on a Social Inspired Broker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vincenza Carchiolo, Alessandro Longheu, Michele Malgeri, and Giuseppe Mangioni Easy Development of Software for IoT Systems . . . . . . . . . . . . . . . . . . . Ichiro Satoh

48

56

xix

xx

Contents

Data Analysis, Mining and Machine Learning Privacy-Preserving LDA Classification over Horizontally Distributed Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fatemeh Khodaparast, Mina Sheikhalishahi, Hassan Haghighi, and Fabio Martinelli

65

Improving Parallel Data Mining for Different Data Distributions in IoT Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ivan Kholod, Andrey Shorov, and Sergei Gorlatch

75

An Experiment on Automated Requirements Mapping Using Deep Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Felix Petcuşin, Liana Stănescu, and Costin Bădică

86

Using the Doc2Vec Algorithm to Detect Semantically Similar Jira Issues in the Process of Resolving Customer Requests . . . . . . . . . . Artem Kovalev, Nikita Voinov, and Igor Nikiforov

96

evoRF: An Evolutionary Approach to Random Forests . . . . . . . . . . . . . 102 Diogo Ramos, Davide Carneiro, and Paulo Novais The Method of Fuzzy Logic and Data Mining for Monitoring Troposphere Parameters Using Ground-Based Radiometric Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 S. I. Ivanov and G. N. Ilin Multi-agent and Service-Based Distributed Systems Actor-Network Approach to Self-organisation in Global Logistics Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Yury Iskanderov and Mikhail Pautov Multi-agent System for Simulation of Response to Supply Chain Disruptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Jing Tan, Rongjun Xu, Kai Chen, Lars Braubach, Kai Jander, and Alexander Pokahr Data Warehouse Design for Security Applications Using Distributed Ontology-Based Knowledge Representation . . . . . . . . . . . . . 140 Maria A. Butakova, Andrey V. Chernov, Ilias K. Savvas, and Georgia Garani Distributed Algorithms and Optimization Exploring the Space of Block Structured Scheduling Processes Using Constraint Logic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Amelia Bădică, Costin Bădică, Mirjana Ivanović, and Doina Logofătu

Contents

xxi

Global and Private Job-Flow Scheduling Optimization in Grid Virtual Organizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Victor Toporkov, Anna Toporkova, and Dmitry Yemelyanov Type-Based Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Roman Sizov and Dan A. Simovici Distributed Construction of a Level Class Description in the Framework of Logic-Predicate Approach to AI Problems . . . . . . 177 Tatiana M. Kosovskaya On Approaches for Solving Nonlinear Optimal Control Problems . . . . . 183 Alina V. Boiko and Nikolay V. Smirnov Modeling Operational Processes for Intelligent Distributed Computing Hierarchical Simulation of Onboard Networks . . . . . . . . . . . . . . . . . . . 191 Valentin Olenev, Irina Lavrovskaya, Ilya Korobkov, Nikolay Sinyov, and Yuriy Sheynin Strategies Comparison in Link Building Problem . . . . . . . . . . . . . . . . . 197 Vincenza Carchiolo, Marco Grassia, Alessandro Longheu, Michele Malgeri, and Giuseppe Mangioni Research of the Possibility of Hidden Embedding of a Digital Watermark Using Practical Methods of Channel Steganography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Pavel I. Sharikov, Andrey V. Krasov, Artem M. Gelfand, and Nikita A. Kosov A Highly Scalable Index Structure for Multicore In-Memory Database Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Hitoshi Mitake, Hiroshi Yamada, and Tatsuo Nakajima Applying the Split-Join Queuing System Model to Estimating the Efficiency of Detecting Contamination Content Process in Multimedia Objects Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 Vladimir Lokhvickii, Yuri Ryzhikov, and Andry Dudkin On the Applicability of the Modernized Method of Latent-Semantic Analysis to Identify Negative Content in Multimedia Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 Sergey Krasnov, Vladimir Lokhvitckii, and Andry Dudkin

xxii

Contents

Advanced Methods for Social Network Analysis and Inappropriate Content Counteraction Digital Subjects as New Power Actors: A Critical View on Political, Media-, and Digital Spaces Intersection . . . . . . . . . . . . . . . 233 Dmitry Gavra, Vladislav Dekalov, and Ksenia Naumenko An Approach to Creating an Intelligent System for Detecting and Countering Inappropriate Information on the Internet . . . . . . . . . . 244 Lidiya Vitkova, Igor Saenko, and Olga Tushkanova A Note on Analysing the Attacker Aims Behind DDoS Attacks . . . . . . . 255 Abhishta Abhishta, Marianne Junger, Reinoud Joosten, and Lambert J. M. Nieuwenhuis Formation of the System of Signs of Potentially Harmful Multimedia Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Sergey Pilkevich and Konstantin Gnidko Soft Estimates for Social Engineering Attack Propagation Probabilities Depending on Interaction Rates Among Instagram Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Anastasiia O. Khlobystova, Maxim V. Abramov, and Alexander L. Tulupyev Development of the Complex Algorithm for Web Pages Classification to Detection Inappropriate Information on the Internet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 Diana Gaifulina and Andrey Chechulin Approach to Identification and Analysis of Information Sources in Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Lidia Vitkova and Maxim Kolomeets The Architecture of Subsystem for Eliminating an Uncertainty in Assessment of Information Objects’ Semantic Content Based on the Methods of Incomplete, Inconsistent and Fuzzy Knowledge Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 Igor Parashchuk and Elena Doynikova The Common Approach to Determination of the Destructive Information Impacts and Negative Personal Tendencies of Young Generation Using the Neural Network Methods for the Internet Content Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Alexander Branitskiy, Elena Doynikova, Igor Kotenko, Natalia Krasilnikova, Dmitriy Levshun, Artem Tishkov, and Nina Vanchakova

Contents

xxiii

Intelligent Distributed Computing for Cyber-Physical Security and Safety Authorize-then-Authenticate: Supporting Authorization Decisions Prior to Authentication in an Electronic Identity Infrastructure . . . . . . 313 Diana Berbecaru, Antonio Lioy, and Cesare Cameroni Modeling and Evaluation of Battery Depletion Attacks on Unmanned Aerial Vehicles in Crisis Management Systems . . . . . . . . 323 Vasily Desnitsky, Nikolay Rudavin, and Igor Kotenko The Integrated Model of Secure Cyber-Physical Systems for Their Design and Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 Dmitry Levshun, Igor Kotenko, and Andrey Chechulin Scalable Data Processing Approach and Anomaly Detection Method for User and Entity Behavior Analytics Platform . . . . . . . . . . . 344 Alexey Lukashin, Mikhail Popov, Anatoliy Bolshakov, and Yuri Nikolashin Approach to Detection of Denial-of-Sleep Attacks in Wireless Sensor Networks on the Base of Machine Learning . . . . . . . . . . . . . . . . 350 Anastasia Balueva, Vasily Desnitsky, and Igor Ushakov Model of Smart Manufacturing System . . . . . . . . . . . . . . . . . . . . . . . . . 356 Maria Usova, Sergey Chuprov, Ilya Viksnin, Ruslan Gataullin, Antonina Komarova, and Andrey Iuganson Intelligent Distributed Decision Support Systems Technology Resolution Criterion of Uncertainty in Intelligent Distributed Decision Support Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 365 Alexsander N. Pavlov, Dmitry A. Pavlov, and Valerii V. Zakharov Satellite Constellation Control Based on Inter-Satellite Information Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Oleg Karsaev and Evgeniy Minakov Load Balancing Cloud Computing with Web-Interface Using Multi-channel Queuing Systems with Warming up and Cooling . . . . . . 385 Maad M. Khalill, Anatoly D. Khomonenko, and Sergey I. Gindin Application of Cyber-Physical System and Real-Time Control Construction Algorithm in Supply Chain Management Problem . . . . . . 394 Inna Trofimova, Boris Sokolov, Dmitry Nazarov, Semyon Potryasaev, Andrey Musaev, and Vladimir Kalinin Method for Design of ‘smart’ Spacecraft Onboard Decision Making in Case of Limited Onboard Resources . . . . . . . . . . . . . . . . . . . 404 Andrey Tyugashev and Sergei Orlov

xxiv

Contents

Intelligent Technologies and Systems for Spatial Industrial Strategic Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 Elena Serova Conceptual and Formal Models of Information Technologies Use for Decisions Support in Technological Systems . . . . . . . . . . . . . . . 423 Alexander S. Geyda Role and Future of Standards in Development of Intelligent and Dependable Control Software in Russian Space Industry . . . . . . . . 430 Andrey Tyugashev, Alexander Kovalev, and Vjacheslav Pjatkov Improved Particle Swarm Medical Image Segmentation Algorithm for Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Samer El-Khatib, Yuri Skobtsov, and Sergey Rodzin Collecting and Processing Distributed Data for Decision Support in Social Ecology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443 Dmitry Verzilin, Tatyana Maximova, and Irina Sokolova Evaluation of the Dynamics of Phytomass in the Tundra Zone Using a Fuzzy-Opportunity Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 449 V. V. Mikhailov, Alexandr V. Spesivtsev, and Andrey Yu. Perevaryukha Intelligent Human-Machine Interfaces Lower Limbs Exoskeleton Control System Based on Intelligent Human-Machine Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457 Ildar Kagirov, Alexey Karpov, Irina Kipyatkova, Konstantin Klyuzhev, Alexander Kudryavcev, Igor Kudryavcev, and Dmitry Ryumin Central Audio-Library of the University of Novi Sad . . . . . . . . . . . . . . 467 Vlado Delić, Dragiša Mišković, Branislav Popović, Milan Secujski, Siniša Suzić, Tijana Delić, and Nikša Jakovljević Applying Ensemble Learning Techniques and Neural Networks to Deceptive and Truthful Information Detection Task in the Flow of Speech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 Alena Velichko, Viktor Budkov, Ildar Kagirov, and Alexey Karpov Security for Intelligent Distributed Computing - Machine Learning vs. Chains of Trust Model Checking to Detect the Hummingbad Malware . . . . . . . . . . . . . . 485 Fabio Martinelli, Francesco Mercaldo, Vittoria Nardone, Antonella Santone, and Gigliola Vaglini

Contents

xxv

ECU-Secure: Characteristic Functions for In-Vehicle Intrusion Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Yannick Chevalier, Roland Rieke, Florian Fenzl, Andrey Chechulin, and Igor Kotenko Experimenting with Machine Learning in Automated Intrusion Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 Andre Lopes and Andrew Hutchison Method of Several Information Spaces for Identification of Anomalies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515 Alexander Grusho, Nick Grusho, and Elena Timonina Gateway for Industrial Cyber-Physical Systems with Hardware-Based Trust Anchors . . . . . . . . . . . . . . . . . . . . . . . . . . . 521 Diethelm Bienhaus, Lukas Jäger, Roland Rieke, and Christoph Krauß Visual Analytics in Distributed Environment The Architecture of the System for Monitoring the Status in Patients with Parkinson’s Disease Using Mobile Technologies . . . . . . 531 Shichkina Yulia, Kataeva Galina, Irishina Yulia, and Stanevich Elizaveta Approach to Association and Classification Rules Visualization . . . . . . . 541 Yana Bekeneva, Vladimir Mochalov, and Andrey Shorov The Visualization-Driven Approach to the Analysis of the HVAC Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547 Evgenia Novikova, Mikhail Bestuzhev, and Andrey Shorov Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553

Internet of Things, Cloud Computing and Big Data

Using Blockchain for Reputation-Based Cooperation in Federated IoT Domains Giancarlo Fortino1 , Fabrizio Messina2 , Domenico Rosaci3 , and Giuseppe M. L. Sarn´e4(B) 1

4

DIMES, University of Calabria, Rende, CS, Italy [email protected] 2 DMI, University of Catania, Catania, Italy [email protected] 3 DIIES, University “Mediterranea”, Reggio Calabria, Italy [email protected] DICEAM, University “Mediterranea”, Reggio Calabria, Italy [email protected]

Abstract. The convergence of Internet of Things and Multi-Agent Systems also relies on the association of software agents with IoT devices to benefit from their social attitude to cooperate for services. In this context, the selection of reliable partners for cooperation can be a very difficult task if IoT devices migrate across different domains. To this purpose, we introduce the Reputation Capital model and an algorithm to form agents groups in each IoT federated domain, on the basis of the reputation capital of each agent, to realize a competitive framework. A further essential contribution consists of adopting the blockchain technology to certify the reputation capital of each agent in each federated environment. To verify that the individual reputation capital of devices and, consequently, the overall reputation capital of the IoT community can benefit from the adoption of the proposed approach, we performed some experiments. The results of these experiments witnessed that, under certain conditions, almost all the misleading agents were detected. Moreover, the simulations also have shown that, by adopting our reputation model, malicious actors always pay for services significantly more than honest devices.

Keywords: Blockchain

1

· Agents group · IoT · Reputation capital

Introduction

The Internet of Things (IoT) devices are equipped with increasing sensing and computational capabilities so that the environments around us are more and more pervasive, context-aware and smart [2]. This process appears as unstoppable and is changing human and virtual societies in a deep way. In this scenario, cooperation among smart objects can represent a powerful way for increasing the c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 3–12, 2020. https://doi.org/10.1007/978-3-030-32258-8_1

4

G. Fortino et al.

performance of IoT devices [9,10] and, to this purpose, several IoT architectures and standards have been proposed [5]. In this work, we focus on the cooperation among IoT devices, suitably supported by the adoption of a Multi-Agent System (MAS) approach, in a dynamic, distributed framework. In particular, each IoT device is associated with a software agent that works on its behalf [8]. The synergy deriving by joining IoT and MAS technologies allows to discovery services potentially interesting in an effective manner as well as to realize a wide, dynamic network of federated IoT domains where smart heterogeneous IoT devices can move across the different domains forming the network. In this context, any IoT device (i.e., its associated software agent) has to select few reliable partners even if it has not an adequate experience to perform a good choice. A promising solution is represented by forming social structures, like agent groups, to suitably support devices in each federated domain. Notice that the effectiveness of a group (real or virtual) is tightly related to the number of interactions occurring among the members of that group or, in other words, to its social capital [4]. In this perspective, a real and interesting problem to face is that of representing the trustworthiness inside an IoT network composed by several federated environments where a great number of agents is affiliated with. In such a scenario, a common way for an agent to select a partner relies on the (global) reputation that a device (i.e., agent) has in its community [23] given that its personal experience is not suitably to directly perform a good choice. However, in a distributed context, the task of realizing a measure of the global reputation is not trivial in absence of a centralized repository. To deal with this problem, we propose to model agent reputation by using a measure of reputation called Reputation Capital (RC) and adopting a blockchain protocol [30] to maintain and certify the reputation capital of the agents in each federated domain. Moreover, we describe a competitive IoT scenario where it is possible to form agent groups inside each federated domain and where the cooperation for services occurring inside the same group is provided for free (conversely, it is provided only for pay). Our approach profitably combines more technologies (e.g., IoT, reputation systems, blockchain and group formation) and, in this way, all the IoT devices moving across the federated domains composing our IoT network can always certify their reputation capital in order to join with groups active in their current IoT domain to maximize their advantages. We performed a few preliminary experiments of a simulated environment containing both honest and misleading agents and the obtained results have shown that almost all the misleading agents, if their percentage is lower than a (high) threshold, are detected and that honest agent significantly pay less that malicious agents. The rest of the paper is organized as follows. Section 2 presents an overview on the related literature. Section 3 introduces the proposed framework scenario. Section 4 discusses the reputation-blockchain mechanism as well as presents the

Using Blockchain for Reputation-Based Cooperation

5

group formation algorithm. Section 5 reports the results of our experiments. Finally, in Sect. 6 some conclusions are drawn.

2

Related Work

Trust and reputation systems have been widely studied to support users’ (i.e., agents) activities in distributed environments subject to more potential threats, mainly due to the activities of malicious devices, than centralized environments [22]. Often to provide security to a distributed environment they are used together with cryptography. In particular, trust and reputation systems [24] are capable to limit the risk due to the possible unreliability of potential partners by estimating their trustworthiness, while Cryptographic techniques are aimed to protect against outside attacks, authenticating own counterparts and safeguarding privacy [29]. The most part of social interactions and decision process happening within evolved societies, independently if they are human or virtual, relies on the concept of trustworthiness [13]. Therefore, a great variety of studies based on different point of view [12] and a great number of analysis, models and architectures dealing with trust have been proposed in the literature. In this respect, different elements can influence the accuracy of these trust esteems; specifically, number and quality of the information sources [14], aggregation and inferring rules (e.g., in a local or global way) [17] and the nature of the context (e.g., centralized or distributed) [16]. Also group formation processes having place in real and virtual communities can benefit of trust and reputation measures. Indeed, many proposal to form groups relies on trust or reputation measures to generate recommendations to a group (member) of a community about the best members (groups) for the affiliation with [1,11], this problem is known as group recommendation (affiliation problems). Moreover, groups are more stable over time when trustworthiness information are exploited in their formation with respect to the adoption of different group formation strategies [6] and this because, usually, in the former case groups and their members receive more benefits. In the IoT world, trust and reputation criteria are assuming a great relevance and new trust and reputation models for IoT have been recently proposed [25]. Similarly, the introduction of social structures within IoT communities can significantly improving the effectiveness of the IoT devices activities and, therefore, this approach is increasing in popularity [3,9]. Finally, recent proposals introduced the blockchain technology [19] into the IoT world [18] both to implement secure management frameworks and for enabling a reliable sharing of resources and services between IoT devices in distributed scenarios. In our context, the possibility of realizing smart-contracts [27] is of particular interest. In particular, Ethereum [7] was the first blockchain implementing smartcontracts that, from a technical viewpoint, can be assimilated to software agents which autonomously executes programmed transactions. To the aim, Ethereum

6

G. Fortino et al.

makes available some programming languages to write smart-contract codes in a relatively easy way. However, behind Ethereum other similar platforms support smart-contracts among which, for instance, Hyperledger [15], Ripple [21], Stellar [26] and Tendermint [28].

3

The IoT Framework

The scheme of the proposed IoT framework is depicted in Fig. 1. It is populated by a wide number of heterogeneous, smart IoT devices, each one associated with a software agent that works on the behalf of the device. Let N be a dynamic IoT Network affiliating a number n (with n > 1) of existing domains (which are free of entering or leaving N based on their convenience), named Federated Domains (F D)s, and let us assume that a (permissioned) blockchain is associated with N (see Sect. 4). Moreover, let us denote with SD the set of devices and with SA the set of software agents, each one associated with a unique, personal device (in the following we will use the terms device and agent in an interchangeable manner). Each federated domain F D ∈ N is administrated by a trusted and equipped agent called Federated Domain Manager (F DM ) which provides all the devices, temporarily affiliated with its administrated domain F D, of some basic services. In detail, a F DM associates an identifier (Id), unique into N , with each agent the first time that is comes active on N and manages and updates a registry of all the agents currently hosted in its administrated F D. For sake of simplicity, the set of agents SA affiliated with N and the relationships occurring among them will be represented by means of a graph G = N G, LG, where N G represents the set of nodes of G and each node of N G is associated with a unique agent a ∈ SA (i.e., device d ∈ SD), and LG represents the set of oriented links where each link of LG is associated with a relationship occurring between two agents of N (middle layer in Fig. 1).

d

N Network

d

a

FDM1

a

FD1 d

a

d

a

d

a

d

a

d

a

FDM2

FD2 d

a

d

a

d

FD3

a

d

a

d

a

FDM3 d

a

d

a

d

a

G Graph

RC Blockchain

Fig. 1. The Agent-based framework

Using Blockchain for Reputation-Based Cooperation

7

Inside each F D can be formed agent groups and each agent can require the affiliation to one of the groups active into its current F D based on its reputation capital certified by the blockchain (take into account that agents can freely move from a F D to another F D). We denote the k-th group g formed inside to the m-th F D ∈ N as gkF Dm and, similarly to each group active into a F D, it will be managed by the respective F DM .

4

Reputation Capital and Group Formation Algorithm

In our framework each IoT device is a prosumer that consumes and produces services that will be offered for pay at all the agents that do not belong to the same group of the producer. Moreover, the reliability of an IoT device when it acts as a consumer is witnessed by the blockchain (see below), while when it acts as a provider the quality of its services is witnessed by its reputation capital score described in detail below. The Reputation Capital is expressed by a numerical score taking into account the past “behaviors” of the devices (i.e., agent) performed in its past qualified interactions (see below) occurred with other devices belonging to N when it acted as a provider. The first time that an agent is active on N , it receives an initial RC that has not to penalize too much the newcomer [20] but suitable discouraging whitewashing strategies of malicious agents that in presence of a compromised RC could exit from the framework to return into N for receiving a new starting RC. More in detail, let us to define an interaction when a consumer agent aj obtains a service s from a provider agent ai . After that s is consumed, aj releases a feedback φji ∈ [0, 1] ∈ R which represents its appreciation for s. If the interaction is considered as qualified (see below) then the released feedback will be exploited to update the RCi of the agent ai (similarly for the agent aj when it acts as provider). In this respect, the Relevance R of an interaction is given by the ratio cs /C if cs < C or fixed to 1 otherwise, where cs is the cost of s and C is the maximum cost after which the relevance of the interaction is assumed as saturated. We will say an interaction as qualified when the feedback φ, with respect to R, is: ⎧ ∀R ⎨ φ < 0.5 (1) ⎩ R≥φ for φ ≥ 0.5 and R ≥ 0.5 Based on the later h qualified interactions of the provider agent ai occurred with different agents (generically denoted as aj ), its RCi will be updated as follows: h  (n−1 ) RCi = ωn · δj,n · Rji,n · φji,n (2) n=1

8

G. Fortino et al.

where (i ) ω weights the h qualified interactions by giving more relevance to the more recent of them and (ii ) δ is a parameter introduced to attenuate the impact of agents that provide negative feedback (e.g., φ < 0.5) in a systematic way to gain some benefits (it is set as the complement to 1 of the ratio between the number of negative feedback (N F ) released by an agent with respect to the overall number of its interactions (N T )). When in a FD a service is provided, it is committed on the associated blockchain platform by means of a smart contract which takes out its contractual obligations. Currently, for sake of simplicity, we refer to the Ethereum blockchain platform given the availability of both well documented API and an own cryptocurrency (i.e, Ether) to realize payments inside the framework. With the aim of realizing a competitive framework, we exploited the RC scores to form agents groups inside each FD and assumed that all the agents interactions occurring inside a group will be provided for free. To this purpose, a distributed group formation algorithm is periodically executed in each federated domain (i.e., FD) by its manager (i.e., FDM) for grouping agents (i.e, devices) based on their RC scores. For each FD its FDM decides the number of groups allowed in its domain and the RC scores (in an increasing manner) required by each group to affiliate an agent. Each agent can require to its FDM the affiliation with a FD group (see Algorithm 1), the FDM will assign it to the group “best” fitting its RC. Periodically, the FDM checks if the RC of each agent is still adequate for its group, otherwise the agent could be moved to another group based on its RC or also removed from every group if its RC is too low to join with any group active in that FD. The pseudocode of this procedure, executed by the FDM of every FD, is listed in Algorithm 1 (symbols are listed in Table 1). The simple function Assign( ) takes as input an agent, the maximum number of groups active in F Dm and the datasets DAm and DGm . For each agent ai belonging to F Dm , Assign( ) assigns ai to a group based on its RCi score: if RCi is lower than the threshold Γ1 of the last group (that is the lower Γ in AGm ) then the function Remove ( ) is called to remove ai from every group active in AGm as long as its RCi will not be adequate for a new affiliation with a group. Otherwise ai is assigned to the group best fitting with its RCi . Table 1. Table of the main symbols Symbol Description

Symbol Description

N

Network

τm

Time Threshold

FD

Federated Domain

F DM

Federated Domain Manager agent

DG

Dataset of Groups

DA

MG

Max. number of Groups AG

Dataset of Agents Set of Active Groups g

Using Blockchain for Reputation-Based Cooperation

9

Algorithm 1. The procedure executed by each F DM . Input: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20:

5

DAm , DGm , τm , M Gm ;

for all ai ∈ F Dm do if time last RC check ≥ τm then ask to the blockchain for the updated RC of ai and then update DAm end if end for for all gk ∈ AGm do if (time last af f iliation check ≥ τm ) then for all ai ∈ gk do if (RCi < Γk )  (RCi ≥ Γk+1 ) then Assign (ai , AGm , DAm , DGm ) end if end for end if end for for all ai ∈ F Dm requiring to be affiliated with a group do if RCi ≥ Γ1 then Assign (ai , AGm , DAm , DGm ) else reject the request of ai and sends a message to the agent end if end for

Experiments

We performed some tests to understand whether the proposed solution is capable to recognize malicious actors (RC < 1.0) in presence of different and concomitant typologies of attacks carried out also with different strategies. To this end, our simulations were performed by means of a simulated federated Network F D and 103 simulated IoT devices (i.e., agents): each epoch included 103 interactions so that each one of the simulated agents acted as a provider one time for epoch in average. Preliminary tests allowed to set the best initial RC score assigned to each devices to 1.0; we assumed to randomly generate the cost cs of a service s in the range 1 ÷ 1.5$ with a cost threshold C for a service of 1$. Finally, the horizon h varied from h = 4 to h = 10. The accuracy of our framework was analyzed by means of several metrics. In particular, by assuming an initial RC = 1.0, here we referred to (i ) the percentage of malicious and honest devices/agents correctly recognized (on the basis of their RC) and (ii ) the cost payed for services, for h = 4 and h = 10 as the number of cheaters increases. In the first case (see Fig. 2), the cheaters percentages varied from 0% to 100% with step of 5% (results are referred to the 25-th epoch but they came stable within the 5-th epoch). Results highlight that with a percentage of cheaters not greater than 25% the percentage of recognized cheaters varied from the 100% (h = 4, malicious = 5%) to 96% (h = 10, malicious = 25%) within the 5-th epoch, conversely the reputation capital, in

10

G. Fortino et al. Sensitivity 100

mal. (h=4) mal. (h=10) hon. (h=4) hon. (h=10)

80 60 40 20 0

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95100

malicious (%)

Fig. 2. Sensitivity for h = 4 and h = 10 in identifying honest and malicious devices

Fig. 3. Cost for services payed from honest (H) and malicious (M ) devices for h = 4 and h = 10 and different malicious percentages (mal)

the adopted configuration, progressively losses effectiveness in recognizing the honest or malicious nature of the devices. In the second case, the simulations carried out (see Fig. 3) confirmed that cheaters (limited to the 25% of the population) always pay for services significantly more than honest devices with a ratio of about 1 : 1.5 to 1 : 4.2 (measured at the 25-th epoch).

6

Conclusions and Future Work

In this paper we described a wide IoT network federating several federated environments where live heterogeneous, smart cooperating IoT devices having the opportunity of moving across the different federated domains. In the proposed distributed framework, we dealt with the problem of choosing the partner based on its reputation, assumed that this information is properly spread into the framework. To this aim, we associated with each IoT device a software agent for exploiting its social attitudes to cooperate as well as its capability to form complex agent social structures, as groups. To support device cooperation, we proposed the reputation capital model, built on the basis of the devices’ feedback received for its interactions and disseminated by means of a blockchain to avoid any centralized component inside the proposed distributed network.

Using Blockchain for Reputation-Based Cooperation

11

A preliminary experimental campaign of simulations was realized in order to verify effectiveness and efficiency of the framework. The results obtained from these simulations witness that the proposed framework and the reputation capital are capable to detect almost all the misleading agents when their number is under a high enough percentage threshold of the framework population. We are planning, for a future work, to study the behavior of the proposed model with respect to the affiliation of the devices to the groups active on each FD; in other words, on the composition of devices groups in presence of malicious devices. Finally, we aim to study the values of reputation capital with respect to the variation of the horizon and percentage of malicious agents also in a real context.

References 1. Aikebaier, A., Enokido, T., Takizawa, M.: Trustworthy group making algorithm in distributed systems. Hum.-Centric Comput. Inf. Sci. 1(1), 6 (2011) 2. Augusto, J., Callaghan, V., Cook, D., Kameas, A., Satoh, I.: Intelligent environments: a manifesto. Hum.-Centric Comput. Inf. Sci. 3(1), 12 (2013) 3. Bao, F., Chen, I.: Dynamic trust management for internet of things applications. In: Proceedings of the 2012 International Workshop on Self-aware Internet of Things, pp. 1–6. ACM (2012) 4. Blanchard, A., Horan, T.: Virtual communities and social capital. In: Knowledge and Social Capital, pp. 159–178. Elsevier (2000) 5. Da Xu, L., He, W., Li, S.: Internet of things in industries: a survey. IEEE Trans. Industr. Inf. 10(4), 2233–2243 (2014) 6. De Meo, P., Ferrara, E., Rosaci, D., Sarn`e, G.M.L.: Trust and compactness in social network groups. ACM Trans. Cybern. 45(2), 205–2016 (2015) 7. https://www.ethereum.org (2018) 8. Fortino, G., Gravina, R., Russo, W., Savaglio, C.: Modeling and simulating internet-of-things systems: a hybrid agent-oriented approach. Comput. Sci. Eng. 19(5), 68–76 (2017) 9. Fortino, G., Messina, F., Rosaci, D., Sarn´e, G.M.L.: Using trust and local reputation for group formation in the cloud of things. Future Gener. Comput. Syst. 89, 804–815 (2018) 10. Fortino, G., Russo, W., Savaglio, C., Shen, W., Zhou, M.: Agent-oriented cooperative smart objects: from IoT system design to implementation. IEEE Trans. Syst. Man Cybern. Syst. (2018). https://doi.org/10.1109/TSMC.2017.2780618 11. Fotia, L., Messina, F., Rosaci, D., Sarn´e, G.M.L.: On the impact of trust relationships on social network group formation. In: WOA, pp. 25–30 (2017) 12. G¨ achter, S., Herrmann, B., Th¨ oni, C.: Trust, voluntary cooperation, and socioeconomic background: survey and experimental evidence. J. Econ. Behav. Organ. 55(4), 505–531 (2004) 13. Heidemann, J., Klier, M., Probst, F.: Online social networks: a survey of a global phenomenon. Comput. Netw. 56(18), 3866–3878 (2012) 14. Huynh, T.D., Jennings, N.R., Shadbolt, N.R.: An integrated trust and reputation model for open multi-agent systems. Auton. Agents Multi-Agent Syst. 13(2), 119– 154 (2006) 15. https://www.hyperledger.org (2018)

12

G. Fortino et al.

16. Jøsang, A., Ismail, R., Boyd, C.: A survey of trust and reputation systems for online service provision. Decis. Support Syst. 43(2), 618–644 (2007) 17. Kim, Y., Song, H.S.: Strategies for predicting local trust based on trust propagation in social networks. Knowl. Based Syst. 24(8), 1360–1371 (2011) 18. Kumara, N.M., Mallickb, P.K.: Blockchain technology for security issues and challenges in IoT. Procedia Comput. Sci. 132, 1815–1823 (2018) 19. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008) 20. Ramchurn, S.D., Huynh, D., Jennings, N.R.: Trust in multi-agent systems. Knowl. Eng. Rev. 19(1), 1–25 (2004) 21. https://ripple.com/ (2018) 22. Roman, R., Zhou, J., Lopez, J.: On the features and challenges of security and privacy in distributed internet of things. Comput. Netw. 57(10), 2266–2279 (2013) 23. Ruohomaa, S., Kutvonen, L., Koutrouli, E.: Reputation management survey. In: 2007 The 2nd International Conference on Availability, Reliability and Security, ARES 2007, pp. 103–111. IEEE (2007) 24. Sabater-Mir, J., Paolucci, M.: On representation and aggregation of social evaluations in computational trust and reputation models. Int. J. Approximate Reasoning 46(3), 458–483 (2007) 25. Sicari, S., Rizzardi, A., Grieco, L., Coen-Porisini, A.: Security, privacy and trust in Internet of Things: the road ahead. Comput. Netw. 76, 146–164 (2015) 26. https://www.stellar.org (2018) 27. Szabo, N.: Smart contracts (1994, Unpublished manuscript) 28. https://tendermint.com (2018) 29. Yang, Y., Wu, L., Yin, G., Li, L., Zhao, H.: A survey on security and privacy issues in internet-of-things. IEEE Internet of Things J. 4(5), 1250–1258 (2017) 30. Zheng, Z., Xie, S., Dai, H., Wang, H.: Blockchain challenges and opportunities: a survey. In: Work Pap.–2016 (2016)

Design of Fail-Safe Quadrocopter Configuration Oleg V. Baranov, Nikolay V. Smirnov(B) , Tatiana E. Smirnova, and Yefim V. Zholobov St. Petersburg State University, 7/9, Universitetskaya nab., St. Petersburg 199034, Russia {o.baranov,n.v.smirnov,t.smirnova}@spbu.ru, [email protected]

Abstract. Unmanned aerial vehicles have become widespread in various fields of activity. In this article, we propose an approach to choosing performance characteristics of a quadrocopter, which provide a possibility to solve emergency landing problems. The key target characteristics for such a device are substantiated. The calculation methodology assumes the selection of commercially available components with the appropriate characteristics. For this purpose, the eCalc software was used. In the second part of the work, we propose a mathematical model describing the quadrocopter movement in emergency situations. This model allows us to create algorithms for solving control problems in various emergency situations. For implementation the possibilities of MATLAB package were used. Numerical experiment is carried out to prove the efficiency of this approach.

Keywords: UAV

1

· Quadrocopter · Fail-safe landing · Control

Introduction

Unmanned aerial vehicles (UAV) of helicopter type with an even number of rotors: quadro- hexa- and octocopters have become widespread in various fields of activity. The urgency of modeling UAV fail-safe configurations is due to the payload value that is installed on the board. At the same time, according to statistics, accidents of unmanned vehicles happen hundreds times more often than accidents of manned vehicles [7]. The main causes of failures are malfunctions in the control system, which includes a number of high-sensitivity sensors, as well as various mechanical damages [1,4,6]. Obviously, the most dangerous situation is the failure of one or several motors directly in flight, regardless of the failure nature (a failure in the control or power circuit, a mechanical frame damage or a propeller destruction). In this case, with respect to multirotor systems, the failure situation of all electric motors simultaneously is less likely, and hence the planting possibility remains even for a scheme with four screws – a quadrocopter. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 13–22, 2020. https://doi.org/10.1007/978-3-030-32258-8_2

14

O. V. Baranov et al.

Mathematical modeling shows [1,2,5,6] that even with the loss of two diagonal propellers out of four, the device landing is possible, albeit with loss of control over one of the degrees of freedom (the angle of its own rotation). However, in practice, for the successful realization of such a landing, already during the device design, it is necessary to provide some solutions to ensure the power reserve of the engine-propeller unit. This necessarily leads to a loss of the overall efficiency of the system and the question of calculating the UAV performance characteristics for such a device is especially important, because in addition to reliability, any UAV must have the necessary operational characteristics. In this article, we propose an approach to choosing performance characteristics of a quadrocopter, which provide a solution to emergency landing problems. Based on a special mathematical model, we develop algorithms and software to control UAV in emergency situations. Numerical experiment is carried out to prove the efficiency of the approach.

2

Requirements for the Fail-Safe Configuration

If two of the four motors of the quadrocopter fail, successful landing requires that the thrust of the two remaining engines reaches one equipped mass of the vehicle. However, taking into account: • an unfavorable mode of movement during an emergency landing (constant rotation); • system efficiency; • a probable unfavorable external situation (weather, electromagnetic, etc.); for a fail-safe configuration, it makes sense to have a two-engine power-to-weight ratio comparable to the typical for the full thrust-to-weight ratio of the usual configuration, i.e. not less than 1.75 equipped masses of the vehicle [3]. Thus, the target indicator of the full thrust-to-weight ratio of the fail-safe quadrocopter configuration is proposed to be installed in the range of 3.5–5 equipped masses of the vehicle. In order to achieve this indicator, when designing the quadrocopter configuration, the following components are selected: • a power source (in this work we will consider lithium-polymer batteries, as the most common power source for quadrocopters); • electronic speed controllers (ESC) (one regulator for each motor); • power electrical wires; electric motors and propellers. The increase of thrust-to-weight ratio leads to an increase in the weight of these components. Also, providing a large thrust leads to a design load index, which provides a stationary hovering of the vehicle, from an optimal range. Now we define the main target performance characteristics of the failsafe configuration that meet the specified requirements. They are presented in Table 1.

Design of Fail-Safe Quadrocopter Configuration

15

Table 1. Main target performance characteristics of the fail-safe configuration No Characteristic

Value

1

Trust-to-weight ratio

3.5–5 equipped masses

2

Hover flight time

more 15 min

3

Specific thrust

more 5 g/W

4

Battery load

less 50 C (C – battery capacity)

5

Minimum flight time (all motors on max) more 2 min

6

Hover flight throttle

20–40%

7

Hover flight efficiency

more 70%

8

Optimal horizontal speed

more 25 km/h

Note that requirements No. 2, 3, 7 (see Table 1) for the fail-safe configuration are reduced relative to typical requirements for general-purpose quadrocopters. A critically low index No. 5 is explained by the high thrust-to-weight ratio. It has no fundamental significance, because the maximum thrust mode of all four motors with a total thrust-to-weight ratio close to 5 equipped masses is a mode that does not have a typical practical application. As the key performance characteristics, we consider the time and distance of the flight, depending on the motion speed. The selection of components is carried out in order to maximize the key performance characteristics. To calculate the performance characteristics, we also fix the general parameters of the vehicle configuration and the state of the external environment (see Table 2).

3

Calculation of the Performance Characteristics

Calculation of performance characteristics is carried out for a quadrocopter with a 750 mm frame size. For calculations with parameters of Tables 1 and 2, the Table 2. Main target performance characteristics No Characteristic

Value

1

Number of blades for each propeller 2

2

Field elevation

100 m ASL

3

Pressure (QNH)

1013 hPa

4

Battery type

LiPo

5

Maximum discharge

90%

6

FCU Tilt limit

none

7

Gear ratio

1:1

8

Additional electrical load

0W

16

O. V. Baranov et al.

eCalc software was used (see xCopterCalc [3]). The calculation methodology assumes the selection of commercially available components with the appropriate characteristics to guarantee the target performance characteristics. xCopterCalc contains a continuously updated database of components for multirotor systems. In the absence of a suitable part in the database, xCopterCalc allows you to add the original by entering its characteristics. Thus, with this software it is possible to calculate almost any configuration of a multirotor system, receiving as a result not only the vehicle performance characteristics, but also a draft list of components for subsequent assembly. Quadrocopter with a 750 mm frame can be used for professional video shooting, target designation, observation using special devices. The calculations results of the fail-safe configuration are presented in Table 3. The fail-safe configuration of a quadrocopter has even more technical peculiarities. To reduce the current in power lines, it is advisable to use a battery assembly configuration 14S1P with a sufficiently high, uncharacteristic for quadrocopters, rated voltage of 51.8 V. This makes it possible to reduce the current of the power lines, but in the maximum mode it still remains high (77.25 A). Therefore, this requires not only heavy power wires with more than 10 mm2 section and powerful ESC (80 A) – heavy and expensive. The high energy, stored in the battery (725.2 Wh), also requires a number of conditions for ensuring safety in case of damage. At the same time, such a device has high performance characteristics No. 9, 14, 15 (see Table 3). The dependence of the engine-propeller combination characteristics on the motor current is shown in Fig. 1. The key performance characteristics are shown in Fig. 2. The horizontal speed range (optimum in power consumption) is 33–36 km/h. The maximum flight distance is 6.1 km.

4

Modeling of Emergency Situations

A quadrocopter movement can be described by the differential equations system [8] ⎧ x˙ = Vx , y˙ = Vy , y˙ = Vz , ϕ˙ = ωϕ , θ˙ = ωθ , ψ˙ = ωψ , ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ mV˙ x = (sin ψ sin ϕ + cos ψ sin θ cos ϕ)U1 , ⎪ ⎪ ⎨ mV˙ y = (− cos ψ sin ϕ + sin ψ sin θ cos ϕ)U1 , mV˙ z = U1 cos θ cos ϕ − mg, ⎪ ⎪ ⎪ ⎪ ⎪ Ixx ω˙ ϕ = (Iyy − Izz )ωθ ωψ − JT P ωθ Ω + U2 , ⎪ ⎪ ⎪ ⎩ Iyy ω˙ θ = (Izz − Ixx )ωψ ωϕ + JT P ωϕ Ω + U3 , Izz ω˙ ψ = (Ixx − Iyy )ωψ ωθ + U4 ,

(1)

where x, y, z are coordinates of the mass center; Vx , Vy , Vz are velocity projections; ϕ, θ, ψ are the roll, pitch and yaw angles; ωϕ , ωθ , ωψ are corresponding angular velocities; m is the quadrocopter mass; Ixx , Iyy , Izz are inertia moments around coordinate axes; U1 , U2 , U3 , U4 are control channels; Ω is the total velocity of four screws; JT P is the total rotational moment of inertia around the screw axis: JT P = JP + ηN 2 JM , JP is the motor inertia moment, JM is the propeller inertia moment, N is the gear ratio, η is the gear efficiency.

Design of Fail-Safe Quadrocopter Configuration

17

Table 3. Fail-safe configuration parameters No Characteristic

Value

1

Battery capacity – nominal/maximum discharge current (by battery design)

14 Ah – 45/60 C

2

Stored energy

725.2 Wh

3

Cell configuration (S – serial, P – parallel cell connection)

14S1P

4

Battery load

22.07 C

5

On load battery voltage

51.01 V

6

Battery nominal voltage

51.8 V

7

Specific thrust

5.52 g/W

8

Minimum flight time

2.4 min

9

Hover flight time

19.9 min

10

Motor model

Turnigy RotoMax 50

11

Trust-to-weight ratio

3.6

12

All-up weight

12.33 Kg

13

Drive weight

10.83 Kg

14

Maximum horizontal speed

69 km/h

15

Rate of climb

15 m/s

16

Hover flight throttle (linear)

43%

17

Efficiency (hover)

89.8%

18

Motor current (hover)

10.28 A

19

Motor current (maximum)

77.25 A

20

Max ESC current

80 A

21

Efficiency (100% throttle)

92.1%

Fig. 1. Dependence of the power plant characteristics on the electric motor current. Curve 1 – power (W); 2 – efficiency (%); 3 – max. rev. (×100 rpm); 4 – motor temperature (◦ C)

18

O. V. Baranov et al.

Fig. 2. Dependence of the distance (curve 1) and the flight time (curve 2) from the speed

Relations between controls U1 , U2 , U3 , U4 and angular velocities Ω1 , Ω2 , Ω3 , Ω4 , where Ω = −Ω1 + Ω2 − Ω3 + Ω4 , have the form U1 = b(Ω12 + Ω22 + Ω32 + Ω42 ), U4 = d(−Ω12 + Ω22 − Ω32 + Ω42 ), U2 = lb(−Ω22 + Ω42 ), U3 = lb(−Ω12 + Ω32 ).

(2)

In (2) l, b and d are positive parameters [8]. Within this model, an emergency situation involving damages of one or several screws can be represented as an instantaneous change in corresponding Ωi , i.e. Ω i = Ωi − εi , where εi are losses. At the full screw destruction εi = Ωi and Ω i = 0. Substitution of emergency values Ω i into expressions for controls (2) and further into system (1) will determine the mathematical model of an emergency situation.

5

Numerical Example

In this section we demonstrate the possibilities of the proposed approach. Using MATLAB, we developed a software package that allows simulating emergency situations and solving control problems in order to minimize consequences of emergencies. Consider an example that illustrates the operation of one of the algorithms for vehicle with parameters: m = 1 kg, l = 0.175 m, b = 26.5 · 10−6 H·s2 , d = 0.6 · 10−6 H·ms2 , Ixx = Iyy = Izz = 0.1 H·ms2 , JT P = 0.005 H·ms2 . Suppose that a quadrocopter performs a certain maneuver – a turn in a given direction. At the time t = 5 s, an accident occurs and the value of Ω2 drops to zero due to a screw breakage (see Fig. 3). If we do nothing, the quadcopter crashes with vertical speed 120 m/s at Z = 0 (see Fig. 4).

Design of Fail-Safe Quadrocopter Configuration

19

Fig. 3. Emergency situation: while turning, Ω2 drops to zero at t = 5 s

60 50 40 Z, m 30 20 10 0 2

0

-2

-4

-6

-8

Y, m

-10

-12

-14

0

1

2

3

4

5

6

7

X, m

Fig. 4. Emergency landing

The algorithm assumes automatic switching from the normal to the emergency control mode by analyzing the data from gyroscopes, accelerometers and speed of the screws rotation. This is possible because in the case of a screw breaking without failure of the electrical part, the angular velocity of the fail screw will increase significantly. Otherwise, in case of failure of the electrical part, the screw speed will decrease significantly. In both cases, an abrupt change of the propeller rotation speed will be not typical for normal flight mode, since affects only one screw and causes the vehicle to rotate. The presence of these signs can be detected with high accuracy in no more than 0.5 s. Further the device control goes into emergency mode. At time t = 5.5 s, the screw, which is diagonal with failing one, turns off (see value Ω4 , Fig. 5). At the same time the thrust on the working engines on the second diagonal increases, see value Ω1 = Ω3 = 400 rev/s, Fig. 5. The rotation speed of the working screws can be controlled by two PID controllers. The first one provides a pre-set vertical speed on the interval t ∈ [5.5, 11.7], and the second reduces the vertical speed to zero when the quadcopter approaches the landing surface. As a result, we have a safe landing with zero vertical speed at Z = 0 (see Fig. 6).

20

O. V. Baranov et al.

Fig. 5. Fail-safe strategy

60 50 40 Z,m

30 20 10 0 0

-1

-2

-3

-4

-5

Y, m

-6

-7

0

1

3

2

4

5

6

7

8

X, m

Fig. 6. Fail-safe landing

It should be noted that the operation of the algorithm would be impossible if we ignore the recommendations for the quadcopter configuration proposed in Sects. 2 and 3.

6

Conclusion

According to the xCopterCalc documentation [3], it is not directly intended to calculate the fail-safe configuration. However, due to its versatility and flexibility, xCopterCalc allows not only to obtain the value of the possible target characteristics, but also by selecting specific components, to determine all the performance characteristics and many auxiliary parameters of a specific configuration. This approach is constructive, and allows us to define the configuration features in details at the initial stage of design. The target characteristics proposed in the present work for the fail-safe configuration are achievable with the use of serially produced components. The hardware configuration proposed in the article uses components with an open architecture and can be used with popular UAV platforms, like ArduPilot Mega ver. 2.6 controller configured by Mission Planner software [9]. And although the creation of a safety factor for a fail-safe configuration is accompanied by some negative consequences (weight increase, suboptimal operation of electric

Design of Fail-Safe Quadrocopter Configuration

21

motors in the hovering mode), the flight characteristics obtained as a result of the calculation allow the vehicle to be used for their main purpose. Thus, the method of landing a quadrocopter at a failure of one or two electric motors (which is based on the results of mathematical modeling of the emergency landing process [1]) can be realized on a vehicle designed in the manner proposed in the present work. The practical implementation of the algorithms for such emergency landing requires a special study, taking into account the capabilities of the software of a particular flight controller. Also, the proposed approach can be used not only on the newly created vehicle, but also on an existing heavy class quadcopter by integrating proposed algorithm in a control system. The main requirement for this is to dismiss the parameters of Table 1, especially the Trust-to-weight ratio. Note also that the vehicle with a fail-safe configuration can be used as an ordinary heavy-class device, taking on extra weight and losing fail-safe capability. In this case, it makes sense to equip such an apparatus with known rescue systems, for example, with an emergency parachute. Such save method does not require engines with excess power, but also has a number of disadvantages compared with the proposed approach. For example, it is necessary to have sufficient height of flight, the inability to control a vertical speed and less reliability, because parachute lines can be tangled in screws. Future research directions are related to solving parametric optimization problems in emergency situations, as well as with a complete classification of such tasks. To solve them, the adaptive method of optimal control (Gabasov’s method [10]) is of particular importance. This method has been successfully applied in various control problems [11–14]. The developed software package is the first step in this direction.

References 1. Baranov, O.V.: Quadrocopter control in emergency mode. Vestnik of Saint Petersburg University. Appl. Math. Comput. Sci. Control Process. 2, 69–79 (2016). (in Russian) 2. Baranov, O.V., Smirnov, N.V., Smirnova, T.E.: On the choosing problem of PID controller parameters for a quadrocopter. In: 2017 Constructive Nonsmooth Analysis and Related Topics (Dedicated to the Memory of V.F. Demyanov), pp. 27–29. IEEE (2017) 3. eCalc — the most reliable RC Calculator on the Web (2019). DIALOG. https:// www.ecalc.ch/index.htm. Accessed 31 Mar 2019 4. Garcia Carrillo, L.R., Dzul, A., Lozano, R., Pegard, C.: Quad Rotorcraft Control: Vision-Based Hovering and Navigation. Springer, London (2012) 5. Popkov, A.S., Baranov, O.V., Smirnov, N.V.: Application of adaptive method of linear programming for technical objects control. In: 2014 International Conference on Computer Technologies in Physical and Engineering Applications, pp. 141–142. IEEE (2014) 6. Popkov, A.S., Smirnov, N.V., Baranov, O.V.: Real-time quadrocopter optimal stabilization. In: 2015 International Conference on “Stability and Control Processes” in Memory of V.I. Zubov, pp. 123–125. IEEE (2015)

22

O. V. Baranov et al.

7. Sayfeddine, D.: Mechatronic control system for flying a quadrocopter and trajectory planning using optical odometry (Ph.D thesis abstract). Publishing house of Platov South-Russian State Polytechnic University, Novocherkassk, Russia (2015). DIALOG. https://dlib.rsl.ru/viewer/01005560514#?page=1. Accessed 31 Mar 2019. (in Russian) 8. Sklyarov, A.A., Sklyarov, S.A.: Synergetic approach to the control of UAV in an environment with external disturbances. News YuFU. Tech. Sci. 4(129), 159–170 (2012). (in Russian) 9. Mission Planner. Official site (2019). DIALOG. http://ardupilot.org/planner/. Accessed 31 Mar 2019 10. Balashevich, N.V., Gabasov, R., Kirillova, F.M.: Numerical methods of program and positional optimization of linear control systems. J. Comput. Math. Math. Phys. 40(6), 838–859 (2000). (in Russian) 11. Popkov, A.S., Smirnov, N.V., Smirnova, T.E.: On modification of the positional optimization method for a class of nonlinear systems. In: ACM International Conference Proceeding Series, pp. 46–51 (2018) 12. Popkov, A.S.: Application of the adaptive method for optimal stabilization of a nonlinear object. In: Proceedings of 2016 International Conference “Stability and Oscillations of Nonlinear Control Systems” (Pyatnitskiy’s Conference), STAB 2016 (2016). Article number 7541215 13. Girdyuk, D.V., Smirnov, N.V., Smirnova, T.E.: Optimal control of the profit tax rate based on the nonlinear dynamic input-output model. In: ACM International Conference Proceeding Series, pp. 80–84 (2018) 14. Smirnov, M.N., Smirnova, M.A., Smimova, T.E., Smirnov, N.V.: Multi-purpose control laws in motion control systems. Information (Japan) 20(4), 2265–2272 (2017)

CAAVI-RICS Model for Analyzing the Security of Fog Computing Systems Saˇsa Peˇsi´c1(B) , Miloˇs Radovanovi´c1 , Mirjana Ivanovi´c1 , Costin Badica2 , Milenko Toˇsi´c3 , Ognjen Ikovi´c3 , and Dragan Boˇskovi´c3 1

University of Novi Sad, Faculty of Sciences, Novi Sad, Serbia {sasa.pesic,radacha,mira}@dmi.uns.ac.rs 2 University of Craiova, Craiova, Romania [email protected] 3 VizLore Labs Foundation, Novi Sad, Serbia {milenko.tosic,ognjen.ikovic,dragan.boskovic}@vizlore.com

Abstract. The ubiquitous connectivity of “things” in the Internet of Things, and fog computing systems, presents a stimulating setting for innovation and business opportunity, but also an immense set of security threats and challenges. Security engineering for such systems must take into consideration the peculiar conditions under which these systems operate: low resource constraints, decentralized decision making, large device churn, etc. Thus, techniques and methodologies of building secure and robust IoT/fog systems have to support these conditions. In this paper, we are presenting the CAAVI-RICS framework, a novel security review methodology tightly coupled with distributed, IoT and fog computing systems. With CAAVI-RICS we are exploring credibility, authentication, authorization, verification, and integrity through explaining the rationale, influence, concerns and security solutions that accompany them. Our contribution is a through systematic categorization and rationalization of security issues, covering the security landscape of IoT/fog computing systems. Additionally, we contribute to the discussion on the aspects of fog computing security and state-of-the-art solutions.

Keywords: Fog computing modelling

1

· Cybersecurity · Distributed security

Introduction

Before IoT, the concept of cloud computing has brought a revolution in how we build our applications (Buyya et al. 2009). In spite of the fact that the cloud offers numerous advantages, it does present with several concerns such as security, privacy, availability of data and services, reliability and performance (Dillon et al. 2010). IoT, and specifically fog/edge computing aim to solve some of these challenges. Fog computing is a computational framework where data analytics and c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 23–34, 2020. https://doi.org/10.1007/978-3-030-32258-8_3

24

S. Peˇsi´c et al.

decision-making processes are moved to the edge of the network. IoT systems require reactive decision making and fog computing is capable of providing it in real-time and with fast transformations of data streams, to produce actionable insights. A lot of the security issues that exist in the cloud, also exist in the fog: privilege escalation, identity spoofing, denial of service attacks, etc. (Krutz and Vines 2010). However, in real-world deployments, it is more challenging to handle fog security than cloud security, since fog devices are deployed in places that are normally out of rigorously controlled environments typical for data centers. Furthermore, it is necessary to carefully design fog computing systems to employ distributed security mechanisms due to the nature of how those systems are designed to operate, share data and manage processes (Stojmenovic and Wen 2014). Moreover, without the proper model for viewing the security of such systems, it is hard to determine what needs to be protected, at which levels, and how. In this paper, we will present the CAAVI-RICS model for analyzing security foundations in fog computing systems. CAAVI stands for: Credibility, Authentication, Authorization, Verification, and Integrity; while RICS stands for: Rationale, Influence, Concerns, and Solutions. By explaining each of the CAAVI components through RICS, we aim to introduce a new categorization methodology for IoT and fog systems’ security. In this paper, the CAAVI/RICS methodology will be presented in details by presenting only the Credibility principle. In this paper, we provide an overview of the security landscape in fog computing systems, through systematic categorization. Additionally, we contribute to the discussion on the aspects of fog computing security and state-of-the-art solutions. This paper aims to contribute to the real-world acceptance of the fog computing paradigm by explaining its technological and resource-management advantages as well as peculiar security challenges and operational disadvantages. We offer a scaled-down summarization of the security in a way that also readers without comprehensive technological backgrounds can have a basic understanding of fog computing and its underlying security demands. The paper is structured as follows: Sect. 2 contains the related work; Sect. 3 references measures in securing a fog computing platform; Sect. 4 discusses the Credibility of CAAVI model through RICS review methodology; finally, Sect. 5 provides conclusions.

2

Related Work

The related work section is focused on papers creating different overviews and categorizations of security in IoT and fog computing systems. Mahmoud et al. (2015) discuss the countermeasures for securing IoT through authentication, trust establishment, federated architectures, and security awareness. In their taxonomy of security and threats in IoT, Babar et al. (2010) discuss the solutions in identification, communication, physical threats, embedded security and storage management. An algorithmic overview of solutions is presented by Cirani et al. (2013) where solutions are presented through the following chapters: security protocols, lightweight cryptographic algorithms,

CAAVI-RICS Model for Analyzing the Security of Fog Computing Systems

25

key distribution and management, secure data aggregation, and authorization. Kumar et al. (2016) are dividing fog security to Network, Data, Access control, Privacy, and Attackers Interest in Private Data, while Khan et al. (2017) have identified 12 security categories to formulate a systematic review, some of them being: Advanced Persistent Threats, Data Loss, Insecure APIs, Insufficient Due Diligence, Abuse, and Nefarious Use. At this moment this is the most complete systematic review of security challenges and proposed solutions. Yi et al. (2015) are reviewing security through six chapters: trust and authentication, network security, secure data storage, secure and private data computation, privacy, access control, and intrusion detection. While papers mentioned in this section provide extensive overviews of security of IoT and fog computing systems, they do not analyze the abstractions above the categories of security or form an overview methodology that can correspond to not just their categorization, but others as well. Also, these papers often discuss IoT systems as a mixed blend of cloud and fog computing. This paper focuses on the fog, thus separating our work from other papers mentioned in this section.

3

Securing a Fog Computing Platform

From the aspect of what is the protected security layer, the security of a fog computing system can be divided into two major areas: physical security and intrusions. Physical security is important as fog computing devices (sensors, actuators, etc) are often deployed in areas outside of rigorous surveillance. While one of the main goals of fog devices is to collect data from different sources, disseminating the data through a secure communication channel is vital at this layer. Physical security attacks could be grouped based on the attackers’ objective (what part of the device to attack): device interfaces, device firmware, device network services, device local storage, device operation, etc. Securing the device at this layer can include: on-boot peripherals check, bootloader protection, secure firmware updates, dynamic data encryption, secure communication, local data protection, etc. Nevertheless, it is necessary to encrypt all sensitive data on a device, in case the device is accessed by an unauthorized party, discarded or stolen. Lastly, the best practice is keeping the device in a safe, controlled, environment. System-wide intrusions can be grouped into outside and inside intrusions. Outside intrusions are the outcome of devices’ Internet connectivity and refer to typical security issues of any system connected to the Internet: asymmetric routing attacks, buffer overflow attacks, protocol-specific attacks, traffic flooding, and malicious software. Intrusions should be tracked continuously, system-wide. By utilizing proper network monitoring techniques to do so, the percentage of prevented networking attacks is going to increase (Muda et al. 2011). This can be achieved by using different intrusion detection techniques, checking suspicious network packets, or using machine learning to recognize intrusions. These attacks can also appear in the form of a denial of service attack (DoS) and distributed DoS. If a wireless access point is not properly secured, it can result in

26

S. Peˇsi´c et al.

many forms of attack on the system: Sybil attack, illegal bandwidth usage (flood attack), etc. Inside intrusions refer to uncharacteristic behavior of the participants inside the system (rogue nodes, man-in-the-middle attacks, etc.). They can cause internal disturbance of trustworthiness and integrity, leading to possible incorrect decision making, false data input/output, privilege escalation, but also data and identity theft, etc. Inside intrusions also refer to authentication and authorization issues, fog nodes and sensors spoofing. Security issues and points of attack for fog computing system do not deviate much from regular systems with centralized architectures. Although there are attack types that are more meaningful to launch against distributed systems (i.e. DDoS), the countermeasures that are taken are usually different in distributed than in centralized architectures. While centralized architectures can take advantage of more processing power concentrated at one point and protection of single node, distributed systems have to take into consideration the high number of nodes, and efficient way for handling and distributing actions when protecting themselves from an attack.

4

Foundations of a Secure Fog Computing Architecture CAAVI

Credibility, privacy, trust, and authentication represent foundational elements of system security. Achieving them, as well as security rules, requires an approach to discover, attest, and verify all smart and connected “things.” In CAAVI-RICS model, credibility (C) refers to whether a fog node is trustworthy, i.e. not malicious, thus credible to operate in the system. Authentication (A) refers to determining the identity of a system user or device, thus allowing him/her or it to access or change data. Authorization (A) is tied to managing permissions to perform system actions, i.e. having a certain role in the system. Verification (V) is the process of establishing the truth, accuracy, or validity of a system action and its results. Integrity (I) is the assurance that data was not changed during transmission. In the next subsection, we are going to deeply explore the Credibility principle. We are going to give an overview of how to model security with one of the principles while keeping in mind that the other four principles could be discussed in the same manner. The credibility principle is going to be discussed through 4 subtopics, RICS: Rationale – what is it and why is it important?; Influence – how does it affect the overall system well-being if (if not) implemented correctly?; Concerns – what problems does it bring?; and Solutions – a review of current state-of-the-art solutions. Solutions will be discussed from two standpoints: first, we will discuss solutions that bring new or improved algorithms, techniques, and schemes for security enhancement; second, we will discuss solutions that propose new or improved security management frameworks, architectures, and methodologies. Solutions will then be discussed in terms of their advantages and disadvantages in regards to implementation in decentralized fog computing systems.

CAAVI-RICS Model for Analyzing the Security of Fog Computing Systems

4.1

27

Credibility

In the rest of this section, the Credibility principle of CAAVI model will be deliberated through the RICS aspects. The summary of the results of RICS applied to credibility is displayed in Fig. 1. While reading the following section focusing on Credibility, note that some of the sentences, phrases or words are in bold text. If one’s idea of reading the paper is skimming through the chapter, then reading the bold text is highly recommended as it will capture the essence of the section. Italic text is the authors’ comment based on the presented evidence, and it will refer to the discussion about the individual solution from the point of view of implementation in fog computing systems. Rationale Credibility is the process of establishing trustworthy relationships between devices, which results in efficient and secure communication. Credibility in a fog computing system must rest on these four pillars:

Fig. 1. Credibility summarized

1. Each and every connected device must have a unique identity, and must be able to authenticate itself to the entire network upon joining and upon system request. The identity must remain uncompromised – or if compromised must immediately be blacklisted by the network. 2. All devices must have foreseeable behavior, and there must not be a device action that the system does not know and recognize (Guo et al. 2017). 3. A device should not be assumed to be credible upon results of only one action, but rather continuity of actions with positive intentions.

28

S. Peˇsi´c et al.

4. On top of (3) reputation based trustworthiness audit is necessary (Ganeriwal et al. 2008). Each device’s credibility should be measured, thus giving it a reputation score in the system. If (3) is being fulfilled over a period of time, the device receives rewards and its reputation increases – in the same manner, it can decrease leading to device ban or blacklisting. By giving an answer to the question of whether devices or services are credible, including their data, services, and physical device credibility we are essentially establishing trust in a network of nodes. Trust is the level of assurance that an object or a system will behave in a satisfactory manner. It can provide assistance in the decision making, and allow autonomous connections to be established between fog nodes and resource-constrained IoT devices. But, to establish trust, a system is required to establish credibility first through a credibilityassessment management framework that works in parallel with the system. Next, credibility is not only tied to the credibility of devices but the credibility of information as well. The information can be presumed credible if: its source can be established, the source has credibility, it has not been tampered with in transit, and the originality of the information can be established. The credibility of a certain node can be quantified, and then computed. Thus, we introduce the term credibility calculation (Nitti et al. 2012). It refers to the process of establishing a devices credibility based upon a certain set of considerations. After credibility has been calculated the device is assigned a credibility index. Credibility information can be stored inside a network. Although a centralized approach is easier to manage, it also introduces a system dependency, a single point of failure, etc. In fog computing, a decentralized approach is preferred, where credibility can be stored and calculated on all, or a subset of nodes. Influence A well-established credibility framework influences a distributed system in a positive way as it enables straightforward cooperation between network devices, and cooperation enables more efficient process handling. It allows autonomous communication, without human intervention. Autonomous establishment of communication between devices with no previous knowledge of each other is also desirable in fog computing (it leads to efficient device onboarding). If a fog computing system cannot account for the credibility of its devices nor their actions, then the consequences of its operation can be severely negative. If a strong credibility-assessment management framework is not in place, the system cannot guarantee the well-being of the devices and users of that system. This can lead to devices being exposed to attacks and sensitive data theft, and this can be fatal in critical systems. On top of data theft, data tampering, physical device tampering attacks are all made possible, not just outside the network, but also from the inside. Furthermore, a low degree of system credibility does not only lead to decrease in performance and quality of decision making, but it also has an impact on the ability to attract users and customers. Credibility is not only important as an inside part of the system’s security but

CAAVI-RICS Model for Analyzing the Security of Fog Computing Systems

29

also when considering user experience. Users must trust the security, safety, and privacy of fog computing systems. Concerns Malicious nodes aim at breaking the basic functionality of IoT by means of breaching trust in the system and negatively affecting its credibility through attacks like self-promoting, bad-mouthing and good-mouthing (Sicari et al. 2015). The credibility of an IoT device can be compromised on the hardware and the software level. At the hardware level, by tampering with physical components, sensors, etc. the device can be led into feeding false information without realizing it (if there are no regular hardware checks). On the software level credibility can be shattered by tampering with devices’ services’ operation e.g. malicious code injection, information theft, identity stealing or tampering, etc. Another issue is the appearance of rogue nodes with malicious intents. Rogue nodes can affect the credibility of IoT systems on two levels: by malicious information spreading they can diminish the credibility levels of other devices that act on the information provided by the malicious nodes, thus having a direct impact on the credibility of the entire system; or by malicious actions aimed at exposing sensitive data such as identities, private keys, etc. Bao and Chen (2012) discuss how often IoT devices are exposed to public areas, and how uncontrolled wireless communication can make the devices vulnerable to malicious attacks resulting in sensitive data theft upon through an unknown, insecure access point. Another term we should introduce and separate from a rogue node is a misbehaving node. While rogue nodes are operated, or their actions are triggered by an external attacker, misbehaving nodes can deviate from typical behavior due to malfunction of parts, various errors in operation, etc. Lastly, due to the ability of the attacker or just the aging process of devices at some point in time, there will be misbehaving or faulty nodes. This is another security concern, at the credibility level to account for – there must always be in place a distributed, preferably autonomous and self-triggered mechanism that performs regular device health checks. Solutions Solutions that bring new or improved algorithms, techniques and schemes for security enhancement Credibility roots must be built starting from the hardware level. At this level, one could provide device credibility by resorting to hardware performance counters (HPCs) that are present in all commodity processors. HPCs are registers that can memorize and audit certain events that happen during the lifespan of a program. HPCs can be used to detect cache-based attacks in realtime (Chiappetta et al. 2016), firmware modifications (Wang et al. 2015), DDoS attacks (Jyothi et al. 2016), kernel control-flow modifying rootkits (Wang and Karri, 2016), etc. The idea is to use HPCs to detect significant deviations brought by malicious programs. Next, if a sensor is tampered with to report incorrect values, then the problem can be addressed at hardware level using Physical Unclonable Functions – PUFs (Rosenfeld et al. 2010). From a fog computing

30

S. Peˇsi´c et al.

point of view, this is a good approach for handling anti-hardware tampering and also recognize software tampering. Since all devices implement HPCs, it is a good security strategy, to begin with at this level. Yuan and Li (2018) proposed a new reliable and lightweight trust mechanism tailored specifically for low constraint IoT edge devices, based on multisource feedback information fusion. By employing the multi-source feedback mechanism for global trust calculation, lightweight trust evaluating mechanisms, and a feedback information fusion algorithm based on objective information entropy theory, their approach significantly outperforms existing approaches in both computational efficiency and reliability. This is a solution specifically tailored to meet the low requirements of fog computing devices. Misbehavior detection can happen on many levels and there are many ways to enforce such a mechanism in a fog computing system. Ahmed and Tepe (2016) such a detection mechanism has been proposed. Once the correct events have been learned using information from the most trusted sources, the information is used to identify the behavior of the nodes in a logistic trust algorithm. It is observed that logistic trust results in high accuracy of over 99% and a very low error of less than 2% even when the majority of the nodes are malicious. Moore et al. (2008) propose another scheme for misbehavior detection called suicide attacks, which relies on an honest majority. It is referred to as the STING algorithm. To summarize, if one node believes that another node may have been compromised and is misbehaving, it will issue a message that will ban them both from the network. The idea behind this principle is to make the sacrifice of future participation costly, in order to discourage false allegations. Although utilizing the logistic trust algorithms does come with a learning phase, thus increasing its deployment efforts, it proves to justify that with high accuracy of misbehaving nodes detection – it is suitable for fog computing systems if introducing a pre-deployment phase is possible (it may not be for different reasons such as operational costs or time constraints). The STING algorithm makes a positive notion towards quickly expelling the misbehaving node, but with the cost of also expelling the node that detected it. Although authors argue it may discourage false allegations and misbehavior, some fog computing systems cannot handle such constraints (consider a heart monitor and sugar-level sensors being expelled from the network of patient-care devices). Solutions that propose new or improved security management frameworks, architectures and middlewares, methods and methodologies Alongside mechanisms for actually establishing the credibility of devices, a framework for distributed credibility establishment is needed. Credibility management frameworks should give a system the means for transporting trust from a trusted party to a trustee (Jøsang et al. 2005). Distributed Trust management was first mentioned by Blaze et al. (1996). A distributed credibility/trust management framework for fog computing system must have the means to inspect, grant and revoke trust (Cho et al. 2011) in near-real time. By introducing a middleware-based architecture for securing a fog computing system, Razouk et al. (2017) designed a scheme where IoT constrained devices

CAAVI-RICS Model for Analyzing the Security of Fog Computing Systems

31

communicate through the proposed middleware agent. The agent then provides access to more computing power and enhanced capability to perform secure communications. To address the low resource constraints issue as well, when discussing trust models and the benefits they bring to a distributed network, Sun et al. (2008) underline the fact that security levels can change depending on the risk involved, thus leading to a more efficient and robust security model. If a node is highly trustworthy, less security may be put in place, and this ought to be done dynamically, throughout system operation lifetime. Middleware-based frameworks are only as trusted as the middleware included in the process of making decisions about credibility. By dynamically allocating credibility in the network as well as security strictness fog computing systems are more robust and scalable. Credibility can be confirmed and audited via remote confirmation/attestation. A remote confirmation service aims to verify the state of a possibly compromised device and is usually carried out by an authorized, secured the third party. This process is often referred to as attestation. Remote attestation techniques range from heavyweight ones (e.g. hardware-based) to more lightweight software-based ones, as well as hybrid ones that fuse the two (e.g. physical unclonable function – PUF). Chiang and Zhang (2016) cover the research done for a project in a powerful enterprise with the aim to create a scheme to authorize external devices’ access to the enterprise intranet for testing. Using a scheme with Trusted Cryptographic Module-based Remote Anonymous Attestation (TCM-RAA) they provide proof that the protocol is secure against existential forgery on adaptively chosen message under the elliptic curve discrete logarithm problem (ECDLP). Other papers argue in favor of remote attestation (Haldar et al. 2004; Sadeghi et al. 2011) across different fields of research, computing environments, and system requirements. Depending on how remote attestation is set up, it can be centralized (one node handles attestation), semi-decentralized (a subset of all nodes handle attestation) or completely decentralized (all nodes are handling attestation). Although in fog computing the last option would be preferred, the second option might also be viable because, in most of the cases, there are a number of devices or users that are unable to take part in attestation (devices that have just joined the network and their credibility is not yet established, “visiting” devices who are not part of the system but interact with it, etc.). Guo et al. (2017) surveyed existing techniques for trust computation in the service-oriented IoT. They classified trust computing methods into five dimensions: trust composition, trust propagation, trust update, trust formation, and trust aggregating. This paper is relevant to the topic addressed in our paper for three reasons: (1) it summarizes the pros and cons of each dimension’s options, and highlights the effectiveness of defense mechanisms against malicious attacks; (2) it summarizes the most, least, and little visited trust computation techniques in the literature and provide insight on the effectiveness of trust computation techniques regarding their application to IoT systems; (3) it identifies gaps in IoT trust computation research and suggest future research

32

S. Peˇsi´c et al.

directions. Thinking about the process of credibility/trust establishment, fog computing systems resort to the same resources and techniques that are generally available while keeping in mind the potential of distributing them.

5

Conclusion

In this paper, we have discussed key security issues in IoT and fog computing systems. Peculiar security challenges in IoT and fog have been presented, and the specific context around them was explained. Further, we presented the most common security challenges by classifying them into two categories: physical security and intrusions. The main contribution of our efforts is the introduction of a novel security review methodology that we call CAAVI-RICS. This novel review taxonomy aims to explain and discuss the foundational building blocks of an IoT/fog system’s security. In this paper, we concentrated on the illustration of the proposed methodology on the Credibility (CAAVI) principle. This is done by offering a scaled-down summarization of the security in a way that also readers without comprehensive technological backgrounds can understand. On the other hand, we provide an extensive overview of the security in fog computing systems, through systematic categorization resulting in discussion for more advanced and on-topic readers. We argue that the CAAVI-RICS review methodology can also be applied to modeling security of all cyber-physical systems. It captures the security aspects of the underlying systems well through deliberating each of the CAAVI separately, and then also forces a thorough understanding for each of those building blocks through RICS. Hence, although our focus was distributed systems in this paper, readers are also advised to apply the methodology to other real-world security problems. CAAVI-RICS can be applied when there is deep architectural knowledge about the system, its features and expected behavior. As part of our future work, we are going to discuss other CAAVI principles in the same manner.

References Ahmed, S., Tepe, K.: Misbehaviour detection in vehicular networks using logistic trust. In: 2016 IEEE Wireless Communications and Networking Conference, pp. 1–6. IEEE (2016) Babar, S., Mahalle, P., Stango, A., Prasad, N., Prasad, R.: Proposed security model and threat taxonomy for the Internet of Things (IoT). In: International Conference on Network Security and Applications, pp. 420–429. Springer (2010) Bao, F., Chen, R.: Trust management for the Internet of Things and its application to service composition. In: 2012 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp. 1–6. IEEE (2012) Blaze, M., Feigenbaum, J., Lacy, J.: Decentralized trust management. In: Proceedings 1996 IEEE Symposium on Security and Privacy, pp. 164–173. IEEE (1996)

CAAVI-RICS Model for Analyzing the Security of Fog Computing Systems

33

Buyya, R., Yeo, C.S., Venugopal, S., Broberg, J., Brandic, I.: Cloud computing and emerging it platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener. Comput. Syst. 25(6), 599–616 (2009) Chiang, M., Zhang, T.: Fog and IoT: an overview of research opportunities. IEEE Internet Things J. 3(6), 854–864 (2016) Chiappetta, M., Savas, E., Yilmaz, C.: Real time detection of cache-based side-channel attacks using hardware performance counters. Appl. Soft Comput. 49, 1162–1174 (2016) Cho, J.H., Swami, A., Chen, R.: A survey on trust management for mobile ad hoc networks. IEEE Commun. Surv. Tutorials 13(4), 562–583 (2011) Cirani, S., Ferrari, G., Veltri, L.: Enforcing security mechanisms in the IP-based Internet of Things: an algorithmic overview. Algorithms 6(2), 197–226 (2013) Dillon, T., Wu, C., Chang, E.: Cloud computing: issues and challenges. In: 2010 24th IEEE International Conference on Advanced Information Networking and Applications, pp. 27–33. IEEE (2010) Ganeriwal, S., Balzano, L.K., Srivastava, M.B.: Reputation-based framework for high integrity sensor networks. ACM Trans. Sens. Netw. (TOSN) 4(3), 15 (2008) Guo, J., Chen, R., Tsai, J.J.: A survey of trust computation models for service management in Internet of Things systems. Comput. Commun. 97, 1–14 (2017) Haldar, V., Chandra, D., Franz, M.: Semantic remote attestation: a virtual machine directed approach to trusted computing. In: USENIX Virtual Machine Research and Technology Symposium, vol. 2004 (2004) Jøsang, A., Keser, C., Dimitrakos, T.: Can we manage trust? In: International Conference on Trust Management, pp. 93–107. Springer (2005) Jyothi, V., Wang, X., Addepalli, S.K., Karri, R.: Brain: behavior based adaptive intrusion detection in networks: using hardware performance counters to detect DDoS attacks. In: 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID), pp. 587–588. IEEE (2016) Khan, S., Parkinson, S., Qin, Y.: Fog computing security: a review of current applications and security solutions. J. Cloud Comput. 6(1), 19 (2017) Krutz, R.L., Vines, R.D.: Cloud Security: A Comprehensive Guide to Secure Cloud Computing. Wiley, Indianapolis (2010) Kumar, P., Zaidi, N., Choudhury, T.: Fog computing: common security issues and proposed countermeasures. In: 2016 International Conference System Modeling and Advancement in Research Trends (SMART), pp. 311–315. IEEE (2016) Mahmoud, R., Yousuf, T., Aloul, F., Zualkernan, I.: Internet of Things (IoT) security: current status, challenges and prospective measures. In: 2015 10th International Conference for Internet Technology and Secured Transactions (ICITST), pp. 336– 341. IEEE (2015) Moore, T., Raya, M., Clulow, J., Papadimitratos, P., Anderson, R., Hubaux, J.P.: Fast exclusion of errant devices from vehicular networks. In: 2008 5th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, pp. 135–143. IEEE (2008) Muda, Z., Yassin, W., Sulaiman, M., Udzir, N.I., et al.: A k-means and Naive Bayes learning approach for better intrusion detection. Inform. Technol. J. 10(3), 648–655 (2011) Nitti, M., Girau, R., Atzori, L., Iera, A., Morabito, G.: A subjective model for trustworthiness evaluation in the social Internet of Things. In: 2012 23rd International Symposium on Personal, Indoor and Mobile Radio Communications-(PIMRC), pp. 18–23. IEEE (2012)

34

S. Peˇsi´c et al.

Razouk, W., Sgandurra, D., Sakurai, K.: A new security middleware architecture based on fog computing and cloud to support IoT constrained devices. In: Proceedings of the 1st International Conference on Internet of Things and Machine Learning, p. 35. ACM (2017) Rosenfeld, K., Gavas, E., Karri, R.: Sensor physical unclonable functions. In: 2010 IEEE International Symposium on Hardware-oriented Security and Trust (HOST), pp. 112–117. IEEE (2010) Sadeghi, A.R., Schulz, S., Wachsmann, C.: Short paper: lightweight remote attestation using physical functions. In: WiSec 2011 (2011) Sicari, S., Rizzardi, A., Grieco, L.A., Coen-Porisini, A.: Security, privacy and trust in Internet of Things: the road ahead. Comput. Netw. 76, 146–164 (2015) Stojmenovic, I., Wen, S.: The fog computing paradigm: scenarios and security issues. In: 2014 Federated Conference on Computer Science and Information Systems, pp. 1–8. IEEE (2014) Sun, Y., Han, Z., Liu, K.R.: Defense of trust management vulnerabilities in distributed networks. IEEE Commun. Mag. 46(2), 112–119 (2008) Wang, X., Karri, R.: Reusing hardware performance counters to detect and identify kernel control-flow modifying rootkits. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 35(3), 485–498 (2016) Wang, X., Konstantinou, C., Maniatakos, M., Karri, R.: Confirm: detecting firmware modifications in embedded systems using hardware performance counters. In: Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, pp. 544–551. IEEE Press (2015) Yi, S., Qin, Z., Li, Q.: Security and privacy issues of fog computing: a survey. In: International Conference on Wireless Algorithms, Systems, and Applications, pp. 685–695. Springer (2015) Yuan, J., Li, X.: A reliable and lightweight trust computing mechanism for IoT edge devices based on multi-source feedback information fusion. IEEE Access 6, 23626– 23638 (2018)

Conceptual Model of Digital Platform for Enterprises of Industry 5.0 Vladimir Gorodetsky1 , Vladimir Larukchin2 , and Petr Skobelev3(B) 1

3

InfoWings Ltd., Saint-Petersburg, Russian Federation [email protected] 2 Samara State Technical University, Samara, Russian Federation [email protected] Institute for the Control of Complex Systems of Russian Academy of Sciences, Samara, Russian Federation [email protected]

Abstract. The paper proposes conceptual model of advanced digital platform for adaptive management of enterprises within the next generation of digital economy in the upcoming era of Industry 5.0. It analyzes existing digital platforms and their limitations due to their centralized and hierarchical management style supported. The paper considers the concept of digital ecosystem as an open, distributed, self-organizing “system of systems” of smart services capable of coordinating decisions and automatically resolving conflicts through multi-party negotiations. It proposes classification of services to be provided by the introduced advanced digital platform and describes their functions. It substantiates the leading role of multi-agent systems as the basic software architecture and technology for developing applications managed by the introduced digital platform. The paper results are applicable to many modern industrial enterprises. Keywords: Complex adaptive systems · Self-organization · Digital platform · Digital ecosystem · Multi-agent technology · Smart services · Resource management

1

Introduction

About decade ago, Industry 4.0 has manifested transition to the new era in manufacturing through integration and combining of innovations and advancements of modern digital technologies like artificial intelligence, big data and analytics, robotics, smart sensors, cloud computing, Internet of Things, cyber-physical systems and some others. Since that time, Industry 4.0 remains to be a flagship determining the main global trends in development of manufacturing systems. However, the landscape of modern IT-models, paradigms and control technologies accelerated by novel emergent applications has been changed rapidly, thus motivating researchers and practitioners to update some basic principles of Industry 4.0. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 35–40, 2020. https://doi.org/10.1007/978-3-030-32258-8_4

36

V. Gorodetsky et al.

The recently introduced concept of the “system of systems” is aimed to manage distributed processes of the subsystems in a coordinated way according to p2p principles. This actually means transition to network-centric production control model borrowed from military area, where lack of coordination and any delays can critically change the outcome of a military operation [1]. According to Industry 4.0 view, coordinated operation of digital networking enterprise is achieved via use of a software and communication infrastructure called also digital platform (DP). The main objective of this platform is to constitute, for the network components, common information and communication space as well as runtime environment. However, such view of DP looks too limited for the next generation of enterprises comprising of autonomous subsystems, which should be capable to proactive behavior, adaptive planning and on-line re-planning, communicating the enterprise units to employees. In fact, the enterprise should operate according to the well-known Deming cycle “Plan-Do-Check-Act” [2] in all the aspects concerning the enterprise activities. Creating DP with such extended capabilities is the key issue of the next step of digital economy formulated as Industry 5.0. More specifically, the intended capabilities of digital enterprise DP are to unite enterprise subsystems into a common semantic information space and provide for them access to the rich set of reusable applied services intended to support for distributed resource planning and control in real time. DP of Industry 5.0 should also provide the networked enterprise units with the standard access to the cloud resources and services and to data perceived by external smart sensor networks. The paper objective is to introduce and outline the architecture, basic functions and services of DP of enterprise that fits the requirements of Industry 5.0. Accordingly, Sect. 2 outlines the current R&D regarding to the enterprise DP issues. Section 3 explains the recently introduced concept of digital ecosystem as a conceptual framework for enterprise of Industry 5.0. Section 4 presents highlevel view of architecture of DP in question and outlines the services that should transform production ecosystem to adaptive real-time resource management networked system of system fitting the requirements of Industry 5.0. Conclusion summarizes the results of the paper.

2

Related Research and Developments

Quite an extensive amount of information on models, architecture and software and hardware implementations of enterprise DP can be found in literature. However, most of them refer to Industry 4.0 and do not go outside the classical ERPand BI-systems. For example, in current EU Horizon 2020 program, about dozen of projects deals with R&D concerning various aspects of enterprise DP of Industry 4.0 level. Short overview of these projects and their outcomes can be found in [3]. These projects focus on the role of DP as an integrator of knowledge and data of different enterprises composing B2B-manufacturing network to ensure their information and software compatibilities.

Conceptual Model of Digital Platform for Enterprises of Industry 5.0

37

An architecture and basic functions of the advanced DP for a B2B production network compatible with the Industry 4.0 concept are proposed in [4]. In it, DP supplies the following basic services: • communication service providing, for network nodes, communication channels for message exchange and routing supplies services of White and Yellow pages; • support for network openness letting; • support for ontology-based information compatibility of production network nodes; • support for network node interaction protocols, in various use cases. Among agent platforms, one should mention JADE (Java-Agent Development Framework). Due to the use of open source concept, many reusable software have been developed for it [5]. Moreover, there are exists real experience of use the JADE to implement DP for self-organizing agent networks [3,4]. However, the JADE plat-form, despite it is a product of industrial level, has several significant limitations preventing its use as a prototype of digital enterprise DP. Our experience of developing industrial multi-agent systems [6] shows that existing platforms are lacking key components for providing enterprise-ready solutions. But the main topic now is how to provide interaction and communication be-tween these systems of systems and what kind of services are required.

3

The Concept of Digital Ecosystem

The concept of digital ecosystem was introduced in [7]. According to Wikipedia, a digit role of transformer ecosystem is a distributed, adaptive, open sociotechnical system with properties of self-organization, scalability and sustainability inspired from natural ecosystems. Digital ecosystem models are informed by knowledge of natural ecosystems, especially for aspects related to competition and collaboration among diverse entities [8]. It is worth to note that exactly the same necessary features were claimed in Sect. 1 for an enterprise of Industry 5.0 and therefore the latter can be viewed as a particular case of digital ecosystem, and enterprise DP – as a particular case of ecosystem DP. To be more precise, the enterprise DP should play the role transformer of a digital enterprise to the digital ecosystem of smart services. The ecosystem of smart services should hold the following basic new features: • Openness - ability to introduce new services into ecosystem “on the fly”, with no stop or restart of it; • Distributed architecture - ability of all services to operate autonomously, continuously, in parallel and asynchronously. The service calls can be initiated by either users, or by other services, or the calls can be initiated proactively by internal events generated depending on the system state, criteria values, and decision-making policy;

38

V. Gorodetsky et al.

• Adaptability - the ecosystem ability to change its structure or functions in dynamic fashion in order to increase own efficiency, or, in a narrower sense, – the ability of ecosystem to adaptively respond to events via change of previous decision or adaptively revisit the existing plan due to change of situation; • Self-organization - the ability of ecosystem of services to pro-actively create local connections and review them when the situation changes; • Service competition and cooperation - the capability of the ecosystem service requester to choose among the services provided by different service suppliers. In contrast to traditional closed hierarchical centralized and sequential systems for enterprise management, the considered enterprise ecosystem is built from autonomous intelligent component capable for proactive situation analysis and resource management, to interact on joint actions, to respond to incoming real-time events through on-line revisiting the current plans to achieve ultimate goals. Implementing of enterprise ecosystem with such features requires development of a fundamentally new DP.

4

Architecture and Services of Digital Platform

As a rule, ecosystem of smart services is composed of a large or even huge number of autonomous but relatively simple entities solving their tasks based on intensive interactions. It is common opinion that, for such class of systems, the multi-agent technology relying on distributed autonomous software agents situated within shared software, information and communication environments is the best framework, architecture and implementation technology. Accordingly, in the paper, this approach is accepted as the basic one for ecosystem of smart services. Additional argument in favor of such decision is that multi-agent community developed many standard architectural and software solutions that can be productively exploited for implementation of digital platform of ecosystem of smart services. For example, the FIPA abstract architecture proposes practically ready solution for implementation of some system services constituting the software and communication environment (infrastructure) of DP in question. It proposes a number of standard protocols supporting for virtual market self-organization techniques for resource planning and scheduling, e.g. Contract Net Protocol. The developed architecture is illustrated in Fig. 1. In it, the services of the platform are divided into two groups. The first of them called “system services” implements the standard functionalities of FIPA-compliant agent platform constituting what is usually called software and communication agent infrastructure. The second group of services comprises application services. They play the roles of interfaces connecting the software and users with ecosystem smart services. Let us outline both sorts of services. The FIPA abstract architecture [9] includes system services that can be exploited practically without change in the DP of the ecosystem of smart services (see Fig. 1). Let us motivate and outline the application services.

Conceptual Model of Digital Platform for Enterprises of Industry 5.0

39

Fig. 1. Digital Platform of the ecosystem of smart services

1. Service intended for integration of knowledge and data into a shared semantic space ensuring unambiguous understanding of all terminology used in it with support for multi-aspect issue and integrity of data and knowledge. This service should monitor all the processes of data and knowledge creation, modification and usage to guarantee the unique and unambiguous understanding and use by all the software entities. Graph databases are considered as a primary candidate interacting with the data and knowledge consumers through this service. 2. Virtual market service of DP is the main resource planning and scheduling and deconfliction service. Specific feature of ecosystem DP, compared to MAS platform, is that DP usually runs in parallel a number of tasks that, at the same times, need the same resources and services, data from same sensor network, etc. Naturally, there will always be conflicts in competition for limited resources. However, a distinctive feature of digital production is the fact that all resources are of public availability. These services do not have particular owners, and therefore one of the key tasks of DP of production ecosystem of smart services is online planning of the resource allocation over a multitude of potential consumers. 3. Service supporting for access to the logged data ecosystem. Smart environment like ecosystem of smart services is the source of large quantity of valuable data, and one of the modern trends is creating a special data ecosystem containing data on every aspects of the environment performance [10]. Data ecosystem is a new challenge for DP of data-intensive ecosystem of smart

40

V. Gorodetsky et al.

services, on the one hand, and a new opportunity to take advantage of the data to increase its intelligence. The main functionality of this service of DP is to enable data sharing between the systems of ecosystem in order to support semantic interconnections of them. Other application services indicated in Fig. 1 are self-explaining and, thus, do not need additional description.

5

Conclusion

The paper presents the concept of digital ecosystem of smart services for digital enterprise and DP transforming the ecosystem into single whole. The latter should constitute a basis for managing the enterprise ecosystem in the upcoming era of Industry 5.0. The paper contribution concerns development of the DP architecture and services it has to provide. The efforts of the future research and development will be intended to the implementation of a proof of concept prototype of DP corresponding to the Industry 5.0 concept of digital production systems. Acknowledgements. The work was supported by the Ministry of Education and Science of Russian Federation (contract agreement No. 14.574.21.0183 – the unique identification number is RFMEFI57417X0183).

References 1. Moffat, J.: Complexity Theory and Network Centric Warfare. CCRP Publication series: Information Age Transformation Series (2003) 2. Deming, W.E.: Out of the crisis: New paradigm of managing people, systems and processes. Alpina Business Books, Moscow (2007). (in Russian) 3. Gorodetsky, V.: Multi-agent self-organization in B2B networks. In: XII All-Russian Meeting on Management Problems VSPU-2014, pp. 8954–8965 (2014). (in Russian) 4. Gorodetsky, V., Bukhvalov, O.: Self-organizing production B2B-networks. Part 2. Architecture and algorithmic support. Mechatron. Autom. Control 18(12), 829– 839 (2017). (in Russian) 5. JADE. http://jade.tilab.com. Accessed 5 Feb 2019 6. Rzevski, G., Skobelev, P.: Managing Complexity. WIT Press (2014) 7. Briscoe, G., De Wilde, P.: Digital ecosystems: evolving service-oriented architectures. In: Proceedings of BIONETICS 2006. IEEE Press (2006) 8. Digital ecosystem. https://en.wikipedia.org/wiki/Digitalecosystem. Accessed 15 Feb 2019 9. Abstract FIPA architecture (2019). http://www.fipa.org/specs/fipa00001/ SC00001L.html. Accessed 15 Feb 2019 10. Curry, E., Sheth, A.: Next-generation smart environments: from system of systems to data ecosystems. IEEE Intell. Syst. 33(3), 69–76 (2019)

Conceptual Data Modeling Using Aggregates to Ensure Large-Scale Distributed Data Management Systems Security Maria A. Poltavtseva(B) and Maxim O. Kalinin Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia {poltavtseva,max}@ibks.spbstu.ru

Abstract. To define a common data security policy in a large-scale distributed data management system, it is necessary to create a uniform data catalog. The existing conceptual data models and their suitability for the problem are considered in the paper. The authors propose a new aggregate model for conceptual big data description in large-scale distributed data management systems security subsystems. The paper presents the mapping of the DBMS data models to the new conceptual model. Keywords: Data models · Conceptual modeling systems · Big data security

1

· Distributed

Introduction

Large-scale distributed systems in the different areas operate with large volumes of heterogeneous data – Big Data. Different types of DBMS are used as part of the data architecture. Data during processing is granulated in different ways and distributed among these repositories. The data life cycle is complex and nonlinear [2]. To ensure the security of such systems, access control and setting a common security policy, it is necessary to use a common view for all information components. It will allow one to apply them the uniform methods security management, regardless of the characteristics of the source, recipient, storage and processing.

2

Related Works

New threats in data management [4,22,27] require harmonization of security management throughout the data life cycle [2,7]. In a large-scale distributed systems, data have a complex nonlinear cycle of transformations [2,23]. Using different processing tools based on different data models, and thus with different sets of operations and structuring methods, makes it difficult to harmonize c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 41–47, 2020. https://doi.org/10.1007/978-3-030-32258-8_5

42

M. A. Poltavtseva and M. O. Kalinin

security policies and set a single policy for the entire system. To overcome these challenges, a consistent approach to big data security requires a unified conceptual data model [26]. It should have a number of properties [9], defined by the scope. In this case they are: to have a mutual mapping with the basic industrial DBMS models [21]; to match the mathematical security models [1,31]; have the simplicity to provide the required performance [1]; to support the transformation of the data: nonlinear data lifecycle [2,27]. In conceptual modeling, the search for relevant data models and processes has been going on for a long time [21,25]. Earlier conceptual modeling of data and processes has been clearly separated [28]. In conceptual data modeling it is noted the need for model integration with DBMS technologies [21] and taking into account the specifics of the subject area [24]. Requirements to the popular conceptual data models were presented by analytical systems integrating heterogeneous information [5,8]. For analytical systems it is important to type sets of links, a high degree of data structuring and semantics [30]. The same is noted in the conceptual modeling of Big data [13], including NoSQL [29]. In information security, data has always been presented in the form of simple objects [33], characterized by a simple set of properties: level [10], role [10], type [12,19], attributes [16] and integrated directly with logical DBMS models [9,15]. To manage and control access, one need to identify the object and the ability to determine access rights in relation to it for a given subject. Complex relationships between objects are not taken into account in most (and all common) computer security models [11]. Therefore, conceptual models for Analytics such as multidimensional cubes [3] and ontologies [14,18] are excessive. This situation leads to the need to develop a conceptual data model for security management in large-scale data management systems (big data management systems). The model should have simplicity and completeness of data display and operations. It should be based on a mathematical model that correlates with the mathematical foundations of SQL, NoSQL and NewSQL approaches to data modeling.

3

Aggregate Data Model

Distributed large-scale Big data management systems, in terms of information processing, are characterized by a large number of different specialized tools [15,17]. Data models in them can be: relational,key – value, column family, documentary and others DBMS [20]. But the formal definition of new models (except relational) is still being formed [6,20]. Relational DBMS and DBMS based on key – value pairs (including column family, documentary and so on) are based on one, set-theoretic approach and can be assigned to the same category from the point of mathematical view. The authors propose a data model based on the concept of “aggregate” and nested aggregates. The main data structure of the proposed model is the aggregate, based on nested sets: A = , where the Key uniquely identifies the aggregate. The aggregate can be atomic – not interpreted from the point of data management system view. Such a unit is |aatom = , j ∈ (1, M )} where M is described as: Aatom = {aatom j j

Conceptual Data Modeling

43

the number of atomic aggregates in the system. The aggregate can be composite – which is a union of a several other units plurality.That is, the set of aggregates A K is defined as: A = {ai |ai ∈ Aatom ∨ ai = , i ∈ (1, N )}, where N is the total number of aggregates in the system. As when data modeling manipulated by the same type of structures, let’s describe the atomic aggregate as an composite aggregate with a sin→ aj =⇒ aj .Key = aatom .Key, aj .V alue = gle input value: aatom j j atom atom . When data granulating, one type of relationship is established between aggregates – nesting. Aggregate a1 is nested in aggregate a2 when the a2 is a composite and a1 is an element of the set of values a2 : a1 ∈ a2 .V alue =⇒ a1 ∈ a2 . If the ai aggregate value is generated using a function based on the set of aggregates AF values, then between ai and aj ∈ AF , j ∈ (0, | AF |) there is a dependency on generation: ai = F (AF ) or aj → ai , aj ∈ AF . The aggregate data model has two basic constraints. First constraint - the uniqueness of the key. Second one - atomicity of simple aggregates. All operations with aggregates are divided into two classes: operations that require access to data semantics, and operations that are independent of semantics [26]. There are three operations to aggregates data definition (DDL). They are the aggregate creation a = Create(Key, V alue) and the aggregate creation based on existing ones aF = CreateF (Key, F (AF )), AF ⊆ A, where the function F involved in the aggregate definition is called the generation function. Function Fkey , on the basis of which the key of the new value is formed – the function of the key generating. And the last - the aggregate deletion: Delete(a). It is possible to use different strategies to generate a unique aggregate UID. The default key generation funcdef ault (AF ) = a1 .Key · · · · · a|F |−1 .Key · a|F | .Key · U ID for all i ∈ tion is: Fkey (1, | AF mid). Consider the operation of manipulating aggregates. Operations without taking into account the semantics of the aggregate value are: Search (selection) by key a = Select(Key, A∪ where aggregate is found a ∈ A∪) , and A∪ ⊂ A - the scope of the search attribute; Nesting operation establishing the nesting relation between existing aggregates aΣ = Include(ai , aj ) =⇒ aj ∈ aΣ , aΣ .Key = ai .Key, aΣ .V alue = ai .V laue ∪ ; Operation exception, which breaks the nesting relation between the aggregates Σ .V alue ai = Extract(aΣ , Key) =⇒ ai = Select(Key, aΣ .V alue), aΣ .V alue = aai .V alue . Operations that take into account the aggregate value semantics: aggregate value reading - Read(a); aggregate value modifying - U pdate(a). The operation of modifying the value of the aggregate is fundamentally different from the operation of deleting the aggregate and creating a new one, because when you modify the key of the aggregate remains the same, and when you delete the aggregate and create a new one - its key (identifier) changes.

4

Mapping Logical DBMS Models to Aggregate Model

Mapping a key – value data model to an aggregate, the direct map takes the form Keyl → Key, V aluel → V alue. The key of the stored value is mapped in the aggregate key, the value is mapped in the aggregate value.

44

M. A. Poltavtseva and M. O. Kalinin

For more complex models built in the same aggregate paradigm, the mapping will be more complex and will require nested aggregates. As an example, consider the “classic” data model “column family” [20]. The member of the “column family” set is displayed as N ame → Keyn , Dataset → V aluen and the primary key reference becomes the top-level aggregate KeyP → Key, ColumnF amilies → V laue. Then the final map will be . The graphic representation is shown in Fig. 1(A).

Fig. 1. Mapping key – value (A) and relational (B) model data structures in the aggregate model

Similar principles are used to construct mappings of similar models, such as time series, documentary models, or individual implementations of a column and column family. Relational model the most complex of modern industrial data models. An instance of r(R), the relationship to the scheme R, is a set of n-tuples and is written as r = {t1 , . . . , tM }. Each tuple is a set of attributes t = (v1 , . . . , vN ), where vi - i-th attribute value (i.e. vi ∈ Aj ) tuple t. That is, the value of the tuple is set on a set of attributes. The combination of the primary key of the tuple and the attribute name uniquely determines the value of this attribute of the tuple. So, the main key should include: the name of the relationship; primary keys of relations; attribute name. An example of the relationship mapping is shown in Fig. 1(B). Thus, the proposed aggregate model has mutual mappings in all logical data modelse of the distributed large-scale data management systems tools.

5

Conclusion

The authors found, that the known conceptual data models in relation to the problems of information security are redundant and do not meet the requirements. Mathematical foundations of the majority logic models is the theory of sets. The proposed conceptual model preserves the basic relationships between the data of logical DBMS models defined by mathematical methods of nested

Conceptual Data Modeling

45

sets. The length of the article does not allow us to consider them in more detail here. The existence of a relation between the generating aggregates and the identity adoption of aggregates with the same key allows the integration of an aggregate model with a complex data fragment life cycle in a Big data management system. The aggregate model thus supports data replication and transformation. The authors propose to use their aggregate data model to develop methods and tools for conceptual design of Big data in security management systems. The next step is to drill down into the aggregate model and logical DBMS data models mapping and develop an access control approach using aggregates. Acknowledgements. The reported study was funded by RFBR according to the research project N18-29-03102.

References 1. Ahmad, K., Alam, S., Udzir, N.I.: Security of NoSQL database against intruders. Recent Pat. Eng. (2018). https://doi.org/10.2174/1872212112666180731114714 2. Alshboul, Y., Wang, Y., Nepali, R.K.: Big data lifecycle : threats and security model. In: Twenty-First Americas Conference on Information Systems (AMCIS), pp. 1–7 (2015) 3. Bagozi, A., Bianchini, D., De Antonellis, V., Marini, A., Ragazzi, D.: Big data conceptual modelling in cyber-physical systems. Int. J. Concept. Model. (2018). https://doi.org/10.18417/emisa.si.hcm.24. Special Issue on Conceptual Modelling in Honour of Heinrich C. Mayr 4. Far S.B., Rad, A.I.: Security Analysis of Big Data on Internet of Things (2018). https://arxiv.org/abs/1808.09491. Accessed 28 Aug 2018 5. Calvanese, D., Lenzerini, M., Nardi, D.: Description logics for conceptual data modeling. In: Logics for Databases and Information Systems, The Springer International Series in Engineering and Computer Science (1998). https://doi.org/10. 1007/978-1-4615-5643-5 8 6. Chebotko, A., Kashlev, A., Lu, S.: A big data modeling methodology for apache cassandra. In: 2015 IEEE International Congress on Big Data, New York (2015). https://doi.org/10.1109/BigDataCongress.2015.41 7. D’Acquisto, G., Domingo-Ferrer, J., Kikiras, P., Torra, V., Montjoye, Y.D., Bourka, A.: Privacy by design in big data: an overview of privacy enhancing technologies in the era of big data analytics. CoRR, abs/1512.06000 (2015) 8. Davies, I., Green, P., Rosemann, M., Indulska, M., Gallo, S.: How do practitioners use conceptual modeling in practice? Data Knowl. Eng. 58(3) (2006). https://doi. org/10.1016/j.datak.2005.07.007 9. Embley, D.W., Thalheim, B.: Handbook of Conceptual Modeling: Theory, Practice, and Research Challenges, 1st edn. Springer Publishing Company (2011) 10. Ferraiolo, D.F., Kuhn, R.D., Chandramouli, R.: Role-based Access Control, 2nd edn. Artech House (2007) 11. Gaidamakin, N.A.: Access to Information in Computer Systems. Ural Publishing House, Yekaterinburg (2003). UN-TA 12. Gaydamakin, N.A.: Multilevel thematic-hierarchical access control (MLTHSsystem). Prikladnaya Diskretnaya Matematika 39 (2018). https://doi.org/10. 17223/20710410/39/4

46

M. A. Poltavtseva and M. O. Kalinin

13. Gil, D., Song, I.-Y.: Modeling and management of big data: challenges and opportunities. Future Gen. Comput. Syst. (2015). https://doi.org/10.1016/j.future. 2015.07.019 14. Jarrar, M., Demey, J., Meersman, R.: On using conceptual data modeling for ontology engineering. J. Data Semant. I (2003). https://doi.org/10.1007/978-3540-39733-5 8 15. Jiaheng, L., Holubov´ a, I.: Multi-model data management: what’s new and what’s next? In: EDBT (2017). https://doi.org/10.5441/002/edbt.2017.80 16. Kim, S., Kim, N., Chung, T.: Attribute relationship evaluation methodology for big data security. In: 2013 International Conference on IT Convergence and Security (ICITCS), Macao (2013). https://doi.org/10.1109/ICITCS.2013.6717808 17. Kleppmann, M.: Designing Data-Intensive Applications. O’Reilly (2018) 18. Kotenko, I., Fedorchenko, A., Doynikova, E., Chechulin, A.: An ontology-based hybrid storage of security information. Inf. Technol. Control (4) (2018) 19. Kotenko, I., Polubelova, O., Saenko, I.: The ontological approach for SIEM data repository implementation. In: 2012 IEEE International Conference on Green Computing and Communications, Besancon (2012). https://doi.org/10. 1109/GreenCom.2012.125 20. Kuznetsov, S.D., Poskonin, A.V.: NoSQL data management systems. Program. Comput. Softw. (2014). https://doi.org/10.1134/S0361768814060152. M.: Springer-Verlag GmbH 21. Loucopoulos, P., Zicari, R.: Conceptual Modeling, Databases, and Case: An Integrated View of Information Systems Development. John Wiley & Sons Inc, New York (1992) 22. Mehmood, A., Natgunanathan, I., Xiang, Y., Hua, G., Guo, S.: Protection of big data privacy. IEEE Access (2016). https://doi.org/10.1109/ACCESS.2016. 2558446 23. Moreno, J., Serrano, M.A., Fern´ andez-Medina, E.: Main issues in big data security. Future Internet (2016). https://doi.org/10.3390/fi8030044 24. Palacio, L.A., Gimenez, G.A., Rodenas, C.J.C., Roman, R.J.F.: Genomic data management in big data environments: the colorectal cancer case. In: Advances in Conceptual Modeling, ER 2018 (2018) https://doi.org/10.1007/978-3-030-013912 36 25. Pernul, G.: Canonical security modeling for federated databases. In: IFIP Transactions A: Computer Science and Technology, Interoperable Database Systems (1993). https://doi.org/10.1016/B978-0-444-89879-1.50018-X 26. Poltavtseva, M.A.: Consistent approach to secure big data processing and storage systems development. Problems Inf. Secur. Comput. Syst. (2) (2019, in press) 27. Poltavtseva, M.A., Kalinin, M.O.: Threat model of big data processing and storage Systems. Problems Inf. Secur. Comput. Syst. (2) (2019, in press) 28. Pras, A., Schoenwaelder, J.: On the Difference between Information Models and Data Models, RFC 3444 (2003). https://doi.org/10.17487/RFC3444 29. Storey, V.C., Song, I.-Y.: Big data technologies and management: what conceptual modeling can do. Data Knowl. Eng. (2017). https://doi.org/10.1016/j.datak.2017. 01.001 30. Storey, V.C., Trujillo, J.C., Liddle, S.W.: Research on conceptual modeling: themes, topics, and introduction to the special issue. Data Knowl. Eng. (2015). https://doi.org/10.1016/j.datak.2015.07.002 31. Thuraisingham, B.: Database and Applications Security: Integrating Information Security and Data Management. Taylor & Francis Group (2005)

Conceptual Data Modeling

47

32. Veniaminov, E.M.: Algebraic Methods in Database Theory and Knowledge Representation. Scientific World, Moscow (2003) 33. Zegzhda, P.D., Zegzhda, D.P.: Secure systems design technology. In: Lecture Notes in Computer Science, vol. 2052 (2001)

Smart Topic Sharing in IoT Platform Based on a Social Inspired Broker Vincenza Carchiolo1(B) , Alessandro Longheu2 , Michele Malgeri2 , and Giuseppe Mangioni2 1

Dip. Matematica e Informatica, Universit` a degli Studi di Catania, Catania, Italy [email protected] 2 Dip. Ingegneria Elettrica, Elettronica e Informatica, Universit` a degli Studi di Catania, Catania, Italy

Abstract. Social Internet of Things can be viewed as the evolution of IoT in the same way social networks can be considered an evolution of the Internet. In this paper we present a social IoT approach based on a social broker paradigm. In our proposal, the sharing of information among entities requires a common semantic model to make the information meaningful. We also introduce the concept of topics semantics and we show a solution to share them among brokers.

Keywords: IoT

1

· OSN · Topic sharing · Brokers

Introduction

Social behavior models the interactions among two or more organisms within the same species, and encompasses any behavior in which one member affects the other. In Social Internet of Things, the role of interacting members can be played by different actors, in particular four different entities can be identified: things, brokers, humans and applications. Combining their communication, four scenarios in social IoT (SIoT) arise; we briefly outline them in the following. The first, called Thing-to-Thing Social IoT (TTS-IoT) [1,2] is one of the oldest SIoT paradigma in which wireless sensor nodes build temporary social relationships. In [2], the idea is to provide objects with global information so they become context-aware and can be involved in conversation like human beings. Thing-to-Human Social IoT (THS-IoT) paradigma is the most natural and has been considered by many researchers. For instance Socialite [3] is a tool that uses semantic technologies for SIoT end-user programming. In particular, the authors extracted some rules from online surveys, exploiting them in automatic runtime decisions in IoT scenarios, where knowledge representation is semantic based and rules support information sharing to facilitate social relationships among IoT users. Other works such as [4] and [5] highlight the integration of IoT with social networks, where they are used as a base for resources discovery by IoT. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 48–55, 2020. https://doi.org/10.1007/978-3-030-32258-8_6

Smart Topic Sharing in IoT Platform Based on a Social Inspired Broker

49

A third paradigma is known as Application-to-Application Social IoT (AASIoT), a new social approach that promotes data exchange and reuse among IoT applications, so they can use mutual social relationships to leverage their services [6]. Finally, Broker-to-Broker Social IoT (BBS-IoT) is a distributed multi-broker overlay platform proposed in [7] although no social inspiration is considered in query forwarding and answering; an improvement of BBS-IoT paradigma is introduced in [8], where a multi broker solution for M2M protocols is proposed. In this paper we focus on the BBS-IoT paradigma, in particular we consider each broker as a node of a social-based peer-to-peer network that operates according to PROSA [9,10], a semantic social inspired overlay network whose query forwarding algorithm is reliable and effective and whose small world structure guarantees fast responses and good query recall. In a social model, a cooperation among individuals requires they share the meaning of the exchanged messages, in the same way, any social paradigm requires a semantic model for the social objects. In our approach we provide a model to exchange topics among cooperating brokers; to this purpose, we use an ontology based concept semantic similarities. The main contribution of this paper lies in the formalization of the algorithm that ensures that networks evolves to a stable configuration that resemble a social-network. In Sect. 2 we briefly discuss about the meaning of social inspiration in IoT context and present the distributed social broker functionality. In Sect. 3 we present our approach in sharing topic over such a network, finally providing our concluding remarks and future works in Sect. 4.

2

A Distributed Social Broker

A typical communication pattern used in IoT is the publish-subscribe that allows asynchronous communication between processes. In this schema sender and receiver communicate through a broker and are not aware of each other. The sender usually classifies the message and publishes it to the broker, while a receiver subscribes to the same broker for any message belonging to a given class. This pattern allows implementing an effective Machine to machine (M2M) communication, thus inspirating several protocols as CoAP, MQTT, AMQP, and many others [11,12]. Most of them use topics for modeling classes of interest, and a client can subscribe to multiple topics to receive from a broker related messages. The growing use of IoT based systems implies that many systems often coexist in the same area as in a closed world, but they do not exchange messages. To allow information exchange, in [8] we proposed a social inspired networks among the existing brokers. Multiple-brokers approach is largely used within the IoT context to allow each broker to share information with other system, so a broker can answer requests leveraging information collected by others. The way brokers share information is a relevant question within a distributed network, for instance, they can share information similarly to the management

50

V. Carchiolo et al.

of queries in a P2P network. To the best of our knowledge, no definitive standard exists for this mechanism, and the overall performance in searching and retrieving resources heavily depends on the organization of broker P2P network. The main problem concerning the use of a distributed broker is the significant broadcast communication overhead among brokers in the network. In fact, this solution can be efficiently implemented only with a small number of nodes, therefore we should select the subset of brokers that guarantee effectiveness and efficiency of the whole architecture, to avoid spreading information to brokers that are not interested in the same topic. In our approach, we build a broker network based on the PROSA model [10]. PROSA models relationships among peers (brokers in our case) by miming those among people, and consequently relations among friends are privileged i.e. the question is first issued to a friend rather than a stranger. In the proposed model two peers are friends when they share information. The semantic proximity of resources is mapped onto topological proximity of a node, whereas query forwarding/answering effectiveness and efficiency are endorsed by the social nature of PROSA network [9]. For this reason, in the following we use the term Social Broker Network where nodes are brokers and links among brokers are managed according to the PROSA model, that provides three kinds of links: acquaintances link (AL), temporary semantic link (TSL) and full semantic link (FSL). AL models social relationships raising from everyday life interactions, while Semantic links model those acquaintances with which a stronger relation exists (note that a semantic link is not symmetric). A semantic link is split in two subcategories: temporary and full semantic links. To describe their difference, let us consider an example: if a friend asked us something about golf and we were not able to answer him, we will anyway remember that he is involved with golf. This results in a stronger link than simple acquaintances (AL), thanks to past queries, and it is called Temporary Semantic Link (TSL). Whenever an answer to a query is provided, this leads to a stronger link named Full Semantic Link (FSL). To upgrade an acquaintance link into semantic, some additional semantic information is required. Once such a semantic link is established, as soon as a query concerning that field arises, we exploit that link to get assistance or collaboration.

3

Sharing a Topic in a Social Broker Network

A first issue to address in the social broker network described previously is the way a new broker joining the network gathers links with other nodes. In particular, the newcomer will create links with other brokers if they share a topic with similar meaning and few random links similarly to a human being that creates social semantic ties with people having the same interests, culture, hobbies as well as random ties with unknown persons. If the strategy is correctly defined and the algorithm successfully catches the dynamics of the social model, the resulting network of brokers will be a small-world network. We model the

Smart Topic Sharing in IoT Platform Based on a Social Inspired Broker

51

broker according to PROSA where a peer receiving a query forwarded by an unknown peer extracts some information about source peer knowledge from the query itself and use it to establish a new link with the source peer. The model and the algorithms is described in [9] and [13]. A Broker in a classical MQTT protocol has two kinds of clients, the publisher and the subscriber, the first publishes a message on a given topic, than the broker will distribute that message to any client that have subscribed to that topic. For each broker we identify: Local-Publisher Set that is the set of publishers currently registered to the broker, a Local-Subscriber Set that is the set of subscribers currently registered to the broker and Local List topic subscriber that is the list of topics for which there exists at least one local interested subscriber. In our social inspired approach, when a publisher publishes a topic, the Broker will distribute that message both to all subscribers in the Local-Publisher Set and to some Brokers in the Social Broker Network according to PROSA peer-to-peer network management protocol(see Fig. 1).

Fig. 1. Social broker message distribution

Each broker B in the Social Broker Network has a BrokersList containing the information about all brokers Bi linked with B. The BrokersList contains a row for each connected Broker Bi representing the Linki between B and Bi . Linki is a triple (Addressi , LinkT ypei , LocalListT opici ) where Addressi is Bi ’s address, LinkT ypei is type of link between B and Bi that is FSL, TSL or AL and LocalListT opici is the Local List topic of Bi . When a new subscriber for a topic is added to B, its List topic subscriber is updated; when B receives a topic from its publisher, it sends the topic to all local subscribers registered for that topic and to the Social Broker Network that manages the routing in order to deliver the message to the subset of Bi having a Local subscriber interested in that topic. When a broker B needs to select some other Bi in the Social Broker Network to delivery a Topic T , it uses its BrokersList and a semantic similarity function SS. The Algorithm 1 describes the procedure used by a broker’s LocalPublisher to select the set of brokers Bi when publishing a topic T , whereas BrokerRecieverSet refers to the set of the brokers to send the topic T . When the algorithm starts this is an empty set populated by the Picking Brokers Function. The algorithm makes three attempts to select brokers using its LinkType. The first attempt concerns with the selection from BrokersList of broker Bi with

52

V. Carchiolo et al.

Algorithm 1. Selection of Brokers interested in T 1: procedure Picking Brokers(BrokersList, T ) 2: BrokerReceiverSet = ∅ 3: for all Linki ∈ BrokersList do 4: // Populate BrokerReceiverSet with existing broker with Full Semantic Link 5: if LinkT ype(Linki ) == F SL ∧ SS(T, Broker(Linki )) == T rue then 6: BrokerReceiverSet = BrokerReceiverSet ∪ {Broker(Linki )} 7: //If no FSL it searches for interested Broker connect with Temporary TSL 8: if BrokerReceiverSet == ∅ then 9: for all Linki ∈ BrokersList do 10: if LinkT ype(Linki ) == T SL ∧ SS(T, Broker(Linki )) == T rue then 11: BrokerReceiverSet = BrokerReceiverSet ∪ {Broker(Linki )} 12: //If BrokerReceiverSet is still empty some broker are ramdonly selected 13: if BrokerReceiverSet == ∅ then 14: for all Linki ∈ BrokersList do 15: // random() return True according to a given probability distribution 16: if LinkT ype(Linki ) == AL ∧ random() then 17: BrokerReceiverSet = BrokerReceiverSet ∪ {Broker(Linki )} 18: return BrokerReceiverSet

LinkType equals to FSL. For each Bi having ‘FSL’ as LinkType, the function SS is evaluated; If True for some Bi , that Bi is added to the BrokerReceiverSet. If a candidate to deliver the topic T with a FSL does not exist, i.e. the BrokerRecieverSet is still an empty set, then the algorithm proceeds trying to find a Bi connected through a TLS interested in the topic T . If also this attempt fails, this means that neither FSL nor TSL linked broker with an acceptable semantic similarity with respect to the topic T exist; the Algorithm 1 then selects the subset of Bi in the BrokersList linked with an AL in a random way and use that as BrokerRecieverSet. As a rule, when talking to a friend, in addition to answering a direct question, usually we also provide him with information about ourselves, to mimic this social behavior the broker B will send to broker Bi ∈ BrokerReceiverSet a triple built with the address of B, the topic T and the SubscriberList of B. Finally, B will proceed with delivering the topic T to the subscriber of Bi ∈ BrokerReceiverSet and updates the BrokersList of Bi with the received information (when Broker Bi receives a topic T from Bj it resend its BrokersList to Bj ). A topic T can be exchanged between a broker S (the Sender) and another R (the Receiver); the communication is modeled as a message(AddressS , T, SubscriberR ). The Algorithm 2 describes the update procedure of the BrokersList of R when it receives a message(AddressS , T, SubscriberR ) from S. First, a new AL between R and S rises if they are unlinked, and a new row (AddressB , T SL, SubscriberB ) to the BrokersList of the Receiver R is added. Conversely, if a link between R and S exists, the algorithm adds the SubscriberS to the correspondent LocalListTopic and builds an FSL between R and S.

Smart Topic Sharing in IoT Platform Based on a Social Inspired Broker

53

Algorithm 2. Topic Vector UpDate 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:

procedure Update(BrokersListR , AddressS , T, SubscriberS ) for all Linki ∈ BrokersListR do if LinkT ype(Addressi ) == AddressS then if LinkT ype(Linki ) == F SL then // S and R are linked with a FSL then the LocalListT opicR is updated LinkT ype(LocalListT opicB S) = {SubscriberB S} ∪ LinkT ype(LocalListT opicB S) return if LinkT ype(Linki ) == T SL then // S and R are linked with a TSL then LinkT ypeR and LocalListT opicR are updated LinkT ype(LocalListT opici ) = {SubscriberS } ∪ LinkT ype(LocalListT opici ) LinkT ype(Linki ) = F SL return // BS and BR are not linked an TLS link is added in the BrokersListB R is updated BrokersListR = BrokersListR ∪ {(AddressS , T LS, SubscriberS ))}

The semantic similarity function SS(T opic T, Broker Bi ) computes the interest of Bi for T and B’s willingness to share the topic; to this purpose, we manipulate the simple text strings, where elements are separated by the symbol ‘\’ used in MQTT protocol for the topic. These strings contain information about the localization and the type of measure. The localization can be absolute, or descriptive spatial coordinates or relative as in the examples shown in Fig. 2, where absolute spatial information (case 1) are directly used, descriptive spatial coordinates absolute information (case2) must be replaced with absolute spatial information, whereas relative information (case 3) is turned into absolute spatial information by fetching the position of the broker the publisher is connected to.

Fig. 2. Topic splitting in T(C) and T(M)

We split the string representing the topic T in two separate strings: T (C) and T (M ), where T (C) contains the localization of the sensor/publisher as absolute spatial coordinates and T (M ) is the description of the measure publisher being distributed. The generation of T (C) depends on the nature of the localization information included in the Topic. As shown in Fig. 2, in Case1 the string is only splitted, in Case2 the string “Italy/Rome” is replaced with the coordinate of Rome (41.9028◦ N 12.4964◦ E) and finally in Case3 the replacement is carried out with Broker’s coordinates (52.5200◦ N, 13.4050◦ E).

54

V. Carchiolo et al.

When the broker joins the Social Broker Network, it provides some information about the accuracy through the pair Accuracy = (AL, AM ), where AL and AM measure respectively the acceptable error of localization and the type of measure. Given the topics T 1 and T 2, the function SS(T 1, T 2, Accuracy) returns a boolean T rue when T 1  T 2, SS(T 1, T 2, Accuracy) = (pointDist(T 1(C), T 2(C)) ≤ AL) ∧ (ontologicDist(T 1(M ), T 2(M ) ≤ AM ) The function pointDist(T 1(C), T 2(C)) returns the distance between point T 1(C) and T 2(C); the function ontologicDist gives a measure of the semantic distance between strings T 1(M ) and T 2(M ) according to the given ontology.

4

Conclusions and Future Works

In this paper we presented a novel Social Broker Network self organising algorithm for Topic distribution. The algorithm aims at emulating the way social relationships among people naturally arise and evolve. Several problems still remain to be addressed as security and privacy concerns, efficiency, long term evolution of the networks, clash of topics and so on. Most of the security concerns can be handled by communication networks using certificates, symmetric keys, distributed password managements etc., though reputation and trust remains open challenges. In [14] a solution based on social relationship is proposed, while in [15] another solution still inspired to PROSA is given. Privacy is also a big concerns since although the main goal is to share information, clients would also like to protect some sensitive information, so a trade off should be established. The efficiency of the solution can be measured according to different features (performance, number of links, traffic), the solution currently experimented concerns an IoT smart office network. A smart factory management system in addition to classical energy consumption management is being developed to put the worker in his/her best mental and physical condition to increase productivity. Long term evolution of the network can lead to a large number of links, therefore some countermeasures are (1) to discard some links that are rarely used through an aging algorithm or (2) the link can be discarded according to the quality of information, reputation and/or the trustness. Clash of topics, i.e. the chance that a subscriber can receive several data related to a topic that different publisher describe with same terms, is partially solved taking into account the localization that is already included. However, a smarter algorithms may take into account history (to better understand what meaning is linked to the topic), some threshold, precision but also the kind of link from which topic was obtained. Acknowledgements. “This work has been partially supported by the Universita’ degli Studi di Catania, “Piano della Ricerca 2016/2018 Linea di intervento 2”.

Smart Topic Sharing in IoT Platform Based on a Social Inspired Broker

55

References 1. Holmquist, L.E., Mattern, F., Schiele, B., Alahuhta, P., Beigl, M., Gellersen, H.W.: Smart-its friends: a technique for users to easily establish connections between smart artefacts. In: Abowd, G.D., Brumitt, B., Shafer, S. (eds.) Ubicomp 2001: Ubiquitous Computing, pp. 116–122. Springer, Heidelberg (2001) 2. Mendes, P., Mendes, P.A.: Social-driven internet of connected objects (2011) 3. Kim, J.E., Fan, X., Mosse, D.: Empowering end users for social Internet of Things. In: Proceedings of the Second International Conference on Internet-of-Things Design and Implementation, IoTDI 2017, pp. 71–82. ACM, New York (2017) 4. Kranz, M., Roalter, L., Michahelles, F.: Things that Twitter: social networks and the Internet of Things. In: In What Can the Internet of Things do for the Citizen (CIoT) Workshop at The Eighth International Conference on Pervasive Computing (2010) 5. Guinard, D., Fischer, M., Trifa, V.: Sharing using social networks in a composable web of things. In: 2010 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), pp. 702–707, March 2010 6. Saleem, Y., Crespi, N., Pace, P.: SCDIoT: social cross-domain IoT enabling application-to-application communications. In: 2018 IEEE International Conference on Cloud Engineering (IC2E), pp. 346–350, April 2018 7. D’Elia, A., Viola, F., Roffia, L., Cinotti, T.S.: A multi-broker platform for the internet of things. In: Balandin, S., Andreev, S., Koucheryavy, Y. (eds.) Internet of Things, Smart Spaces, and Next Generation Networks and Systems, pp. 34–46. Springer, Cham (2015) 8. Carchiolo, V., Longheu, A., Malgeri, M., Mangioni, G.: A social inspired broker for M2M protocols. In: COMPLEXIS 2019 (2019) 9. Carchiolo, V., Malgeri, M., Mangioni, G., Nicosia, V.: Social behaviours applied to P2P systems: an efficient algorithm for resource organisation. CoRR abs/cs/0702085 (2007) 10. Carchiolo, V., Malgeri, M., Mangioni, G., Nicosia, V.: PROSA: P2P resource organisation by social acquaintances. In: Agents and Peer-to-Peer Computing (2006) 11. Hunkeler, U., Truong, H.L., Stanford-Clark, A.: MQTT-S - a publish/subscribe protocol for wireless sensor networks. In: 2008 3rd International Conference on Communication Systems Software and Middleware and Workshops (COMSWARE 2008), pp. 791–798, January 2008 12. Vinoski, S.: Advanced message queuing protocol. IEEE Internet Comput. 10(6), 87–89 (2006) 13. Carchiolo, V., Malgeri, M., Mangioni, G., Nicosia, V.: An adaptive overlay network inspired by social behaviour. J. Parallel Distrib. Comput. 70(3), 282–295 (2010) 14. Buzzanca, M., Carchiolo, V., Longheu, A., Malgeri, M., Mangioni, G.: Direct trust assignment using social reputation and aging. J. Ambient Intell. Humaniz. Comput. 8(2), 167–175 (2017) 15. Carchiolo, V., Longheu, A., Malgeri, M., Mangioni, G.: The cost of trust in the dynamics of best attachment. Comput. Inform. 34, 167–184 (2015)

Easy Development of Software for IoT Systems Ichiro Satoh(B) National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan [email protected]

Abstract. Software for IoT, which is distributed systems, is often developed by people who have domain knowledge on the targets measured or actuated by IoT in the real world. They may not have professional knowledge on distributed systems. The difficulty of programming for distributed systems tend to result from communication processing. This paper proposes a framework that enables developers to define programs for IoT without explicitly writing commutation processing. The idea behind this framework is to migrate running programs themselves from computer to computer on behalf of communicating between programs running at different computers. We describe the design and implementation of the framework and an early evaluation. Keywords: Software development · Distributed system · IoT

1 Introduction Internet of Things (IoT) systems have become essential infrastructures nowadays to monitor or actuate physical objects in the real world. To develop and operate IoT systems requires knowledge of the objects in the physical world that are measured and controlled by the systems. As a result, people, including engineers or researchers, who have such knowledge, but not professional knowledge about distributed systems, often develop such systems. For example, when deploying IoT sensing units at a volcano to monitor volcanic tremors, researchers may have professional knowledge on volcanoes but not on IoT. For example, IoT nodes need to communicate with other nodes, but processing for communications drastically increase the complexity and difficulty of the software for IoT system, because such software requires many exceptional handling programs against various problems that may occur during communications. Therefore, developers also need sufficient knowledge of communications. The purpose of this work is to reduce the difficulty in developing software for IoT systems. This paper proposes a framework for easily developing and deploying software for IoT systems. It enables developers to define software for running on nodes in the IoT systems without explicit communications, essentially as stand-alone software. The key idea is to dynamically deploy programs for communicating data and reduce the difficulty of defining communication processing by moving data for each program instead of explicitly writing data communications. This has three distinct advantages as follows: (i) First, it makes software simple, because it enables the software itself to carry data between IoT nodes. Therefore, the software does not have to define processes for c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 56–61, 2020. https://doi.org/10.1007/978-3-030-32258-8_7

Easy Development of Software for IoT Systems

57

sending and receiving data with other programs running on different IoT nodes. (ii) Second in the conventional approaches, in order to send data to other nodes, developers had to define two types of programs (one for the sending side and one for the receiving side). While, in this framework, developers can achieve data exchange by creating just one type of program. The framework is based on a mobile agent platform, as mobile agents are autonomous and programmable entities that can travel with their data between computers under their own control. When migrating to another computer, not only the program code of an agent but also its state, e.g., its program variables, can continue to process its execution at the destination. This paper presents the design and implementation of the framework with its application.

2 Basic Approach To develop software for IoT systems, it is necessary to have domain knowledge about processing data about the objects monitored or actuated by the systems in the real world in addition to knowledge on system-level issues, e.g., networking and resource management, including batteries. We aim at reducing the difficulty in developing software for IoT systems. Mobile agents can carry data to their destinations. When migrating to another node, each mobile agent explicitly invokes application programming interfaces (APIs) provided from its current runtime system (Fig. 1). Therefore, it does not need to define communication processing inside itself. The framework should not assume that nodes continue to be connected through networks, because networks used in IoT systems tend to be unstable and slow. After migrating to the destination-side computer, mobile agents do not have to interact with their source-side nodes. Since mobile agents can directly read data from storages and write the result into storages at their current nodes, standalone programs for analyzing data can easily be modified into mobile agents. To aggregate and process results at multiple nodes into the overall result, the framework should allow developers to explicitly define aggregating and processing of the results. Each mobile agent can define its itinerary inside itself so that it can aggregate data from nodes according to its itinerary.

Data

Communication Data node 1

Processing node

Data node 2

Data node 1

Processing node

Data node 2

Agent migration Software for IoT

Data node 3

Data node 3

Data processing program Data processing program (processing node-side)

calling API for migration

Data processing program (data node-side) Communication processing program (processing node-side) Communication processing program (data node-side)

Fig. 1. Communication-based processing and mobile agent-based processing

58

I. Satoh

The framework itself assumes that each single mobile agent is responsible for visiting nodes, gathering data, processing the data, and relaying the result to a certain node. However, we may want to use multiple mobile agents for reason of efficiency. In this case, we need a mechanism to divide a single mobile agent into multiple agents and then merge multiple agents into a single agent again. When dividing a mobile agent into multiple agents, the data maintained in the former may need to be divided into multiple pieces in a certain way. When merging multiple agents into a single agent, the pieces of data maintained in the latter agents may be joined in a certain function.

3 Design and Implementation Our framework consists of two layers; software agents and runtime systems. The former are programmable entities to define user-defined application-specific processing and are constructed as Java-based mobile agents. The latter run on IoT devices and are responsible for executing and deploying the software agent at computers. 3.1

Runtime System

Each runtime system was constructed on our original Java-based mobile agent platform. It is responsible for exchanging and executing mobile agents. It establishes at most one TCP connection with each of its neighboring systems in a peer-to-peer manner without any centralized management server and exchanges control messages and agents through the connection. Each runtime system manages the executions of its running agents and maintains their life-cycles. 3.2

Programming Model

Conventional Java sequential programs can be embedded into methods invoked at certain life-cycle changes, e.g., the creation and arrival of agents. Also, the programs can explicitly call application programming interfaces (APIs). For example, when an agent program calls the move() API with a host address, it migrates itself to the computer specified at that address. When an agent migrates to another computer, not only the program code of the running program but also its state, e.g., data in its heap areas, are transferred to the destination. As a result, it can carry its data to the destination and continue its execution there. Let us assume a non-mobile agent single program that reads data from multiple devices and then calculates the average of the data. The left side of Fig. 2 is a program designed to run at the data processing-side and the right side of Fig. 2 is a mobile agent-based program. The framework transforms from the former to the latter, where the former has program parts for communication processing with exceptional handling but the latter does not have any communication processing except for calling an API for the program itself to another computer. In the latter, created() and arrived() are invoked by certain changes in the life-cycle state of the agent, e.g., after creation and after arrival, respectively. Each mobile agent can maintain its state, e.g., data in heap areas, after it is not only migrated but also duplicated. Nevertheless, duplicating a mobile agent into multiple

Easy Development of Software for IoT Systems

59

Mobile agent-based program (without explicit communication processing)

Program with communication processing public class Average { int total = 0; int num = 0; DestinationList dstList = null; public Average() { .... dstList = new DestinationList(); dstList.append(device1); dstList.append(device2); .... for (dstList) { num = dstList.size(); Program part for calculate(); } communication processing } public void calculate() { for (dstList.hasNext()) { try { NetworkAddress addr = dstList.next()); Meter meter = new Meter(addr); total = total + meter.value(); } catch (Exception ex) {...} } System.out.println("Average consumption: "+total / num); } }

public class AverageAgent implement MobileAgent { int total = 0; int num = 0; DestinationList dstList = null; public AverageAgent() { .... dstList = new DestinationList(); dstList.append(device1); Program part for dstList.append(device2); calling API for .... num = dstList.size(); agent migration } public void created(AgentEvent evt, Context context) { .... } public void arrived(AgentEvent evt, Context context) { for (dstList.hasNext()) { try { context.move(dstList.next()); } catch (Exception e) {....} Meter meter = new Meter(current_host); total = total + meter.value(); } System.out.println("Average consumption: "+total / num); } }

Transformation

Fig. 2. Transformation of a program for distributed processing into a mobile agent-based program

mobile agents often aims at processing in parallel, the data stored in the agent may need to be divided and the divided data are assigned to the agents. Therefore, the framework provides data areas for mobile agents, where the areas are constructed as data stores attached to mobile agents and enable their data to be divided or merged in user-defined functions. The stores are constructed as tree-structured KVSs, where each KVS maps an arbitrary string value and arbitrary byte array data and is maintained inside its agent, and provides directory servers to KVSs in agents. The root KVS merges the KVSs of agents into itself according to user-defined functions such as the reduce functions of MapReduce processing [1]. Each KVS in each data processing agent is implemented in the current implementation as a hashtable whose keys, given as pairs of arbitrary string values, are byte array data, and it is carried with its agent between nodes. It supports a built-in hash-join to merge more than two KVSs carried by agents into a single KVS.

4 Experience We describe the basic performance of the framework and its application. The current implementation is built on Java Virtual Machine (JVM) version 8 or later. The runtime system can store the states of each agent in heap space in addition to the code of an agent into a bit-stream formed in Java’s JAR file format, which can support digital signatures for authentication. The current framework basically uses the Java object serialization package for marshaling agents. The package does not support the capturing of stack frames of threads. Instead, when an agent is duplicated or migrated, the runtime system issues events to it to invoke their specified methods, which should be executed before the agent is duplicated, and it then suspends their active threads. Although the current implementation was not constructed for performance, we evaluated several basic operations in a distributed system where eight computers (2.3 GHz Intel i5 with MacOS X 10.11 and Java version 8) were connected through a Giga Ethernet. The cost of agent duplication was measured as plotted on the left side of Fig. 3, where the agent was simple and consisted of the callback methods as described in Fig. 2. The cost included that of invoking callback methods. The cost included that of invoking two callback methods. The cost of agent duplication was measured as plotted at the left of Fig. 3. The cost of

60

I. Satoh

migrating the same agent between two computers was measured as plotted on the right side of Fig. 3, where the cost included that of opening TCP transmission, marshaling the agents, migrating the agents from their source computers to their destination computers, unmarshalling the agents, and verifying security.

500 Cost (ms)

Cost (ms)

20 15 10 5 0

400 300 200 100

10

20 40 80 Age nt si ze (KB)

160

Cost of agent duplication

320

0

10

20

40 80 160 Agent size (KB)

320

Cost of agent deployment

Fig. 3. Basic cost of agent migration and duplication

5 Related Work There have been several attempts to enable easier development of software for IoT systems. Several researchers have been explored middleware systems for IoT, e.g., HYDRA [2]. They tend to focus on elaboration and manipulation of the gathered data from devices. FIWARE is an open software platform for developing and operating application-level services to IoT systems [3]. Although it may help developers to abstract away differences between computers, but it still assumes them to explicitly describe communication processing. Several researchers have explored visual programming frameworks for IoT systems, e.g., LabView [6], while these frameworks reduce the burden on the developer, it is still necessary to write communication processing. Foglets [5] and MCEP [4] introduce the notion of live task migrations with their own APIs in a cloud-edge environment for location-aware applications similar to ours, but they aim at optimizing resource and task allocations rather than providing support for programming for IoT systems.

6 Conclusion This paper proposed a framework to help developers define programs for IoT without explicitly describing communications, which are often one of the greatest difficulties in developing IoT systems. In our framework data transfer between computers by communication can be achieved by moving the program itself being executed between computers through use of mobile agent technology. As a result, developers do not have to write communication processing into their programs explicitly.

Easy Development of Software for IoT Systems

61

References 1. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Operating Systems Design and Implementation (OSDI 2004) (2004) 2. Eisenhauer, P., Rosengren, M., Antolin, P.: HYDRA: a development platform for integrating wireless devices and sensors into ambient intelligence systems. In: Giusto, D., Iera, A., Morabito, G., Atzori, L. (eds.) The Internet of Things, pp. 367–373. Springer, New York (2010) 3. FIWARE: From Open Data to Open APIs. http://www.fi-ware.org 4. Ottenwalder, B., Koldehofe, B., Rothermel, K., Hong, K., Lillethun, D., Ramachandran, U.: MCEP: a mobility-aware complex event processing system. ACM Trans. Internet Technol. 14(1), 6:1–6:24 (2014) 5. Saurez, E., Hong, K., Lillethun, D., Ramachandran, U., Ottenwalder, B.: Incremental deployment and migration of geo-distributed situation awareness applications in the fog. In: Proceedings of 10th ACM International Conference on Distributed and Event-Based Systems, pp. 258-269. ACM (2016) 6. Whitley, K.N., Backwell, A.F.: Visual programming in the wild: a survery of LabVIEW programmers. J. Vis. Lang. Comput. 12(4), 435–472 (2001)

Data Analysis, Mining and Machine Learning

Privacy-Preserving LDA Classification over Horizontally Distributed Data Fatemeh Khodaparast1(B) , Mina Sheikhalishahi2 , Hassan Haghighi1 , and Fabio Martinelli2 1

Department of Computer Science and Engineering, Shahid Beheshti University, Tehran, Iran [email protected], [email protected] 2 Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche (CNR), Pisa, Italy {mina.sheikhalishahi,fabio.martinelli}@iit.cnr.it

Abstract. This paper presents a framework for constructing Linear Discriminate Analysis (LDA) classifier, securely, over distributed data. It is assumed that data is partitioned among several parties such that for obtaining higher benefits from a greater extent of data, all participants are willing to model the LDA classifier on whole data, but for privacy concerns, they refuse to share the original datasets. To this end, we propose an algorithm based on secure computation protocols, which provides the data owners the possibility of LDA classifier construction over all datasets without revealing any sensitive information. In our experimental analysis, applying data-packing and tree communication model, has shown the reduction of computation and communication costs 35 and 15 times, respectively. Keywords: Privacy-preserving · Information sharing · Collaborative data analysis · Distributed classification · Linear Discriminant Analysis

1 Introduction In today modern society, data is becoming as valuable as oil in the past, with the generation of everyday quintillion bytes of data. At the same time, data-driven problemsolving methods, like machine learning (ML), have been developed to solve and predict upcoming events [8]. Generally, ML algorithms produce more accurate outcomes when the model is constructed over a larger set of data. Owing to this fact, sharing information for constructing ML algorithms over distributed data is vital; examples can be found in e-health scenarios [1] or in e-markets [5]. However, data holders are usually unwilling to share their original set of data due to the fact that it might contain sensitive information, which either are forbidden to be shared due to the laws or will harm the reputation of organizations. In these situations, privacy-preserving data mining approaches are proposed to protect confidential information when data is planned to be utilized in data mining processes. Classification, as a data mining technique, is the process of predicting the likelihood of occurrence of an event based on the history of previous observations (training data) [6]. Linear Discriminate Analysis (LDA) is an efficient classifier which has shown a c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 65–74, 2020. https://doi.org/10.1007/978-3-030-32258-8_8

66

F. Khodaparast et al.

promising result in terms of accuracy and time [12]. The LDA classifier is built based on two main criteria, i.e. mean and covariance. In this study, we plan to construct LDA classifier securely when data is distributed among several agents horizontally, all agents have their data expressed with the same set of features, but on different set of objects. To shape the LDA classifier over all data without revealing any confidential information, we utilize the Paillier secure sum protocol to develop the proposed algorithm. To mitigate the cost of encryption mechanisms, we employ (1) a different communication model, named tree-model, and (2) an alternative message passing technique, named data-packing. The contribution of this work can be summarized as follows: • We address the problem of secure LDA classification when data is distributed among several parties. • We reduce the communication cost through a tree communication model from n to log2 (n) among n participants. • We improve the computation and communication cost of the secure LDA construction through the application of data-packing. • We prove the security of the proposed approach. • Finally, the efficiency of our algorithm, in terms of both computation and communication cost, is validated through a number of experiments. The rest of this work is organized as follows. Section 2 presents some preliminary concepts and notions exploited in this study. The problem statement and proposed framework are introduced in Sect. 3. Next, in Sect. 4, the experimental result is reported. Related work on the concepts of privacy-preserving data classification is presented in Sect. 5. Finally, Sect. 6 briefly concludes the paper and proposes some future research directions.

2 Preliminary Concepts In this section, we briefly present some fundamental notations and concepts which are deployed in our proposed framework. 2.1

Linear Discriminant Analysis

Linear Discriminant Analysis (LDA), a statistical classifier with promising performance on biological applications [4], is based on the inner class regression coefficient and allocates a new instance to a class with the highest regression coefficient based on the within class covariance and the total covariance. Formally, suppose k attributes, as A = (A1 , A2 , . . . , Ak ), are describing the dataset D with N samples. Each record of D is a vector of size k + 1, as Xi = (xi1 , xi2 , . . . , xik ,Ci ), such that xit ∈ At (1 ≤ t ≤ k), and the last component is the class label of the record that gets its value as Yes or No. First, for the training phase of the LDA classification, some criteria, namely mean, covariance and inverse of covariance, are calculated separately over each class c. Thence, we compute the mean and covariance over each collection Dc separately. These two notions along with the decision criteria are defined as follows:

Privacy-Preserving LDA Classification over Horizontally Distributed Data

67

• Mean: The mean value of the class label c, denoted by µc , is defined as a vector of size k over all features, shown as µc = (µA1 , µA2 , . . . , µAk ), where µAt (c) = ∑xit ∈Dc xit . Hence, applying mean equation, µc = (1/|Dc |) ∑Xi ∈Dc Xi , the mean vector of each class and the total mean vector µ is obtained as µ = (0.5 × µYes , 0.5 × µNo ). • Covariance: Covariance measures the joint variability of two variables, such that a greater value shows more similar behavior of two variables. The relation cts = |D | Covc (At , As ) = (1/|Dc |) ∑i=1c ((xit − µAt (c)) × (xis − µAs (c))) shows the covariance of two attributes At and As in dataset Dc . Then, we aggregate the covariance of each class c, to obtain the total covariance matrix, TCov = ∑c∈{yes,No} (nc /N) × Cov(c) where nc is the number of records labeled c. • Decision Criteria: To decide to which class a new instance belongs, two criteria named β and Z are required to be computed based on the inverse of covariance  × matrix Cov−1 . To obtain β depending to the attribute At we have β(At ) = ∑ks=1 cts  (µAs (Yes) − µAs (No)), where cts is the element in t’th row and s’th column of matrix k β(At ) × µAt (c) Cov−1 . Applying β, the values of Zc and Z are obtained as Zc = ∑t=1 k and Z = ∑t=1 β(At ) × ((µAt (Yes) × µAt (No))/2). To classify a new instance, like X = (x1 , x2 , . . . , xk ), based on the constructed LDA, k β(At ) × xt and the label of the new sample is deterwe first compute Z(X) = ∑t=1 mined through the following conditional function: ⎧ Yes if ZYes > ZNo , Z(X) > Z ⎪ ⎪ ⎪ ⎨No if Z > Z , Z(X) ≤ Z Yes No LDA(X) = ⎪ No if ZYes ≤ ZNo , Z(X) > Z ⎪ ⎪ ⎩ Yes if ZYes ≤ ZNo , Z(X) ≤ Z 2.2 Additive Homomorphic Encryption Homomorphic Encryption (HE) is an encryption schema which is capable of mathematical operations, like sum, on ciphertexts, such that after decrypting the result of the operations, it is equivalent to the situation where the computation had been executed on plaintexts. In current study, Paillier additive homomorphic encryption is employed [11]. Let E pk (·) and Dsk (·) represent the encryption function (with public-key pk ) and decryption function (with secret-key sk ), respectively. With m1 , m2 , two messages, and e, a scalar value, the homomorphism is defined as, Dsk (E pk (m1 ) · E pk (m2 )) = m1 + m2 , and Dsk (E pk (m)e ) = e · m. The additive property of this system fulfills the required operations for secure construction of LDA classifier.

3 Problem Statement and Framework In general, a classification algorithm is more accurate over a richer amount of data. However, for sensitivity of information, the data holders are unwilling to share their original datasets. To this end, secure computation protocols are adopted as reliable tools to train a classifier over distributed data, without revealing the original datasets. Assume

68

F. Khodaparast et al.

w ≥ 3 agents, say P1 , P2 , . . . , Pw , are interested in sharing information to construct LDA classifier on whole of their data, securely, when data is distributed horizontally. This means that each data holder has information about all the features but for different collection of objects. More precisely, each party holds k attributes used to express each record. Therefore, each record is a k + 1 dimensional vector Xi = (xi1 , xi2 , . . . , xik ,Ci ), where xi j ∈ A j (1 ≤ j ≤ k), and the last component, denoted by Ci , is the class label. To protect the confidential information in each dataset, we design secure construction of LDA classifier over distributed data. Figure 1 shows the architecture underlying the interactive privacy model, along with the main components and their interactions. First, Coordinator generates public (pk ) and private (sk ) keys; It sends the public key pk to the data-holders P1 , P2 and P3 . After receiving the public-key, the data-holders P1 , P2 , P3 encrypt their data a, b, c (denoted as [a], [b], [c]), respectively. The first agent (P1 ) sends her encrypted value [a] to second agent (P2 ). With the application of additive homomorphic property of Paillier cryptosystem, P2 sends [a + b] to P3 . Similarly, P3 employs Paillier encryption property and sends [a + b + c] to the coordinator. At this step, coordinator, who is the only component holding the secret key, decrypts the final message and reveals the outcome to all agents. For the sake of simplicity, in the rest of this paper, we represent the call of Secure Sum Protocol on w numbers xi (1 ≤ i ≤ w) w x , where without loss of generality P is the initiator [11]. as SSPi=1 i 1 Coordinator Sk

pk

a

pk

pk

c

b p1

[a]

[a+b+c]

p2

[a+b]

p3

Fig. 1. Reference architecture

Two main criteria of LDA, i.e. mean and covariance, are computed securely on all data belonging to the same class label. That means, first, each agent separates her own dataset to two smaller datasets based on class labels Yes and No. Afterwards, through Algorithm 1, they communicate the secure sum protocol to obtain the mean and covariance over all data which are labeled equally. Obtaining the result of Algorithm 1, each party is able calculate the value of total and inverse covariance locally. At this stage, each party is able to construct LDA classifier locally applying globally computed criteria. Upon receiving a new instance, each party is able to find the associated class label independently, resulting from the fact that every one knows the classifier structure.

Privacy-Preserving LDA Classification over Horizontally Distributed Data

69

Algorithm 1: LDACriteria()

1 2 3 4 5

6 7

Input: w datasets, described by attributes A1 , . . . , Ak , belonging to agents P1 , . . . , Pw ; Public key pk generated by cordinator Output: Mean, Covariance, total Mean, total Covariance for ci ∈ {Yes, No} do for 1 ≤ j ≤ w do for 1 ≤ t ≤ k do Pj computes locally si jt = ∑x jt ∈Dci x jt . Pj computes locally ni jt = “number of records in D j labeled ci respecting attribute At ”. w s Si j ← SSPt=1 i jt w n Ni j ← SSPt=1 i jt µAt (ci ) ←

8 9 10 11 12 13 14

end end for 1 ≤ j ≤ w do for 1 ≤ t ≤ k do for 1 ≤ s ≤ k do w SSPt=1 (xi jt −µAt (ci ))(xi js −µAs (ci )) cits = Covci (At , As ) ← Ni j end

15 16 17 18 19

Si j Ni j

end end end return µci = (µA1 , . . . , µAk ) , covariance = [cts ]

Security Analysis: The privacy model in this study is the semi-honest model in which the participated parties follow the protocols, but they are curious. In this scenario, respecting the protocols, collusion will not occur among the parties. In the presented algorithm, we applied Paillier homomorphic encryption for secure sum which has been proven to be secure in [11]. In our architecture, the coordinator is the only component who owns the secret key, and it only receives the addition of all values, without knowing the individual value of each agent. On the other hand, the parties do not own the secret key to be capable in decrypting other agents’ input. 3.1 Optimization To achieve a higher efficiency in secure LDA construction, we employ two approaches, named Data Packing (DP) and tree communication. Data Packing utilizes the message space of Paillier cryptosystem to decrease the communication cost by reducing the number of decryption operation [10]. To this end, as many as possible messages are packed in one single message space and transferred in each communication. This mechanism reduces the complexity overhead to 1/p times compared to normal communication, where p is the number of messages packed in one message. Moreover since decryption is a heavy operation, through one decryption p messages are obtained instead of one. Communication Model is a way in which parties pass a message to each other, e.g. in a ring, star or tree topology [19]. The selected communication model has considerable effect on computation cost of information aggregation. To clarify, in ring

70

F. Khodaparast et al.

communication model each agent sends her ciphertext to the next agent. Whereas, in tree communication model, the participated parties are connected through a topology similar to a binary tree. In this model, first all agents located in the first level (leafs), simultaneously send their encrypted message to their father agent. Then, the agents of the second level add up the received ciphertexts with their own ciphertext, and they send the associated ciphertext to their father node. This process continues to the root. Accordingly, the communication cost in ring model compared with tree-model reduces from n to log2 n .

4 Experimental Analysis Implementation Details: In our experiment, every operation is executed ten times and the average of outputs are reported. The computer used for the experiment has an Intel core i3 − 3556U CPU at 1.70 GHz, 64−bit, running windows 8.1, 8.0 GB DDR3 RAM with 3 level cache up to 2.0 MB. It is noticeable that our implementation codes are available in “https://github.com/khodaparast/Private-LDA-”. Dataset: For evaluating the efficiency of our algorithm through simulation, we created a synthetic dataset from the Pima Indians Diabetes dataset1 , with eight integer attributes, and one binary class label, with different number instances varying from one million to 10 million. Regarding that the number of features matters in experimental analysis, we also consider datasets with fixed 10 million records, but described with the use of different number of features. Experiment: In the process of distributed LDA classification, we have four main operations: (1) encryption 0.021 s, (2) decryption 0.018 s, (3) message passing 0.325 s, and (4) the addition of encrypted values 0.252 s, with 1024 key-length. In our system, the size of a “ciphertext” and “attribute-value” is considered as 1024 and 68 bits, respectively, we are able to pack 15 attribute-values in one ciphertext through the application of data packing. Figure 2 depicts the result of secure LDA construction, Axis X represents the number of records distributed equally between agents (in Million). Figure 3 shows the experimental result of LDA construction in the case that number of records is set to 10 million, but the number of features varies from 10 to 200 (axis X shows the number of features). In both figures, axis Y shows the total time elapsed to run Algorithm 1 in four different scenarios: (1) simple linear communication, (2) linear communication with data packing, (3) tree communication, (4) tree communication with data packing. As expected, in both experiments, the linear model (red line) without data packing requires maximum amount of time, while the tree-model combined with data-packing (green line) spends minimum time. Comparing the results of tree model (blue line) and data packing (yellow line), it can be inferred that data-packing has better effect in efficiency compared to tree-model. Moreover, when the number of records changes, it slightly affects computation cost (Fig. 2), whereas this difference is considerably high when the number of features changes (Fig. 3). To get a better insight, let quantitatively report the impact of number 1

https://www.kaggle.com/uciml/pima-indians-diabetes-database.

Privacy-Preserving LDA Classification over Horizontally Distributed Data

71

Fig. 2. Time (seconds) vs number of records (M)

Fig. 3. Time (seconds) vs number of features

of data and number of features for six parties. In former scenario, the average computation cost difference between linear and Tree-DP is reported as 197 s, while in latter scenario this difference reaches to 31748 s. This difference in our extreme case, i.e. for 200 features is 109859 s. This outcome is resulted from this fact that the number of communications among agents is a dependent value to the number of features. Regarding that the number of features has considerable effect in our experiments, in Fig. 4 we report the communication cost (bandwidth usage) for constructing LDA classifier when the number of features varies from 10 to 200. The result is reported for two scenarios of data being distributed among three and six parties. The number of bits transferred in linear model (red line) exponentially grows when the number of features increases. Adding data-packing to linear model (blue line) leads to logarithmic bandwidth increment. The average bandwidth-usage difference of two models for three

Fig. 4. Transfer (bits) vs number of feature

72

F. Khodaparast et al.

and six parties is 25, 883, 696 and 150, 206, 962 bits, respectively. This result shows a noticeable reduction of communication cost under the application of data-packing. It needs to be mentioned that tree-model has no effect in bandwidth usage. From the result of our experiments, it can be inferred that the application of an appropriate collaborative feature selection technique, prior to secure classifier construction, considerably reduces computation and communication costs in terms of runtime and bandwidth usage.

5 Related Work In the literature a set of works have been devoted to Privacy Preserving Data Mining (PPDM) [2, 7]. However, to the best of our knowledge, no study has constructed the LDA classifier in private collaborative data mining tasks. Therefore, we investigate other classifiers with the similar behavior as the LDA algorithm, e.g. linear and logistic regression. In a more recent study [9], an efficient protocol has been proposed over three classifiers, linear regression, logistic regression, and neural networks. Among these classifiers, linear regression has the most analogous behavior to our approach since it employs similar algebraic operations, like matrix multiplication. The authors have proposed an efficient private protocol based on the stochastic gradient descent method to optimize computation cost in the two-party data distribution model. An efficient method, vectorization, has been proposed to improve matrix operations. Regarding the reported time, training of the linear regression model has shown a linear time based on the sample size and dimension, whereas our model behaves logarithmically. In [3], secure logistic regression learning construction is addressed via homomorphic encryption. In this work, Learning-With-Errors (LWE) encryption scheme is proposed, which deploys an optimized message passing technique, similar to data packing. The results show that the time complexity has been reduced from O(N) to O(N/t), where t is the size of the packed message and N number of instances. The reported time shows a linear behavior, O(N), in a dataset with less than 10 attributes; nevertheless, the time complexity is non-linear when the dimension grows. However, our framework preserves its logarithmic time over any number of attributes. Several works have been presented in privacy-preserving data mining fields utilizing different approaches, like support vector machine (SV M) classification [14], Bayesian methods over multi-party data distribution [15], or tree-based classification model [13]. However, these works suffer from high communication and computation costs. In the literature, no research has worked on LDA classifier in private settings, however, to get a better insight, we present the computation complexity of some private classifiers’ construction in Table 1. As it can be observed due to different criteria in secure classifier construction, e.g. number of parties, instances, features, and cryptography model, we are not able to compare the efficiency of our approach with other approaches. However, it needs to be mentioned that in the future direction of our research we plan to implement different classifiers with the same setting of current study, and compare the result on various benchmark datasets.

Privacy-Preserving LDA Classification over Horizontally Distributed Data

73

Table 1. Summary of related work (N Instances, K Dimensions, C Classes, W parties, F values of a feature) Reference Data distribution

Parties

Complexity

[3]

Vertical and horizontal Two and multi party O((N/t).K2 )

[9]

Vertical and horizontal Two and multi party O(N.K2 )

[14]

Vertical

Multi party

[16]

Vertical

Two and multi party O(C.K.F.W2 )

[17]

Horizontal

Multi party

[18]

Vertical and horizontal Multi party

Not mentioned O(N2 .K2 .C) Not mentioned

6 Conclusion In this study we proposed an efficient framework for the private construction of LDA classifier over encrypted input. The proposed approach can be applied for multi parties to train a classifier on whole of their data, without revealing their original dataset. To improve the efficiency, we applied two approaches based on the “model of communication” and the way that “message passes”. Through several experimental analyses, we show that the proposed approaches reduce computation and communication costs 35 and 15 times, respectively. As a direction for future research, we plan to solve the problem when data is distributed between only two parties. Moreover, we plan to compare the results of the current study with the result of application of another cryptosystem, e.g. secret sharing, in secure communication. In addition, applying proposed framework on different datasets is considered. We will also plan to perform our approach using other classifiers and verify the effect of selected data mining algorithms on time complexity. Acknowledgment. This work was supported by H2020 EU funded project C3ISP [GA #700294].

References 1. Abouelmehdi, K., Hssane, A.B., Khaloufi, H.: Big healthcare data: preserving security and privacy. J. Big Data 5, 1 (2018) 2. Aggarwal, C.C., Yu, P.S.: Privacy-preserving data mining: a survey. In: Gertz, M., Jajodia, S. (eds.) Handbook of Database Security - Applications and Trends, pp. 431–460. Springer, Boston (2008) 3. Aono, Y., Hayashi, T., Phong, L.T., Wang, L.: Scalable and secure logistic regression via homomorphic encryption. In: Proceedings of the Sixth ACM on Conference on Data and Application Security and Privacy, CODASPY 2016, New Orleans, LA, USA, 9–11 March 2016, pp. 142–144 (2016) 4. Cui, J., Xu, Y.: Three dimensional palmprint recognition using linear discriminant analysis method. In: Second International Conference on Innovations in Bio-inspired Computing and Applications, IBICA 2011, Shenzhen, China, 16–18 December 2011, pp. 107–111 (2011)

74

F. Khodaparast et al.

5. Dinh, D.T., Huynh, V.N., Le, B., Fournier-Viger, P., Huynh, U., Nguyen, Q.M.: A survey of privacy preserving utility mining. In: Fournier-Viger, P., Lin, J.W., Nkambou, R., Vo, B., Tseng, V. (eds.) High-Utility Pattern Mining, pp. 207–232. Springer, Cham (2019) 6. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2011) 7. Khodaparast, F., Sheikhalishahi, M., Haghighi, H., Martinelli, F.: Privacy preserving random decision tree classification over horizontally and vertically partitioned data. In: 2018 IEEE 16th International Conference on Dependable, Autonomic and Secure Computing, Athens, Greece, 12–15 August 2018, pp. 600–607 (2018) 8. Martinelli, F., Saracino, A., Sheikhalishahi, M.: Modeling privacy aware information sharing systems: a formal and general approach. In: 2016 IEEE Trustcom/BigDataSE/ISPA, Tianjin, China, 23–26 August 2016, pp. 767–774 (2016) 9. Mohassel, P., Zhang, Y.: SecureML: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, 22–26 May 2017, pp. 19–38 (2017) 10. Nateghizad, M., Erkin, Z., Lagendijk, R.L.: An efficient privacy-preserving comparison protocol in smart metering systems. EURASIP J. Inf. Secur. 2016, 11 (2016) 11. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Advances in Cryptology - EUROCRYPT 1999, International Conference on the Theory and Application of Cryptographic Techniques, Proceeding, Prague, Czech Republic, 2–6 May 1999, pp. 223–238 (1999) 12. Shao, X., Li, H., Wang, N., Zhang, Q.: Comparison of different classification methods for analyzing electronic nose data to characterize sesame oils and blends. Sensors 15(10), 26726–26742 (2015) 13. Sheikhalishahi, M., Martinelli, F.: Privacy preserving clustering over horizontal and vertical partitioned data. In: 2017 IEEE Symposium on Computers and Communications, ISCC 2017, Heraklion, Greece, 3–6 July 2017, pp. 1237–1244 (2017) 14. Sun, L., Mu, W., Qi, B., Zhou, Z.: A new privacy-preserving proximal support vector machine for classification of vertically partitioned data. Int. J. Mach. Learn. Cybern. 6(1), 109–118 (2015) 15. Vaidya, J., Clifton, C.: Privacy preserving Na¨ıve Bayes classifier for vertically partitioned data. In: Proceedings of the Fourth SIAM International Conference on Data Mining, Lake Buena Vista, Florida, USA, 22–24 April 2004, pp. 522–526 (2004) 16. Vaidya, J., Clifton, C., Kantarcioglu, M., Patterson, A.S.: Privacy-preserving decision trees over vertically partitioned data. TKDD 2(3), 14:1–14:27 (2008) 17. Xiao, M., Huang, L., Luo, Y., Shen, H.: Privacy preserving ID3 algorithm over horizontally partitioned data. In: Sixth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2005), Dalian, China, 5–8 December 2005, pp. 239– 243 (2005) 18. Xu, K., Yue, H., Guo, L., Guo, Y., Fang, Y.: Privacy-preserving machine learning algorithms for big data systems. In: 35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015, Columbus, OH, USA, 29 June–2 July 2015, pp. 318–327 (2015) 19. Yang, L., Xue, H., Li, F.: Privacy-preserving data sharing in smart grid systems. In: 2014 IEEE International Conference on Smart Grid Communications, SmartGridComm 2014, Venice, Italy, 3–6 November 2014, pp. 878–883 (2014)

Improving Parallel Data Mining for Different Data Distributions in IoT Systems Ivan Kholod1(B) , Andrey Shorov1 , and Sergei Gorlatch2 1

Saint Petersburg Electrotechnical University “LETI”, Saint Petersburg, Russia [email protected], [email protected] 2 University of Muenster, Muenster, Germany [email protected]

Abstract. We aim at improving the distributed implementation of data mining algorithms in modern Internet of Things (IoT) systems. The idea of our approach is performing as much as possible computations at local IoT nodes, rather than transferring data for processing at a central compute cluster as in the current solutions based on MapReduce. We study different kinds of data distributions between the nodes of IoT and we adapt the structure of the implementation correspondingly. Our formally-based approach ensures the correctness of the obtained parallel implementation. We implement our approach in the Java-based data mining library DXelopes, and we illustrate the approach with the popular algorithm Naive Bayes. Experiments confirm that our approach significantly reduces the application run time. Keywords: Distributed algorithms Internet of Things (IoT)

1

· Data mining ·

Introduction

As an important trend in the current information technology, the Internet of Things (IoT) [21] combines a high number of smart devices that produce large volumes of distributed data. A popular example of an IoT system is a Remote Monitoring System (RMS) for controlling objects of large and complex systems such as networks, factories, airports, and others. Figure 1 presents the typical architecture of IoT widely used in many application domains. Usually IoT has three layers of hierarchy [3,21]: the device layer, the middle layer, and the application layer. The system in Fig. 1 receives data from different devices that monitor multiple objects (hexagons in Fig. 1). The devices are connected with a large number of data storage nodes at the middle layer. There are two possible cases of how data are distributed in the system: • Vertical distribution shown in Fig. 1(a): data are gathered at each storage node from the devices of the same type that control all objects (for example, temperature sensors, pressure sensors etc.). c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 75–85, 2020. https://doi.org/10.1007/978-3-030-32258-8_9

76

I. Kholod et al.

Fig. 1. Example of IoT with: (a) vertically distributed data; (b) horizontally distributed data

• Horizontal distribution shown in Fig. 1(b): data are gathered at each storage node from devices of all types that control subsets of objects (for example, objects located in different regions). Since storage nodes are often low-cost and have low computational power, data are processed on a powerful compute cluster at the application layer. For this, all data from the data storage nodes are gathered into the central data warehouse. The compute cluster and data warehouse are often combined in a Cloud [12]. The important drawback of gathering data in a central data warehouse is the increase in processing time, network traffic, and a risk of unauthorized access to the data. The Fog computing paradigm [6] is viewed as a potential solution to these problems. Fog moves data processing data nearer to the sources where IoT data are produced. The idea of our approach is to optimise a data mining algorithms following the paradigm of Fog computing. We demonstrate that such optimization of a data mining algorithm strongly depends on the type of data distribution. The paper extends our work on parallelizing algorithms for modern multi-core processors [17].

2

Related Work

Nowadays, there are several platforms offered by leading IT vendors that provide data mining services for IoT systems: Azure Machine Learning [11] from Microsoft, Amazon Machine Learning [4], Cloud Machine Learning platform [9] from Google, and Watson Analytics [18] from IBM. The common feature of the current data mining frameworks for IoT is that they are based on the MapReduce programming model [7]. This model uses the abstraction inspired by the primitives map and reduce that are popular as patterns in parallel and functional programming [10] that ensure potentially high performance. Figure 2(a) shows how a data mining algorithm is usually executed in IoT systems when using the MapReduce model. Here, the middle layer is responsible for connecting IoT devices (such as sensors or cameras) with

Improving Parallel Data Mining for Different Data Distributions

77

a Cloud. These connections often negatively impact the performance of data mining, because they: create an intensive traffic via the network, increase the delay between obtaining data at devices and data processing, and increase the risk of an unauthorised access to sensible data.

Fig. 2. Variants of data mining algorithms applied to distributed IoT data: (a) traditional approach using the MapReduce model; (b) suggested approach following the Fog paradigm.

We aim at using the paradigm of Fog computing [6] for overcoming these problems. In the Fog, data are processed closer to their sources, thus enabling low latency and context awareness. Despite the popularity of Fog computing, there are still no ready solutions for its implementation in the context of data mining. There are studies in data mining using Fog computing for particular algorithms and only one data distribution per algorithm. Examples are clustering k-means algorithm for vertically [22] and horizontally [8,16] distributed data, as well as an optimized association rule mining algorithm [20] for horizontal data distribution. However, these studies do not address the problem of how the distributed data mining implementation can take into account different types of input data distribution. Furthermore, these individual approaches for a particular combination of algorithm and data distribution have high complexity and require a lot of effort from the developer. Figure 2(b) shows our suggested approach that executes a data mining algorithm in accordance with the principles of Fog computing: we execute significant parts of a data mining algorithm at storage nodes, instead of transferring data to the central compute cluster and processing them there. In our approach, the cluster receives intermediate results after data processing at storage nodes. Our approach to distributed data mining optimizes the structure of a data mining algorithm according to the type of data distribution: horizontal or vertical.

78

3 3.1

I. Kholod et al.

The Formal Functional Approach Parallel Data Mining Algorithm on Distributed Data

In our approach, we view a data mining algorithm as a function that takes a data set d ∈ D as input and creates from it a mining mode μ ∈ M as output: dma : D → M

(1)

We represent a data set as a 2-dimensional array (data matrix), e.g., for m objects that are described by p characteristics [13]: d = (xj.k )m,p j=1,k=1

(2)

where xj.k is the value of k th characteristic of the j th object. In the case of distributed storage, data divided among storage nodes are represented as parts of data matrix (2): d = d1 ∪ · · · ∪ d s

(3)

where data sub-matrix dh , h = 1..s is located at hth storage node. In this case, the following options of data distribution are possible: m,p • horizontal data distribution: d1 = (xj.k )y,p j=1,k=1 , . . . , ds = (xj.k )j=x+1,k=1 m,p • vertical data distribution: d1 = (xj.k )m,g j=1,k=1 , . . . , ds = (xj.k )j=1,k=r+1

Since algorithms are usually a sequence of steps, we formally represent a data mining algorithm as a sequential composition of functions, as follows: dma = fn ◦ fn−1 ◦ . . . ◦ f1 ◦ f0

(4)

where function f0 : D → M takes a data set d ∈ D and returns a mining model μ0 ∈ M ; functions ft : M → M, t = 1..n take the mining model μt−1 ∈ M created by the previous function ft−1 and return the changed mining model μt ∈ M . They are called Functional Mining Blocks (FMB). To apply some function ft to each element of d ∈ D, we invoke it in a loop: • loopc applies ft to the columns of d ∈ D, starting from index is till index ie : loopc : I → I → (M → M ) → D → M → M loopc is ie ft d μ = (ft d[∗, ie ]) ◦ · · · ◦ (ft d[∗, is ]) μ

(5)

• loopr applies ft to rows of d ∈ D starting from index is till index ie : loopr : I → I → (M → M ) → D → M → M loopr is ie ft d μ = (ft d[ie , ∗]) ◦ · · · ◦ (ft d[is , ∗]) μ

(6)

Improving Parallel Data Mining for Different Data Distributions

79

Our alternative to the traditional MapReduce approach is to execute FMBs on the storage nodes and transfer only mining models created by them. In addition, the functions that are parts of the data mining algorithms are executed in parallel in our approach, i.e., we have parallel execution with distributed memory. Higher-order function paralleld expresses the parallel execution of FMBs on a distributed memory: paralleld : [(M → M )] → M → M

(7)

parallel [fr , . . . , fs ] μ = join μ f orkd [fr , . . . , fs ] μ, where function forkd allows to invoke FMBs in parallel: forkd : [M → M ] → M → [M ] forkd [fr , . . . , fs ] μ = [fr (copyμ), . . . , fs (copy μ)]

(8)

and function copy creates copies of the mining model μ in separate areas of the distributed memory for parallel processing by FMBs. Function join in (7) combines the mining models built by parallel FMBs in separate areas of distributed memory. The implementation of join depends on the mining model’s elements. To ensure the correctness of the parallel execution of FMBs on distributed memory, we employ Bernstein’s conditions [5]: two FMBs can be executed in parallel if there is no dependency of any of the types: flow, output, and antidependency. Bernstein’s conditions are sufficient, but not necessary. We can weaken them for distributed memory, as follows. During parallel execution on distributed memory, FMBs ft and ft+1 receive a copy of the mining model μt−1 constructed by FMB ft−1 and copied by the copy function in (8). Summarising, FMB ft does not use mining model that is modified by FMBs ft+1 , i.e. there is no data anti-dependency. As a result, FMBs ft and ft+1 create different instances of a mining model, μt−1 and μt , which are merged by function join. Therefore, if there exists function join for mining models μt−1 and μt , such that the following condition holds: (ft+1 ◦ ft ) μt−1 = join μt−1 [(ft copy μt−1 ), (ft+1 copy μt−1 )],

(9)

then the result of parallel execution is correct. For iterative processing of rows and columns of data matrix d, both loops loopc (5) and loopr (6) are used as nested. There are two possible orders of applying an FMB ft to the elements of data matrix d: by columns (loopc 1 p (loopr 1 m ft )) d, and by rows (loopr 1 m (loopc 1 p ft )) d. Adjusting the execution order of the nested loops loopr and loopc according to the type of data distribution can be achieved by executing the loop interchange operation which is often used in compilers [1]: it changes the order of iteration variables in nested loops. If outer loop loopc invokes not only inner loop loopr but also a composition of FMBs ft−1 and ft+1 , for example: loopc 1 p (ft−1 ◦ (loopr 1 m ft ) ◦ ft+1 ), then it is necessary to perform the loop fission operation before interchange operation. This operation is also widely used in compilers [1]. It breaks down a loop to several loops,

80

I. Kholod et al.

each of which has the same index boundaries but contains only a part of initial loop’s body: loopc 1 p (ft−1 ◦ (loopr 1 m ft ) ◦ ft+1 ) = (loopc 1 p ft−1 ) ◦ (loopc 1 p (loopr 1 m ft )) ◦ (loopc 1 p ft+1 ). Applying at first the operation fission and then the operation interchange to the nested loops loopc and loopr allows us to adjust the data mining algorithm’s structure to the type of data distribution in order to reduce the number of invocations of function paralleld and the total run time of the algorithm. The next section describes an example of such transformation. 3.2

Illustration for the Naive Bayes Algorithm

We use the Naive Bayes (NB) algorithm [14] to illustrate the described parallelization of the algorithm represented in form (3) for distributed data. Figure 3 shows the sequential pseudocode of the NB algorithm. Function f0 initializes the mining model (lines 1–3). All other FMBs implements each line as follows: • f1 is the loop for the data set’s rows (line 4): f1 = loopr 1 m (f3 ◦ f2 ); • f2 increments the number of the rows with the value d[j, p] of the pth column (line 5): f2 = μ[d[j, p]] + +; • f3 is the loop for the data set’s columns (line 6): f3 = loopc 1 p − 1 f4 ; • f4 increments the number of the rows with the value d[j, k] of the k th column and value d[j, p] of the pth column (line 7): f4 = μ[ϕ(k − 1) + (d[j, k] − 1) · ϕ(0) + d[j, p]] + +.

Fig. 3. The Naive Bayes algorithm: pseudocode.

The following composition of these functions represents the NB algorithm: N B = (loopr 1 m ((loopc 1 p − 1 f4 d) ◦ f2 ) d) ◦ f0 .

(10)

We verify that the outer loop and the type of data distribution corresponds to each other in (10). Next, we verify condition (9) for loop loopr. Mining model of NB contains the numbers of rows with different values. Parallel function loopr

Improving Parallel Data Mining for Different Data Distributions

81

calculates these numbers for rows that are located at different storage nodes, such that function join summarises them. Thus, loopr in (10) can be executed in parallel on distributed memory. In the final step, we transform the sequential form of the NB algorithm into a distributed form by inserting function paralleld : N BHP ar = (paralleld[loopr 1 m ((loopc 1 p f4 ) ◦ f2 )] d) ◦ f0

(11)

Figure 4 shows how the FMBs of composition (11) are deployed on storage nodes.

Fig. 4. Distributed execution of the Naive Bayes algorithm for horizontally distributed data.

For vertically distributed data, we transform expression (10), because outer loop loopc does not agree with the type of data distribution: N BV er = f7 ◦ f5 ◦ f0 = (loopc 1 p − 1 (loop r 1 m f4 d)d) ◦ (loopr 1 m f2 d) ◦ f0 (12) where: • f5 is the loop for the data set’s rows with FMB f2 : f5 = loopr 1 m f2 ; • f6 is the loop for the data set’s rows with FMB f4 : f6 = loopr 1 m f4 ; • f7 is the loop for the data set’s columns: f7 = loopc 1 p − 1 f6 . Next, we verify condition (9) for loop loopc. Parallel function loopc calculates the numbers of rows for columns located at different storage nodes; function join combines them. Thus, loopc in (12) can be executed in parallel on distributed memory: N BVP ar = (paralleld[(paralleld[(loopc 1 p (loopr 1 m f4 ))]), (loopr 1 m f2 )]) ◦ f0 (13) Figure 5 shows how the FMBs of composition (13) are deployed on the storage nodes.

82

I. Kholod et al.

Fig. 5. Distributed execution of the Naive Bayes algorithm for vertically distributed data.

4

Experiments and Results

We implement the two parallel NB implementations (11) and (13) in the Javabased library DXelopes [19]. In our experiments with them, we use the real-world data set Predict Outcome of Pregnancy from the Kaggle Datasets [15]. Table 1 shows the characteristics of this data set and how the data are partitioned in our experiments. The data set is split into 2 and then 4 parts and distributed between two and then four storage nodes, correspondingly. In each case, data set is distributed both horizontally and vertically. Table 1. Distributed data sets Type of distribution Number of data sets

Number of rows

Number of columns

Horizontal

3 402 670

68

4

Size of data set (Mb) 500

Vertical

4

14 461 451

17

500

Horizontal

2

6 805 350

68

1000

Vertical

2

14 461 451

34

1000

Single data set

1

14 461 451

68

2000

We compare our results with Apache Spark MLib. Apache Spark Machine Learning Library (MLlib) [2] is a scalable data mining library for the Apache Spark platform. It contains common learning algorithms, including NB algorithm. In our experiments, Apache MLib is executed on the computational node with 1, 2 and 4 cores. Figure 6 shows the run time for horizontally (Fig. 6(a)) and vertically (Fig. 6(b)) distributed data. We compare the two variants of the parallel NB algorithm adapted to horizontally (NBHPar) and vertically (NBVPer) distributed data. We observe that the run time of NBHPar is lower than of NBVPar for horizontally distributed data. On the other hand, the run time of NBVPar is lower than of NBHPar for vertically distributed data. It is explained by the

Improving Parallel Data Mining for Different Data Distributions

83

non-corresponding structure of the algorithm and the type of data distribution, that increases the overhead of distributed execution (number of functions paralleld ).

Fig. 6. Runtime of executing NB algorithm at 1, 2, 4 storage nodes for (a) horizontally distributed data; (b) vertically distributed data.

We observe that the difference between both variants is larger for vertically than for horizontally distributed data. This can be explained by the larger number of rows than columns, so the number of invocations of functions copy and join in NBHPar for vertically distributed data is very large. We also observe that the difference between the two variants is reduced with a smaller number of storage nodes. This can also be explained by the decreased number of invocations of functions copy and join in the distributed execution of the algorithm on a smaller number of storage nodes. Our implementation, adapted to the type of data distribution, shows a better run time than the implementation in MLib of Spark MLLib, which confirms the advantage of our approach.

5

Conclusion

In this paper, we improve the parallel implementation of data mining algorithms in modern Internet of Things (IoT) systems. Our approach formally transforms a high-level functional representation of a data mining algorithm into a parallel implementation that performs as much as possible computations at local nodes, rather than transferring data for processing at a central compute cluster as done in the current solutions based on MapReduce. We study different kinds of data distributions between the nodes of IoT and we adapt the structure of the algorithm correspondingly. Thereby our parallel implementation of data mining avoids the main disadvantages of MapReduce in the context of IoT: increased total processing time, high network traffic, and a risk of unauthorized access to the data.

84

I. Kholod et al.

Our experiments on a real-world data set confirm that the flexible adaptation to the type of data distribution by using our approach significantly reduces the network traffic and the application run time. Acknowledgements. This work was supported by the Ministry of Education and Science of the Russian Federation in the framework of the state order “Organization of Scientific Research”, task 2.6113.2017/6.7, by the RFBR according to the research project 19-07-00784., and by the German Ministry of Education and Research (BMBF) in the framework of project HPC2SE at the University of Muenster.

References 1. Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures. Morgan Kaufmann, Burlington (2002) 2. Apache Spark. http://spark.apache.org. Accessed 19 June 2019 3. Atzori, L., Lera, A., Morabito, G.: The Internet of Things: a survey. Comput. Netw. 54(15), 2787–2805 (2010) 4. Barr, J.: Amazon Machine Learning – Make Data-Driven Decisions at Scale. https://aws.amazon.com/blogs/aws/amazon-machine-learning-make-datadriven-decisions-at-scale. Accessed 19 June 2019 5. Bernstein, J.: Program analysis for parallel processing. IEEE Trans. Electron. Comput. 15, 757–762 (1966) 6. Bonomi, F., et al.: Fog computing and its role in the Internet of Things. In: MCC, pp. 13–16 (2012) 7. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of Operating Systems Design and Implementation, San Francisco, CA (2004) 8. Geetha, J., Pillaipakkamnatt, K., Wright, R.N.: A new privacy-preserving distributed k-clustering algorithm. SDM (2006) 9. Google Cloud Machine Learning at Scale. https://cloud.google.com/products/ machine-learning. Accessed 19 June 2019 10. Gorlatch, S., Cole, M.: Parallel skeletons. In: Padua, D. (ed.) Encyclopedia of Parallel Computing, pp. 1417–1422. Springer, Boston (2011) 11. Gronlund, C.J.: Introduction to machine learning on Microsoft Azure. https:// azure.microsoft.com/en-gb/documentation/articles/machine-learning-what-ismachine-learning. Accessed 19 June 2019 12. Gubbi, J., et al.: Internet of Things (IoT): a vision, architectural el-ements, and future directions. Future Gener. Comput. Syst. 29(7), 1645–1660 (2013) 13. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2001) 14. John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345 (1995) 15. Kaggle. Dataset: Predict Outcome of Pregnancy. https://prudsys.de/en/ knowledge/technology/prudsys-xelopes/. Accessed 19 June 2019 16. Kholod, I., Kuprianov, M., Petukhov, I.: Distributed data mining based on actors for Internet of Things. In: MECO, pp. 480–484 (2016) 17. Kholod, I., Shorov, A., Titkov, E., Gorlatch, S.: A formally based parallelization of data mining algorithms for multi-core systems. J. Supercomput. (2018). https:// doi.org/10.1007/s11227-018-2473-8

Improving Parallel Data Mining for Different Data Distributions

85

18. Lally, A., et al.: Question analysis: how Watson reads a clue. IBM J. Res. Dev. 56(3.4), 2–11 (2012) 19. Prudsys Xelopes. https://de.wikipedia.org/wiki/XELOPES. Accessed 19 June 2019 20. Sunil Kumar, C., Santosh Kumar, P.N., Venugopal, C.: An apriori algorithm in distributed data mining system. Global J. Comput. Sci. Technol. Softw. Data Eng. 13(12) (2013) 21. Tsai, C.-W., Lai, C.-F., Vasilakos, A.V.: Future Internet of Things: open issues and challenges. Wireless Netw. 20(8), 2201–2217 (2014) 22. Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of 9th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2003)

An Experiment on Automated Requirements Mapping Using Deep Learning Methods Felix Petcu¸sin, Liana St˘ anescu, and Costin B˘ adic˘ a(B) University of Craiova, Craiova, Romania [email protected], {stanescu liana,cbadica}@software.ucv.ro

Abstract. Requirements engineering is one of critical activities in systems development for Automotive Industry. Its outcome is most often represented by a set of documents capturing requirements specifications in natural language. For quality assurance and maturity support of the final products, the requirements must be verified and validated at different testing levels. To achieve this, the requirements are manually labelled to indicate the corresponding testing level. The number of requirements can vary from few hundreds in smaller projects to several thousands in larger projects. Their manual labeling is time consuming and error-prone, thus sometimes incurring an unacceptable high cost. In this paper we report our initial results on the automated classification of requirements in two classes: “Integration Test” and “Software Test” using Machine Learning approaches. Our solution could help the requirements engineers by speeding up the classification of requirements and thus reducing the time to market of final products.

Keywords: Deep Learning Industry

1

· Requirements engineering · Automotive

Introduction

In the automotive industry, the functionality of electronic components is becoming more and more complex at a very fast rate – about one third of development costs is spent for electric/electronic development nowadays, and this amount is still increasing. At the same time, many components are developed, integrated and tested over a sequence of prototyping phases. Furthermore, components are developed in several slightly different variants and according to different schedules. Therefore, the specification activities have reached a level of complexity, that exceeds the limits of what can be reasonably supported by conventional human-driven text processing systems [12]. Most system development projects (see Fig. 1) contain a separate stage dedicated to requirements specification, as part of their standard process. The aim of the requirements specification is to define the proprieties that a system must c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 86–95, 2020. https://doi.org/10.1007/978-3-030-32258-8_10

Requirements Mapping Using Deep Learning

87

fulfill. Customer Requirements are received by the project supplier and further refined in Software and Hardware requirements. All functional requirements need to be verified and validated against the implementation. Testing plays an extremely important role in automotive industry. Depending on the content of requirements information, software requirements are usually classified as “Integration Test” or “Software Test” requirements. Based on this classification, specific testing techniques and methods are applied. The classification of requirements is an activity that is usually performed by the requirements engineer, after the requirements are formally captured in written form. This process is mainly achieved by manual labelling, thus being time consuming and error-prone.

Fig. 1. System development process.

In this paper we propose the development of a new approach to automatically classify software requirements into “Integration Test” and “Software Test” requirements. The presented approach can be used by the requirements engineers and/or testing engineers to map requirements to the corresponding testing level. Our proposed approach is based on the most recent achievements and research results in the domain of software technologies for Machine Learning and Text Processing. As starting point of our analysis we have focused on the practical application of deep learning and word embeddings methods to our classification problem. In this paper we present our initial results that we obtained by utilizing Keras Deep Learning framework [3,4], as well as Glove algorithm for vector

88

F. Petcu¸sin et al.

representations of words [11]. We have applied a combination of both methods on a dataset of 2000 requirements from automotive area. In our experiments with Deep Learning and the baseline representation of text we obtained an accuracy of 0.84, while combining deep learning with word embeddings allowed us to improve the accuracy to 0.86.

2 2.1

Background and Related Works Deep Learning, Neural Networks, and Word Representations

Neural networks (NN hereafter) have a long tradition in AI. Nevertheless, it is only quite recently, during the last decade, when they started to capture the attention of main stream AI research and applications. Modern AI places current NN research under the wider umbrella of deep learning (DL hereafter), which itself is considered a branch of machine learning (ML hereafter). Basically, DL covers not only NNs, but also the whole set of rich and deep hierarchical quantitative representations currently used in ML research and practice. Today DL has many practical applications in imagine recognition, computer vision, activity recognition, business intelligence, natural language processing, and text mining. Recent survey [1] mentions DL as one of the five fundamental categories for organizing computational intelligence methods for semantic text classification. DL-based approaches became popular during the recent years, especially starting with 2006. This survey mentions two types of building blocks for NNs used in text classification: Auto-encoders and Restricted Boltzmann Machines. Essential in DL is the use of multiple interconnected layers combined with linear and non-linear connections aiming to provide hierarchical learning capabilities. Experimental results highlighted the capabilities of DL to digest larger amounts of data and to produce better accuracy than traditional ML methods. Often DL is applied in natural language processing in conjunction with word representations using vector space models, also known as word embeddings. The idea of such approaches is to capture semantics by mapping each word to a numerical vector such that similar or related words are mapped to close vectors, according to a distance measure. The mapping is produced by learning algorithms applied to large natural language corpora. Classic approaches are GloVe [11] and word2vec [8]. 2.2

Related Works

Within the last years, a number of machine learning (ML) algorithms were utilised in order to resolve a series of particularly difficult problems in the requirements engineering (RE) field. An example of such a problem is identifying and classifying of non-functional requirements (NFRs) in requirements documents. ML-based proposed solutions have shown to lead to promising results, superior to those brought by traditional natural language processing (NLP) approaches.

Requirements Mapping Using Deep Learning

89

In [2], the authors present a systematic review of 24 ML-based solutions for identifying and classifying NFRs. 24 research papers have been selected by the authors, that make use of 16 different ML algorithms, which can be split into 3 categories: supervised learning (7 algorithms), unsupervised learning (4 algorithms) and semi-supervised learning (5 algorithms). The supervised learning algorithms were used in 17 papers (71%), and SVM is the most popular algorithm used in 11 studies (45.8%). The authors of [2] have reached the following conclusions: ML-based solutions have potential in the classification and identification of NFRs; a collaboration between RE and ML researchers is necessary, in order to address open challenges facing the development of real-world ML systems; The use of ML in RE leads to exciting opportunities to develop novel expert and intelligent systems to support RE tasks and processes. Another paper [13] presents an approach for automatic classification of content elements of a natural language requirements specification as requirement or information. The presented approach could be used in the following ways: classification of content elements in documents, which has never before been classified; to perform an analysis on a previously classified document and further assist the user in discovering the elements that have not been correctly classified. The authors propose convolutional neural networks, a machine learning algorithm, which has raised more attention in the natural language processing field. In order to train the neural network, the authors have used a set of over 10,000 content elements, which have been extracted from 89 requirement specification of their industry partner. Through the use of content elements in proportion of 90%, for training data, as well as the remaining 10% as test data, the authors’ approach became able to achieve a stable classification accuracy of approximately 81%. The authors applied their proposed approach in a preliminary evaluation to an unknown requirements specification which consisted of 747 content elements, obtaining a precision of 0.73 and a recall of 0.89 in classifying requirements and a precision of 0.90 and a recall of 0.75 in classifying information. Within the paper [6] the authors proposed an automated requirements classification technique for early requirements analysis steps, which was further applied to the Internet-based requirements analysis-supporting system, lead to a base, which efficiently analyzed the requirements collected from a distributed environment. The developed system reduces the amount of work done by hand at the initial stage of requirements analysis, because it automatically classifies the collected requirements into several views. As a result, the proposed system enables rapid and correct requirements analysis from messy requirements. In order to perform an evaluation of the proposed technique, the authors performed a set of experiments, with the use of two real requirements data sets: the Korean data set and the English data set. The proposed technique is composed of the following steps: 1. Preprocessing: Each sentence is separated from all collected requirements, and each content word is extracted from the separated sentence.

90

F. Petcu¸sin et al.

2. Constructing a set of centroid-sentences as training data for each topic category 3. Learning classifier: The Naive Bayes classifier is learned by using the set of centroid-sentences as training data The research presented in [5] is a first step towards using machine learning to support self-adaptive systems in the adaptation of contextual requirements. In the presented approach named ACon (Adaptation of Contextual requirements) the authors take a position that in order to respond to unexpected events or changes in the operating environment, self adaptive systems are required to learn from the environmental data which is available to them at runtime, and further execute the adaption of their contextual requirements accordingly. ACon integrates data mining algorithms that analyse contextual data to determine the context in which contextual requirements are valid, thus adapting the context in which they are valid. In [7] the authors proposed an automatic machine-learning-based method, namely, natural language text classification, for classifying user requests on software or product features into classes of software requirements. The proposed method uses TF-IDF of words in corpus as basic document features. The paper also proposes non-project-specific keywords of different classes of software requirements for encoding text features. For improving the classification accuracy, project specific keywords are added, by labeling them manually or automatically from all user requests using the Word2vec method. For training classifiers, a number of different machine learning algorithms were studied, resulting that SVM obtained best results. In [9] the authors present a methodology for automatic evaluation of the quality of requirements, according to the requirements of the field experts who use that methodology. The methodology aims to make a prediction on the quality of new requirements. The expert must contribute with an initial set of requirements that have been previously classified according to their quality and that have been chosen as appropriate. For each of the requirements in the set, the authors extract metrics that quantify the value for the quality of the requirement. The methodology proposes the use of a Machine Learning technique, namely rule inference, for learning the value ranges for metrics, as well as the way they are combined to result into the interpretation of requirements quality of the domain expert.

3

Experiments and Results

In this section we present the experimental prototype for evaluating our proposed method, as well as the data sets used and the experimental results that we obtained.

Requirements Mapping Using Deep Learning

3.1

91

Data Set

We have created a data set starting from an available set of real requirements used to develop an ECU (Electronic Control Unit) in automotive industry. Each individual requirement has the following attributes: • Testing Level : This attribute describes the level of testing at which the content element must be validated. • Requirement Content: This attribute contains the text body of the content element. The Testing Level indicates the classification of each given requirement either into “Integration Test” or “Software Test”. The requirements of our data set have been already manually labelled during the ECU development process. From the requirements database, we have filtered and exported only those requirements that have been assigned to one of these two testing levels. As the requirements are written by different engineers in different phases of the project, the available requirements have different writing styling. The exported requirements were preprocessed to remove unnecessary information: figures, logical expressions and statements that could not be converted into lexical tokens. Labels “Integration Test” and “Software Test” were mapped to integers 0 and 1. The data set is balanced, i.e. it contains equal numbers of requirements in each class. Finally, the resulted data set contains a number of 2000 labelled requirements that represent the raw input to our classification process. Requirements are small texts containing between 6 and 38 words with a mean of 17 words per each requirement. 3.2

Experiments and Results

We have applied two computational methods on our data set. The first method uses the classic “bag of word” (BoW) representation of text. This assumes the creation of a vector counting word occurrences out of the text sentences. The second method uses the newer “word embeddings” approach. It assumes the mapping of each word to a vector of numbers such that semantics is preserved. The goal of our experiments was to obtain an initial assessment of the potential use of word embeddings (VE) compared to classic BoW representation, when both methods are used in conjunction with deep neural networks for the automated classification of requirements. Both experiments share the same processing pipeline consisting of two steps: 1. Text vectorization. This step assumes the transformation of each example (requirements document) into a vector of numbers, using either of the two methods: BoW and VE. 2. DL classification. This step assumes the definition, training, and validation of a suitable deep NN that is capable to classify the vectorized representations produced by the previous step.

92

F. Petcu¸sin et al.

For both models we have applied the standard approach of cross-validation by splitting our available data set into the training part and the testing part. We have used 75% of the examples to train the model and the remaining 25% as testing data needed for measuring the accuracy of our model. The split was performed using the train test split method of class module selection of scikitlearn package [10]. The split allowed us to obtain 1500 examples for training and 500 examples for testing. 3.2.1 First Model Our first approach is based on the BoW representation of text. This means that starting from our data set we have created a vocabulary containing the list of its occurring unique words. Each word has an associated index, so each requirement (example) was mapped to a vector with dimension equal to the size of our vocabulary – 1878 in our case. Each vector element represents the counter of occurrences of the corresponding word in our data set. The resulting training and test sets have been mapped to numeric vectors according to the BoW model using class CountVectorizer of scikit-learn package. We have used Keras [3,4] for creating deep NN models for the classification of our requirements The NN had an input layer, one hidden layer and the output layer. The hidden layer has 10 nodes. We used the regular densely-connected NN layer (layers.Dense data type). The standard ReLu activation function has been used for the hidden layer. As we have a binary classification problem, we have used the sigmoid as the activation function with dimensionality 1 of the output space, for the output layer of our NN model. Binary crossentropy was used to calculate the loss function between the actual output and the predicted output. The optimization of our NN was done using Adam. With the created model we were able to fit out training data to the NN model created. To train the model we have used 10 samples per gradient updates and we have iterated 20 times. We had 18790 parameters for the first layer and another 11 in the second layer. The number of parameters is calculated in the following way: 1878 dimension for each feature vector, we need weight for each feature dimension and each node resulting 1878 × 10 adding 10 times bias for each node. In the layer we had 10 weights and one bias. All 18801 parameters have been determined using the training process. Our obtained results are presented in Fig. 2. The evaluation of the performance for the trained network was done by measuring the accuracy on the training and test set on one hand and on the other hand measuring the training and validation loss. Due to the relatively small data set used it can be observed that starting with the 12 epoch the model starts to over-fitting on the training data set. We obtained a performance of 84% accuracy with the first model.

Requirements Mapping Using Deep Learning

93

Fig. 2. Experimental results.

3.2.2 Second Model In our second model we have used the word embeddings representation of the requirements. While in the previous model BoW we have mapped each requirement to a single feature vector, now we have captured each word as a numeric vector. There are two possibilities for obtaining word embeddings: either to train them separately on the new corpus or to use suitable pre-trained versions. For our second experiment we have opted for using GloVe pre-trained word embeddings1 . We selected a set of pre-trained word vectors glove6B.50d.txt that contains 400 K unique words and that has a total size of 822 MB. However, we have filtered it by keeping only those word embeddings corresponding to the words that are occurring in our data set. We have used the test and training data already prepared and then we tokenized it in order to be usable by our word embeddings model. The Tokenizer utility class from Keras has been used to vectorize the dataset into a list of integers. Note that each integer of this list is mapped to a word in the dictionary that encodes the entire corpus. Do to the fact that each requirement (example) has a different length of words, we have padded its sequence of words with zero in order to obtain the same length – 100 in our experiment. For our second model we have used a deep NN with an input layer, three hidden layers and the output layer. For the first layer we have used the layers.Embedding data type. This allowed us to use our examples represented as list of integers and map them to a suitable representation that can be processed by the next layer of type GlobalMaxPool1D. The following parameters have been used for the embedded layer: – input dim: 1720, i.e. the size of the vocabulary – output dim: 50, i.e. the size of the dense vector – input length: 100, i.e. the length of the sequence 1

Data sets are publicly available at https://nlp.stanford.edu/projects/glove/.

94

F. Petcu¸sin et al.

The second hidden layer was of type GlobalMaxPool1D with default parameters suitable for downsampling the incoming features vectors by taking the maximum value of all features in the pool for each feature dimension. The third hidden layer was of type layers.Dense Layer, similarly to the First Model. As we have a binary classification problem, similarly to the first experiment, we have used the sigmoid as the activation function with dimensionality 1 of the output space, for the output layer of our NN model. Binary crossentropy was used to calculate the loss function between the actual output and the predicted output. The optimization of our NN was done using Adam. With the created model we are now able to fit out training data to the NN model created. To train the model we have used 10 samples per gradient updates and we have iterated 50 times. The total number of parameters was 86521, all of them being trained. The results obtained are presented in Fig. 3. The evaluation of the performance for the trained network was done by measuring the accuracy on the training and test set on one hand and on the other hand measuring the training and validation loss. With this model we obtained an improvement from the first model, the accuracy being 86%.

Fig. 3. Experimental results.

4

Conclusions and Future Works

In this paper we presented our initial approach, results and experiments towards automating classification of requirements in software engineering for automotive industry. We have investigated the potential of combining deep NN models with word representations for improving the performance of this task. Our results are preliminary, Nevertheless, we were able to obtain a slight improvement in performance for the word embeddings approach, as compared with the baseline bag of words approach. In the near future we plan to strengthen our results

Requirements Mapping Using Deep Learning

95

by expanding our experiments to include different and larger data sets, as well as to use different and suitably trained deep NN architectures. Moreover, we would like to compare our results with other Machine Learning methods for text mining and classification by running experiments on various private and public data sets in the area of requirements engineering.

References 1. Altınel, B., Ganiz, M.C.: Semantic text classification: a survey of past and recent advances. Inf. Process. Manag. 54(6), 1129–1153 (2018). https://doi.org/10.1016/ j.ipm.2018.08.001 2. Binkhonain, M., Zhao, L.: A review of machine learning algorithms for identification and classification of non-functional requirements. Expert Syst. Appl. X 1 (2019). https://doi.org/10.1016/j.eswax.2019.100001 3. Chollet, F., et al.: Keras: the Python Deep Learning library (2015). https://keras.io 4. Chollet, F.: Deep Learning with Python. Manning Publications (2018) 5. Knauss, A., Damian, D., Franch, X., Rook, A., M¨ uller, H.A., Thomo, A.: ACon: a learning-based approach to deal with uncertainty in contextual requirements at runtime. Inf. Softw. Technol. 70, 85–99 (2016). https://doi.org/10.1016/j.infsof. 2015.10.001 6. Ko, Y., Park, S., Seo, J., Choi, S.: Using classification techniques for informal requirements in the requirements analysis-supporting system. Inf. Softw. Technol. 49(11–12), 1128–1140 (2007). https://doi.org/10.1016/j.infsof.2006.11.007 7. Li, C., Huang, L., Ge, J., Luo, B., Ng, V.: Automatically classifying user requests in crowdsourcing requirements engineering. J. Syst. Softw. 138, 108–123 (2018). https://doi.org/10.1016/j.jss.2017.12.028 8. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Proceedings of the 26th International Conference on Neural Information Processing Systems – Volume 2 (NIPS’13), Vol. 2. Curran Associates Inc., USA 3111-3119 (2013) 9. Parra, E., Dimou, C., Llorens, J., Moreno, V., Fraga, A.: A methodology for the classification of quality of requirements using machine learning techniques. Inf. Softw. Technol. 67, 180–195 (2015). https://doi.org/10.1016/j.infsof.2015.07.006 10. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011). http://dl.acm.org/citation.cfm?id=2078195 11. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014). http://www.aclweb.org/anthology/D14-1162 12. Weber, M., Weisbrod, J.: Requirements engineering in automotive development: experiences and challenges. IEEE Softw. 20(1), 16–24 (2003). https://doi.org/10. 1109/MS.2003.1159025 13. Winkler, J., Vogelsang, A.: Automatic Classification of Requirements Based on Convolutional Neural Networks, In: IEEE 24th International Requirements Engineering Conference Workshops (REW), pp. 39–45 (2016) . https://doi.org/10. 1109/REW.2016.021

Using the Doc2Vec Algorithm to Detect Semantically Similar Jira Issues in the Process of Resolving Customer Requests Artem Kovalev(B) , Nikita Voinov, and Igor Nikiforov Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia [email protected], {voinov,i.nikiforov}@ics2.ecd.spbstu.ru

Abstract. The paper is devoted to research in the field of software maintenance automation. An approach to solving customer requests based on the use of the Doc2Vec algorithm is proposed. It consists of finding semantically related resolved requests, as well as identifying qualified software engineers in the Jira bug tracking system. The developed software tool implements the proposed approach and provides reports which help software engineers in solving unresolved customer requests. The experiment compares the automated approach to resolving customer requests with the manual one. The results show advantages of using the software tool in the maintenance process. Keywords: Maintenance

1

· Automation · Doc2Vec · Machine learning

Introduction

The software maintenance phase is one of the most important stages of the system development life cycle. In the process of the software maintenance, developers of the vendor company solve problems arising during the operation of the software product on the customer side. This stage is labor-intensive and, according to estimates [1], accounts for about 50% of all software life cycle costs. At the software maintenance phase, customer requests are received and processed by technical support engineers. Customer requests can be received by e-mail, by telephone or via HelpDesk system [2]. The request is usually written in natural language and contains a description of the issue in the software product, as well as attached files: configurations, logs, stack traces, screenshots, and video playback of the problem. When technical support engineers cannot process a customer request due to lack of competence, they assign it to the software engineers. For convenient management of such requests, issue tracking systems are usually used [3]. In this paper, the Jira issue tracking system is considered [4]. When solving customer requests, it is usually necessary to find similar requests that have already been resolved, as well as to involve competent software engineers who specialize in the relevant problem area. In solving these problems, machine learning algorithms and neural networks can help [5], namely the c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 96–101, 2020. https://doi.org/10.1007/978-3-030-32258-8_11

Using the Doc2Vec Algorithm to Detect Semantically Similar Jira Issues

97

Doc2Vec algorithm [6]. Due to the use of software based on the Doc2Vec algorithm, it is possible to reduce the complexity and increase the efficiency of the maintenance process. Improving the maintenance quality helps to build longterm relationships between the software vendor and the customer. The goal of this work is to reduce the processing time of customer requests in the Jira issue tracking system by using the Doc2Vec algorithm to search for semantically similar requests. To achieve this goal, it is necessary to solve the following tasks: 1. Investigate existing solutions related to reducing the complexity of processing customer requests by detecting similar issue reports. 2. Propose an automated approach based on the use of the Doc2Vec algorithm to search for similar customer requests and qualified engineers in the Jira issue tracking system. 3. Develop a software tool that implements the proposed approach. 4. Show the effectiveness of the proposed approach and its implementation in the software tool on the relevant data. The following sections discuss each of the points listed above.

2

Related Work

Detecting duplicate bug reports is a well-established problem in software engineering research. One of the pioneer studies on detection of duplicate bug reports is by Hiew [7]. Their approach is based on converting the stemmed text into a word vector using Term Frequency-Inverse Document Frequency (TF-IDF) [8]. After document vectors for bug reports were formed, they used cosine similarity to measure how similar two bug reports are to each other [9]. Sun et al. [10] proposed the discriminative model using a support vector machine (SVM) to train a model based on a set of labelled vectors. The model could then be used to classify the other bug reports as either duplicate or not. The weights of features are calculated using the IDF-based similarity function to represent the correlation relationship of a pair of bug reports. Nguyen et al. [11] proposed an approach that takes advantage of both IRbased features (BM25F) and topic-based features (LDA). None of the reviewed studies applies the approach based on the Doc2Vec algorithm. Besides, none of the works uses the Jira system as the bug reports repository. Thus, a distinctive feature of our research is the use of the above algorithm to identify semantically similar customer requests in the Jira issue tracking system.

98

3

A. Kovalev et al.

Proposed Approach

Figure 1 shows the proposed conceptual schema for analyzing customer requests.

Fig. 1. Conceptual schema of the proposed approach to processing customer requests

The Jira issue tracking system provides a REST API interface [4] for retrieving data, such as descriptions, steps to reproduce, comments, and other request fields. The Doc2Vec machine-learning algorithm uses the generated set of resolved requests to create a vector model. The raw data preparation consists of tokenization [12], stemming [13] and removing stop words. After the learning process, the vector model stored in RAM is serialized into the file. To search for semantically related customer requests, it is necessary to vectorize a set of unresolved requests using the Doc2Vec algorithm. After that, the semantic proximity between each unresolved request vector and the vectors of already resolved requests is determined by the cosine similarity coefficient of their vector representations [9]. The greater the coefficient value, the more confidently it can be argued that the two requests are similar. From the resolved requests with the most similar vector representations, identifiers are extracted, as well as the names of the software engineers who solved the problem. Thus, the result of applying the Doc2Vec algorithm to unresolved requests is a generated report that is represented by a text file. Each line of this file corresponds to an unresolved customer request. At the beginning of the line is the identifier of the unresolved request, and then comma-separated identifiers of similar resolved requests with the names of competent software engineers. After generating the report, the software developer must ensure that the tool correctly detects similar requests and qualified engineers. The proposed approach has been implemented in the software tool, which consists of two modules: Jira connector and similarity processor. It was written in the Java 8 language. Work with the tool is divided into two stages: setup and usage. At the setup stage, it is necessary to create data sets, and then the vector model, which is further applied in the usage stage.

Using the Doc2Vec Algorithm to Detect Semantically Similar Jira Issues

4

99

Experiment

The developed tool is applied to the Apache ActiveMQ project1 . The project is freely distributable, and any user of the Internet can register a request containing a question or bug description in the Jira issue tracking system. For training and testing of the Doc2Vec model, a set of Jira requests was used, taken over the period from April 20, 2004 to November 15, 2018, which eventually amounts to 7100 requests. The entire data set is divided into two parts: a training sample (90%, or 6390 requests), representing resolved requests, and a test sample (10%, or 710 requests), representing unresolved requests. The tool is deployed on a platform that is a local workstation with the Windows 10 Enterprise x64 operating system, Intel Core i7-4810MQ CPU 2.80 GHz, 16 GB RAM and 450 GB on SSD. Before the experiment, the software tool was configured. It took 5 min to set up the configuration files. At the setup stage of the tool 7100 requests were loaded. Loading time took 2 h. The file with resolved requests is 512 MB. The file with unresolved requests is 49 MB. The vector model file size is 287 MB. Creating a model took 30 min. In the process of working on the Apache ActiveMQ project, one experiment was conducted. The essence of the experiment is to compare the average time for analyzing a single customer request using automated and manual approaches and to measure the precision of the automated approach. Only one engineer carried out the experiment. The working time of the manual and automated approach was recorded using a timer. When the engineer began work on request, the timer turned on. The timer turned off, and measurements were recorded, when the engineer found the one similar resolved request and one qualified software engineer with the manual approach or when the engineer checked the correctness of the report results with the automated approach. As part of the experiment, 100 requests from the test sample were analyzed: 50 manually and 50 automatically. The results of the analysis are presented in Fig. 2. The average time to analyze a single request using the manual and automated approach is 36, 8 and 16, 3 min, respectively. During automated analysis, only 4 of the 50 requests were mismatched. The proposed similar request was marked as mismatched if it could not help to resolve the customer request. Figure 3 shows that the precision of the automated analysis is about 92%. With the manual approach, the time spent analyzing a single request depends on its complexity, the amount of useful information in the request, and the experience of the developer who is involved in solving this problem. In the automated approach, the setup time and quality of the analysis directly depend on the number of already resolved requests in the project. The data presented in the experiment were obtained for a specific system and depend on the features of the project. Nevertheless, based on the graphs, 1

https://issues.apache.org/jira/projects/AMQ/issues.

100

A. Kovalev et al.

Fig. 2. The comparison chart of the manual and automated approach to analyze 50 requests shows the gain in time with long-term use of the tool

Fig. 3. The automated analysis of 50 customer requests shows 92% precision

it can be concluded that with an increase in the number of unresolved requests, the use of software tool becomes more effective. The speed and accuracy of the request consideration directly affect the quality of customer service, and, consequently, the long-term relationship with him. Using a software tool created for analyzing customer requests allows the software engineer to replace the manual approach with the automated one. This increases the quality of the sustaining process for the customer and reduces the complexity of this process for the developer.

5

Conclusion

The study reviewed existing solutions for detecting similar issue reports during the maintenance process. The automated approach to reduce the complexity of customer requests processing was proposed. This approach is based on the use of the Doc2Vec machine learning algorithm, which solves the problem of detecting semantically related customer requests and competent engineers. The created tool was successfully tested on the Apache ActiveMQ project. By using the tool, 50 requests were analyzed with the manual approach and 50 requests with the automated one. The effectiveness of the automated approach is shown. The results present the benefits of using the software tool. The time to analyze a single request has decreased compared to the traditional manual approach.

Using the Doc2Vec Algorithm to Detect Semantically Similar Jira Issues

101

References 1. Ogheneovo, E.E.: On the relationship between software complexity and maintenance costs. J. Comput. Commun. 2(14), 1 (2014) 2. Rababah, K., Mohd, H., Ibrahim, H.: Customer relationship management (CRM) processes from theory to practice: the pre-implementation plan of CRM system. Int. J. e-Education e-Business e-Management e-Learning 1(1), 22 (2011) 3. Bertram, D., Voida, A., Greenberg, S., Walker, R.: Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 291–300, February 2010 4. Fisher, J., Koning, D., Ludwigsen, A.P.: Utilizing Atlassian JIRA for large-scale software development management. In: 14th International Conference on Accelerator And Large Experimental Physics Control Systems (ICALEPCS), October 2013 5. Pellegrini, T.: Comparing SVM, Softmax, and shallow neural networks for eating condition classification. In: 16th Annual Conference of the International Speech Communication Association, pp. 899–903 (2015) 6. Maslova, N., Potapov, V.: Neural network Doc2Vec in automated sentiment analysis for short informal texts. In: Lecture Notes in Computer Science, vol. 10458, pp. 546–554 (2017) 7. Hiew, L.: Assisted detection of duplicate bug reports. University of British Columbia (2006) 8. Ramos, J.: Using TF-IDF to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning, vol. 242, pp. 133–142 (2003) 9. Giller, G.L.: The statistical properties of random bitstreams and the sampling distribution of cosine similarity. In: Giller Investments Research Notes, no. 20121024/1 (2012) 10. Sun, C., Lo, D., Wang, X., Jiang, J., Khoo, S.-C.: A discriminative model approach for accurate duplicate bug report retrieval. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol. 1, pp. 45–54 (2010) 11. Nguyen, A.T., Nguyen, T.T., Nguyen, T.N., Lo, D., Sun, C.: Duplicate bug report detection with a combination of information retrieval and topic modeling. In: Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pp. 70–79 (2012) 12. Branco, A.H., Silva, J.R.: Contractions: breaking the tokenization-tagging circularity. Lecture Notes in Computer Science, vol. 2721, pp. 167–170 (2003) 13. Sharma, D.: Stemming algorithms: a comparative study and their analysis. Int. J. Appl. Inf. Syst. 4(3), 7–12 (2012)

evoRF: An Evolutionary Approach to Random Forests Diogo Ramos1 , Davide Carneiro1(B) , and Paulo Novais2 1 2

CIICESI/ESTG - Polytechnic Institute of Porto, Felgueiras, Portugal {8150101,dcarneiro}@estg.ipp.pt Algoritmi Centre/Department of Informatics, Universidade do Minho, Braga, Portugal [email protected]

Abstract. Machine Learning is a field in which significant steps forward have been taken in the last years, resulting in a wide variety of available algorithms, for many different problems. Nonetheless, most of these algorithms focus on the training of static models, in the sense that the model stops evolving after the training phase. This is increasingly becoming a limitation, especially in an era in which datasets are increasingly larger and may even arrive as sequential streams of data. Frequently retraining a model, in these scenarios, is not realistic. In this paper we propose evoRF: a combination of a Random Forest with an evolutionary approach. Its key innovative aspect is the evolution of the weights of the Random Forest over time, as new data arrives, thus making the forest’s voting scheme adapt to the new data. Older trees can also be replaced by newly trained ones, according to their accuracy, ensuring that the ensemble remains up to date without requiring a whole retraining.

Keywords: Random Forest Online learning

1

· Optimization · Genetic algorithms ·

Introduction

Decision Trees [1] are among the most popular predictive models used in Machine Learning nowadays. On the one hand, the models that result from the training of a Decision Tree are generally simple to interpret and understand [2,3] as they can be represented as boolean conditions, as opposed to other approaches such as Neural Networks. On the other hand, Decision Tree algorithms tend to behave fairly well with large datasets and can be used for both classification and regression [4,5]. A Decision Tree model can be defined as an acyclic graph upon which decisions can be made by following the branches of the tree. The tree is constituted by nodes (in which branching takes place) and leafs, which contain the answer to the problem. When searching for an answer, a feature of the feature vector is considered in each node of the tree. If the value of the feature is below a c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 102–107, 2020. https://doi.org/10.1007/978-3-030-32258-8_12

evoRF: An Evolutionary Approach to Random Forests

103

given threshold, the path follows the left branch. Otherwise, the right branch is followed. The search for an answer ends when a leaf is reached. The leaf may contain the label of a class in the case of a classification problem, or a value in the case of a regression problem [3]. In the last years, Decision Trees have gained renewed interest in the context of meta-models [6]. Meta-models, also known as Ensembles, are Machine Learning models that are based on the combination of a number of weak models, generally trained with a sub-set of the data of the feature vector. The underlying assumption is that a large group of simple and efficient models will have higher accuracy than one large heavy model. One of the most widely used ensemble algorithms is the Random Forest [7], in which multiple Decision Trees are used as weak learners. These trees are purposely made simple during training, namely by limiting its depth (branching is stopped early) or by limiting the amount of data or the feature vector used in each tree. Each tree is thus trained on a different set of data, a process known as Bagging. In this paper we propose a meta-model that improves the result of the training of a Random Forest through the use of an evolutionary approach. After the training of the ensemble, an evolutionary process takes place with the goal to optimize the weight of each tree in the voting scheme of the forest, driven by the goal to minimize error measures such as the RMSE or accuracy. The output is a vector of weights that is then used by the Random Forest to make a prediction. As opposed to other tree-based ensembles, the proposed meta-model is especially interesting in the domain of online learning, in which data changes gradually, such as in streams of data from social media, as well as in very large datasets whose size might be a challenge for traditional approaches. In these scenarios, the weights of the models in the ensemble can be adapted dynamically through the proposed evolutionary process, for a new set of test data. Moreover, as data flows, new trees can be trained and added to the model pool, eventually replacing older trees according to their weight in the ensemble. This ensures that the ensemble remains up-to-date without the need to re-train it with all the data.

2

Methodology

As put forward in Sect. 1, in this paper we propose an evolutionary approach to improve the result of an ensemble by optimizing the weights of each model on the voting scheme. The methodology followed is described in this section. The process starts with the random splitting of the dataset into three different sets: training, validation and test. A Random Forest with t trees is trained using the training and validation sets, with the sklearn 0.20.3 package. In order to make each tree in the ensemble a weak learner, each is trained with a subset of the data and of the feature vector. These hyperparameters are deemed, respectively, sr (sample rate) and csr (column sample rate). As for the remaining parameters, the default ones in the package are used.

104

D. Ramos et al.

After the training of the ensemble finishes, the evolutionary process takes place. This process has as main goal to find the optimum weight of each tree in the ensemble. This evolutionary process is often time-consuming. However, this is not a drawback in this case as the ensemble can make predictions as soon as all the trees are trained, like a traditional Random Forest. The evolutionary process starts and the new weights of the trees are updated in real-time and used as soon as better solutions are found. The evolutionary process starts with the creation of w random solutions for the problem, that is, w weights vectors. This first group of solutions constitutes the first generation g1 of a group of G generations that will be created throughout the process. Each weight wg,s,i , g ∈ [1..G], s ∈ [1..w], i ∈ [1..t] represents the t weight of the tree i in the solution s of generation g ( i=1 wg,s,i = 1, ∀s ∈ [1..w], ∀g ∈ [1..G]). The fitness of each solution in the current generation is then calculated. This depends on the type of the ensemble. In a classification problem, the fitness is given by the accuracy of the predictions while in a regression problem the fitness is given by the RMSE. Both are calculated using the test dataset, which acts as holdout. This results in a fitness vector f of size t, in which each fitness value fg,s , g ∈ [1..G], s ∈ [1..w] represents the fitness of a given solution s in generation g. After the fitness is calculated, the b solutions with the highest fitness are selected to be reproduced, thus originating the first solutions of a new generation. A group of m new solutions are added to this new generation, based on the mutation operator. This operator takes as input a solution (selected randomly from the b best solutions), generates a random number r, r ∈ [0..mr] (in which mr denotes the mutation rate), and randomly selects two positions in the weights vector i and j. The solution is thus mutated according to wg,s,i ← wg,s,i + mr and wg,s,j ← wg,s,j − mr. This is repeated for the same chromosome until the sum t of the weights of the vector equals 1, i.e., until the solution is a valid one ( i=1 wg,s = 1). Another group of r random solutions are also added to this new generation, created as detailed before. The aim is to maintain a certain level of diversity in each generation. Another common operator is crossover. This is not used in this work as it would potentially violate the validity of the solutions crossed, with a very low probability of obtaining valid solutions. Once the new generation is complete, including not only the b best solutions but also the m solutions generated by mutation and the r solutions generated randomly (b + m + r = w), the process returns to the fitness evaluation stage and repeats. The process stops when a limit of generations l is reached, or when the value of the best fitness found so far has not improved by δ over the last n generations. The output of the process is the weights vector wg,s with the highest fitness value in all generations (g ∈ [1..G], s ∈ [1..w]). The main hyperparameters that can be tuned when using this algorithm are thus: • sr - sample rate • csr - column sample rate

evoRF: An Evolutionary Approach to Random Forests

• • • • • • • •

3

105

w - number of solutions in each generation t - number of trees in the ensemble b - number of best solutions to consider in each generation mr - mutation rate, defining how much each solution should mutate m - number of new solutions to generate through mutation, in each generation r - number of new random solutions to generate, in each generation l - number of iterations (stopping criterion) δ and n - define the reaching of convergence (stopping criterion).

Results

In order to validate evoRF, it was implemented and tested with the well-known dataset MNIST of handwritten digits. It has a total of 70.000 examples of handwritten digits. The digits have been size-normalized and centered in a fixed-size image, in a 28 × 28 grid in grayscale. Researchers using this dataset generally carry out some pre-processing tasks which result in lower error rates, such as deskewing, noise removal, blurring or pixel shifts. Since the goal of this paper is to validate evoRF and not to achieve the highest accuracy possible with this dataset, the data were not pre-processed and was used as is. This dataset was thus used to train a classification ensemble, following the methodology detailed in Sect. 2. An ensemble with 25 trees was trained using 66% of the data, with 29% used for validation and 5% used for testing and for making the solutions evolve. A population of size 8 was used. In each round, the best 5 solutions were kept while 2 new solutions were added through mutation (by mutating one of the 5 selected solutions). Finally, 1 new random solution is also added in each generation, to maintain diversity and explore the search space. The process was run by 100 iterations. Figure 1 shows the evolution of accuracy throughout the 100 generations. The solid line represents the accuracy of the best solution found so far, i.e., the best combination of weights for the models given the test data, while the dotted line represents the average accuracy of all solutions in a generation. Figure 2 compares the weights of the best solution of the first generation, when they were randomly generated, with the weights of the best solution of the 100th generation. A change in weights is evident, which results from the evolution process. If a new set of test data were to be presented, from a more recent stream of data, the result of the evolution would be different according to how good each model is for that particular set of data. In order to assess the quality of the evolved ensemble, we compare its accuracy with that of a standard Random Forest implemented in python using the same 25 trees and a voting scheme in which each tree has the same weight (0.04). We also compare it with the H2O’s implementation of a Distributed Random Forest algorithm. The standard implementation achieves an accuracy of 90.86% while H2O’s DRF achieves an accuracy of 93.45%. In comparison, evoRF shows an accuracy of 92.37%, which is better than the standard RF and close to that of H2O.

106

D. Ramos et al.

Fig. 1. Evolution of accuracy over time.

Fig. 2. Weights of each model in the best solutions of the first and last generations, and relative accuracy of each model.

4

Conclusions and Future Work

The requirements of Machine Learning algorithms have been changing in the last years, due to two main reasons: datasets are growing increasingly larger and data sometimes arrive in streaming rather than in batch. These calls for algorithms that can deal with the challenges that this changes entail. In this paper we proposed evoRF: a Random Forest with a voting scheme whose weights can be adapted through an evolutionary process to better adjust to new sets of data. Moreover, new trees can be trained with more recent data, to be included in the ensemble, eventually replacing older and/or less suited models. Models can also be trained with different subsets of the data, representing different views on the problem (e.g. long-term vs. short term views on the problem). This approach was devised to be used in domains in which data and its patterns change frequently, such as in social networks, in which changes may occur within a day and models become obsolete really fast. This approach may thus allow ensembles to remain up to date, whether by continuously adjusting the weights of existing models or by changing the pool of models. Its main characteristics are: (1) the ensemble does not need to be retrained when data changes significantly; (2) adaptation takes place by changing the weights of each model and does not require changing the trees; (3) new data/patterns can be

evoRF: An Evolutionary Approach to Random Forests

107

incorporated by training new weak learners that are included in the ensemble. Moreover, as with traditional Random Forest algorithms, this approach is easy to parallelize and/or distribute (as are other evolutionary algorithms). All these factors make this a computationally simple approach, especially when compared to other methods that change the structure of existing trees according to some heuristic. We are now working on the implementation of an online service that continuously receives data and carries out the training of models and an ongoing evolution of the weights, in order for the ensemble to adapt to these new data. In the future we will study how this approach behaves with large streams of data from social media, especially when there are new classes. We will, specifically, assess the time it takes for the ensemble to adapt to significant changes in the data, such as when a social event takes place that significantly alters the way the network behaves, namely when predicting interactions. Acknowledgments. This work has been supported by FCT – Funda¸ca ˜o para a Ciˆencia e Tecnologia within the Project Scope: UID/CEC/00319/2019.

References 1. Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991) 2. Janikow, C.Z.: Fuzzy decision trees: issues and methods. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 28(1), 1–14 (1998) 3. Singh, A., Thakur, N., Sharma, A.: A review of supervised machine learning algorithms. In: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 1310–1315. IEEE (2016) 4. Gehrke, J., Ramakrishnan, R., Ganti, V.: Rainforest-a framework for fast decision tree construction of large datasets. VLDB 98, 416–427 (1998) 5. Ranka, S., Singh, V.: Clouds: a decision tree classifier for large datasets. In: Proceedings of the 4th Knowledge Discovery and Data Mining Conference, vol. 2, p. 8 (1998) 6. Vilalta, R., Drissi, Y.: A perspective view and survey of meta-learning. Artif. Intell. Rev. 18(2), 77–95 (2002) 7. Biau, G., Scornet, E.: A random forest guided tour. Test 25(2), 197–227 (2016)

The Method of Fuzzy Logic and Data Mining for Monitoring Troposphere Parameters Using Ground-Based Radiometric Complex S. I. Ivanov1(B) and G. N. Ilin2 1

2

Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russian Federation ivanov [email protected] Institute of Applied Astronomy of the Russian Academy of Science, St. Petersburg, Russia [email protected]

Abstract. The paper presents an algorithm for processing data obtained with the help of ground-based water vapor radiometer (WVR) under conditions of intense precipitation. Fuzzy logic technology and method of singular spectrum analysis were used for realization of this algorithm. Data mining is used to support the choice of the most relevant variables for membership functions of fuzzy logic and inference rules. The selected variables are used as inputs to a fuzzy logic system designed to filter the signal from noise in a quasi-real time scale and predict data during exposure to intensive precipitation. The results of computer modelling of the algorithm demonstrating its good accuracy are presented. The results of comparing integral water vapor content obtained with WVR and Global Navigation Satellite System (GNSS) are discussed. Keywords: Fuzzy logic · Singular spectrum analysis sensing · Water vapor radiometer · Data processing

1

· Remote

Introduction

At present, the only device for obtaining operative data of a number of atmospheric parameters—integral water vapor content Q (kg/m2 ), cloud liquid water path W (kg/m2 ), and tropospheric wet delay (cm) is the ground-based automated radiometric complex (GBARC) based on a two-frequency channel water vapor radiometer [1–3]. The advantages of GBARC include high accuracy measurement of parameters, large volume of measurement data and relatively low cost. GBARC has obvious disadvantages such as low accuracy of parameters estimated in the period of intense precipitation in the form of rain or wet snow. As a result, some of the data related to rainfall are unreliable and should be excluded from further consideration. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 108–114, 2020. https://doi.org/10.1007/978-3-030-32258-8_13

The Method of Fuzzy Logic and Data Mining for Monitoring

109

Fig. 1. Functional scheme of the GBARC

In previous paper [4], we proposed an algorithm for processing the results of measuring integral water vapor content Q. This algorithm includes noise filtering and predicting the values of a series data in the interval of obtaining anomalous samples. The input data of the digital processor implementing the algorithm for processing GBARC data are the time series of samples Qi and the atmospheric precipitation sensor Vi . In this paper, we propose an algorithm for processing the results of the Q parameter measurement, taking into account additional data of the atmospheric precipitation sensor, atmosphere brightness temperatures and results of calculating the content of condensed water in clouds W (kg/m2 ). Additional input information allows to increase the efficiency of the algorithm developed, to reduce the loss of observational time and save acceptable measurement accuracy.

2

The Ground-Based Automated Radiometric Complex for Troposphere Remote Sensing

The ground-based automated radiometric complex for remote sensing of the atmosphere has been installed and successfully operated for more than three years in the observatories of the Russian national Very Long Base Interferometry network “Quasar” [5]. The GBARC consists of the two-channel water vapor radiometer (WVR) measuring the brightness temperature of the atmosphere radiation at frequencies f1 = 20.7 GHz (channel A) and f2 = 31.4 GHz (channel B), the microwave meteorological temperature profiler MTP-5 operating at the central frequency f3 = 56.7 GHz and the weather station MK-15 [6,7]. The functional diagram of the GBARC is shown in Fig. 1. Microwave receivers of the WVR and MTP-5 were made according to the Dicke modulation scheme. A description of the hardware and software of the GBARC is presented in [2,4]. The distributed computing units of the GBARC system are: the WVR, MTP5, and MK-15 processors, which provide for the management of data collection systems and data transmission to the remote control and monitoring center (RCMC); supercomputer RCMC (CPU/core number 80/1680, peak performance

110

S. I. Ivanov and G. N. Ilin

106.91Tflop/s) and operator’s PC processor (see Fig. 1). Distributed software applications of the listed units interact using local computer networks. The input data of the remote control and monitoring center (RCMC in Fig. 1) are the atmosphere brightness temperatures Tf 1 and Tf 2 of the WVR A and B channels, MTP-5 brightness temperature Tf 3 (Θ) depending on elevation angle Θ and MK-15 meteorological parameters. The central processor (CPU) calculates the values of the integral water vapor content Q, the integral liquid water path W and the radio signal tropospheric wet delay on a quasi-real time scale on the basis of measurement results obtained in accordance with the mathematical models developed [2]. Further processing of the obtained data is carried out on the NI LabVIEW platform, including prediction the values of the time series data in the interval of intense precipitation, when the samples take anomalous values. The description of the processing algorithm that additionally takes into account the current values of the integral water content W and meteorological parameters is given in the next section.

3

WVR Data Processing on the Basis of Fuzzy Logic and Method of Singular Spectrum Analysis

The block scheme of the algorithm for time series WVR data processing is presented in Fig. 2. Extension LabVIEW PID and Fuzzy Logic Toolkit is used for the software implementation of the WVR data processing algorithm [9]. The LabVIEW National Instruments (NI) platform greatly simplifies the hardware implementation of the processing algorithm and allows data processing on a quasi-real time scale. The mathematical apparatus of fuzzy logic makes it possible to determine whether the value of the current samples of GBARC data belongs to anomalous values and go into the prediction mode.

Fig. 2. The block scheme of the algorithm for time series WVR data processing

The Method of Fuzzy Logic and Data Mining for Monitoring

111

The construction of a fuzzy inference system is based on an algorithm of the Mamdani type [9,10], which allows combining numerical and linguistic information, and provides the possibility of switching from expert conclusions to fuzzy IF-THEN rules. The analysis shows that WVR Tf 1 and Tf 2 brightness temperatures can not be used as input variables for the fuzzy logic module. Statistical processing of the GBARC data obtained over a long time interval and for different WVR geographic locations showed that the input variables for the fuzzy logic module are most effective to use the samples of the integral water vapor content Qi , the integral liquid water content Wi , and the atmospheric precipitation sensor Vi . The statistical analysis result is the calculation of the histograms of the distributions of such events as the anomalous Qi samples beginning, their terminations and the reliable determination of Qi in the form of functionals of the input variables Wi and Vi . The resulting distributions determine the type and parameters of the input membership functions of the fuzzy logic program block. The fuzzy logic program block allows you to determine whether the current Qi sample values belong to anomalous values and switch to vector prediction mode or switch to GBARC output high-frequency noise filtering mode. For vector prediction, the method of Singular Spectrum Analysis “Caterpillar” is used, (SSA-“Caterpillar” in Fig. 2) [11,12]. This method also allows adaptive suppression of the noise component of the data time series and the isolation of its main trend component in a mode where Qi values are not anomalous (Fig. 2). A detailed description of the SSA method for this task, including the Singular Value Decomposition (SVD) steps, the recovery of a useful signal, and vector prediction is contained in [4,11,12]. The basic algorithm of the SSA method contains the steps of singular decomposition using a virtual SVD device from the Linear Algebra LabVIEW palette, recovery of the useful signal, and vector prediction. The developed algorithm includes an initial processing program block, when the volume N of sampling vector Qi is less than 360 and the size of the moving average window L is less than 100 (Fig. 2). Implementation of the processing algorithm requires the creation of memory registers (MR) and shift registers (SR) of the FIFO type, as well as the use of the causal digital median filter (Filter) at the output of the program (Fig. 2). The architecture of the Mamdani type fuzzy logic controller for the algorithm of the WVR data processing is shown in Fig. 3. We used the membership functions to switch from a numerical variable to linguistic information (the fuzzification process) and vice versa (the defuzzification process, Fig. 3). The feature of membership functions for the fuzzification process is 3-D dimension—the number of input variables is equal to three: the level of integral water content Wi and atmospheric precipitation sensor Vi , as well as the value of the time derivative of the integral water vapor content dQi /dt. The type and parameters of the three-dimensional matrix of membership functions are obtained as a result of statistical processing of the GBARC data on the atmospheric precipitation time intervals. The Gaussian membership function type for the fuzification process

112

S. I. Ivanov and G. N. Ilin

Fig. 3. The Mamdani fuzzy logic controller

was used in controller. The argument of the function is the value of dQi /dt for fixed values of Wi and Vi . Another peculiarity of the Mamdanifuzzy logic controller architecture is the use of the weight coefficients in the rule of the center of the membership functions sum when going from the fuzzy output set obtained as a result of expert evaluations to the output logical variable [9,10]. The weight coefficients take into account the probabilistic properties of the membership functions in the given problem. The program block for finishing the processing algorithm in the forecast mode (Fig. 2) operates according to the determinate IF-THEN rules in accordance with the values of the input variables Wi , Qi and Vi . The IF-THEN rules used are also the result of statistical processing of GBARC data in the learning process.

4

Results of Modeling and Processing of WVR Data

To analyze the efficiency of the processing algorithm, it is necessary to calculate the root-mean-square error in estimating the information signal of the radiometer Qi , observed against a background of nonstationary additive interference. To carry out such an analysis, we have developed a generator simulating a time series of data for the integral content of water vapor Q against a background of nonstationary Gaussian noise and impulse noise of different intensity [8]. Such a model accurately reflects the information signal of a real WVR under the influence of atmospheric precipitation. With the help of the generator- simulator of WVR data, the dependence of the RMS of the error signal on the RMS of the input noise at fixed parameters of the impulse noise is calculated. Analysis of the calculation results showed that the noise power is changed by 16 times, the root-mean-square error ε increases by 4 times, which is typical for optimal processing algorithms. The root-mean-square error ε of the useful signal Q obtained with the developed algorithm is less than the error obtained with other filtering methods, for example, using Savitzky-Golay low-pass filter.

The Method of Fuzzy Logic and Data Mining for Monitoring

113

Fig. 4. Time series of Q WVR data (black), GNSS data (green), processing results (red), rain sensor data (blue) and 10xdiscrepancy+7(magenta).

To test the efficiency of the algorithm developed, the WVR data obtained in the Russian VLBI “Quasar” network from January 2016 to December 2018 were processed. The results of comparison of the WVR integral water vapor content Q obtained at the radio astronomy observatory “Svetloe” in August 2018 with the same type of GNSS data are presented in Fig. 4. As shown in Fig. 4, the duration of the time sample is 750 h (31 days) and according to the rain sensor signal more than 25% of them is intensive atmospheric precipitation. During precipitation the data is processed in the prediction mode. RMS residual—the difference of the parameter Q values measured by independent methods—with the help of the WVR and the GNSS is 0.14 g/cm2 for the entire observation time interval (Fig. 4). Note that this value is not equivalent to a measurement error, and its small value indicates the proximity of parameter Q measurements results by groundbased WVR and the GLONASS system. In addition, the GNSS data is delayed for 24 h, and the WVR data is available in real time mode. The mathematical expectation of the residual signal over the entire observation time interval (see Fig. 4) is minus 0.07 g/cm2 . That is practically means that the Q parameter estimate is not biased. The use of additional input variables of the integral water content W and the atmospheric precipitation sensor V for the Mamdani type fuzzy logic controller makes it possible to improve the accuracy characteristics of the processing algorithm in comparison with the previous version of the algorithm proposed in [4].

114

5

S. I. Ivanov and G. N. Ilin

Conclusion

The use of additional input variables of the integral water content W and the precipitation sensor V for the Mamdani type fuzzy logic controller allow to develop an effective algorithm for processing GBARC data for monitoring the troposphere parameters in quasi-real time. The developed algorithm allows to significantly reduce the loss of observational time without a significant loss of measurement accuracy in the period of intense precipitation. The results of computer simulation and full-scale tests confirm a small value of the mean-square error of measurements in a wide range of external non-stationary interference. For example, the difference of the parameter Q values measured in different ways for the 750-h sample, including the period of intense precipitation, was 0.14 g/cm2 , as shown on the graph of Fig. 4.

References 1. Animesh, M., Chakraborty, R.: Prediction of rain occurrence and accumulation using multifrequency radiometric observations. IEEE Trans. Geosci. Remote Sens. PP(99), pp. 1–9 (2018). https://doi.org/10.1109/TGRS.2017.2783848 2. Ilyin, G.N., Troitsky, A.V.: Determining the tropospheric delay of a radio signal by the radiometric method. Radiophys. Quantum Electron. 60(4), 291–299 (2017). https://doi.org/10.1007/s11141-017-9799-6 3. Arsaev, I.E., Bykov, V.Y., Ilin, G.N., Yurchuk, E.F.: Water vapor radiometer: measuring instrument of atmospheric brightness temperature. Meas. Technol. 60(5), 497–504 (2017). https://doi.org/10.1007/s11018-017-1224-1 4. Drozhzhov, K.A., Ivanov, S.I., Ilin, G.N.: Adaptive data processing of a groundbased radiometric complex for remote sensing of tropospheric parameters. In: 2017 IEEE II International Conference on Control in Technical Systems (CTS) – Proceedings, 25–27 October 2017, St.-Peterburg, Russia pp. 297–300 (2017). https:// doi.org/10.1109/CTSYS.2017.8109550 5. IAA RAS (2017). http://iaaras.ru/en/. Accessed 05 Apr 2018 6. NPO: ATTEKH (2017). http://attex.net/RU/index.php. Accessed 05 Apr 2018 7. NPO: Typhoon (2017). http://www.rpatyphoon.ru/product/devices/meteo/mk15/. Accessed 05 Apr 2018 8. Drozhzhov, K.A. Ivanov, S.I.: The research and implementation of processing algorithm for a non-stationary signal with input sampled-data missing and intense impulse noise. In: 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus) – Proceedings, 29 January–1 February 2018, Moscow, Russia pp. 1071–1074 (2018). https://doi.org/10.1109/EIConRus. 2018.8317275 9. Ponce-Cruz, P., Ram´ırez-Figueroa, F.D.: Intelligent Control Systems with LabVIEWTM , 216 p. Springer-Verlag London Limited (2010) 10. Mendel, J.M., Tan, W.-W., Hagras, H., Melek, W.W., Hao Y.: Introduction to Type-2 Fuzzy Logic Control: Theory and Applications, 356 p. Wiley, Hoboken (2014) 11. Golyandina, N., Nekrutkin, V., Zhigljavsky, A.: Analysis of Time Series Structure, 310 p. Chapman & Hall/CRC, Roca Raton (2003) 12. Golyandina, N., Zhigljavsky, A.: Singular Spectrum Analysis for Time Series, 119 p. Springer, Heidelberg (2013)

Multi-agent and Service-Based Distributed Systems

Actor-Network Approach to Self-organisation in Global Logistics Networks Yury Iskanderov1(B) and Mikhail Pautov2 1

2

The St. Petersburg Institute for Informatics and Automation of RAS, 39, 14-th Line, St. Petersburg, Russia iskanderov y [email protected] TPF Forwarding Network, Avd. Paralell 15, 2, 08004 Barcelona, Spain [email protected]

Abstract. This paper introduces application of some core principles of the actor-network theory (ANT) to the description of self-organised multi-agent socio-techno-informational systems represented by global logistics networks. The actor-network discourse is used to reflect on certain provisions of MAS theory, Agile style of project management, Blockchain (P2P) decentralised interactions etc. The authors make an attempt to formalise some basic concepts of ANT and discover its potential liaisons with MAS theory, applied semiotics and other related methods. The paper aims at bringing attention of the MAS theorists, logistics researchers and broader research community to the actor-network paradigm and its applied potential in multi-agent network research. Keywords: Actor-network theory · Global logistics networks Self-organisation · Heterogeneity · Multiplicity · Socio-techno-informational sytems · Multi-agent systems · Object-oriented semiotics

1

·

Introduction

Actor-network theory (ANT) may be presented as a descriptor for self-organised multi-agent socio-techno-informational systems (SOMASTIS). The concept of actor-network was first formulated in the end of the 20th century texts by Latour, Callon, Law and some other protagonists of the theory and was primarily applied in the science and technology studies (STS) and social philosophy [2,3,10]. The core principle of the theory is based on the idea that the actions of any agent (or “actor” in terms of ANT) are mediated by the actions of a set of other heterogeneous agents. In other words, all actions are distributed on a set composed of humans and non-human objects. Or, more formally, all actions are distributed on a set uniting actors with human intelligence (humans), non-human actors with artificial intelligence (AI actors), and non-human actors without intelligence (other artificial/technological and/or natural objects). An actor-network (AN) can be thus presented as: c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 117–127, 2020. https://doi.org/10.1007/978-3-030-32258-8_14

118

Y. Iskanderov and M. Pautov

AN = ANH ∪ ANAI ∪ ANNH, where ANH is the set of human actors (human intelligence), ANAI is the set of AI actors, ANNH is the set of non-intelligent natural and artificial (technological) actors. Actor-network building is a continuous process. Once it starts individual human and non-human (intelligent and non-intelligent) actors are considered as elements with equal agency (ability to act) in their interplay within AN. Agency here is nothing but an effect of interaction between human and non-human, intelligent and non-intelligent actors [11]. As Latour outlines in [2] a network is not a thing but rather the recorded movement of a thing. No network exists independently of the very act of tracing it, and no tracing is done by an actor exterior to the network [2]. Heterogeneity (of actors virtually brought together to form an actor-network) should be perceived as a keyword to emphasize the origins of any actor-network as a “rhizome” in Deleuze-Guattarian understanding of this term [12]: a self-organising multiplicity with totally decentralised structure and no hierarchical relations between its heterogeneous elements having only an initial affinity with each other. It can hardly be identified as a system since it lacks order and never inherits any order from its predecessors, however the initial affinity between its heterogeneous members establishes links between them so that any point of a rhizome can be (and must be) connected to any other, and a broken rhizome will start up again on one of its old lines or on new lines [12]. Today ANT plays a prominent role in the avant-garde of postmodern reflections on the socio-techno-informational systems. Unearthing and effective exploitation of the applied potential of this theory is the actual problem of today’s and future multidisciplinary research on the multi-agent network structures encompassing social, techno-informational and natural elements (represented, inter alia, by global logistics networks).

2

Basic ANT Principles and Their Applied Potential

Generalised Symmetry: Equivalence of heterogeneous actors in their interplay within actor-network AN should mean that the same criteria and terms are equally applied to the technological and natural actors on the one hand, and the socio-cultural actors on the other hand [1]. This vision represents a revolutionary paradigm shift from differentiation between the agency of intelligent (human and AI) actors and that of non-intelligent actors. The latter were totally deprived of agency in older paradigms. Actor-Network Dualism: Every individual actor of an actor-network is considered a network acting together with this actor. ANT is the theory of actors as networks. As Latour’s famous maxim goes: “when one acts, others proceed to action” [1]. Network is a work done by actors, i.e. by entities who act or undergo an action [2]. Any actor A can be involved (and in  the majority of cases is involved) in multiple actor-networks {ANi}, or: A ∈ ANi . i

Actor-Network Approach to Self-organisation in Global Logistics Networks

119

Nebular Oppositions Revised: ANT reconsiders some fundamental relations and metrics on the networks, e.g. [2]: • Far/Close. Physically close elements (when disconnected) may appear extremely distant from each other if we analyse their connections, and vice versa (cf. Latourian metaphor: “an Alaskan reindeer might be ten metres away from another one and they might be nevertheless cut off by a pipeline of 800 miles that make their mating for ever impossible” [2]). • Large/Small. A network is never “larger” than any other one. It can only have a more complex topology. • Inside/Outside. A network is its own border. A network in ANT has virtually nothing external. Translation: Operation of translation in actor-networks can be defined as a delegation of powers of representation from a set of actors (actor-networks) to any particular [blackboxed] actor or actor-network in a particular programme of actions: A = T(A1, . . ., An), where is the translation of actors {A1, . . ., An} to. In other words, actions of actors {A1, . . ., An} (translants) are brought into being or expressed through representative A acting on behalf of the entire actornetwork. Operation of translation equalises actor-network actions in various space-time areas and various meta-levels of presentation (e.g. when behaviour of an actor-network is translated through texts, graphs, diagrams, algorithms, formulae etc.). In Latourian understanding any association is possible if this association is encoded as heterogeneous liaisons established through the operation of translation [2]. Prescription: Prescription index P(A) ∈ [0,1] of actor A can be defined as a fuzzy estimate of actions of actor A from the viewpoint of other actors in actor-network AN. The less prescribed actors are more easily translatable in the interest of others, than more rigidly prescribed ones [6]. Hence the theorem: For any actors A1 and A2: P(A1) < P(A2) => τ (A1) > τ (A2), where τ ≥ 0 is a quantitative metric to measure ability of an actor or actor-network to be translated. For actor A the following is always true: τ (A) = k/(P(A)), k ∈ [0, 1]. Operation (process) of translation in complex actor-networks normally includes the following main stages [3,4]: Problematisation: At the problematisation stage the actors determine identities, goals and preferences of their [potential] allies. But those allies (presumably) participate in problematisations carried out by other actors. Hence their identities can be defined through other competitive methods. Problematisation may embrace what MAS theory calls “commitments” of an actor postulated as the necessity of a chain of actions performed by this actor towards a predetermined goal in the interests of the community of actors. The commitments of an actor are the basis of his future actions plan. An actor also uses his knowledge of the commitments of other actors to take into account the “social context and limitations”. One of the forms of commitments of an actor is the role accepted by or assigned to him [5].

120

Y. Iskanderov and M. Pautov

Interessement: Synonymous with interposition this stage of translation process represents a group of actions where an actor tends to impose and stabilise identities of other actors determined at the problematisation stage. These actions are performed by various devices. To make a group of actors “interested” means to create a device which can be placed between this group and all other entities tending to re-determine identities of this group. A1 makes A2 interested (A1 = I(A2)) when it breaks or weakens all links between A2 and group A3, A4, A5, . . . , An of n − 2 actors who may tend to liaise with A2 (Fig. 1 schematically demonstrates the process of “interessement” of actor B by actor A in actor-network AN = {A, B, C, D, E} [3]).

Fig. 1. “Interessement” of actor B by actor A in actor-network AN = {A, B, C, D, E} [3].

Successful outcome of the “interessement” stage confirms (more or less completely) the efficiency of the previous (problematisation) stage and supposed actor alliances. Besides, this stage tries to break all competitive liaisons and build a system of alliances within actor-network. At this stage the socio-technoinformational network structures are formed and fixed. This stage of the translation process embraces “conventions” contemplated in MAS theory. According to MAS theory conventions fix the conditions of fulfilment/rejection of obligations by an actor. All coordination mechanisms in MAS can be articulated in terms of [reciprocal] obligations of actors and pertinent conventions [5]. Enrolement: Determination and coordination of roles of actors aiming at creation of a steady network of alliances. This stage also communicates with MAS theory where one of the forms of actor obligations is the role accepted by or assigned to an actor. Based on the analysis of partial plans it can be discovered that some actors are overloaded with work, while others are not sufficiently busy. This could be a pretext for amendment of partial/local plans of the actors through redistribution of tasks. This coordination scenario is normally realised through “enrolement” where the actors are supposed to play roles (accepted by or assigned to them), interacting based on rigorously determined protocols [5].

Actor-Network Approach to Self-organisation in Global Logistics Networks

121

Mobilisation: The verb “mobilise” should literally mean to give mobility to immobile entities. Through step-by-step appointment of representatives and establishment of a series of equivalences heterogeneous actors are moved and then “reassembled” at a new place/time. This stage completes translation and certain actors start acting as representatives (delegates) of other actors [3]. Mobilisation concept can invest into understanding of the team behaviour of actors in MAS theory (where it is considered as something more than just a set of coordinated individual actions of the actors). The basic problem of the team work organisation in MAS is to organise the work of actors as a team in situations where every individual actor has but limited information on his own team, environment and competition. At the same time the actor has to materialise his own intentions through individual actions performed simultaneously or consequently with the actions of other actors [5]. The new vision towards equivalent agency of heterogeneous actors regardless of their inherent intelligence (or absence thereof) introduced by ANT (generalised symmetry principle) may help reconsider and enrich coordination scenarios contemplated in MAS theory.

3

Global Logistics Networks from ANT Perspective

From its humble beginnings in the sociology of science and technology, the ANT diaspora has spread to [13]: sociology, information systems [4,6], geography [7], management and organisation studies [8,9], economics, anthropology and philosophy. However we did not come across any serious attempts to formalise ANT or create cross-references between ANT and MAS theory or other methods of applied research on network structures. This paper is mostly intended to introduce the new member in the family of modeling methods applied to technosocieties. Unfortunately, the strict space limits of this paper do not allow us to go deep into detail. Therefore we leave further formalisation and detailing of the method for future publications. Here we try to steal an ANT inspired glance at the self-organised global networks of independent logistics providers and “follow the actors” as Latour has encouraged us to. During the last three decades the international logistics market was witnessing the growing global tendency of establishing and developing alliances between the independent logistics providers. These alliances were coming into successful competition with each other and with the hierarchically structured multinational corporations. Such associations are known as international or global logistics networks. In most cases such networks are formed by globally distributed independent SMEs. Today dozens of logistics networks operate around the globe and their number is permanently growing. Now we can talk about the new market – market of logistics networks – with its leaders, specialised (regionoriented, niche-oriented) players, even meta-networks (hyper-alliances of logistics networks, where individual networks are members). Another signature tendency may start dominating the market in the near future: some logistics networks seek integration of online marketplaces with further conversion of such “hardsoft” networks into fully functional globally distributed e-logistics providers – extensive users of the Internet of Things in their activity.

122

Y. Iskanderov and M. Pautov

A typical modern global logistics network may be presented as an actornetwork involving dynamic interactions of heterogeneous human and non-human actors: globally distributed independent logistics providers (network members), coordinating board and specialised committees, globally transported material objects (goods), information (texts, knowledge) interchanged between the actors, finance, modes of transport, containers, trade lanes, transport networks, their infrastructure and environment, national and international regulations, inspections, ports, terminals, warehouses, their personnel and equipment, manufacturers and consignees of the goods, hard- and software etc. In the actor-network approach towards formation and self-organisation of global logistics networks the critical role is played by fundamental refusal of ANT to view actor-networks as pre-existing (ready-made) structures. For ANT there is “no group, only group formation” [1]. It is a process of “heterogeneous engineering” in which bits and pieces from the social, the technical, the conceptual and the textual are fitted together, and so converted (or translated) into a set of equally heterogeneous products [10]. A global logistics network as an organisation may be seen as a set of strategies which operate to generate complex configurations of network durability, spatial mobility, systems of representation and calculability – configurations which have the effect of generating the centre/periphery asymmetries and hierarchies characteristic of most formal organisations. When ANT explores the character of organisation, it treats this as an effect or a consequence – the effect of interaction between materials and strategies of organisation. Looked at in this way a self-organised global logistics network is an achievement, a process, a consequence, a set of resistances overcome, a precarious effect. Its components – the hierarchies, organisational arrangements, power relations, and flows of information – are the uncertain consequences of the ordering of heterogeneous materials [10]. The structure of global logistics networks reflects not only common will of their members to find a workable solution, but also the connections between the forces that they can mobilise and the forces mobilised by their opponents/competitors [15,21]. We have used the example of TPF logistics network [16] as a case study for the purposes of this research. TPF (initially standing for Trans Pacific Forwarding) takes its roots in 1990 with the first five members representing respectively USA (2 members), Hong Kong, China and Australia. The first move was made because these companies used to work together as business partners and they wanted to create a safe environment for doing business as well as exchange sales leads and look after each other’s customers. Then they saw the potential for doing this with more countries, which would give them a wider reach than they had. It was then when they incorporated it in 1996 and the bylaws were written to define core principles of TPF as self-organised and self-governed organisation (e.g. the basic policy of member exclusivity: every country (or economic region in case of big markets) is represented by one exclusive member to avoid local interference and competition between the members). Since then TPF cannot be considered as a once established entity, but rather as an ongoing process of network building and self-organisation with the members permanently coming

Actor-Network Approach to Self-organisation in Global Logistics Networks

123

and going, P2P alliances assembled and dismantled. For TPF the Agile Manifesto’s formula “individuals and interactions over processes and tools” [20] has turned into “quality (of members) prevails over quantity”. Therefore, reaching the level of nearly 50 members representing almost 50 countries in 2011 this number has dropped to 33 members in 2019 without any adverse effect on the quality and effectiveness of TPF organisation. Each new member contributes to formation of TPF network culture and simultaneously adapts to it. The strength of the logistics networks of TPF type is in their mobility and protean changeability: incompatible elements (members or specific country-based actors translated through them) quit the network, and the network is permanently renewed through attracting substitute actors. Each member joining TPF comes up with his own actor-network of manufacturers/suppliers/shippers, cargoes, multimodal freight units (containers) located in his country, local transport, local representations of global ocean/air carriers, local ports, airports, local authorities and regulations, national/local market and logistics related knowledge, national business culture. The least prescribed of those actors readily translate their agency to the new TPF member representing them internationally on the TPF platform and thus giving those actors additional business opportunities. Through “interessement” of his TPF peers and creation of peer-to-peer alliances the member acts as a mediator [19] between his country-based actors and his new international peers he allies with. That is when the competitiveness of the global logistics network emerges: the local actors represented by network members start allying with the network through weakening or even breaking their earlier unquestionable links with the multinational players they used to be allied with. These new alliances may result in launch of international logistics projects and even in creation of new supply chains/networks. The following features of a model self-organised global logistics network are to a certain extent characteristic of TPF in its today’s status and thus contribute to making TPF an interesting case study from the ANT viewpoint. These characteristic features make TPF and similar logistics networks successful competitors of the multinational corporations (here discussed in a nutshell due to the space limits of this paper): • Holacratic organisation; • Agile style of project management; • Decentralised Blockchain (P2P) contracting. The building blocks of a holacratic organisational structure are roles (see definition of “enrolement” above). Holacracy distinguishes between roles and the actors who fill them, as one individual actor can hold multiple roles at any given time. A role is not a job description; its definition follows a clear format including a name, a purpose, optional “domains” to control, and accountabilities, which are ongoing activities to perform [18]. Roles are defined by each circle/team (committee or working group at a general annual meeting in case of TPF network) via a collective governance process, and are updated regularly in order

124

Y. Iskanderov and M. Pautov

to adapt to the ever-evolving needs of a global logistics network. Holacracy of TPF gives the network particular advantages over the hierarchically organised multinationals since the “circles” between TPF members are self-organised and developed on “win/win” basis. These are the TPF members who determine the politics of the network and rule the network: the most important decisions are made through member voting at TPF general annual meetings (GAMs); in the interim between the GAMs coordination functions are delegated from members to the coordinating board (with permanent succession of members voted for at the GAMs) whose role is strictly limited to the support of self-organised interactions between the network members. All fees paid by the members are invested in development of the network and distributed between the committees formed to create and develop new projects, support expansion of the network and smooth succession of the members, insure the members’ activities and responsibilities, resolve internal conflicts, develop IT for the network etc. Agile method tailoring is defined in [14] as: a process or capability in which human actors determine a system development approach for a specific project situation through responsive changes in, and dynamic interplays between contexts, intentions, and method fragments. Or, in terms of ANT human actors make “interested” selected contexts, intentions and method fragments, distribute roles and “mobilise” these actors to form new project situations. TPF implements so called distributed Agile style of project development (where project development is applied in a distributed setting with members/committees dispersed across multiple locations around the globe). The goal is to leverage the unique benefits offered by each idea brought to the network by the committees/members. Agile development provides increased transparency, continuous feedback and more flexibility when responding to changes. Blockchain contracting and payments system in a self-organised global logistics network entirely run by its members (like TPF) is totally decentralised and does not require relying on a central authority or centralised structure that established trust. To add a contract/transaction to the Blockchain ledger, the contract/transaction must be shared within the Blockchain peer-to-peer (P2P) network [17]. In [17] the authors give a remarkable example of alliance between IBM and Maersk temporarily established to tackle shipping document processing inefficiencies which resulted in permissioned Blockchain solution as means to connect the vast global network of shippers, carriers, ports and customs where every relevant document or approval was shadowed on the Blockchain, meaning the legacy IT systems were not replaced but augmented [17]. In the actor-network approach the “problematisation” and “interessement” phases of the translation macro-process explain formation of Blockchain alliances within global logistics networks and thus can be used in advanced Blockchain research.

4

Other Prospective Applications

The ideological nucleus of ANT is semiotics. In principle, ANT can be viewed as the newest phase of evolution of semiotics towards its object-orientedness.

Actor-Network Approach to Self-organisation in Global Logistics Networks

125

Every actor-network creates its own metrics and metalinguistic resources for its description. ANT looks to the network builders as the primary actors to follow and through whose eyes they attempt to interpret the process of network construction [13]. Actors are able to “explain” other actors. Creation of metrics means expansion of networks (e.g. Hughes’s Networks of Power grow and by their very growth they become more and more of an explanation of themselves [2]). In future research we will try to establish links between ANT and the approach developed in the framework of applied semiotics. As an object-oriented semiotic tool ANT can analyse connections between the information in the form of texts (documents, contracts, messages etc.) circulating in a logistics network, meta-texts (scripts, protocols) and network situations caused by the texts and meta-texts. This tool in our opinion has a great applied potential in information security studies: investigation of dependencies between the degree/quality of protection of the texts moved across the network and the network situations caused by protected/unchanged texts (intermediaries) or deliberately/randomly changed texts (mediators). We will specifically discuss this approach in future publications on information security in logistics systems.

5

Conclusion

Actor-network paradigm looks well projectable onto self-organised multi-agent socio-techno-informational systems (SOMASTIS) in general and onto the global logistics networks in particular. Given the demonstrated intercourse between ANT and other methods of applied research on network structures (e.g. MAS theory), it becomes clear that in the context of the described class of problems the actor-network theory can be either used as a leading research tool or it can be integrated with the existing research methods applied to coordination and collective decision making processes in self-organised multi-agent systems. At this stage actor-network theory still lacks formalisation, its applied potential is not fully discovered. This paper is the first in a series of planned publications on the development and application of the actor-network method in SOMASTIS research and aims at “interessement” of the research community working in the discussed and adjacent fields of research, drawing the researchers’ attention to the actor-network paradigm and involving the researchers into the collective process of further development and utilisation of the discussed method. Acknowledgements. We cordially thank the Board of Directors of TPF Logistics Network and personally Mrs. Begonia Arsuaga (TPF General Manager) for collaboration and important data provided to support our research.

References 1. Bencherki, N.: Actor-network theory. In: Scott, C., Lewis, L. (eds.) The International Encyclopedia of Organizational Communication. Wiley, New York (2017)

126

Y. Iskanderov and M. Pautov

2. Latour, B.: On actor-network theory. A few clarifications plus more than a few complications. Soziale Welt 47, 369–381 (1996) 3. Callon, M.: Some elements of a sociology of translation; domestication of the scallops and the fishermen of St. Brieuc Bay. In: Law, J. (ed.) Power, Action and Belief, pp. 196–223. Routledge, London (1986) 4. Silic, M.: Using ANT to understand dark side of computing – Computer underground impact on eSecurity in the dual use context. University of St Gallen, Institute of Information Management, January 2015 5. Gorodetskii, V.I., Karsayev, O.V., Samoylov, V.V., Serebryakov, S.V.: Applied multiagent systems of group control. Sci. Tech. Inf. Process. 37(5), 301–317 (2010) 6. Cordella, A., Shaikh, M.: Actor-network theory and after: what’s new for IS research. In: European Conference on Information Systems, 19 June 2003– 21 June 2003 (2003) 7. M¨ uller, M., Schurr, C.: Assemblage thinking and actor-network theory: conjunctions, disjunctions, cross-fertilisations. Transactions of the Institute of British Geographers published by John Wiley & Sons Ltd on behalf of Royal Geographical Society (with The Institute of British Geographers), vol. 47, pp. 217–229 (2016). https://doi.org/10.1111/tran.12117. ISSN 0020-2754 8. Alcadipani, R., Hassard, J.: Actor Network Theory (and After) and Critical Management Studies: Contributions to the Politics of Organising. In: XXXIII Encontro da ANPAD, Sao Paulo/SP – 19-23/09/2009 (2009) 9. Gherardi, S., Nicolini, D.: Actor-networks: Ecology and Entrepreneurs. Universit` a di Trento, Dipartimento di Sociologia e Ricerca Sociale (2019) 10. Law, J.: Notes on the theory of the actor-network: ordering, strategy and heterogeneity, Syst. Pract. 5, 379–393 (1992). Version published by heterogeneities.net on 25th September 2011. http://www.heterogeneities.net/ publications/Law1992NotesOnTheTheoryOfTheActor-Network.pdf 11. Erofeeva, M.: On the possibility of actor-network theory of action. Sociol. Power 27(4), 51–71 (2015) 12. Deleuze, G., Guattari, F.: A thousand plateaus. University of Minnesota Press, Minneapolis (1993). ISBN 0-8166-1402-4 13. Cressman, D.: A brief overview of actor-network theory: punctualization, heterogeneous engineering & translation. ACT Lab/Centre for Policy Research on Science & Technology (CPROST) School of Communication, Simon Fraser University, April 2009 14. Aydin, M.N., Harmsen, F., Slooten, K., Stagwee, R.A.: An agile information systems development method in use. Turk. J. Elec. Eng. 12(2), 127–138 (2004) 15. Law, J.: Technology and heterogeneous engineering: the case of Portuguese expansion. In: Bijker, W.E., et al. (eds.) The Social Construction of Technological Systems: New Directions in the Sociology and History of Technology, pp. 105–127. MIT Press, Cambridge (2012) 16. TPF Logistics Network website (public resource). www.tpfnetwork.org 17. Hackius, N., Petersen, M.: Blockchain in logistics and supply chain: trick or treat? In: Proceedings of the Hamburg International Conference of Logistics (HICL), vol. 23 (2017) 18. Holacracy Constitution (public resource). www.holacracy.org/constitution 19. Salin, S.: How to pack the Lebenswelt into a black box: assembly instructions. Logos. 28(5), 137–168 (2018). ISSN 0869-5377 20. Agile Manifesto (public resource). https://agilemanifesto.org

Actor-Network Approach to Self-organisation in Global Logistics Networks

127

21. Iskanderov, Y., Pautov, M.: Security of information processes in supply chains. In: Abraham, A., Kovalev, S., Tarassov, V., Snasel, V., Sukhanov, A. (eds.) Proceedings of the Third International Scientific Conference Intelligent Information Technologies for Industry, IITI 2018. Advances in Intelligent Systems and Computing, vol. 875. Springer, Cham (2019)

Multi-agent System for Simulation of Response to Supply Chain Disruptions Jing Tan1(B) , Rongjun Xu1(B) , Kai Chen1(B) , Lars Braubach2(B) , Kai Jander3(B) , and Alexander Pokahr4(B) 1

4

Huawei Technologies Co. Ltd., Bantian, Longgang District, Shenzhen 518129, China {tanjing9,xurongjun,colin.chenkai}@huawei.com 2 City University of Applied Sciences Bremen, Neustadtswall 30, 28199 Bremen, Germany [email protected] 3 Brandenburg University of Applied Sciences, Magdeburger Str. 50, 14770 Brandenburg, Germany [email protected] Actoron GmbH, Richardstr. 49, 22081 Hamburg, Germany [email protected]

Abstract. Global supply networks of manufacturing companies face many types of disruption. Quick decision-making with only limited information is often required. We propose a novel agent-based planning and scheduling simulation system, which can make rescheduling suggestions within minutes and with limited change to the existing plan. By simulating disruptions of various nature and severity in advance, the system also serves to support preventive supply chain design changes. Keywords: Supply chain management Simulation

1

· Multi-agent system · Agent ·

Introduction

Companies who have experienced severe supply chain disruptions often report substantial revenue losses and reduced market value after the event [3,14]. Hendricks et al. [6] show that such disruptions are likely to have long-term financial impact. Mitigation and contingency tactics range from passive acceptance to active customer demand steering [14]. A company’s decision is often an attempt at balancing multiple objectives, when critical decision factors are unclear and causal effects are hard to explain. Vakharia et al. [15] introduces various MIP (Mixed-Integer Programming) models, analytically solving for important decision points such as lead time and safety stock [16], preventive supply chain partner selection [4], optimized material flow (order volume and source) with reliability/robustness constraints [2] and a global supply chain network design [7]. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 128–139, 2020. https://doi.org/10.1007/978-3-030-32258-8_15

Multi-agent System for Simulation of Response to Supply Chain Disruptions

129

Due to growing complexity in a global supply network, traditional optimization models rely on expert knowledge regarding critical factors in the model as well as the fine-tuning required to overcome potentially huge result variations. This study is based on a project of the supply chain department of Huawei Technologies Co. Ltd, an international manufacturing company. The company was subject to several supply chain disruptions in the past year. “War rooms” were used during such disruptions to share information across functional departments, agree on decisions based on the status quo, wait until the next day for decision effects to materialize and incorporate new inputs to devise further decisions. This iterative process is manual, inefficient and cumbersome to the point of delaying critical decisions. The goal of this study is not to design a system replacing existing planning and scheduling systems, but instead offers a mechanism, which closely resembles the behavior of such systems, imitating their response at a fraction of the original timeat the expense of a slightly reduced solution quality. Typically the standard scheduling system requires hours to reach an optimal solution. If used for simulating various scenarios, it may necessitate days to collect and compare results. This study prototypes a simulation system which enables (1) playing through scenarios in minutes, eliciting responses similar to the actual system supporting urgent management decisions in emergency situations and (2) simulating a diverse set of disruptions of varying severity in advance supporting preventive supply chain design changes. The study proposes a hybrid approach, solving multiple simple MIP problems in parallel using decentralized multi-agent system. Each agent controls its own resources and is responsible for its own individual and independent objectives. In addition, its visibility is limited to its immediate suppliers and customers (neighboring agents). During disruptions, directly affected agents will start optimizing their individual objectives with simple MIP techniques based on reduced resources (capacity or material); resulting interim results will propagate along the chain to their suppliers or customers in stages aiming to minimize number of nodes which need to deviate from the initial state. In case of conflicts, i.e. a proposal cannot meet the agent’s internal constraints, the agent will optimize and propose an alternative solution, then propagate it back through the chain. The system achieves stability when all agents meet their constraints. Since MIP optimization independently occurs inside each agent, the number of parameters and constraints are substantially reduced compared to modeling and optimizing the entire system. Despite increased communication complexity, the distributed system rapidly stabilizes. The paper is structured as follows: Sect. 2 provides a brief overview of current simulation methods. Section 3 describes the system prototype developed in the project. In Sect. 4 first simulation results with sample data are provided. Finally, in Sect. 5, conclusions are drawn and further research areas are suggested.

130

2

J. Tan et al.

Related Work

This section presents a broad overview of simulation techniques and agent-based approaches in the context of scheduling problems. 2.1

Simulation Techniques

A common dynamic modeling technique is systems dynamics modeling (SD). As described by Li et al. [9], the modeling approach adjusted by supply chain risk modeling highlights causal inter-dependencies and interactions. Uncertainty is captured through probability distribution of parameters; causal relation (including time delays) is expressed by mathematical equations. Using sampled parameter values and running a Monte-Carlo simulation, worst-, average- and bestcase scenarios can be derived. Although suitable for capturing interactions, SD depends on expert knowledge of system structure and parameters [5,17]. Discrete event simulation (DES) is another prevailing modeling technique in logistics and supply chain management. In contrast to SD, it models state changes in discrete time steps and entities are represented individually. In literature review by Tako et al. [13], both simulation techniques are compared. In conclusion, despite both approaches being used extensively as decision support tools, DES is more suitable for operational/tactical level modeling such as local production decisions, whereas SD is more appropriate for long-term and strategic modeling. Choice of modeling technique relies on balancing data volume and complexity of the system structure: in general, a DES system structure requires less expert knowledge while SD requires less data. 2.2

Agent-Based Modeling

Compared to SD or DES, agent-based simulation relies on bottom up modeling of supply chain roles by heterogeneous agents acting autonomously, often using simple rules but expressing behaviors as a system not explicitly programmed (emergent behavior) [10]. A common way of designing agents uses different roles and functionalities in the supply chain. For example, Ledwoch et al. [8] model supply chains with supplier agents, OEM agents and logistic provider agents; Otto et al. [11] employ a similar approach to model response dynamics of a system under production shocks. Such models generally focus on studying complex supply network topology, material and information flow but are deficient modeling operation-level planning and scheduling. They simplify product hierarchy with one or a few dummy products, and rely on general assumptions like time delay to imitate planning processes and capacity limits within each agent. Seck et al. [12] apply a more operative approach to separate agents into product, demand, production, stock, order and batch entities. This approach allows studying networks with more operational aspects such as Bill-Of-Materials (BOM), forecast and firm demands, capacity and optimal production batches, etc. Although tested with a limited number of nodes and products, its model is extensible for more intricate product hierarchies.

Multi-agent System for Simulation of Response to Supply Chain Disruptions

131

Current level of modeling detail in existing approaches is insufficient for daily operational planning and scheduling. These approaches are inadequate for short-term supply chain disruption simulations of individual material codes and production capacity, especially when decisions involve more than 10,000 material codes and hundreds of production lines in various manufacturing locations. Our approach uses agent-based modeling and simulation techniques in which agents are modeled on an even lower level than Seck et al. [12]. Here, each agent represents one specific type of product or production capability. Scheduling of competing products for limited material availability and production capability is done using iterative communication and compromises between agents. Since this study is focused on disruptions, communication and compromise always leads to reduced or delayed production compared to the baseline.

3

Agent-Based Disruption Scheduling Approach

The approach is based on real-world Bill-Of-Materials (BOM). A BOM is a tree or network structure representing relationships between raw materials, semifinished and finished goods. Leaf nodes on the bottom of the tree or network are the lowest-level raw materials and root nodes at the top are finished goods for customers. Material codes on the lower layer need to be produced or purchased before their parent (downstream customer) nodes can be produced. Scheduling complexity can be defined by the number of layers and connections BOMs: BOMs with copious layers and connections imply high levels of inter-dependence between material codes and therefore result in more complex scheduling. Two types of agents used, representing entities with different capabilities in the supply network. A material agent’s objective is producing a designated type of material in a specific location to satisfy as many of its downstream customer agents’ orders as possible, on time and in full. It has the resources of in-stock and in-transit supply inventory. It controls its own production schedule and has partial visibility into its immediate upstream and downstream agents’ schedules when announcing new proposals concerning its own material. It also includes information of substitute materials for its standard-choice of supply. It can communicate with other agents through a protocol for sending and receiving new scheduling proposals; it can optimize its production schedule or alternatively delay, cancel or combine production orders. It is initialized with a predefined production schedule. The capacity agent represents the machinery used to produce goods. Like a real-life production line, capacity agents deal with so-called “capacity packages”, representing a group of semi-finished or finished goods with similar characteristics so that they can be produced using the same production line in only slightly different configurations. These agents’ objective is to fulfill the maximum number of capacity orders on time and in full. A major difference between a material agent and a capacity agent is that unused capacity left at the end of each planning time unit will be deleted. In case of a resource shortage it can optimize its production and propose new schedules.

132

J. Tan et al.

Material agents in example BOM trees: FG1 RM0

FG2

FG: finished goods

SFG1

SFG2

SFG: semi-finished goods

RM1

RM2

RM: raw material

capacity buffer RM inventory buffer SFG inventory buffer

Planning and scheduling example: RM1

RM1 capacity agent1

SFG1

SFG1

capacity agent2

FG1

RM0

capacity agent1

SFG2

SFG1

SFG1

capacity agent2

FG2

FG1

RM2

RM1

RM1

RM2

FG1 RM0

RM0

SFG1

SFG2

FG1 RM0

FG2 planned delivery

maximum delay

FG1 actual delivery

Fig. 1. An illustration of rescheduling

Figure 1 (upper part) demonstrates two simple BOM trees modeled using 7 material agents, 4 of which impose requirements of production capacity on 2 capacity packages (represented by 2 capacity agents). The initial production schedule, as shown in Fig. 1 (lower left part) has built-in inventory and capacity buffers to cope with short-notice schedule changes. Disruption lasting no longer than that buffer length require no rescheduling. If it occurs on short notice lasting for a longer period of time, the planner will attempt rescheduling so that the existing production sequence and amount necessitate minimum change. As an example shown in Fig. 1 (lower left part), the second delivery of RM1 is delayed long enough to affect scheduling of other material codes. Figure 1 (lower right part) shows one potential rescheduling solution where production of affected SFG1 and FG1 is split. 3.1

Conceptual Overview

Figure 2 illustrates the interaction between agents based on a single rescheduling iteration. When a disruption occurs, system input data changes to reflect reduced amount of raw materials and/or semi-finished goods. As a first step (cf. Fig. 2), this only triggers immediately affected agents. In response, in-stock and in-transit inventory is recalculated. In step 2, triggered agents attempt to reschedule and optimize their production schedules using the new information. The optimization problem is modeled as a simple MIP problem. Due to the agent’s limited visibility and control scope, the potentially complex system-wide optimization problem is reduced to individual agents resolving their own small problems. After solving their own optimization problems, agent propose changes to their downstream customer agents’ schedules. In step 4, the affected customer agents consolidate all change proposals received from their upstream suppliers and optimize their own schedule. 9

6 Capacity Agent

4/8

5 5

6 Capacity Agent

Parent Material Agent

7

5

7 7

9

4/8

Parent Material Agent

9

Child Material Agent

3 3 3

9 9

Child Material Agent

Fig. 2. Interaction between agents

2 1 2 1

Multi-agent System for Simulation of Response to Supply Chain Disruptions

133

In general, material agent schedules can only degrade under the assumption that supply chain disruption risks always negatively impact the plan. Proposed changes will be propagated to related capacity agents. In steps 5 to 7, communication and rescheduling will be initialized by affected capacity agents. A capacity agent receives change proposals from all relevant material agents and attempts to solve its own optimization problem. Steps 8 and 9 again describe consolidation of new proposals and the corresponding rescheduling activities. The current simulation system only models disruption scenarios. However, it is straightforward to extend to allow simulating human interference such as order priority changes, capacity increases or expedited shipments. 3.2

MIP Problem

Step 2 of the rescheduling iteration includes an optimization problem within the supply material agent (i.e. a child node on a lower level in the BOM) (cf. Sect. 3.3.1). There are two options used in the simulation: if partial order fulfillment is permitted, agents attempt to produce as early and as much as possible, even if it means full amounts will be produced separately or partially cancelled. If partial order fulfillment is prohibited, agents try to produce orders either in full or not at all. This behavior is modeled after real-world scenarios where customer demands need to be produced and shipped as single batch or completely cancelled by the customer. In step 4, a customer material agent (i.e. a parent node on a higher level in the BOM) solves the optimization problem (cf. Sect. 3.3.2). It is common source of multiple suppliers who independently delay their material delivery. The plant reschedules its production accommodating all potential delays in one shot instead of dealing with each disruption separately. In step 6, a capacity agent solves its optimization problem (cf. Sect. 3.4.1). After the system stabilizes, an optional inventory reduction step strips off excessive inventory caused by order cancellation induced by local optimizations in all material agents. This step is not described in detail in this paper. 3.3

Material Agent

As shown in Fig. 3, the material agent iteratively carries out the following tasks: 1. check planned downstream consumption, update demand-supply situation 2. when production cannot fulfill all demands before required date, try to find best solution by performing optimization from the supplier material agent’s perspective to allocate its available material to each customer 3. event-driven: the agent receives a change request from its upstream supplier agents 4. consolidate all change requests and try to find best solution by performing optimization from the customer material agent’s perspective to reconcile change proposals from its suppliers 5. when all constraints are met, check for potentially excessive inventory and reduce production accordingly

134

J. Tan et al. Parent agent producƟon schedules (pull) 1

Material Agent

2

Change proposal to parent producƟon schedules (push)

Optimize supply allocation

Change proposal to agent’s producƟon schedule (eventdriven) 3

Supply situation from scheduled production Reconcile change proposals

Update own producƟon schedule and supply situaƟon

4

Fig. 3. Structure of a material agent

3.3.1 Supplier Material Agent’s Optimization of Schedule First optimize from a supplier’s perspective: a material agent notices its own produced material quantity cannot fulfill all downstream customer agents’ orders before their demand date. It tries to match as much demand as possible based on the given order priority and the given strategy of order fulfillment: whether it is allowed to fulfill only partial orders. If partial order fulfillment is allowed, its objective is defined as maximize (using M : number of orders; N : rescheduling horizon; dmn : demand of order m on day n; sn : maximum available supply on day n; xmn : quantity allocated to N M   order m on day n; Wm : weight, or priority of order m): obj = Wm ∗ xmn , with the demand constraints constraints

M  m=1

xmn 1. With higher importance of the private criteria, BP selects more specific resources and increasingly diverges from the backfilling finish time procedure and corresponding jobs execution order. The values obtained by BP with α = 1000 are close to the practical limits provided by the pure private criteria optimizations. We may conclude from Figs. 1, 2 and 3 that by changing a mutual importance of private and global scheduling criteria it is possible to find a trade-off solution. Even the smallest α values are able to provide a considerable resources distribution according to VO users private preferences. At the same time BP with α < 10 maintains an adequate resources utilization efficiency comparable with BFf while provides more efficient preference-based resource share.

Global and Private Job-Flow Scheduling Optimization

167

Fig. 3. Simulation results: average performance of the resources allocated

4

Conclusions and Future Work

In this paper, we study the problem of a resources selection optimization for job-flow scheduling and execution in Grid virtual organizations. Fair scheduling policies in VOs usually assume configurable resources distribution according to VO stakeholders individual preferences. For this purpose we used SSA algorithm as a resources selection step in a conservative backfilling procedure. SSA performs resources selection optimization for each job according to both global and private scheduling criteria. In this study we considered jobs finish time minimization as a global scheduling criterion, and resources performance and cost optimization as different users’ scheduling criteria. The simulation study proved the efficiency of the proposed fair resources sharing approach. The difference in jobs execution according to private criteria reached 40%. At the same time the difference from a pure global criterion optimization is less than 1% in a wide range of considered scheduling scenarios. Besides that, by configuring the importance factor between private and integral scheduling criteria it is possible to influence the fair scheduling outcome and propose a balanced solution. Future work will be focused on a more detailed private and global criteria study, their mutual consistency and possible scheduling strategies to improve resources usage efficiency and the quality of service. Acknowledgements. This work was partially supported by the Council on Grants of the President of the Russian Federation for State Support of Young Scientists (grant YPhD-2979.2019.9), RFBR (grants 18-07-00456 and 18-07-00534), and by the Ministry on Education and Science of the Russian Federation (project no. 2.9606.2017/8.9).

168

V. Toporkov et al.

References 1. Dimitriadou, S.K., Karatza, H.D.: Job scheduling in a distributed system using backfilling with inaccurate runtime computations. In: Proceedings of 2010 International Conference on Complex, Intelligent and Software Intensive Systems, pp. 329–336 (2010) 2. Toporkov, V., Toporkova, A., Tselishchev, A., Yemelyanov, D., Potekhin, P.: Heuristic strategies for preference-based scheduling in virtual organizations of utility grids. J. Ambient Intell. Humanized Comput. 6(6), 733–740 (2015) 3. Buyya, R., Abramson, D., Giddy, J.: Economic models for resource management and scheduling in grid computing. J. Concurrency Comput. 14(5), 1507–1542 (2002) 4. Kurowski, K., Nabrzyski, J., Oleksiak, A., Weglarz, J.: Multicriteria aspects of grid resource management. In: Nabrzyski, J., Schopf, J.M., Weglarz, J. (eds.) Grid Resource Management. State of the Art and Future Trends, pp. 271–293. Kluwer Acad. Publ. (2003) 5. Rodero, I., Villegas, D., Bobroff, N., Liu, Y., Fong, L., Sadjadi, S.M.: Enabling interoperability among grid metaschedulers. J. Grid Comput. 11(2), 311–336 (2013) 6. Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A.F., Buyya, R.: CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. J. Softw. Pract. Experience 41(1), 23–50 (2011) 7. Rzadca, K., Trystram, D., Wierzbicki, A.: Fair game-theoretic resource management in dedicated grids. In: IEEE International Symposium on Cluster Computing and the Grid (CCGRID 2007), Rio De Janeiro, pp. 343–350. IEEE Computer Society (2007) 8. Vasile, M., Pop, F., Tutueanu, R., Cristea, V., Kolodziej, J.: Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing. J. Future Gener. Comput. Syst. 51, 61–71 (2015) 9. Penmatsa, S., Chronopoulos, A.T.: Cost minimization in utility computing systems. Concurrency Comput. Pract. Experience 16(1), 287–307 (2014) 10. Jackson, D., Snell, Q., Clement, M.: Core algorithms of the Maui scheduler. In: Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing, JSSPP 2001, pp. 87–102 (2001) 11. Mutz, A., Wolski, R., Brevik, J.: Eliciting honest value information in a batchqueue environment. In: 8th IEEE/ACM International Conference on Grid Computing, New York, pp. 291–297 (2007) 12. Toporkov, V., Toporkova, A., Yemelyanov, D.: Slot co-allocation optimization in distributed computing with heterogeneous resources. In: Studies in Computational Intelligence, vol. 798, pp. 40–49. Springer, Cham (2018) 13. Takefusa, A., Nakada, H., Kudoh, T., Tanaka, Y.: An advance reservation-based co-allocation algorithm for distributed computers and network bandwidth on QoSguaranteed grids. In: Schwiegelshohn, U., Frachtenberg, E. (eds.) JSSPP 2010, vol. 6253, pp. 16–34. Springer, Heidelberg (2010) 14. Kim, K., Buyya, R.: Fair resource sharing in hierarchical virtual organizations for global grids. In: Proceedings of the 8th IEEE/ACM International Conference on Grid Computing, pp. 50–57. IEEE Computer Society, Austin (2007) 15. Skowron, P., Rzadca, K.: Non-monetary fair scheduling cooperative game theory approach. In: Proceedings of the Twenty-Fifth Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp. 288–297. ACM, New York (2013)

Global and Private Job-Flow Scheduling Optimization

169

16. Toporkov, V., Yemelyanov, D., Bobchenkov, A., Potekhin, P.: Fair resource allocation and metascheduling in grid with VO stakeholders preferences. In: Proceedings of the 45th International Conference on Parallel Processing Workshops, pp. 375– 384. IEEE (2016) 17. Khemka, B., Machovec, D., Blandin, C., Siegel, H.J., Hariri, S., Louri, A., Tunc, C., Fargo, F., Maciejewski, A.A.: Resource management in heterogeneous parallel computing environments with soft and hard deadlines. In: Proceedings of 11th Metaheuristics International Conference (MIC 2015) (2015) 18. Shmueli, E., Feitelson, D.G.: Backfilling with lookahead to optimize the packing of parallel jobs. J. Parallel Distrib. Comput. 65(9), 1090–1107 (2005)

Type-Based Genetic Algorithms Roman Sizov(B) and Dan A. Simovici Computer Science Department, University of Massachusetts Boston, Boston, MA 02125, USA {rsizov,dsim}@cs.umb.edu

Abstract. This paper introduces a novel type-based genetic algorithm and its applications to two well-known problems: N -queen problem and finding the global minimum of the Rosenbrock function. The algorithm offers a new approach to internal structure of individuals in population of genetic algorithms. Keywords: Genetic algorithm · Type-based N -queen problem · Rosenbrock function

1

· Gender-based ·

Introduction

Genetic algorithms have been successfully applied to find approximate solutions to several N P -complete problems and to solve a variety of multi-objective optimization problems ranging from groundwater quality management [3] to fuzzy autopilot controllers [2]. In [7] a comparative study of multi-objective genetic algorithms and simulated annealing applied to analogue filter tuning is performed. We introduce a novel type-based genetic algorithm where individuals are classified based on their internal chromosomal structure. Our algorithm partitions the population of individuals into several groups based on this structure, and allows mating only between individuals that belong to different groups. In this note we focus of two-group populations that contain one or two full chromosomes. Various genetic algorithms that split the population into groups based on a gender were introduced in the past [1,6]. However, their approach only tags an individual as a male or a female does not involve the chromosomal composition of individuals. The structural differences define the type of an individual. We demonstrate that our type-based genetic algorithm clearly outperforms the classic genetic algorithms for two important and much examined problems: the N -Queen problem, and finding the global minimum of Rosenbrock function [5] both in terms of solution quality and stability.

2

Individuals, Population and Chromosomes

The population consists of individuals of different types. Each individual consists of one or more chromosomes, where each chromosome represents a solution to c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 170–176, 2020. https://doi.org/10.1007/978-3-030-32258-8_19

Type-Based Genetic Algorithm

171

the problem solved by the genetic algorithm. Thus, each individual has a type, and there are at least two types. Individuals can mate only with individuals of a different type. We focused on populations that contain two groups of individuals: Group A and Group B whose members are designated as individuals of Type A and Type B respectively. These types of individuals are defined as follows. Definition 2.1. A Type A individual has two chromosomes of the same size. A Type B individual is an individual with two chromosomes, the first one of which is of the same size as chromosomes in an individual of Type A and the second one is a null chromosome, i.e. a chromosome with the size 0.   Mating takes place between individuals of Type A and Type B as follows. A Type A parent provides one of its chromosomes to its child in a random fashion. The second chromosome is randomly obtained from a Type B parent and this chromosome could null. The second chromosome of the child defines its type. We keep the population size fixed using a certain type of selection. However, the population size increases temporarily during the mating stage. Once the mating stage is complete, the population size is reduced to the original size by keeping only the most successful individuals. Also, note that since only different types of individuals can mate, we have to keep at least one individual of each type even if these individuals are not the most successful ones in the population. However, they still have to be the most successful in their group. Moreover, the initial population is created in a random fashion but at least one individual of each type must be present. There are two kinds of chromosomes that we use in our algorithm. The first kind is a full chromosome with a fixed positive size L, where L is merely the number of genes in the chromosome. The second kind is a null chromosome with the size L = 0. This kind of chromosome is merely used to simplify the mating process and to automate the determination of the type of an individual. Each full chromosome comprises L genes. Genes can have various types, e.g. an integer, a real number, an integer from a to b, a matrix or any other structure. In our algorithm the types of genes are fixed, i.e. a mutation process cannot change the type of a gene; only the value of a gene can be changed by mutation. The algorithm assumes that larger values of a fitness function mean a better solution. Fitness functions are normally applied to chromosomes and this is the case in our algorithm. However, we also allow for fitness functions to be applied to individuals. Definition 2.2. Let C be a chromosome space and let f : C → R be a fitness function, where f (∅) = −∞. Let I = C × C be the space of individuals. Define a fitness function F : I → R of an individual I as F (I) = max(f (C1 ), f (C2 )),   where I = (C1 , C2 ). There are two selection operators in the algorithm since there are two different points when selection is necessary. The first selection is done to choose

172

R. Sizov and D. A. Simovici

individuals from the population for mating. We choose top α percent (α < 100%) of the population with the provision that both best individuals from the Group A and the Group B must be present to guarantee that the mating process occurs. The second selection happens after the mating process is complete. The best m individuals are chosen from the temporarily increased population, which includes both parents and their offspring in order to produce a population of the same size as the original population. Again, both best individuals from the Group A and the Group B must be present. The algorithm allows mating only between individuals of Type A and Type B, who contribute the first and the second chromosome of the child, respectively. The second chromosome defines the type of a child. If this chromosome is null the child is of type B; otherwise the type of the child is A. This type of genetic information exchange is possible in the context of the distinct chromosomal structures of individuals that we propose. Our algorithm allows for crossover to occur during the mating process. A crossover might occur with some probability in a Type A individual just before the mating with an individual of Type B since the Type A individual has two full chromosomes. The crossover does not change the Type A individual but only affects the child. A mutation operator acts on both chromosome and gene levels. Mutation can occur in both chromosomes and their genes of both types of individuals. Note that for a Type A individuals both chromosomes and their genes might mutate. Mutation occurs with some small probability. On the chromosome level the genes subjected to mutation are randomly chosen but the actual mutation happens at the gene level. Once a gene is chosen to be mutated, then it is up to the gene to decide how to mutate and to what extent. Mutation occurs with some small probability in both parents just before the mating process. However, it does not affect the parents (i.e. copies of the parents mutate; these copies are discarded later) but does affect the offspring. The pseudocode of our algorithm is shown in Algorithm 1. Note that the selection operator gets a population and a fitness function and returns a subpopulation based on internal criteria of the operator, e.g. best 25% individuals in the population. There are two selection operators in the algorithm and they may have different internal criteria. Both of the operators must return the population with at least one Type A individual and one Type B individual. The mating operator gets one Type A individual, one Type B from the population and a crossover operator, and returns a non-empty set of children produced by these individuals. This operator may or may not give some preference to either type of individuals produced, e.g. there is 60% chance for a child to be a Type B individual. Also, it is up to the mating operator to decide on the probability of crossover occurrence. The crossover operator gets a pair of full chromosomes and changes the obtained pair. The operator defines both a type of crossover and its probability. Note this operator does not return anything.

Type-Based Genetic Algorithm

173

Input : Population size p ≥ 2, Number of Generations n > 0, Chromosome Template C, Selection Operator S1 , Selection Operator S2 , Mating Operator M , Crossover Operator R, Mutation Operator U , Fitness Function F Output: Individual with highest fitness function value begin P ← createInitialPopulation(p, C); /* Creates initial random population */ for i = 1 to n do P ← S1 (P, F ) ; /* First selection operator is applied */ while size(P ) ≤ 2p do Randomly choose a Type A individual from the population and set it to I1 ; I1 ← copy of I1 ; Randomly choose a Type B individual from the population and set it to I2 ; I2 ← copy of I2 ; /* Mating operator is applied Of f spring ← M (I1 , I2 , R) ; with crossover operator applied internally */ Apply the mutation operator U to each child in Of f spring; Add Of f spring to P ; end /* Second selection operator is applied and P ← S2 (P, F ) ; returns a subpopulation with the size equal to p */ end return An individual from the population P with the highest value of the fitness function end

Algorithm 1. Type-Based Genetic Algorithm

Space limitations allow us to present only two applications, namely the N Queen problem, and finding the global minimum of Rosenbrock function. Each of these problems has its own fitness function and structure of full chromosomes but the genetic operators, population structure and all constants used in the algorithm are the same. In both cases the algorithm was run with a population size of 500 for various numbers of generations. In our implementation the first selection operator has α = 25%, i.e. 25% of individuals with highest fitness function values are chosen; the second selection operator just keeps the original size of the population but also chooses only the individuals with the highest fitness function values. Also, both of these selection operators keep at least one best individual of Type A and one best individual of Type B. The implemented mating operator produced from 1 to 4 children in a random fashion but with the probability of 60% for a Type B child to be born. This bias toward a Type B individual is implemented to compensate for higher chances of survival of a Type A individual due to the fact that the fitness function value of an individual is the maximum of fitness function values of two chromosomes and a Type A individual has two full chromosomes, whereas a Type B individual has only one full chromosome.

174

R. Sizov and D. A. Simovici

We used the single point crossover operator with probability of crossover based on the size of a full chromosome. The mutation operator was implemented 1 , where Li is the with the probability of a single gene mutation equal to L1 +L 2 th length of the i chromosome in an individual for i = 1, 2. Thus, we have the probability of a mutation of a gene to occur in a Type B individual equal to L1 and 1 in a Type A individual, where L is just the length of a full the probability of 2L chromosome. This was done in this way to keep on average only one mutation per an individual. We also implemented the classic genetic algorithm to compare the results of both algorithms when applied to the two chosen problems. The classic genetic algorithm also used the same genetic operators and constants when it was applicable to it. The N -Queen problem is the problem of placing N chess queens on an N ×N chessboard so that no two queens attack each other. We used N = 25 and also applied the classic genetic algorithm to this problem [4]. The structure of a chromosome contains 25 genes, where each gene is just an integer from 1 to 25. Each gene represents a row on the 25 × 25 chessboard, where the column is the position of the gene in the chromosome, i.e. a number from 1 to 25. Thus, each column has only one queen and hence there are no other queen attacks on the column. The fitness function of a chromosome for this problem is the number of queen attacks on the chessboard taken with the negative sign. Note that this problem has many equally good solutions and the algorithm returns only one of them. The Rosenbrock function is a non-convex function, introduced by Rosenbrock in 1960 [5] which a unique global minimum and many local minima. The function is defined by f (x1 , x2 , · · · , xn ) =

n−1 

[100(xi+1 −x2i )2 +(1−xi )2 ], where [x1 , x2 , · · · , xn ] ∈ Rn .

i=1

The global minimum is at [1, 1, · · · , 1] and equals 0. We used n = 10 for this problem and we only looked at the range of [−10, 10] for each of the variables. A full chromosome contains n = 10 genes, where each gene represents the ith variable, where 1 ≤ i ≤ 10 and i is also the position of the gene in the chromosome. Each gene is represented by an integer from −1000 to 1000 and in order to obtain the value of a variable the integer is merely divided by 100.

3

Experimental Results and Analysis

Figure 1 shows the results of applying our genetic algorithm and the classic genetic algorithm to the N -Queen problem. Our algorithm is shown to produce much more stable results with better quality. The stability of our algorithm is likely due to the fact that Type A individuals keep two solutions to the problem and, as a result, they have a higher chance to survive and archive the already found solutions, whereas in the classic algorithm all individuals have only one solution and, as a result, they

Type-Based Genetic Algorithm

175

Fig. 1. Average (over 5 runs) Fitness Function Value vs. Number of Generation for N-Queen Problem, where N = 25

are more likely to loose it due to crossovers and mutations. In our algorithm we have individuals that can save older solutions (namely, Type A individuals) and also individuals that are open to explore solution space in order to find possibly better new solutions to the problem (namely Type B individuals). Figure 2 shows the results of applying genetic algorithms to the problem of finding global minimum to the Rosenbrock function. Again, we apply both

Fig. 2. Average (over 20 runs) Fitness Function Value vs. Number of Generations for Rosenbrock Function with 10 variables

176

R. Sizov and D. A. Simovici

genetic algorithms: our genetic algorithm and the classic one [4]. The results are very similar to the results in the previous problem and the explanation for this difference between the algorithms is the same.

4

Conclusions and Future Work

We introduced a novel type-based genetic algorithm and its application to two well-known problems. Our algorithm outperforms the classic genetic algorithm and is more stable and achieves better solution quality. We intend to examine type-based genetic algorithms with more than two types, where each type is defined based on the number of full chromosomes that an individual of a respective type contains.

References 1. Ans´ otegui, C., Sellmann, M., Tierney, K.: A gender-based genetic algorithm for the automatic configuration of algorithms. In: Gent, I.P. (ed.) Principles and Practice of Constraint Programming - CP 2009, 15th International Conference, CP 2009, Lisbon, 20–24 September 2009, Proceedings. Lecture Notes in Computer Science, vol. 5732, pp. 142–157. Springer (2009) 2. Blumel, A.L., Hughes, E.J., White, B.A.: Multi-objective evolutionary design of fuzzy autopilot controller. In: Zitzler et al. [8], pp. 668–680 3. Erickson, M., Mayer, A., Horn, J.: The niched pareto genetic algorithm 2 applied to the design of groundwater remediation systems. In: Zitzler et al. [8], pp. 681–695 4. Goldberg, D.E.: Genetic Algorithms in Search. Optimization and Machine Learning, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1989) 5. Rosenbrock, H.H.: An automatic method for finding the greatest or least value of a function. Comput. J. 3(3), 175–184 (1960) 6. S´ anchez-Velazco, J., Bullinaria, J.A.: Sexual selection with competitive/co-operative operators for genetic algorithms. In: Proceedings of the IASTED International Conference on Neural Networks and Computational Intelligence, NCI 2003, 19–21 May 2003, Cancun, pp. 191–196. IASTED/ACTA Press (2003) 7. Thompson, M.: Application of multi objective evolutionary algorithms to analogue filter tuning. In: Zitzler et al. [8], pp. 546–559 8. Zitzler, E., Deb, K., Thiele, L., Coello, C.A.C., Corne, D. (eds.): Evolutionary MultiCriterion Optimization, First International Conference, EMO 2001, Zurich, 7–9 March 2001, Proceedings. Lecture Notes in Computer Science, vol. 1993. Springer (2001)

Distributed Construction of a Level Class Description in the Framework of Logic-Predicate Approach to AI Problems Tatiana M. Kosovskaya(B) St. Petersburg State University, University emb. 7-9, St. Petersburg 199034, Russia [email protected]

Abstract. Method of a distributed construction and the use of a level description of goal conditions within the framework of a logic-predicate approach to AI problems is described in the paper. The previously proposed logical-predicate approach to solving AI problems is briefly described. Many of these problems are NP-complete or NP-hard. According to a set of goal conditions, the author has earlier proposed construction of their level (hierarchical) descriptions, the use of which significantly decreases the computational complexity of the problems. The construction of such descriptions is performed only once, and then the resulting descriptions are reused. Both the distributed construction of level descriptions of goal conditions and their distributed use make it possible to decrease the time complexity of solving AI problems. Keywords: Predicate formulas · Computational complexity · Level description · Distributed construction and use of level description

1

Introduction

The use of logical approach to solving various problems of Artificial Intelligence (AI) is widespread. As a rule, the logical approach is understood as input data presentation by means of binary (or finite-valued) strings of a fixed length. In such a case, the processing of binary strings requires a linear (or at least polynomial) of the data length time. This seems to be very effective in terms of time spent on solving the problem. However, in many cases the length of such a binary string depends exponentially of the length of the initial parameters of the problem [9]. Besides, the original structure of the object under investigation is lost. The setting of such AI problems which allow their formalization by means of the predicate calculus language is described in the paper. Unfortunately, such a way formulated problems are NP-complete or NP-hard [4]. But the computational complexity of such problems when they are solved by an exhaustion algorithm coincides with the length of their code using a binary string [9]. Earlier, the author has proposed the concept of a level description of classes of objects, which allows to significantly reduce the computational complexity of c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 177–182, 2020. https://doi.org/10.1007/978-3-030-32258-8_20

178

T. M. Kosovskaya

problems formalized by means of the predicate calculus language. Algorithms of distributed construction and use of such a description are proposed in this paper.

2

Logic-Predicate Approach to AI Problems

A logic-predicate approach to AI problems and algorithms of their solutions in frameworks of this approach are described in [6]. Let an investigated object is represented as a set of its elements ω = {ω1 , . . . , ωt }. A collection of predicates p1 , . . . , pn characterizing properties of elements from ω and relations between them is given. Logical description S(ω) of the object ω is the set of all satisfiable on ω literals. The set of all objects is partitioned on the classes Ω = ∪K k=1 Ωk . Logical description of the class Ωk is a formula Ak (xk ) in the form of disjunction of elementary conjunctions such that if Ak (ω) then ω ∈ Ωk .1 With the use of such descriptions some AI problems may be reduced to the proof of the following sequenes [4,6]2 S(ω) ⇒ ∃xk= Ak (xk ), (1) S(ω) ⇒

M 

Ak (ω),

(2)

k=1

where Ak (xk ) is an elementary conjunction. The problem (1) is NP-complete [2,6]. The problem (2) is polynomially equivalent [7] to the problem Graph Isomorphism, which is a so called “open” problem [1] for which it is not known neither it is in the class P neither it is NP-complete. We will say that two elementary conjunctions of atomic formulas P and Q are isomorphic if you can rename their arguments in such a way that these formulas coincide completely. To decrease the number of the algorithm run steps while solving the problems (1) and (2) a level description of classes was suggested in [3]. In fact such a description is a hierarchical one and is based on the extraction from the class description some sub-formulas which are isomorphic to each other and define generalized characteristics of objects from one class [5,6]. It may be done by means of extraction of formulas Pi1 (y 1i ) which are isomorphic to “frequently” appeared sub-formulas of Ak (xk ) with “small complexity”. At the same time a system of equivalences of the form pi 1 (yi1 ) ⇔ Pi1 (y 1i ) is written down. Here p1i are new predicates of the 1-st level and yi1 are new 1-st level variables for lists of initial variables.

1 2

Here and below x is a notation of an ordered list of a finite set x elements. If elements of the list x belong to the set y then we will write x ⊆ y. To mark that all values of variables from the list x satisfying the formula A(x) are m distinct, instead of the formula ∃x1 ...∃xm (&m−1 i=1 &j=i+1 (xi = xj ) & A(x1 , ..., xm )) the formula ∃x= A(x) will be used.

Distributed Construction of a Level Class Description

179

Denote formulas, received from Ak (xk ) by means of substitution of p1i (x1i j )3 instead of all occurrences of sub-formulas isomorphic to Pi1 (y 1i ), by A1k (x1k ). The 1-st level description S 1 (ω) of the object ω is a union of S(ω) and the 1 1 set of all constant atomic formulas p1i (τij ) such that the 1-st level constant τij 1 1 1 1 1 is the name of a list of initial objects τ ij and pi (τij ) ⇔ Pi (τ ij ). The process of extraction of sub-formulas Pil (y li ) which are isomorphic to l−1 “frequently” appeared sub-formulas of Al−1 k (xk ) with “small complexity” may be repeated for l = 1, . . . , L. A level description of classes has the form ⎧ p11 (x11 ) ⇔ P11 (y 11 ) ⎪ ⎪ ⎪ ⎪ .. ⎪ ⎪ . ⎪ ⎪ ⎪ l l ⎪ (x ) ⇔ Pil (y li ) p ⎨ i i .. . (3) . ⎪ ⎪ ⎪ L ⎪ ⎪ ⇔ PnLL (y L pL nL (xnL ) nL ) ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ L AL k (xk ) While using 2-level description, the number of steps for an exhaustive algo1 rithm solving the problem (1) decreases from O(tmk ) to O(n1 ·tr +tδk +n1 ), where r is a maximal number of arguments in the 1-st level predicates, n1 is the number of such predicates, δk1 is the number of initial variables which are presented in Ak (xk ) and are absent in A1k (x1k ). Number of steps for an algorithm based on the derivation in sequential calculus or on the use of resolution method decreases n1 ρ1 a1 from O(sk ak ) to O(s1 k + j=1 s j ), where ak and a1k are maximal numbers of literals in Ak (xk ) and A1k (x1k ), respectively, s and s1 are the numbers of literals in S(ω) and S 1 (ω), respectively, ρ1j is the numbers of literals in Pi1 (y 1i ). An algorithm of the extraction of the maximal formula which is isomorphic to sub-formulas of two given elementary conjunction and construction of a level description of classes is in [8].

3

Distributed Construction of Level Description

Extraction of the maximum formula, isomorphic to sub-formulas of Ak1 (xk1 ) and Ak2 (xk2 ) is an NP-hard problem [8]. The number of steps in the algorithm has the order O(n1 s · n2 s ), where ni is the number of literals in the formula Aki (xki ), s = min{n1 , n2 }. At the best, when |xk1 | = |xk2 | = t and all predicates are t-ary, 2 the number of steps of the algorithm has the order O((k  1 k2 ) ). On the average, the number of steps of the algorithm has the order O (k1 k2 )1/2 log(k1 k2 ) . Since this algorithm should be applied to each pair of the formulas Ak1 (xk1 ) times, it makes sense to perform this procedure for and Ak2 (xk2 ), i.e. K(K−1) 2 3

Index j is between 1 and the number of occurrences of sub-formulas isomorphic to Pi1 (y 1i ).

180

T. M. Kosovskaya

each pair of formulas Ak1 (xk1 ) and Ak2 (xk2 ) on a separate processor (or a sepprocessors. arate computer on the network). Total required K(K−1) 2 The main processor distributes jobs to the processors in the network and receives from them the maximum formulas Qk1 k2 (xk1 k2 ), isomorphic to subformulas of Ak1 (xk1 ) and Ak2 (xk2 ) and unifiers of the formula Qk1 k2 (xk1 k2 ) with the corresponding sub-formulas of the formulas Ak1 (xk1 ) and Ak2 (xk2 ). In addition, the main processor analyzes whether the formulas Qk1 k2 (xk1 k2 ) and Qk3 k4 (xk3 k4 ) can be isomorphic to each other. Obviously, if the number of literals or arguments in them is different, or there is at least one predicate that has a different number of occurrences in these formulas, then Qk1 k2 (xk1 k2 ) and Qk3 k4 (xk3 k4 ) are not isomorphic to each other. Formulas Qk1 k2 (xk1 k2 ) and Qk3 k4 (xk3 k4 ), “suspicious” for isomorphism, are sent to check for auxiliary processors. In the case of their isomorphism, the names of the variables in one of the formulas are changed so that these formulas coincide up to the order of literals. In addition, unifiers for this formula are modified. After that formulas Q1j (x1j ) (j = 1, . . . , K 1 ), isomorphic to sub-formulas of several formulas Ak (xk ) are found and their unifiers are selected, the main processor distributes their pairs among the auxiliary processors to select all formulas Q2j (x2j ) (j = 1, . . . , K 2 ) isomorphic to sub-formulas of several formulas Q1j1 (x1j1 ) and Q1j2 (x1j2 ), and find their unifiers. The procedure is repeated with the formulas Qlj (xlj ) (j = 1, . . . , K l ) for L L L l = 1, . . . , L with some L, for which no pair of formulas QL j1 (xj1 ) and Qj2 (xj2 ) has any sub-formulas, isomorphic to each other. Note that the number of literals in the formulas Qlj (xlj ) decreases with increasing l. Consequently, the total computational complexity of the entire process has the same order (+1) as the computational complexity of the first stage of the calculations.4 The following procedure can be performed on a single processor, since its computational complexity is linear over the length of the initial class descriptions and the extracted sub-formulas. Every selected subformula Qlj (xlj ) that does not have subformulas that are isomorphic to other selected sub-formulas is denoted by Pi1 (x1i ) (different values of the index i correspond to different values of the pair l and j). The equivalences of the form p11 (x11 ) ⇔ P11 (x11 ), where p11 (x11 ) are atomic formulas with new unary predicates and variables for the list of initial variables, is written down. Replace all sub-formulas isomorphic to Pi1 (x1i ) in the formulas A1 (x1 ), ..., AK (xK ) by atomic formulas p11 (x11 ). The first-level description A1k (x1k ) is constructed. Repeat the same procedure successively with the extracted formulas which l+l sub-formulas are isomorphic to Pil (xli ). New (l + l)-level predicates pl+l i (xi ) ⇔ l+l l+l l+l l+l Pi (xi ) and (l + l)-level description Ak (xk ) is constructed. As a result, a multi-level description of the form (3) is constructed.

4

This is based on the fact that if α1 > α2 > · · · > αm , then

m

i=1

q αi = O(q α1 +1 ).

Distributed Construction of a Level Class Description

4

181

Distributed Use of Multi-level Description

Fig. 1. Scheme of distributed use of multi-level description.

While use of the constructed level description, for each l (l = 1, . . . , L) simultaneous checking for different j (j = 1, . . . , nl ) of logical sequents of the form l , whose existence S l−1 (ω) ⇒ ∃xlj = Pjl (xlj ) and finding all such l-level objects τi,j is stated on the right-hand side of the logical sequent, and therefore the atomic formulas plj (ωjl ) are true. In this case, a description of the l-th level S l (ω) object will be obtained. A scheme of distributed use of multi-level description is presented in Fig. 1.

5

Conclusion

While the presence of an already constructed level description allows us to significantly reduce the time complexity of recognition problems [3], the construction of such a description has a high exponential time complexity. The proposed distributed construction of a level description makes it possible to decrease the time times, where K is the complexity of such a description constructing in K(K−1) 2 number of elementary conjunctions in class descriptions. Further research may by devoted to the program implementation of the proposed algorithm.

182

T. M. Kosovskaya

References 1. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979) 2. Kosovskaya, T.M.: Proofs of the number of steps bounds for solving some pattern recognition problems with logical description. Vestn. St. Petersburg Univ.: Math. 4, 82–90 (2007). (in Russian) 3. Kosovskaya, T.M.: Level descriptions of classes for decreasing of step number of solving of a pattern recognition problem described by predicate calculus formulas. Vestn. St. Petersburg Univ.: Seria 10(1), 64–72 (2008). (in Russian) 4. Kosovskaya, T.M.: Some artificial intelligence problems permitting formalization by means of predicate calculus language and upper bounds of their solution steps. SPIIRAS Proc. 14, 58–75 (2010). (in Russian) 5. Kosovskaya, T.M.: An approach to the construction of a level description of classes by means of a predicate calculus language. SPIIRAS Proc. 3(34), 204–217 (2010). (in Russian) 6. Kosovskaya, T.: Predicate calculus as a tool for AI problems solution: algorithms and their complexity. In: Wongchoosuk, C. (ed.) Chapter 3 in: Intelligent System. Open access peer-reviewed Edited volume, pp. 1–20. Kasetsart University (2018). https://www.intechopen.com/books/intelligent-system/predicate-calculusas-a-tool-for-ai-problems-solution-algorithms-and-their-complexity. Accessed 29 Aug 2018 7. Kosovskaya, T.M., Kosovskii, N.N.: Polynomial equivalence of the problems “predicate formulas isomorphism and graph isomorphism”. Vestn. St. Petersburg Univ.: Math (2019, be printed). (in Russian) 8. Kosovskaya, T.M., Petrov, D.A.: Extraction of a maximal common sub-formula of predicate formulas for the solving of some artificial intelligence problems. Vestnik of Saint-Petersburg University. Series 10. Applied mathematics. Computer science. Control processes, issue 3, pp. 250–263 (2017). (in Russian) 9. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall Press, Upper Saddle River (2009)

On Approaches for Solving Nonlinear Optimal Control Problems Alina V. Boiko and Nikolay V. Smirnov(B) St. Petersburg State University, 7/9, Universitetskaya nab., St. Petersburg 199034, Russia [email protected], [email protected]

Abstract. The paper discusses various approaches to solving nonlinear optimal control problems. Of all such approaches, we chose the two most characteristic. The first one uses sufficient conditions of optimality in the form of Hamilton–Jacobi–Bellman equations and the corresponding numerical method. The second is based on the reduction of optimal control problem to interval linear programming problem and finding a solution using the Gabasov’s adaptive method. The main goal is to compare the capabilities of these methods within a specific problem of optimal control. As an application, we consider the problem of constructing optimal control in a nonlinear model of macroeconomic growth with nonlinear dynamical constraints. Comparative analysis of these two approaches and corresponding numerical simulation are presented.

Keywords: Optimal control programming method

1

· Gabasov’s adaptive method · Dynamic

Introduction

Optimal control problems are the most widely developed and current class of problems in the modern mathematical control theory. There are many approaches and methods for solving linear and nonlinear optimal control problems. Some of them are fundamental. First of all, this is Belman’s optimality principle [3] and Pontryagin’s maximum principle [8]. Aimed at optimizing functionals that characterize various parameters of mathematical models, the optimal control theory allows, using the widespread implementation of software tools, to find numerically the most favorable control modes of objects. With the advent of software development tools, it became possible to create regulators that allow us to find the optimal control taking into account the actual conditions of the control objects functioning. Currently, methods of synthesizing optimal controls in real time are of particular importance. One of these methods is Gabasov’s adaptive method [1,2]. This method has a wide range of applications from robotics to the optimization and implementation of macroeconomic trends [4,5,9,10]. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 183–188, 2020. https://doi.org/10.1007/978-3-030-32258-8_21

184

A. V. Boiko and N. V. Smirnov

The novelty and feature of our research are determined by the following reasons. The effectiveness of the methods is different. Different efforts are needed to collect and analyze the initial data, and then to identify model parameters. In specific applications, the speed of the algorithms is important. There are many papers on solving optimal control problems, in this paper we will compare the application of the modern approach to finding the optimal control and the widely known classical approach. This paper describes in detail two approaches to solving nonlinear optimal control problems. The main goal is to compare their capabilities within a specific problem. The first one is to apply Belman’s optimality principle, and the second consists in reducing the optimal control problem to the interval linear programming problem and finding a solution using the Gabasov’s adaptive method. As an application for testing, we consider the optimal control problem in a nonlinear macroeconomic growth model with nonlinear dynamical constraints. To implement numerical experiments, we have developed several programs in the MATLAB.

2

Statement of the Problem and Methods of Optimal Control

Consider the optimal control problem in common case x˙ = f (x, u, t), x(t0 ) = x0 , x(t1 ) = x1 , u(t) ∈ U,  t1 J= G(x, u, t)dt + I(x1 , t1 ) −→ max, t0

u

(1) (2)

where f (x, u, t) is a continuously differentiable vector function; x ∈ Rn is the state vector; u ∈ Rr is the control vector, U ⊂ Rr ; x0 , x1 are the defined constant vectors. Functional (2) is specified with given functions G(x, u, t), I(x, t). If the function f (x, u, t) is nonlinear, then problem (1), (2) is called nonlinear optimal control problem (NOC problem). In most cases, NOC problems can be reduced to linear optimal control problems (LOC problem), and solved by one of the methods for LOC problems. We will use and compare two approaches to solving optimal control problems. The first one is a dynamic programming method based on Bellman’s optimality principle. It determines the basic recurrence relation for the problem under consideration [3,6]. The main idea of the second approach is reducing of a LOC problem to an interval linear programming problem (ILPP) and its solving using Gabasov’s adaptive method [1,2]. Dynamic Programming Method. Consider the general scheme of the dynamic programming method. Using Bellman’s optimality principle, we obtain the main recurrence relation (see [3]) J ∗ (x, t) = max [G(x, u, t)Δt + J ∗ (x + Δx, t + Δt)] , u(t)

where J ∗ is a functional of optimal behavior. Time t is considered as a discrete value t0 , t0 +h, t0 +2h, . . . , t1 , h > 0. Let x(t) is the system state at a moment of

On Approaches for Solving Nonlinear Optimal Control Problems

185

time t, then x(t + h) = f (x(t), u(t), t). In this case, the basic recurrence relation is as follows J ∗ (x, t) = max [G(x(t), u(t), t) + J ∗ (f (x(t), u(t), t), t + h)] , u(t)

(3)

and the corresponding boundary condition J ∗ (x1 , t1 ) = I(x1 , t1 ). Adaptive Method. Consider a linear system with initial conditions x˙ = A(t)x + b(t)u,

x(t0 ) = x0 ,

t ∈ [t0 , t1 ],

t0 < t1 < +∞,

(4)

where x ∈ Rn , u ∈ R1 ; A(t), b(t) are piecewise continuous (n × n)-matrix and n-dimensional vector functions. As admissible controls u(t) we use piecewise −t0 , N is a natural number, constant functions with a quantization period h = t1N i.e. u(t) = u(t0 + (k − 1)h) = uk ,

t ∈ [t0 + (k − 1)h, t0 + kh),

k = 1, N .

(5)

LOC problem: on solutions of system (4), in the class of piecewise constant controls (5), it is necessary provide the maximum of the following function cT x(t1 ) −→ max, u

g1 ≤ Hx(t1 ) ≤ g2 ,

g1 , g2 ∈ Rm , m = rank H < n.

(6)

In accordance with the algorithm from paper [2] we have reduced LOC problem (4)–(6) to the interval linear programming problem. Further, we will find the solution of ILPP using Gabasov’s adaptive method, which has several advantages in comparison with the simplex method. Since the adaptive method works with arbitrary points of a set of plans, and not just with vertexes, we can choose any initial values of the problem. Also, the adaptive method does not require increasing the dimension of the problem. More details about the algorithms of the adaptive method can be found in monograph [1] and articles [2,10].

3

Numerical Implementation

Economic Growth Model. To compare the two approaches to solving optimal control problems, consider an economic growth model [1,7]. The first assumption of this model is related to macroeconomic balance U = C + I, where U , C, I are the total output product, consumption and respectively. All these values are usually measured in national currency. The next important element of the model is a production function. We will use the homogeneous Cobb–Douglas function U (t) = AL(t)α K(t)1−α , where K(t) is a capital amount at the moment of time t; L(t) is a labor input (the total number of person-hours worked for a year); A is a technological coefficient; α ∈ (0, 1) is an elasticity coefficient. The second assumption is that the investment provides a capital growth and compensates its depreciation. Thus, taking into account the balance condition

186

A. V. Boiko and N. V. Smirnov

˙ and production function, we obtain the differential equation K(t) = I − μK(t), where μ is a norm of the capital depreciation. The next assumption is that labor resources grow at a constant rate L˙ = nL, L(t0 ) = L0 . So, finally we have αnt ˙ K(t) = ALα K(t)1−α − μK(t) − C(t), 0e

K(t0 ) = K0 , K(t1 ) = K1 ,

αnt 0 ≤ C(t) ≤ ALα K(t)1−α , 0e

(7)

t ∈ [t0 , t1 ].

The feature of this model is nonlinear dynamic constraints on control, for more details on how to reduce them to the classical type of constraints of ILPP see [4]. The optimal control problem is to maximize the final capital value and the total consumption  t1 −δt1 J =e K(t1 ) + e−δt C(t)dt → max, (8) t0

where δ is a discount factor, and C(t) is considered in (7), (8) as a control. The First Approach. Let us find the optimal control using the dynamic programming method. For (7), (8) the main recurrence relation (3) has the following form   αnt J ∗ (K, t) = max e−δt C(t) + J ∗ ([ALα K(t)1−α − μK(t) − C(t)], t + 1) , 0e C(t)

(9)

and the corresponding boundary condition J ∗ (K1 , t1 ) = e−δt1 K1 . Recurrence relation (9) allows us to find the optimal control C ∗ (t) and corresponding functional value. 0) be just two. Then, taking into account Let the number of steps N = (t1 −t h (9) and the control constraints in (7), we will obtain the optimal control C(t1 −h), C(t0 ). It should be noted that if N > 2, then the problem of the optimal control construction becomes very difficult. The Second Approach. To apply Gabasov’s adaptive method, we linearize system (7) and then reduce it to ILPP. In (7), there is a nonlinear term K(t)1−α , which can be linearized by a straight line segment pK(t) + q, t ∈ [t0 , t1 ]. After linearization, system (7) takes the following form ˙ K(t) = (pξ(t) − μ)K(t) + qξ(t) − C(t), 0 ≤ C(t) ≤ ξ(t)(pK(t) + q),

K(t0 ) = K0 , K(t1 ) = K1 ,

t ∈ [t0 , t1 ],

αnt ξ(t) = ALα . 0e t

For system (10) we find a fundamental matrix Y (t) = e then all coefficients of ILPP DT C −→ max, C

P C = B,

(10)

0 ≤ C ≤ R,

t0

(pξ(τ )−μ)dτ

, and (11)

where C = CN ×1 = (C1 , . . . , CN )T is the unknown control vector (consumption). This approach allows us to construct numerically the optimal control C ∗ (t).

On Approaches for Solving Nonlinear Optimal Control Problems

187

Results. For the numerical implementation of these approaches, we have developed several programs in the MATLAB environment. The initial parameters of the model: N = 2, A = 1, L0 = 2, K0 = 5, K1 = 1.1K0 , n = 0.1, t0 = 0, t1 = 5, μ = 0.05, δ = 0.09. We construct the optimal control for different values of the parameter of the production function: α = 0.3, 0.8. These parameters were used in relation (9) and for ILPP (11). Further, the optimal control was constructed by two methods (see Fig. 1). The dotted line is the control found by the adaptive method, the solid line is the control found by using the dynamic programming method. Finally, we have constructed the graphs of the objective functional that are corresponding to obtained optimal controls. The dotted line is the objective functional for the adaptive method, the solid line is the objective functional for the dynamic programming method.

Fig. 1. (a) Optimal control for α = 0.8, (b) Objective function for α = 0.8, (c) Optimal control for α = 0.3, (d) Objective function for α = 0.3

4

Conclusion

We reviewed solving a macroeconomic growth model as an example and conducted a comparative analysis of two methods for finding optimal control for nonlinear problems. On the graphs we can see (Fig. 1) that the constructed controls are not equal and the objective function for the adaptive method exceeds the value of the objective function for the dynamic programming method at the end of the interval. In applying the second approach the problem is reduced to a simple ILPP, the increase in the number of considered time intervals N does not lead to the complication of the system. In the dynamic programming method each new step increases the complexity of the calculations. At the same time, the method

188

A. V. Boiko and N. V. Smirnov

of dynamic programming does not require linearization of the problem and is applied to a nonlinear system, which reduces errors. Both methods find optimal control, but in the case of a large number of time intervals, the use of dynamic programming generates complex calculations. The correct linearization of the system allows using an algorithm of the adaptive method, which is easier to implement. For some problems, the final value of the system may not be specified. For example, in economic growth model capital may be greater than K1 , in this case control constraints will be inequalities and the use of the dynamic programming method is difficult. Even if we will use top-down approach we do not know the end point of the system. There are no such problems in reducing to ILPP and implementation of the algorithm of the adaptive method. The considered example shows the relevance of the research to build systems for intellectual support of management decisions. In future researches, we will generalize the solution of linear and nonlinear problems with dynamic nonlinear constraints on control.

References 1. Alsevich, V.V., Gabasov, R., Glushenkov, V.S.: Optimization of Linear Economic Models. Publishing House of Belarusian State University, Minsk (2000). (in Russian) 2. Balashevich, N.V., Gabasov, R., Kirillova, F.M.: Numerical methods of program and positional optimization of linear control systems. J. Comput. Math. Math. Phys. 40(6), 838–859 (2000). (in Russian) 3. Bellman, R., Kalaba, R.: Dynamic Programming and Modern Control Theory. Academic Press, New York (1965) 4. Boiko, A.V., Smirnov, N.V.: Approach to optimal control in the economic growth model with a nonlinear production function. In: ACM International Conference Proceeding Series, pp. 85–89 (2018) 5. Girdyuk, D.V., Smirnov, N.V., Smirnova, T.E.: Optimal control of the profit tax rate based on the nonlinear dynamic input-output model. In: ACM International Conference Proceeding Series, pp. 80–84 (2018) 6. Intriligator, M.: Mathematical Optimization and Economic Theory. Prentice Hall, New York (1971) 7. von Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press, Princeton (1953) 8. Pontryagin, L.S., Boltyanskii, V.G., Gamkrelidze, R.V., Mishchenko, E.F.: The Mathematical Theory of Optimal Processes. Wiley, New York (1962) 9. Popkov, A.S., Baranov, O.V., Smirnov, N.V.: Application of adaptive method of linear programming for technical objects control. In: International Conference on Computer Technologies in Physical and Engineering Applications, pp. 141–142. IEEE Inc. (2014) 10. Popkov, A.S., Smirnov, N.V., Smirnova, T.E.: On modification of the positional optimization method for a class of nonlinear systems. In: ACM International Conference Proceeding Series, pp. 46–51 (2018)

Modeling Operational Processes for Intelligent Distributed Computing

Hierarchical Simulation of Onboard Networks Valentin Olenev(B) , Irina Lavrovskaya, Ilya Korobkov, Nikolay Sinyov, and Yuriy Sheynin Saint-Petersburg State University of Aerospace Instrumentation, 67, Bolshaya Morskaia str., Saint Petersburg 190000, Russia {valentin.olenev,irina.lavrovskaya,ilya.korobkov,nikolay.sinyov}@guap.ru, [email protected]

Abstract. The paper presents a solution for hierarchical simulation of onboard networks, which allows performing simulation at different levels of details. This solution was integrated into a new CAD system - SpaceWire Automated Network Design and Simulation. The paper describes hierarchical simulation component SANDS based on SystemC and methods that it is based on. The overview is followed by methods of SystemC parallelism. Finally, the authors present results of simulation parallelisation and performance testing on a supercomputer. Keywords: Simulation · SANDS · Onboard network · SpaceWire

1 Introduction The need to simulate destabilizing factors and failures at different levels of spacecraft onboard networks leads to an approach of layered modeling in order to have access to different levels of the system. However, when modeling the operation of large networks by levels (tens, hundreds, thousands of nodes), modeling of several seconds of network operation can take from several hours to several days or weeks of real time, since the degree of detail is quite high. Consequently, there is a demand of flexible simulation approach, which allows simulation of onboard network operation in different levels of details. In this paper, we describe our solution for hierarchical simulation of onboard networks – hierarchical simulator included into SANDS CAD system. This paper is the continuation of the researches provided in [1].

2 Related Studies of Onboard Network Simulators Network simulators allow researchers to test the scenarios that are difficult or expensive to imitate in real world. It is particularly useful to test new communication protocols or to change the existing protocols in a controlled and reproducible environment. Simulators can be used to design different network topologies using various types of nodes. Let us consider some tools that are specially developed for SpaceWire networks modeling. Most of them are based on the OPNET [2] simulation framework. For these purposes, OPNET was adapted for the SpaceWire and updated with a list of specific modules and network elements. This way was chosen by Thales Alenia company, which c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 191–196, 2020. https://doi.org/10.1007/978-3-030-32258-8_22

192

V. Olenev et al.

implemented MOST (Modeling of SpaceWire Traffic) [3, 4]. Initially MOST was based on the OPNET toolkit dedicated to network modeling. Currently, MOST is available for NS3 simulator [5]. Similar solution has been developed by Sandia National Laboratories (SNL) [6], but it gives less abilities than MOST, because it does not have an option to insert errors to the transmitted data, which is not good for testing and verification. The SNL team also used extension features within OPNET Modeler to create a set of general purpose modules representing different network elements or basic building blocks for SpaceWire networks simulation. The second ability of the tool is an in-depth analysis of the accurate distribution of system time across the SpaceWire network. However, there is a tool that is not based on the OPNET. It is VisualSim SpaceWire modeling, developed by MIRABILIS Design Company [7]. VisualSim is intended for end-to-end system-level design. VisualSim gives an ability to test the real hardware SpaceWire devices, but it is not applicable for prototyping of real onboard networks on early stages of the project. Consequently, there are only three tools that give an ability to build SpaceWire networks and to simulate their operation. Two of them are based on the OPNET and use its capabilities.

3 Hierarchical Simulation of SpaceWire Onboard Networks SANDS is SpaceWire Automated Network Design and Simulation. It was developed to support full SpaceWire network design and simulation flow, which begins from the network topology automated generation and finishes with getting simulation results and statistics. SANDS consists of different components, to read more details see in [1]. SpaceWire is a computer network designed to connect together high data-rate sensors, processing units, memory devices and telemetry/telecommand sub-systems onboard spacecraft. It is compact and efficient in implementation [8]. The paper is intended to Component #4 of SANDS. It is hierarchical simulator of SpaceWire networks. Hierarchical simulation is a simulation of network operations with different levels of details (see Fig. 1). Bit level is simulation of full hierarchy of protocol layers: full SpaceWire stack (from bit encoding at Physical layer up to Network layer) and Application with/without Transport protocol: RMAP [9] or STP-ISS [10]. Packet level is simulation of constrained hierarchy, upper layers only: SpaceWire Network layer, Application with/without Transport protocol: RMAP or STP-ISS. For bit level simulation SANDS uses SystemC clock based simulation option. This is done in order to achieve a more accurate comparison of the model with a hardware device. Nodes operation in a model closely corresponds to clock signals which impact data transmission latencies in links, ports and switches. However, bit level simulation gives significant difference between the real simulation time of the network and model time, since the model of each network component describes in details all the internal mechanisms of the all SpaceWire layers: Network, Channel, Physical (bit encoding only).

Hierarchical Simulation of Onboard Networks

193

Fig. 1. Hierarchical simulation in SANDS: bit level and packet level simulation

Packet level simulation is implemented using event based mode in SystemC. In this simulation mode, the SpaceWire Channel and Physical level operations are not considered, which greatly simplifies the model operation logic. Such modeling significantly reduces the time for modeling the SpaceWire network, which makes it possible to simulate long periods of onboard network operation. Finally, the user will be provided with detailed statistical log tracking all events of interest in the network in html format.

4 Paralleling of Simulation Process Authors of [11–13] presented solutions that need particular modifications inside SystemC kernel to achieve parallelism. Designer annotates all potentially shared resources in the code coupled with a consistency resources monitor. The data consistency can require an effort from the designer with specific annotations in the models. Implementation is mainly limited to a single host platform. Another solution devoted to parallel MPSoC simulation was described in [14]. Processor models are dispatched on a cloud architecture. The synchronization between processor cores is done asynchronously. The data is exchanged through a messagepassing cluster mechanism. However, it does not ensure the partial order of simultaneous events. Decoupling time reduces data exchange rate through the SystemC kernel. This would improve parallelization efficiency. Authors of [15] introduce a new synchronization mechanism to bound the temporal error that are produced by asynchronous communication. Explicit synchronizations at regular intervals are necessary to reduce the error. Some other approaches try to remove the need of time synchronization are demonstrated in [16, 17]. Unfortunately, simulation models should be implemented on TLM-DT. Some SystemC semantics are broken in TLM-DT. It cannot be used with existing standard compliant SystemC models without modifications. To speed up hierarchical simulator we choose approach that is close to the main ideas from [18] – large network can be divided into several areas (subnetworks) with no loss in simulation quality that can be executed in different SystemC kernels. Each area is simulated separately and independently from the others by separated program instances of Component #4 software. Available processor kernels are shared equally among all software instances (see Fig. 2).

194

V. Olenev et al.

Fig. 2. Communication flows and areas for parallelism

5 Simulation Using Supercomputer Simulation performance can vary in dependence with characteristics of a computer it is running on. For the purpose of performance investigation, we run simulation component software on one computational node of a hybrid-supercomputing center [19]. Figure 2 shows a network topology and communication flows that were used for simulations on the computational node of the supercomputer. On the basis of the conducted investigation we built diagrams, which show worst operation time instance in bit and packet level modes correspondently. Diagrams are shown in Fig. 3. We measured worst time, because fast instance should wait until slower instance will complete simulation of network area. When we do not use any parallelism, then the simulation is performed only on one of 28 processor kernels, since the simulation core is represented by a single-threaded program on SystemC, and the operating system independently determines on which kernel the simulator will be launched.

Fig. 3. Operation time in bit and packet level modes

By dividing the network topology into areas and running the simulation of each area on separate processor kernels, we can achieve the program acceleration [see Fig. 3]. The acceleration is 5.4 times for 8 simulator’s instances at the bit-level mode. The packetlevel mode provides acceleration only 2.3 times. Our approach does not have error in timing compared to the time decoupling approach with 1% error [20]. Bit level mode provides detailed simulation with lots of network operation events. An internal pipeline of each processor kernel is efficiently loaded – previously unprocessed instructions of the running instance of the simulator are received in time for processing from the processor’s internal caches and RAM. As a result, the maximum acceleration of the simulator operation is achieved for the bit-level mode.

Hierarchical Simulation of Onboard Networks

195

6 Conclusion In the current paper we presented a solution for hierarchical simulation of onboard networks. These methods were implemented in the simulation component of a new CAD system for SpaceWire onboard networks design and simulation – SANDS. Hierarchical simulation software provides two modes of simulation with different levels of details: bit level and packet level. Now it is in trial use. In the scope of our work we conducted an investigation of SANDS simulation component parallelisation and performance analysis on the hybrid supercomputer. The paper presents results of this investigation. We used dividing the network topology into several areas and parallelized their execution. As a result, we can achieve significant acceleration of simulation execution.

References 1. Olenev, V., Lavroskaya, I., Korobkov, I., Sheynin, Yu.: Design and simulation of onboard SpaceWire networks. In: Proceeding of the 24th Conference of FRUCT Association, MTUCI, Moscow, Russia, pp. 291–299 (2019) 2. Jianru, H., Xiaomin, C., Huixian, S.: An OPNET model of SpaceWire and validation. In: Proceedings of the 2012 International Conference on Electronics, Communications and Control, pp. 792–795. IEEE Computer Society, October 2012 3. Dellandrea, B., Gouin, B., Parkes, S., Jameux, D.: MOST: modeling of SpaceWire & SpaceFiber traffic-applications and operations: on-board segment. In: Proceedings of the DASIA 2014 Conference, Warsaw (2014) 4. Thales Alenia Space: Modeling Of SpaceWire Traffic. Project Executive Summary & Final Report, 25 p. (2011) 5. NS-3 Manual: NS-3 Network Simulator, 165 p. (2017) 6. van Leeuwen, B., Eldridge, J., Leemaster, J.: SpaceWire model development technology for satellite architecture. Sandia National Laboratories, Sandia Report, 30 p. (2011) 7. Mirabilis Design: Mirabilis VisualSim data sheet, 4 p. (2003) 8. SpaceWire Standard. ECSS – Space Engineering. “SpaceWire – Links, Nodes, Routers and Networks”. ECSS-E-ST-50-12C, July 2008 9. ESA. Standard ECSS-E-ST-50-52C, SpaceWire — Remote memory access protocol. Noordwijk: Publications Division ESTEC, 5 February 2010 10. Sheynin, Y., Lavrovskaya, I., Olenev, V., Korobkov, I., Dymov, D., Kochura, S.: STP-ISS transport protocol for spacecraft on-board networks. In: 2014 International SpaceWire Conference (SpaceWire), pp. 1–6. IEEE, September 2014 11. Ezudheen, P., Chandran, P., Chandra, J., Simon, B.P., Ravi, D.: Parallelizing SystemC kernel for fast hardware simulation on SMP machines. In: Proceedings of the ACM/IEEE/SCS Workshop on PADS, pp. 80–87 (2009) 12. TI. OMAP 4470 (2011). https://is.gd/VQ7ncC. Accessed 17 June 2019 13. Ventroux, N., Sassolas, T.: A new parallel SystemC kernel leveraging manycore architectures. In: Proceedings of the DATE, pp. 487–492 (2016) 14. Roth, C., Reder, S., Erdogan, G., Sander, O., Almeida, G.M., Bucher, H., Becker, J.: Asynchronous parallel MPSoC simulation on the Single-Chip Cloud Computer. In: Proceedings of the International Symposium on System-on-Chip, pp. 1–8 (2012) 15. Peeters, J., Ventroux, N., Sassolas, T., Lacassagne, L.: A SystemC TLM framework for distributed simulation of complex systems with unpredictable communication. In: Proceedings of DASIP, pp. 1–8 (2011)

196

V. Olenev et al.

16. Mello, A., Maia, I., Greiner, A., Pecheux, F.: Parallel simulation of SystemC TLM 2.0 compliant MPSoC on SMP workstations. In: Proceedings of DATE, pp. 606–609, March 2010 17. Pessoa, I.M., Mello, A., Greiner, A., Pˆecheux, F.: Parallel TLM simulation of MPSoC on SMP workstations: influence of communication locality. In: Proceedings of ICM, pp. 359– 362 (2010) 18. Ziyu, H., Lei, Q., Hongliang, L., Xianghui, X., Kun, Z.: A parallel SystemC environment: ArchSC. In: Proceedings of ICPADS, pp. 617–623 (2009) 19. Supercomputing center “Polytechnic”. http://scc.spbstu.ru/index.php/about-scc/scc-is. Accessed 17 June 2019 20. Weinstock, J.H.: Parallel SystemC Simulation. https://ice.rwth-aachen.de/research/toolsprojects/parallel-systemc-simulation/. Accessed 17 June 2019

Strategies Comparison in Link Building Problem Vincenza Carchiolo1 , Marco Grassia2 , Alessandro Longheu2 , Michele Malgeri2 , and Giuseppe Mangioni2(B) 1 2

Dip. di Matematica e Informatica, Universit`a degli Studi di Catania, Catania, Italy Dip. Ingegneria Elettrica Elettronica Informatica, Universit`a degli Studi di Catania, Catania, Italy [email protected]

Abstract. Choosing an effective yet efficient solution to the link building problem means to select which nodes in a network should point a newcomer in order to increase its rank while limiting the cost of this operation (usually related to the number of such in-links). In this paper we consider different heuristics to address the question and we apply them both to Scale-Free (SF) and Erd˝os-R´enyi (ER) networks, showing that the best tradeoff is achieved with the long-distance link approach, i.e. a newcomer node gathering farthest in-links succeeds in improving its position (rank) in the network with a limited cost. Keywords: PageRank · Best attachment · Link building · NP-hard · Long distance link

1 Introduction The need of ranking a set of elements, especially in the context of complex networks, encompasses several real-world applications, from recommendation networks [1, 2] to data envelopment analysis [3], scientific journals prestige [4], webpages relevance for SEO [5–8] and many others. A shared basic is that ranking is proportional to the relevance of a node, hence increasing the rank is also a widely recognized goal to provide a so-called weak order among nodes in a network, whatever algorithm is adopted for ranking assessment (e.g. PageRank [9], HITS [10], SALSA [11] or others). To examine the process of ranking enhancement from its very beginning, we can refer to a node that wants to join a network and to this purpose it yearns to establish a relationship with other nodes. This phenomenon has been studied in the recent past [12–14] and naturally leads to the so-called link building problem, that consists in searching the set of backlinks (directed from nodes in the network to the newcomer) that maximizes newcomer’s rank. This problem has been proved to be NP-hard in [15], where also both upper and lower bounds are ascertained in specific cases. Other studies about link building problem exist, as [16] where the use of asymptotic analysis to establish how the rank depends on creating new links is considered, or [17], that generalizes the approach to multiple pages websites; in [18] is discussed a model for link building c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 197–202, 2020. https://doi.org/10.1007/978-3-030-32258-8_23

198

V. Carchiolo et al.

based on constrained Markov decision processes, whereas in [19] authors consider the impact of out-links on a node’s PageRank. To tackle with complexity, several heuristics can be considered and should be compared for what concern the two competing factors, i.e. the gathering of backlinks (that implies a computational effort to bear) and the ranking improvement to be gained. In this work, the PageRank metric is adopted [9] as a consolidated and extensively used ranking mechanism. In Sect. 2 we outline a set of heuristics, classified as agnostic or aware, respectively without or with some assumption about the network and about the PageRank. Each strategy is described together with the complexity assessment. In Sect. 3 they are compared in the scenario of Scale-Free and Erd˝os-R´enyi networks; we illustrate our simulations and discuss the results in ranking improvement. Our concluding remarks and future works are finally shown in Sect. 4.

2 Heuristics As introduced in the previous section, the best attachment (or link building) problem consists in finding an in-attachment strategy that allows a newcomer node x to reach some desired rank position with as few in-links as possible, given that building an inlink comes with some cost; alternatively, we aim to reach the best rank position as possible with the same number of in-links. We call S the subset of the nodes x should get an in-link from in order to achieve its goal. In this section, we overview some of heuristics that have been proposed to tackle this NP-hard problem and discuss their computational complexity. 2.1

Problem Agnostic Strategies

This class of strategies does not make any assumption on the problem itself, on the PageRank neither on the properties of the network. Random. The simplest way [20] to choose the sub-set S of nodes is picking them randomly until x reaches its goal. Of course, any node can be only picked once since there is no gain in building parallel links (this is common for all the strategies described here and will be omitted in the rest of paper). While this strategy may sound very naive, it does not require any knowledge of the network, excluding the nodes themselves, and its computational cost only depends on the number s of in-links needed, i.e. O(Random) = O(s). Degree Based. Another simple way [20] to pick the subset S of nodes to be pointed by is choosing them in descending degree order. Of course, nodes can be ranked according to their in-degree, out-degree or the sum of the two. The complexity of this strategy is O(Degree) = O(s) ∗ O(select), where s is the number of steps (or in-links) and O(select) is the complexity to find the node with maximum degree, which depends on the data structure used to store the nodes. Long-Distance Based. A completely different approach was defined in [21] and exploits the finding that being pointed by nodes which are far away in the network provides higher PageRank gains compared to closer nodes, as shown in [22]. The set S is built

Strategies Comparison in Link Building Problem

199

by choosing a node randomly in the network (otherwise x would be infinitely distant from all nodes, at least during the initial attachment phase) and then by choosing v in the network such that d(v, x) = max, recomputing the distances after each attachment. The computational complexity is O(Long distance) = O(s)∗O(Distance), where O(Distance) is the computational complexity of the single-destination shortest path solver used. For instance, in the case of sparse directed networks, Dijkstra’s algorithm can be used and it is O(Distance) = O(|N| ∗ log|N| + |E|) using Fibonacci heap [23], where E is the set of the edges. 2.2 Problem Aware Strategies While approaches in the previous section use no information about the problem itself or about the PageRank to build the set S, in this section we overview some approaches that are less naive and that take into account the PageRank value of the other nodes, which may be computed at different times (e.g., statically at the beginning or during each selection step). Therefore, it should be noted that the computational complexity of all the following strategies depends on PageRank’s complexity O(PR), which is O(|N|3 ) using the exact Gauss method, where N is the set of the nodes. However, it can be reduced to O(PR) = m ∗ |E| by using iterative approximations, where m is the number of iterations needed to get an acceptable approximation as shown in [24]. Anticipated Value. The key idea for the Anticipated Value heuristic is to include in the set S the nodes in descending PageRank value order, which is calculated once at the beginning [20]. The main issue with this strategy is that the more in-links are built, the more the network topology changes, making the initial PageRank values less and less accurate. The computational complexity is O(Anticipated value) = max{O(sort), O(PR)}, where O(sort) is the complexity of sorting the nodes according to their PageRank value. Anticipated Out-Degree. This strategy is similar to the previous one [20], except for the fact that nodes are selected in descending order of the ratio between their PageRank value and their out-degree, instead of the PageRank value alone. The intuition is that nodes with a higher ratio that may “transfer” the highest PageRank value to its outneighborhood, and should be selected first. Of course, the computational complexity of this strategy is the same as the one of the Anticipated Value strategy. Current Rank. This strategy is a dynamic version of the Anticipated Value [20], since it picks the nodes with the highest PageRank value, but recomputes the values before each pick iteration, hence the name. The complexity of this algorithm is O(Current) = O(s) ∗ max{O(select), O(PR)} and—since O(PR) is higher than O(select) in the general case—it is higher than all previous algorithms. However, this algorithm seems to capture the network dynamics due to the creation of new links better than previous ones. Future Rank. During each step, this strategy simulates all the possible choices and selects the connection giving x the best rank [20]. In other words, it always chooses the best short-term option, and therefore it should be the most effective. The computational cost is O(Future) = O(s) ∗ max{O(select), O(PR) ∗ O(|N|)}, that can be rewritten as O(Future) = O(s) ∗ O(PR) ∗ O(|N|), of course it is the main drawback of this strategy being much higher than all previous approaches.

200

V. Carchiolo et al.

(a) ER networks

(b) SF networks

Fig. 1. Simulations

3 Results In this section, we compare the performances of the previously mentioned strategies on synthetic Scale-Free (SF) and on Erd˝os-R´enyi (ER) networks, generated using in Pajek [25], with 100K nodes each. During each experiment we employ a different heuristics to build the in-link set S of a newcomer node x. All the results are averaged on five instances for each network type to remove any realization bias. Figures 1a and b show heuristics comparison for ER and SF networks respectively. The experiments show that most of the intuitions on how to improve the PageRank “transfer” do not perform as well as expected. Indeed, excluding the Future that computes and chooses the best choice during each step, all other heuristics are outperformed by the Long distance one, which does not rely on any information about the PageRank itself. Moreover, in SF networks such approach has comparable performances with the more expensive Future. In addition to these considerations that can be applied to both ER and SF networks, there are some differences in how the heuristics perform on the two types of networks. For instance, a remarkable difference is the behavior of the Random approach, which performs very well in SF networks but poorly in ER networks, due to their different structural properties. According to these results, one could consider the Random attachment strategy as a baseline, at least in SF networks, which also comes with almost no computational cost and can be used when the network topology is unknown. Moreover,

Strategies Comparison in Link Building Problem

201

Figs. 1(b) and (a) clearly show that SF networks exhibit a fast initial dynamics and this does not depend on the heuristic considered. In fact, with only one link added to the target node x, in the ER networks x reaches position 8000 at most (by using the Future strategy) while it can rank far better in the SF networks by reaching position 5200. On the other hand, in SF networks it is very hard to reach the top position—which takes more than 350 in-links—while it seems to be quite easy in ER networks with less than 10 links using Future strategy. These results are in according with those proposed in [26, 27].

4 Conclusion The goal of improving the rank with a minimal effort has been considered in this paper, in particular focusing on how this can be accomplished using backlinks, facing the link building (or best attachment) problem. We considered different criteria to select nodes backlinks are gathered from, and a successful approach revealed the long distance heuristic, in which the node joining the network aims at gathering backlink from farthest nodes. Further works includes learn why long distance heuristics works, performs some experiments on real-world case studies and extension of context aware strategies taking into account different network properties. Moreover, the cost model should be enriched to take into account real networks dynamics mainly in social networks context. Acknowledgements. This work has been partially supported by the Universit`a degli Studi di Catania, “Piano della Ricerca 2016/2018 Linea di intervento 2”.

References 1. Weng, J., Miao, C., Goh, A., Shen, Z., Gay, R.: Trust-based agent community for collaborative recommendation. In: AAMAS 2006: Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 1260–1262. ACM, New York (2006) 2. Liu, X.: Towards context-aware social recommendation via trust networks. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) Web Information Systems Engineering WISE 2013. Lecture Notes in Computer Science, vol. 8180, pp. 121–134. Springer, Berlin Heidelberg (2013) 3. de Blas, C.S., Martin, J.S., Gonzalez, D.G.: Combined social networks and data envelopment analysis for ranking. Eur. J. Oper. Res. 266(3), 990–999 (2018) 4. Guerrero-Bote, V.P., Moya-Aneg´on, F.: A further step forward in measuring journals scientific prestige: the SJR2 indicator. J. Inform. 6(4), 674–688 (2012) 5. Pan, B., Hembrooke, H., Joachims, T., Lorigo, L., Gay, G., Granka, L.: In Google we trust: users decisions on rank, position, and relevance. J. Comput. Mediated Commun. 12(3), 801– 823 (2007) 6. Chauhan, V., Jaiswal, A., Khan, J.: Web page ranking using machine learning approach. In: 2015 Fifth International Conference on Advanced Computing Communication Technologies (ACCT), pp. 575–580, February 2015 7. Su, A.J., Hu, Y.C., Kuzmanovic, A., Koh, C.K.: How to improve your search engine ranking: myths and reality. ACM Trans. Web 8(2), 8:1–8:25 (2014)

202

V. Carchiolo et al.

8. Jiang, J.Y., Liu, J., Lin, C.Y., Cheng, P.J.: Improving ranking consistency for web search by leveraging a knowledge base and search logs. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM 2015, pp. 1441–1450. ACM, New York (2015) 9. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web (1998) 10. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999) 11. Lempel, R., Moran, S.: SALSA: the stochastic approach for link-structure analysis. ACM Trans. Inf. Syst. 19(2), 131–160 (2001) 12. Albert, R., Barabasi, A.L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47 (2002) 13. Newman, M.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003) 14. Kunegis, J., Blattner, M., Moser, C.: Preferential attachment in online networks: measurement and explanations. In: Proceedings of the 5th Annual ACM Web Science Conference, WebSci 2013, pp. 205–214. ACM, New York (2013) 15. Olsen, M., Viglas, A., Zvedeniouk, I.: An approximation algorithm for the link building problem. CoRR abs/1204.1369 (2012) 16. Avrachenkov, K., Litvak, N.: The effect of new links on google PageRank. Stoch. Models 22(2), 319–331 (2006) 17. de Kerchove, C., Ninove, L., Dooren, P.V.: Maximizing pagerank via outlinks. CoRR abs/0711.2867 (2007) 18. Fercoq, O., Akian, M., Bouhtou, M., Gaubert, S.: Ergodic control and polyhedral approaches to pagerank optimization. IEEE Trans. Automat. Contr. 58(1), 134–148 (2013) 19. Sydow, M.: Can one out-link change your PageRank? In: Szczepaniak, P.S., Kacprzyk, J., Niewiadomski, A. (eds.) AWIC 2005. Lecture Notes in Computer Science, vol. 3528, pp. 408–414. Springer, Heidelberg (2005) 20. Buzzanca, M., Carchiolo, V., Longheu, A., Malgeri, M., Mangioni, G.: Dealing with the best attachment problem via heuristics. In: Badica, C., et al. (eds.) Intelligent Distributed Computing X, vol. 678, pp. 205–214. Springer International Publishing, Cham (2017) 21. Carchiolo, V., Grassia, M., Longheu, A., Malgeri, M., Mangioni, G.: Climbing ranking position via long-distance backlinks. In: Proceedings of the 11th International Conference, IDCS 2018, Tokyo, Japan, 11–13 October 2018, pp. 100–108. Springer International Publishing, October 2018 22. Carchiolo, V., Grassia, M., Longheu, A., Malgeri, M., Mangioni, G.: Long distance in-links for ranking enhancement. In: Del Ser, J., Osaba, E., Bilbao, M., Sanchez-Medina, J., Vecchio, M., Yang, X.S. (eds.) Intelligent Distributed Computing XII, pp. 3–10. Springer International Publishing, Cham (2018) 23. Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms, pp. 338–346 (1984) 24. Bianchini, M., Gori, M., Scarselli, F.: Inside pagerank. ACM Trans. Internet Technol. 5(1), 92–128 (2005) 25. Batagelj, V., Mrvar, A.: Pajek - program for large network analysis (1999) 26. Carchiolo, V., Longheu, A., Malgeri, M., Mangioni, G.: Network size and topology impact on trust-based ranking. Int. J. Bio-Inspired Comput. 10(2), 119–126 (2017) 27. Carchiolo, V., Longheu, A., Malgeri, M., Mangioni, G.: The effect of topology on the attachment process in trust networks. In: Camacho, D., Braubach, L., Venticinque, S., Badica, C. (eds.) Intelligent Distributed Computing VIII, pp. 377–382. Springer, Cham (2015)

Research of the Possibility of Hidden Embedding of a Digital Watermark Using Practical Methods of Channel Steganography Pavel I. Sharikov(B) , Andrey V. Krasov, Artem M. Gelfand, and Nikita A. Kosov The Bonch-Bruevich Saint-Petersburg State University of Telecommunications, 22 Prospekt Bolshevikov, St. Petersburg, Russia [email protected], [email protected], [email protected], [email protected]

Abstract. The purpose of the study of the authors of this article are the two most popular network protocols of the network and transport level – TCP and IP, for the presence of the steganographic potential of their packets. The article describes and argues the choice of certain working tools. A study was made of possible ways of channel steganography in data packets of protocols. The argumentation is given for the choice of certain working tools. The implementation of the attachment of information was made. Described the feasibility of the application, developed software, and appropriate conclusions were made throughout the study. Keywords: Channel steganography · Steganography · Network protocols · Digital watermark · TCP · IP · Hiding information

1

Introduction

This paper covers the areas of steganography, network protocols, and security for hiding data in communication networks that use TCP/IP. In this article, the authors presented a new method of hidden information attachment in packages. The novelty of this study lies in the use of methods of channel steganography and the developed software with which the research is carried out. The relevance of this work lies in the fact that the methods discussed in this study during the comparison, have lost their relevance due to the rapid development of technology.

2

Related Work

Due to the growth of transmission of various data through protocols, the amount of research in this area is growing exponentially. For example, in [1], the authors describe mathematical methods designed to work with steganography in jpeg files, by working through a new channel selection rule. But about the work of the developed rules and techniques in TCP/IP packets, the authors do not speak. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 203–209, 2020. https://doi.org/10.1007/978-3-030-32258-8_24

204

P. I. Sharikov et al.

The relevance of channel steganography is confirmed by large-scale scientific research [2], which deals with the development of templates for detecting malicious software or any other hidden information in packages. In [3], the possibilities of new methods for creating hidden channels are explored; a proposal is to hide data in VoIP traffic. Developed a new technique. In [4], the authors propose the construction of a hidden communication channel using the StegBlocks method (steganographic block cipher). Unfortunately, at the current moment, due to the latest advances in the field, the limitations of the method do not allow using it fully and calling it “not detectable.” In [5], a new data hiding method was developed. Experiments are carried out with three protocols (HTTP, FTP and DNS) as carriers. In [6], the authors propose a steganographic method using the IPID field in the IP packet. At the moment, there is a sufficient amount of software that successfully performs the demarcation and reveals all the encrypted data fields. Moreover, at the beginning of the field at the header. All mentioned, in this section of the work, were analyzed by the authors of this study. After that, conclusions were made about the lack of data on the most in-depth study of popular TCP/IP protocols. The software developed for the purposes of the above studies is currently outdated. Also, there is no information about the protocols developed. In this regard, the authors made the assumption that today, few approaches have been developed that can produce embedding secret information with a high degree of security in TCP/IP protocols. As part of this study, a specialized protocol was developed that was tested in real conditions at the operating enterprise and fully working software, which is also used at the information security enterprise.

3

Proposed Approach

The authors propose an approach to hidden information attachment to TCP/IP protocol packet headers. The main requirements for the approach: • • • •

The invisibility hidden in the information package. Package must not be spoiled. Hidden message is difficult to detect and delete. In the process of sending a packet, the hidden message is not distorted.

As follows from Sect. 2 of this work, the authors concluded that the existing approaches to channel steganography do not meet the current requirements for hidden message transmission. This work should fill in the gaps in this area. 3.1

Selection of Tools

In the proposed scenario, you should choose a method for modifying packages at the user level, since modifying packages at the kernel level requires installation of system drivers, which can be very likely to be noticed by the administrator

Research of the Possibility of Hidden Embedding of a Digital Watermark

205

of the compromised working machine and will issue the attacker at the stage of software configuration for channel steganography [7]. User-level modification tools depend on the type of OS and do not have common interfaces (unlike the pcap library). For this reason, Linux was chosen as the operating system for practical implementation, specifically Ubuntu 17.04 distributor. And libnetfilter queue was chosen as a tool. 3.2

Description of the Selected Tools

Libnetfilter queue is a library that implements the ability to modify packages at the user level in Linux. Work libnetfilter queue requires customized network packet transfer rules for system netfilter per user level. As a custom utility to create these rules, you can use iptables. 3.3

Software Implementation Language

GO was chosen as the software implementation language. There was a need for a tool with flexible settings and suitable libraries for conducting experiments. It was also necessary that the programming language was not dependent on external libraries. The GO language satisfies all these requirements. To parse the network packages, it was decided to use the google gopacket software package. To change packages: go-netfilter-queue, which is the go packer for the libnetfilter queue library. 3.4

Study of Possible Methods of Channel Steganography for the Network Protocol IP

For the practical implementation, only the two most popular network protocols of the network and transport layer - TCP and IP - were chosen. For IP, the following features were considered, indicated in Fig. 1. Type of Service: the eight Type of Service (ToS) bits in the IP header are used to specify the quality of service parameters for routers along the packet path. Currently rarely used for their original purpose. Can be used in DiffServ. There is the potential to use bits in this field as a carrier of steganographic information, because most networks do not use them. IP Identification: The IP Identification (IP ID) field is used to determine the correct sequence of fragments when assembling a packet, and is 16 bits in the IP header. Since IP ID is used to distinguish fragments that make up one packet from fragments that make up another, the only restrictions on the value of this field are uniqueness while packets fragments are on the network and unpredictable. The choice was made in favor of the ToS IP header. IP flags, IP Options fields were not selected due to their low efficiency, as well as IP identification and IP fragment fields due to easy detection by specialized software.

206

P. I. Sharikov et al.

Fig. 1. IPv4 packet header

3.5

Study of Possible Methods of Channel Steganography for the TCP Network Protocol

The TCP packet header is shown in Fig. 2. TCP Timestamp: The TCP timestamp option allows the host to accurately measure the packet transit time along the way, and also reduce the problems associated with the sequence number of the packet in networks with high bandwidth and delay.

Fig. 2. TCP packet header

The timestamp option consists of two 32 bit fields, TS Value and TS EchoReply. The value of TC Value is set on the basis of the time stamp of the sender, and it is in this field that you can enter hidden data.

Research of the Possibility of Hidden Embedding of a Digital Watermark

207

Huge potential to use this field for the purpose of hidden information. The choice was made in favor of the Timestamp option. No other options were chosen due to ease of detection or low steganographic potential.

4

Experiment

4.1

Implementing Attachment Information Using TCP

In practical implementation, the data was nested in the TS Value field of the timestamp option. The data was inserted only if the timestamp option was missing from the sending packet and the data offset field was less than or equal to 12. At the same time, to correctly set the offset of the received packet header, two options nop were added after the timestamp option. To check the operation of the attachment, the iperf3 utility was used to generate tcp packets. After the connection is established, the packets begin to get lost. Thus, it can be concluded that the attachment to the TCP timestamp option is inefficient and easily detected due to incorrect network behavior if this option is not properly used. 4.2

Implementing Attachment Information Using the IP Protocol

Differentiated Services Code Point (DSCP). Used to divide traffic into service classes. Explicit Congestion Notification (ECN). Network congestion warning without packet loss. Both of these fields form the steganographic potential of an investment of 8 bits in each IP packet. The verification of proof of concept shows the possibility of using this field without loss of traffic and problems with the network. For the transfer of hidden data in this field, a data transfer protocol was developed. The first 7 bits in each packet are informational, and the last bit is signal. The protocol uses the following signal codes: 1. 2. 3. 4. 5.

discoverCode = 11111110 acceptCode = 11111111 okCode = 11111111 startTransmitCode = 11111110 endTransmitCode = 11111110

Data transfer scenarios are shown in Fig. 3. When transmitting information, only packets with the last bit set are valid [8]. This allows you to use 7 bits for signaling when the last bit is reset in future implementations of this protocol. The developed protocol was implemented as an external package (library) for the GO language, using the standard net.Conn interface, which allows integrating the developed package code into ready-made solutions for remote control, data transfer, and encryption with the least effort. Below is the implementation of a simple server in the Go language, implementing channel steganography in IP protocol as a data transport.

208

P. I. Sharikov et al.

Fig. 3. Data transfer scenarios

Listing 1. Server implementing channel steganography package main import ( ” fmt ” ” n e t ” ” o s ” ) f u n c main ( ) { l , e r r := c h a n s t e g o . L i s t e n ( ” IP . DS” , 1 0 , 2 0 ) i f e r r != n i l { os . Exit (1)} defer l . Close () for { conn , e r r := l . Accept ( ) i f e r r != n i l { o s . E x i t ( 1 ) } go h a n d l e R e q u e s t ( conn ) } } f u n c h a n d l e R e q u e s t ( conn n e t . Conn ) { buf := make ( [ ] byte , 1 0 2 4 ) , e r r := conn . Read ( buf ) i f e r r != n i l { conn . Write ( [ ] byte ( ” Message r e c e i v e d . ” ) ) conn . C l o s e ( ) 4.3

Technical Discussion of the Results of the Proposed Method and Evaluation of the Approach

Testing was conducted in a virtualized environment using the Parallels Desktop hypervisor. The virtual machines were configured to transfer packets through a real network, thus the transmitted packets passed through real network equipment. Both on the client and on the server, the Linux distribution of Ubuntu 17.04 was installed. In the course of the study, an experiment was conducted in order to transfer some sequence of bytes from the server to the client and from the client to the server. The experiment resulted in successful data transfer. Thus, this technique can be used to check the validity of the information transmitted by means of checks of the amounts. In the articles in the public domain, the authors did not find such opportunities. With frequent transmission of user data, this approach is most relevant to the protection of user data, for example, digital watermarks [9].

Research of the Possibility of Hidden Embedding of a Digital Watermark

5

209

Conclusion

The consequence of the above reasons is the ability to use the steganographic potential of each IP packet and the product of a hidden investment of 8 bits in size, without loss of traffic and problems with the network. This approach has a wide potential use, in connection with the satisfaction of the requirements outlined in Sect. 3 of this article and the mass transfer of user data via the Internet. This article examines the widespread data transfer protocols, which further substantiates the choice of TCP/IP protocols for hidden data attachment by the authors.

References 1. Huang, F., Huang, J., Shi, Y.Q.: New channel selection rule for JPEG steganography. IEEE Trans. Inf. Forensics Secur. 7(4), 1181–1191 (2012) 2. Abarca, S.: An analysis of network steganographic malware. Doctoral dissertation, Utica College (2018) 3. Schmidt, S.S., Mazurczyk, W., Keller, J., Caviglione, L.: A new data-hiding approach for IP telephony applications with silence suppression. In: Proceedings of the 12th International Conference on Availability, Reliability and Security, p. 83. ACM, August 2017 P., Bieniasz, J., Krzemi´ nski, M., Szczypiorski, K.: Application of perfectly 4. Bak,  undetectable network steganography method for malware hidden communication. In: 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 34–38. IEEE, September 2018 5. Wendzel, S., Mazurczyk, W.: Poster: an educational network protocol for covert channel analysis using patterns. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1739–1741. ACM, October 2016 6. Xue, P.F., Hu, J.S., Liu, H.L., Hu, R.G.: A new network steganographic method based on the transverse multi-protocol collaboration (2017) 7. Abdullaziz, O.I., Goh, V.T., Ling, H.C., Wong, K.: AIPISteg: an active IP identification based steganographic method. J. Netw. Comput. Appl. 63, 150–158 (2016) 8. K21. Kostyrin A.S., Krasov A.V.: Implementation of the channel steganography method using the ICMP protocol. In: The Collection: Regional Informatics and Information Security Collection of Scientific Papers. St. Petersburg Society of Informatics, Computer Engineering, Communication and Control Systems, pp. 313–316 (2017) 9. Sharikov, P.I.: The method of finding the most profitable container in the format of executable files. Science-intensive technologies in Earth space research. T.7. No. 5, pp. 58–62 (2015)

A Highly Scalable Index Structure for Multicore In-Memory Database Systems Hitoshi Mitake1 , Hiroshi Yamada2 , and Tatsuo Nakajima1(B) 1

2

Waseda University, 3-4-1 Okubo Shinjuku, Tokyo, Japan {h.mitake,tatsuo}@dcl.cs.waseda.ac.jp Tokyo University of Agriculture and Technology, 2-24-16 Nakamachi Koganei, Tokyo 184-8588, Japan [email protected]

Abstract. In this paper, we present some insights from the analysis of the drawbacks of the advanced concurrency control techniques. Based on the analysis, we reveal that a commonly used technique for index structures, read-copy update (RCU), has the most significant impact on the throughput and latency of in-memory database systems. For overcoming the drawbacks, we developed Glasstree, a new index structure that produces a smaller load on the memory allocator by enhancing Masstree. Glasstree achieved high throughput and stable latency than Masstree under various workloads. Keywords: In-memory database · Epoch-based reclamation Multicore scalability · Index tree structure

1

·

Introduction

Modern commodity servers can be equipped with multiple terabytes of DRAM. This large amount of main memory has enabled modern in-memory database systems, including key-value stores (KVSes) [9] and relational database management systems (RDBMSes) [3]. Hekaton [1] and Silo [10] are pioneering in-memory RDBMSes that achieve scalable transaction throughput. Masstree, which is an index structure used in the above systems, is a fast key-value database designed for multicore processors [5]. However, recent studies have revealed that the advanced concurrency control techniques in Mastree introduce significant drawbacks. Our analysis revealed that this is caused by Read-Copy Update (RCU) [7], the memory reclamation scheme used in Masstree. RCU is used by Masstree for managing the lifetime of index nodes and values in a scalable manner. In contrast to naive techniques, RCU does not require shared memory updates in frequently executed code paths. We found that the drawback of RCU results in unstable and high latency spikes and throughput degradation because of its high load on the memory allocator. For overcoming the drawbacks, we designed and implemented a new variant of Masstree named Glasstree, that combines both RCU and reference counting c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 210–217, 2020. https://doi.org/10.1007/978-3-030-32258-8_25

A Highly Scalable Index Structure

211

for achieving multicore scalability and reducing the load on the memory allocator. We evaluated the performance properties of Glasstree when it is integrated into a KVS. For this purpose, we selected a Masstree based KVS for our evaluation because their source code is publicly available [11]. We compared the performance of Glasstree-based KVS with the original versions of Masstree-based KVS. The most remarkable result is that reference counting, a naive technique that was generally considered to be a source of the bottleneck to multicore scalability, contributes to multicore scalability under some workloads if it can be utilized in a suitable manner.

2

Read-Copy Update for In-Memory Scalable Database Systems

RCU is used for managing the lifetime of nodes and records of Masstree as an alternative to naive reference counting. Naive reference counting requires an increment and decrement of a count value of the data structure. These operations can incur shared cache line updates and can be another bottleneck to multicore scalability. RCU reduces these updates by detecting the end of reader-side critical sections with various mechanisms based on thread scheduling assumptions. Masstree uses epoch based reclamation (EBR) [2] as the reclamation mechanism of its RCU.

Fig. 1. Comparison of RCU and reference counting from the perspective of memory allocation. e1 and e2 mean epoch of EBR. T1 and T2 are threads

A program that uses EBR has a global epoch number e, and each thread that can read values whose lifetime is managed by RCU has its own epoch number ew (note that ew can also be read from other threads). e is periodically incremented, and ew is initialized as e by its owner thread before starting its critical section.

212

H. Mitake et al.

Deleted values must be registered to the limbo list of the deleting thread with its ew . With this rule, a thread can judge that a value with an epoch number that is less than the minimum ew can be reclaimed safely. Objects managed under RCU cannot be immediately reclaimed even when no threads are referring to them (the time interval required for ensuring the end of all reader side critical sections is called grace period [7].) This means that the peak memory usage of systems based on RCU tends to be higher than that of systems based on reference counting. In addition, the hit ratio of the allocator’s thread local cache tends to be lower because long-lived objects prevent recycling of the space. This difference between RCU and reference counting is depicted in Fig. 1. During the grace period, the memory area used by the reclaimed object cannot be recycled in the case of RCU.

3 3.1

Glasstree: A Highly Scalable Index Structure for Multicore In-Memory Database Systems Analysis of Ordered Indexes of Databases

The design of Glasstree exploits a simple but fundamental property of balanced trees, namely, data structures used as ordered indexes: if access to values is equally distributed, deeper nodes near a root node are accessed frequently and shallower nodes near the values are not accessed frequently. From the perspective of multicore scalability, this means that updating shared cache lines during accessing the deeper nodes will likely produce scalability bottlenecks. Conversely, the update operations during accessing the shallower nodes will produce fewer bottlenecks. The design also respects the principle of RCU, where RCU is suitable for protecting mostly read and long-living objects. In general, the internodes and the leaves live longer than the values. The values can be reclaimed if they are updated or deleted. However, internodes and leaves are only reclaimed when reshaping occurs during deleting values. This assumption can also be validated by actual experiments. Table 1 summarizes memory reclamation during Silo’s TPC-C benchmark, where Silo adopts Masstree as an index structure. As shown in this table, a large portion of memory reclamation is dominated by value objects. Table 1. Breakdown of types and numbers of reclaimed objects during Silo’s TPC-C benchmark Type Internode

# of reclamation Percentage 767505

2.3%

Leaf

7734210

24.0%

Value

23813904

73.7%

A Highly Scalable Index Structure

3.2

213

Overall Design of Glasstree

Based on the above observations, Glasstree employs both RCU and reference counting as its lifetime management schemes although Masstree adopts only RCU for all cases. As depicted in Fig. 2, Glasstree manages the lifetimes of internodes and leaves using RCU. This design is shared with Masstree. Conversely, Glasstree manages the lifetimes of values (in the case of Silo, a value functions as a record) with reference counting. Therefore, read operations (get and scan) need to increment before passing the results to their callers and decrement after their usage. This means that Glasstree gives up complete invisibility of the readers. In contrast to the read operations, the write operations (insert, put and remove) are not changed from Masstree. Because of the design, the reclaimed values of Glasstree are directly returned to its allocator. For Masstree, they are registered to a limbo list of each thread once and then returned to its allocator after the end of grace periods. With this design, Glasstree can reduce the overhead of memory reclamation without introducing a large scalability bottleneck (e.g., protecting a tree with a single giant lock).

Fig. 2. An overview of Glasstree

The original Masstree assumes that its value is given by its user program and that a pointer of the type is used in its operation. Therefore, Masstree does not require any special operations on the value type. Unlike Masstree, Glasstree requires the value type to have a reference count field and two operations for incrementing and decrementing the count. If the reference count reaches zero, then the value will be deallocated immediately. 3.3

Design of Index Structure

The main difficulty of Glasstree’s design is enabling the coexistence of multiple GC schemes. Internodes and leaves are managed with RCU, and values are

214

H. Mitake et al.

managed with reference counting. If we use reference counting for managing values in a straightforward manner, then this hybrid approach can lead to a situation of easily dereferencing reclaimed objects. First, consider a case of a get operation (obtaining a value of a required key) and assume a schedule as follows: (1) A get request for a given key K that has a value V is issued, and a reader thread R reads a leaf L and obtains a pointer P of V. (2) A remove request for K is issued, and a writer thread W obtains L for removal. (3) W actually removes P from L, and decrements a reference count of V, then reclaims V. (4) R dereferences P and increments a memory area as integer-type value that is used for a reference count of V and returns P as a response; however, the area pointed by P is already freed and can be reused for other purposes. To avoid such a situation, Glasstree needs to ensure that reader threads do not obtain obsolete pointers. We avoid the above problem by changing the behavior of a reader thread as in Fig. 3. (note that the symbols and terminologies follow the specification of Masstree [5].)

Fig. 3. Get operation: find a value for a given key

The modified get operation of Glasstree leverages a retry process of reader threads: unlike traditional data structures protected by RCU [2], reader threads can observe inconsistent states introduced by modifications of writer threads.

A Highly Scalable Index Structure

215

Traditional RCU-based data structures are designed for hiding the inconsistent state from the readers by carefully designing the order of updates. This constraint significantly limits the types of data structures that can be protected by RCU (for example, a methodology of updating multiple pointers was recently established by Mateev et al. [6], but reader threads never need to wait or retry during their traversal operations. Masstree addresses the constraint by simply allowing reader threads to retry. The inconsistent state can be detected through an OCC-based version number validation mechanism [3]. The version number used in Masstree doubles as a spinlock: a single bit in the version number is used for indicating the existence of a writer thread. It is used for both writer-writer coordination and the detection of inconsistent states from the reader side. In the get operation of Glasstree, this spinlock is acquired for avoiding the dereferencing obsolete pointer. Therefore, the get operation of Glasstree introduces at most three new shared cache line updates: acquiring a spinlock of a leaf, incrementing a reference count of a value (which will not occur if a key does not have its value in the tree), and releasing the spinlock. This means that Glasstree’s scalability of read operations can be degraded compared to Masstree. As with the get operation, the scan operation also needs to be modified. Masstree takes a callback function for the scan operation. During the scan operation, the callback function will be called for each process of visiting a candidate value. In Glasstree, the scan operation acquires the spinlock of leaves to which the values belong. 3.4

Evaluation and Analysis

In this section, we analyze the performance contributions from lifetime management techniques by comparing KVSes based on Masstree called mtd [11]1 and Glasstree called gtd. In addition, we evaluated mtd with asynchronous GC for clarifying the analysis. Each KVS server was executed on a machine equipped with dual Intel Xeon E5-2690 v4 CPUs (2.60 GHz, total of 28 physical cores) and 128 GB of DRAM. The server process executed 28 threads. The persistence feature of mtd was disabled to simplify the analysis. We used 10 client machines equipped with single Intel Xeon E3-1270 v3 (3.50 GHz, 4 physical cores) and 32 GB of DRAM for generating workloads. The client machines executed our customized mutilate [4] that supports the mtd protocol. The server and clients are connected with a 10Gbps Ethernet connection and communicate using the mtd protocol based on TCP. The workloads consist of three patterns: 100% get, 50% get + 50% put, and 100% put. Every client issues requests for 60 s at the same time. The requests target 1 million values whose length is 200 bytes and have a key of 30 bytes. Table 2 shows the results of the benchmark (latency and throughput). The 1st to 3rd rows show the results of the read-only workload. The 4th to 6th rows 1

The commit ID of git used in our evaluation is 15edde0.

216

H. Mitake et al.

Table 2. Performance comparison of mtd, mtd with asynchronous GC and gtd under various workloads. A unit of the scores that is used other than query/sec is microseconds. nth columns show the average of latencies in the top n percent std dev

min

max

5th

95th

99th

99.9th

99.99th

query/sec

mtd (get only)

241.0

avg

40.2

80.2

12203.0

183.0

306.9

358.7

446.3

564.0

2653525.3

mtd w/ async GC (get only)

241.1

44.1

88.2

12747.0

182.0

309.8

364.4

466.1

599.7

2654116.9

gtd (get only)

240.9

39.4

88.2

3037.0

182.2

308.6

362.1

455.2

568.8

2654731.8

mtd (get)

237.6

300.2

80.2

57241.9

152.1

308.2

401.1

3792.7

8798.6

2700735.8

mtd (put)

244.8

290.2

80.2

59278.0

169.4

316.1

404.1

3792.5

8460.0

mtd w/ async GC (get)

227.8

91.0

80.2

22561.1

153.1

328.1

428.4

673.0

3542.7

mtd w/ async GC (put)

235.8

92.5

72.9

22547.0

169.1

336.8

441.2

708.9

3796.7

gtd (get)

224.1

56.6

80.2

15449.0

153.3

310.6

392.4

510.3

688.9

gtd (put)

231.7

54.7

80.2

15417.1

169.6

319.1

397.8

521.5

699.2

mtd (put only)

287.7

428.9

72.9

96538.1

194.1

337.4

519.9

5458.8

14348.3

2223881.7

mtd w/ async GC (put only)

260.1

101.4

72.9

49347.2

191.1

348.4

472.3

804.4

4132.2

2459698.1

gtd (put only)

257.9

99.2

80.2

46134.0

193.4

335.9

409.0

599.4

954.5

2480658.0

2810971.3

2860670.9

show the results of the 50% read and 50% write workload. The 7th to 9th rows show the results of the write-only workload. As shown by the 4th to 9th rows, gtd provides the lowest and most stable latency and the highest throughput workloads that include write operations. In almost every metric, gtd provides lower latency than mtd with asynchronous GC. It can be considered that explicit communication between the main threads and GC threads in mtd with asynchronous GC can cause non-trivial jitter. Surprisingly, the performance of gtd under the read-only workload is almost equal to that of mtd. In this case, the reference counting mechanism of gtd can be pure overhead. The reason for this result is that other overheads, e.g., sending and receiving messages over TCP/IP, marshalling and unmarshalling requests and responses, and so forth, lowered the rate of cache line contention. The most important results in the evaluation are that the maximum latencies of gtd are much improved than mtd or mtd with asynchronous GC. Therefore, we can conclude the advantages of Glasstree to reduce big latency spikes.

4

Conclusion

In this paper, we analyzed the performance properties of RCU, an advanced lifetime management technique widely used by many state-of-the-art in-memory database systems. Our analysis revealed the importance and necessity of studies on the relation between lifetime management techniques and memory allocators from the perspective of in-memory database systems. In [8], we also show some experiences with Glasstree-based Silo. Our analysis revealed the importance and necessity of studies on the relation between lifetime management techniques and

A Highly Scalable Index Structure

217

memory allocators from the perspective of in-memory database systems. For establishing truly robust in-memory database systems, more detailed studies on the topic are required. We hope that our analysis will be helpful for later studies.

References 1. Diaconu, C., Freedman, C., Ismert, E., Larson, P.-A., Mittal, P., Stonecipher, R., Verma, N., Zwilling, M.: Hekaton: SQL server’s memory-optimized OLTP engine. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1243–1254 (2013) 2. Fraser, K.: Practical lock-freedom, Ph.D. thesis, Cambridge University, Technical report UCAM-CL-TR-579 (2004) 3. Kung, H.T., Robinson, J.T.: On optimistic methods for concurrency control. ACM Trans. Database Syst. 6(2), 213–226 (1981) 4. Leverich, J., Kozyrakis, C.: Reconciling high server utilization and sub-millisecond quality-of-service. In: Proceedings of the Ninth European Conference on Computer Systems, pp. 4:1–4:14 (2014) 5. Mao, Y., Kohler, E., Morris, R.T.: Cache craftiness for fast multicore key-value storage. In: Proceedings of the 7th ACM European Conference on Computer Systems, pp. 183–196 (2012) 6. Matveev, A., Shavit, N., Felber, P., Marlier, P.: Read-log-update: a lightweight synchronization mechanism for concurrent programming. In: Proceedings of the 25th Symposium on Operating Systems, pp. 168–183 (2015) 7. McKenney, P.E., Sarma, D., Arcangeli, A., Kleen, A., Krieger, O., Russell, R.: Read-copy update. In: Proceedings of Ottawa Linux Symposium (2002) 8. Mitake, H., Yamada, H., Nakajima, T.: Looking into the peak memory consumption of epoch-based reclamation in scalable in-memory database systems. In: Proceedings of the 30th International Conference on Database and Expert Systems Applications (2019) 9. Ousterhout, J., Gopalan, A., Gupta, A., Kejriwal, A., Lee, C., Montazeri, B., Ongaro, D., Park, S.J., Qin, H., Rosenblum, M., Rumble, S., Stutsman, R., Yang, S.: The RAMCloud storage system. ACM Trans. Comput. Syst. 33(3), 7:1–7:55 (2015) 10. Tu, S., Zheng, W., Kohler, E., Liskov, B., Madden, S.: Speedy transactions in multicore in-memory databases. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 18–32 (2013) 11. Beta release of Masstree. https://github.com/kohler/masstree-beta. Accessed 13 June 2019

Applying the Split-Join Queuing System Model to Estimating the Efficiency of Detecting Contamination Content Process in Multimedia Objects Streams Vladimir Lokhvickii(B) , Yuri Ryzhikov, and Andry Dudkin Mozhaysky MSA, 13 Zhdanovskaya Street, Saint-Petersburg 197198, Russia lokhv [email protected], [email protected], [email protected]

Abstract. An approach to evaluating the efficiency of the process of identifying contamination content in streams of multimedia objects is proposed. The possibility of separating multimedia objects into components (text, video, audio) and their parallel processing (analysis) with subsequent aggregation of the results is considered. The task was formalized on the basis of the M/G/n multichannel queuing system (QS) model taking into account the Split-Join processes and the calculation of its probability-time characteristics. The key stage of the calculation is the representation of random processing times for components of multimedia objects by the distribution of their maximum, the initial moments of which are found by numerical integration along the half-axis with the weight of Chebyshev-Laguerre. The resulting estimate of efficiency is presented in the form of a set of initial moments of the distribution of the duration of the request in the system. Keywords: Contamination content · Split-join service process · Maximum distribution of random variables · Chebyshev-Lagerr numerical integration

1

Introduction

The current state of information security tools, including means of protecting users of Internet resources from unwanted content, requires the use of more efficient methods for detecting contamination of information resources. The main ways to counteract the distribution of unwanted content currently remain [7] methods involving the blocking of information resources prohibited in the Russian Federation. The disadvantages of such approaches are: 1. The difficulty of identifying and preventing the spread of contamination content in real time and, as a consequence, the low response time. 2. Blocking a banned resource by a domain name or IP-address, which leaves an attacker with the opportunity to transfer it to a new address. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 218–223, 2020. https://doi.org/10.1007/978-3-030-32258-8_26

Split-Join Queuing System For Estimating Contamination Content Detection

219

To eliminate the above disadvantages, the authors propose an approach to intelligently detecting contamination content in streams of multimedia objects, which opens up the possibility of blocking unwanted (negative) information based on its semantics in real time. The general scheme of this process with a variant of its implementation based on the cloud platform is presented in Fig. 1.

Fig. 1. Diagram of the multimedia objects streams intellectual analysis process

The features of this process are: • decomposition of multimedia objects into components: video (V), audio (A), text (T), and using various (by complexity) methods of their analysis; • decision about belonging of a multimedia object to the contamination class based on the results aggregation of components analysis; • the need for timely detection of contamination content with a high intensity of streams of multimedia objects (work practically in the real time). These features determine the need to use special models when designing platforms that can provide the required efficiency in detecting contamination content in streams of multimedia objects.

2

Analysis of Modeling Possibilities the Splitted Tasks Service Processes

Traditionally, probabilistic models based on queuing systems and networks are widely used to simulate the processing of tasks and evaluate their efficiency. The need to take into account the division of the original task with the subsequent assembly of subtasks parallel processing results leads to the possibility of representing this process as a Split-Join QS [4]. A significant number of papers [1–3,6,9] are devoted to the study of Split-Join processes. In [3], an exact solution was obtained for the maximum (service time)

220

V. Lokhvickii et al.

of independent channels with exponential service time and different intensities, as well as approximation for the case of distribution of a general form. In [2], the above distribution was obtained for homogeneous and heterogeneous servers, and its representation in a matrix-exponential form allowed us to find both the first and the highest-order moments. It should be noted that this method is characterized by high computational complexity, which is a consequence of its labor-intensive operations of circulation and Kronecker’s matrix products. Thus, the development of effective methods for analyzing multichannel QS taking into account the processes of Split-Join is an urgent task.

3

Formulation of the Problem

Imagine a system for complex classification of streams of multimedia objects in the form of a multichannel QS of the M/G/n type (Fig. 1), for which the following are given: • λ – the intensity of the incoming requests (packets of media objects) Poisson flow; • n – the number of service devices (cloud service instances of media content packages classification). Service in each device is carried out according to the Split-Join scheme – the initial application is divided into three components (video, audio, text), each of which is processed in parallel, and the results of processing are aggregated. The processing times of subtasks for different devices are generally different and are characterized by the corresponding distributions, given sets of initial T T moments BiV = {bVi,m }, BiA = {bA i,m }, Bi = {bi,m }, i = 1, n, where m – is the order of the moment. Required to find: 1. {pj }, j = 0, 1, 2, . . . – the queries count stationary distribution in the QS. 2. {vm }, m = 1, s – the initial moments of the m-th order of the distribution of the duration of the request in the system, where s – is the required number of moments.

4

Method Description

The solution of the problem seems possible based on the sequence of stages: 1. The calculation of the initial moments of the distribution of the service duration of the query (task) in the i-th channel, taking into account its division into N subtasks with the subsequent merging of the results. The sought moments {bi,m }, i = 1, n, m = 1, 2, . . . are expressed through the additional cumulative distribution function (ACDF) of the subtasks maximum servicing time [4]:  ∞ ∗ bi,m = tm−1 Fi (t)dt, m = 1, 2, . . . , (1) 0

Split-Join Queuing System For Estimating Contamination Content Detection

221



where Fi (t) – ACDF of maximum service time of task in Split-Join server. Integral (1) may not have analitic solution, apply numeric formula by Chebyshev-Luagerr [5]: 



xs e−x f (x)dx ≈

0

l 

Ak f (xk ).

k=1

A special tables are exists [5] for {xk } and {Ak } by the different s and l. ∗

Change x to t and presents the function f (t) by F (t)et . Therefore Eq. (1) for bi,m can be presented as bi,m = m

l 



Ak Fi (tk )etk , m = 1, 2, . . .

(2)

k=1

2. Calculating initial moments {bm }, m = 1, 2, . . . of service time distribution B(t) in system (in any server). Thus distributions {Bi (t)} of query service in i-th server may be different, to distribution B(t) is mixture of distributions Bi (t), i = 1, n. Initial moments m-th order can be calculated by expression: bm =

n 

ui bi,m .

(3)

i=1

Here ui – probability of next query occupying i-th server, and characterized work of load balancer. It can be set manually, but in general case it is according mean of served queries by i-th server and proportional to intensity μi of it’s serving: μi ui = n . i=1 μi 3. Approximation of finding service time distribution B(t) by method of moments with phase distributions (for example, hyperexponential) and calculating of QS characteristics by methods [8].

5

Numerical Experiment

QS calculating prodused for next initial data: servers count n = 4, arrival Poisson flow intensity λ = {0.25, 0.5, 0.75, 1.0}, initial moments of subtasks service time distribution and calculating results of according moments for tasks for each server is presented in Table 1. Initial moments of tasks service time distribution in QS calculation was made according (3). As results b1 = 3.124; b2 = 30.261; b3 = 490.648. Further phaze approximation was performed by a second-order hyperexponential distribution, and calculation of system characteristic was carried out by the iterative method of Takahashi-Takami [8]. Query count in QS stationary distributions are presented at Fig. 2. Calculating results of probability-time characteristics are presented in Table 2.

222

V. Lokhvickii et al.

Table 1. Initial moments of subtask and task service time distribution for i-th server i

BiV

bV i,1

bV i,2

bV i,3

BiA

bA i,1

bA i,2

1 2.000 18.440 303.154 1.200 9.734

bA i,3

BiT

bT i,1

Bi bT i,2

bT i,3

bi,1

bi,2

bi,3

146.250 0.500 0.973 3.297 2.979 27.601 428.281

2 2.200 22.312 403.497 1.400 13.250 232.240 0.650 1.644 7.243 3.391 34.833 590.746 3 1.800 14.936 220.999 1.100 8.180

112.650 0.450 0.788 2.403 2.698 22.737 323.398

4 2.300 24.387 461.059 1.500 15.210 285.644 0.700 1.906 9.046 3.582 38.617 681.952

Fig. 2. Query count in QS stationary distributions for different λ

Table 2. Initial moments of subtask and task service time distribution for i-th server λ

ρ

q

w1

w2

v1

v2

0.25 0.195 0.003 0.012 0.033

3.136 30.366

0.50 0.391 0.077 0.153 0.585

3.278 31.804

0.75 0.586 0.552 0.736 4.138

3.860 38.994

1.00 0.781 2.360 2.360 20.594 5.484 65.597

Split-Join Queuing System For Estimating Contamination Content Detection

6

223

Further Research

The presented approach allows to obtain estimates of the efficiency of detecting contamination content both of various software and hardware platforms and in different conditions of their operation. Further research is planned in the following directions: • development of an experimental stand and the accumulation of statistical data on the complexity of the operations of processing packages of media content; • development and analysis of a queuing network model with Split-Join nodes representing the process of multi-stage processing of media content packages. Acknowledgements. The study was carried out with the financial support of the Russian Foundation for Basic Research, project No. 18-29-22064\18.

References 1. Alomari, F., Menasce, D.A.: Efficient response time approximation for multiclass fork and join queues in open and closed queueing networks. IEEE Trans. Parallel Distrib. Syst. 25, 1437–1446 (2014) 2. Fiorini, P., Lipsky, L.: Exact analysis of some split-merge queues. Perform. Eval. Rev. 43(2), 51–53 (2015) 3. Harrison, P.G., Zertal, S.: Queueing models with maxima of service times. In: Computer Performance Evaluation. Modelling Techiniques and Tools, pp. 152–168 (2003) 4. Khabarov, R.S., Lokhvitckii, V.A.: Model’ otsenivaniya operativnosti mnogopotochnoi obrabotki zadach v raspredelennoi vychislitel’noi srede s uchetom protsessov Split-Join. Vestnik Rossiyskogo novogo universiteta. Seriya “Slozhnye sistemy: modeli, analiz i upravlenie”, vol. 1, pp. 26–35. [Efficiency evaluating model of tasks multy-threading processing in a distributed computing environment with Split-Join processes. Vestnik of Russian New University. Series “Complex systems: models, analisys, management”, vol. 1, pp. 26–35] (2019). (in Russian) 5. Krylov, V.I., Shul’gina L.T.: Spravochnaya kniga po chislennomu integrirovaniyu, 372 p. Nauka, M (1966). (in Russian) 6. Qiu, Z., Perez, J.G., Harrison, P.G.: Beyond the mean in fork-join queues: efficient approximation for response-time tails. Perform. Eval. 91, 99–106 (2015) 7. Roskomnadzor. http://eais.rkn.gov.ru. Accessed 08 Apr 2019 8. Ryzhikov, Yu.I.: Algoritmicheskiy podkhod k zadacham massovogo obsluzhivaniya: monografiya. SPb. VKA im. A.F. Mozhayskogo, 496 p. (2013). (in Russian) 9. Wang, P., Li, J., Shen, Z., Zhou, Y.: Approximations and bounds for (n, k) fork-join queues: a linear transformation approach. Computer Network Information Center, Chinese Academy of Sciences, China (2017)

On the Applicability of the Modernized Method of Latent-Semantic Analysis to Identify Negative Content in Multimedia Objects Sergey Krasnov1,2(B) , Vladimir Lokhvitckii1 , and Andry Dudkin1 1 2

Mozhaysky MSA, 13 Zhdanovskaya Street, Saint-Petersburg 197198, Russia lokhv [email protected], [email protected] Saint Petersburg State Electrotechnical University, Saint-Petersburg, Russia [email protected]

Abstract. The possibility of applying a modernized method of latent semantic analysis (MLSA) to identify negative content in multimedia objects of web space is considered. A method for analyzing the dynamics of changes in the singular numbers of the original matrix with automatic selection of the range of used rank values is described. The positive dynamics of application of the MLSA method in different directions is shown, where semantic analysis of information is required. Keywords: Latent-semantic analysis · Semantic analysis of information · Negative content · Contamination information resources

1

Introduction

Currently, information is considered a strategic national resource, and its volume is constantly increasing. One of the important and integral parts of the information support of various organizations and users of Internet resources is the process of searching, collecting and analyzing information obtained from the web-space. At the same time, the scale of distribution and the turnover of negative (undesirable) content (information that has a destructive effect on the human psyche and (or) the public consciousness) has dramatically increased. In the key authoritative works of American and domestic scholars [2] in the field of sociology, the following types of negative content are identified: media violence; sexually explicit media content; awesome media production and many others species. Quite important and common is the task of identifying negative content, and the basis of its solution is the theory of data mining [9]. To solve this problem and similar problems of intellectual processing, various variants of applied areas and various approaches are offered. In [8], methods are proposed for identifying duplicate and conflicting information on Wikipedia. In [3] an ontological approach is used to eliminate conflicts at the level of semantics. In [1], a method is c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 224–229, 2020. https://doi.org/10.1007/978-3-030-32258-8_27

Latent-Semantic Analysis to Identify Negative Content

225

proposed for determining semantic proximity when comparing texts in Russian. A review of existing comparison methods is presented. A method for determining the degree of similarity between text passages within the semantic class is described. To achieve our goal, we need a comprehensive application of the original and developed methods in the field of semantic analysis of information.

2

Description of the Approach

This work was carried out within the framework of solving a complex scientific problem of the intellectual detection of contaminating information resources of the Internet. The authors have proposed a generalized diagram of the process of intellectual analysis of the flow of multimedia objects, which includes a complex classification subsystem based on the modernized LSA method. Modernization of the LSA method is to use the developed method of analyzing the dynamics of change in the singular numbers of the original “packageobject” matrix with automatic selection of the range of used rank values. The initial information for the LSA is the “package object” matrix, which describes the data set used to train the system. The elements of this matrix contain the frequency of use of each package in each object. One of the most common variants of LSA is based on the use of the decomposition of the original matrix in singular values (SVD - Singular-Value Decomposition). Using SVD, a large source matrix decomposes into a set of k, usually from 70 to 200, orthogonal matrices, the linear combination of which is a good approximation of the original matrix. According to the singular decomposition theorem, any real rectangular matrix X can be decomposed into a product of three matrices X = U ΣV T , where the matrices U and V are orthogonal, and Σ is the diagonal matrix, the values on the diagonal of which are called the singular values of the matrix X. The peculiarity of this expansion is that if we leave in Σ only k of the largest singular values, and in the matrices U and V only the columns corresponding to these values, then the product of the resulting matrices Σlsa , Ulsa and Vlsa will be the best approximation of the original matrix X by the matrix of rank k.  The idea of such a decomposition X ∼ =X = Ulsa Σlsa Vlsa and the essence of latent semantic analysis is that if X was used as a matrix of multimedia objects,  then the matrix X containing only k first linearly independent components of X reflects the basic structure of associative dependencies present in the original matrix, and time contains no noise. Thus, each package of a multimedia object is represented using vectors in a common space of dimension k (the so-called hypothesis space). The proximity between any combination of multimedia packages can be easily calculated using the scalar product of vectors.

226

2.1

S. Krasnov et al.

The Method of Analyzing the Dynamics of Change in the Singular Numbers of the Original Matrix with Automatic Selection of the Range of Used Rank Values

At the first step, it is necessary to create a “package-vector” matrix that describes the analyzed documents and will contain the initial data for the LSA method. Its elements will contain the packet weights obtained after applying the statistical measure tf-idf (the ratio of the frequency of the packets to the inverse frequency of the vector (object)) and the HB matrix. dij =

wij |wij |

, wij = tfij × log

|D| nj , tfij =  , dfj k nk

where dij – normalized weights tf-idf, (packet frequency - inverse object frequency), 0 ≤ dij ≤ 1; |D| – number of objects analyzed; nj – the number of uses of the package in the object;  k nk – the total number of packages contained in the object; tfij – the frequency of occurrence of a packet in an object (the number of times that the jth packet has occurred in the ith object); dfj – object frequency (the number of objects in which the jth packet was encountered); |wij | – normalized vector wij in Euclidean space. In tf-idf, packets with a high frequency within the object and with a small frequency of occurrence in other objects receive the greatest weight. The rationing operation is performed before the calculation λ - the degree of compliance of objects. Next, you need to perform a singular decomposition of the matrix A, into the product of three matrices A = U W V T , where U and V - unitary matrices, which consist of left and right singular vectors, and W - a matrix with non-negative elements on the diagonal, which are called singular values of the matrix A. According to the Eckart-Yang theorem, if only the σ singular values are left in the matrix W, and in the matrices U and V are the columns corresponding to these values, then the matrices Uσ and Vσ will be their best approximations, reflecting the associative dependencies of the representation of packages and objects in the dimension space σ [4,5,7]. The next step is the optimal choice of non-zero singular values of the matrix A σ1 ≥ σ2 ≥, ..., ≥ σn > 0, which affect the result of the identification of negative content. The result of reducing the matrix A to ranks that have singular values close to zero are equal matrices, accounting for which leads to an increase in the computational complexity of the LSA method and thereby reduces the efficiency of identifying negative content. In order to solve the problem of optimal selection of singular values, a heuristic method for selecting significant ranks described below is proposed, which is the essence of the method for analyzing the dynamics of change in the singular numbers of the original “Pocket-object” matrix with automatic selection of the range of used rank values.

Latent-Semantic Analysis to Identify Negative Content

227

Define the function fi = σi , i ∈ N , i < P , where P – number of documents. Only ranks are significant rp , rp+1 , ..., rm−1 , rm , p ≤ m, prisoners between the corresponding singular values σp ≥ σm ≥ 0, undergoing dramatic change Δσi = σi − σi−1 , i ∈ {p; m} relative to previous singular values σi ≥ σp , i ≤ p, σi ≥ σm , i ≤ m. The boundaries of significant ranks are determined using the concept of the derivative of the function of singular values f  = σi − σi−1 , i ∈ N, i < n. Next, we search for the maximum value of the derivative of the function  , achievable at σmax , 1 < max ≤ n2 . fmax Then the first local minimum is determined σp , next to σmax . Ranks r1 , r2 , ..., ri , ..., rp−1 , rp , i ≤ p corresponding to singular values greater than σp , recognized as insignificant. The rank is selected as the right border of significant ranks rn , corresponding to the last nonzero singular value σn . At the final stage, it is necessary to calculate λ documents using KMB: → − → − (i) (i) M xj xi cos(Xj , Xi ) = Σi=1 (i) (i)

when xj xi – elements of different vectors between which the proximity measure is calculated; M – dimension of vector space. KMB values are limited by the interval [−1; 1] when using the HB operation. − → − → Degree of compliance λlj,i vector Xj , Xi (i < j≤ P ) calculated for each significant −→ − → − → rank rl , p ≤ l ≤ m. Next, you need to calculate the resulting λj,l vectors Xj , Xi : m λj,l Σl=p −→ λj,l = m−p+1

Thus, we obtain the resulting degree of compliance for a particular pair of objects over the entire significant range of rank values. The obtained value should be compared, for example, with the threshold value for automatic or automated decision-making on the elimination or isolation of negative content in the global Internet. Note that the method of automatically determining the range of used rank values allows us to guarantee with greater accuracy that the data are really negative, because the values of λ all pairs of vectors are evaluated at each significant rank. At the same time, random bursts of the obtained values are smoothed with incorrect ranking values, and the values of obviously negative data tend to unity, since the arithmetic average is found for all values of one pair of vectors of matrices obtained from the range of used rank values.

3

Numerical Experiments

The research results showed that the use of the LSA method in the tasks of searching, categorizing, identifying duplicate information makes it possible to

228

S. Krasnov et al.

effectively identify semantic relationships between terms. This made it possible to increase the automation of solving the problem of eliminating conflicts and redundancy of information, and to increase the accuracy in the search for information rubrication. Consider the distribution curve of the number of pairs of documents that exceed a certain value of the degree of proximity (Fig. 1).

Fig. 1. The distribution of the number of pairs of objects that exceed a certain value of the degree of proximity

This curve has an exponential form, which makes it possible in automatic mode to significantly reduce the search area for negative content: from 1,730 pairs of objects to 12 pairs of objects with a threshold of proximity ϕ = 0.85. Found 12 pairs of objects are negative. Automation and continuity of information analysis processes based on intelligent methods and methods of its processing (lexical, syntactic and semantic analysis), including the proposed MLSA method, will improve the timeliness and accuracy of negative Internet content detection [4–7].

4

Conclusion

As a result of the study, it can be assumed that the proposed approach based on the MLSA method will be effective for solving the problem of intelligently detecting negative content in the global Internet. The application of the proposed approach will allow identifying contaminating information resources of the Internet in a timely and efficient manner, as well as adapting to the dynamically changing requirements for the content of multimedia content containing information that is prohibited in the Russian Federation. Acknowledgements. The study was carried out with the financial support of the Russian Foundation for Basic Research, project No. 18-29-22064\18.

Latent-Semantic Analysis to Identify Negative Content

229

References 1. Bermudes, S.H.G., Kerimova, S.U.: O metode opredeleniy tekstovoi blizosti, osnovannom na semanticheskix klassax. Elect. nauchnii jurnal Inzhenernyj vestnik Dona 4(43) [About the method of determination of text closeness based on semantic classes. Electronic science journal Engineering herald of Dona 4(43) (2016). http:// www.ivdon.ru/ru/magazine/archive/n4y2016/3832. (in Russian) 2. Bryant, J., Thompson, S.: Osnovi vozdeistviy SMI. Per. Eng. - Williams, M, p. 432 [Basics of the impact of the media, Translate from English - Williams, M, p. 432] (2004). (in Russian) 3. Jotsov, V.S., Sgurev, V.S., Yusupov, R.M., Khomonenko, A.D.: Ontologii dly razresheniy semanticheskix konfliktov. In: Trudy SPIIRAN, vol. 7, pp. 26–40 [The Ontology for the Semantic Conflicts Resolution. In: Proceedings of SPIIRAS, vol. 7, pp. 26–40] (2008). (in Russian) 4. Khomonenko, A.D., Logashev, S.V., Krasnov, S.A.: Avtomaticheskay rubrikaciy dokumentov s pomoch’y lsa algoritma nechetkogo vivoda Mamdami. In: Trudy SPIIRAN, vol. 44, pp. 5–19 [Automatic rubrication of documents using latent-semantic analysis and Mamdani fuzzy inference algorithm. In: Proceedings of SPIIRAS, vol. 44, pp. 5–19] (2015). (in Russian) 5. Khomonenko, A.D., Krasnov, S.A.: Primenenie metodov latentno-semanticheskogo analiza dly avtomaticheskoi rubrikacii dokumentov. In: Izvestiy Peterburgskogo universiteta putei soobwenii, vol. 31, pp. 124–132 [Application of methods of latentsemantic analysis for automatic document categorization. In: Proceedings of the Petersburg Transport University, vol. 31, pp. 124–132] (2012). (in Russian) 6. Krasnov, S.A., Ilatovsky, A.S., Khomonenko, A.D., Arsenyev, V.N.: Ocenka semanticheskoi blizosti dokumentov na osnove latentno-semanticheskogo analiza s avtomaticheskim viborom rangovix znachenii. In: Trudy SPIIRAN, vol. 54, pp. 185– 204 [Estimation of semantic proximity of documents based on latent-semantic analysis with automatic selection of rank values. In: Proceedings of SPIIRAS, vol. 54, pp. 185–204] (2017). (in Russian) 7. Krasnov, S.A., Khomonenko, A.D., Dashonok, V.L.: Viyvlenie protivorechii blizkoi informacii na osnove latentno-semanticheskogo analiza. Sbornik nauchnix trudov SPbGPU, vol. 2, pp. 73–84 [Identification of contradictions in semantically close information based on latent-semantic analysis. Collected Scientific Works of the St. Petersburg State Polytechnical University, vol. 2, pp. 73–84] (2014). (in Russian) 8. Weissman, S., Ayhan, S., Bradley, J., Lin, J.: Identifying duplicate and contradictory information in Wikipedia. In: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 57–60 (2015) 9. Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn., p. 664. Morgan Kaufmann (2011)

Advanced Methods for Social Network Analysis and Inappropriate Content Counteraction

Digital Subjects as New Power Actors: A Critical View on Political, Media-, and Digital Spaces Intersection Dmitry Gavra1 , Vladislav Dekalov1 , and Ksenia Naumenko1,2(B) 1

2

School of Journalism and Mass Communications, SPbSU, VO, 1 Line, 26, St. Petersburg, Russia [email protected], [email protected] GloryStory Communication Agency, Moika embankment, 69, St. Petersburg, Russia [email protected]

Abstract. The paper deals with transformed power relations appeared as a result of political, media, and digital spaces intersection. Authors apply methodology of digital critical media theory to conceptualize digital subjects in the context of attention economy and communicative capitalization. Those subjects have become influential power actors which are capable to compete with media for their audiences and with politicians for their publics. Conversely, media-subjects and political actors apply new practices trying to attract Internet users’ attention to a certain digital segment and stimulate communicative labour of retained audience. In the paper, authors analyze relations between all these actors. Authors consider complex power configuration in the area of all three spaces intersection.

Keywords: Critical internet studies Digital subjects

1

· Communicative capitalism ·

Introduction

In the “digital age”, there is an intersection of various spaces (political, media-, and digital ones), that complicates the relationship between power actors of each space and gives them additional dimensions. Power configurations in all these spaces are intertwined, so we can in a new way problematize media-political relations, as well as a politicization of Internet, and its medialization. The first part of the paper deals with a digital space, Internet users’ attention as a new scarce resource, and new power actors which concentrate this recourse on associated digital segments. In the second part, we analyze new opportunities received by subjects of intersected political, media- and digital spaces, and how those opportunities complicate power relations. In the third part, there is a methodological tool concerning four situations in digitalized society contingent upon digital actors and media-political actors power balance. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 233–243, 2020. https://doi.org/10.1007/978-3-030-32258-8_28

234

2

D. Gavra et al.

Theoretical Framework

To analyze the Internet as a space that not only liberates but also “enslaves” researchers use a critical media theory approach. It considers the relationships between the main subjects of the digital space applying methods of political economy, sociology, cultural studies, and communication studies. We can refer following concepts to the digital political media theories within the critical paradigm: “communication power” (late works of Castells [6]), “creative class” (Florida [16]), “precariat” (Standing [28]), “the new spirit of capitalism” (Boltanski and Chiapello [4]), “network culture” (Terranova [30]), “general intellect” (Hardt and Negri [21], Virno [34], Dyer-Witheford [13]), “digital labour” (Fuchs and Sevignani [19]), “cognitariat,” and “semiocapitalism” (Berardi [2]), “cognitive capitalism” (Boutang [5]), “communicative capitalism” (Dean [7–9], Gavra and Dekalov [11,20]).

3 3.1

Conceptual Development Attention as a New Scarce Resource of the Digital Space

The main resource that constitutes power relations on the Internet is an attention. Space which we call “digital” consists of users—individuals who are connected to IT networks through the digital mediation of electronic interfaces. Users consume and produce (i.e. prosume) digital products: texts, images, videos etc. Some of those users are professional consumers (profi-sumers) or digital subjects. And there is a competition for the attention of other Internet users between them [10]. Every digital subject who wants to be successful strives to attract other users’ attention to her or his digital segment. The digital audience rewards the most exciting content on different digital segments with its attention, thereby increasing their symbolic use value. In turn, the totality of all these “rewards” in the same segment also has a value of another type—an exchange one. A segment, which has attracted attention, is valuable to business (e.g. to advertisers, investors, or sponsors) and political subjects. Digital subjects have not only to attract the attention of Internet users but attach them to some kind of rituals and convince those users to trust digital content they prosume. In this case, labour power of retained users can be easily “exploited” in the process of communicative capital amassing. A digital subject extracts surplus value from the difference between costs to maintain a high quality of digital segment and incomes resulting from the selling of retained users’ communicative product. Money can be reinvested by a subject in the digital segment development (thus increasing communicative capital) or can be converted into other forms of capital e.g. social, cultural, or political one. We claim that communicative capital is the most important component of power exercised by a digital subject. But s/he has to fulfill different functions and influence different objects to perpetuate a position in digital space stratification. There is a temptation to categorize different power actors according to

Digital Subjects as New Power Actors: A Critical View

235

the functions they perform. But more correct to be considering different roles an actor is able to play (see Table 1). Table 1. Different roles power actors play in the digital space Role

Object

Description of functions

Traffic monopolist Attention

An actor attracts and retains internet user’s attention on her of his digital segment(s). S/he creates and maintains digital product which is positioned as a professional one and is distinguished from other digital products

Network elite

Attachment

An actor creates and distributes values, rituals and behavioral patterns as rules to share and follow

Digital brand

Trust

An actor reinforces her of his symbolic status, consolidate “property rights” (on the digital product created at the digital segment or segments) and provide her or his recognition

Digital capitalist

Labour power An actor exploits communication labour of her or his digital segment’s audience. S/he extracts the surplus value and convert it into money accumulating in communicative capital (which can be further invested or reinvested)

Thereby, the power in digital space is a power to attract and retain attention, to attach users to some kind of symbolic structures, to inspire trust in them, and to effectively stimulate unpaid communicative labour on the digital segment. The more functions a subject fulfill the more powerful in digital space s/he is. This power also facilitates conversing communicative capital into other forms of capital and using it in other spaces (e.g. political one). It’s important to notice, that unlike some researchers (for example, Fuchs [18] or Faucher [14]) we claim that digital power actors are not only global advertising platforms such as Facebook, Twitter, or Instagram. We acknowledge some kind of micro-level actors (for example, micro-celebrities) also have a digital power over their audience limited my territory, generation, interests etc. 3.2

The Intersections of Political Space, Media-Space, and Digital Space

When we speak of political space we consider the relationship between political actors and political publics. Despite the real political power of institutionalized political actors (such as parties, movements, or establishment groups), their positions are constantly jeopardized by publics’ mistrust and self-organization.

236

D. Gavra et al.

In turn, the media-subjects are the mass media organizations (PR and journalistic ones): media outlets, channels, distributors, and other media forms. All those subjects connect the information sources with the audience in media-space. Both media- and political spaces are enriched with not only intrinsic dynamics and relationship reconfiguration but also with external sources. Hence, digital subjects with sufficient resource can convert it into capital and use it to compete with institutionalized actors of respective space. So, a well-known Russian blogger and oppositionist Alexei Navalny is an influential digital actor, but his position in the official Russian political space is controversial, and his party has not received registration yet [32]. A number of Telegram-channels fulfill the information function and “trade with insides”, however, they are not media, and therefore are excluded from the Russian media legislation [33]. Not less interesting is the reverse processes when media and political actors entering digital space and strive to compete with digital actors for Internet audiences’ attention, attachment, trust, and communication labour power. Thereby, at the intersections of the designated spaces we distinguish: 1. 2. 3. 4.

Area Area Area Area

of of of of

political and media- spaces intersection. digital and media- spaces intersection. digital and political spaces intersection. political, media-, and digital spaces intersection

Political and Media- Spaces Intersection. In this intersection, we consider media-political power actors as the actors that have arisen as a result of media and political institutions merging (i.e. mediocracy [3]). They are large media affiliated with the political process, and at the same time—political figures that are associated with those media. Media processes overlap with processes occurring at other intersections. As a result, the complexity of interactions between different actors increases more and more. Simultaneously, the “media spectacle” is moving to a new digital level [23]. These processes mainly occur at other intersections of our scheme, but media-political subjects as influential power actors acknowledge those new opportunities. Digital and Political Spaces Intersection. The Internet as a social space acquires a political dimension. Internet users need not only an entertainment but also an information laden with social and political meanings. Politicization, on the one hand, creates a demand for a digital elite producing an immaterial product with a high use value. On the other hand, it entices non-political opinion leaders to speak out on political topics. It is worthwhile to consider intersections of role structures when each subject can have two dimensions each belonging to a respective space (see Table 2). Political Publics—Users. Social networks are politicized as a result of undermining trust in traditional political institutions [15]. Users simultaneously act as the members of the political public and the prosumers at the entertainment market.

Digital Subjects as New Power Actors: A Critical View

237

Table 2. Two-dimensional roles of subjects operating in political and digital spaces Subjects

Political publics

Political actors

Users

Politicizing of the Internet users

Appropriating new practices and tools by political actors

Digital power actors Emerging of new independent political actors

Increasing political competition

Political Actors—Users. Political actors seeking to gain power are involved in the competition for the Internet users attention. They need to convince users and raise trust in them. Also political actors attempt to control, collect information, as well as transfer to the legal field a number of actions that are in fact considered illegal, for example, the dissemination of extremist materials. Sometimes these practices contradict each other (for example: US government vs. Apple [24], Russian government vs. Telegram [31]). Digital Power Actors—Political Publics. Political publics going online find alternatives to established political actors in digital opinion leaders. Those actors accumulated communicative capital are able to invest it in a further political campaigning. As an example, the self-financing of Ksenia Sobchak (journalist and celebrity), who was the opposition candidate for the presidency of Russia (2018) [29]. Digital Power Actors—Political Actors. Separate political actors (both institutionalized or not) try to become traffic monopolists and part of the digital elite. Also, they create digital brands. They implement their political expertise to attract additional attention, be in touch with the potential electorate, and reinforce the reputation. Acting online can increase the chances of gaining power rent. So it was, for example, with Donald Trump, in whose victory in US presidential election Twitter account played a significant role. It became a “symbolic and communicative battlefield” [17]. Digital and Media- Spaces Intersection. The media-subject (media) and the media audience can act as an external factor in relation to the digital space (and vice versa). So, the audience of traditional media can change their media consumption in conditions of information redundancy, and users, in turn, expand the media audience. The media usually operates in digital space, and the digital subjects (bloggers, informational communities) may have journalistic intentions. Further, we consider the intersections of the digital space and media-space and possible transformations that occur with the subjects in both of them (see Table 3). Users—Audience. Users have at the same time the needs for entertainment and social participation on the one hand and the need to receive information on the other [1]. Digital subjects perform both an informational and an entertainment function for young users and compete with traditional and online media.

238

D. Gavra et al.

Table 3. Two-dimensional roles of subjects operating in media- and digital spaces Subjects

Audience

Media

Users

Emerging of new infotainment sources

Expanding of media audiences

Digital power actors Strengthening of alternative opinion leaders

Increasing competition between media and quasi-media

Users—Media. The media are involved in the competition for the attention of Internet users. They (media) have to apply practices of attracting and retaining users attention. These practices have to be technically homogenous to digital affordances. Also, media-subject have to reinforce its reputation and symbolic status. Finally, the media is able to get additional financing with the use of new advertising opportunities. As an example, consider communicative practices of some Russian cultural journalists [12] reinforcing their symbolic statuses as members of the digital elite. Digital Power Actors—Audience. Digital subjects can serve as an alternative source of information. The media audience “migrates” to the digital environment and changes its habits in information consumption. Its attention is attracted by many sources. Thereby, media consumers can be attached to products of non-institutionalized sources, which supplement informational space [25]. Also, an Internet user can be disappointed in the old informational source and start trusting new opinion leaders available online, which are more authentic and less official. Digital Power Actors—Media. Digital subjects with a substantial influence on their audiences are a vital alternative to traditional media presented in the digital space. Those subjects also can migrate to traditional media (for example, as political commentators, columnists, and so on) in search of media institutionalization. In turn, the media go digital in order not to lose popularity “in the clickbate age [26].” Journalists become part of the digital elite, transferring their journalistic expertise to the non-specialized digital segment. A vivid example: YouTube blog “vDud” that is the project of the editor-in-chief of Russian Internet media sports.ru Yuri Dud [22]. Political, Media-, and Digital Spaces Intersection. The intersection of the three spaces, each with its own power relationship, generates very complex configurations and ambiguous subjects’ constellations. The media and journalists interact with the audience. Political actors apply power practices to the political audience. Internet users are influenced and moreover managed by the winners in the competition for their attention, attachment, trust, and labour power. Media operating within the digital space should, on the one hand, take into account the political preferences of its potential audience, on the other—compete for its attention on a par with other forms appropriating media functions (e.g.

Digital Subjects as New Power Actors: A Critical View

239

bloggers, news communities, independent and anonymous channels in various messengers). Political actors if they want to be successful digitally have to be not only media-persons but digital ones who simultaneously are traffic monopolists, representatives of digital elites, digital brands, and communicative capitalists. Besides, we can talk about reverse penetration, when digital power actors with significant influence on digitalized audience establish themselves in the media or political spaces, or in both at the same time. In the first case, we can talk about the media institutionalization of digital actors (for example, when a successful blogger becomes a journalist), in the second one—about political participation and an alternative way of gaining political influence by those forces that were excluded from the official mainstream agenda before. 3.3

The Matrix of Digital and Media-Political Power Actors Balance

In the digitalized media-political area, the interests of media conglomerates and established political groups, which in this paper we call merged media-political power actors, can come into conflict with the interests of digital power actors. In fact, we can talk about including non-institutionalized political and mediapersons in the struggle for power. They being influential digital subjects with sufficient communicative capital and loyal audience have enough resources to participate in the struggle for media and political rents within the mediacracy. In turn, we can also witness media-political actors appropriating practices of digital power such as attention attraction, dissemination of values and rituals, working with digital reputation and communicative exploitation. To analyze this situation we suggest a methodological tool—the matrix of power balance on the intersection of political, media-, and digital spaces (see Fig. 1). The Status Quo position fixes intermediate results of media-political and digital power actors interaction. Both types of power actors have a political subjectivity in certain fields. They have relatively equal resources to implement independent strategies. So, if the interests of media-political and digital power actors confront, those subjects need to cooperate or compete. But in both cases, there is a risk of weakening of one or another side and violating the power balance. “Cooperation” scenario leads to Media-Political Power, while “Competition” scenario precipitates Digital Power. Meanwhile, the White spots position describes the situation of uncertainty. Digital power actors haven’t concentrated enough audience’s attention, as well as political actors and media conglomerates haven’t been digitalized. Those subjects can’t mobilize the Internet audience. The quadrant describes a situation of relative freedom and perfect competition on the Internet, or a phase of the information society shaping and communicative capitalism establishing. Positions, which in the matrix we called Digital Power and Media-Political Power, illustrate asymmetric situations.

240

D. Gavra et al.

Digital Power is a position in which digital actors (at the global level—owners of popular platforms, at the local level—online communities, movements, and digital elites) not only have sufficient power to participate in real politics but also influence political processes confronting official political communication. MediaPolitical Power is a position in which the government, with the help of popular online platforms frames internet users at the discourse level and regulates digital space at a legislative level hence eliminating all potentially destabilizing factors.

4

Discussion

Applying the matrix, we will be able to analyze and deconstruct macro-level power configurations which may be unclear or even deliberately obscured. Also, such methodological tool will allow us to map different countries depending on their power balance dynamics as well as to cluster them. It’s possible to derive several groups of empirically operationalized indicators capable to measure current national situation and forecast threats and opportunities arising as a result of violating the balance. Thus, we will be able to compare countries using data collected in such a way.

Fig. 1. The intersections of political space, media-space, and digital space. New relationship between power actors from different spaces.

Finally, we can both analyze a potential of communicative capital using in political struggle, and digital power practices efficiency in media-political promotion on the Internet. For digital power actor, it’s not enough to be a popular blogger with millions of subscribers—it’s important to convert their attention into real political action. For a media-political actor, it’s not enough to create a profile on Facebook—it’s important to constantly work on attention attraction techniques and algorithms, to elaborate rituals, behavioral patterns, and memes for an auditory, to increase the trust of digitalized public and use its potential to get additional funding.

Digital Subjects as New Power Actors: A Critical View

5

241

Conclusion

In the paper, we presented a critical view on political, media, and digital spaces intersections and the emergence of new power actors—digital ones. Traffic monopolists, digital elite, digital brands, and communicative capitalists are potential roles which digital subject has to play to become a digital power actor. S/he gains power in digital space attracting Internet audience, attaching it with values and rituals, increasing its trust, and finally converting results of its communication labour in resources, which can be used for further internal and external development. Those digital power actors compete and cooperate with political and media actors on respective spaces, while Internet users acquire features both of political public and media consumers. The intersection of various spaces transforms and complicates power relations. As a result of their analysis, we suggested the matrix conceptualizing new power balance between merged media-political and digital power actors in the area of all three spaces intersection. Different positions (quadrants) being formed as a consequence of this balance violations require different communicative strategies. This methodological tool has a prognostic potential. But also it presents capabilities for analyzing practices of both types of actors struggling for power in the contemporary fragile balanced digital-media-political environment. Acknowledgements. The authors acknowledge Saint-Petersburg State University for the research grant 26520757.

References 1. A new generation of Internet users: a study of the habits and behavior of Russian youth online. https://www.thinkwithgoogle.com/intl/ru-ru/insights-trends/ user-insights/novoe-pokolenie-internet-polzovatelei-issledovanie-privychek-ipovedeniia-rossiiskoi-molodezhi-onlain. Accessed 08 Aug 2018 2. Berardi, F.: Semiocapitalism and the Pathologies of the Post-Appha Generation. Minor compositions, London (2009) 3. Bodrunova, S.: Mediacracy: advanced approaches towards interpretation of the concept. Philology Orientalism Journalism 3, 203–215 (2012). Vestnik of Saint Petersburg State University 4. Boltanski, L., Chiapello, E.: The New Spirit of Capitalism. New literary review, Moscow (2011) 5. Boutang, Y.M.: Cognitive Capitalism. Polity Press, Cambridge, Malden (2011) 6. Castells, M.: Communication Power. Izd. house of the Higher School of Economics, Moscow (2011) 7. Dean, J.: Blog Theory. Polity Press, Cambridge, Malden (2010) 8. Dean, J.: Communicative capitalism: circulation and the foreclosure of politics. Cult. Politics (2005). https://doi.org/10.2752/174321905778054845 9. Dean, J.: Crowds and Party. Verso, London, Brooklyn (2016) 10. Dekalov, V.: Attention as a basic resource for communicative capitalism. Russ. Sch. Public Relat. 10, 27–38 (2017)

242

D. Gavra et al.

11. Dekalov, V.: Communicative capital: conceptualization. Sociology (2017). Vestnik of Saint Petersburg State University. https://doi.org/10.21638/11701/spbu12. 2017.402 12. Dekalov, V., Grigorieva, K., Uskova, D.: Cultural experts and communicative capitalism: transformation of communicative practices. Media Watch J. (2017). https://doi.org/10.15655/mw/2017/v8i3/49155 13. Dyer-Witheford, N.: Cyber-Marx. Cycles and Circuits of Struggle in Hightechnology Capitalism. University of Illinois Press, Champaign (1999) 14. Faucher, K.X.: Social Capital Online: Alienation and Accumulation (2017). https://doi.org/10.16997/book16 15. Fedorchenko, S.: Global study of social networks politicization. Sci. Anal. J. Observer 8, 57–67 (2016) 16. Florida, R.: The Rise of the Creative Class: And How It’s Transforming Work, Leisure, Community, and Everyday Life. Publishing house “Classics XXI”, Moscow (2007) 17. Fuchs, C.: Donald Trump: A Critical Theory-Perspective on Authoritarian Capitalism. tripleC (2017). https://doi.org/10.31269/triplec.v15i1.835 18. Fuchs, C.: The Online Advertising Tax as the Foundation of a Public Sphere Internet, Communication Power (2018). https://doi.org/10.16997/book23 19. Fuchs, C., Sevignani, S.: What is digital labour? What is digital work? What’s their difference? And why do these questions matter for understanding social media? tripleC (2013). https://doi.org/10.31269/triplec.v11i2.461 20. Gavra, D., Dekalov, V.: Concept of communicative capitalism: methodological premises and paradigmatic positioning. J. Sociol. Soc. Anthropol. (2018). https:// doi.org/10.31119/jssa.2018.21.1.2 21. Hardt, M., Negri, A.: Empire. Praxis, Moscow (2004) 22. How Yuri Dud created one of the strongest personal brands in the media. In: RBC. https://www.rbc.ru/magazine/2017/06/592555cc9a7947d4c8585415. Accessed 08 Aug 2018 23. Kellner, D.: Preface: Guy Debord, Donald Trump, and the politics of the spectacle. In: Briziarelli, M., Armano, E. (ed.) The Spectacle 2.0. Reading Debord in the Context of Digital Capitalism, pp. 1–14. Westminister University Press, London (2017) 24. Message to our customers. http://www.apple.com/customer-letter. Accessed 08 Aug 2018 25. Nine theses about the future of media. In: Forbes. http://www.forbes.ru/ tehnologii/341887-devyat-tezisov-o-budushchem-media. Accessed 08 Aug 2018 26. Ross, A.: The fate of the critic in the clickbait age. In: New Yorker. https://www. newyorker.com/culture/cultural-comment/the-fate-of-the-critic-in-the-clickbaitage. Accessed 08 Aug 2018 27. Smith, T.G.: Politicizing Digital Space: Theory, the Internet, and Renewing Democracy (2018). https://doi.org/10.16997/book5 28. Standing, G.: The Precariat: The New Dangerous Class. Ad Marginem Press, Moscow (2014) 29. Sobchak on her campaign financing and advertising in Instagram. In: YouTube. https://www.youtube.com/watch?v=7Wvp0R269nY. Accessed 08 Aug 2018 30. Terranova, T.: Network Culture: Politics for the Information Age. Pluto Press, London (2001) 31. The court ruled to block telegram in Russia. In: Meduza. https://meduza.io/news/ 2018/04/13/sud-postanovil-zablokirovat-telegram-v-rossii. Accessed 08 Aug 2018

Digital Subjects as New Power Actors: A Critical View

243

32. The Ministry of justice suspended the registration of the party Alexei Navalny. In: Vedomosti. https://www.vedomosti.ru/politics/articles/2018/07/18/775863partii-navalnogo. Accessed 08 Aug 2018 33. Telegram-channels: can one believe what anonymous people write on-line? In: New Izvestiya. https://newizv.ru/article/general/10-04-2018/telegram-kanalymozhno-li-verit-tomu-chto-pishut-anonimy-v-seti. Accessed 08 Aug 2018 34. Virno, P.: A Grammar of the Multitude: for an analysis of Contemporary Forms of Life. Ad Marginem Press, Moscow (2001)

An Approach to Creating an Intelligent System for Detecting and Countering Inappropriate Information on the Internet Lidiya Vitkova1(B) , Igor Saenko1 , and Olga Tushkanova1,2

2

1 SPIIRAS, 39, 14 Liniya, St. Petersburg 199178, Russia {vitkova,ibsaen}@comsec.spb.ru, [email protected] SPbPU, 29, Politekhnicheskaya St., St. Petersburg 195251, Russia

Abstract. Currently, the Internet is becoming one of the most dangerous threats to personal, public and state information security. Therefore, the task of detecting and counteracting inappropriate information in digital network content becomes of national importance. The paper offers a new approach to creating an intelligent system for detecting and counteracting inappropriate information on the Internet based on the use of machine learning methods and processing of big data and describes the architecture of such a system. Experimental evaluation of one of the most important system components, which is the component of multidimensional evaluation and categorization of information objects in single-threaded and multi-threaded modes showed high efficiency of using various classifiers included in the Python Scikit-learn and Spark MLlib libraries to solve the problem.

Keywords: Inappropriate information Big data · Classifier

1

· Internet · Machine learning ·

Introduction

The rapid spread of the Internet, including social networks, into the state, industrial, economic and socio-cultural spheres of modern society is a powerful incentive for their further development. At the same time, the Internet is becoming one of the most important threats to personal, public and state information security. Therefore, the protection of individuals, society and the state from information that spreads through information and telecommunication networks and is capable of harming the health of citizens or motivating them to unlawful behavior is a problem of national importance affecting the provision of information security of the country. Currently, this problem has a small number of scientific and technical solutions. The known automated means of identifying and counteracting inappropriate information on the Internet do not meet the requirements for speed, completeness, accuracy and adequacy of the decisions made. This is due to several c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 244–254, 2020. https://doi.org/10.1007/978-3-030-32258-8_29

An Approach to Creating an Intelligent System

245

reasons. First, web resources have a complex hierarchical structure and consist of many disparate elements. Secondly, it is necessary to process big data online. Thirdly, the analysis of information on the Internet is associated with the processing of conflicting and changeable data, which requires the use of methods for processing incomplete, contradictory and fuzzy knowledge. These problems necessitates the development of a new class of systems for detecting and countering inappropriate information using artificial intelligence methods, in particular, methods for processing big data, classification, machine learning and others. The paper is devoted to the description of the proposed approach to creating such a system. The novelty of the work lies in the development of an approach to creating a new technology for intelligent analytical processing of web content using big data processing, machine learning and fuzzy computing. The theoretical contribution of the work lies in the proposed architecture of an intelligent system for detecting and countering inappropriate information on the Internet, describing its purpose and functions of its components, as well as in software implementation and experimental verification of the central component for multidimensional evaluation and categorization of web resources. The paper has the following structure. Section 2 overviews related work. Section 3 discusses the main solutions for the architecture of the proposed system. Sections 4 and 5 are devoted to the implementation and experimental evaluation of the component of multidimensional assessment and categorization of web resources. Section 6 outlines the conclusions and further research directions.

2

Related Work

The works devoted to detecting and countering inappropriate information can be divided into two groups. The works on the analysis of the diffusion of information, as well as the detection of fake news belong to the first group. Methods and techniques of information diffusion assessment in social networks are considered in [12]. This work proposes the roadmap for research on diffusion of information, according to which YouTube, Facebook, Flickr, Twitter platforms became the generalized objects of research. The basic research adopted a definition for fake news is made in [7,16]. To evaluate not only the source of fake news, but also the recipient is proposed in [20]. An approach to detecting fake news through the evaluation of the reaction of users in social networks is offered in [21]. The approach proposes that the user himself marks such news and sends to the expert. An important task of countering inappropriate information is the identification and assessment of the source, including the detection of fake liker [5,13]. With the help of “fake followers”, the context of the message is distorted, the popularity is rotated and comments are added. Examples of the use of such tools are election campaigns and the network propaganda [3,4,6]. The analysis of the news content as the way to counter inappropriate information is considered in [19]. This work proposes to collect data from many websites and compare news reports.

246

L. Vitkova et al.

The second group of related works is devoted to implementation of systems for detecting and countering inappropriate information with use of big data processing and machine learning. The issues of creating distributed data collection and processing systems are fully developed. For example, the tasks of data collection and preprocessing were studied in [11,22]. Questions of classification and categorization of textual content of information objects are considered in [2]. In addition to textual and graphical content, information objects have, for example, the page URL [17]. An approach suggested splitting the source URL into separate tokens and analyzed the content and sequence of individual tokens is offered in [9]. [14] suggests to additionally use the length of the host name and the entire URL, as well as the number of different characters included in the URL string. Highlighted URL tags can be used in machine learning algorithms such as Naive Bayes, SVM, Logistic Regression, etc. Algorithms and methods of machine learning, focused on solving problems of automatic classification of textual content, are partially described in [1,8,10,14,15,18]. In the classical classification problem, a training sample is a set of individual objects, characterized by the feature vector. To solve the classification problem, it is required to construct an algorithm that would “qualitatively” determine a class label or a vector of a posteriori probabilities belonging to each of the classes [8]. The process of creating such a decision algorithm is called classifier training. Most often, metrics of accuracy, precision and recall, or their combination, F-measure, are used to assess the quality of a trained classifier. The carried out analysis of related works allows us to draw the following conclusions. The efforts of the scientific community are basically aimed at solving methodological and algorithmic issues of detecting inappropriate information. Other popular areas are text classification, data clustering, search and their distributed storage, as well as optimization of computational costs in data processing. However, the task of developing an intelligent system for detecting and countering inappropriate information on the Internet is not sufficiently solved.

3 3.1

Architecture of the System Overall Architecture of the System

The proposed intelligent system for detecting and countering inappropriate information has an architecture shown in Fig. 1. It has three levels: (1) data collection and preprocessing; (2) evaluation of the semantic content; (3) countermeasures development. The initial data for such a system are information objects of the Internet and social networks. The results obtained by the proposed system are used by security administrators responsible for protecting against inappropriate information. At the first level of the system architecture, distributed intelligent scanners of network information objects are located. At the second level there are the following components: (1) the component for multidimensional assessment and categorization of information objects; (2) the component of ensuring the timeliness of the analysis of information objects; (3) the component of elimination

An Approach to Creating an Intelligent System

247

of the incompleteness and inconsistency; (4) the component of adaptation and retraining of the system. At the third level there are the components of the choice of countermeasures and the implementation of visual interfaces.

Fig. 1. The overall architecture of an intelligent system for detecting and countering inappropriate information on the Internet

3.2

Components of the System

To determine the functions performed by the components of the system, consider its functional structure (Fig. 2). Distributed intelligent scanners (IS), which are part of the collection and preprocessing component (Pt) of network information objects, perform the following functions: analysis and classification of information objects; tag cloud extraction (tags, labels, hashtags, keywords); prioritization of information object analysis. The component for multidimensional assessment and categorization of information objects includes: basic classifiers of inappropriate information (Bc); aspect classifiers of this information (Ac); the final classifier of inappropriate information (Fc). Functions of the component of ensuring the timeliness of the analysis (Manager) include: distributed memory management; planning of parallel data processing; implementation of parallel data processing.

248

L. Vitkova et al.

Functions of the component of elimination of the incompleteness and inconsistency include: elimination of ambiguity (fuzziness) of source data for multidimensional assessment and categorization of the semantic content of information objects collected in a distributed repository (DB); elimination of unreliability (inadequacy, incompleteness) of the initial information in the interests of detecting and counteracting inappropriate information. The functions of the components of the adaptation and retraining include: classifiers training; countermeasure components training; system incremental learning.

Fig. 2. Functional structure of the system

The functions of the component of the countermeasures choice include: storing countermeasures in the database (DBC); evaluation of the effectiveness of possible and implemented countermeasures (DMS). The functions of the components of the visual interfaces implementation (Visualizer) include: interaction user with the system; displaying the results of the system components using various models of data visualization.

4

Implementation of the Component for Multidimensional Assessment and Categorization of Information Objects

The component for multidimensional assessment and categorization of information objects is the most important part of the intelligent system for analyzing digital network content. This component solves the problem of automatic classification of network information objects using data collected in a distributed storage using distributed intelligent scanners. As a feature vector, any combination of the available characteristics of information objects can be considered, the most important of which is natural language text. A training sample can be a set of pre-categorized information objects, for example, web pages or web sites.

An Approach to Creating an Intelligent System

249

The component for multidimensional assessment and categorization of information objects supports several operation modes: 1. Classifiers batch training mode. In this mode, the input to the component is a task for classifiers batch training. The task and the list of information objects categories are formed manually using the visualization component, or automatically. After receiving the task for classifier batch training, a script for collecting necessary information about information objects of specified categories from the distributed data store is launched, then the ensemble of multidimensional classifiers is trained, and then the accuracy of the trained classifiers is assessed. If the assessment is not satisfactory, the user can change the parameters of the classifiers to improve the quality of the classifiers. If the accuracy assessment meets the specified criteria, the component can be switched to the incremental learning mode or the analysis mode. 2. Classifiers incremental learning mode. In this mode, one or several unknown information objects of known category arrive to the component input. After that, the pretrained in batch training mode classifiers adjust the values of parameters and weights, taking into account the new information object. It is expected that addition of the new information object will improve the quality of classification for pretrained classifiers. 3. Analysis of new information objects mode. In this mode, unknown information object can be fed to the component in order to assign it to one or several categories for which classifiers were previously trained in batch or incremental learning modes.

5

Experimental Evaluation of the Multidimensional Assessment and Categorization Component

5.1

Dataset

The “Fsec full” dataset was used as an experimental one, which was formed by loading and preliminary categorizing the content of 78 663 individual webpages. Each webpage in the sample is described by the following attributes: • • • • • •

webpage URL; the full text of the page in natural language; number of images on the page; number of links to other resources on the page; content of the html-meta tag “Description”; content of the html-meta tag “Keywords”.

The webpages contained in the dataset belong to one of 19 categories: Adult English, Beer, Casino, Cigarette, Cigars, Cults, Dating, Religious, Marijuana, Occults, Prescription drugs, Racist groups, Religion, Spirits, Sport betting, Violence, Wine, Weapon, Other. The Other category includes webpages that do not belong to any of the listed categories. The affiliation of a webpage to any category was determined on the basis of an expert assessment. It is obviously, that

250

L. Vitkova et al.

some categories, such as {Cults, Occults}, {Racist Groups, White supremacy, Violence}, can have similar semantic content and many intersections, which can make it difficult to perform their classification. 5.2

Experiments

During the study, two series of experiments were performed. During both series, only the full text of the page in natural language was used. For the preprocessing of texts and feature extraction, the standard “bag-of-words” scheme [8] was used, namely, the lemmatization and removal of standard and dataset-specific data (for example, “web”, “website”) stopwords. The “Fsec full” dataset was divided into two parts in the ratio of 80% to 20%. In the first part of the sample, 5-folds crossvalidation was used to select the optimal parameters for each classifier. The best classifiers with optimal parameters were evaluated using precision, recall, and F-measure on the second part of the sample. The first series of experiments was performed on the single machine with the following characteristics: operating system - macOS High Sierra; processor - 2.3 GHz Intel Core i5; memory - 8 GB 2133 MHz LPDDR3; graphics card Intel Iris Plus Graphics 640 1536 MB. During the first series of experiments, the following classifiers, implemented in the Scikit-learn Python library, were investigated: decision tree implemented in the DecisionTreeClassifier class; support vector machine implemented in the SVC class; multinomial Naive Bayes classifier implemented in the class MultinomialNB ; random forest implemented in the RandomForestClassifier class. The general scheme of the experiments with algorithms implemented in the Python Scikit-learn library is presented in Fig. 3. Table 1 presents the values of precision, recall, and F-measure for classifiers learned on single machine, calculated on a test sample of the “Fsec full” dataset. The table also shows the time spent on the selection of optimal parameters for these classifiers. Table 1. Evaluation of Python Scikit-learn classifiers on “Fsec full” dataset Classifier

Classifier parameters selection, sec.

Prediction Precision Recall F-measure time, sec.

SVC

41047

950

0.92

0.92

0.92

Decision tree

4160

35

0.84

0.84

0.84

Multinomial NB

1340

27

0.81

0.74

0.73

Random forest

3715

29

0.84

0.84

0.84

The second series of experiments was performed on a cluster running under Hadoop. A bench for experimental studies was deployed on the basis of the VMware ESXi 6.0 hypervisor. The hardware under the control of the hypervisor has the following characteristics: motherboard - Supermicro X9DRL-3F/iF;

An Approach to Creating an Intelligent System

251

Fig. 3. The general scheme of the experiment with Python Scikit-learn classifiers

processor - Intel Xeon [email protected] x 2 (6 cores and 12 threads each); memory - RAM 131 GB RAM and 2 hard drives with a total capacity of 8 TB. As part of the hypervisor, 5 virtual machines were created, four of which run under Ubuntu Server 18.04LT and one run under Ubuntu Desktop 18.04LTS. The Desktop machine acts as the master and manages a cluster of four other Server machines that perform the functions of a worker. The host machine has 8 CPU threads of 2.0 GHz and 64 GB RAM, while the working machines have 2 threads of 2.0 GHz and 8 GB RAM. All cluster machines are combined into one virtual local area network. To combine the computing resources of all the machines, a Hadoop framework was used to organize distributed computing. It was decided to manage cluster resources using the resource manager YARN. Also, to perform more productive computing over Hadoop, another popular distributed computing system Spark was installed. In the second series of experiments, the following classifiers were investigated, which are implemented in a parallel version in the MLlib library of the Spark distributed computing system: logistic regression implemented in the LogisticRegression class, support vector machines implemented in the class LinearSVC, Naive Bayes classifier implemented in the NaiveBayes class, random

252

L. Vitkova et al. Table 2. Evaluation of Spark classifiers on “Fsec full” dataset

Classifier

Classifier parameters selection, sec.

Linear SVC logistic 27123 Regression

Prediction Precision Recall F-measure time, sec. 223

0.89

0.91

0.90

288

1

0.79

0.81

0.80

Naive Bayes

1256

4

0.84

0.79

0.81

Random forest

2934

4

0.81

0.81

0.81

forest implemented in the RandomForestClassifier class. Table 2 presents the evaluation of the Spark MLlib classifiers. It should be noted that the two series of experiments (single-machine and parallel versions) had some differences, mainly related to the implementation of the classifiers and the methods of text preprocessing in the respective libraries. 5.3

Discussion

From the experimental results, it can be concluded that in the first series of experiments the best in terms of precision, recall and F-measure for the “Fsec full” dataset is the SVC classifier that implements the support vector machines method. However, the selection of optimal parameters for this classifier took a much longer time compared with the other classifiers. The same is true for the prediction time. In the second series of experiments, the most efficient in terms of precision, recall and F-measure was the LinearSVC classifier, which also implements the support vector machines method. The differences in the accuracy of similar classifiers implemented in the Python Scikit-learn and Spark MLlib libraries are related to the peculiarities of their implementation and the inability to completely repeat the conditions of experiments on different platforms. It should also be noted that the involvement of the Spark distributed computing system did not make it possible to obtain a significant gain in training time. First of all, this is due to the fact that the Hadoop cluster is deployed on virtual machines, which entails an increased consumption of memory and processor resources. In addition, the “Fsec full” dataset is not a large one in the classical sense. Therefore in further researches we plan to attract the hardware and the software that should significantly accelerate the learning process and improve the accuracy of classifiers.

6

Conclusion

The paper offers a new approach to creating an intelligent system for detecting and countering inappropriate information on the Internet based on the use of machine learning methods and the processing of big data. The proposed architecture contains the following components: distributed intelligent scanners, the

An Approach to Creating an Intelligent System

253

component of multi-dimensional assessment and categorization of information objects, the components ensuring timely analysis of information objects, the component eliminating incomplete and contradictory results of evaluation and categorization, the component of adaptation and retraining of the system, the component of the countermeasure choice, and the components of the visual interfaces implementation. Experimental evaluation of the developed component of multidimensional evaluation and categorization of information objects in single-threaded and multi-threaded modes has shown the high efficiency of application of various classifiers from the Pythion Scikit-learn and Spark MLlib libraries. Further research is associated with the development, integration and evaluation of all system components on a much larger dataset and in a more productive computing environment. Acknowledgements. This research is being supported by the grant of RSF #18-1100302 in SPIIRAS.

References 1. Aggarwal, C.C.: Data Classification: Algorithms and Applications. CRC Press, Boca Raton (2014) 2. Aggarwal, C.C.: Machine Learning for Text. Springer, Cham (2018) 3. Al-Khateeb, S., Hussain, M.N., Agarwal, M.N.: Leveraging social network analysis and cyber forensics approaches to study cyber propaganda campaigns. In: Social Networks and Surveillance for Society, pp. 19–42. Springer, Cham (2019) 4. Atodiresei, C.-S., T˘ an˘ aselea, A., Iftene, A.: Identifying fake news and fake users on twitter. Procedia Comput. Sci. 126, 451–461 (2018) 5. Badri Satya, P.R., Lee, K., Lee, D., Tran, T., Zhang, J.J.: Uncovering fake likers in online social networks. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 2365–2370. ACM (2016) 6. Benkler, Y., Faris, R., Roberts, H.: Network Propaganda: Manipulation, Disinformation, and Radicalization in American Politics. Oxford University Press, Oxford (2018) 7. Conroy, N.J., Rubin, V.L., Chen, Y.: Automatic deception detection: methods for finding fake news. Proc. Assoc. Inf. Sci. Technol. 52(1), 1–4 (2015) 8. Feldman, R., Sanger, J.: The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, Cambridge (2007) 9. Khonji, M., Iraqi, Y., Jones, A.: Enhancing phishing e-mail classifiers: a lexical url analysis approach. Int. J. Inf. Secur. Res. (IJISR) 2(1/2), 40 (2012) 10. Kotenko, I., Chechulin, A., Komashinsky, D.: Categorisation of web pages for protection against inappropriate content in the internet. Int. J. Internet Protoc. Technol. 10(1), 61–71 (2017) 11. Kotenko, I.V., Saenko, I., Kushnerevich, A.: Parallel big data processing system for security monitoring in internet of things networks. JoWUA 8(4), 60–74 (2017) 12. Li, M., Wang, X., Gao, K., Zhang, S.: A survey on information diffusion in online social networks: models and methods. Information 8(4), 118 (2017) 13. Liu, Y., Liu, Y., Zhang, M., Ma, S.: Pay me and i’ll follow you: detection of crowdturfing following activities in microblog environment. In: IJCAI, pp. 3789– 3796 (2016)

254

L. Vitkova et al.

14. Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious web sites from suspicious urls. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1245– 1254. ACM (2009) 15. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) 16. Mustafaraj, E., Metaxas, P.T.: The fake news spreading plague: was it preventable? In: Proceedings of the 2017 ACM on Web Science Conference, pp. 235–239. ACM (2017) 17. Novozhilov, D., Kotenko, I., Chechulin, A.: Improving the categorization of web sites by analysis of html-tags statistics to block inappropriate content. In: Intelligent Distributed Computing IX, pp. 257–263. Springer (2016) 18. Qi, X., Davison, B.D.: Web page classification: features and algorithms. ACM Comput. Surv. (CSUR) 41(2), 12 (2009) 19. Raniere, K.A.: Data stream division to increase data transmission rates, 5 December 2017. US Patent 9,838,166 (2017) 20. Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor. Newsl. 19(1), 22–36 (2017) 21. Tschiatschek, S., Singla, A., Gomez Rodriguez, M., Merchant, M., Krause, A.: Fake news detection in social networks via crowd signals. In: Companion of the The Web Conference 2018 on The Web Conference 2018, pp. 517–524. International World Wide Web Conferences Steering Committee (2018) 22. Tushkanova, O.: Comparative analysis of the numerical measures for mining associative and causal relationships in big data. In: Creativity in Intelligent Technologies and Data Science, First conference Proceedings, CIT&DS, pp. 571–582. Springer (2015)

A Note on Analysing the Attacker Aims Behind DDoS Attacks Abhishta Abhishta(B) , Marianne Junger, Reinoud Joosten, and Lambert J. M. Nieuwenhuis University of Twente, Enschede, The Netherlands {s.abhishta,m.junger,r.a.m.g.joosten,l.j.m.nieuwenhuis}@utwente.nl

Abstract. Distributed denial of service (DDoS) attacks pose a serious threat to the availability of online resources. In this paper, we analyse the attacker aims for the use of DDoS attacks. We propose a model that can be used to evaluate news articles for determining probable aims of attackers. Thereafter, we apply this model to evaluate 27 distinct attack events from 2016. We make use of a DDoS specific longitudinal news database to select these attack events. We find the proposed model useful in analysing attack aims. We also find that in some cases attackers might target a web infrastructure just because it is virtually invincible.

Keywords: DDoS attacks SPEC

1

· Routine activity theory · Attacker aims ·

Introduction and Background

Today availability of internet and internet based services is of great importance to all. Malicious actors use cyber attacks to disrupt the normal functioning of internet or to steal digital information. These cyber attacks lead to direct or indirect financial losses [29] for the victimised firms or attacked individuals. Distributed denial of service (DDoS) is one such cyber attack that is used to make web based services inaccessible to the intended user. In order to protect itself a firm needs to evaluate its vulnerabilities and threats so as to plan its defence strategy [34]. These threats can be realised by acknowledging the various reasons for which the firm’s IT infrastructure might become a target. Hence, it is important to investigate the aims of attackers for the use of DDoS attacks. A conventional crime has three aspects that need to be proven before a wrongdoing is determined: Means, Motive and Opportunity. Just like conventional crimes, DDoS attacks require a means to execute, a motive to select the target and an opportunity to attack. In this case, means refers to the attack tools or the necessary technical expertise needed to execute the attack, the aim of the attacker points towards the reason for the attacker to act and vulnerabilities in the network provide the opportunity for the attack. Figure 1 shows the three aspects of a DDoS attack. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 255–265, 2020. https://doi.org/10.1007/978-3-030-32258-8_30

256

A. Abhishta et al.

Fig. 1. Aspects of a DDoS attack

In this paper, we focus on analysing attacker aims for the use of DDoS attack. The obvious way to investigate the aims of attackers is to interview them. However, it is also possible to model the probable aims based on the information reported by journalists in news articles related to the attack event. Taking into account the socio-cultural, political and economic dimensions of DDoS attacks and the postulates of routine activity theory (RAT), we propose a model to analyse the content of news articles related to an attack. We then use this model to analyse probable attacker aims in 27 different cases from 2016.

2

Previous Works

A few studies have tried to evaluate the attacker aims behind DDoS attacks. Hutchings and Clayton [35] discuss the incentives for booter owners. Paulson and Webber [42] evaluate the use of DDoS attacks for extortion against online gaming companies. Nazario [41] discuss politically motivated DDoS attacks. Later, Sauter [43] highlights the use of DDoS attacks for hacktivism purposes. Finally Zargar et al. [46] listed the probable incentives for attackers to use DDoS attacks as follows: 1. Financial/economical gain: This is the motive when an attacker gets paid for the assault. 2. Revenge: The motive of an attacker in this category is to DDoS for retribution. 3. Ideological belief: The attackers in this category attack usually as a portrayal of disagreement. 4. Intellectual Challenge: The attackers in this category experiment and learn from their activities. They are usually hackers who wish to show off their capabilities. 5. Cyber warfare: The attackers in this category belong usually to a military or terrorist group. However, Zargar et al. [46] do not provide any evidence for most of the listed motives. Some other studies also evaluated the non-technical characteristics of

A Note on Analysing the Attacker Aims Behind DDoS Attacks

257

cyber attacks as a whole. Liu and Cheng [40] discuss the reasons for cyber attacks to happen. They also explain who these attackers are and how they conduct these attacks. Gandhi et al. [32] discuss the socio-cultural, political and economic (SPEC) dimensions of cyber attacks. They analyse selected security events between 1996 and 2010 on the basis of SPEC criteria. Sharma et al. [44] proposed a social dimensional threat model by using historical cyber attack events. On the basis of their model they evaluate 14 different news articles concerning cyber attacks. Geers et al.[33] analyse the nation-state motives behind cyber attacks. Kumar and Carley [38] used network analysis on the data from Arbor network’s digital attack map and World Bank to study the aims behind DDoS attacks. They conclude that there is an increase in the probability of attacks on the country if there are negative sentiments towards the country on social media. All of the above mentioned studies show that not all attacks are carried out for economic gains. As booters have made DDoS attacks an easy weapon for nearly everyone, a number of aims can trigger attackers to launch an attack. These studies either evaluate the aims of attackers with respect to the SPEC criteria, or assume an aim and provide evidence to show the relevance of the aim in certain attacks. We believe that in case of DDoS attacks, attackers have to make two choices; (1) The victim (company or the individual they wish to attack). (2) Network infrastructure of the victim they wish to target. We propose a hybrid strategy for evaluating attacker aims by analysing the victim with respect to SPEC criteria and analysing the choice of infrastructure by considering the postulates of routine activity theory.

3

Methodology

Here, we discuss the characteristics of the dataset and the sampling strategy used by us to extract DDoS attack events. We then explain the proposed model for content analysis of news articles. Table 1. Characteristics of the dataset. Dates Start

#Articles End

Web News Web News

01-01-2016 31-12-2016 9387 4458

3.1

#Articles/day Standard deviation 25.6 12.18

Web News 7.55 8.67

Dataset and Sampling

The dataset is a collection of Google Alerts on DDoS attacks1 . The collection process and some possible uses of the dataset are mentioned by Abhishta et al. [28]. Table 1 shows the characteristics of the dataset used in this research. 1

Google Alerts is a content change detection and notification service. A user of this service can keep themselves updated about the topic of their choice. The service notifies with two types of alerts: (1) News (2) Web. News alerts report about content posted on news websites, all others are categorised as web alerts.

258

A. Abhishta et al.

In this paper, we are looking for a sample of DDoS attack events that were discussed at length in the media. Hence, the goal of sampling is to extract the most reported DDoS attacks of 2016. We divide event sampling process into two parts: (1) We identify eventful days (2) We evaluate the ‘News’ alerts of an eventful day to extract attack events.

Fig. 2. Histogram depicting selection criterion for eventful days.

The statistical criteria for identification of ‘eventful days’ is based on the methodology also used by Kallus [37]. We consider the days on which the number of alerts were greater than θ as ‘eventful’. In order to calculate the threshold θ we make use of the empirical distribution of number of alerts generated each day. Figure 2 shows the empirical distribution of number of ‘News’ alerts that are generated daily over the year. In this paper, we consider the threshold to be at 20 percentile. If we consider top 10 percentile of the alerts than most of the eventful days lie in the second half of 2016 this is due to an enormous increase in reporting of DDoS attacks in the later half of the year. In this case, θ is calculated to be at 31.92 alerts. Thus, if in a single day greater than or equal to 32 ‘News’ alerts are reported than we consider that as an eventful day. With this method, we are able to select 43 eventful days. We consider the alerts generated on eventful days for our study.

Fig. 3. Attack time-line showing the extracted attack events for θ = 32

In order to identify the events responsible for the generation of abnormally high number of alerts on eventful days, we evaluate the text of all alerts on an

A Note on Analysing the Attacker Aims Behind DDoS Attacks

259

eventful day and record the reported events as DDoS related events (non-attack) and DDoS attack events. We find that these news alerts report either an attack or an activity associated to an attack e.g. a research report by a DDoS protection company, or steps taken by law enforcement agencies. We manually tag the content of the alerts on selected days to identify attack reporting alerts. The extracted attack events are shown in Fig. 3. For this research we only consider the articles reporting a DDoS attack. We identify 27 separate attack events being discussed in these news articles. 3.2

Content Analysis

The decision of the attacker to choose a target for a DDoS attack can be broken down in the following two components: (1) Choice of victim organisation to target. (2) Choice of network infrastructure to target. Figure 4 shows the model followed by us to analyse attacker aims. In Gandhi et al. [32] have shown that social, political, economic and cultural circumstances influence the choice of victim for an attacker. Hence, we evaluate the attacker’s choice of victim using the SPEC criteria suggested by Gandhi et al. For the choice of network infrastructure, we assume that the attackers are rational i.e. the attacker choose to launch an attack [31]. This assumption enables us to make use of the postulates of RAT. According to Cohen and Felson’s (1979) [30] routine activities theory, direct contact predatory victimization occurs with the convergence in both space and time of three components: a motivated offender, the absence of a capable guardian, and a suitable target. According to routine activity theory, the suitability of a infrastructure for predation can be estimated using its four-fold constituent properties: value, inertia, visibility and accessibility, usually rendered in the acronym VIVA [45]. VIVA dimensions can be described as follows: Value. This refers to the importance of the infrastructure to the victim. For example, depending on the online sales of a company, a website can be more or less valuable to the company. Inertia. Inertia refers to the degree of resistance posed by the infrastructure to an effective predation. So, high inertia infrastructure will be the ones employing better protection strategies against DDoS attacks or the ones that can sustain high intensity network traffic (e.g. distributed servers, websites hosted in the cloud etc.). Visibility. Visibility refers to the visibility of the objects an offender wishes to steal [39]. High visibility web infrastructures are mostly public facing such as, a public website. Accessibility. It refers to the ability of an offender to get to the target and get away from the scene of crime. An example of a high accessibility infrastructure can be servers whose ip address can be easily accessed and are setup without intrusion detection systems or network monitoring applications. With the help of the concepts discussed above, we develop a model for analysing the probable aims behind attack events. We analyse news articles

260

A. Abhishta et al.

Fig. 4. Model for analysing attacker aims using news articles

related to 27 distinct attack events using this model to understand the attacker aims.

4

Results and Discussion

Figure 3 shows the DDoS attack events reported on eventful days. As a result of filtering a total of 43 dates were selected as eventful days. We evaluate all the alerts on these days and select DDoS attack events on the basis of the criteria mentioned in Sect. 3.1. The number of alerts collected on eventful days is 1929. Hence, these 11.75% of the days of the calender year account for nearly 43% (((Number of news alerts on eventful days)/(Number of news alerts in the whole year ))*100) of the total ‘news’ alerts. This result supports the findings of Johnson [36] with respect to the concentration of traditional crimes, as traditional crime is also very much concentrated in time and space. Table 2 summarises the components of each of the selected attack event i.e. victim, attacked infrastructure, SPEC variables and VIVA characteristics of the infrastructure. In the following paragraphs we discuss these attack reports in detail and report our findings in accordance with the criterion discussed in Sect. 3.2. In our analysis we see that the selected attack events can be broadly classified in 6 categories: (1) Attacks on large manufacturing companies (2) Attacks targeting public figures and ideological groups (3) Attacks targeting governments (4) Attacks on gaming and gambling platforms (5) Attacks on internet service providers and hosting service providers and (6) Attacks on financial institutions. The first category includes the attack on Nissan Motors, all the global websites of the automotive company Nissan [2] were reported to suffer downtime.

A Note on Analysing the Attacker Aims Behind DDoS Attacks

261

As Nissan does not sell cars online, the website is of relatively low value to the company. However, it was reported that the attack was carried out during Detroit auto show. During auto shows, car manufacturers expect attendants to visit their website to know more about the vehicle. Hence, even though Nissan doesn’t sell cars online, the website has a high visibility during this period. Later reports suggested that Anonymous (hacker group) targeted the website to protest against whale hunting in Japan (justifies choice of the Nissan as a victim). Hence, high visibility of the website was the key input for the choice of target. The second category include attacks on Ku Klux Klan, website of Brian Kreb, Black Lives Matter, Wikileaks, Donald Trump and Hillary Clinton [8,16, 20,22,26]. The websites of this category of victims are easy targets and have high visibility. As a result of a protest against racism ‘Anonymous’ attacked the website of Ku Klux Klan [16]. According to the reports, websites on Wikileaks, Donald Trump and Hillary Clinton were targeted on the day of election result, showing socio-cultural reasons for the attacks. The next category comprises of attacks on websites of Irish, Italian and Australian government [1,3,15]. These attacks could have been launched for both socio-cultural and political reasons as government websites usually do not cater online services. Italian government websites [1] were targeted by hacker group ‘Anonymous’. The motivation behind the attack was to protest against the participation of local bodies in the Trans Adriatic Pipeline (TAP) project. However, the attack on Australian government website was clearly targeted to interrupt census data collection. The fourth category includes online gaming platforms and gambling websites. The news sources reported an attack on the Irish lottery website [19] and vending machines that lead to the disruption of the sale of tickets. According to the reports this time the lottery jackpot was the highest in 18 months (high value). Hence, more people were expected to buy the tickets (high visibility). In July 2016, when the game ‘Nintendo Pokemon Go’ [21] was very popular (high visibility), another hacker group ‘PoodleCorp’ attacked the servers of the game. They took responsibility of the attack thus gaining a lot of publicity. Just after this online assault an attack on the servers of blizzard was reported that made the Warcraft servers inaccessible for the gamers. The fifth category comprises of attacks on ISPs and web hosting providers. In September and October 2016 attacks on ISPs in India [18], OVH (web hosting provider) [25] and Dyn (DNS service provider) [10] were reported. Usually ISPs form a high inertia targets for DDoS attacks. A new internet of things (IOT) based botnet, ‘Mirai’, who’s code was released online was used for these attacks. Each of these attacks were bigger than the other in intensity. The final category includes the attack on HSBC online banking services. As the attack was launched on last Friday of the month (salary day), the reasons for the attack was clearly economic. This is another example in our sample when the routine period affected the value of the infrastructure.

Canadian migration

Wikileaks

Trump and Clinton

Eir

Deutsche Telekom

European Commission

Black lives matter

BTC exchange

Tumblr

Steam

08/11/2016 [9]

08/11/2016 [26]

08/11/2016 [22]

29/11/2016 [11]

25/11/2016 [14]

30/11/2016 [13]

15/12/2016 [8]

15/12/2016 [5]

21/12/2016 [24]

23/12/2016 [23]

StarHub

William Hill

02/11/2016 [27]

Dyn

21/10/2016 [10]

27/10/2016 [7]

ISPs in India

18/10/2016 [18]

Australian Census

11/08/2016 [3]

OVH

Blizzard

03/08/2016 [6]

Ethereum network

Pokemon go

20/07/2016 [21]

29/09/2016 [25]

Ku Klux Klan

26/04/2016 [16]

23/09/2016 [12]

Italian government

26/02/2016 [1]

EA sports

HSBC

29/01/2016 [17]

Brian Kreb

Irish government

22/01/2016 [15]

23/09/2016 [20]

Primier lotteries

22/01/2016 [19]

01/09/2016 [4]

Nissan Motors

Reference Victim

13/01/2016 [2]

Date

Gaming Servers

Website

Servers

Website

Website

Network Devices

Email Server

Website

Website

Website

Website

Network Devices

Servers

Network Devices

Hosting Server

Servers

Website

Gaming Server

Website

Gaming Server

Gaming Server

Website

Website

Online Banking Server

Website

High

High

High

Low

Low

High

High

Low

High

Low

High

High

High

High

High

High

Low

High

High

High

High

Low

Low

High

Low

Low

Low

Low

Low

Low

High

Low

Low

Low

Low

Low

High

High

High

High

Low

High

Low

Low

Low

Low

Low

Low

High

Low

Low

Low

Low

High

Low

High

High

Low

Low

High

High

High

High

Low

Low

Low

Low

Low

High

Low

High

Low

Low

High

High

Low

High

High

High

High

High

Low

High

High

High

Low

High

High

High

High

High

High

High

High

Low

High

High

High

High

High

High

High

Low

High

High

High

Value Inertia Visibility Accessibility Low

Ticket machines and Website High

Website

Socio-Cultural Political Economic Infrastructure

Table 2. Analysis of each of the selected attack event.

262 A. Abhishta et al.

A Note on Analysing the Attacker Aims Behind DDoS Attacks

5

263

Conclusions and Future Work

In this paper, we propose a model for analysing the attacker aims for using DDoS attacks. This model uses SPEC criteria for evaluating the reasons for choosing the victim and then studies the VIVA characteristics of the choice of infrastructure. We use this model to evaluate news articles related to 27 attack events that were reported in 2016. Our main conclusions are as follows: • News articles are able to put DDoS attacks in context. Using the proposed model it is possible to evaluate the decisions made by the attacker to chose the victim and infrastructure. • Companies need to monitor their socio-cultural and political environment at all times, not all attackers look for personal economic gains. • All infrastructure connected to the internet is vulnerable to DDoS attacks. Companies must be aware of the degree of visibility and accessibility of the infrastructure. They should also consider their routine periods while analysing the VIVA characteristics of the infrastructure. • Attacks on high inertia targets such as Dyn [10] shows that, sometimes attackers target infrastructures because they are virtually invincible. In this study, we only use data from 2016, hence we cannot derive conclusions on how often attackers are motivated by a particular aim. In the future, we would like to analyse a larger and more representative sample of all reported attacks. We hope to use the proposed model as a base for automatically detecting attacker aims from news articles reporting DDoS attacks. Acknowledgments. This work is part of the NWO: D3 project, which is funded by the Netherlands Organization for Scientific Research (628.001.018).

References 1. Anonymous attacks Italian government portals because of gas pipeline project (2016) 2. Anonymous takes down Nissan website in protest of Japanese whale killings (2016) 3. Australian 2016 census sabotage puts a question mark on private cloud (2016) 4. Battlefield 1 beta: You have lost connection to ea servers (2016) 5. Bitcoin exchange BTC-e resumes services after latest DDoS attack (2016) 6. Blizzard hit with another DDoS attack, overwatch, wow, hearthstone and more down (2016) 7. DDoS attacks caused StarHub broadband outages (2016) 8. The DDoS vigilantes trying to silence black lives matter (2016) 9. Donald Trump sweeping American polls, Canadian migration website down (2016) 10. Dyn statement on 10/21/2016 DDoS attack (2016) 11. Eir’s webmail affected by DDoS attack (2016) 12. Ethereum’s network is currently suffering from a computational DDoS attack (2016) 13. European commission hit by DDoS attack (2016) 14. Failed Mirai botnet attack downed 900000 Germans’ internet access (2016)

264 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.

29.

30. 31.

32.

33. 34. 35. 36. 37.

38.

39. 40. 41. 42.

A. Abhishta et al. Govt websites forced offline in DDoS attack (2016) Hacker group anonymous shuts down KKK website (2016) HSBC online banking is ‘attacked’ (2016) Internet providers claim cyber attack, to meet senior cop (2016) Irish lottery site and ticket machines hit by DDoS attack (2016) KrebsOnSecurity hit with record DDoS (2016) Pokemon go down: Hacking group claims credit for taking down servers ‘with ddos attack’ (2016) Presidential candidate websites targeted (2016) Steam connection servers down in probable DDoS attack (2016) Tumblr goes down after hacker attack (2016) Web host hit by DDoS of over 1Tbps (2016) WikiLeaks comes under ‘unrelenting’ cyber attack that briefly prevents it from releasing more emails linked to Hillary Clinton on election day (2016) William Hill website under siege from DDoS attacks (2016) Abhishta, A., Joosten, R., Jonker, M., Kamerman, W., Nieuwenhuis, L.J.M.: Poster: collecting contextual information about a DDoS attack event using Google alerts (2019) Anderson, R., Barton, C., Bohm, R., Clayton, R., Van Eeten, M.J.G., Levi, M., Moore, T., Savage, S.: Measuring the Cost of Cybercrime. In: Workshop on Economics of Information Security (2012) Cohen, L.E., Felson, M.: Social change and crime rate trends: a routine activity approach. Am. Sociol. Rev. 44(4), 588–608 (1979) Cromwell, P., Olson, J.N.: The reasoning burglar: motives and decision-making strategies. In: In Their Own Words. Roxbury Publishing Company, Los Angeles (2006) Gandhi, R.A., Sharma, A.C., Mahoney, W., Sousan, W., Zhu, Q.: Dimensions of cyber-attacks: cultural, social, economic, and political. IEEE Technol. Soc. Mag. 30(1), 28–38 (2011) Geers, K., Kindlund, D., Moran, N., Rachwald, R.: WORLD WAR C: understanding nation-state motives behind today’s advanced cyber attacks (2013) Gordon, L.A., Loeb, M.P.: The economics of information security investment. ACM Trans. Inf. Syst. Secur. 5, 438–457 (2002) Hutchings, A., Clayton, R.: Exploring the provision of online booter services. Deviant Behav. 37(10), 1163–1178 (2016) Johnson, S.D.: A brief history of the analysis of crime concentration. Eur. J. Appl. Math. 21(4–5), 349–370 (2010) Kallus, N.: Predicting crowd behavior with big public data. In: Proceedings of the 23rd International Conference on World Wide Web, WWW 2014 Companion, pp. 625–630. ACM, New York (2014) Kumar, S., Carley, K.M.: Understanding DDoS cyber-attacks using social media analytics. In: IEEE Conference on Intelligence and Security Informatics (ISI) (2016) Leukfeldt, E.R., Yar, M.: Applying routine activity theory to cybercrime: a theoretical and empirical analysis. Deviant Behav. 37(3), 263–280 (2016) Liu, S., Cheng, B.: Cyberattacks: why, what, who, and how. IT Prof. 11(3) (2009) Nazario, J.: Politically motivated denial of service attacks. In: Cryptology and Information Security Series (2009) Paulson, R.A., Weber, J.: Cyberextortion: an overview of distributed denial of service (2006)

A Note on Analysing the Attacker Aims Behind DDoS Attacks

265

43. Sauter, M.: “LOIC Will Tear Us Apart”: the impact of tool design and media portrayals in the success of activist DDOS attacks. Am. Behav. Sci. 57, 983–1007 (2013) 44. Sharma, A.C., Gandhi, R.A., Mahoney, W., Sousan, W., Zhu, Q., Laplante, P.: Building a social dimensional threat model from current and historic events of cyber attacks. In: IEEE Second International Conference on Social Computing (2010) 45. Yar, M.: The novelty of cybercrime an assessment in light of routine activity theory. Eur. J. Criminol. 2(4), 407–427 (2005) 46. Zargar, S.T., Joshi, J., Tipper, D.: A survey of defense mechanisms against distributed denial of service (DDoS) flooding attacks. IEEE Commun. Surv. Tutor. 15, 2046–2069 (2013)

Formation of the System of Signs of Potentially Harmful Multimedia Objects Sergey Pilkevich(B) and Konstantin Gnidko(B) Mozhaysky MSA, 13 Zhdanovskaya Street, Saint Petersburg 197198, Russia [email protected], [email protected]

Abstract. The article describes an approach to the formation of a system of signs (indicators) of potentially harmful multimedia objects. The assignment of information objects to the class of malware is carried out on the basis of testing of volunteer subjects. Experimental research is performed in accordance with methodological principles of colour-stimulus associations and allows to identify the relationship between the multimedia content and its impact on the subjects.

1 Introduction Uncontrolled information space has a significant destructive impact on the formation of consciousness and morality of minors. The spread of pornography, violence and cruelty, extremism and terrorism on the Internet is a serious threat to modern society. Numerous studies [5] prove that there is a strong correlation between viewing scenes of violence and cruelty in the media, in particular on the Internet, and the level of aggression. The emergence of destructive phenomena in the information sphere leads to the need to ensure a “healthy” information environment. Psychological protection of the population from psychotraumatic information is an important component of the national security of the Russian Federation [3] and should be considered as part of the system of social protection of the population at the Federal and regional levels. Underestimation of the problem of ensuring information and psychological security threatens individual, group and public consciousness, can lead to errors in the choice of tactics and forms of social work, distrust to the authorities, the emergence of social and psychological tensions and other negative consequences. The above-stated actualizes the purpose of this publication, which is to develop a methodology for conducting an experimental study on the formalization and obtaining a system of objective criteria for classifying multimedia objects as potentially harmful.

2 Alternative Approaches to Solving the Problem Currently the most common approach to provide the protection of the users’ psyche from potentially dangerous materials includes, mainly, legislative measures (Germany: Act To Regulate The Dissemination of Writings and Media Contents harmful to Young Persons, USA: The children’s Internet Protection Act, Canada: National Strategy for the Protection of Children from Sexual Exploitation on the Internet and other), as well c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 266–271, 2020. https://doi.org/10.1007/978-3-030-32258-8_31

System of Signs of Potentially Harmful Multimedia Objects

267

as public regulation (German Association of Voluntary Self-Monitoring of Multimedia Service Providers, Federal Department for Media harmful to Young Persons; Canadian Centre for Child Protection; British Internet Watch Foundation) and voluntary certification based on national and regional ratings and classifications of video and gaming products (CERO, USK, ESRB, PEGI, etc.). One of the main drawbacks of such measures aimed to protection from the harmful multimedia content in the Internet channels is the high inertia of the existing legal mechanisms with regard to the emergence of new types of threats. Software and hardware systems and tools for detection and filtering of hidden malicious content on the Internet (e.g. filtering features of Google, Lycos Europe, AOL Deutschland, Yahoo! and other search engines, parental control in Windows 10 and the Kaspersky Security product line) are currently underdeveloped. The systems of technical monitoring of negative multimedia information circulating on the Internet operating in the Russian Federation are overwhelmingly based on the use of so-called “black lists” and the blocking of Internet addresses of sources that have compromised themselves by the spread of negative content (pornography, scenes of violence and cruelty, etc.). The obvious disadvantage of this approach is the inability to detect and block potentially harmful multimedia objects from trusted sources in a timely manner. Most modern research works focus on improving algorithms for recognizing malicious information of a particular type, such as pornography [2], or profanity, and suggest different modifications of user authorization procedures when accessing negative content [4]. The proposed approach is characterized by a combination of non-verbal testing of experts (users) with monitoring of their physiological reactions in the process of perception of test graphical stimuli. This allowed, along with the specification of expert’s assessments rating, to obtain objective evidence of the subjects’ response to emotionally significant stimuli (negative, positive and neutral).

3 The Hypothesis of the Study It seems that the traditional approach based on expert opinions is of limited applicability. The main hypothesis underlying the approach described below is that destructive phenomena in the information sphere are related to the psychophysiological response of users accessing Internet content. A schematic representation of this approach is presented in the Fig. 1. The multimedia content provides informational influence on both conscious and unconscious (subliminal) levels. In the first case, it’s the semantics of the information message which has an impact on the recipient, in the second case - it is syntax. The result of the impact is manifested in a change in the psychological (cognitive, emotional and behavioural) characteristics of people by which one can judge the degree of contagiousness of Internet content. Based on the above, for the formation of a system of signs of potentially harmful multimedia objects, it is necessary to conduct experimental studies which implements obtaining sufficient data by which one can reliably judge what characteristics (including presentation conditions: duration, sequence, language, demonstration techniques etc.) the content should have to induce certain (in particular,

268

S. Pilkevich and K. Gnidko

Fig. 1. Scheme for determining the degree of impact of Internet content on users of modern access devices

negative) cognitive, emotional and autonomic-behavioral effects (reactions) of people with different psychological profiles.

4 Methods of Experimental Research This article proposes an approach to conducting experimental research, based on the methodological principles of color-stimulus associations and allowing to identify the relationship between the multimedia content and its impact on the subjects. The scheme of interrelation of the main stages of the considered methodology is presented in Fig. 2.

Fig. 2. Methods of studying the impact of information content on the human psyche

The main methodological basis of the technology proposed is the concept of the color test of relationships (Etkind-Bazhin’s color test of relations and the colorassociative method of A.M. Parachev) [1], according to which the reflection of the essential characteristics of non-verbal components of emotional relationships to significant people, objects, phenomena of reality and to itself is manifested in color associations to them. The development of this approach is the procedure of associating standard color stimuli with words, phrases, symbols, images, objects, concepts, etc. related to a particular subject area. As a result it is possible to associate “clouds of meaning” or “semantic fields” that exist in the individual psyche of the subject with the area being under study.

System of Signs of Potentially Harmful Multimedia Objects

269

In the described method, color-stimulus associations have a quantitative expression. The association procedure is carried out on the basis of the “inner subjective sense of correctness”. The analysis and interpretation of such associations characterizes various aspects of the subject’s personality (including the adequacy of understanding Russianlanguage concepts, the adequacy of the “subjective picture of the world”, “speed of thinking”, simple and complex visual-motor reactions). The structure of the experiment is described in the following sequence of steps. 1. On the basis of voluntary informed consent Internet users are invited as test subjects - at least 30 young men and women (age 18–25 years); healthy; Russian-speaking students of the university of psychological and pedagogical profile. Volunteers are informed that they will be presented with a variety of Internet resources, including potentially harmful ones, which can cause pronounced negative emotional reactions (aversion, fear, anger, depression, etc.). After familiarization with the content, the subjects record their impressions of the perceived information, thoughts and experiences in connection with the viewing, as well as the resulting desire to do something or not. The above data is collected during the survey and allows to compare both the perceived responses and reactions and unconscious behavior registered by means of sensor equipment. 2. Preliminarily and once, by testing, the psycho profiles of each subject are determined according to the criterion of attitude towards significant symbols and phenomena of life. This is done with the help of specialised software “Tsvetomer”. 3. Each subject is presented in a predetermined order for a specified time, with multimedia objects belonging to predetermined emotionally significant and neutral classes. As a basis for verifying the results obtained and possibly extending them to a non-Russian speaking audience, it is advisable to select the multimedia objects and their classes based on the data published in [6]. 4. During the presentation of content to the subject, background monitoring of oculomotor, mimic (micromymic), pantomimic (gestural) and vegetative (pulse, respiration, tremor, blood vessels of the skin) reactions is conducted. The following equipment is used: video tracker, computer mouse “BioMouse” and video camera (see Table 1). 5. After the end of the session, a quick assessment of the emotionally conditioned state of the subject is performed using colour psychometry (an unconscious component from perceived stimuli), then the subject describes his/her impressions of the presented content, including what the content motivated/justified to (the perceived component of the senses contained in incentives presented). 6. As a result of the experiment the filled ontology knowledge base is obtained. For the inference machine, a decision rule is constructed to answer the following questions: what set of reactions occurs as a result of exposure to content of a certain class, by what response patterns can we judge that the content contains malicious multimedia objects, as well as how the respondents with different behavioral styles react to destructive phenomena in the information sphere. Thus the tasks of the experiment are the following:

270

S. Pilkevich and K. Gnidko Table 1. Characteristics of sensor equipment

Characteristics Name of sensor equipment Hardware and software

Mouse “BioMouse”

Video tracker Gazepoint GP3 (60 Hz)

Software Gazepoint Analyzer

Software “Tsvetomer”

Basic functionality

Infrared pulse sensor

Record, sync audio and video

Video cameras

Graphic stimulus

Analyzed parameter

Photoplethysmogram; Map of Heart rate movement; fog map; heat map

gaze Area of interest: -static; -dynamic

Index of color-pairs; “Cognitive load” concepts; latent reaction time

1. On the basis of expert assessments form incentive classes (test multimedia content): “neutral”, “news”, “romantic”, “viral”, “educational”, “terrorist”, “extremist”, “suicidal”, “pornographic” and etc. 2. Develop and execute a formalized description of the selected incentives based on the following characteristics: semantic content, “density” of content contamination, display form. 3. Evaluate the sufficiency and degree of informativeness of the selected psychometric testing methods, as well as assess the cognitive, emotional, and vegetativebehavioral effects. 4. Conduct experimental sessions familiarizing the subjects with selected content alongwith parallel monitoring of the dynamics of cognitive, emotional and vegetative-behavioral effects. 5. Identify trends and patterns in the reactions of the subjects upon presentation of various content and forms of incentives.

5 Conclusion The listed threats in the field of information and psychological security create prerequisites for expanding the range of interdisciplinary basic research focused on solving the problems of the neurocomputer and neurocognitive human interaction with the Internet space. The results obtained in the course of these studies can be used as a fundamental basis for solving many pressing problems in the field of intelligent Internet services, such as “Internet education”, “Internet medicine”, “Internet economy”, etc., and also to ensure national security, countering cyber threats, the ideology of extremism and terrorism through the development and creation of automated information and psychological security systems. Acknowledgements. The reported study was funded by RFBR according to the research project N 18-29-22064\18.

System of Signs of Potentially Harmful Multimedia Objects

271

References 1. Etkind, A.M.: Tsvetovoy test otnosheniy: Metodicheskiye rekomendatsii (Relationship color test: Methodical recommendations). L.: Leningradskiy nauchno-issledovatel’skiy psikhonevrologicheskiy institut im. V.M. Bekhtereva (1983). (in Russ.) 2. Jang, S.-W., Lee, S.-H.: Harmful content detection based on cascaded adaptive boosting. J. Sens. 2018, 1–12 (2018). https://doi.org/10.1155/2018/7497243 3. Melnitskaya, T.B.: Informatsionno-psikhologicheskaya bezopasnost’ naseleniya v usloviyakh riska radiatsionnogo vozdeystviya: kontseptsiya, model’, tekhnologii (Information and psychological safety of the population under the risk of radiation exposure: concept, model, technology). Extended abstract of Doctor’s thesis. Saint Petersburg (2009). (in Russ.) 4. Park, N., Kim, Y.: Harmful adult multimedia contents filtering method in mobile RFID service environment. In: Pan, J.S., Chen, S.M., Nguyen, N.T. (eds.) Computational Collective Intelligence. Technologies and Applications. ICCCI 2010. Lecture Notes in Computer Science, vol. 6422. Springer, Heidelberg (2010) 5. Carlsson, U. (ed.): Regulation, Awareness, Empowerment. Young People and Harmful Media Content in the Digital Age. The International Clearinghouse on Children, Youth and Media, at Nordicom. - G¨oteborg University (2006) 6. The center for the study of emotion and attention. Site of The University of Florida. https:// csea.phhp.ufl.edu

Soft Estimates for Social Engineering Attack Propagation Probabilities Depending on Interaction Rates Among Instagram Users Anastasiia O. Khlobystova1,2 , Maxim V. Abramov1,2 , and Alexander L. Tulupyev1,2(B) 1 Laboratory of Theoretical and Interdisciplinary Problems of Informatics, St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, 14-th Linia, VI, No. 39, St. Petersburg 199178, Russia {aok,mva,alt}@dscs.pro 2 Mathematics and Mechanics Faculty, St. Petersburg State University, Universitetskaya Emb., 7/9, St. Petersburg 199034, Russia

Abstract. The purpose of this article is to propose an approach to denoting the parameters of the model for assessing the probability of success of a multi-pass social engineering attack of an attacker on a user. These parameters characterize the evaluation of the probability of propagation of social engineering attacks from user to user in one type of interaction. These estimates are related to the intensity of user interaction, information about which is extracted from data obtained from social Media. The article proposes an approach to the conversion of information about the episodes of interaction between users in the social Media Instagram in assessing the probability of the spread of social engineering attack, based on the Khovanov method. The obtained results help produce social network analysis and serve as a basis for the subsequent analysis of possible trajectories of the spread of multi-pass social engineering attacks, allowing the simulation of social engineering attacks and automated calculation of estimates of the success of the attack on different trajectories. The novelty of the research is to the application quantification method to social links in the context of social engineering attacks. Keywords: Social engineering attacks computing

1

· Soft estimates · Soft social

Introduction

In these latter days there is a tendency of social engineering attacks number’s increasing in the area of cyber attacks. This was observed in reports of the big companies which are engaged in investigation and analytics of such criminal acts [5,7]. Within a framework of this article we consider social engineering attack as a set of applied psychological and analytical methods which malefactors use for c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 272–277, 2020. https://doi.org/10.1007/978-3-030-32258-8_32

Estimates for Social Engineering Attack Propagation

273

users’ motivation in terms of public or corporate network in relation to violations of the settled rules and politics in the field of information security [1]. Social engineering attacks can be carried out both directly (straight-through attacks), and through a chain of users (the multiple attacks). We describe the multiple attack as such an attack at which target user and the user from which the influence begins differ from each other [1]. In the present article we discuss the aspects of estimated probability related to the success of expansion of multiple social engineering attacks, connected with extraction and analysis of data from social Media. One of the underlying purposes in this direction is exploration of psychological constitutions and character of mutual relations concerning people which are manifested in social Media and can be automated analyzed and also correlation of these constitutions with their mathematical models and their representation in a form, clear for computation system. It should be noted that the comparison of character of mutual relations and mathematical models in most cases demands application of the next methods: fuzzy logic, soft social computing [9]. The purpose of the present article consists in the proposition of mathematical model connected with estimated probability related to the success of expansion of multiple social engineering attacks, which parameters are based on of intensity of user interaction retrieved from social Media Instagram.

2

Relevant Works

The groundwork for this work was the study [1,11], which raised the issues of building a social graph based on estimates of the intensity of user interaction in the social Media “VKontakte”. In addition to the study, we relied on the work, which considered the problem of increasing the level of protection of hidden confidential data of users of social Media [3,6]. In [14] the factors are analyzed, affecting the “Facebook” users interaction in the context marketing. In [4,12] the interaction signs were investigated in the social Media. In [2] the issues of dissemination of information in social Media on the basis of “Facebook” are considered. In [15] the strength of communication is predicted using quantitative estimates and machine learning methods. It is also worth noting that when processing expert assessments obtained during the survey, an important aspect is the quantitative assessment of the degree of consistency of expert opinion and further convolution of assessments taking into account the weight of the criterion or expert opinion. For these purposes, the apparatus of fuzzy set theory can serve, in particular, the methods proposed in [13].

3 3.1

A Study Experiment

The selection of components of profile in social Media Instagram that in any way can characterize the intensity degree of user interaction became the first step in

274

A. O. Khlobystova et al.

the current research. To confirm the right choice of these components and their further formalization a corresponding sociological study was conducted. It should be noted that there are several types of interaction between users on “Instagram”. These types can be symbolically divided into groups of actions, such as mutual actions or one-sided actions. One-sided forms of interaction include such as subscribing to an account, “liking” under the post, commenting under the post and marking on the photo/post. In addition, users of this network can put geotags and hashtags under their posts, acquire the account status of a famous person subject to special conditions. The coincidence of geotags or hashtags belongs to the group of mutual forms of interaction. The users of this social Media access other accounts and see which of their followings are account followers. Based on the identified features a survey was created to determine the probability of responding to a request, depending on the intensity of interaction recorded on Instagram. As answers respondents were asked to set the slider on a scale from 0 to 100 in increments of 1, where a value of 100 corresponds to the maximum probability of performing some action if asked by a user who has the specified characteristics. 0 corresponds to the fact that the request received from the user with these characteristics will not be executed under any conditions. 3.2

Research of Sequences

During the analysis of survey’s results, it was noted that a number of respondents has set the value of 2 or more categories in every subgroup to equal positions. In this regard it was decided to define the number of respondents who have arranged users from different categories in the same order (more precisely, implicitly assigned them the same rank when ranking). For the sake of brevity, the proposed categories would define as: Qi . As a first step, all respondent’s answers were arranged in descending order in every allotted group. It should be noted that equal valuations were taken into account. For the automation of this process a program in C# using the Microsoft Excel Object Library was developed. The output data contains a generated answers order, a transcript of this order, and the number of times which this order was found in the structured answers for each subgroup. In addition, if two or more values for some categories coincide, they will be considered as a separate order and highlighted in green or blue. On Table 1 an example of fragment of program’s work result is presented.

4 4.1

Methods of Research Mathematical Formalization

On the basis of the data obtained, estimates of the probability of social engineering attack’s progression for different episodes are built, which characterise the interaction intensity Instagram’s users. According to [1] a formula for calculating estimates of the probabilities of propagation of a social engineering attacks

Estimates for Social Engineering Attack Propagation

275

Table 1. Example fragment result of program’s work [1234] [123]4 [23]14 3214 [23]41 1324 [34]12 4321 32[14] [34]21 1[23]4 [23][14]

26 7 5 5 4 3 3 3 3 3 2 2

Q9 Q9 Q10 Q11 Q10 Q9 Q11 Q12 Q11 Q11 Q9 Q10

Q10 Q10 Q11 Q10 Q11 Q11 Q12 Q11 Q10 Q12 Q10 Q11

Q11 Q11 Q9 Q9 Q12 Q10 Q9 Q10 Q9 Q10 Q11 Q9

Q12 Q12 Q12 Q12 Q9 Q12 Q10 Q9 Q12 Q9 Q12 Q12

between users has the following form: Pi,i+1 = 1 −

 t

(1 − pi,i+1 )nt , where pi,i+1 t t

is the probability for successful social engineering attack malefactor to user by t connection, nt is the number of episodes, Pi,i+1 is the probability for successful attack progression from user i to user i + 1. This model is used to the considered social Media. The characteristic of the t connection will be understood as one of the categories proposed in the survey, i.e. t represents a element from set {Q1 , . . . , Q12 }. For further calculations each type of t connection is compared to its value of probability pt , based on expert’s estimates. The comparison will be made on the basis of quantification methods proposed by Khovanov [8,10]. For this goal a scale of potential probability values is set as a discrete set with step N : {0, N1 , . . . , NN−1 , 1}. Let us make use the results received during the sequences research, namely by the sequences themselves and the information about how many respondents implicitly ranked categories in this way. According to [8,10] the expected value (with the help of introduced probability scale) and the expected value of the expected value (with the help of amount of answers which is part of the obtained sequence divided by the total number of answers) are found. The presented in the survey categories will be reviewed as a random variable, i.e. Qi is understood in the context of probability as the random variable, taking some probability value. The notation are introduced: assume k is the number of the considered sequences (corresponds to the line number from (Table 1)), and rk is the number of respondents, arranged the sequences in this way. Take example of the application of quantification method: k = 4 (Table 1), (4) (4) (4) (4) i.e. r4 = 5 respondents arranged categories as follows: Q11 , Q10 , Q9 , Q12 . Assume the step of the scale of potential probability values equals N = 3 : (4) {0, 13 , 23 , 1}. Then each of random variables Qi , i = 9, . . . , 12 can accept a single value.

276

A. O. Khlobystova et al.

  (4) (4) For example for Q11 : P Q11 = 0 = 1,  P

(4) Q11

1 = 3



 =P

(4) Q11

2 = 3



  (4) = P Q11 = 1 = 0.

Then the expected value accepts the following values:    1  1 (4) (4) E Q11 = 0 · 1 = 0, E Q10 = · 1 = , 3 3   2   2 (4) (4) E Q9 = · 1 = , E Q12 = 1 · 1 = 1. 3 3 The expected value for each from represented sequences was calculated. Find: m     rk (k) (k) = ∀iEE Qi · E Qi , n k=1

where n is the number of respondents (in our case 88), m is the number of the different sequences (corresponds to the number of table’s rows (Table 1)). Thus probability for connection Q11 equals, according to the described 47 = 132 ≈ 0.356. I.e. if only the connection t is the “common method: pi,i+1 t hashtags” (Q11 ) binds user i to user i + 1 and the malefactor try to attack three once, the probability for successful attack progression between them will be equal: 47 3 ) ≈ 0.733. Pi,i+1 = 1 − (1 − 132

5

Conclusion

Thus, with the help of Khovanov method, soft estimates of the probability of the spread of a multi-pass social engineering attack on the episodes selected in Instagram characterizing the intensity of user interaction were obtained. The obtained results can serve for advanced social network analysis and as a basis for the analysis of possible trajectories of the spread of multi-pass social engineering attacks, allowing the simulation of social engineering attacks and automated calculation of estimates of the success of the attack on different trajectories. In turn, this contributes to the expansion of the number of factors taken into account that affect the assessment of security of users of information systems, and, as a result, increase the level of security of information systems from social engineering attacks. Acknowledgments. The work was carried out as part of the project according to the RF Government Assignment SPIIRAS No. 0073-2019-0003, with financial support from the Russian Foundation for Basic Research, project No. 18-01-00626, No. 18-37-00323.

Estimates for Social Engineering Attack Propagation

277

References 1. Abramov, M.V., Tulupyeva, T.V., Tulupyev, A.L.: Social engineering attacks: social networks and user security estimates, 266 p. SUAI, St. Petersburg (2018) 2. Beam, M.A., Child, J.T., Hutchens, M.J., Hmielowski, J.D.: Context collapse and privacy management: diversity in facebook friends increases online news reading and sharing. New Media Soc. 20(7), 2296–2314 (2017). https://doi.org/10.1177/ 1461444818795487 3. Cai, Z., He, Z., Guan, X., Li, Y.: Collective data-sanitization for preventing sensitive information inference attacks in social networks. IEEE Trans. Dependable Secure Comput. 15(4), 577–590 (2018) 4. Chang, H.T., Li, Y.W., Mishra, N.: mCAF: a multi-dimensional clustering algorithm for friends of social network services. SpringerPlus 5(1), 757 (2016). https:// doi.org/10.1109/TDSC.2016.2613521 5. Cybersecurity threatscape: Q2 2018. Positive Technologies. https://www. ptsecurity.com/upload/corporate/ww-en/analytics/Cybersecurity-threatscape2018-Q2-eng.pdf. Accessed 15 Dec 2018 6. Edwards, M., Larson, R., Green, B., Rashid, A., Baron, A.: Panning for gold: automatically analysing online social engineering attack surfaces. Comput. Secur. 69, 18–34 (2017). https://doi.org/10.1016/j.cose.2016.12.013 7. High-Tech Crime Trends 2018. Group-IB. https://www.group-ib.com/resources/ threat-research/2018-report.html. Accessed 26 Dec 2018 8. Hovanov, N., Yudaeva, M., Hovanov, K.: Multicriteria estimation of probabilities on basis of expert non-numeric, non-exact and non-complete knowledge. Eur. J. Oper. Res. 195(3), 857–863 (2009). https://doi.org/10.1016/j.ejor.2007.11.018 9. Kharitonov, N., Zolotin, A., Tulupyev, A.: Software implementation of algebraic Bayesian networks consistency algorithms. In: 2017 XX IEEE International Conference on Soft Computing and Measurements (SCM), pp. 8–10 (2017) 10. Khovanov, N.V.: Measurement of a discrete indicator utilizing nonnumerical, inaccurate, and incomplete information. Measur. Tech. 46(9), 834–838 (2003). https:// doi.org/10.1023/B:METE.0000008440.41847.c7 11. Khlobystova, A.O., Abramov, M.V., Tulupyev, A.L., Zolotin, A.A.: Search for the shortest trajectory of a social engineering attack between a pair of users in a graph with transition probabilities. Inf. Control Syst. 6, 74–81 (2018). https://doi.org/ 10.31799/1684-8853-2018-6-74-81 12. Krakan, S., Humski, L., Skoˇcir, Z.: Determination of friendship intensity between online social network users based on their interaction. Tehniˇcki vjesnik 25(3), 655– 662 (2018). https://doi.org/10.17559/TV-20170124144723 13. Madrid, N., Perfilieva, I., Kreinovich, V.: How to describe measurement uncertainty and uncertainty of expert estimates? In: Zadeh, L., Yager, R., Shahbazova, S., Reformat, M., Kreinovich, V. (eds.) Recent Developments and the New Direction in Soft-Computing Foundations and Applications. Studies in Fuzziness and Soft Computing, vol. 361, pp. 247–257. Springer, Cham (2018) 14. Maiz, A., Arranz, N., Fdez de Arroyabe, J.C.: Factors affecting social interaction on social network sites: the Facebook case. J. Enterpr. Inf. Manag. 29(5), 630–649 (2016). https://doi.org/10.1108/JEIM-10-2014-0105 15. Mattie, H., Engo-Monsen, K., Ling, R., Onnela, J.P.: Understanding tie strength in social networks using a local “bow tie” framework. Sci. Rep. 8(1), 9349 (2018). https://doi.org/10.1038/s41598-018-27290-8

Development of the Complex Algorithm for Web Pages Classification to Detection Inappropriate Information on the Internet Diana Gaifulina1,2(B) and Andrey Chechulin1,2 1

2

SPIIRAS, 39, 14th Liniya, St. Petersburg, Russia {gaifulina,chechulin}@comsec.spb.ru ITMO University, 49 Kronverksky Pr., St. Petersburg, Russia [email protected]

Abstract. The paper is devoted to the investigation of web pages classification algorithms for protection against inappropriate information on the Internet. The approach for combining of classification algorithms based on different aspects of the source data and different machine learning methods is proposed. The experiments results of this approach application for website classification is presented. Keywords: Inappropriate information · Prohibited information · Internet · Data mining · Machine learning · Information protection Web page classification

1

·

Introduction

Inappropriate information on the Internet is an information object or a set of objects containing illegal, questionable, malicious information. Protection of users from such information includes two main areas: the protection of minors from unwanted materials (parental control system) and the blocking of content that violates the law. The large amount of information on the Internet, the high variability of content and the complexity of the structure of web pages complicate security measures. Analysis of web pages in order to detect inappropriate information requires the use of not only blacklists and whitelists of web resources, but also automatic systems of web pages classification by their content. This paper proposes a comprehensive approach to the web pages classification in order to identify inappropriate information on the Internet. We organize the work as follows. Section 2 describes the general approach to web pages classification to identify inappropriate information. Section 3 includes experiments to assessment of the accuracy of different classification algorithms using machine learning methods. Section 4 explains the proposed complex algorithm for web pages classification and experimental assessment of its quality. Section 5 describes the obtained results and the future investigation plans. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 278–284, 2020. https://doi.org/10.1007/978-3-030-32258-8_33

Development of the Complex Algorithm for Web Pages Classification

2

279

The General Approach to Web Pages Classification

The approach to learning and using web page classification systems uses the processing of raw web page data in pre-defined categories and the application of machine learning methods. Preprocessing is the mapping of web page text to a logical representation, such as a weight vector. Training the classifier uses the resulting logical representation. Figure 1 shows the scheme of the general approach to automated of web pages classification.

Fig. 1. The general approach to web pages classification

Web pages are more complex than text documents. First of all, the fact that they are semi-structured using HTML tags markup, linked, contain code fragments, executable on both the server and client side. Typically, researchers use the following aspects of web pages to solve classification problems: (1) text content [4,5]; (2) the address of the host in the Internet (URL) [3,9], (3) structural features (statistics of HTML tags) [6], (4) objects that are not associated with text (e.g., media content), (5) information on the web page (information from WHOIS servers, age, country of incorporation); (6) links to this page; (7) links to third-party resources to analyze the web page. Web pages classification based on text content is the most widely used method which consist of two steps.

280

D. Gaifulina and A. Chechulin

The first step is to pre-process (index) the text content to represent it as a set of terms (words) with some weights using the TF-IDF measure. TF-IDF is a statistical measure used to assess the importance of a one word in the context of a web page’s text. The second step is the web page classification or training on a variety of web pages. The input parameters are the characteristics of the web pages obtained in the first step, as well as a predetermined number of categories. The output parameter is the analyzed web page category.

3

Classification Algorithms Assessment

As part of the investigation, we conducted an experimental evaluation of machine learning algorithms for solving web page classification problems. We implemented the developed approach to classification in the Python programming language and tested it on various data sets. We conducted experiments with data sets based on URL from Shalla’s Blacklists [2] and DMOZ [1]. We analyzed a sample of 17,248 web pages from categorized lists of web sites. The source data contains the following categories: (1) drugs/alcohol, (2) religion, (3) pornography, (4) aggression, (5) gambling, (6) weapons and (7) unknown result. We selected that categories to form the most numerous URL groups for classifier training. We use the same number of web pages for each category and equal to 2464. The contents of the web pages includes a full-page text and the text in the HTML tags: , ,

, , , , , , , , . We take the following machine learning methods for evaluation: decision trees, support vector machines, naive Bayesian classifier, and logistic regression, based on studies [7,8]. We use the following parameters: 1. 2. 3. 4.

the number of web pages in the correct category (True Positive, TP); the number of web pages incorrectly categorized as true (False Positive, FP); the number of correctly dropped web pages (True Negative, TN); the number of web pages in the incorrect category (False Negative, FN).

In the experiments, we analyzed the quality of classification algorithms by various numerical metrics: accuracy, recall, precision and confidence matrices. In this paper we give one metric (accuracy) which allows the most successful visualization of the experiments results. Figure 2 shows the assessment of accuracy for the web pages classification for each dataset according to the tags. Figure 3 shows the assessment of accuracy for the web pages classification for each of the dataset categories. This assessment characterizes the ratio of the number of web pages on which the classifier made the right decision to the total number of web pages of all categories. Based on the results obtained, we can note that the support vector machine gives higher classification quality results for the full-page text (91%). At the

Development of the Complex Algorithm for Web Pages Classification

281

Fig. 2. Accuracy of web pages classification by tags

Fig. 3. Accuracy of web pages classification by categories

same time, other classifiers show better results for the texts included in the tags (naive Bayesian classifier for < Description> tag 82%, logistic regression for < H1> 75%). Thus, we conclude that the use of a particular classifier to classify a web page is not sufficient to determine the category of unwanted information. We propose to develop the complex algorithm for web pages classification with the technology of bagging. Bagging is a classification technology where train and work of all elementary (base) classifiers take place in parallel and independently of each other. Elementary classifiers do not correct each other’s mistakes, but compensate.

4

Development of the Complex Classification Algorithm

The proposed complex algorithm of web pages classification allows us to use the technology of parallel training of basic classifiers on different aspects of web pages. At the stage of the final classification, the vote selects the decision of the classifier with the highest weight for this characteristic. In addition, the developed algorithm introduces additional processing of text content as a machine translation from a foreign language. Translation of text content of web pages from the original language into the language that was used for training of classifiers allows to classify web pages in any language supported by the system of automatic translation, using models of classifiers trained on English web pages. There is no need to prepare additional classifiers to categorize web pages for each specific language. This eliminates the need for preparation of additional training data sets and further maintenance of them in up to date state.

282

D. Gaifulina and A. Chechulin

Figure 4 shows the conceptual scheme of the complex algorithm for web pages classification.

Fig. 4. Conceptual scheme of the complex algorithm for web pages classification

Training of each base classifier takes place separately on a subset of the characteristics of the web page. We determine the weight for each classifier based on the results of weighted voting (Voting classifier). We perform a weighted vote of the results of the classification of each characteristic of the web page and determine the category of the web page (Final classification). For the developed complex algorithm, we conducted a similar experiment to assessment of quality of classification on the same datasets. We used the previously described classifiers as basic ones. We used Yandex’s services for machine translation of web page content. Figures 5 and 6 shows the assessment of accuracy for classification by text content of web pages. The figures show that the proposed approach is generally characterized by high accuracy for the tags , and . Using data in these tags is considered to be the best way to classify web pages. The approach shows high accuracy for categories too. Exceptions are sites of the categories “gambling” and “drugs/alcohol”. The use of machine translation shows the best results for the categories “weapons” and “religion”. In future work, we plan to add classifiers based on new aspects of web pages, which will improve the quality of classification, including for exception categories.

Development of the Complex Algorithm for Web Pages Classification

283

Fig. 5. Assessment of accuracy of web pages classification by tags

Fig. 6. Assessment of accuracy of classification by categories

5

Conclusion

In the result of the investigation, we propose an approach to the web pages classification aimed to inappropriate information identification on the Internet, based on the simultaneous use of heterogeneous aspects of web pages and machine translation of text content of web pages in foreign languages. The approach automates preparing input data and trained models. The algorithm of combining base classifiers into a general scheme uses advantages of individual classifiers and therefore neutralizes their limitations. We define the stages of processing web page data for the classification problem and present the results of experiments on the analysis of web page classification algorithms with various machine learning methods and the proposed algorithm. As a result, experiments showed high classification accuracy for certain categories. This confirms feasibility of using the technology in systems of blocking websites with inappropriate content. In the direction of further work, we plan to improve the algorithm to achieve higher classification quality with used additional aspects of web pages. We plan to compare the results of the developed algorithm with the well known methods that use bootstrap paradigm and cascade amplification of weak classifiers.

284

D. Gaifulina and A. Chechulin

Acknowledgments. The work is performed by the grant of RSF 18-11-00302 in SPIIRAS.

References 1. Open directory project (dmoz). https://dmoztools.net/. Accessed 17 July 2019 2. Shalla’s blacklists. http://www.shallalist.de/. Accessed 17 July 2019 3. Khonji, M., Iraqi, Y., Jones, A.: Enhancing phishing e-mail classifiers: a lexical url analysis approach. Int. J. Inf. Secur. Res. (IJISR) 2(1/2), 40 (2012) 4. Kotenko, I., Chechulin, A., Komashinsky, D.: Evaluation of text classification techniques for inappropriate web content blocking. In: 2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), vol. 1, pp. 412–417. IEEE (2015) 5. Kotenko, I., Chechulin, A., Komashinsky, D.: Categorisation of web pages for protection against inappropriate content in the internet. Int. J. Internet Protoc. Technol. (IJIPT) 1(10), 61–71 (2017) 6. Novozhilov, D., Kotenko, I., Chechulin, A.: Improving the categorization of web sites by analysis of html-tags statistics to block inappropriate content. In: Intelligent Distributed Computing IX, pp. 257–263. Springer (2016) 7. Patil, A.S., Pawar, B.: Automated classification of web sites using naive Bayesian algorithm. In: Proceedings of the International Multiconference of Engineers and Computer Scientists, vol. 1, pp. 14–16 (2012) 8. Qi, X., Davison, B.D.: Web page classification: features and algorithms. ACM Comput. Surv. (CSUR) 41(2), 12 (2009) 9. Sara-Meshkizadeh, D., Masoud-Rahmani, A.: Webpage classification based on compound of using html features & url features and features of sibling pages. Int. J. Adv. Comput. Technol. 2(4), 36–46 (2010)

Approach to Identification and Analysis of Information Sources in Social Networks Lidia Vitkova(B) and Maxim Kolomeets SPIIRAS, 39, 14th Liniya, St. Petersburg, Russia {vitkova,kolomeec}@comsec.spb.ru

Abstract. The paper investigates an approach for communicative leaders selection in social network. The hypothesis is that the analysis of these leaders is enough for social network community evaluation. The approach for the communicative leaders is proposed. Experiments with several groups in VKontakte network is performed and presented. Keywords: Social network analysis · Leaders selection · Community evaluation · Fake profiles identification · Influencers identification

1

Introduction

Issues of human information security in online social networks (OSNs) are relevant to most countries of the world. Such areas of science as social network analysis, network science are new for researchers. These areas became a separate discipline only in the 21st century [3]. And protection from misinformation, fake information takes its origins far in the history of civilization. And today, scientists, researchers, government and commercial structures are increasingly interested in how it is possible to recognize misinformation, detect a fake profile in the social network. They also often find themselves face to face with such a problem as identifying a cluster of fake likers, countering fake news. To do this, there are more and more tools and solutions that have their advantages and disadvantages. A common problem of many SNA algorithms today is heterogeneity, volume and dynamics of social networks. If researchers are looking for fake news, they need to analyze the text. If they are looking for a cluster of bots, then they analyze a lot of links between pages. Then when looking for a distribution channel they need to analyze a lot of links and texts. In previous studies, the team of authors proposed an approach to identifying distribution channels in social networks [12]. But the distribution channel is a collection of elementary sources. Therefore, the authors set themselves the task of finding the optimal solution. The solution should identify and analyze sources of information in social networks and process the minimum amount of data in monitoring systems. The scientific novelty of the proposed approach is that it does not require link analysis and clustering at the preliminary decision-making stage, no text c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 285–293, 2020. https://doi.org/10.1007/978-3-030-32258-8_34

286

L. Vitkova and M. Kolomeets

analysis content is required. According to the proposed approach, the audience of the source is evaluated and segmented at the first step. The audience is divided into parts according to the degree of user activity at the source. In the identified parts, an assessment of the possibility of data collection (open or closed profile) is carried out. Next, the top communicative leaders with open profiles are selected, and their connectedness with each other is assessed. This approach allows you to identify the active part in the source, for which, in principle, requires analysis and cut off those who are not indicative. This approach to analyzing the source reduces the time for collecting and processing data in the monitoring system. It allows you to recognize abnormal behavior with minimal resource consumption and direct attention to in-depth study. The paper is organized as follows. The second section provides an overview of the literature on relevant research. The third section contains a description of the proposed approach to segmentation of the user audience and the identification, evaluation of communication leaders. The fourth section presents the results of the experiments. The fifth section concludes the article by summing up the results and determines the directions of future research.

2

Related Work

Relevant works describing methods of audience analysis and information dissemination are included in the review as well as the approaches to identifying communication leaders. The works on the detection of fake news, fake accounts, likers are also included in the review. One of the inspiring authors of methodologies and research areas were the works in the field of marketing and communication theory [10]. A targeted advertising campaign is an act of purposeful influence on a user in a social network. The task of dividing users into groups is acute and controversial in the marketing and communication theory. In [13] the author proposes the 5 W method. In his work, he divides the General audience into groups with a universal set of characteristics. It introduces 5 key questions about the consumer: (1) What (product); Who (buyer); Why (product benefits); When; Where. The versatility of the method proposed in [13] makes it very flexible for the marketing industry. Moreover, today all OSNs transmit data on demographics, gender and age attributes of users, which can be used by commercial organizations. This technique allows us to look at the problem of analysis and evaluation of the source in the social network in a new way. In [15], the influence of bots in OSN on the formation of opinion and voting results during the referendum is studied. Large-scale data was collected during the Catalan referendum for independence on 1 October 2017. Nearly 4 million Twitter messages created by nearly 1 million users were analyzed. As a result, the audience was segmented into two polarized groups of Independentists and Constitutionalists. The paper quantifies the structural and emotional roles played by bots in OSN. At the same time, managed accounts operate from the periphery of

Approach to Identification and Analysis of Information Sources

287

the network. They have little connection to the “live audience” and target communication leaders. Such studies suggest the use of audience analysis to counter the spread of unwanted information on social networks. In [5], the authors discuss an important question about the analysis of the dissemination of information in the social network in order to find the initiators. The proposed method is aimed at assessing the involvement of the audience and the influence of communication leaders on the discussion, the point of view of society. This approach can also be used to evaluate the distribution channel audience. The paper [2] analyzes the role of social networks in the dissemination of information on the Internet. Researchers conducted a large-scale field experiment that randomizes exposure to signals about friends’ information sharing among 253 million subjects in-situ (on-site). As a result, the authors [2] came to the conclusion that weak links have a greater impact on the dissemination of information, as they are incomparably more. The work [6] suggests that social networks are considered the fourth most popular source of access to information about emergency situations. The authors analyze the messages in OSN during the 2016 flood in Louisiana and transform the emergency data of social networks into knowledge. At the same time, 3 segments in the audience were allocated in [6], such as: (1) individuals, (2) emergency agencies, (3) organizations. The aim of the work was to identify ways to improve the efficiency of emergency organizations during emergencies at the expense of data obtained from social networks. However, this technique can be applied in the analysis and evaluation of the source of information. A separate area is the methods of assessing the impact of fake news (fake information, fake news) on the opinion of the audience. Experiments have shown that repeated exposure increases the probability for information to be taken for truth, [11]. In [14], researchers raise the issue of detecting fake news in social networks and discuss the prospects of data mining. In [14], they are based on research [4,9], which adopted a definition for fake news. Fake news contains fake information and it is verified, as well as fake news is created with the dishonest intention to mislead consumers. The authors [14] propose to include an auxiliary function - evaluation of the source (user in this case). However, this approach leads to the fact that it is necessary to analyze social interaction. An actual problem for OSN is the search and detection of fake likers [1,8]. Fake likers change the context of the message, popularity, rating, add comments. The number of works devoted to the detection, analysis and evaluation of fake likers in the social network is growing steadily. Analysis of modern research in the field of SNA shows that, first of all, scientists solve methodological and algorithmic issues of fake information detection, sources, artificial distribution channels. Other popular areas are text classification, data clustering, search, and distributed storage. But it is becoming increasingly difficult to collect data for analysis from social networks (API access). At the same time, many methods and algorithms are limited in their computational capabilities, due to the need for analysis and processing of dynamic objects, while

288

L. Vitkova and M. Kolomeets

today it is necessary to quickly make an important decision or to determine the priority for the monitoring system. It is necessary to evaluate and categorize the source parameters for decisionmaking under different types of uncertainty. Uncertainty has traditionally been understood as a lack of awareness among decision makers and systems. This is an incomplete or inaccurate representation of the values of various source parameters generated by various reasons and, above all, incomplete or inaccurate information [7].

3

Proposed Approach

In the proposed approach, the authors start from the fact that at the time of joining the space of social networks, the user is an individual or a representative of the organization. It is always on the border between the virtual and the real world. In the process of finding an intruder, it is important to identify the profile through which it performs the action. In the social network profiles form an information environment. The task of identifying the offender in the social network can be divided into 2 subtasks: (1) search for profiles with influence; (2) identification of the offender among them. The first sub-task can be solved with the help of the proposed approach. The second sub-task is solved by the operator on the basis of data collected according to the proposed approach. For formalization purposes, we introduce the following concepts: (1) Personindividual person; (2) Company - juridical person; (3) Social network Profile (SN Profile) - account; (4) Subject - the owner of the account; (5) User - it is a subject that acts on the social network through SN Profile (6) Information object - this information is presented in the form necessary to store information and transfer it to other people; (7) Page - page of the subject through which it distributes information; (8) Group - a page created by a subject to organize a private or public community; (9) Public - page created by the subject for the organization of public media; (10) Event - a page created by a party to organize a private or public event; (11) Distribution channel - the set of pages and information objects through which the diffusion of information; (12) Source - this is the set, which includes page, group, public, event, channel; (13) Individual source - this is a Subject, SN Profile, or Information object; (14) Mass audience - users who have entered into a relationship with a group, with another user, with an information object in a “connectivity” relationship. Conceptually, the scheme of communication between social network’s objects can be represented as 4 basic levels: (1) source; (2) content; (3) channel; (4) destination. This levels allow us to analyze and evaluate the Sources. On each of these levels particular approaches and algorithms are used. A lot of relations and action of the SN profile in social networks gives rise to noise. Due to this, the process of analysis and evaluation of the SN profile is complicated. Therefore, it is necessary to define a set of features that make it possible to obtain data at the destination level.

Approach to Identification and Analysis of Information Sources

289

The relation between the community and the user: the User has entered into a relationship; the User ’liked’; the User posted the comment; the User added the post. Relationships between users in the Group: the User ’liked’ the comment of another User in Source; the User ’liked’ the post of another User in Source; the User shares Source with another User (repost); in Source there are similar communicative leaders; in Source there are similar Information objects (post, repost, comments). Characteristic features of SN Profile: the User has geographical features; gender features; behavioral signs; discrete features. Based on the analysis of the presented categories, we propose the following hypothesis: Open data and incomplete information are sufficient to analyze and evaluate the Source in social network. Let’s choose the minimum number of features that will allow one to make a decision in the face of uncertainty and incompleteness of data. We assume that the basis for Source analysis is SN Profile segmentation by the following actions: the User posted the comment in the Source, i.e. the User showed the communicative activity. Collecting information about the sum of the indicators for the User allows one to select the communicative core and cut off the passive ones. The analysis and evaluation of the Source is carried out on the power of the core, the cohesion of communication leaders. Segmentation of the Source audience is carried out by separating Users in the range of the maximum and minimum set of manifested actions in the Source. For example, one User during the existence of the Source for all comments to posts wrote 2000 comments, and the other only 1. Then the Users who most often perform this action compared to others are the most active. Segmentation of the Source audience is carried out in the following groups: E-fluentials; (2) Sub-Efluentials; (3) Activist; (4) Sub-Activist; (5) Observer; (6) Passive, according to the following algorithm: The first step highlights the “Passive” segment, which includes Users who have never shown communicative activity in the Source. That is, they did not write a single message. In the second step, the kernel (communicative kernel) summarizes all User actions (messages) in the Source and Users are sorted in descending order. The average value for the number of messages is calculated. All Users with activity below average belong to the group (5) Observer. In the third step, the average value is calculated for the remaining most active Users. Those who are below average in the active group are cut off in (4) Sub-Activist. Cutting off (6), (5), (4) segments can significantly reduce the number of Users in the Source to analyze their actions. In fact, there remains the most active link that forms the field of discussion and affects the dissemination of information in the Source.

290

L. Vitkova and M. Kolomeets

The fourth step reveals the average value characteristic of the sum (1), (2), (3). A group of activists who are not communicative leaders is singled out. In the fifth step, a group of Users is segmented into (1) and (2), similar to the previous steps. The sixth step in the identified group of communication leaders assesses the possibility of collecting data on SN profile (open or closed profile). Next, the communicative leaders with open SN profiles are selected and their connections with each other and profile’s data are evaluated. With additional evaluation of E-fluentials, the Activist, for example, the time of their likes to objects, the time of writing comments, the relationship with each other and with other Users from parts of the audience of the Source of E-fluentials and the Activist, it is possible to obtain information about their falsification and falsity at a small time.

4

Experiment

In the experiment show only part the proposed approach. However the experiment show that in a mass audience of a large Source, most subscribers are in the state of Observe or Passive. That is, once, two have taken action or just joined the community and are not logged in, do not read the page. The 3 largest Sources of information in the city of Kudrovo of the Leningrad region, located near St. Petersburg, were chosen as the object of the study. For the experiment, the three largest groups in the social network VKontakte were selected, in which the residents of Kudrovo discuss their city and its problems (see Table 1). Table 1. Selected sources for research A source

Group ID Members

No.1 Kudrovo here

58713406 101808

No.2 Life in Kudrovo 51766355

51439

No.3 Kudrovo 24

77155

46193251

The results of the analysis showed that discussions in “Life in Kudrovo” are 1000 times more active than in 2 other Sources. At the same time, in terms of numbers, it is the smallest of them. Initially, information was obtained about the number of information objects on the community wall, the author’s id. Further - information is received about the number of comments to the post, the id of the authors of the comments, time. Data collection at the source was carried out for 2018. According to the ban of the legislator on storage of the data, they were deleted after the collection and analysis.

Approach to Identification and Analysis of Information Sources

291

According to the proposed approach, User activity was summarized and sorted in descending order. The obtained information was processed according to the proposed approach and is presented in Table 2. The analysis shows that E-fluentials differ greatly from Activist in the number of actions performed at the Source. As research shows, a Source with fewer subscribers turns out to be the most active and effective in terms of user engagement. Also in the course of the experiment, it was revealed that amount of E-fluentials in all sources is less than 1%, so in the conditions of uncertainty and incompleteness of data it is possible to predict the time and resource costs for analyzing and evaluating the Source. Table 2. Segmentation of groups’ members No.1 Kudrovo here No.2 Life in Kudrovo No.3 Kudrovo 24 Amount Percent

Amount Percent

Amount Percent

Maximum value

205

100.00%

5578

100.00%

351

100.00%

Total records

4494

100.00%

9343

100.00%

8806

100.00%

Amount of Observers (1)

3548

78.95%

7922

84.79%

6921

78.59%

Amount of Sub Activists (2)

673

14.98%

1143

12.23%

1432

16.26%

Amount of Activists (3)

189

4.21%

213

2.28%

333

3.78%

Amount of Sub E-fluentials (4) 52

1.16%

46

0.49%

81

0.92%

Amount of E-fluentials (5)

32

0.71%

19

0.20%

39

0.44%

Average of Observer (1)

1.40

0.68%

6.90

0.12%

1.21

0.35%

Average of Sub Activist (2)

10.48

5.11%

61.80

1.11%

6.50

1.85%

Average of Activist (3)

27.86

13.59%

256.47

4.60%

20.99

5.98%

Average of Sub E-fluentials (4) 53.46

26.08%

818.76

14.68%

49.38

14.07%

Average of E-fluentials (5)

54.36%

2831.95 50.77%

127.05

36.20%

Threshold between (1) and (2) 5.26

2.57%

29.05

0.52%

3.82

1.09%

Threshold between (2) and (3) 19.73

9.63%

152.53

2.73%

13.40

3.82%

Threshold between (3) and (4) 42.53

20.75%

525.54

9.42%

35.20

10.03%

Threshold between (4) and (5) 75.55

36.85%

1407.23 25.23%

74.63

21.26%

111.44

According to the proposed approach E-fluentials were marked as communication leaders. In the next experiment their open profiles could collected. A network analysis of their communications could also conducted. It should be noted that some of the Users included in the communication leaders list are not subscribers of any of the analyzed Source. The experiment shows that the primary analysis and evaluation of the Source in the social network do not require a large amount of information and data. However, further verification of the proposed approach in other social network with a different architecture is necessary. For example, Twitter, Facebook, and Instagram have their own features that require additional analysis.

292

5

L. Vitkova and M. Kolomeets

Conclusion and Discussion

In the experiment, the separation of users into segments was carried out manually, but in the future, it is planned to refine the approach. Each source has its own average indicators, according to which the user falls into one or another part of the audience. However, the proposed approach allows us to evaluate the source and getting quick preliminary results. This approach allows one to identify many communication leaders that are not associated with sources. Selecting this group for further analysis can help identify a cluster of fake likers. An analysis of the user’s discrete features, such as comment time, like time, will allow one to reveal additional characteristics indicating the working time of the fake profiles. The information obtained allows not only to analyze the source but also to assess the potential of communicative leaders, based on the number of likes they received on their messages. To increase the speed of making an intermediate decision, the architecture of distributed collection and preprocessing of data obtained from social networks can also be optimized. Acknowledgments. The work is performed by the grant of RSF No. 18-71-10094 in SPIIRAS.

References 1. Badri Satya, P.R., Lee, K., Lee, D., Tran, T., Zhang, J.J.: Uncovering fake likers in online social networks. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 2365–2370. ACM (2016) 2. Bakshy, E., Rosenn, I., Marlow, C., Adamic, L.: The role of social networks in information diffusion. In: Proceedings of the 21st International Conference on World Wide Web, pp. 519–528. ACM (2012) 3. Barab´ asi, A.-L., et al.: Network Science. Cambridge University Press, Cambridge (2016) 4. Conroy, N.J., Rubin, V.L., Chen, Y.: Automatic deception detection: methods for finding fake news. Proc. Assoc. Inf. Sci. Technol. 52(1), 1–4 (2015) 5. Jackson, S.J., Foucault Welles, B.: #ferguson is everywhere: initiators in emerging counterpublic networks. Inf. Commun. Soc. 19(3), 397–418 (2016) 6. Kim, J., Hastak, M.: Social network analysis: characteristics of online social networks after a disaster. Int. J. Inf. Manage. 38(1), 86–96 (2018) 7. Kotenko, I., Parashchuk, I.: Analysis of the sensitivity of algorithms for assessing the harmful information indicators in the interests of cyber-physical security. Electronics 8(3), 284 (2019) 8. Liu, Y., Liu, Y., Zhang, M., Ma, S.: Pay me and i’ll follow you: detection of crowdturfing following activities in microblog environment. In: IJCAI, pp. 3789– 3796 (2016) 9. Mustafaraj, E., Metaxas, P.T.: The fake news spreading plague: was it preventable? In: Proceedings of the 2017 ACM on Web Science Conference, pp. 235–239. ACM (2017) 10. Pasti, S., Gavra, D., Anikina, M.: New news media in Russia: what is new? Afr. Journalism Stud. 36(3), 33–60 (2015)

Approach to Identification and Analysis of Information Sources

293

11. Pennycook, G., Cannon, T.D., Rand, D.G.: Prior exposure increases perceived accuracy of fake news. J. Exp. Psychol. Gen. (2018) 12. Pronoza, A., Vitkova, L., Chechulin, A., Kotenko, I.: Visual analysis of information dissemination channels in social network for protection against inappropriate content. In: International Conference on Intelligent Information Technologies for Industry, pp. 95–105. Springer (2018) 13. Sherrington, M.: Added Value: The Alchemy of Brand-led Growth. Springer, Cham (2003) 14. Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor. Newsl. 19(1), 22–36 (2017) 15. Stella, M., Ferrara, E., De Domenico, M.: Bots sustain and inflate striking opposition in online social systems. arXiv preprint arXiv:1802.07292 (2018)

The Architecture of Subsystem for Eliminating an Uncertainty in Assessment of Information Objects’ Semantic Content Based on the Methods of Incomplete, Inconsistent and Fuzzy Knowledge Processing Igor Parashchuk1,2 and Elena Doynikova1,2(B)

2

1 SPIIRAS, 39, 14th Liniya, St. Petersburg, Russia {parashchuk,doynikova}@comsec.spb.ru ITMO University, 49, Kronverksky Pr., St. Petersburg, Russia

Abstract. The study considers different levels of the architecture of subsystem for eliminating uncertainty in assessment of the information objects’ semantic content based on the methods for processing incomplete, inconsistent and fuzzy knowledge. It analyzes the possibility of their joint operation on the basis of multilevel functional, informational and logical interaction. The variant of architecture based on a single integration platform is proposed. The results of analysis of relationships between the different levels of architecture for obtaining new knowledge related to the detection of unwanted, doubtful and harmful information features under uncertainty are discussed. Keywords: Architecture · Information · Uncertainty · Features · Level

1 Introduction In the modern conditions of information society the tasks of protection of every citizen and society as a whole from an unwanted, doubtful and harmful information (UDHI) get a lot of attention. The UDHI distributed on the Internet and in social networks can cause serious harm. There is a high probability that the UDHI can and will act as both a source and an environment for dissemination of calls for the acts of terrorism and ideological extremism. In addition, the UDHI may harm the mental and physical health of certain categories of citizens, may encourage them to illegal actions aimed at changing the political system and violating the integrity of country. Therefore, protection of the citizens, society and the state from the UDHI is a problem of national importance and it affects the information security of country. The basis of protection from the UDHI is development and use of the tools and systems of content analysis, the software and hardware tools for detection, assessment and countering information of this type. Practical experience in obtaining reliable estimates in the context of content analysis, nomenclature ratings and digital network content [5, 12] shows that objective analysis of the semantic content of information objects c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 294–301, 2020. https://doi.org/10.1007/978-3-030-32258-8_35

The Architecture of Subsystem for Eliminating an Uncertainty

295

is impossible without data processing and, consequently, without the conclusions and knowledge derived in conditions of uncertainty, and without use of algorithms for analyzing inconsistent and dynamic (changing) knowledge. The traditional systems for the practical implementation of tasks of detection, assessment and adequate countering the UDHI are the intellectual systems of analytical processing (ISAP) of digital network content (DNC). The common architecture of the ISAP of DNC includes the following elements: the subsystem for network information objects gathering and preliminary processing; the subsystem for multidimensional assessment and categorization of the semantic content of information objects; the subsystem for ensuring the timeliness of multilevel and multi-module analysis of information objects; the subsystem for elimination an uncertainty of information objects’ semantic content; the subsystem for adaptation and retraining the information objects’ analysis system that should also work in exploitation mode; the subsystem for development and selection of countermeasures to the UDHI; the subsystem for implementation of the visual interfaces. The process of assessment of information object’s semantic content is characterized by an uncertainty because of the character of UDHI dissemination on Internet and in the social networks. Elimination of an uncertainty is one of the key tasks for the ISAP of DNC. The subsystem for eliminating an uncertainty (SEU) has its own architecture. The SEU mechanisms are implemented on the basis of the advanced information objects classifiers. These classifiers are enhanced with the mechanisms for processing the incomplete, inconsistent and fuzzy knowledge. The architecture and features of implementation of SEU are based on use of the methods for processing the incomplete, inconsistent and fuzzy knowledge is the subject of this study. The paper is organized as follows. Section 2 considers the related works. Section 3 proposes the variant of architecture of SEU of information object’s semantic content assessment. Section 4 discusses the experiments investigating relationships between different levels of the SEU architecture and the case studies for application of the SEU architecture in the common architecture of the ISAP of DNC. Section 5 describes the obtained results and plans for future research.

2 Related Work A number of works consider the issues of protection from the UDHI. Some works describe foundations of content analysis [5, 12]. They are devoted to the statement of particular methods and mechanisms for the UDHI detection and counteraction in the network digital objects. In some cases, researchers use the methods of content analysis based on the web pages classification [16]. Some studies research the questions of optimizing the process of classifiers training [18]. In [9] the Naive Bayes classifier is used for webpage classification system learning in scope of the separate groups of internal features of HTML documents. Such approaches use models of web page features presentation, learning models, as well as methods for combining individual solution models (classifiers). But the listed approaches are aimed at analyzing web content and do not allow controlling the entire space of UDHI sources. The paper [2] considers classification of web content topics with the goal of searching for the UDHI based on URLs.

296

I. Parashchuk and E. Doynikova

But this approach narrows the range of UDHI features for analysis. The paper [4] proposes the hierarchical classification and combined classification of web content based on links. However, such methods should be used in complex to maximize an efficiency of their application. The research [6] suggests the method of categorization of web page’s text using support vector machines. However, this way of UDHI detection is laborious and requires additional computational costs. The method proposed in [7] is based on identification of the UDHI features on the web pages using indirect signs. This approach allows hiding the researcher interest to the particular web page. But it requires collecting additional data about the web page. In [3] the approach based on extracting the meaningful text from tags is proposed. The classifier is applied to the obtained samples. It takes a lot of time. However, the problem of analyzing web pages to protect against inappropriate content on the Internet is still relevant [8]. Another important research area is social networks analysis [1, 13, 17]. But the UDHI sources are not only social networks. The stage of processing and estimation of the UDHI features in the different conditions of uncertainty is essential. It requires using the modern methods, models, techniques and algorithms. For example, the methods of fuzzy sets theory [11], of artificial neural networks [14, 15] and neuro-fuzzy networks [10]. But objective analysis of the semantic content of information objects for the UDHI detection and counteraction is a complex process. It requires using such mechanisms in complex, with involvement of an additional expert data, and inconsistent and dynamic knowledge. All this make it relevant the modification of existing techniques of poorly formalized knowledge processing for the UDHI detection and counteraction. In its turn, this make it relevant the task of searching the new variants of architecture of SEU of information object’s semantic content assessment based on methods for processing the incomplete, inconsistent and fuzzy knowledge.

3 The Variant of Architecture Development of architecture of SEU for assessment and categorization of information object’s semantic content is one of the key stages of building the ISAP of DNC. The uncertainty (fuzziness, incompleteness and inconsistency) is the result of UDHI unsteadiness, the fuzziness, incompleteness and inconsistency of such information’s features, dynamics of the ISAP of DNC operation, the effects of destabilizing factors, the factors of external environment, the uncertainty of goals and inconsistency of tasks of UDHI detection and counteraction, etc. It defines the necessity of solving the task of optimal assessment of information object’s semantic content considering both the conditions of multi-criteria and uncertainty. We analyzed the approaches to making evaluation decisions and the types of uncertainty related to assessment of information object’s semantic content. Analysis shown that currently the algorithms of assessment under the conditions of non-stochastic uncertainty, namely fuzziness and incompleteness of initial information, are underdeveloped. The component levels of the common architecture of SEU are developed to reduce uncertainty. We understand a subsystem architecture as its principal organization embodied in the elements, their relationships with each other and with the environment, and principles that guide the design of subsystem and its evolution. The proposed variant of

The Architecture of Subsystem for Eliminating an Uncertainty

297

SEU common architecture includes the following basic components: the component of collecting data on the UDHI features as the result of information objects analysis; the component of preprocessing and categorization of UDHI features uncertainty type; the component of expert information input and processing; the component of solving problems of uncertainty elimination (fuzzy sets system solving and implementation of the algorithms of neural network identification); the component of output of UDHI features with eliminated uncertainty (suitable for the parametric (reliable) assessment and categorization). We propose the prospective SEU of assessment of information object’s semantic content for the UDHI detection and counteraction. The common scheme of edges (levels) interconnection of its architecture is provided in Fig. 1. The architecture is based on the multilevel functional approach to analysis of interconnection of SEU edges. Unwanted Information Doubtful Information

Component of solving problems of an uncertainty elimination Component of the UDHI features output

Process of elimination of uncertainty

Component of an expert information input and processing

Identified features of UDHI

Interaction with the database of expert information

Information flows

Harmful Information

Component of collecting data on the UDHI features Component of preprocessing and categorization of the UDHI features′ uncertainty type

Elimination of the UDHI features fuzziness Elimination of the UDHI features′ incompleteness (inconsistency)

Fig. 1. Common scheme of levels (edges) interconnection of SEU architecture

The first level of architecture is responsible for the functional components of the SEU. It contains all five components listed above. The second level of architecture is responsible for the specific subject areas, i.e. the UDHI types. It contains the SEU operation areas that eliminate uncertainty of assessment of the unwanted, doubtful and harmful information. The third level of architecture is responsible for the modern methods, models, techniques, algorithms, technology and services for converting poorly formalized data to the formal form. It contains the sublevels for eliminating two types of uncertainty, namely fuzzy and incomplete data. In fact, the third level is an independent 2-tier architecture. It includes the sublevel for eliminating a fuzziness of assessment and categorization using the fuzzy set theory and the sublevel for eliminating an incompleteness of attributes of semantic content of information objects to be evaluated using the theory of artificial neural networks. Obviously, in scope of this architecture (Fig. 1), the component of expert information input and processing interacts, i.e. exchanges an information and control commands, with the database of expert information.

I. Parashchuk and E. Doynikova

Output data on the UDHI features Solving problems of the uncertainty elimination Expert information input and processing Processing and categorization of the UDHI features′ uncertainty type Collecting data on the UDHI features "Regulatory content control" "Preventive content control" "Remote content diagnostics" "Content movement control" "Automation of content analysis operations"

Information flows

298

Applied tasks

Reliability improvement tasks

Data analysis tasks

Telecommunication network services Internet services Social networking services

Fig. 2. The relationships between sublevels, operations, content analysis models and sources of information objects within the overall SEU architecture

4 Case Studies of the Relationships Between Different Levels of Architecture, Process Flow and Discussion Let us to consider and analyze the relationships between different levels of the proposed SEU architecture, and to provide the case studies of application of SEU architecture in the common architecture of the ISAP of DNC. In scope of the common architecture the several additional sublevels should be specified. These sublevels are responsible for the technology, networks and services that can participate in the dissemination of information objects with the UDHI. One of these sublevels contains following services: traditional public telecommunication network services; Internet services; social network services (Fig. 2). The sublevel of traditional public telecommunication network services implements information monitoring and collecting data on the UDHI features related to the Wi-Fi, networks of mobile operators, etc. The sublevel of Internet services implements collection of data on existence of the UDHI features in the semantic content of objects existing in TCP/IP networks and being implemented in the cloud technologies and Internet of Things. The sublevel of the social network services implements the collection of data on existence of the UDHI features in the information object’s semantic content. Here objects that exist in the social networks of various types and are built using different network and information technologies are considered. The SEU components can be grouped into three classes of practical tasks: data analysis tasks (collecting data on the UDHI features, their generalization, normalization, preliminary correlation and categorization of the UDHI features uncertainty type), reliability improvement tasks (analysis, input and processing of the expert information), and applied tasks (output of data with eliminated uncertainty). Thus, in scope of the common architecture of SEU the data on UDHI features are generated on the level of content analysis objects. Further, these

The Architecture of Subsystem for Eliminating an Uncertainty

299

data go to the level of data analysis tasks for preprocessing. Here they are analyzed, converted (calculated) and identified using reliability improving tasks. Finally, the data are forwarded to the required elements of class of applied tasks. These elements are responsible for effective assessment and categorization of the information object’s semantic content. All levels, models and algorithms of the common SEU architecture are combined in the unified system. This system allows using the methods of processing of incomplete, inconsistent and fuzzy knowledge for the UDHI detection and counteraction. While considering the common architecture of SEU taking into account the considered services, it should be noticed that interconnections of these services and subject areas, namely the UDHI types, are not always full scale. Collecting data on the UDHI features

... Reliable and trusted data bus Internal regulations for work with UDHI features

Report on the UDHI feature

UDHI feature registration

Control by an intelligent digital network′s content analytical processing system

Expert knowledge, the type of uncertainty and other data on the UDHI features

An uncertainty solving for the UDHI features

Uncertainty removal record

Closing the problem of uncertainty of the UDHI features

Record on the closure of the problem of uncertainty of the UDHI features

Fig. 3. The common scheme of the uncertainty eliminating process within the common architecture of intelligent digital network content analytical processing system

There are various approaches to constructing the systems of network content analytical processing and the incidents management systems that are applicable in practice. However, these systems cannot ensure stable and efficient work of the complex information security system, reliable assessment and categorization of the UDHI. We take as a basis the process of detection and prevention of incidents for IT infrastructure from the ITIL standard. In this case, the scheme of application of SEU architecture and the processes it implements in the common architecture of the ISAP of DNC can be represented as shown in Fig. 3. This scheme characterizes the architecture as logical interconnection of sub processes in scope of a single process for analyzing network content. It incorporates the following stages: the UDHI features registration; the UDHI features uncertainty solving and closing the problem of the UDHI features uncertainty. Constructing the single architecture of complex system on the basis of single integration platform is considered as the modern approach. The single integration platform

300

I. Parashchuk and E. Doynikova

combines all subsystems and processes. From this point of view, we can consider the variant of architecture based on the single integration platform that combines all the processes implemented by the ISAP of DNC with the processes aimed at eliminating uncertainty of assessment of information object’s semantic content. In this case the SEU can include the following components: an integration platform; the software and hardware tools for expert information processing; the software and hardware tools for eliminating uncertainty; a repository of information on the UDHI features and their characteristics; the analytical tools and reporting tools; the management tools and custom user interfaces. The integration platform is supposed to be the core of both SEU and ISAP of DNC as a whole. It implements the functions of integration and interconnection of all the components of our system. The main goal of integration platform is to ensure clear and timely coordination and interaction of the components responsible for the reaction to the UDHI. The next stage of research is development of the specific algorithms and software prototypes of tools for eliminating uncertainty of assessment of information object’s semantic content on the basis of methods of incomplete, inconsistent and fuzzy knowledge processing.

5 Conclusion The paper proposed the approach to building architecture of subsystem for eliminating an uncertainty of information object’s semantic content using the methods of processing of incomplete, inconsistent and fuzzy knowledge. The approach is based on the multilevel, functional, information and logical data interaction. We proposed the variant of common SEU architecture and subsystem for eliminating a fuzziness, incompleteness and inconsistency of analysis of the UDHI features. We also developed the common scheme of uncertainty elimination process in the common architecture of intellectual analytical system for network content processing. On the basis of the proposed scheme we developed the additional variant of the SEU architecture based on the single integration platform. We suppose that the proposed variants of architecture will allow constructing a rational intellectual subsystem for eliminating an uncertainty. The paper demonstrated and discussed the examples of logical and information relations between different levels of architecture for obtaining new knowledge related to detection of the UDHI features under uncertainty. Further research will be related to uncertainty elimination while detecting and processing information of this type. Acknowledgements. The research is being supported by the grant of RSF #18-11-00302 in SPIIRAS.

References 1. Ucinet documentation. https://sites.google.com/site/ucinetsoftware/document 2. Baykan, E., Henzinger, M., Marian, L., Beber, I.: Purely url–based topic classification. In: Proceedings of the WWW, pp. 1109–1110 (2009)

The Architecture of Subsystem for Eliminating an Uncertainty

301

3. Belmouhcine, A., Idrissi, A., Benkhalifa, M.: Web classification approach using reduced vector representation model based on html tags. JATIT 55(1), 137–148 (2013) 4. Dumais, S., Chen, H.: Hierarchical classification of web content. In: Proceedings of the SIGIR, pp. 256–263. ACM (2000) 5. Elo, S., Kyngas, H.: The qualitative content analysis process. JAN 62(1), 107–115 (2008) 6. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Proceedings of the ECML, pp. 137–142 (1998) 7. Kan, M.-Y.: Web page classification without the web page. In: Proceedings of the WWW Alt, pp. 262–263. ACM (2004) 8. Kotenko, I., Chechulin, A., Komashinsky, D.: Categorisation of web pages for protection against inappropriate content in the internet. IJIPT 10(1), 61–71 (2017) 9. Kotenko, I., Chechulin, A., Shorov, A., Komashinsky, D.: Analysis and evaluation of web pages classification techniques for inappropriate content blocking. In: Proceedings of the ICDM, pp. 39–54 (2014) 10. Kotenko, I., Parashchuk, I., Omar, T.: Neuro-fuzzy models in tasks of intelligent data processing for detection and counteraction of inappropriate, dubious and harmful information. In: Proceedings of the FTI, pp. 116–125 (2018) 11. Kotenko, I., Saenko, I., Ageev, S., Kopchak, Y.: Abnormal traffic detection in networks of the Internet of Things based on fuzzy logical inference. In: Proceedings of the SCM (2015) 12. Krippendorff, K.: Content Analysis: An Introduction to Its Methodology. Sage, Thousand Oaks (2004) 13. Marcus, S., Moy, M., Coffman, T.: The anatomy of human destructiveness. In: Mining Graph Data. Wiley, Hoboken (2007) 14. Mehlig, B.: Artificial Neural Networks, vol. 2337. University of Gothenburg, Gothenburg (2019) 15. Parashchuk, I.: System formation algorithm of communication network quality factors using artificial neural networks. In: Proceedings of the ICCSC, pp. 263–266. SPbGTU (2002) 16. Qi, X., Davison, B.: Web page classification: features and algorithms. CSUR 41(2), 12 (2009) 17. Scott, J.: Social network analysis: developments, advances, and prospects. SNAM 1(1), 21– 26 (2011) 18. Shibu, S., Vishwakarma, A., Bhargava, N.: A combination approach for web page classification using page rank and feature selection technique. IJCTE 2(6), 897–900 (2010)

The Common Approach to Determination of the Destructive Information Impacts and Negative Personal Tendencies of Young Generation Using the Neural Network Methods for the Internet Content Processing Alexander Branitskiy1(B) , Elena Doynikova1 , Igor Kotenko1 , Natalia Krasilnikova2 , Dmitriy Levshun1 , Artem Tishkov2 , and Nina Vanchakova2 1

SPIIRAS, 39, 14th Liniya, St. Petersburg, Russia [email protected], [email protected] 2 SPbSMU, 39, 14th Liniya, St. Petersburg, Russia [email protected]

Abstract. The paper considers determination of destructive information impacts and personal tendencies of young generation that predispose them to uncritical comprehension of the content with destructive components. An application of traditional manual and semi-automatic methods seems ineffective because of the huge amount of information in the Internet space. The paper proposes an approach using the technologies of psychological examination and artificial intelligence. It incorporates the technique to determine the tendency of social networks’ users to acquire destructive information, the technique for classification of the social networks communities considering an existence of destructive impacts, and the technique for hypothetical detecting changes in the tendency of users to acquire information that may contain destructive components when interacting in social networks. The paper describes the experiments on highlighting the relation between the information that users provide in social networks and some of their psychological traits and states that may cause predisposition for non-critical acquisition and digestion of potentially destructive information. Keywords: Social networks · Neural networks · Content processing Destructive content · Destructive impact · Ego-structure test

1

·

Introduction

Nowadays the Internet space is one of the popular communication forms for the young generation. At the same time, it is the main environment to disseminate c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 302–310, 2020. https://doi.org/10.1007/978-3-030-32258-8_36

The Common Approach to Determination

303

destructive information impacts. We understand the destructive information impact as an impact on a person or mass audience that can have a negative effect on the psyche, the emotional and volitional sphere of the individual, his/her value system. We also understand the destructive information impact as an impact that can provoke aggressive actions and aggressive behavior (in relation to others or yourself). The global goal of our research is identification of such impacts and monitoring of their possible influences on the younger generation. Its achievement demands on solving certain particular problems such as identification of personal traits of social network users that can predispose to uncritical comprehension of the content with destructive components, naming types and forms of possible destructive information. The initial hypothesis that formed the basis of our approach is that the information on the user’s pages in social networks can serve to hypothetically determine the user’s tendency to non-critical acquisition of destructive information that may potentially cause a destructive behavior. An approach based on the analysis of social network data is proposed. It includes three groups of techniques: (1) the technique to determine the social network user’s tendency to acquire destructive information; (2) the technique to classify the social networks’ communities considering the presence of destructive impact; (3) the technique to detect changes in the tendency of users to acquire information that may contain destructive components when interacting with communities in social networks. We suggest using artificial neural networks and other machine learning methods to implement this approach. In this study we conducted the experiments to test the hypotheses that there may be correlation between the information that users provide in social networks and their tendency to acquire destructive information that can potentially cause destructive behavior. According to our hypothesis the presence of certain personal traits can raise the sensitivity to destructive content. Thus, study of these traits can be used for defining the further risks of inclination to reaction on further stimuli. The experiments consist in the manual determination of some psychological traits and states using the Ammon’s test and in the determination of representatives’ tendency to acquire destructive information by the experts using the information that these representatives provided in social networks. The main challenge while identifying the destructive impacts in the Internet space is the huge amount of information. We argue that the proposed approach will reduce the amount of the analyzed information for experts and will allow them to focus on automatically ranked objects. The paper is organized as follows. Section 2 reviews related work. Section 3 considers the suggested approach. Section 4 describes the implemented prototypes and experiments. Section 5 contains the conclusions and the future work directions.

2

Related Work

The period of adolescence is the period of time when a person can be highly affected by various types of destructive impact and searches for his/her social

304

A. Branitskiy et al.

“Self” [7]. Surrounding people, society, informational processes can act as carriers of destructiveness and agents of destructive influence. At the same time, it is important to understand that the destructive stimuli can highly affect the personality of a person who is in the state of frustration or crisis. Thus, the idea of searching for correlation between the informational streams that may cause negative effect on the psyche and its perception by a person appears to have potential. To analyze aggression and destructiveness the ideas of Erich Fromm are widely used. He distinguished benign kind of aggression (that is justified from the ethical point of view) and malignant kind of aggression (destructive eagerness to oppress and hold under control, to frighten and to terrify, to hurt in order to get satisfaction and pleasure) [2]. This research aims detecting malignant aggression and informational streams that may cause it. It demands the in-depth study of various pre-requisites of aggressive behavior, various types of social and cultural impact that can cause it [5,7]. Informational processes that can influence consciousness, identity construction, and forming of potential destructive behavior can be unfolded in the informational streams that are shared between people and various social groups [18]. Hypothetically, the resulting vector of a person can be formed by constructiveness, destructiveness, and their representation in inner world of a person, his/her activity, social choices, preferences, including choices of informational streams. Talking about constructiveness and destructiveness, it is worth mentioning that these phenomena are extremely complex. The hypothesis that certain states and traits of a person can meet certain destructive influence resulting in possible displays of destructive behavior can be given. Of course, the in-depth study of the processes that arise when a person meets certain informational stimuli should be conducted. The informational streams of the Internet can act as one of the forms of consciousness handling. They can be used to form the virtual society, virtual space, indirect forms of communication. It can be accompanied by the feeling of responsibility loss, the illusion of total freedom and impunity, occurring abilities to realize a socially-troubled behavior [14]. In this space there may be allocated direct destructive impacts and intermediated impacts aimed at forming destructive convictions, a destructive social position and other aspects of consciousness that may have destructive character [16]. There may be the following targets of this impact: destruction of meanings in value system, knowledge, activities of a person [19]. Regular examinations aimed at detecting psychological disorders of the person make it possible to carry out timely prevention of a person’s health state and thereby preserve his health. At present, specialized tests and expensive equipment are used to perform such procedures, which make the implementation of this process time-consuming. To solve this problem, artificial neural networks can be applied, which are widely used in solving problems of object classification. Monitoring the psychological state of the Internet users has attracted the attention of many IT-specialists. Lin et al. [9] explore the usage of convolutional neural networks to identify psychological disorders. The paper [10] is devoted to solving the problem of classification of clinical diagnoses based on

The Common Approach to Determination

305

various medical indications. Investigations on sentiment analysis for the Internet users were presented in [3]. The authors of [3] use a multi-level scheme for constructing a feature vector: for analysis, information is used which is extracted from both individual character sequences and higher-level syntactic structures, namely sentences. A large amount of useful information, suitable for constructing a person’s psychological profile, can be obtained from the account of the corresponding user, posted on his personal page on the social network. The authors of [13] consider the issue of building a system for recognizing the psychological profile and character traits of a person using his photos from the Facebook social network. The approach to finding groups in social networks is presented in [12]. One of the practical purposes of this approach is to highlight common interests in the studied group of people and to construct an average psychological profile of a participant of the detected group. Authors of the paper [11] investigate the level of emotional impact produced on a person when looking the content presented in digital images. In [4] it is proposed to consider the genetic algorithm as a heuristic tool for solving the NP-complete problem – clustering a fragment of a social network. In [6] the Bayesian approach is proposed to determine the number of clusters within a social network. In [17] the problem of predicting the reposts in the social network Twitter is solved. The issues of protecting users from malicious and unwanted information were discussed in [8]. One of the targeted purposes of the developed approach within the considered subject area is to reduce the cases of access of the younger generation to illegitimate information.

3

Common Approach

We propose the approach to determination of destructive information impacts and personal tendencies of young generation based on the social networks’ data analysis. The approach includes three main stages, namely: (1) Search for psychological states and traits of social networks’ users that can affect the acquirement of destructive information; (2) Classification of the social networks’ communities considering an existence of destructive impacts; (3) Detection of possible changes when interacting with communities in a social network. We propose certain ways of developing the techniques that implement appropriate stages. The elements of techniques for the first and second stages are similar. Therefore, we will consider them jointly. We mainly focus on the first stage. The technique that implements it supposes the neural network learning to rank the web pages of social networks’ users considering possible destructive impact of the information. At the same time, the technique that implements the second stage supposes the neural network learning to rank the web pages of social networks’ communities considering the destructive impact of the information that is shown there. Learning is based on the preformed sample and determined features. Within the developed techniques two modes are distinguished: manual and automated ones.

306

A. Branitskiy et al.

Manual Determination of Destructive Impacts. In this study we selected the scale that was proposed in [1]. It includes three basic categories: intrapersonal, inter-personal and meta-personal behavior. Each of these categories is divided into two sub-categories. Intra-personal destructiveness is the destructiveness aimed at self-destruction. There are the mild strong forms of possible changes. Inter-personal destructiveness means the relation of two subjects where one tends to downgrade the other (the first sub-category) and the second acts as a victim (the second sub-category). Meta-personal destructiveness implies the interaction between a person and a social group and can be divided into the opposition against the group’s principles and foundations and excessive upholding of interests of some social group. The classification of communities can be made on the basis of the same scale that is used to classify users, but the principle of manual marking of certain categories taken from the community page. Let us to notice that for the further neural network training we should determine the key features that experts use to make a decision. We map these features on the information provided in the social networks. The examples of these features are the character of text information and multimedia, the user’s activity in the network, an amount of open/closed information, participation in the communities, etc. The developed technique supposes adjustment of the features according the learning results of neural network. Additionally to the scale proposed in [20], we use the Ammon’s test for determination of the central personal functions of social networks’ users in scope of the developed technique. This test allows determining the constructive, destructive and deficient manifestations of certain Ego-functions highlighted by Ammon [1]. It was selected because of its orientation on the dynamic study of personality. Besides, this test was adopted and validated for application using Russian sample. Its foundation is personality concept proposed by G. Ammon that supposes that the individual has the following Ego-functions: aggression, anxiety, external restriction, internal restriction, narcissism, sexuality. It allows one to systematically evaluate the personality structure in the complex of both healthy and pathologically changed aspects (i.e. destructive or deficient). The obtained using this test experimental data should be analyzed in complex for each individual. Both the ratio of indicators for the constructive, destructive and deficient components of each Ego-function, and dynamics of their interaction with the social environment (“field”) should be considered [7]. The results of this test are also mapped with the information in the Internet content and are used as additional features for the neural network learning. Automated Determination of Information that Can Cause Destructive Impact. Automated mode includes three steps. The first step is to collect input data from social profiles of Internet users. The second step, namely feature extraction, is aimed at (1) calculating the numerical parameters, (2) creating a set of the most commonly used words and expressions for each of the selected categories of destructiveness, (3) forming the low-level binary data streams. Among the numerical parameters we can specify the number of likes and comments, the

The Common Approach to Determination

307

average size of published posts, etc. To analyze textual information, it is supposed to use information about the most common words describing each category of destructiveness and their semantic similarity. In addition, to improve the quality of person type classification, a deep convolutional network was chosen which accepts raw data from multimedia files as inputs. The third step involves training the neural network classifiers and identifying destructive personalities. As basic classifiers, we can use principal component analysis, support vector machine, linear regression, convolutional neural network, ImageNet and two-layer neural network word2vec. Figure 1 shows schematically the indicated steps of the proposed technique in an automated mode. Social profiles of Internet users Step 1 Collecting the input data from social profiles of Internet users

00 11 00 00 11 00 00 11 00 00 00 00 00 00 00 00 11 00

Data describing social profiles

Step 2 Extracting the features characteristic for each group of personality traits that can hypothetically lead to some forms of destructive behavior

Feature vectors suitable for training and analysis Step 3 Training the classifiers and identifying the group of personality traits that can hypothetically lead to some forms of destructive behavior

Fig. 1. The technique for automated determination of information that can cause destructive impact

4

The Prototypes, Experiments and Discussion

Currently we developed the prototypes and conducted the experiments related to the first stage of the proposed approach. The final goal of these experiments is to specify the correlation between the personality traits of persons and information on the web pages of users in the social networks (its existence is shown, for example, in [15]). The experiments included manual markup of users’ profiles in social networks using the provided information considering personality traits that can hypothetically lead to some forms of destructive behavior, testing of the same users using the Ammons’ test, and collecting the input data (information on the web pages of users). The first experiment is manual and consists in the analysis of the open web pages in social networks by the experts. The goal of this experiment is to get the first set of features that can signal the personality traits of persons and that will be used for the neural network learning. The second experiment has the same goal but it is based on another mechanism. To conduct it we developed the prototype. It is a social network application that implements the Ammon’s test to simplify the test passing procedure. Currently we have collected the test results for 588 users. Obtained average values on each type of ego-function are

308

A. Branitskiy et al.

Table 1. Average values for the constructive, destructive and deficient Ego-functions Ego-function/Type

Constructive

Destructive Deficient

Aggression

45,08 ± 12,99 56,3 ± 9,9

54,09 ± 2,18

Anxiety

45,08 ± 12,99 56,3 ± 9,9

54,09 ± 2,18

External I-restriction 45,08 ± 12,99 56,3 ± 9,9

54,09 ± 2,18

Internal I-restriction

45,08 ± 12,99 56,3 ± 9,9

54,09 ± 2,18

Narcissism

45,08 ± 12,99 56,3 ± 9,9

54,09 ± 2,18

Sexuality

45,08 ± 12,99 56,3 ± 9,9

54,09 ± 2,18

presented in Table 1. With help of JavaScript SDK we collect data on test results in JSON format and transfer it to MySQL database. It should be especially noticed that Table 1 represents average values for the limited group. In each case, it is necessary to consider a person in all the variety of his personal manifestations. Thus, to get the personality traits of persons that can hypothetically lead to some forms of destructive behavior we need to conduct more deep analysis of the obtained data. The third experiment consisted in collecting the data from the social network VKontakte. We implemented the system for collecting the data (features’ values) from the web pages of the social network’s users (for the further specification of the features that can signal the personality traits of persons that can hypothetically lead to some forms of destructive behavior and neural network learning). We have collected and analyzed the following data: 714 profiles; 668 friends lists and their short profiles; 633 groups lists and short information about them; 580 followers lists and their short profiles; 572 subscriptions lists and short information about them; 517 wall posts lists, information about them and their statistics; 465 photos lists and 410 videos lists from the users pages, information about them and statistics. Thus, as the result of the experiment we have got the set of data for the further analysis, that incorporates three slices, namely, the results of manual markup and considered features, the results of Ammon’s test, and the features that can be collected from the social networks. We will use them to improve our conceptual model of the approach and to get closer to the correlation between the personality traits of persons that can hypothetically lead to some forms of destructive behavior and information on the web pages of users in the social networks.

5

Conclusion

The paper describes the developed approach to determination of the destructive information impacts and negative personal tendencies of young generation on the basis of information in the social networks. The main elements of the proposed approach are described including the techniques that compose it. The

The Common Approach to Determination

309

conducted experiment gave us the set of data for the further analysis and the neural network learning including the results of manual markup of persons and considered features, the results of Ammon’s test, and the features collected from the users’ web pages in social networks. These results are the basis for improvement of our conceptual model of the approach and possible correlation between the personality traits of persons that can hypothetically lead to some forms of destructive behavior and information in the social networks. The main challenge while detecting the destructive impacts in the Internet space is connected with a huge amount of information that should be analyzed. The proposed approach will reduce the amount of the analyzed information for the experts and will allow them to focus on automatically ranked objects. Acknowledgements. The reported study was funded by RFBR, project number 1829-22034 mk.

References 1. Ego-structure test developed by g¨ unter ammon. https://www.psychol-ok.ru/ statistics/ista/. Accessed 18 June 2019 2. Borisov, P.M.: Verbal characteristics of the concept of a destructive personality. Bull. Moscow Reg. State Univ. Ser. Ling. 2, 78–92 (2010) 3. Dos Santos, C., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 69–78 (2014) 4. Firat, A., Chatterjee, S., Yilmaz, M.: Genetic clustering of social networks using random walks. Comput. Stat. Data Anal. 51(12), 6285–6294 (2007) 5. Fromm, E.: The anatomy of human destructiveness, vol. 2337. Random House (1975) 6. Handcock, M.S., Raftery, A.E., Tantrum, J.M.: Model-based clustering for social networks. J. Roy. Stat. Soc. Ser. A (Statistics in Society) 170(2), 301–354 (2007) 7. Karnaushenko, L.V.: Destructive informational and psychological impact on a mass audience: legal aspects of counteraction. Vestnik Krasnodarskogo universiteta MVD Rossii 2(36), 157–161 (2017) 8. Kotenko, I., Chechulin, A., Komashinsky, D.: Categorisation of web pages for protection against inappropriate content in the internet. Int. J. Internet Protoc. Technol. 10(1), 61–71 (2017) 9. Lin, H., Jia, J., Guo, Q., Xue, Y., Li, Q., Huang, J., Cai, L., Feng, L.: User-level psychological stress detection from social media using deep neural network. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 507– 516. ACM (2014) 10. Lipton, Z.C., Kale, D.C., Elkan, C., Wetzel, R.: Learning to diagnose with LSTM recurrent neural networks. arXiv preprint arXiv:1511.03677 (2015) 11. Machajdik, J., Hanbury, J.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 83–92. ACM (2010) 12. Pizzuti, C.: Ga-net: a genetic algorithm for community detection in social networks. In: International Conference on Parallel Problem Solving from Nature, pp. 1081– 1090. Springer (2008)

310

A. Branitskiy et al.

13. Segalin, C., Celli, F., Polonio, L., Kosinski, M., Stillwell, D., Sebe, N., Cristani, M., Lepri, B.: What your facebook profile picture reveals about your personality. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 460–468. ACM (2017) 14. Stepanov, A.V., Kim, L.M.: Destructive behavior and the problem of the culture of perception of information flows. Eurasian Union Sci. 7–6, 154–156 (2014) 15. Tulupyeva, T., Tulupyev, A., Abramov, M., Azarov, A., Bordovskaya, N.: Character reasoning of the social network users on the basis of the content contained on their personal pages. In: Biologically Inspired Cognitive Architectures (BICA) for Young Scientists, pp. 31–38. Springer (2016) 16. Voroshilova, M.B.: Cognitive arsenal and communication strategies of contemporary nationalist discourse. Polit. Linguist. 3(49), 242–245 (2014) 17. Yang, Z., Guo, J., Cai, K., Tang, J., Li, J., Zhang, L., Su, Z.: Understanding retweeting behaviors in social networks. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1633–1636. ACM (2010) 18. Zlokazov, K.V.: Destructiveness and personal identity. Science year-book of the Institute of Philosophy and Law of the Ural Branch of RAS, vol. 14(1) (2014) 19. Zlokazov, K.V.: Content analysis of destructive texts. Polit. Linguist. 1(51), 244– 251 (2015) 20. Zlokazov, K.V.: Destructive behavior in various contexts of its manifestation. Vestnik Udmurtskogo universiteta. Series: Philosophy, Psychology, Pedagogy, 26(4) (2016)

Intelligent Distributed Computing for Cyber-Physical Security and Safety

Authorize-then-Authenticate: Supporting Authorization Decisions Prior to Authentication in an Electronic Identity Infrastructure Diana Berbecaru(B) , Antonio Lioy, and Cesare Cameroni Dip. di Automatica e Informatica, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, Italy [email protected]

Abstract. Federated electronic identity systems are increasingly used in commercial and public services to let users share their identity across providers. We discuss authorization (prior to authentication) issues in the eIDAS federated European electronic identity infrastructure. In this scenario, each European country runs a national eIDAS node, which transfers personal attributes upon successful authentication of a person in his home country. Service Providers in foreign countries use these attributes to take (local) authorization decisions for the requested service. Our work addresses those scenarios where authorization is required prior to authentication (authorise-then-authenticate), that is when a service provider has to implement access control decisions before the person has been authenticated. This scenario applies for example in an user-centric network access service. We propose two models to perform authorise-then-authenticate in eIDAS, one working at application level and one at transport level, and we sketch a possible implementation scenario. Keywords: Electronic identity · eIDAS network · Authorise-then-authenticate

1 Introduction Many (public or private) organisations nowadays exploit the federated identity management (FIM) model to provide access to their services, based on the electronic identity (eID) of a person. In this model, the user registers his identity (and authentication credentials) with one organization (called Identity Provider - IdP) and he manages to get access to the services offered by a Service Provider (SP) without any further registration, provided the two organizations established a trust relationship, named ‘Circle of Trust’ (CoT). In the FIM model, upon successful authentication, IdP releases a security token to the user agent (UA), i.e. the browser, which forwards the token to the SP using a FIM protocol, such as SAML [1]. Based on the FIM model, the European pilot projects STORK and STORK2 [2] created a pan-European eID interoperability framework, which allowed authentication means from one country to be accepted in applications outside of the origin country’s jurisdiction, such as to register a student at foreign universities [3] or to provide Wi-Fi access [4]. The results of the STORK project c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 313–322, 2020. https://doi.org/10.1007/978-3-030-32258-8_37

314

D. Berbecaru et al.

were used in the definition of the eIDAS Regulation [5], which put the grounds of legal recognition of eIDs across the European Union (EU), and its technical specification [6]. Nowadays, many European countries have already set up eIDAS nodes (running code based on [7]) that are part of a CoT, the eIDAS network. Each eIDAS node is also connected to the national (notified) IdPs, in order to authenticate persons and to provide attributes for them, and to the national SPs to allow foreign citizens to authenticate in their home country when accessing eIDAS-enabled services. We observe that eIDAS responds to authentication requirements of on-line webbased applications, but additional work is needed to cover those services requiring authorisation prior to authentication. For example, in the network access scenario, the user does not have access to the network, but the service itself consists in getting access to the network, provided that the user can demonstrate to the SP that he has been authenticated by some (recognized) IdP in eIDAS. We call this service eAccess (eIDASbased network access service). This resembles the chicken-and-egg problem: to provide access, the SP requires authentication data from the IdP, but to obtain this data the user must be given access to the remote IdP. We construct an architecture based on the eIDAS infrastructure, which responds to ‘authorise-then-authenticate’ requirement and we discuss design options for its main components. Note that the eAccess service may be used on wide-scale: a citizen that has an authentication credential released by a notified IdP in one EU country (such as a username/password, a national card or a mobile one time password) may use it to get access to the network across EU, provided the SP is integrated with eIDAS nodes modified to support such a service. The eAccess is foreseen to be fully implemented in the eID4U project [8], which involves 5 universities from 5 EU countries: Italy, Austria, Spain, Portugal and Slovenia. Organisation. The paper is organized as follows: Sect. 2 presents the eIDAS architecture in brief, Sect. 3 presents our proposed architecture along with its motivations and requirements, Sect. 4 confronts some existing proposals against our proposed architecture. We then discuss design issues for the various components, our proposed design models and a possible implementation scenario (Sect. 5). Finally, Sect. 6 concludes the paper and indicates future work.

2 The eIDAS Architecture in Brief The eIDAS network supports two models, proxy and middleware. In the proxy model, a country runs a single gateway called eIDAS node, which is composed of two elements, an eIDAS-Proxy-Service (in short, eIDAS Proxy) and an eIDAS Connector, which are in a CoT with the national IdPs and SPs so they share national SAML metadata, as shown in Fig. 1. The eIDAS nodes also are in a CoT, thus the eIDAS SAML metadata of the nodes is distributed through a dedicated mechanism. Cross-border authentication is delegated from an SP to its national eIDAS Connector, which acts as a gateway and subsequently forwards the eIDAS authentication request (eIDAS auth req ) to the eIDAS Proxy in the country in which the person will be authenticated. The eIDAS auth req is handled by the eIDAS Proxy according to Member-State (MS) specific approach. Typically, a new authentication request is constructed by the eIDAS Proxy and is sent (through the user’s browser) to the national IdP (part of the National eID scheme)

Supporting Authorization Prior to Authentication in an eID Infrastructure

315

Fig. 1. The eIDAS architecture.

where the citizen is asked to authenticate with a national eID. For example in Italy, the eIDAS auth req is converted to an authentication request in SPID [10] format. Upon successful authentication, the eIDAS authentication response (eIDAS auth resp ) containing also the (personal) attributes that have been requested are returned through the eIDAS infrastructure back to the requesting SP. Each eIDAS node communicates with the other eIDAS nodes through the eIDAS communication protocol [11], which is based on SAML 2.0 WebSSO Profile [12]. The eIDAS infrastructure supports only a limited number of person attributes to be exchanged through the eIDAS nodes, some of them are mandatory, e.g. family name, first name, date of birth, an eIDAS person identifier, others instead are optional, e.g. current address, the place of birth and gender of a person. As explained in [9], these attributes are not typically sufficient to build more advanced services. On the other hand, in eAccess they might be enough to implement network access decisions, but in some cases (e.g. for users that are have strict requirements on providing such extremely sensitive data to SPs) the user could forgo network access altogether rather than providing their personal data.

Fig. 2. The proposed eIDAS-based architecture supporting authorise-then-authenticate.

316

D. Berbecaru et al.

3 Motivation, Requirements, Architecture The FIM solutions do not natively provide authorise-then-authenticate, which is required in the network access service. All practical FIM architectures require the user to access its own IdP to retrieve proper authentication credentials. In the (roaming) network access service, the SP must grant the user limited network resource access before the authentication process has been actually completed. This turns to be a challenging task: the network access provider cannot be assumed to know and trust every possible IdP (abroad), but external interactions belonging to the authentication process should be carefully authorised to avoid covert channels. To allow authorize-then-authenticate support in eIDAS, we enhance eIDAS node functionality to issue tokens used for controlled authorisation of user communication across the SP. Figure 2 shows the architecture we propose. UA interacts with SP’s Control service (Web app) to authenticate and be authorised to access external resources, both during the authentication process and afterward. Control service defines the authorisation policy for the controlled Port, which actually allows or denies UA with the right to communicate with external resources. Before full access is granted, UA is required to interact with the FIM infrastructure to collect the required authentication and authorisation credentials. The protocol is user-centric in compliance with most modern FIM concepts: UA switches tokens among the various involved servers, including the eIDAS nodes in the SP and IdP countries, the user’s IdP, and the Control Web app at the SP. No direct server-to-server interaction is assumed inline. Control service verifies the authentication and authorisation credentials collected by UA and takes the ultimate decisions enforced by Port. The eIDAS Connector translates externally collected credentials into data that can be directly verified by SP, thus masking the complexity of the FIM infrastructure to SP. At eIDAS node we distinguish between base identity management services (or authentication services, AeS), natively provided by eIDAS, and authorisation services (AoS), proposed in this work.

4 Related Work A solution exploiting Shibboleth [13] for user’s network access control is proposed in [14], which works as follows. The user connects to the docking network (e.g. by activating her WLAN card) and gets an IP address via DHCP; all traffic is initially blocked except for port TCP/443 (HTTPS) on the server hosting the Shibboleth ‘Where Are You From’ (WAYF) service, and for all the Shibboleth origin servers (IdPs) in the federation. The initial HTTP request from the UA is intercepted by the NAC (Network Access Controller), which is a web server equipped with Shibboleth target component (SP). NAC constructs an authentication request in SAML format and redirects the UA to the WAYF service where the user selects the IdP from a drop-down list. The WAYF application redirects the UA to the HTTPS URL of the IdP where the authentication process can take place. After successful authentication, the IdP redirects the UA back to the SP with a message containing a signed SAML assertion. Based on user’s attributes in the SAML assertion, the NAC decides if the user is authorised to access the network. If access is granted, NAC updates the firewall rules so that traffic can flow between the

Supporting Authorization Prior to Authentication in an eID Infrastructure

317

client and the network. This solution assumes that the IP address of the IdP is known by SP and can be pre-configured on the NAC. In a real scenario, the SP could obtain the IdP’s IP address/DNS name during the federation setup phase, but if the IdP’s name changes, the SP has to be notified in advance in a secure manner (e.g. via an out-of-band channel). On the contrary, in our proposed architecture the eIDAS node acts as a certifier for the names of other eIDAS nodes and of the IdP servers (in its country). We require also the SP to authenticate the destination (at transport or application level) to protect from covert channel attacks where the UA cooperates with external entities. TLS-Federation, proposed in [15], exploits the TLS protocol [16] with clientcertificate authentication in a federated setting. In practice, the IdP issues to the user (upon successful authentication) an X.509 certificate as authorisation token, instead of a SAML assertion or other security token. This short-lived session credential is sent by UA to SP within the TLS handshake protocol. In this way, SP only needs a standard web server without any extension and excludes also a wide class of browser-based security attacks. We use some concepts from TLS-Federation in our proposal: we extend eIDAS node functionality to issue authorization credentials (including X.509 certificates) that have to be checked by SP before the authentication phase has been completed. This feature is very helpful in those services that require both authentication and authorization (like the network access service). However, in TLS-Federation, the user might end up with multiple X.509 certificates in his browser and might be asked to select which one to use, depending on the service being implemented. EDUcational ROAMing (EduRoam) [17] is a RADIUS-based infrastructure that uses 802.1x protocol to allow inter-institutional roaming. The EduRoam allows educational users who are members of one institution to log on to the WLAN (Wireless LAN) when visiting another institution by using the same credentials (username and password) that the user has at his home institution. Nowadays, EduRoam is largely used in academic environment but cannot be used if a person is not part of such environment (student, teacher, etc). On the other hand, our proposed solution can be used even by persons that temporarily visit a university (for a meeting, conference a.s.o) without necessary being part of the academic staff. In our approach, the person would use his national eID to get access to the network, not his university credentials (which might not exist, or may be expired).

5 Design Models and an Implementation Scenario We consider two main design models for our architecture (see Fig. 2): one works exclusively at application level, the other one involves also the transport-level. In the first case, we assume applications running at IdP and eIDAS node can be changed to support message-based authorisation at application level. In the second case, we assume IdP and eIDAS node functionality cannot be easily changed for authorisation purposes, thus we count to re-use the existing web-based components, combined with the use of a dedicated web application for access control. For simplicity, we consider only simple rules for authorization policies at SP site: (r1) “UA X is not allowed to connect to any server”;

318

D. Berbecaru et al.

(r2) “UA X is allowed to connect to server (app) Y”; (r3) “UA X is allowed to connect to any server (app)”. The initial policy for every user contains r1 only, r2 rules are added by SP upon obtaining correct authorization tokens from the local eIDAS node (eIDAS Connector), whereas r3 is applied upon successful authentication of the user. In general, r3 can be further extended to require specific attributes (e.g. the role Z) or to restrict the access to specific classes of external resources. For completeness, we consider also the case in which manual configuration is performed on SP, to enable r2 rules. For example, if the DNS names of the eIDAS Connector, of the eIDAS Proxy and of the IdP are static and trustworthy, they could be manually configured on the SP site, allowing thus the user to reach the IdP. In this case, the user authenticates to the IdP and on receipt of the SAML authentication token the SP applies the authorisation policy r3. We consider this case in a possible implementation scenario sketched briefly further below. Finally, we restrict our study to web-based scenarios, i.e. services using HTTP and HTTPS (HTTP over TLS channel [18], and HTTP then TLS [19]). AoS Description. With reference to Fig. 2, the eIDAS authorisation service (AoS module) is in charge with issuing authorisation tokens that SPs can directly verify. We consider basically three types of authorisation token formats: (1) X.509 certificates (e.g. the approach used in TLS federation solution); (2) HTTP-based access tokens, such as the OAuthAuthorizeToken used in OAuth authorization process for web applications (http://oauth.net/); (3) SAML assertions. We clarify the role of AoS with a simple example involving two eIDAS nodes, by using X.509 certificates as authorisation tokens, and having as final goal to grant network access to roaming users. The workflow is as follows: 1. UA → SP: the user navigates to the SP secure login page (e.g. an HTTPS URL) that requires user authentication to provide network access. For simplicity, we assume that an IP address is given to the user (as in [14]) and the Control blocks out the input and output traffic for this IP, except for the traffic toward the national eIDAS Connector. 2. SP → UA: The SP creates the eIDAS auth req and sends it to the user together with an HTTP redirect pointing to the HTTPS URL of eIDAS Connector. 3. UA ⇔ eIDAS Connector: At eIDAS Connector, the user selects the country of origin. Next, eIDAS Connector identifies the browser type and responds with a login page containing the browser’s proprietary elements necessary for key generation and certificate signature request. We skip the certificate issuing process since it is extensively explained in [15]. Finally, the eIDAS Connector issues a short-lived certificate containing the DNS name of eIDAS Proxy in a private certificate extension, signs the certificate, and returns it with the according MIME type to the browser. This instructs the browser to verify whether its key store contains an according key pair and on success it automatically inserts the certificate as authorisation token 1 (AoS token 1 ) in its native certificate store. In addition, eIDAS Connector creates (and signs) an eIDAS auth req message to be used for authentication. Next, eIDAS Connector redirects the UA to the HTTPS URL of the SP, together with the eIDAS auth req message generated above.

Supporting Authorization Prior to Authentication in an eID Infrastructure

319

4. UA ⇔ SP: The user navigates to the SP secure page, initiating a TLS handshake with client authentication. When asked for the client certificate, it sends to the SP the AoS token 1. Since the SP trusts eIDAS Connector, the verification of the client certificate is successful and the TLS handshake should complete without errors. Afterward, the SP applies the authorisation policy r2 where Y is equal to eIDAS Proxy’s name. Finally the SP redirects the UA to the HTTPS URL of eIDAS Proxy together with the eIDAS auth req message created at step 3. 5. UA → eIDAS Proxy: The user accesses the secure web page of eIDAS Proxy and selects the IdP to authenticate with. eIDAS Proxy generates an authorization token 2 (AoS token 2 ) containing the IdP’s name retrieved from eIDAS Proxy’s IdP list. 6. eIDAS Proxy ⇔ UA: The UA cannot reach however the IdP because the SP will block access to it. Thus, eIDAS Proxy functionality is modified to redirect the UA to eIDAS Connector. 7. UA ⇔ eIDAS Connector: The UA sends the eIDAS auth req message to eIDAS Connector together with AoS token 2. The eIDAS Connector issues (as in the step 3) a new short-lived certificate, that is the authorisation token 3 (AoS token 3 ), containing the IdP’s DNS name and redirects the UA to the SP. When the SP receives the AoS token 3 from the UA, the SP applies the authorisation policy r2 where Y is equal to IdP’s DNS name. Next, the SP redirects the UA to eIDAS Connector and then to the HTTPS URL of IdP together with the eIDAS auth req message. Here the user authenticates to the IdP and on receipt of the eIDAS auth resp the SP applies the authorisation policy r3. Control Web Application. This application is the internal authorisation decision point at SP and basically performs two tasks: (1) verifies authentication and authorisation tokens produced by eIDAS Connector, including the binding data preventing attackers to intercept and use tokens in place of the real owner; (2) configures the controlled port, i.e. it defines the authorisation policies to be applied. In a practical design, the application is a Web application supporting the specific authorisation token format, and the specific control mechanism to configure the controlled port. For example, the application could use the Shibboleth SAML-aware interface to support SAML-based tokens. Whereas, for HTTP-based tokens, the application could use the APIs and documentation provided by OAuth supporters to enable OAuth into web applications [20]. Finally, an ad-hoc application may be developed in case the authorisation token is a X.509v3 certificate, adopting the TLS-federation approach. Controlled Port. This module is in charge of enforcing the authorisation policy by allowing or denying connections from UA to external Web applications/servers. In practice, it must verify authentication and authorisation data associated to each UA and allow/deny its connections toward specific external servers. We have identified two design alternatives: (1) at transport-level where the Port role is played by the HTTP server at SP, and (2) at application-level where we use a dedicated Web application. The application-level option is more flexible under this perspective. We can exploit TLS to protect HTTPS connections between UA and Port and between Port and external servers. We can instead rely on secure Web service protocols, e.g. WSSecureConversation [21], to protect end-to-end message exchange between UA and

320

D. Berbecaru et al.

external servers. Unfortunately, this require re-design of applications running on external servers, including eIDAS node and IdP. Moreover, it is difficult to grant user awareness without dedicated features in the browser for managing secure web service protocols. Though the application-level solution is appealing in the long term, we therefore designed the transport-level solution that better integrates in existing Web-based infrastructure. In the transport-level solution, the Port role is played by the HTTP server at SP. The server must be capable of proxying HTTPS channels from UA to external servers. The Apache Web server provides the ‘mod proxy’, ‘mod ssl’ and ‘mod connect’ modules to proxy HTTP and HTTPS sessions, but a couple of technical issues need to be addressed. First, we must design an appropriate mechanism to let the Control Web app signal the current authentication and authorisation policy to the HTTP server. In other words, Control Web app should translate the SAML-based credentials collected by the UA interacting with the FIM infrastructure into access credential understood by the HTTP server. The most flexible solution is exploiting HTTP cookies. Cookies must however be protected against client-side tampering and verified by the HTTP server. Generally, such advanced controls are not natively supported by HTTP server. In the case of the Apache Web server, this limit is overcome by installing third party modules: for instance, ‘mod security’ allows applying fairly elaborate validation rules on HTTP requests and responses processed by the server. A more subtle obstacle stems from managing the at the same time the three HTTPS conversations, UA to external servers, UA to Port, and Port to external servers. Preliminary Implementation Scenario. By assuming that the DNS names of the eIDAS Connector, eIDAS Proxies and the IdPs (in foreign) countries are static and trustworthy, we started to work on an implementation scenario that can be adopted by an SP (e.g. an university) to provide network access services to users based on their national credentials (e.g. username and password) issued by notified IdPs in eIDAS. In Italy, the notified eID scheme is SPID, so several IdP like InfoCert SpA or Poste Italiane SpA provide such credentials to citizens. Our goal is to design and deploy a captive portal integrated with eIDAS, which to the best of our knowledge has not been proposed and implemented yet. We used Zeroshell (https://zeroshell.org/) because it implements a captive portal to authenticate users against an IdP SAML 2.0 by using Shibboleth. We modified the configuration of the Shibboleth SP to that to generate an eidas auth req message (instead of a plain SAML request) which is sent to the eIDAS Connector, and then subsequently through the eIDAS infrastructure the user reaches his IdP as described in Sect. 2. From the technical point of view, the Shibboleth SP must support the eIDAS protocol and the WAYF service, allowing the user to select the country in which he will be authenticated. Moreover, the Shibboleth SP needs to be configured with the metadata of the eIDAS Connector, so that the messages exchanged with it can be verified cryptographically. We have installed the eIDAS code in an experimental testbed that hosts a dedicated academic eIDAS node, which acts both as eIDAS Connector and as eIDAS-Proxy service. This environment is deployed by using a Docker infrastructure (https://www.docker.com) running on an Ubuntu Server 16.04 virtual machine hosted at Politecnico di Torino. We have also installed Zeroshell in the experimental

Supporting Authorization Prior to Authentication in an eID Infrastructure

321

testbed. Full set up and deployment of this scenario in a real service and its validation is regarded as future work.

6 Conclusions Our work stems from the observation that FIM solutions do not natively provide authorise-then-authenticate functionality, which is required for example in a network access scenario. We have proposed an architecture based on the eIDAS framework to address this limitation, we have compared some alternative design options, and we have identified two solutions. One works at application-level, it is more flexible, but requires redesign of server-side applications, and extensions to current Web browsers. The other one involves the transport-level, it integrates more easily with current Web components, but still requires technical upgrade to commodity software. Finally, we sketched an implementation scenario for the eAccess service exploiting the proposed architecture. Our approach is extensible to other usage scenarios where access to sensible Web resources is required before full user authentication can take place. Acknowledgements. This work was developed in the eID4U project, co-funded by the European Union’s Connecting Europe Facility, under the grant agreement no. INEA/CEF/ICT/ A2017/1433625.

References 1. Cantor, S., et al.: Assertions and Protocols for the OASIS Security Assertion Markup Language (SAML) V2.0 – Errata Composite. Working Draft 07, 8 September 2015. http://www. oasis-open.org/committees/download.php/56776/sstc-saml-core-errata-2.0-wd-07.pdf 2. Secure Identity Across Borders Linked (Stork) project - Towards pan-European recognition of electronic IDs (eIDs) (2008–2011). https://ec.europa.eu/digital-single-market/en/content/ stork-take-your-e-identity-you-everywhere-eu 3. Berbecaru, D., Lioy, A., Mezzalama, M., Santiano, G., Venuto, E., Oreglia, M.: Federating e-identities across Europe, or how to build cross-border e-services. In: Proceedings of AICA2011: Smart Tech and Smart Innovation Conference, Torino, Italy, 15–17 November 2011, 10 p. http://security.polito.it/doc/public/torsec aica2011 stork.pdf 4. Berbecaru, D., Lioy, A., Aime, M.D.: Exploiting proxy-based federated identity management in wireless roaming access. In: TrustBus 2011. LNCS, vol. 6863, pp. 13–23. https://doi.org/ 10.1007/978-3-642-22890-2 2 5. European Union, Regulation (EU) No 910/2014 of the European Parliament and of the Council of 23 July 2014 on electronic identification and trust services for electronic transactions in the internal market and repealing directive 1999/93/ec, European Union (2014) 6. eIDAS Technical Specifications v1.1. https://ec.europa.eu/cefdigital/wiki/display/ CEFDIGITAL/2016/12/16/eIDAS+Technical+Specifications+v.+1.1 7. eIDAS-Node software releases. https://ec.europa.eu/cefdigital/wiki/display/CEFDIGITAL/ eIDAS-Node+Integration+Package 8. eID4U project - eID for University (2018–2019). https://ec.europa.eu/inea/en/connectingeurope-facility/cef-telecom/2017-eu-ia-0051 9. Berbecaru, D., Lioy, A., Cameroni, C.: Electronic identification for universities: building cross-border services based on the eIDAS infrastructure. Information 10, 210 (2019). https:// doi.org/10.3390/info10060210

322

D. Berbecaru et al.

10. Sistema Pubblico di Identit`a Digitale (SPID). https://www.spid.gov.it/ 11. eIDAS SAML Message Format, version 1.1. https://ec.europa.eu/cefdigital/wiki/ download/attachments/80183964/eIDAS%20Message%20Format v1.1-2.pdf?version=1& modificationDate=1497252919575&api=v2 12. Hughes, J., et al.: Profiles for the OASIS Security Assertion Markup Language (SAML) V2.0. OASIS Standard, March 2005. http://docs.oasis-open.org/security/saml/v2.0/samlprofiles-2.0-os.pdf 13. Shibboleth Consortium. https://www.shibboleth.net/ 14. Linden, M., Viitanen, V.: Roaming network access using shibboleth. In: TERENA Networking Conference 2004. https://www.terena.org/publications/tnc2004-proceedings/ papers/linden.pdf 15. Bruegger, B.P., Huehnlein, D., Schwenk, J.: TLS federation - a secure and relying party friendly approach for federated identity management. In: BIOSIG 2008, pp. 93–106 (2008) 16. Rescorla, E.: The Transport Layer Security (TLS) Protocol Version 1.3 (2018). https://tools. ietf.org/html/rfc8446 17. EduRoam, a RADIUS based authentication system for network access. www.eduroam.org 18. Rescorla, E.: HTTP Over TLS. RFC 2818, May 2000. https://tools.ietf.org/html/rfc2818 19. Khare, R., Lawrence, S.: Upgrading to TLS Within HTTP/1.1. RFC2817, May 2000. https:// tools.ietf.org/html/rfc2817 20. Bidelman, E.: Google Data APIs team: Using OAuth with the Google Data APIs, September 2008. https://developers.google.com/gdata/articles/oauth 21. Web Services Secure Conversation Language 1.4, OASIS Standard, 2 February 2009. http:// docs.oasis-open.org/ws-sx/ws-secureconversation/v1.4/ws-secureconversation.html

Modeling and Evaluation of Battery Depletion Attacks on Unmanned Aerial Vehicles in Crisis Management Systems Vasily Desnitsky1(B) , Nikolay Rudavin2 , and Igor Kotenko1 1

St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, 14-th Liniya, 39, St. Petersburg 199178, Russia {desnitsky,ivkote}@comsec.spb.ru 2 ITMO University, 49, Kronverkskiy prospekt, St. Petersburg, Russia [email protected]

Abstract. The paper comprises issues of modeling and evaluation of battery depletion attacks aimed at unmanned aerial vehicles (UAV, drone) being elements of crisis management systems in emergencies. Being a crucial element of such a system, UAV is exposed to cyberphysical attacks. The purpose of such attacks is to disrupt the process of the normal functioning of the system, violation of the mission objectives and infliction of damage. The analysis of various types of battery depletion attacks on UAV and their key characteristics has been performed. Besides we obtained some experimental results by simulation of some types of battery depletion attacks on a fragment of a prototype of a system by using Parrot AR-Drone.

Keywords: Battery depletion attacks Modelings

1

· Unmanned aerial vehicle ·

Introduction

The increasing spreading and involvement of cyber-physical devices in our live along with the increased intelligence of various technical solutions and automation of technological processes make such devices vulnerable to illegal actions of intruders. Intruders are assumed to compromise the system and services it provides, which in turn can lead to significant damage. It is worthwhile to single out autonomously functioning devices located at a distance from other entities of the system and moving in space whose autonomy is expressed in autonomous power supply or limited communication/computing resources. As a result, such devices are vulnerable to attacks explicitly or implicitly exploiting this autonomy. In addition, arranging protection against this type of attacks is complicated due to the fact that for the most part the security subsystem has to rely on assets located inside the device, which complicates the organization of effective protection. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 323–332, 2020. https://doi.org/10.1007/978-3-030-32258-8_38

324

V. Desnitsky et al.

Therefore, it leads to a problem of protecting such devices from attacks exploiting autonomy, including battery resource as one of the most critical ones consumed during the operation. Forced recharging or battery replacement causes unplanned time delays, financial losses, reduced functionality provided by the device, which negatively affects tasks being performed and can lead to serious costs. This paper presents an analytical and experimental study of battery depletion attacks aimed at UAVs, as one of the most typical kinds of autonomously working cyber-physical devices. Elements of the novelty of this work are, first, the constructed classification of battery depletion attacks in its relation to UAVs and analysis of the key characteristics of such attacks and, second, the results of experimental studies on the built software/hardware prototype, confirming feasibility of the analyzed attacks and ranking some of them according to their possible effectiveness. The rest of the paper is organized as follows. Section 2 provides an overview of papers in the field. Sections 3 and 4 reveal the scenario and the built prototype of the crisis management system, respectively. Section 5 analyzes types of battery depletion attacks aimed at UAVs. Section 6 expresses experiments on the attack modeling and discussion. Section 7 concludes the paper.

2

Related Work

A number of papers in the field have been published on the relevance of UAV security problems and possible types of cyber-physical security threats, including industrial-level drones, used in various areas of the professional activities. Rodday, et al. [1] describe vulnerabilities of professional UAVs that worth several thousand dollars and are used to monitor critical infrastructures in police operations by using UAVs to provide telemetry functions. The drone is targeted on collecting video information and location data of certain objects. Communication with the drone is conducted within 2.4 GHz band commonly used for Wi-Fi, Zigbee and other wireless protocols. Rodday, et al. also demonstrated the implementation of the man-in-the-middle attack on such a drone. Samland, et al. [2] fulfill a risk assessment and analysis of requirements to protect the main hardware and software elements of the drone. In particular they regard requirements related to initiating the security process, creating a security concept, implementing the concept of security and maintenance, and constantly improving information security on an example of Parrot A.R.Drone platform. Besides in [3,4] possible threats were revealed, the implementation of attacks was shown as well as countermeasures were proposed. In particular, Pleban, et al. [3] describe threats and simulate attacks on Wi-Fi protocol involved in drone control and Linux-based operating system used in A.R.Drone 2.0. To protect this platform, the authors suggest the use of IPTables policy, which filters unauthorized traffic to the drone. Nonetheless the policy is effective against non-professional attackers only. Vasconcelos, et al. [4] focused on simulation of Denial of Service (DoS) attacks on A.R.Drone 2.0,

Modeling and Evaluation of Battery Depletion Attacks on UAV

325

using hping3, LOIC and Netwox tools. Having shown a significant increase in data transmission delay time (up to 10 times), the authors recommended the use of these attack modeling methods to assess security of a UAV. Almost all UAVs use GPS as a navigation system, which can be very vulnerable to attacks, as it uses data transmission over an open channel [5,6]. Drone communication and control channel can be rather vulnerable as well. As a rule the most common attacks operate on the principle of common DoS attacks [7] - it is either setting interference at the physical level or sending previously copied or distorted requests/control commands [8]. In [9,10] it is described how Denial-ofService attacks can be used in wireless sensor networks in the context of energy depletion. In particular Geethanjali and Gayathri provide facts on the implementation of battery depletion attacks, when a node captured by an intruder is subjected to a Denial-of-Sleep attack being a modification of Denial-of-Service attacks [9]. Chang, et al. [10] propose their own solution to repel such attacks, by using hardware based protection and a prototype based on wireless charging that absorbs the energy of the Denial-of-Sleep attack signal. In general such attacks on cyber-physical systems and their consequences are described in detail in [11,12]. Martin et al. [11] demonstrate the following attacks leading to increased power consumption: (i) sending a request that the device has to accept, but in accordance with the semantics of the request, it is either ignored by the device or does not imply any reaction (for example, sending a request that does not contain a payload or sending an wrong authentication request); (ii) sending a request forcing a device to perform certain communication or computing functions (but not changing the settings of network devices); (iii) sending a request that changes the settings of the modules, including modification of executable files (and commands such as increasing the transmitter power). Martin, et al. [11] propose application of a multi-layer authentication model. The methods are aimed protecting both at low-hardware levels, limiting time and energy resources for connecting devices (protection against the first type of attack), and at higher levels. In particular in a multilayered network model it is proposed to restrict the privileges of lower rank devices, which will protect the network from the second and third type of attacks. In [12] an approach to optimize the distribution of network nodes is proposed. In particular the flexibility of the approach lies in the distribution of system elements, depending on the workload of each autonomous node. This allows one to save energy, using a smaller number of unnecessary network nodes and without overloading the most active ones. In [13,14], the relevance of the problem of depleting UAV energy is substantiated. In particular, [13] and [14] show the dependence of the volume of energy consumption of UAVs on various external parameters, such as environmental characteristics, network load, presence of third-party objects, density of drones in the territory, load of drones, etc. These papers disclose experimental results, where unfavorable conditions in the form of sending drones to a long-range task, the uneven use of all UAVs and the uneven distribution of UAVs in the task area led to the depletion of battery resources of some UAVs, while many other drones remained unused.

326

V. Desnitsky et al.

The concept of a neural network for UAV is described in [14]. Based on the use of the free neural network library ANN the network allows one to optimally distribute the tasks of the drones. The essence of that work is that the network parameters are selected and the response time of each cluster of drones is calculated in order to apply a more efficient distribution of tasks to the available UAVs [14]. Mersheeva and Friedrich [15] present a solution of energy loss problems raised due to the use of non-optimal UAV routes. By the redundant routes to monitor the space they mean the following ones: fly routes above the previously surveyed points as well routes over the roofs of buildings, structures and other objects that are unnecessary to monitor. As a result the optimization of routes leads to an increase in the autonomy up to two times under the conditions of the monitoring task. It should be noted that battery depletion attacks on UAVs are currently rather poorly studied. This paper presents results of a research modeling of this class of attacks. The distinctive features of this work include (i) comprehensive analysis and comparison of battery depletion attacks on the UAV as a node of a wireless sensor network, and (ii) obtained experimental assessments proving feasibility and effectiveness of such attacks.

3

Drone Based Scenario of a Crisis Management System

In the study we have implemented a scenario of a crisis management network based on the use of a self-organizing wireless communication mash network. It is assumed that in the event of an emergency such as an earthquake, fire, flood, etc., personnel of emergency services arrive at the place of the incident and deploy a mobile communication infrastructure on the ground. The goal of this infrastructure is to exchange operational data among the nodes and to communicate to the command center. An important requirement imposed on such a network is to ensure the functionality and reliability of data transmission between network nodes as long as poorly predictable physical movements of network nodes in the space [16]. At the same time the mesh topology principle used in the protocol allows network nodes, in addition to their business functions, to play the role of intermediate router nodes when transmitting data over a network. The protocol enables an ability to dynamically rebuild the network on the fly and select the most suitable data transfer routes, depending on the current network graph and loadings of its channels. The approach proposed in [17] to ensure the availability of nodes in the network of a crisis management system implies the presence of one or several UAVs. Each UAV, first, represents a communication network node and, second, is used for the physical delivery of a wireless module into points of violation of network graph connectivity.

Modeling and Evaluation of Battery Depletion Attacks on UAV

4

327

Crisis Management System Prototype

In order to study battery depletion attacks a hardware-software prototype of a crisis management system using self-organizing communication mesh networks based on Zigbee S2 protocol has been developed. This protocol is focused on its use in networks with possible time-varying composition of nodes, changing its position in the space and changing workload of communication channels (i.e. its spontaneous, poorly predictable nature and dynamically changing network topology). This protocol is primarily used for organizing wireless communications within small or poorly segmented cyber-physical systems (such as Smart Homes, monitoring warehouse, trade and industrial premises, etc.) that assume a relatively small signal radius. Nevertheless the protocol’s focus on the economical power consumption of network nodes allows it to be successfully used for research purposes to simulate both crisis management processes and attacks on such networks in a laboratory. Foremost the development of the prototype included assembling the hardware of the network elements and writing the software responsible for, first, functions of controlling the network devices and, second, experimental study procedures. The key network element regarded in this paper is a device for delivering reserved router modules and it is designed on the basis of UAV A.R.Drone 2.0. The UAV has the following elements: 1. Built-in microcircuit A.R.Drone 2.0 Power Edition for controlling the fly (Fig. 1, bottom center). The prototype uses a 5-pin interface and a USB port. 2. Lithium-polymer battery (bottom-left) with a capacity of 1500 mAh, i.e. it is a more powerful battery that comes with version A.R.Drone 2.0 Power Edition. The nominal voltage applied to the UAV board is 11.1 V. 3. Arduino PRO 5V microcontroller (top-right), which the program component murimod is built in. It converts PWM signals of the control channels to AT commands for controlling the drone. 4. 4-channel logic level converter 5 V-3.3 V (analogue of BOB-12009 – bottomright), it converts the logic levels of the RX/TX inputs/outputs of the drone board. 5. Arduino PRO microcontroller (in the upper-right corner) modified to be analogous to Arduino Leonardo. The choice of this board is due to the fact that it allows the use of both hardware and software Serial interfaces, i.e. to communicate with the wireless communication module and the GPS module, respectively. This board performs functions of autopilot and guided drone piloting, collecting GPS data and data on the state of the network environment. 6. The wireless communication module in the self-organizing network XBee s2 Router (left-top) provides receipt of control commands and telemetry transmission (GPS coordinates and surrounding network status data). This module acts as a Zigbee network repeater also. 7. GPS-module Neo 6m, providing orientation of the drone in space (determining position, date, time, altitude, speed, and course) and working under the NMEA 0183 protocol by using TinyGPS library.

328

5

V. Desnitsky et al.

Analysis of Energy Depletion Attacks on Unmanned Aerial Vehicles

Battery depletion attacks can be divided into 2 types. The first type is attacks on wireless channels that do not require a physical contact with the attacked device. There are two types of such channels in the crisis management system: the GPS data transmission channel and the control/data transmission channel between XBee network nodes. GPS channel is used to determine the drone’s position: co-ordinates, altitude and speed. Attacks on this channel are made by an interferer of the intruder, who can either drown out the signal of the receiving device, or give incorrect coordinates to it. Having active noises applied, chaotic UAV movements are likely to occur even if the drone hangs at one point, since the drone’s location is deduced precisely by the coordinates. Applying targeted interference to the drone can force it to leave the route or fly for a longer and less energy efficient way. The XBee network channel is used to send target goals and control commands, and the drone is responding by its telemetry data, such as GPS data, state of the surrounding network (data on other network modules and the signal level) and state of the drone (including the battery charge). The use of interference by an intruder may lead to the loss of the relevance of the UAV task being carried out, since new target goals to the drone will be received with interruptions. The power consumption of the network node is also growing (the device stops going into periodic sleep mode and increases the power of the transmitting signal). Also, breaking into the communication channel may give the drone incorrect target goals and send it along a non-optimal route or force it to perform extra energy-consuming actions (Fig. 1). The second type of attack implies the physical contact of the intruder with the UAV. The attack is performed when the drone is in standby mode, a static repeater or in reserve. Attacks are possible on a hardware interface (USB port, RX/TX connections on the UAV board, Arduino/XBee Serial interface), power system, or drone’s body. In another case, the connection of the charging/discharging device to the power supply system leads to a rapid deterioration of the battery due to the acceleration of charge/discharge cycles. By acting on the body of the drone as shown below, the attacker can cause the UAV to spend more energy on the engines. This is either a hooking an excess physical weight or a violation of centering (including a forward centering shift - it does not significantly worsen the flight performance when moving forward, but it will force the front engines to operate at maximum power). Also it could be a partial violation of the aerodynamics of rotors by installing plugs on the blades.

6

Experiments and Discussion

Within the experimental part of the study the following attack effects were modeled on the built software/hardware prototype: (i) unauthorized turning on the UAV and putting it into standby mode (on the ground); (ii) unauthorized weighting of the drone; (iii) UAV freeze at low altitude instead of landing and

Modeling and Evaluation of Battery Depletion Attacks on UAV

329

Fig. 1. The structure of the built device for the delivery of spare router modules.

parking; (iv) making unnecessary movements; (v) removal of energy resources of the UAV via the USB interface. Attack of the unauthorized turning on the UAV and its transfer to the waiting mode (Fig. 2). Within such an attack, the attacker sends a request to turn on the UAV in ready-for-takeoff mode (by sending the corresponding data packet, gaining access to the network, or by connecting to the hardware interface [18,19]). In the mode with the Arduino disabled, the operation time lasted 5.43 h. When all modules except XBee were connected, the operation time was reduced to 4 h, but the discharge statistics remained almost unchanged for the first 3 h. In the case of the implementation of the Denial-of-Sleep-attack on the XBee, the battery was discharged completely after 2.5 h, and the falling of the charge began instantaneously, but in the first hour of the operation the charge fell twice as fast, and in the first 10 min the charge decreased immediately by 15%.

Fig. 2. Results of unauthorized switching on the UAV and its transfer to the standby mode.

330

V. Desnitsky et al.

Attack of unauthorized weighting of the drone. This attack was simulated by loading the A.R.Drone UAV by various weights, including with violation of the drone centering by shifting the load to the front of the drone. Charts of charge consumption under load conditions of 50%, 25% and 12.5% of the mass as well as without extra mass (0%) of the drone and with a load of 12.5% of the mass with shifted alignment (12.5% db) are shown in Fig. 3. A series of experiments confirmed that the maximum operation time happened with an empty load. With a load of 12.5% and 25% of the total mass of the drone, the operating time is reduced by about the same 9.72 and 9.57 min, respectively, against 10.67 min of the drone without unnecessary load. There is a significant falling of the battery charge at the start (after 1.5 min the charge fell to 55–70% and the charge was kept 10–15% that is below the normal load). This is probably due to the fact that the more increasing load on the motors, the battery voltage drop becomes more significant, and the charge level is determined by the battery voltage. Also, the time point of a rapid drop in the charge level begins in a loaded drone much later (half a minute before shutting down against a minute of an unloaded drone) and the battery fall down much more intensively, which may result in too late sending the drone back to the base for recharging.

Fig. 3. Simulation results of the unauthorized weighting attack.

Attack of freezing UAV at low altitude instead of landing and parking (see the values of ‘AR-Drone’ in Fig. 2 against the values of ‘0%’ in Fig. 3). Attack of making unnecessary movements in the drone’s trajectory. This attack is possible by intruder’s sending commands to take off/land through the Arduino autopilot or sending a packet with a demand to take off via the Wi-Fi channel. Attack of the depriving of energy resources of a UAV through a USB interface or another UAV interface with the use of a “parasitic” module, including an attack of replacing a legitimate USB module with a maliciously modified. The discharge test with a USB consumer showed an average decrease in the UAV life-time from 4 to 2.2 h. A battery replacement attack on a faulty battery and a battery degradation attack are rather specific due to uncertainties of the degree of battery failure and wearout, as well as the intruder’s motivation for the expected time of a defeat of the drone.

Modeling and Evaluation of Battery Depletion Attacks on UAV

331

According to the results of analysis and experimental studies of battery depletion attacks on UAVs, it can be concluded that the following attacks have the greatest impact on the lifetime: UAV unauthorized turning on, UAV hangs at a low altitude under the guise of stationing, UAV energy resources depriving via USB interface. Each of the attacks has its own conditions of feasibility and stealthiness. The attack of introducing unnecessary movements showed its inefficiency when working with the AR-Drone 2.0. Unauthorized weighting attack can be quite effective - it shortens the lifetime, albeit not very significantly. And as a result, under the full charge depletion during the flight, it may not have enough time to return to the operational headquarters for the correct completion of its business task. The results of the evaluation of battery depletion attacks, in their application to UAVs, should be considered when developing effective defense mechanisms, both embedded directly into the drone and energy-tracking network nodes within the situational center.

7

Conclusion

The paper has analyzed the possible types of battery depletion attacks on UAVs. The results of the experimental studies have been obtained on modeling of battery depletion attacks on a fragment of a software/hardware prototype of a crisis management system using Parrot A.R. Drone 2.0. As part of further work it is planned to move towards identification of these types of attacks on UAVs by constructing automated machine learning based detection tools. Acknowledgements. The work is performed in SPIIRAS and supported by the Council for Grants of the President of Russia (project # MK5848.2018.9).

References 1. Rodday, N.M., Schmidt, R.O., Pras, A.: Exploring security vulnerabilities of unmanned aerial vehicles. In: NOMS 2016–2016 IEEE/IFIP Network Operations and Management Symposium, pp. 993–994. IEEE (2016) 2. Samland, F., Fruth, J., Hildebrandt, M., Hoppe, T., Dittmann, J.: AR.Drone: security threat analysis and exemplary attack to track persons. In: Intelligent Robots and Computer Vision XXIX: Algorithms and Techniques, vol. 8301. International Society for Optics and Photonics (2012) 3. Pleban, J.S., Band, R., Creutzburg, R.: Hacking and securing the AR.Drone 2.0 quadcopter: investigations for improving the security of a toy. In: Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications 2014, vol. 9030. International Society for Optics and Photonics (2014) 4. Vasconcelos, G., Carrijo, G., Miani, R., Souza, J., Guizilini, V.: The impact of DoS attacks on the AR.Drone 2.0. In: 2016 XIII Latin American Robotics Symposium and IV Brazilian Robotics Symposium (LARS/SBR), pp. 127–132. IEEE (2016) 5. Faria, L.A., Silvestre, C.A.M., Correia, M.A.F.: GPS-dependent systems: vulnerabilities to electromagnetic attacks. J. Aerosp. Technol. Manage. 8(4), 423–430 (2016)

332

V. Desnitsky et al.

6. Shepard, D.P., Bhatti, J.A., Humphreys, T.E., Fansler, A.A.: Evaluation of smart grid and civilian UAV vulnerability to GPS spoofing attacks (2012) 7. Muraleedharan, R., Osadciw, L.A.: Jamming attack detection and countermeasures in wireless sensor network using ant system. In: Wireless Sensing and Processing, vol. 6248. International Society for Optics and Photonics (2006) 8. Domin, K., Symeonidis, I., Marin, E.: Security analysis of the drone communication protocol: fuzzing the MAVLink protocol (2016) 9. Geethanjali, N., Gayathri, E.: A survey on energy depletion attacks in wireless sensor networks (2012) 10. Chang, S.Y., Kumar, S.L.S., Tran, B.A.N., Viswanathan, S., Park, Y., Hu, Y.-C.: Power-positive networking using wireless charging: protecting energy against battery exhaustion attacks. In: Proceedings of the 10th ACM Conference on Security and Privacy in Wireless and Mobile Networks, pp. 52–57. ACM (2017) 11. Martin, T., Hsiao, M., Ha, D., Krishnaswami, J.: Denial-of-service attacks on battery-powered mobile computers. In: Proceedings of the 2004 IEEE Second IEEE Annual Conference on Pervasive Computing and Communications, pp. 309–318 (2004) 12. Kim, Y., Kim, C.-M., Han, Y.-H., Jeong, Y.-S., Park, D.-S.: An efficient strategy of nonuniform sensor deployment in cyber physical systems. J. Supercomput. 66(1), 70–80 (2013) 13. Long, T., Ozger, M., Cetinkaya, O., Akan, O.B.: Energy neutral internet of drones. IEEE Commun. Mag. 56(1), 22–28 (2018) 14. Valentino, R., Jung, W.S., Ko, Y.B.: A design and simulation of the opportunistic computation offloading with learning-based prediction for Unmanned Aerial Vehicle (UAV) clustering networks. Sensors 18(11), 3751 (2018) 15. Mersheeva, V., Friedrich, G.: Multi-UAV monitoring with priorities and limited energy resources. In: 25-th International Conference on Automated Planning and Scheduling (2015) 16. Desnitsky, V., Chechulin, A., Kotenko, I., Levshun, D., Kolomeec, M.: Combined design technique for secure embedded devices exemplified by a perimeter protection system. SPIIRAS Proc. 48, 5–31 (2016) 17. Desnitsky, V., Kotenko, I., Rudavin, N.: Ensuring availability of wireless mesh networks for crisis management. In: Del Ser, J., Osaba, E., Bilbao, M., SanchezMedina, J., Vecchio, M., Yang, X.S. (eds.) Intelligent Distributed Computing XII. Studies in Computational Intelligence, vol. 798, pp. 344–353. Springer, Cham (2018) 18. Desnitsky, V.A., Kotenko, I.V., Rudavin, N.N.: Protection mechanisms against energy depletion attacks in cyber-physical systems. In: 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), pp. 214–219 (2019) 19. Desnitsky, V., Kotenko, I.: Modeling and analysis of IoT energy resource exhaustion attacks. In: Ivanovi´c, M., B˘ adic˘ a, C., Dix, J., Jovanovi´c, Z., Malgeri, M., Savi´c, M. (eds.) Intelligent Distributed Computing XI. Studies in Computational Intelligence, vol. 737, pp. 263–270. Springer, Cham (2018)

The Integrated Model of Secure Cyber-Physical Systems for Their Design and Verification Dmitry Levshun1,2(B) , Igor Kotenko1,2 , and Andrey Chechulin1,2 1

2

SPIIRAS, 39, 14th Liniya, St. Petersburg, Russia [email protected] ITMO University, 49 Kronverksky Pr., St. Petersburg, Russia [email protected]

Abstract. This paper considers a new integrated model of secure cyberphysical systems for their design and verification. The suggested integrated model represents cyber-physical systems as a set of building blocks with properties and connections between them. The main challenge to build this model is in consolidating different approaches for modeling of cyber-physical systems in the general integrated approach. The main goal of the suggested general integrated approach is to ensure the transformation from one model to another without losing significant properties of building blocks as well as taking into account emergent properties arising from the interaction of system blocks. The correctness of the model is validated by its use for access control analysis. Keywords: Security by design · Cyber-physical system · Security verification · System modeling · Attacker model · Attack actions model · Integrated model

1

Introduction

Each cyber-physical system represents a complex structure which contains a lot of various elements. Cyber-physical systems (CPSs) can be distributed, decentralized and self-organized, and also may contain a variety of connected and disconnected microcontroller-based devices. As a consequence, there are many different techniques for design and verification of such systems to be secure. Some of them are focused on software, some on hardware and some on highly specialized applications like cars, railway transport, robotics, smart houses, industrial systems, etc. An integral part of any design and verification technique is the CPS model, which allows one to display all the necessary properties of the CPS. Nowadays, there are a lot of tools and approaches that can help to model different aspects of CPSs: physical processes, software elements, hardware elements, computation platforms, networks, timings, performance, computations, communications, control, load balance, interactions, system behavior, topological relationships, interoperability, system boundaries, system hierarchy, workflows, business processes and others. The most of such approaches are focused c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 333–343, 2020. https://doi.org/10.1007/978-3-030-32258-8_39

334

D. Levshun et al.

on stability, reliability or systematicness of the system and do not take security into account. And while there are some CPSs modeling approaches for security design/verification of particular CPS elements, the approaches for design/verification of the whole CPS security are undeveloped now. It is important to base modeling approach on an integrated model, because in most cases it is complex or even not possible to transform one particular model to another due to absence of necessary data, which is not an issue for integrated model. The main contribution of the paper is in a new solution for modeling of secure CPSs aimed at their design and verification - an integrated model that represents CPSs as a set of building blocks with properties and connections between them. The main challenge in building of such model is integration of different approaches to modeling in the general integrated approach, taking into account that different blocks of CPSs are better represented by different models. The main goal of this general integrated approach is to ensure the transformation from one model to another without losing significant properties of building blocks as well as taking into account emergent properties (arising from the interaction of CPS blocks). The novelty of our approach is in combining several different ways for CPS modeling for applying the unified integrated methodology of CPS design and verification: event-based modeling (data transfer environment level), digraph modeling (topology level), multi-agent modeling (network level), continuous-time modeling (hardware elements level), discrete modeling (software elements level), interaction modeling (building blocks level) and ontology modeling (CPS, attacker and attacking actions levels). The proposed solution is focused on CPS security and allows one to combine design and verification of secure CPSs in the unified integrated methodology. Focus on the security of the designed or verified system allows one to enter into the integrated CPS model the models of the attacker and attack actions. At the same time, the impact of attacks is modeled through changes in the properties of the CPS building blocks, which allow one to evaluate the impact on the system as a whole. The paper is organized as follows. Section 2 considers the state of the art in the area of CPS modeling for design and verification respectively. Section 3 suggests the new integrated model of CPSs for their design and verification. Section 4 presents an example of using this model to analyze the security of an access control system. Section 5 discusses the advantages and disadvantages of the proposed model, as well as the scope of its application. Section 6 contains the main conclusions and the future work directions.

2

Related Work

In [5] the authors classify the CPS modeling approaches according to the aspects of the system displayed by the model and the tasks that can be solved. This classification is as follows: 1. Models based on timed actors for timings and performance [2]. 2. Event-based models for computations, communications and control.

The Integrated Model of Secure CPSs for Their Design and Verification

335

3. SCADA (Supervisory Control and Data Acquisition) model for load balance, stablility and integrity [11]. 4. Combination of ordinary differential equations and automata for not complex systems [12]. 5. Continuous-time models for physical processes [7]. 6. MDD (Model-Driven Development), MIC (Model-Integrated Computing) and DSM (Domain Specific Modeling) for software elements. 7. Multi-agent models for interaction between system elements. In [4] the authors use Modelica for system model representation and transferring it to the mathematical model for simulation. They model CPSs as an assembly of components and associated interfaces between them. They use the continuous model for dynamics of physical components and the discrete model for behaviors of computing components. The main challenge for their approach is in joining of these two models to determine important functional and system parameters and future optimization. In [8] the CPSs design methodology is proposed. Let us consider seven main steps of the methodology in more detail: 1. System boundary definition is related to the black box and white box analysis. The implementation is based on SYSML (The Systems Modeling Language) diagrams or Dymola/Modelica models. 2. Multi-view or multi-level modeling is focused on the MBSE (ModelBased Systems Engineering) approach. The implementation is based on SYSML and OOM (Object-Oriented Modeling) for specification, analysis, design, verification and validation of CPSs. 3. Interaction modeling uses the port-based modeling and related to physical support, control support and multi-domain of CPSs. The implementation is based on SYSML and Dymola/Modelica models. 4. Topological modeling is based on idea that existing scales and different viewpoints can be represented as a collection of topological entities sets and subsets linked together through different semantical degrees. The authors are using a set of directed graphs to represent the dependencies between subsystems, components and related parameters. 5. Semantic interoperability is related to definition of existing viewpoints with help of ontology of mechatronic and communication design knowledge and to decomposition of each design viewpoint through a graph-based topological analysis. The implementation is based on graph-based mapping ontologies. 6. Multi-agent modeling is related to model of control and communication protocols (time delayed communication, interactions, changing of topology, communication network nodes and links, packet losses). Author’s implementation is based on topological graphs and multi-agent modeling. 7. Collaboration modeling is related to solve the multi-view issues and the issue of communication between agents with different ontologies. Author’s implementation is based on OWL (Web Ontology Language), which provides interfaces.

336

D. Levshun et al.

Their CPS model in [8] contains: external and internal interactions, process control, behavior simulation, representation of topological relationships and interoperability through multi-agent platform. In [1] an example of verifying the CPS timing correctness and performance is presented. They use the following verification models: functional relations between inputs and outputs of the system; timing of components; communication between components; synchronization constraints of components. Authors perform validation in TrueTime (Matlab/Simulink), verification in UPPAAL (specification of verification models) and model checking for checking whether the CPS implements the requirements. With help of model checking, they verify stability, safety (invariance) and reachability of the CPS. In [9] an object-oriented workflow language for formalizing CPSs processes within heterogeneous and dynamic environments is presented (component-based meta model of CPS). Workflows (or processes) are used to model the high level behavior of the system and divided on next levels of abstraction: process metameta model (the semantic and syntactic elements and structures), process meta model (defines all elements, types, relations and their structural combinations), process model (defines abstract description of the process) and process instance (defines concrete process at execution time). The workflow contains next parts: process step, transition, data, event, logic step, process and handling entities. Implementation is based on EMF (Eclipse Modeling Framework). In [10] the approach for verification of timing performance of NAS (Network Automation Systems) is presented. The response time verification approach consists of three main phases: 1. Model building is related to specification of component reaction times and measuring of their timing performance. 2. Modeling is related to proposition of component based mathematical models (network architecture and inter-connections). 3. Verification is related to abstraction of NAS formal models as UPPAAL timed automata with their timing interfaces (based on proposition of action patterns and their timing wrapper). At the final step of the approach, the result formal model is used to verify the total response time of the NAS using a subset of timed computation tree logic (TCTL) in UPPAAL model checker.

3

The Suggested Integrated Model

As result of the analysis of the current state of the art, we concluded that most of approaches for CPS modeling is focused on stability, reliability or systematicness of the system and do not take security into account. And while there are some modeling approaches for design/verification of particular elements security, the approaches for design/verification of the whole CPS security are required. In our design and verification methodology for secure CPSs, the CPS model consists of the next parts: building blocks (hardware elements and software elements), connections between them (topology and data transfer environment),

The Integrated Model of Secure CPSs for Their Design and Verification

337

attacker and attacker actions. The implementation of such CPS model is presented in Fig. 1. Black rounded rectangles are displaying CPS model with its elements, while black directed arrows are displaying their hierarchy and nesting. White rounded rectangles are displaying external models, which are connected and agreed with integrated model.

Fig. 1. An integrated model of CPS for design and validation of its security

The integrated model of CPS can be represented as follows: cps = (CP S ∗ , BB, nw, a, AA, Pcps )

(1)

where CP S ∗ – set of cyber-physical subsystems cps∗ of cps; BB – set of building blocks of cps; nw – connections between elements of BB; a – attacker on cps; AA – set of attack actions on cps; Pcps – set of properties of cps. Note: each cps element at this level is considered as an object with a certain set of properties and connections without taking into account the internal structure (this rule is working for all model elements). Implementation is based on OWL modeling and helps one to solve multi-view issues. The attacker on cps can be represented as follows: a = (tp, lvl, ap)

(2)

where tp – type of access of a with its properties Ptp ; lvl – level of capabilities of a with its properties Plvl ; ap – access point of a. Implementation is based on OWL modeling and helps one to solve multi-view issues. The set of attack actions on cps can be represented as follows: AA = {(aa1 , prcnd1 , Paa1 ), ..., (aan , prcndn , Paan )}

(3)

where aai |i ∈ 1...n – i-th attack action from AA with its properties Paai ; prcndi |i ∈ 1...n – i-th preconditions (link to target element (bb, hw, dte or etc) and a) of i-th attack action from AA. Implementation is based on OWL modeling and helps one to solve multi-view issues.

338

D. Levshun et al.

The building block of cps can be represented as follows: bb = (BB ∗ , HW, SW, Pbb )|bb ∈ BB

(4)

where BB ∗ – set of building subblocks bb∗ of bb; HW – set of hardware elements of bb; SW – set of software elements of bb; Pbb – set of properties of bb. Implementation is based on SYSML interaction modeling. The hardware element of HW can be represented as follows: hw = (HW ∗ , Phw )|hw ∈ HW

(5)

where HW ∗ – set of hardware subelements hw∗ of hw; Phw – set of properties of hw. Implementation is based on VHDL (VHSIC (Very High Speed Integrated Circuits) Hardware Description Language) continuous-time modeling. The software element of SW can be represented as follows: sw = (SW ∗ , Psw )|sw ∈ SW

(6)

where SW ∗ – set of software subelements sw∗ of sw; Psw – set of properties of sw. Implementation is based on UML (Unified Modeling Language) discrete modeling. The connections between elements of BB can be represented as follows: nw = (tpl, dte, Pnw )

(7)

where tpl – topology of nw with its properties Ptpl ; dte – data transfer environment of nw with its properties Pdte ; Pnw – set of properties of nw. Implementation of nw is based on SPARK (Simple Platform for Agent-based Representation of Knowledge) multi-agent modeling. Implementation of tpl is based on O3PRM Language (Open Object-Oriented Probabilistic Relational Models) digraph modeling. Implementation of dte is based on SIGMA event graph simulation modeling. All elements of cps are connected through its properties. It means that to ensure the required level of security of CPS, the goal of its design phase is in finding of most rational combination of cps elements according to the balance between their needs (F R, N F L) and capabilities (P RF , P RF ). On the other hand, influence of each aa in the proposed model is in reducing of capabilities of cps or its elements (DDoS) or enhancing of their needs (Resource Depletion). And influence of a is in reducing of AA according to its tp and lvl. p = (F R, N F L, P RF, P RR)

(8)

where F R – set of functional requirements that are necessary for cps or its element(s) to work; N F L – set of non-functional limitations, which satisfaction is necessary for cps or its element(s) to work; P RF – set of provided functionality that will be available for cps or its element(s); P RR – set of provided resources that will be available for cps or its element(s). Structure of p is presented on the Fig. 2.

The Integrated Model of Secure CPSs for Their Design and Verification

339

Fig. 2. Properties in integrated model of CPS

Another key aspect of p of cps is in origin of emergent properties in the process of interaction of cps elements. Emergent properties are working as specific modifications that reduce capabilities of cps and its elements during interaction of their parts (this is due to the fact that collaboration requires additional resources). The work of emergent properties can be presented with help of the next function: fp (x) = Px , x = (y1 , ..., yn )|n ∈ N fp (x) = fp (y1 )EPxy1 ∪ ... ∪ fp (yn )EPxyn

(9)

where x – element of cps or cps, which consists of yi |i ∈ 1...n (also cps elements); Px – properties of x; EPxyi |i ∈ 1...n – emergent properties of x. The influence of epi ∈ EP |i ∈ 1...n can be classified by the area of the influence and by the result of the influence. By the area of the influence, the epi ∈ EP |i ∈ 1...n can be divided into hardware, software, hardware and software (building blocks), topology, data transfer environment, network and system. According to the result of the influence, the epi ∈ EP |i ∈ 1...n can be divided into positive, neutral, and negative. Neutral emergent properties imply the absence of EP , however, they are necessary for complete classification. Verification process in the integrated model of cps can be divided into two parts: verification of the possibility to design the security of cps (based on comparison of Pcps with defined by stakeholder F R and N F L) and verification of the protection of cps against a with tp and lvl (also defined by stakeholder).

4

Experiments

For the experimental evaluation of the proposed model, we used a prototype of the access control system (ACS) [3]. This prototype contains 5 main parts: microcontroller-based devices, access server, syslog server, admin client and host agents. Let us consider them in terms of the integrated model. The developed prototype of ACS is cps with next P RF : building physical access control (who is allowed to enter, when and to which room), including restriction of access to specified rooms and identification of users who have access

340

D. Levshun et al.

to specified rooms; and information access control (allow or deny access to information on personal computers). Physical access control is provided by BB1 , where each bb1i is microcontroller-based device, which contains: Arduino Yun microcontroller (building subblock with hardware (ATmega 32U4, AR9331 and etc) and software (firmware, programs and libraries) elements); microSD 512 MB, which P RR extends the amount of flash memory that is available for Arduino Yun software; mechanic lock TowerPro SG90, which P RF allows bb1i to close and open the door; RFID-reader Grove 125 KHz, which P RF allows bb1i to check user smartcards with unique identifications; infrared motion sensor HC-SR 501, which P RF allows bb1i to detect unauthorized access attempts and other components. Which P RF allows bb1i to output audio and light signals as well as text information to user. Each bb1i is connected to bb2 (access server) and bb3 (syslog server) through TCP/IP nw. Connection between BB1 and bb2 , bb3 is provided via Wi-Fi dte with strong password and WPA2 encryption (P RF ) and has a star topology tpl (each bb1i is connected to bb2 and bb3 through its own client). In our model, tp of the a can be from 0 to 4: type 0 – no access to ACS BB and nw, only indirect action (example: social engineering methods); type 1 – indirect access to ACS BB and nw (example: access through TCP/IP to bb2 and bb3 ); type 2 – direct access to ACS BB and nw, while being at some distance from protected perimeter (example: access through Wi-Fi to BB1 ); type 3 – physical access to ACS BB and nw (example: access through USB to bb1i , modification of its electronic components); type 4 – full access to ACS BB and nw (example: access to microcircuits and internal interfaces of bb1i , such as memory, debugging and update interfaces). And lvl of the a can be from 1 to 3: level 1 – insufficient knowledge about ACS BB and nw, a can use only widely-spread software tools and exploit only wellknown vulnerabilities (examples of AA: attacks on web-server, social engineering, traffic interception); level 2 – a has information about ACS BB and nw and can use specialized attacking tools and exploit previously non-used vulnerabilities (examples of AA: man-in-the-middle, DDoS, buffer overflow); level 3 – group of a of level 2 (examples of AA: cryptanalysis of encrypted messages, attacks on authentication system, interception, modification and fake messages). Let us consider a which tp is 2 and lvl is 2. His or her AA which prcnd is linked to Wi-Fi interface of dte are: interception, modification and fake of messages between each bb1i and bb2 , bb3 (man-in-the-middle); classic network attacks on each bb1i (DDoS, TCP SYN Flood); attacks based on exploiting specific vulnerabilities on each bb1i (sending incorrect network packets, buffer overflow); cryptographic analysis of encrypted messages between each bb1i and bb2 , bb3 ; attacks on the update system of each bb1i . Man-in-the-middle attacks can lead to interception, modification and fake of messages between each bb1i and bb2 , bb3 if cps is using Wi-Fi, which security is out of its responsibility (in example, ACS was connected to already existing Wi-Fi, that is used by employees of the organization). In such situation, if a will

The Integrated Model of Secure CPSs for Their Design and Verification

341

receive access to the nw, then he or her will be able to sniff all traffic between each bb1i and bb2 , bb3 without any issues. It means, that cps can receive fake or modified events from the sensors. To prevent such situations, free resources of each bb1i should be analyzed. Arduino Yun flash memory amount is 32 KB, while its firmware (access control logic, work with sensors and communication protocols) size is 17.2 KB. That allows us to add to firmware one more sw which size is not more than 14.8 KB. P RF of this sw will extend firmware functionality with encryption of sending messages. Such dte improvement will allow us to detect the man-in-the-middle attack due to the growing number of rejected messages from each bb1i to bb2 and bb3 (all events faked or modified by a will be rejected). The size limitation of this paper does not allow us to describe the whole process of ACS modeling with the proposed integrated approach. As the result of the experiment, we improved the security of communication between ACS elements during early design stages. And due to the results of the verification in the UPPAAL model checker, the improved ACS is protected against a which tp is 2 and lvl is 2. Moreover, due to the results of model checking of access control policies in the SAL (Symbolic Analysis Laboratory) tool, ACS satisfies the formal specifications.

5

Discussion

In this paper we presented a new solution for modeling of secure CPSs for their design and verification – an integrated model that represents cyber-physical systems as a set of building blocks with properties and connections between them. This model can represent static (for design) and dynamic (for verification) states of the CPSs. And depending on the represented state of the system, appropriate tools and approaches should be used. The important issue in building of such model is that different elements of the CPS are better represented by different models. It means that integrating of different modeling approaches in the general integrated approach is not an easy task and requires exact transitions between various tools and models without missing of necessary information. Our solution is focused on design and verification of CPSs security. It means that resources of the CPS that are mentioned in the properties of our model is resources of CPS, that are free while CPS performs its tasks and though can be used to ensure its security. Moreover, amount of resources that CPS is using at different points in time is not static. It should be taken into account, for example, during combination of building blocks of the CPS, because if two elements of the system use the same amount of resources, but never do it at the same time, then from the point of view of the load on the system there is no conflict between them. On the other hand, if well-timed attack action enhances the resource needs of critical component of the CPS, it can lead to a complete system malfunction. The comparison with other integrated models in terms of design and verification of CPSs security is presented in the Table 1. It is important to mention that “+” for [5] and [8] means that the models and approaches used by the authors theoretically make it possible to realize the

342

D. Levshun et al.

Table 1. Comparison of integrated models for design and verification of CPSs security Attacker Attacking actions

[Hu et al. [5]] [Penas et al. [8]] Suggested

type 0

Social engineering





±

type 1

Man-in-the-middle, DDoS, TCP SYN Flood, sending incorrect network packets, buffer overflow, cryptanalysis, attacks on updating system for TCP/IP

+

+

+

type 2

Same as type 1, but for IR, Wi-Fi, Bluetooth, side channels attacks based on electromagnetic radiation analysis

+

+

+

type 3

Side channels attacks based on various characteristics (direct access), attacks on interfaces, replacing of the original CPS elements with fake ones

+



+

type 4

Disassembling of CPS elements, exploitations of hardware exploits (internal interfaces, hidden ports, intercomponent communication), the electronic components data modification, the encryption keys extraction





+

detection of the corresponding attack actions at the CPSs design and verification stages, not the availability of a ready-made solution. Moreover, none of the approaches is taking into consideration social aspect of CPSs fully. In our future investigations we plan to include users of the CPSs as integrated model element with their own properties and influence on the system work and behavior.

6

Conclusion

This paper describes integrated model of CPSs, which is a key element of our design and verification methodology for secure CPSs [6]. The correctness of the suggested model is validated by its use in the security analysis of the access control system. The results of the analysis allowed us to check the list of possible attacking actions on the access control system and to improve the security of its data transfer environment by extending of the firmware of microcontroller-based devices with sending messages encryption algorithm. In further investigations, it is planned to investigate different catalogs of attacks on CPSs to classify them and enhance our knowledge base of models of the attacker and models of attack actions to improve our integrated model. Moreover, it is also planned to perform more experiments with various use cases (access control system, robotic system, integrated security system and others).

The Integrated Model of Secure CPSs for Their Design and Verification

343

Acknowledgements. This research is being partially supported by the grants of the RFBR (projects No. 19-07-01246, 19-37-90082), by the budget (the project No. 00732019-0002), and by Government of the Russian Federation (Grant 08-08).

References 1. Balasubramaniyan, S., Srinivasan, S., Buonopane, F., Subathra, B., Vain, J., Ramaswamy, S.: Design and verification of cyber-physical systems using truetime, evolutionary optimization and UPPAAL. Microprocess. Microsyst. 42, 37–48 (2016) 2. Canedo, A., Schwarzenbach, E., Faruque, M.A.A.: Context-sensitive synthesis of executable functional models of cyber-physical systems. In: 2013 ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS), pp. 99–108. IEEE (2013) 3. Desnitsky, V., Chechulin, A., Kotenko, I., Levshun, D., Kolomeec, M.: Application of a technique for secure embedded device design based on combining security components for creation of a perimeter protection system. In: 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), pp. 609–616. IEEE (2016) 4. Hehenberger, P., Vogel-Heuser, B., Bradley, D., Eynard, B., Tomiyama, T., Achiche, S.: Design, modelling, simulation and integration of cyber physical systems: methods and applications. Comput. Ind. 82, 273–289 (2016) 5. Hu, F., Lu, Y., Vasilakos, A.V., Hao, Q., Ma, R., Patil, Y., Zhang, T., Lu, J., Li, X., Xiong, N.N.: Robust cyber-physical systems: concept, models, and implementation. Future Gener. Comput. Syst. 56, 449–475 (2016) 6. Levshun, D., Chechulin, A., Kotenko, I.: Design lifecycle for secure cyber-physical systems based on embedded devices. In: 2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), vol. 1, pp. 277–282. IEEE (2017) 7. Nuzzo, P., Sangiovanni-Vincentelli, A.L., Bresolin, D., Geretti, L., Villa, T.: A platform-based design methodology with contracts and related tools for the design of cyber-physical systems. Proc. IEEE 103(11), 2104–2132 (2015) 8. Penas, O., Plateaux, R., Patalano, S., Hammadi, M.: Multi-scale approach from mechatronic to cyber-physical systems for the design of manufacturing systems. Comput. Ind. 86, 52–69 (2017) 9. Seiger, R., Keller, C., Niebling, F., Schlegel, T.: Modelling complex and flexible processes for smart cyber-physical environments. J. Comput. Sci. 10, 137–148 (2015) 10. Srinivasan, S., Buonopane, F., Vain, J., Ramaswamy, S.: Model checking response times in networked automation systems using jitter bounds. Comput. Ind. 74, 186–200 (2015) 11. Srivastava, A., Morris, T., Ernster, T., Vellaithurai, C., Pan, S., Adhikari, U.: Modeling cyber-physical vulnerability of the smart grid with incomplete information. IEEE Trans. Smart Grid 4(1), 235–244 (2013) 12. Xinyu, C., Huiqun, Y., Xin, X.: Verification of hybrid Chi model for cyber-physical systems using PHAVer. In: 2013 Seventh International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 122–128. IEEE (2013)

Scalable Data Processing Approach and Anomaly Detection Method for User and Entity Behavior Analytics Platform Alexey Lukashin1(B) , Mikhail Popov1 , Anatoliy Bolshakov2 , and Yuri Nikolashin2 1

Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia {alexey.lukashin,popov m}@spbstu.ru 2 Gazinformservice, St. Petersburg, Russia {Bolshakov-A,Nikolashin-Y}@gaz-is.ru

Abstract. User and entity behavior analytics (UEBA) is a popular and modern way of finding security threats in corporate infrastructure. Anomaly detection in data allows detecting incidents which cannot be detected by other methods including rules in classical SIEM systems. But there are several problems requiring the development of scalable software and analytical methods which can handle thousands of events per second. The paper describes approaches for processing semi-structured data from different sources for further analytics using anomaly detection methods. The new method of building features from hybrid data streams from different SIEM sources has been introduced. The paper also contains a study of efficiency and scalability of the developed approach. Keywords: UEBA · SIEM · Anomaly detection · Machine learning CEF · Isolation Forest · Feature engineering · Security

1

·

Introduction

Information security in corporate infrastructures is one of the most important topics today. In the era of digitalization and Industry 4.0 everything becomes the digital and intellectual property of the enterprises can be compromised and leaked if there is no adequate protection. Today one of the most popular ways of controlling system data flows and events in the corporate infrastructure are using SIEM (abbreviation “security information and event management”) systems. These systems gather and store information from different resources such as network devices, workstations, servers, etc. SIEM systems allow defining rules which describe security incidents. In case of violation of the rules, a certain action described by security analysts is performed. But not everything can be described by rules. Also, rules are static but modern infrastructures, especially cloud-based, are dynamic. In 2015 Gartner introduced user and entity behavior analytics (UEBA) as a new way of controlling corporate infrastructures [3]. This approach allows c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 344–349, 2020. https://doi.org/10.1007/978-3-030-32258-8_40

Data Processing Approach and Anomaly Detection Method for UEBA

345

detecting deviations of standard behavior by using anomaly detection and other machine learning methods and register it as a security incident. This paper presents the architecture of UEBA platform and describes a generic way of processing different data streams from SIEM system and detect security problems using machine learning methods. The paper presents some of the theoretical and practical results of the project for creating a prototype of UEBA system and based on previous authors experience in creating information security systems for dynamic cloud environments [5,6], managing complex cloud-based infrastructures [4] and applying anomaly detection methods to control the data flows [9]. As for other articles we found on this topic about 200 articles, with a larger volume falling on last years. The search for anomalies is mainly described for social networks and Internet surfing, for example [1,2,7,8].

2

Anomaly Detection Method for Semistructured Data

SIEM systems produce a lot of events with a different context. The method of extracting feature vectors from messages with different data fields is introduced in this section. One of the standards in SIEM systems for the format of messages is CEF (common event format). This format is introduced by ArcSight, and this standard contains mandatory fields like timestamp, type, product name, and version, etc. Message specific information is not fixed in CEF standard and placed as a set of key-value fields to the extension section. The unsupervised method based on Isolation Forest was chosen to identify anomalies. Feature engineering for analysis was developed drawing on statistics collection, which consists of collecting information on the frequency of occurring values for the observed keys of the analyzed fields in events. Collecting statistics is the process of obtaining weights for each value of each key from the interesting fields in the event. In the software implementation, the statistics are presented in the form of map M , containing such values as, m[asa dest port|1000] = 0.1, where asa dest port|1000 is a component key consisting of polar events and a specific value. Statistics are calculated periodically by time or by a number of events. At each step, the statistics merge with the previous statistics through a forgetting ratio. This approach allows the system to implement memory and gradual adaptation to the current situation, as well as conduct an analysis of events on the stream, close to real time. Basic steps to get feature vectors for statistics, that are calculated periodically by a number of events: 1. Getting the frequency of occurrence of the value in the chunk of events is according to the formula (1). ν ki =

countki , N

(1)

where countki is the number of events with the same value for a specific key, N is the number of events in a chunk in total.

346

A. Lukashin et al.

2. Obtaining weights for each key value is based on the previous and current statistics using the “averaging” algorithm and using the forgetting coefficient (taking into account the current and previous values using the coefficients) (2). wki = ω ki−1 ∗ kf + νik ∗ (1 − kf ) wk0 = ν k0 ,

(2)

where ω ki−1 is the value of the normalized weight at the previous step (previous chunk), kf is the coefficient of forgetting the weight for the previous step, νik is the occurrence frequency of value for the current step. 3. Normalization of weights and filtering weights by a threshold value is according to the formula (3). wk (3) ω ki =  i k , wi where ω ki is the weight of the value for a particular key k, wki is weight before k < Mt , then this value is removed from the list of normalization. If the ωij weights, where Mt is the threshold of the value of the statistics.

3

Scalable Data Processing Approach

SIEM systems could produce huge data amounts up to tens of thousands of events per second (eps). Thus, necessary to handle messages flow in a scalable way. UEBA systems have to be able to process data flows in near to real-time mode and react to found anomalies. Another problem is related to splitting data streams to sub-streams and apply different analyzers to it. An analyzer can be represented as a black box which takes data from input, process data, and sends data to the output. Consider the typical infrastructure and organization of a company which consists of: Administrative personal; Software development department; Accounting department; Sales department; Network devices; Corporate services (mail, CRM, HR system, etc). Obviously, that behaviour of different entities and users is very different from each other. Content of events is also different. Consequently, analytic methods have to be different. We propose the following methodology to solve this problem. The data processing approach is decomposed in the following items: 1. What. This topic covers which parameters (data fields) and which types of events (network, services, etc) are important for solving the problem. 2. How. This topic covers how to extract feature vectors from data. These feature vectors will be used for anomaly detection and classification. One of feature extraction method is presented in Sect. 2. 3. By. This topic describes the method which will be applied to feature vectors. It can be, for example, LSTM, Isolation Forest, SVM, etc. Any analyzer has to constructed from these items. Our idea that user selects data fields, selects the feature extraction method from a predefined list, select

Data Processing Approach and Anomaly Detection Method for UEBA

347

supporting library from a predefined list and starts the analyzer. After that user constructs the rules which extract sub-stream from events stream and configures the system to apply analyzer to this sub-stream to perform data analytics. For a data connection, we propose to use the message-oriented approach based on queues.

4

The Architecture of the Software Platform for User and Entity Analytics

The UEBA platform consists of following subsystems: 1. Data gathering system; 2. Data processing system; 3. User interface for analytics and control of data flows. The data processing layer is a set of services which are gathering data from SIEM system. Our UEBA platform prototype is integrated with the HP ArcSight platform and supports a messages stream from agents in CEF format. But our platform is can be integrated with other SIEM systems. The integration between SIEM and UEBA platform is implemented using queues. Processed messages are saved in local storage for offline analytics and also buffered and sent to online processors for online anomaly detection and other data processing methods. Data processing system is represented as a set of analyzers templates which is called the digital library of analysers. This library consists of prepared application modules which implement how and by tasks and models which implement what task (described in Sect. 3). Application is a preconfigured docker container or virtual machine with metadata. Model is represented by a file with parameters and/or pretrained neural network or any other classifier. Each application module and model are versioned. Combination of specific versions of application and

Fig. 1. The architecture of UEBA prototype

348

A. Lukashin et al.

model form a workspace. Workspace is a running instance in the virtual environment of UEBA system. Most of the workspaces in our UEBA systems implemented as Docker containers running in Kubernetes which is a part of the platform prototype. The architecture of UEBA platform prototype is presented on Fig. 1.

5

Cases and Experiments

To check the method, a dataset was selected that includes port scanning on the network. Events from DeviceVendor = CISCO, DeviceProduct = ASA are responsible for this in SIEM system. The following events CEF fields were selected to look for anomalies for the port scanning: source IP, destination IP, source port, destination port. These fields are sufficient to track network activity for a sign of port scanning. The dataset of events was received in 7 h; the number of events is 139629, of which 3796 are port scanning events. Table 1. The result for scanning network N

kf

Mt

Scan anom Found anom Accuracy Recall Precision F1 score

0

1811

100

0,8 0

1925

3845

0,973

0,501

0,507

0,504

562

0,2 0,02 1817

3496

0,974

0,520

0,479

0,498

1000 0

3329

0,975

0,544

0,477

0,508

To find the best quality score for certain parameters, parameters from this range were a brute force: N ∈ [100, 562, 1000], kf ∈ [0.2, 0.4, 0.6, 0.8], Mt ∈ [0, 0.02, 0.04, 0.06, 0.08]. From the Table 1 we can draw the following conclusion: To search for anomalies on this sample, the result with such parameters showed itself better than ever: N = 1000; kf = 0; Mt = 0. The following values were obtained for this data: Revealed 1811 anomalous events from 3329 available. 1518 events out of 139629 normal events were falsely positively determined. That meet the following accuracy criteria: Accuracy = 0.9749, Precision = 0.4771, Recall = 0.5440, F = 0.5084.

6

Conclusion

In this paper novel architecture of UEBA platform is proposed and implemented. Experiments with network devices (agent Cisco ASA), windows domain events, and Checkpoint firewall are demonstrated quite good results of detecting anomaly behaviour. In the future work, we will extend a set of analysers and experiment with different machine learning methods including: research of seq2seq methods to predict attacks (LSTM networks and others), possibility to use LSTM method

Data Processing Approach and Anomaly Detection Method for UEBA

349

for anomaly detection, research of survival analysis methods for evaluating the probability of performing network attacks and extending presented method by adding the running window of events and introducing several statistics for long and current memory. The proposed architecture allows to extend the prototype by adding new analysers to the digital library and perform the different experiments on the same data.

References 1. Brown, A., Tuor, A., Hutchinson, B., Nichols, N.: Recurrent neural network attention mechanisms for interpretable system log anomaly detection. arXiv preprint arXiv:180304967 (2018) 2. Buczak, A.L., Berman, D.S., Yen, S.W., Watkins, L.A., Duong, L.T., Chavis, J.S.: Using sequential pattern mining for common event format (CEF) cyber data. In: Proceedings of the 12th Annual Conference on Cyber and Information Security Research, p. 2. ACM (2017) 3. Litan, A.: Market guide for user and entity behavior analytics. Gartner (G00276088), 22 September 2015 4. Lukashin, A., Lukashin, A.: Resource scheduler based on multi-agent model and intelligent control system for openstack. In: International Conference on Next Generation Wired/Wireless Networking, pp. 556–566. Springer (2014) 5. Lukashin, A., Zaborovsky, V.S., Kupreenko, S.: Access isolation mechanism based on virtual connection management in cloud systems-how to secure cloud system using high perfomance virtual firewalls. In: ICEIS, vol. 3, pp. 371–375 (2011) 6. Molyakov, A., Zaborovsky, V.S., Lukashin, A.: Model of hidden it security threats in the cloud computing environment. Autom. Control Comput. Sci. 49(8), 741–744 (2015) 7. Sun, L., Versteeg, S., Boztas, S., Rao, A.: Detecting anomalous user behavior using an extended isolation forest algorithm: an enterprise case study. arXiv preprint arXiv:160906676 (2016) 8. Tuor, A., Kaplan, S., Hutchinson, B., Nichols, N., Robinson, S.: Deep learning for unsupervised insider threat detection in structured cybersecurity data streams. In: Workshops at the Thirty-First AAAI Conference on Artificial Intelligence (2017) 9. Utkin, L.V., Zaborovsky, V.S., Lukashin, A.A., Popov, S.G., Podolskaja, A.V.: A siamese autoencoder preserving distances for anomaly detection in multi-robot systems. In: 2017 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO), pp. 39–44. IEEE (2017)

Approach to Detection of Denial-of-Sleep Attacks in Wireless Sensor Networks on the Base of Machine Learning Anastasia Balueva1,2 , Vasily Desnitsky1,2(B) , and Igor Ushakov1 1

2

The Bonch-Bruevich Saint-Petersburg State University of Telecommunications, Bolshevikov, 22, Building 1, St. Petersburg 191124, Russia tonys @mail.ru, [email protected] St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, 14-th Liniya, 39, St. Petersburg 199178, Russia [email protected]

Abstract. The paper analyzes possible types of attacks in the areas of information security of cyber-physical systems and surveys available literature in the field. We address a problem of the vulnerability of Internet of Things devices in wireless networks, assuming attacks aimed at depleting energy resources. In the paper we regarded several types of traffic and conducted an analysis of normal and attacking traffic. An approach to detection of energy depletion attacks in cyber-physical systems based on machine learning methods is proposed. The feasibility of the proposed solution is confirmed by an example of the attacking traffic by using the Python language and Anaconda pack. Keywords: Information security · Energy depletion attacks of Things · Cyber-physical systems · Machine learning

1

· Internet

Introduction

Nowadays attacks on wireless sensor networks as a part of Internet of Things are becoming increasingly important. Energy depletion attacks being able to exhaust the device’s battery in a shortest time gap present a significant threat, as they are quite secretive, both for the security monitoring mechanisms and for the device itself. The complete exhaustion of all the energy resources leads to violation of the operability of the device. Energy depletion attacks are usually quite easy to perform, sometimes it is enough for an intruder to have minimal software and hardware facilities, programming skills and experience in working with microcontrollers or other basic telecommunications equipment. Despite the fact that, as mentioned above, such attacks are easy to perform, they can cause great damage and significantly complicate network functionality. If some network has not been protected from this type of attack at all, this can lead to defeat of the entire network, and also complicate the restoration of the workflow. Therefore, it is especially important not c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 350–355, 2020. https://doi.org/10.1007/978-3-030-32258-8_41

Approach to Detection of Denial-of-Sleep Attacks

351

only to consider protecting of individual devices from typical well-known vulnerabilities, but also protect against Denial-of-Sleep attacks and increase network security by various protection methods. The main feature of this type of attacks is the difficulty of their detection, since the attacked device is usually affected via the Internet or remotely, using a series of false requests. Also, the difficulty lies in a fact that not all chains of false requests can be unambiguously identified as attacking, since the discharge of the battery can be associated with formally legitimate actions of the user. In addition in order to monitor energy depletion attacks, it is advisable to record not just the changes of the process of the battery discharge, but also changes in the battery charging speed. Four specific classes of energy depletion attacks are regarded in [1]. The first class is a forced leaving a low-power (sleep) mode of the device. Attacks of the second class are performed by increasing the volume of incoming or outgoing traffic. The third class represents creation of electromagnetic interference on wireless data transmission channels, thus devices are forced to generate a signal of increased power for the data transmission. Finally the last one includes abnormal use of device’s software, multiple applications launch on the device and various violations of embedded software and bypassing hardware optimizations. In this paper we are focused on attacks of the second class. We assume there is an increase of incoming traffic and it leads to a growth of energy consumption, which can cause to a quick shutdown of some wireless device due to the exhaustion of the battery charge. This paper analyzes existing works in the field, proposes an automated determination of the attacking traffic by using machine learning methods. The practical solution and experimental study are performed by using Python language and Anaconda pack. The novelty of this work lies in the use of machine learning for solving the traffic classification problem in wireless sensor networks by means specific frequency based features of the traffic.

2

Related Work

To have a complete scheme of Denial-of-Sleep attacks Kaur, et al. proposed preventive measures against some types of Denial-of-Sleep attacks, describing several possible types of this attacks and details of some attack scenarios [2]. The paper also presents some existing solutions to examine Denial-of-Sleep attacks and describes existing solutions and these features to compare of their key characteristics and countermeasure this type of attacks. Finally, the paper may be used as a reference by researchers when deciding how to secure the sensor nodes. Wainis et al. propose a quite simple algorithm that detects Denial-of-Sleep attacks and reduces damage from them [3]. This study surveyed medium access control (MAC) protocols classification according to how nodes organize (or not) an access to the shared radio channel. A simulation is also performed and an assessment of the protection mechanism proposed by the authors is fulfilled. The results show that the enhanced protocol become selfimmune against Denial-ofSleep attacks with a side effect shown as a throughput reduction during attacks.

352

A. Balueva et al.

In [4] a more efficient solution is proposed in order to solve the problem of a failure in sleep mode by selecting nodes that will be used in hierarchical clustering. The purpose of that study is to increase the network operation time due to the effective spending and saving of energy resources to be consumed during a Denial-of-Sleep attack. A new proposed protocol that can cope effectively with Denial-of-Sleep attacks by setting the detection mode to find the nodes sending specific packets is proposed. In essence is targeted on isolating the nodes with lower energy. As a result, the proposed solution shows more effective outcome against LEACH comparative protocol. Raymond [5] reveals vulnerabilities in modern MAC protocols being subject to Denial-of-Sleep attacks. Raymond classifies these attacks, taking into account the attacker’s knowledge on the MAC layer protocol and the attacker’s ability to bypass the installed authentication and encryption protocols. The attacks from each category of the proposed classification are modeled to show the effects on four current varieties of sensor network MAC protocols: S-MAC, T-MAC, B-MAC and G-MAC. The study provides a set of mechanisms designed to detect and mitigate the effects of Denial-of-Sleep attacks on sensor networks. The Clustered Anti-Sleep-Deprivation kit for sensor networks includes a platform-independent anti-replay mechanism, an adaptive speed limiter and a mechanism for detection and removal of the noise. These tools are designed for selective or specific application in order to protect against Denial-of-Sleep attacks, depending on the specific vulnerabilities in the MAC protocol used in a particular sensor network. Each of the Denial-of-Sleep attack mitigation techniques was explored by using analytical modeling to determine its overhead and the results show that the techniques are effective in supporting network lifetime under a variety of Denial-of-Sleep attacks. In contrast to the considered work in the field in this work we are focused on the study of frequency characteristics, allowing machine learning based classification of traffic in wireless sensor networks with sufficiently high accuracy into normal traffic and one resulting from Denial-of-Sleep attacks.

3

Approach to Attack Detection and Experiments

To obtain traffic samples, a real physically implemented network consisting of ZigBee modules and microcontrollers was used. All received traffic was used as the source data and issues of its receipt are beyond the scope of this paper. The traffic analyzed in this work represents series of requests of specific frequency characteristics. The analyzed traffic is in two modes, namely HIGH and LOW modes. LOW mode is characterized by an interval between packets of 5000 ms with a possible deviation of +/− 500 ms, while HIGH mode is characterized by 20000 ms gap with a possible deviation of 2000 ms. The message sending interval within a packet in LOW mode is 500 ms with a possible maximum deviation of 167 ms, while in HIGH mode it is 2000 ms with a maximum deviation of 666 ms. Also, LOW mode uses an average of 5 messages per packet with a possible deviation

Approach to Detection of Denial-of-Sleep Attacks

353

within +/− 2 messages, while the HIGH mode uses 20 messages with a deviation within +/− 8 messages. Therefore the using both modes of packets assumes the 8 types of attacking traffic are specified as possible combinations of three parameters mentioned. These are denoted as follows: LLL, LLH, LHL, LHH, HLL, HLH, HHL, HHH. As a part of normal traffic, it is assumed that messages are accumulated in batches and are sent at regular intervals in order to optimize transmission between wakefulness and sleep states of the nodes. The attacking traffic consists of an outgoing message flow with a fixed frequency (i.e., no burst allocation), which values can be from 40 to 200 messages per minute. Thus, one can single out the key distinguishing feature of normal traffic versus abnormal traffic, i.e. the grouping character of messages in the normal traffic. The feature is expressed in the following formal parameters: the total number of messages (per minute); the ratio of the maximum interval between messages and the minimum one. The duration of each received traffic sample is 1 min. Totally 80 traffic samples have been used. The prepared Python code for processing traffic uses the normalized time (time of the first message), calculates the number of messages per minute, finds the minimum and maximum pause time points between packet transmission and calculates the needed ratio. The output of the program contains (i) the ratio of the maximum and minimum pauses, (ii) the number of messages per minute and (iii) the type of sample, i.e. either normal traffic or attacking one (Fig. 1). The applied KNN (nearest neighbors) algorithm as a quite straightforward one was chosen for the learning. The parameter k was specified heuristically on the basis of a series of experiments and was consistently led to k = 10. The samples of normal and attacking traffic were divided into two portions for the training and testing respectively. In Fig. 2 the attacking HHL traffic got from specified samples was selected as a test sample, which visually is almost the same as normal traffic. The results show that the program identified the type of traffic correctly, i.e. the traffic was declared as attacking one (Fig. 2). In total according to the experimental results, on the average the produces detection mechanism correctly classified about 82% of the samples. Note that the proposed Denial-of-Sleep attack detection model forms particular elements of a combined intruder model in wireless sensor networks, which allows classifying intruders and their actions, depending on possible intruder’s actions, resources, initial capabilities, type of access to sensor and wireless interfaces, etc. Further the proposed detection mechanism can be applied to solve problems of verifying models a wireless sensor network to check feasibility of Denial-ofSleep attacks in a particular wireless sensor network.

354

A. Balueva et al.

Fig. 1. Scheme of processing traffic and experimental results

Fig. 2. Determination of the attacking traffic

4

Conclusion

The development of wireless sensor networks causes the sensors are becoming increasingly important in the physical world. Considering the low power of sensor units used, such sensors are widely used to detect temperature, pollution,

Approach to Detection of Denial-of-Sleep Attacks

355

pressure, and other various applications. Sensor networks of limited power consumption may be subject to attacks reducing the expected life time of the device and thus making the sensor networks inoperable [6]. The paper is focused on Denial-of-Sleep attacks, provides an example of normal and attacking traffic classification and also reveals a fragment of the implemented tool for detecting attacking traffic in Python with KNN machine learning method used. Acknowledgement. The work was financially supported by the Russian Foundation of Basic Research, project No. 19-07-00953.

References 1. Desnitsky, V., Chechulin, A., Kotenko, I., Levshun, D., Kolomeec, M.: Combined design technique for secure embedded devices exemplified by a perimeter protection system. In: SPIIRAS Proceedings, no. 48, pp. 5–31 (2016) 2. Kaur, S., Ataullah, Md, Garg, M.: Security from denial of sleep attack in wireless sensor network. Int. J. Comput. Technol. 4(2), 419–425 (2013) 3. Wainis, M., Kabalan, K., Dandeh, R.: Denial of sleep detection and mitigation. Latest Trends Commun. 3(2), 195–200 (2014) 4. Kaur, S., Ataullah, Md.: Securing wireless sensor network from denial of sleep attack by isolating nodes. Int. J. Comput. Appl. Technol. 103(1), 29–33 (2014) 5. Raymond, D.R.: Denial-of-Sleep Vulnerabilities and Defenses in Wireless Sensor Network MAC Protocols. Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Engineering (2008) 6. Desnitsky, V.A., Kotenko, I.V., Rudavin, N.N.: Protection mechanisms against energy depletion attacks in cyber-physical systems. In: Proceedings of the 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus 2019), pp. 214–219 (2019). https://doi.org/10.1109/EIConRus. 2019.8656795

Model of Smart Manufacturing System Maria Usova(B) , Sergey Chuprov, Ilya Viksnin, Ruslan Gataullin, Antonina Komarova, and Andrey Iuganson ITMO University, Saint-Petersburg, Russia [email protected], {chuprov,avkomarova,a yougunson}@itmo.ru, [email protected], [email protected]

Abstract. In this paper, the authors proposed a set-theoretic model of the functioning of Smart Manufacturing (SM) system. As part of the work, an analysis of existing scientific works in the subject area was carried out. We focus on the description of the model of SM functioning as a cyber-physical system, as a result of which the physical and informational level of the model functioning are considered. The conditions and tasks for the SM functioning are defined, various types of representation of the information space are considered, and the system of access control to elementary messages is described. As one of the components of digital production, a description of the logistic system was proposed and the main parameters of its functioning were determined. Keywords: Smart Manufacturing · Digital production · Cyber-physical systems · Autonomous cars · Informational space

1

Introduction

The use of information technologies helps to optimize and automate various processes, making our life more convenient and safe. The widespread use of systems combining various physical objects, equipped with embedded technologies for inter-acting with each other and with the external environment for managing processes, eliminating the need of human participation from their actions has been called the Internet of Things (IoT) paradigm [1]. In building such systems, two types of components are commonly used - informational and physical. The combination of these components and the connections between them are called cyber-physical systems (CPS). The development of the IT-industry has had an impact on the production processes in factories. Modern factories strive to integrate their business functions and technologies being introduced in order to create a single digital production space. The concept of digital production (or smart manufacturing) includes models, methods and tools for organizing planning and ensuring the execution of factory functions [2]. Problems of implementing the Smart Manufacturing concept are not limited to the organization of the functioning of the system, they also include the c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 356–362, 2020. https://doi.org/10.1007/978-3-030-32258-8_42

Model of Smart Manufacturing System

357

lack of laws regulating their activities, standardization of systems, the creation of an integrated infrastructure for the industry, and the efficiency assessment of resource use and functioning of the logistics system [3]. To effectively solve complex problems, groups of interacting robots, which are mobile autonomous robotic systems (MARS), are widely used to automate processes in most of areas.

2

Literature Review

Industry 4.0 main principles were defined in paper [4]. The main areas are standardization, creating a system as a complex one, creating an infrastructure, safety and security, work organization and continuing professional development to make Smart Manufacturing concept flexible enough. They also provide basic terms of Industry 4.0. Intelligent manufacturing have the ability to self-regulate and self-control to manufacture the product within the design specifications. Internet of Things (IoT) is the main part of smart factories, which provides interaction between robots, robots and sensors. Basic research issues are creating and modeling of logistic system, making human-robot interaction safety. In [5] authors give a review of Smart Factories implementing. They guess that manufacturing in the context of Industry 4.0 give a wide range of opportunities to modern society. Such factories can rapidly respond to changes in products and customer demands. Scientists are interested in the capability, tools and infrastructure that can provide the best possible options of the product without incidents or interruption. Authors of [3] suggest a model of Smart Factory based on the concept of CPS that includes of five organizational levels as smart connection level, cyber level, etc. They also describe interlevel interactions but the weakness of this model is the absence of mathematical description of model functioning. Another topic of discussion of Smart Manufacturing is features of information in such a system and actions we may perform using information. Paper [6] describes life cycle of the information in the system of factories and describe variable types of information involved in the production process. An example in the development and research of mobile autonomous robotic systems (MARS) may be such projects as “MARTHA” [7], DARPA (Defense Advanced Research Projects Agency) [8], “AMADEUS” [9].

3

Smart Manufacturing System

Smart manufacturing is considered as a combination of two subsystems (1): the sub-system of the factory and the logistic subsystem. SystemM anuf acturing = SubsystemF actory ∪ SubsytemLogistic 3.1

(1)

Factory Subsystem

Physically, a subsystem is a set of robots that implement deterministic algorithms that follow predefined scheduling rules. They exchange informational

358

M. Usova et al.

messages and send “feedback” about the production process. Using the systems theory, we guess that the factory subsystem is the closed one, due to interactions of elements {ci } are limited to the set of internal ones between agents of the subsystem and the production planning system. The number of agent robots involved in the production is constant. Factory subsystem is a set of mathematical objects in the form of the structure < A, I, R, P r >. The elements of this structure are the set of objects of this subsystem: • the set of agents-robots A = {(a1 |q1 ), (a2 |q2 ), . . . , (an |q1 )} implements the assembly of the product, where qi is a value characterizing the access level for each agent to the elementary messages of the information space 0 ≤ qi ≤ 1. Agents with q = 1 are the most privileged ones. Each i-th agent performs a set of command functions Fi = {f1 , f2 , . . . , fm }. The set {Fi } for each i-th agent forms a uniquely defined production algorithm, as a result of which a certain set of products will be produced; • the information space I, with the help of which agents communicate with each other; • the set of resources R = {r1 , r2 , . . . , rs }; • the set of final products of the factory P r = {pr1 , pr2 , . . . , prv }. The determinism of the algorithms is because of the fact that the produced product is the result of a certain function pri = f (Ai , Fi , Ri , I, t), which is indirectly depends on time and a set of elementary messages of the information space I and directly depends on the set of initial set of resources. And if Q is a digital production space, then fpr : Q → P r. 3.2

Information Space

The information space presented in Fig. 1, is the structure < I, D >, where: • I = {i1 , i2 , . . . , il } – is a set of elementary informational messages; • D = {d1 , d2 , . . . , dl }, 0 ≤ di ≤ 1- is a set of elements that characterize the parameters of access to the i-th informational message.

Fig. 1. Schematic representation of the informational space

Model of Smart Manufacturing System

359

3.2.1 The Ways of Presenting of Informational Space The information space includes a subset of elementary command messages, chains of algorithms for implementing product assembly, customer requests, etc., which can also be represented as separate clusters of information: I = If unction ∪ Irules ∪ . . . ∪ Iclient

(2)

These clusters group information thematically (2), with dclusteri = const, i.e. messages with different access parameters can be placed in the same cluster. On the other hand, the information space is represented as a set of subsets of information messages grouped (3) by position in space: I = Imain ∪ Iinteractiona,b ∪ Iagents

(3)

The Imain cluster is defined as a set of information messages Imain = {({i}dk )}, grouped by the access parameter d, and this cluster is common for all robot agents, the ai agent gets access to the ik message when qi ≥ dk . The cluster Iinteractiona,b defines a set of messages transmitted between agents a and b, they are accessible only to these agents, the access parameter d is defined as d = min(da , db ). The last cluster of information Iagents characterizes the set of the k-th agents’ own messages set {ik }. 3.2.2 2D and 3D Informational Space The informational space can also be shown as a two-dimensional space and in a three-dimensional space (Fig. 2). In two-dimensional space, the coordinate axes are the set of elementary messages and the corresponding values of the access parameter defined for the k-th agent. In three-dimensional space, axis a is the axis of the messages senders, b is the axis of the recipient agents, t is the point in time at which the message was transmitted, and we introduce the assumption that the time of message transmission tends to zero (ttransmission → 0).

Fig. 2. Schematic representation of the 2D and 3D informational space

360

M. Usova et al.

The functioning conditions of the production subsystem are: 1. Let ti be the time interval for which the algorithm of product assembly by one agent is implemented, tpre - time spent preparing for the production of a product, tpost is the time spent on additional post-production operations with the product, then the total time of production of a unit of goods is minimal (4). n  ti + tpost → tmin (4) tpre + i=1

2. Resource consumption is effective at each stage of production (5). {ri } → |R|min 3.3

(5)

Logistic Subsystem

The logistic system is a set of autonomous cars running from point A (physical production location) to point B (product consumer location). The set of cars (vehicles) agents is a Cars = {car1 , car2 , . . . , carn }. The car has a set of parameters pi = {statusi , locationi , resourcesi }, which are determined according to (6).  0, if cari is free to complete task p, (6) statusi = 1, if cari is performing the task Also, locationi = (x, y) – current location of cari , resourcesi – current resources of cari . 3.3.1 System Functioning Conditions 1. Delivery time tends to minimum: ttask → tmin . 2. Minimum resources used to complete the task: resourcestask → resourcesmin . 3. The agent has enough resources to complete the task: resourcestask  resourcesi . 4. The agent is free to perform the task: statusi = 0. To implement the planning mechanism in the proposed logistic system, the authors of this work took the basis of the approach built on the Police Office Model (POM) [10]. The POM method itself does not use hard security mechanisms, this method is a soft security mechanism that allows the system to resist harmful destructive information-al influences that are carried out by intruders and organize planning system. Smart Manufactory system, its subsystems and interlevel interactions are presented in Fig. 3. We also defined Factory and Logistic subsystems which have common Global manufacturing management system, and their own planning systems.

Model of Smart Manufacturing System

361

Fig. 3. Smart Manufacture system levels

4

Conclusion

The authors concluded that there are no universal mathematical models to describe the functioning of Smart Manufacturing. Main sets of objects of factory and logistic systems were described and we introduced the concept of informational space which is helpful in describing of information features, messages transmitting. We also described a logistic system based on autonomous cars and Police Office Model approaches. In further studies, it is planned to expand and refine the model, add features of informational messages, describe the parameters affecting manufacturing characteristics and describe the interaction of the production and logistics systems, and develop an approach to study the effectiveness of the proposed model.

References 1. 2. 3. 4. 5. 6.

Ashton, K., et al.: RFID J. 22(7), 97 (2009) Wenzel, S., Jessen, U., Bernhard, J.: Comput. Ind. 56(4), 334 (2005) Lee, J., Bagheri, B., Kao, H.A.: Manuf. Lett. 3, 18 (2015) Thoben, K.D., Wiesner, S., Wuest, T.: Int. J. Autom. Technol. 11(1), 4 (2017) Bryner, M.: Chem. Eng. Prog. 108(10), 4 (2012) Tao, F., Cheng, J., Qi, Q., Zhang, M., Zhang, H., Sui, F.: Int. J. Adv. Manuf. Technol. 94(9–12), 3563 (2018) 7. Alami, R., Fleury, S., Herrb, M., Ingrand, F., Robert, F.: IEEE Robot. Autom. Mag. 5(1), 36 (1998) 8. Rybski, P.E., Burt, I., Dahlin, T., Gini, M., Hougen, D.F., Krantz, D.G., Nageotte, F., Papanikolopoulos, N., Stoeter, S.A.: In: Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No. 01CH37164) (2001)

362

M. Usova et al.

9. Kamada, T., Oikawa, K.: In: Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No. 98CH36146), vol. 3, pp. 2229–2236. IEEE (1998) 10. Viksnin, I., Chuprov, S., Usova, M., Zakoldaev, D.: In: IOP Conference Series: Materials Science and Engineering, vol. 497, p. 012036. IOP Publishing (2019)

Intelligent Distributed Decision Support Systems

Technology Resolution Criterion of Uncertainty in Intelligent Distributed Decision Support Systems Alexsander N. Pavlov1(B) , Dmitry A. Pavlov1 , and Valerii V. Zakharov2 1

Mozhaisky Military AeroSpace Academy, Zhdanovskaya st., 13, 197082 St. Petersburg, Russia [email protected], [email protected] 2 Russian Academy of Science, Saint Petersburg Institute of Informatics and Automation (SPIIRAS), V.O. 14 line, 39, 199178 St. Petersburg, Russia

Abstract. This study proposes a scientific and methodical approach to problem solving automation, and intellectualization of multi-criteria decision-making (DM) processes for complex objects management. The substantiation of the composition and structure of the Intelligent Distributed Decision Support System (IDDSS) in complex objects management is based on a methodology developed by the authors and technologies of proactive monitoring and management of the structural dynamics of systems. The essence of the proposed technology of criterion uncertainty resolution is the joint use of the ideas of verbal analysis of decisions (simple and complex reference situation of the survey) and procedures of bringing data qualitative indicators to quantitative ones based on using the mathematical apparatus of theory of fuzzy sets, relations and measures, and the theory of experiment planning. Keywords: Multi-criteria decision-making · Fuzzy-production approach · Experiment planning · Fuzzy measures

1

Introduction

At present, the central role in ensuring the necessary quality control of Complex Objects (CO) of natural and artificial origin belongs to the Intelligent Distributed Decision Support Systems (IDDSS) and the kernel which is Special Mathematical Software (SMS) for decision making [1–5]. An IDDSS is required for informational, methodical and instrumental support for the decision-training and decision-making processes in all stages of management. The purpose of the introduction of an IDDSS is to improve the efficiency and quality of functional and management activities through the use of advanced intelligent information technologies, formation of complex analytical information and knowledge, to allow us to develop and make informed decisions in a dynamically changing environment. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 365–373, 2020. https://doi.org/10.1007/978-3-030-32258-8_43

366

A. N. Pavlov et al.

To achieve this goal, the following tasks have to be resolved: (1) creation of a single feature space and indicators characterizing the state of CO management based on a centralized data repository, information and knowledge with accumulation, storage, access, and manipulation; (2) integration of existing local databases as part of this centralized information repository; (3) collection, accumulation and application of expert knowledge in distributed bases to form conclusions and recommendations; (4) continuous monitoring (complex analysis) of the current situation; (5) forecasting the developing situation on the basis of complex proactive modelling; (6) increasing the efficiency and quality of management decisions based on the use of analytical and forecasting tools; (7) process automation for preparation of analytical reports; (8) data visualization with the use of cognitive graphics; (9) expert instrumental and informational support and analytical activities. During the design phase of an IDDSS it is important to quantitatively and qualitatively analyze the effectiveness of the relevant control technology. At the same time, management will focus on the concept of proactive monitoring and management of the structural dynamics of complex objects (CO). The proactive monitoring and management of the structural dynamics of the CO is the processes of assessment, analysis and control of the current multi-structural macrostates of those objects, as well as the formation and implementation of control actions to ensure the transition from the current CO to a synthesized multistructural macro-state [1–5]. Proactive management of a CO, in contrast to the reactive control of a CO that is traditionally used in practice, is based upon a rapid response and subsequent prevention of incidents by the creation of innovative predictive and proactive capabilities in the formulation and implementation of control actions in the existing monitoring and management system. Proactive management of a CO also relies on the methodology and technology of complex modelling, which involves a multi-domain description of the study, the combined use of methods, algorithms and techniques of multi-criteria evaluation, and analysis and selection of preferential solutions. In this paper we start by discussing the scientific and methodical nature of the proposed approach to solving the problems of automation and intellectualization of multi-criteria decision-making processes for CO management.

2

The Proposed Technology to Support Multi-criteria Decision Making in the Management of Complex Objects

The most important step when estimating the effectiveness of CO management is the construction stage of the generalized indicator for quality control. At present, a wide variety of methods for solving multi-criteria evaluation of the efficiency of a CO have been developed [1–5]. It is known [11,19], that the basis of multi-criteria selection methods of problem specification by bringing more qualitative and quantitative information about the properties of criterion functions, on alternatives, for the optimality principles, etc. The main source of additional information when searching for the best alternatives are the experts who know

Technology Resolution Criterion of Uncertainty in IDDSS

367

a given subject area, and decision-makers, pursuing a particular objective (s), in order to achieve that, and solved the problem in question. To date, developed a wide variety of methods for solving problems of multi-criteria choice [6,10– 14,19]. Various principles and features can be offered to the classification of these methods. For example, in [14] it is proposed to distinguish between classes: a priori, some posteriori, and adaptive methods and models of multi-criteria optimization. Among a priori methods of construction methods emit a bundle of indicators (scalarization). We differentiate between heuristic and axiomatic convolution. An-other group of a priori methods for solving multi-criteria problems is based on the building exploded resulting preference relations. We differentiate between pareto’s, lexicographical, resulting majority preference relation, the first two of which are in turn divided into classic, interval and threshold, and most on without an interval and interval resulting relationship preferences. Each alternative from a finite set of non-dominated solutions with relevant decisionmakers’ preferences can be the best. However lexicographical optimization and Pareto dominance is characterized by a minimum volume of the decision-making preferences, which entails the use of an incomplete selection criteria and the construction of only a partial order on the set of feasible alternatives [6]. To overcome these shortcomings should increase the decision-making preferences. It should be noted that the development of regulatory decision-making methods need to take into account the possibilities and limitations of human information processing. Normative methods for solving multi-criteria task on types of information collected and used by decision-making e form [15,16]; – methods based on quantitative measurements, but using several indicators when comparing alternatives [14,19]; – methods based on qualitative measurements without any transition to quantitative variables [9,11]. When choosing specific methods of multi-criteria evaluation [11,14] it is advisable to comply with the following requirements: Requirement 1. Completeness and acyclic (transitive) relationship on the set of multi-criterion alternatives. Requirement 2. The requirement in the methods of decision-making must be provided to verify the information decision-makers and experts on the consistency. Low sensitivity to human error. Requirement 3. Any assumptions about the kind of decision rule must be mathematically and psychologically justified. Requirement 4. Decision-making methods must be used only such means of obtaining information from decision-makers and experts, which correspond to the possibilities of human information processing system. The analysis of [11] matching the four groups of standard methods, the first two requirements have shown that these methods are not, in General, can simultaneously ensure the completeness of comparisons of alternatives, to provide a linear order, to be rational and insensitive to human measurement errors, leading to not of the transitive relations on the set of the compared solutions. In this paper the problem of multi-criteria decision making in situations where alternatives are not known or partially known at the time of

368

A. N. Pavlov et al.

decision-making, and can also appear in the decision-making process. When the alternatives are evaluated linguistically (verbally) the specified performance indicators that characterizes being a task as not structured. There are two ways to deal with such poorly structured tasks. The first way is to describe the qualitative indicators quantitative indicators constructed in a special way (ratings, fuzzy numbers, and linguistic variables). It is believed that the use of entirely new mathematical techniques such as fuzzy sets theory, relations and actions, fuzzy integration allows you to effectively formalize and solve semi-structured tasks. The second way is to use methods of verbal (ordinal) decision analysis (ZAPROS I, II, III, and a few others) [11], which are based on the unified scale of changes in quality on the set of values of all criteria and application of socalled anchor (utopian or perfect solution, and the opposite of the decision). It is believed that any procedure qualitative to quantitative information is incorrect and relies on quantitative results with no reason. The methodology of multi-criteria decision-making, which is based on a production model of the decision maker’s preference for describing simple and complex reference situations of the survey, data processing knowledge methods of the theory of fuzzy measures [15,16,20] and the theory of experiment planning [17,18] and the verification of statements of decision-maker for consistency.

3

The Technique of Multi-criteria Decision-Making

Let a set of alternatives evaluated a set of performance indicators F = {F1 , F2 , . . . , Fm }, each of which represents a linguistic variable. For example, a linguistic variable Fi = “Payback Period” can take values from the set of simple and compound terms T (Fi ) = {“low”, “below average”, “average”, “above average”, “high”}. For qualitative interpretation of the resulting index will use the linguistic variable “Effective Solutions”, which can take the values T (Fres ) = {“bad”, “below average”, “average”, “above average”, “good”}. In the most general form of knowledge of decision-makers on the relationship private performance F = {F1 , F2 , . . . , Fm } with the resulting index Fres can be represented production models of the form:

Fig. 1. The baths of the linguistic variable in scale [−1, +1]

Technology Resolution Criterion of Uncertainty in IDDSS

369

Pj : “IF F1 = A1j and . . . and Fm = Amj , THEN Fres = Ajres , where Aij ∈ T (Fi ), Ajres ∈ T (Fres ) are the terms of the respective linguistic variables. As a common scale for all values of the indicators used by the bipolar scale [−1, 0, +1], and baths can be set using fuzzy numbers (L-R) type (Fig. 1). In accordance with the method of solving the problem of multi-criteria evaluation proposed in the works [17,18], extreme (“minimum” and “maximum”) values of the linguistic variable Fi scale labeled “−1” and “+1” and to build the result indicator, in accordance with the provisions of the theory of planning the experiment, form the orthogonal plan of the expert survey, elements of which are extreme marked private values of performance indicators {F1 , F2 , . . . , Fm }. An example of the orthogonal plan of the expert survey for three private performance indicators presented in Table 1. In Table 1 the values of the terms of the linguistic variable Fres result performance indicator can be represented by triangular fuzzy numbers (Fig. 2). Then, for example, in the second row of the table is presented following expert judgment “If the indicator F1 is set to “high”, the indicator F2 is set to “low”, the indicator F3 is set to “low”, the resulting indicator Fres is estimated as “below average”. And production itself generally is seen as supporting the situation when holding the expert survey. Table 1. The orthogonal plan of the expert survey F0 F1 F2 F3 F1 F2 F1 F3 F2 F3 F1 F2 F3 Fres 1

−1 −1 −1 1

1

1

−1

A1res

1

1

1

−1 1

−1 −1

−1

1

1

A2res

1

−1

1

1

1

−1 1

A3res

−1

−1

−1

A4res

1

−1 −1 1

1

1

1

−1

−1

1

A5res

−1

1

−1

−1

A6res

1

−1 1

1

1

1

−1

−1

1

−1

A7res

1

1

1

1

1

A8res

µ13

µ23

µ123

−1 −1 −1 1

−1 1 1

µ0 µ1 µ2 µ3 µ12

Fig. 2. The scale of the resulting indicator

370

A. N. Pavlov et al.

The calculation of the resultant index coefficients Fres = μ0 +

m m  

μij Fi Fj + · · · + μ12...m F1 F2 . . . Fm

i=1 j=1,j=i

that take into account the effect of a private individual indicators and the impact of a set of two, three and so on indicators is carried out according to the rules adopted in experiment planning theory. For this calculated averaged scalar product of the corresponding columns of an orthogonal matrix (Table 1) by the vector of the resulting performance indicator values. For example, the coefficient μ2 value is calculated as follows: μ2 =

−A1res − A2res + A1res + A3res − A2res − A4res + A3res + A5res . 8

Thus, the proposed technique of multi-criteria decision making consists of the following steps. Step 1. The formation of many linguistic scales for each of the partial indicators and the resulting index of efficiency of decisions. Transfer individual results to the scale [−1, +1]. Step 2. The construction of orthogonal plan of the expert survey and the peer survey (answers to questions of production rules). Step 3. The build result indicator of the effectiveness of decisions. Among these the most important steps and responsible step 2 is associated with obtaining expert answers to the questions contained in the production rules. On the one hand, this is because, for example, when the number of a particular performance number of questions asked 4 increases and becomes more than 16, which usually leads to inconsistencies in statements mean expert features of human thinking. These characteristics are reflected in the patterns derived by J. Miller; the essence of which lies in the fact that short-term expert human memory cannot remember and repeat more than 7 ± 2 elements. On the other hand, according to the Ellsberg paradox, a person (expert) thinks not additive that requires evaluation of his answers do not use an additive (fuzzy) measures [15,16,20]. To resolve arisen difficulties is invited to in step 2 the peer survey proceeds as follows. For example, suppose you set private indicators Fi (i = 1, . . . , m) that evaluate the effectiveness of decisions. To conduct the expert survey in step 2 is required to draw up 2m production rules type Pj : “IF F1 = A1j and F2 = A2j and . . . and Fm = Amj , THEN Fres = Ajres ”, where Aj ∈ {−1Fi , +1Fi } – “low” or “high” value of Fi , Ajres ∈ T (Fres ) - the baths of the linguistic variable performance indicator result. Rules, in which all performance indicators except one take “low” values, we call simple rules a survey of expert or simple support situations. The number of these situations corresponds to the number of partial indicators of efficiency.

Technology Resolution Criterion of Uncertainty in IDDSS

371

We assume that the rules P1 , P2 , . . . , Pm are simple, in which the corresponding indicators F1 , F2 , . . . , Fm take “high” values. Complex (compound) rule (advanced reference situation) can be described using some simple reference situations in the following way. A rule Pj : “IF F1 = A1j and F2 = A2j and . . . and Fm = Amj , THEN Fres = Ajres ”, in which the indicators with the indices {i1 , i2 , . . . , ik } ⊆ {1, 2, . . . , m} take high values, can be written as Pj = Pi1 ∪ Pi2 ∪ · · · ∪ Pik . Evaluation of result indicator Aires of in simple rules, we denote gi = E(ai , αi , βi ) = E(Aires ), i = 1, . . . , m, where E(•) is the operation defuzzii fication of triangular fuzzy number (for example, E(ai , αi , βi ) = ai + βi −α 3 ). The calculation of the resulting index ratings in complex situations, reference is invited to be carried out by building a constructive parameter – fuzzy measure Sugeno [15,16,20] on a finite set of simple support situations Pi , i ∈ Γ = {1, 2, . . . , m}, where gi - density of distribution of the fuzzy measures. Sugeno measure reflects an assessment of the resulting figure in a complicated rule Pj = Pi1 ∪ Pi2 ∪ · · · ∪ Pik and is as follows: Gλ (Pj = Pi1 ∪ Pi2 ∪ · · · ∪ Pik ) = k  [ (1 + λgil ) − 1]/λ. l=1

For the construction of λ-fuzzy measure Sugeno, characterizing the rating of the resulting figure in the complex is generally required to find the root λ∗ of the interval (−1, +∞) is the following polynomial of (m − 1) order [10,11,15] k  [ (1 + λgil ) − 1]/λ = 1, −1 < λ < ∞. l=1

It should be noted that in [11] the theorem stating that the polynomials have exactly one root in the interval (−1, ∞). The obtained estimates complex rules are used to check the consistency of the DM’s remarks. For example, if you reply to a complex rule Pj evaluation result indicator will be equal Ajres and relative deviation of the result from the value Gλ∗ (Pj ) will be greater than the specified error value 0 ≤ γ ≤ 1 (i.e. |Gλ∗ (Pj )−E(Ajres |) > γ), it is considered that the expert gave a wrong answer. Gλ∗ (Pj ) Identified contradictions are facing DM to analyze and resolve them.

4

Conclusions

There are advantages of using the intelligent technology for support: use a combination of quantitative and qualitative (fuzzy) information about the management effectiveness of the CO, which will significantly improve the quality of decisions and conclusions; formalization of expert information provided to the expert in a natural language by introducing linguistic variables, which allow an adequate display of the approximate verbal description of objects and phenomena, even in cases where there is no deterministic description; the proposed method in a fuzzy environment with a synthesized theory of experiments design, and a fuzzy-logic linguistic description of statements, allows us to carry out the formalization of a constructive level of expert experience (an expert group) in the form of predictive models in a multidimensional space of linguistic performance. The proposed

372

A. N. Pavlov et al.

method minimizes the number of calls to the experts and takes into account the complex non-linear nature of the influence of efficiency indicators on a CO indicator. Acknowledgement. The research described in this paper is partially supported by the Russian Foundation for Basic Research (grants 16-29-09482-fi-i, 17-08-00797, 1706-00108, 17-01-00139, 17-20-01214, 17-29-07073-fi-i, 18-07-01272, 18-08-01505, 19-0800989) state order of the Ministry of Education and Science of the Russian Federation 2.3135.2017/4.6, state research 0073-2019-0004, International project ERASMUS +, Capacity building in higher education, 73751-EPP-1-2016-1-DE-EPPKA2-CBHE-JP, Innovative teaching and learning strategies in open modelling and simulation environment for student-centered engineering education.

References 1. Okhtilev, M.Y., Sokolov, B.V.: Problems of the development and using of automation monitoring systems of complex technical objects. SPIIRAS Proc. 1, 167–180 (2002). http://proceedings.spiiras.nw.ru/ojs/index.php/sp/article/view/1085 2. Okhtilev, M.Y., Sokolov, B.V., Yusupov, R.M.: Intellekualnye technologii monitoringa i upravleniya stukturnoj dinamikoj slozhnych technicheskih obektov, p. 410. Nauka (2006). (in Russian) 3. Pavlov, A.N.: Integrated modelling of the structural and functional reconfiguration of complex objects. SPIIRAS Proc. 28, 143–168 (2013) 4. Mikoni, S.V., Sokolov, B.V., Yusupov, R.M.: Kvalimetriya modelej i polimodelnych kompleksov: monografija, p. 314. RAN (2018). (in Russian) 5. Sokolov, B.V., Ivanov, D.A., Pavlov, A.N.: Optimal distribution (re) planning in a centralized multi-stage supply network in the presence of the ripple effect. Eur. J. Oper. Res. 237(2), 758–770 (2014) 6. Mikoni, S.V.: Teorija prinjatija upravlencheskih reshenij. Uchebnoe posobie [Theory of administrative decision-making. Tutorial]. SPb.: Lan’, p. 448 (2015). (in Russian) 7. Mattila, V., Virtanen, K.: Ranking and selection for multiple performance measures using incomplete preference information. Eur. J. Oper. Res. 242(2), 568–579 (2015) ¨ orni, A.: Can a linear value 8. Korhonen, P.J., Silvennoinen, K., Wallenius, J., O¨ function explain choices? An experimental study. Eur. J. Oper. Res. 219(2), 360– 367 (2012) 9. Petrovskij, A.B., Rojzenzon, G.V., Tihonov, I.P., Balyshev, A.V.: Retrospektivnyj analiz rezul’tativnosti nauchnyh proektov [A retrospective analysis of the performance of research projects]. Int. J. Inf. Models Anal. 1(4), 349–356 (2012). (in Russian) 10. Podinovski, V.V.: Decision making under uncertainty with unknown utility function and rank-ordered probabilities. Eur. J. Oper. Res. 239(2), 537–541 (2014) 11. Larichev, O.I.: Verbal’nyj analiz reshenij [Verbal decision analysis], p. 181. Nauka (2006). (in Russian) 12. Mikoni, S.V.: System analysis of multi-criteria optimization methods on a finite set of alternatives. Tr. SPIIRAN - SPIIRAS Proc. 4(41), 180–199 (2015). (in Russian) 13. Mikoni, S.V.: Axioms of multicriteria optimization methods on a finite set of alternatives. Tr. SPIIRAN - SPIIRAS Proc. 1(44), 198–214 (2016). (in Russian)

Technology Resolution Criterion of Uncertainty in IDDSS

373

14. Sokolov, B.V., Moskvin, B.V., Pavlov, A.N., et al.: Voennaja sistemotehnika i sistemnyj analiz. Modeli i metody prinjatija reshenij v slozhnyh organizacionnotehnicheskih kompleksah v uslovijah neopredeljonnosti i mnogokriterial’nosti: uchebnik [Military systems engineering and systems analysis. Models and methods of decision-making in complex technical-organizational systems in conditions of uncertainty and multicriteria]/Pod red. B.V. Sokolova. SPb.: VIKKU imeni A. F. Mozhajskogo, p. 496 (1999). (in Russian) 15. Nechetkie mnozhestva v modeljah upravlenija i iskusstvennogo intellekta [Fuzzy sets in management models and artificial intelligence]/Pod red. D.A. Pospelova, p. 312. Nauka (1986). (in Russian) 16. Pavlov, A.N., Sokolov, B.V.: Prinjatie reshenij v uslovijah nechetkoj informacii: ucheb. Posobie [Decision-making in conditions of fuzzy information: tutorial]. SPb.: GUAP, p. 72 (2006). (in Russian) 17. Zelentsov, V.A., Pavlov, A.N.: Multi-criteria analysis of the influence of individual elements on the performance of complex systems. Informacionno-upravljajushhie sistemy - Inf. Control Syst. 6(49), 7–12 (2010). (in Russian) 18. Pavlov, A., Sokolov, B., Pashchenko, A., Shalyto, A., Maklakov, G.: Models and methods for multicriteria situational flexible reassignment of control functions in man-machine systems. In: Proceedings of the 2016 IEEE 8th International Conference on Intelligent Systems, pp. 402–408 (2016) 19. Nogin, V.D.: Prinjatie reshenij v mnogokriterial’noj srede: kolichestvennyj podhod [Decision making in multicriteria environment: a quantitative approach], p. 176. FIZMATLIT (2005). (in Russian) 20. Pyt’ev Ju, P.: Vozmozhnost’ kak al’ternativa verojatnosti. Matematicheskie i jempiricheskie osnovy, primenenie [The possibility alternatively probability. Mathematical and empirical basis, application], p. 464. FIZMATLIT (2007). (in Russian)

Satellite Constellation Control Based on Inter-Satellite Information Interaction Oleg Karsaev1(B) and Evgeniy Minakov2 1

St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), 14 Line 39, St. Petersburg, Russia [email protected] 2 Military-Space Academy, Zhdanovskaya str. 13, St. Petersburg, Russia [email protected]

Abstract. The paper proposes an approach to the development of a control system for low-orbit satellite constellation performing the requests of remote sensing of the Earth and using information interactions between the satellites. Herewith interaction is performed under conditions of delay and disruption tolerant networks. At the basis of message transmission Delay-and-disruption Tolerant Networking technology is used and Contact Graph Routing (CGR) approach is considered in the basis of routing. It is given the description of the developed algorithm of outgoing message queue control in the nodes of the network. The control system relies on distributed and autonomous scheduling of requests that uses the auxiliary data structure calculated on the Earth. This data structure is also used as the basis for the corresponding information interaction scheme. Simulation modeling is used to study and demonstrate the capabilities of the proposed solutions. Keywords: Satellite constellation · Distributed autonomous scheduling · Inter-satellite information interaction

1

Introduction

The use of small satellite constellations instead of a single heavy satellites leads to new possibilities of performing space missions. There is a lot of information about this in the popular science literature. However, the embodiment of this perspective in reality requires the solution of a large number of new problems, one of which is the development of control system for satellite constellation. Fundamentally new tasks in this case are information interaction between satellites and redistribution of target operations between satellites depending on their current state. In this regard, it is necessary to note the project EDSN (Edison Demonstration of Smallsat Networks), which is already trying to make practical use of inter-satellite communication. In the project, a constellation of eight satellites is considered. Information interactions between satellites are performed according to a priori computed schedule in the form of a sequence c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 374–384, 2020. https://doi.org/10.1007/978-3-030-32258-8_44

Satellite Constellation Control

375

of cycles. In each cycle, one of the satellites plays the role of Captain, and the rest—the role of Lieutenant. The task of the Captain is to collect and transmit to Earth the results of observations from all satellites. A more detailed description of this project, as well as a set of tasks to be studied, can be found in [1,2]. Contract Net Protocol is often used as the basis of information interaction in various subject areas. The variant of using the modified version of this protocol for the organization of information interaction in the group of satellites can be found in [3]. In these works it is assumed that the transmission of messages over the communication network between any two satellites is almost always possible. This is because in the cases under consideration satellites are constantly in relative proximity to each other. However, such a constellation orbital formation is a very special case and is not of particular interest for space missions, in particular, to perform requests of remote sensing of the Earth. In the general case, communication takes place in the conditions of Delay Tolerant Networks (DTNs). An exception is the use of multiple satellite constellation, which are considered in such projects as OneWeb [4] and Sphere [5]. These projects consider the constellations consisting of 640 and 700 satellites respectively. The organization of communication in DTNs is the subject of research and development within the Advanced Exploration Systems program, carried out by NASA since 2002 [6]. One of this program development is the technology of data transmission, which received the corresponding name DTN (Delay-andDisruption Tolerant Networking) [7]. In a basis of routing in this technology the CGR (Contact Graph Routing) approach is used [8]. The objective of this program is development of the key technologies for missions in deep space. However, the same technologies were later considered as the basis of the control system of low-orbit satellite constellation [9,10], and they are also considered in this work. Many years’ experience shows that the traditionally employed model of ground planning for the target tasks execution has a low efficiency of the use of on-Board satellites resources. Therefore the transition to autonomous planning has being actively studied and developed for two decades, since in this case planning system has actual information both on the state of satellite resources and on the performance of observation tasks. Herewith due to many reasons, the approach are considered when the on-Board planning system uses the results of ground calculations [11–15]. In the case of the use of satellite constellation, there are fundamentally new opportunities for the execution of space missions, but at the same time, there are new algorithmic tasks associated with the planning of operations and satellite constellation control. In particular, autonomous planning foremost can use the possibility of observations’ redistribution between satellites depending on the current actual state of their resources. The paper discusses an approach to the development of a control system for low-orbit satellite constellation when the information interaction between the satellites is performed in the DTNs network. The problems of using the CGR approach in the case of low-orbit constellations and their solutions are considered in the first part of the article. The second part describes the proposed scheme of information interaction, which implements the possibilities of autonomous

376

O. Karsaev and E. Minakov

scheduling of requests. The third part describes the simulation model that is used to study the proposed solutions. The description of experimental studies is given in the fourth part.

2

Message Routing

DTN technology is based on the following basic principle. If the node could not transmit the data package, it is not deleted, but stored in it. Attempts of transmission continues until the node will not connect to the same or a different node and not send him the data. Therefore, the information in any case reaches the recipient node. In order to use the bandwidth of communication channels more efficiently, the transmitted data can be fragmented into several smaller bundles. The collection of bandles to the original message can occur in intermediate and/or in end nodes of data transfer route. Message routing is based on the CGR approach. This approach, in turn, is based on a contact plan. A contact plan is a plan for establishing satellite-tosatellite and satellite-to-Earth communication channels. Based on the contact plan, a contact graph is computed, the vertices of which are not the network nodes, but the contacts between them. The arcs of the graph in this case correspond to the processes of data storage in the network nodes until subsequent contacts are established. This graph allows to use the usual algorithms for routing, for example, Dijkstra’s algorithm. The plan and contact graph are calculated on the Ground and transmitted to all network nodes for autonomous routing. A detailed description of the CGR approach, as well as the actual problems of its development, can be found in [9]. The immediate result of routing is to define a contact for further transmission of each message. However, according to the specifics of DTNs in the network nodes a queue of messages to be transferred can be arisen. Therefore, along with routing in the network nodes the task of outgoing messages queue control should be considered. Control refers to the distribution of messages over the scheduled contacts, taking into account their bandwidth capabilities and the volume of data transmitted in messages. Fragmentation of messages into many smaller bundles can be considered for efficient use of communication channels. This problem can be formalized as follows. At the current time t∗ the node has a list of messages to be transmitted i = 1(1)N and a list of scheduled contacts j = 1(1)K. Each message i is described by the following basic properties: – – – –

N ode(i)—node where the message should be delivered, V olume(i)—data volume, Route(i)—route of message transmission, Contact(i)—contact within which message i transmission is scheduled, that is, the contact that is the first in the route Route(i). Each contact j is described by the following basic properties:

– Capacity(j)—volume of data that can be transferred within contact, – Completion(j)—end time of the contact,

Satellite Constellation Control

377

– List(j)—list of messages which transmission is scheduled within contact, i.e. messages i that have Contact(i) = j, and – V olume(j)—volume of data that scheduled to be transmitted within contact. The result of the task solution is the calculation of the plan for the transmission of outgoing messages, i.e. the calculation of lists List(j), taking into account restriction for each contact j: V olume(j) ≤ Capacity(j)

(1)

The following requirements are the objective criteria for calculating the message transmission plan: – the earliest possible time to deliver messages to end nodes, and – the most complete use of the contact bandwidth, i.e. for each j minimizing the difference Capacity(j) − V olume(j). The task is solved in dynamic mode. The plan for sending outgoing messages should be updated whenever the following events occur: appearance of a new message, establishing / not establishing contact, completion of contact, the transfer of the message have not been fulfilled. The pseudocode of the algorithm for solving the task is shown in Listing 1. The algorithm performs a refinement of the existing message transmission plan when the occurrence of the listed events. This means that in the initial state there are lists List(j), calculated earlier, and messages for that the transmission paths are not defined. For these messages the route is calculated based on the assumption that their transmission can start at the current time t∗ (lines 1–8). This clarifies the message lists and the volume of data scheduled for the relevant contacts (lines 5–7). Then, for each contact, the order of messages in queue is clarified and the restriction (1) is checked (lines 8, 9). If this restrictionis met for all contacts, the task is solved. Otherwise, messages from contacts that violate this restrictionare redistributed to other contacts. The redistribution is performed iteratively. At each iteration the contact j with the earliest end time is selected (line 11). The last messages are sequentially selected and deleted from the message queue of selected contact until the restriction (1) is will be met (line 12). Deleting a selected message has one of two consequences: restriction (1) is not met or met. In the first case the message is rerouted with an additional condition: it cannot be transmitted within contact j (line 17). In the second case, the message is fragmented (line 19). Fragmentation involves splitting the data of this message into two bundles and creating two new messages respectively. It is assumed that one of these messages will be transmitted along the route of the original message. Therefore, it remains scheduled for transmission within the same contact j as the original message. The volume of data in this message is determined in such a way as to make full use of the remaining bandwidth of this contact j. The second message is rerouted with an additional condition: it cannot be sent within contact j.

378

O. Karsaev and E. Minakov

Listing 1. Algorithm of outgoing message queue control t := t∗ foreach i with Route(i) = ∅ begin Routing(i, t) 5 k := Contact(i) 6 i → List(k) 7 Calculate (V olume(i)) 8 end 9 foreach j Queue(List(j)) 10 if forall j V olume(j) ≤ Capacity(j) Stop 11 Select j with V olume(j) > Capacity(j) and min(Completion(j)) 12 While V olume(j) > Capacity(j) do 13 begin 14 Select last i in List(j) 15 if V olume(j)V olume(i) > Capacity(j) 16 then 17 Routing(i, Completion(j)) 18 else 19 F ragmentation(i) 20 end 1 2 3 4

The order of messages transmitting in each contact (line 9) is determined depending on their type. Service messages are sent in the first place, messages of information and team interaction—in the second place, observations results download—in the third place. Service messages are ones that are used to establish scheduled contacts and to confirm the delivery of transmitted messages. The transmission order of first and second type messages is not significant, since the volume of data in these messages is insignificant. The order of observation result download is determined taking into account the comparative analysis of request requirements; in particular, taking into account the required time for delivery of the observations results on the Earth and the importance of requests.

3

Autonomous Scheduling

In the case of satellite constellation, autonomous scheduling can use information interaction and redistribution of observations between satellites. A constructive basis for this can be data structure (2) describing the time-ordered list of opportunities capture the image of each observation area T arget: Opportunities(T arget) = {T W (Sat)}

(2)

The opportunity is understood as the time interval T W , when area T arget is in the visibility field of any satellite Sat. These lists are calculated on the Ground. They become part of the requests’ description and are used to determine the satellites to which the corresponding requests are initially transmitted. After the transmission, an iterative distributed process scheduling observations is triggered. At each iteration of the process, a

Satellite Constellation Control

379

satellite performs autonomous scheduling observation of the area specified in the description of the received request. Herewith one of the following three results can be obtained: a. observation is scheduled, b. observation is not scheduled, c. observation is scheduled by reassigning the observation of another area from another request with a lower priority to another satellite. In case ‘a’ the schedule process ends. In cases ‘b’ and ‘c’, the schedule process continues and, based on the list of opportunities, the task of selecting the next satellite to schedule the requestis solved. Herewith, in case ‘c’ the schedule process is considered for the reassigning request. Thus, the proposed scheme of autonomous schedule includes two main tasks: the task of local autonomous scheduling of observation and the task of selecting a satellite to continue scheduling for the next iteration. 3.1

Local Autonomous Schedule of Observation

The content of this task is as follows. The satellite has a current schedule of observations, which is admissible, i.e. satisfies the following restrictions: i minimum time intervals between successive observations, and ii schedule does not entail violations of the limits of recoverable resources (battery charge and free memory) on the time interval of the planning horizon. It is need to evaluate the possibility of including a new observation O in the current schedule. Herewith, one of the three estimates described above can be obtained: ‘a’, ‘b’ or ‘c’. The solution of the task consists of the following steps. In the first step, a new observation is added to the plan, taking into account the corresponding time window TW(Sat) from the list of opportunities and without changing the order of the previously scheduled observations. In the second and third steps compliance with restrictions ‘i’ and ‘ii’ are checked respectively. The result of checking each constraint in a formal view can be represented as follows: R(x) = a/b/c(L),

(3)

where x—restriction ‘i’ or ‘ii’, a—means compliance with restriction x, b— violation of restriction x, c(L)—compliance with the restriction provided the removal from the schedule of one of the observation listed in the list L. In the case of constraint‘i’ check in the list L can be added observation that is the preceding and the subsequent one in relation to the new observation O and that in comparison with it have lower priority. In the case of constraint ‘ii’ check in the list L can be added all observations that in comparison with new observation O have lower priority.

380

O. Karsaev and E. Minakov

The result of autonomous scheduling of new observation O in formal view by analogy with (3) can be presented as follows: R(O) = a/b/c(O∗ ), where O∗ —means observation reassigned on other satellite. Using the notation introduced, the logical rules for the final decision can be presented as follows: R(O) = a in case of R(i) = a & R(ii) = a R(O) = b in case of R(i) = b R(O) = c(O∗ ) in case of R(i) = a & R(ii) = c(L), any O∗ ∈ L R(i) = c(L1 ) & R(ii) = c(L2 ), exist O∗ which O∗ ∈ L1 & O∗ ∈ L2 3.2

Selecting Next Satellite for Observation Scheduling

The task is to select the earlier opportunity T W (Sat) from the list of opportunities and, accordingly, the satellite Sat, to which the request can be delivered timely. Delivery time is computed by routing service of network layer. Delivery request to satellite Sat is timely if the following condition is met td (Sat) < ts (Sat),

(4)

where td (Sat) is time delivery and ts (Sat) is the left border of the time window T W (Sat). From all satellites, the transmission of the request to which is timely, the satellite with the earliest time td (Sat) is selected. During the transmission of messages, there may be situations when some of the scheduled contacts are not installed. This entails changing the delivery time of messages to the end recipients and can entail violation of condition (4). If such a situation is detected in the intermediate node of request transmissionroute, the task of choosing the next satellite is also solved. The result of its decision is the change of the final destination in the transmitted message.

4

Simulation Model

To perform experimental studies and demonstrate the capabilities of the proposed solutions, a simulation model is used. It allows simulating the formation of a network in satellite constellation, information interaction of satellites in the network and autonomous scheduling of observations in accordance with the described approaches. The model is developed on the basis of agent approach. The operation of satellites and ground stations, as well as the physical devices of satellites, in particular, transmitter, receiver, surveillance devices, and rechargeable batteries, is described in the form of behavior scenarios of the correspondent agents. The architecture of the model is shown in Fig. 1. According to the figure, it consists of two types of components, planning modules and agents.

Satellite Constellation Control

381

The planning modules provide scheduling of operations to be performed by the satellite in response to incoming requests. In particular, Dispatcher routes messages and determines the order in which outgoing messages are sent. M emory module and Battery module models respectively the state of the memory and the state of battery at the current time and on Fig. 1. Satellite simulation model the planning horizon. Scheduler, based on interaction with other satellite schedulers and with M emory and Battery modules, provides distributed scheduling of observations. Satellite agent simulates the state of the satellite that may be out of operation for some time. At scheduled times, the agent turns on the receiving and transmitting devices and sends contact requests to the agents of the respective satellites. Contact is established if both satellites are in working state. Transmitter agent simulates the sending of outgoing messages according to the queue defined by the Dispatcher. Simulation of message transmission includes determining the time for its transmission, taking into account the data volume of the transmitted message and the data rate in the communication channel. The message is transmitted when the battery charge level is sufficient. Charge level is simulated by battery agent. Receiver agent analyzes the possibility of receiving incoming messages based on the check of available memory. If the recipient of the message is itself satellite, the message is forwarded to the Scheduler, if another satellite, the message is forwarded to the Dispatcher. Sensor agent simulates the execution of observations scheduled by the Scheduler. Observationis performed if the battery level is sufficient and there is a required volume of available memory. Battery agent updates the current charge level after all operations, and simulates the battery charge when the satellite is on the solar side of the orbit.

5

Results of Experimental Studies

The purpose of the initial experiments was to obtain preliminary estimates of the effectiveness of requests’ execution using the proposed approach to distributed autonomous planning and information interaction between satellites. The efficiency assessment was determined by the time interval between the appearance of the requests and the delivery of observations’ results to the Earth. The constellations consisting of 16 and 8 satellites were considered in the experiments. Orbital formation of constellations is borrowed from the work [9]. The results of the experiments are shown in the Fig. 2 (left side—16, right side—8 satellites).

382

O. Karsaev and E. Minakov 20

20

18

18

16

16

14

14

12

12

10

10

8

8

6

6

4

4

2

2 Ground scheduling

В1

В2 В3 Autonomous scheduling

Ground scheduling

В1

В2 В3 Autonomous scheduling

Fig. 2. Evaluation of request execution effectiveness

In the experiments for comparison also calculated estimates of efficiency in the case where the information interaction between the satellites is not used, and ground scheduling was considered instead autonomous one. The list of 20 observation areas selected randomly was used as the initial data. In the process of simulation the average, maximum and minimum execution time of requests was calculated. Simulation of the network in the experiments took into account the characteristics of the transmitting and receiving equipment installed on satellites: – the range of the transmission signal (d) and – the number of communication channels (n), which can simultaneously be supported by this equipment (B1: d = 1000 km, n = 1; B2: d = 10000 km, n = 1; B3: d = 10000 km, n = 2). In the case of eight satellites in variant B1, there are pairs of nodes between which there are never data routes. In this case, the organization of information exchange is impossible, and therefore autonomous scheduling is also impossible. The results of the initial experimental studies show that in the case of the considered satellite constellations, the time of requests’ execution using the proposed autonomous scheduling scheme is 3–5 times less compared to the use of ground one.

6

Conclusions

The paper proposes an approach to the development of a control system for loworbit satellites constellation performing the requests of remote sensing of the Earth. The control system is based on the communication between satellites, the capabilities of which are determined by the limitations of the DTN network. With regard to such conditions, the description of the proposed approach to the implementation of autonomous scheduling of observation operations and the corresponding scheme of information interaction between satellites is given. The results of experimental studies show that the use of the proposed solutions can provide a multiple increase in the efficiency of request execution.

Satellite Constellation Control

383

The objectives of the further development of the proposed approach in the next stages are to assess the performance of satellite constellation and to develop the technology for performing complex observation requests. Under the performance rating refers to the number of requests performed by satellite constellation at certain periods. In this direction the private, but the crucial issue is the estimation of the bandwidth of the network and deliver the results of observations on the Ground. Complex observation requests suggest a scenario in which several interrelated observations are performed by the surveillance devices of various types. Acknowledgments. The work is supported by RFBR (grant no. 18-01-00840).

References 1. Hanson, J., Sanchez, H., Oyadomari, K.: The EDSN intersatellite communications architecture. In: Proceedings of the AIAA/USU Conference on Small Satellites, SSC14-WS1 (2014) 2. Chartres, J., Sanchez, H., Hanson, J.: EDSN development lessons learned. In: Proceedings of the AIAA/USU Conference on Small Satellites, SSC14-VI-7 (2014) 3. van der Horst, J., Noble, J.: Task allocation in networks of satellites with Keplerian dynamics. Acta Futura 5, 143–151 (2012) 4. OneWeb. https://www.oneweb.world. Accessed 6 Apr 2019 5. Sfera. https://ru.wikipedia.org/wiki/%D0%A1%D1%84%D0%B5%D1%80%D0 %B0 (%D1%81%D0%BF%D1%83%D1%82%D0%BD%D0%B8%D0%BA%D0 %BE%D0%B2%D0%B0%D1%8F %D1%81%D0%B8%D1%81%D1%82%D0%B5 %D0%BC%D0%B0 %D1%81%D0%B2%D1%8F%D0%B7%D0%B8). Accessed 06 Apr 2019. (in Russia) 6. Advanced Exploration Systems. https://www.nasa.gov/directorates/heo/aes/ index.html. Accessed 06 Apr 2019 7. Caini, C.: Delay-tolerant networks (DTNs) for satellite communications. In: Rodrigues, J. (ed.) Advances in Delay-Tolerant Networks (DTNs), pp. 25–47. Woodhead Publishing, Oxford (2015) 8. Araniti, G., et al.: Contact graph routing in DTN space networks: overview, enhancements and performance. IEEE Commun. Mag. 53, 38–46 (2015) 9. Fraire, J.A., et al.: Assessing contact graph routing performance and reliability in distributed satellite constellations. J. Comput. Netw. Commun. 2017, 18 (2017). Article ID 2830542 10. Caini, C., Firrincieli, R.: Application of contact graph routing to LEO satellite DTN communications. In: IEEE International Conference on Communications, pp. 3301–3305 (2012) 11. Maillard, A., et al.: Ground and board decision-making on data downloads. In: Proceedings of 25th International Conference on Automated Planning and Scheduling, pp. 273-281 (2015) 12. Lenzen, C., et al.: Onboard planning and scheduling autonomy within in fire bird mission. In: Proceedings of the 14th International Conference on Space Operations AIAA 2014, p. 1759 (2014) 13. Kennedy, A., et al.: Automated resource-constrained science planning for the MiRaTA mission. In: Proceedings of the AIAA/USU Conference on Small Satellites, SSC15-6-37 (2015)

384

O. Karsaev and E. Minakov

14. Herz, E., et al.: Onboard autonomous planning system. In: Proceedings of the 14th International Conference on Space Operations, AIAA 2014, p. 1783 (2014) 15. Li, J., Chi,Y.: Planning and scheduling of an agile EOS combining on ground and on-board decisions. In: IOP Conference. Series: Materials Science and Engineering (2018). https://doi.org/10.1088/1757-899X/382/3/032023. Accessed 06 Apr 2019

Load Balancing Cloud Computing with Web-Interface Using Multi-channel Queuing Systems with Warming up and Cooling Maad M. Khalill1 , Anatoly D. Khomonenko2 , and Sergey I. Gindin2(B) 1

Collage of Pure Science, University of Diyala, Baqubah, Diyala, Iraq [email protected] 2 Emperor Alexander I St. Petersburg State Transport University, Moskovski prospekt, 9, Saint Petersburg, Russia [email protected], [email protected]

Abstract. The model based on queuing theory for a cloud computing system Web-Interface that uses pre-processing and post-processing is presented. The model with different parameters of system, like the arrival rate, service rate, service time and number of servers is analyzed. The results examined carefully and described and shown in a number of charts and figures where the effect of warming-up and cooling is clearly visible and how the effective model helps balancing the load on the servers. Keywords: Queuing theory · Cloud computing Cooling · Web-interface · Load balancing

1

· Warming up ·

Introduction

Everyday more devices are connecting to the internet having different shapes, sizes and purposes thus the world is moving from the internet of computers to the internet of things, since there is a big demand for devices to get smaller and capable of dealing with fast connectivity, cloud computing becomes the obvious solution for providing large storage, powerful processing, anywhere anytime on demand service. Cloud computing is basically a number of servers that are connected in an order that is most sufficient and effective to provide its maximum computing power and QoS see Fig. 1. But in some cases when resources in the cloud are in high demand and the performance of the cloud hardware isn’t sufficient enough, bottlenecks can effect the QoS. System architecture planning and network services depend heavily on the prediction of demand of the requested services. To find optimal solutions for serving requests in a short time, queuing theory provides a number of probable solutions and measures that can help in making decisions to reach a stable and load balanced system. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 385–393, 2020. https://doi.org/10.1007/978-3-030-32258-8_45

386

M. M. Khalill et al.

In this paper, a cloud computing model with Pre-Processing (warming-up) and Post-processing (cooling) is investigated, calculated and tested, These calculations are based on queuing theory and extend its applications by studying the multi-channel systems with “warming-up” & “cooling” and phase-type approximation of Markovian and non-Markovian processes. Transition diagrams and matrices for the microstates of a sample application with Web interface model described, and a scheme for computing the stationary probability distributions for requests number and waiting time is developed.

Fig. 1. Cloud computing system with warming-up & Cooling

2

Existing Models

For queuing theory Kendall’s notation is the standard system to describe and classify a queuing node. It uses three factors written A/S/n, where A denotes the time between arrivals to the queue, S – the service time distribution and n denotes the number of servers at the node. It been extended to A/S/n/K/N/D where K and D mean the capacity of the queue and queuing discipline and N denotes the size of the population of jobs to be served. Best studied QS M/M/n – with n channels, service time distribution and the distribution for the time between arrivals follow an exponential distribution. Because of the assumptions made, such models known as Markov models, have limited appliance and do not fit for most practical systems.

Load Balancing Cloud Computing

387

The most examined are the relatively simple M/M/n models. The simplest example is M/M/1 queue, for which textbooks pertaining to performance evaluation usually present the results to compute the steady state distribution of number of requests. Well studied class of one-channel models with specific flow characteristics, discussed e.g. by Ryzhikov in [1] or Eremin [2], which analyze the behavior QS with determined delay in starting the service. The biggest interest was recently been focused on the investigations in multichannel non-Markovian queues where flows are approximated by phase-type distributions. For example, Bubnov in [3] or Danilov in [4] models to forecast software reliability characteristics, such as number of corrected errors, required debugging time, etc. Brandwajn and Begin in [5] propose a semi-numerical approach to compute the steady-state probability distribution for the number of requests at arbitrary and at arrival time instants in Ph/M/c-like systems. Cox showed in [6] that an arbitrary distribution of length of a random variable can be represented by a compound of exponential stages or phase-type distribution. The advantage of such a representation is that it ensures convenience of approximation of the random process to a Markov process and gives the power of creating and solving the system of equations describing the behavior of the corresponding model. 2.1

Models with “warming-up”

Multi-Channel non-Markovian QS with warming-up require more complex mathematical description, compared to the Markov models, e.g. the request flow can be recurrent or represented by an arbitrary stochastic function. Examples of previous works addressing QS with warming-up are by Kolahi in [7] or by Kreinin in [8] for the characteristics of single channel QS. Mao and Humphrey in [9] examine the influence of the warming-up during virtual machine start-up in the cloud system. More examples of early works on cloud performance subject can be found in [2] and [10]. In a fairly recent work of several authors investigate questions of evaluation of performance of cloud and other systems on the basis of models of multi-channel queuing systems with warming-up [11,12]. 2.2

Models with “cooling”

To study the cloud systems with described “cooling” it is useful to introduce an enhanced notation A/S/C/n, which compared to original Kendall’s notation contains additional C denoting the cooling time distribution. In [13] The study demonstrates the influence of the cooling patterns on the performance and shows the need to collect data and examine cooling patterns to assure that the system capabilities are appropriate for significantly different levels and patterns of demand that might be relevant during a given time period. More Examples of studying the influence of cooling can be found in [14,15].

388

3

M. M. Khalill et al.

Multi-channel QS with Warming-up and Cooling

State of the system is defined by the two-dimensional vector (n, j), where n is the number of Processing servers and j is the number of requests in the system, respectively. Based on the distribution function of the random variables involved in the formation of the model, we determine that the two-dimensional Markov chain describes the system in study. Figure 2 is showing transitions of M/E2/ M/E2/n type multi-channel queuing system. Here: J = 0—means that all channels are free, there are zero requests in the system. J = 1—there is one occupied channel, remaining channels are free, one request in the system. J = 2—there are two occupied channels, remaining channels are free, two requests in the system. J = n—there are n occupied channels, zero free channels, n requests in the system, zero requests waiting in the queue. J = n + 1—there are n occupied channels, zero free channels, n+1 requests in the system, one request waiting in the queue. J = n + m—there are n occupied channels, zero free channels, n+m requests in the system, m requests waiting in the queue. Corresponding matrix of M/E2/M/E2/n type QS with two phase warmingup and cooling is given in Fig. 3. Calculation of time response characteristics is based on the way described by Khomonenko in article [16]. Unlike the multi-channel non-Markov QS considered in article [17] in systems with “cooling” (and “warming-up”) it is necessary to consider expenses of time for transitions between microstates of one tier (with fixed j). We will consider technology of the accounting of these expenses when calculating distribution of waiting time of the request in QS queue with “cooling”. Waiting time of again arrived request is defined by a system microstates right after its arrival. We will enter for each tier of the chart a row vector πj = [πj,1 , πj,2 , . . . , πj,h ] , j = 1, R, final distribution of probabilities of microstates of system right after arrival of the next request. As the input flow not Poisson, conditions of the theorem of PASTA (Poisson Arrival See Time Average) [17] aren’t satisfied, and final distribution j doesn’t match stationary distribution γj . Vector components j represent the relative numbers of arrivals of requests with which arrival the system passed into the appropriate microstates: πj = γj−1 Aj−1 /

R 

γi Ai 1i+1 ,

i=0

where 1i —the single column vector of hi × 1 size.

(1)

Load Balancing Cloud Computing

389

Fig. 2. Diagram showing transitions of M/E2/M/E2/n type multi-channel queuing system

We will define Bn+1 (S) as a matrix of the conditional LSC of exponential distributions of lengths of intervals before transition of QS from the microstates (J + n, i) , i = 1, hn+1 on completion of service increased by system transition probability in one of (J + n − 1, l) , l = 1, hn+1 microstates. The matrix has the dimension hn+1 × hn+1 , its elements are calculated according to: bn+1,i,l , i, l = 1, hn+1 bn+1,i,l (s) = hn+1 r=1 bn+1,i,r + s

(2)

390

M. M. Khalill et al.

Fig. 3. Diagram showing transitions of M/E2/M/E2/n type multi-channel queuing system

We will similarly define a matrix of Cn+1 (s) LSC of distributions of duration of transitions between QS microstates on one tier (in case of the fixed number of requests in system), caused by the phases “cooling” (“warming-up”). The matrix of C (s) has dimension hn+1 × hn+1 , its elements are calculated according to: Cn+1,i,l Cn+1,i,l (s) = hn+1 , i, l = 1, hn+1 r=1 Cn+1,i,r + s

(3)

We will call the k-request the request right after which arrival the system appears in a microstate (k + n, i) , j = 1, 2, . . . , i = 1, hn+1 (in queue there is exactly k of requests). Waiting time in queue of the k-request represents the amount of durations of k of advances of queue on completion of service plus “cooling” time if the system is in the appropriate mode. Based on the accepted designations, LSC of waiting time of the k-request: Wk (s) = πk+n Bnk (s)

r 

i Cn+1 (s) ,

(4)

i=0

where r —number of sequential phases of “warming-up” & “cooling”. We will explain a physical sense of a formula (4). Depending on in what microstate there was a system with arrival of the k-request, time of its waiting will make i = 1, r

Load Balancing Cloud Computing

391

of the stages “cooling” plus k of completions of service. As distribution of the amounts of random variables represents the multiplication of their LSC, the formula takes the form (4). Then expression for LSC of distribution of waiting time of the request in queue taking into account “cooling” is: W (s) =

R  k=1

πk+n Bnk (s)

r  i=0

i Cn+1 (s) =

R 

πk+n Wk (s)

(5)

k=1

Notes 1. In work [18] it is proved that the formula (5) is fair also for multi-channel non-Markov QS with “warming-up” and “cooling”. 2. The knowledge of the Laplace-Stieltjes transform of the waiting time distribution of the application in the queue makes it possible, by numerically differentiating this transformation at the point s = 0, to calculate the initial moments of the required distribution, from which to construct the approximation of the distribution function.

4

Results

A simulation was carried out for a system with a constant input flow for checking the change in the behavior of waiting time in the queue – Wq and the average time of the request in the system (including the service time) – W, when the warming-up & cooling intensities were changed. The initial modeling parameters were as follows: γ1 = 1, γ2 = 3, μ = 1.8, μw , μc = range from 1 to 5. The results of the calculation are shown in Figs. 4 and 5.

Fig. 4. Showing the effect of cooling with different intensities and with different number of processing servers

392

M. M. Khalill et al.

Fig. 5. Showing the effect of warming up & cooling with different intensities and with different number of processing servers

5

Conclusion

The study demonstrates the effect of warming up and cooling on the overall waiting time for a respond of a request from a cloud computing system, and shows how different intensities give different results. The study also shows how adding a number of processing servers leads to a more stable and balanced cloud service with better QoS when there is an overload and intense demand. However, further research is recommended on expanding the research on Hyperexponential distribution case with complex coefficients. In [19] the review of existing load balancing algorithms in cloud computing an autonomous is given and Agent Based Load Balancing Algorithm (A2LB) which provides dynamic load balancing for cloud environment is proposed Acknowledgements. The work was partially supported by the grant of the MES RK: project No. AP05133699 “Research and development of innovative information and telecommunication technologies using modern cyber-technical means for the city’s intelligent transport system”.

References 1. Ryzhikov, Y.I.: Distribution of the number of requests in a queuing system with warm-up. Probl. Peredachi Inf. 9(1), 88–97 (1973) 2. Eremin, A.S.: A queuing system with determined delay in starting the service. Intellect. Technol. Transp. 4, 23–26 (2015) 3. Bubnov, V.P., Tyrva, A.V., Khomonenko, A.D.: Model of reliability of the software with coxian distribution of length of intervals between the moments of detection of errors. In: Proceedings of 34th Annual IEEE Computer Software and Applications Conference (COMPSAC 2010), Seoul, Korea, 19–23 July 2010, pp. 238–243 (2010)

Load Balancing Cloud Computing

393

4. Danilov, A.I., Khomonenko, A.D., Danilov, A.A.: Dynamic software testing models. In: 2015 XVIII International Conference on Soft Computing and Measurements (SCM 2015), pp. 72–74 (2015). https://doi.org/10.1109/SCM.2015.7190414. 5. Brandwajn, A., Begin, T.: A recurrent solution of Ph/M/c/N-like and Ph/M/c-like queues. J. Appl. Probab. 49(1), 84–99 (2012) 6. Cox, D.R.: A use of complex probabilities in the theory of stochastic processes. In: Proceedings of Cambridge Philosophical Society, vol. 51, no. 2, pp. 313–319 (1955) 7. Kolahi, S.S. Simulation model, warm-up period, and simulation length of cellular systems. In: Second International Conference on Intelligent Systems, Modeling and Simulation (ISMS), pp. 375–379 (2011) 8. Kreinin, Y.: Single-channel queuing system with warm up. Autom.Remote Control 41(6), 771–776 (1980) 9. Mao, M., Humphrey, M.: A performance study on the vm startup time in the cloud. In: IEEE 5th International Conference on Cloud Computing (CLOUD), pp. 423–430. IEEE Press (2012) 10. Takahashi, Y.: A numerical method for the steady-state probabilities of a GI/G/c queuing system in a general class. J. Oper. Res. Soc. Japn. 19(2), 147–157 (1976) 11. Khomonenko, A.D., Gindin, S.I.: Stochastic models for cloud computing performance evaluation. In: Proceedings of the 10th Central and Eastern European Software Engineering Conference in Russia. Article No.: 20. ACM, New York, NY, USA. Russian Federation, Moscow, 23–25 October 2014. http://dl.acm.org/ citation.cfm?id=2687233 12. Khomonenko, A., Gindin, S.: Performance evaluation of cloud computing accounting for expenses on information security. In: 2016 18th Conference of Open Innovations Association and Seminar on Information Security and Protection of Information Technology (FRUCT-ISPIT), 18–22 April 2016, pp. 100–105 (2016) 13. Khalil, M.M., Andruk, A.A.: Testing of software for calculating a multichannel queuing system with “cooling” and E2 - approximation. Intellect. Technol. Transp. 4, 22–27 (2016) 14. Khomonenko, A.D., Khalil, M.M., Gindin, S.I.: A cloud computing model using multi-channel queuing system with control. In: 2016 XIX International Conference on Soft Computing and Measurements (SCM), 25–27 May 2016. St.-Petersburg, vol. 1, sec. 2, pp. 247–251 (2016) 15. Lokhvitskii, V.A., Ulanov, A.V.: Numerical analysis of queuing systems with Hyper exponential “cooling”. Vestnik tomskogo gosudarstvennogo universiteta upravlenie vychislitelnaya tekhnika i informatika – Tomsk State Univ. J. Control Comput. Sci. (37)(4), 36–43 (2016) 16. Khomonenko, A.D.: The distribution of the waiting time in queuing systems of the type GIq/Hk/n/R. Autom. Remote Control 518, 91–98 (1990) 17. Ryzhikov, Y.I.: Teorija ocheredej i upravlenie zapasami [Queuing theory and inventory management]. St.-Petersburg: Piter, p. 384 (2001). (In Russian) 18. Khomonenko, A.D., Lokhvitskiy, V.A., Khalil, M.M.: Calculation of waiting time distribution in multi-channel non-markovian queuing systems with “cooling” and “heat-up”. H&ES Res. 9(4), 88–94 (2017) 19. Singha, A., Junejab, A., Malhotraa, M.: Autonomous agent based load balancing algorithm in cloud computing. In: International Conference on Advanced Computing Technologies and Applications (ICACTA-2015). Procedia Comput. Sci. 45, 832–841 (2015)

Application of Cyber-Physical System and Real-Time Control Construction Algorithm in Supply Chain Management Problem Inna Trofimova1(B) , Boris Sokolov2,3,4 , Dmitry Nazarov2 , Semyon Potryasaev2 , Andrey Musaev5 , and Vladimir Kalinin6 1

St. Petersburg State University, Universitetskaya Emb., 7/9, St. Petersburg 199034, Russia [email protected] 2 St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, 14-th Linia, VI, 39, St. Petersburg 1999178, Russia [email protected], [email protected], [email protected] 3 Petersburg State University of Aerospace Instrumentation, Bolshaya Morskaya Str., 67A, St. Petersburg 190000, Russia 4 Petersburg Marine State University, Lotsmanskaya, 3, St. Petersburg 190121, Russia 5 SPIK SZMA, 26 Line, 15, St. Petersburg 199026, Russia [email protected] 6 St. Petersburg Institute for Informatics and Automation of the RusMilitary-Space Academy named after A.F. Mozhaisky 13 Gdanovskaya Str., St. Petersburg 197198, Russia [email protected]

Abstract. Various classes of cyber-physical systems are the basis of digital and computer-integrated production and the digital economy as a whole. They include measuring, telecommunication and control subsystems with embedded software in various types of hierarchical structures. At the present paper supply chains (SC) are considered as highly dynamic systems. To improve the efficiency of incoming information about the current state of the SC and timeliness of the data we suggest to use cyber-physical systems. Here we consider the SC execution problem in the case of unforeseen events, it is one of the important problems in SC management. The advantage of this approach is that cyber-physical systems have been applied and the classical methods of optimal program and position control have been transferred and modified for supply chain management.

Keywords: Cyber-physical system Position control · Supply chain

· Optimal program control ·

c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 394–403, 2020. https://doi.org/10.1007/978-3-030-32258-8_46

Application of CPS in SC Management Problem

1

395

Introduction

Cyber-physical systems (CPS) could be defined as an orchestration of computers and physical systems [12] or as automotive control systems, which used to combine various objects by means of multichannel (wired and wireless) measuring systems with embedded software in various types of hierarchical structures [11,18,20]. Thus applications of cyber-physical systems could include automotive systems, manufacturing, medical devices, military systems, assisted living, traffic control, process control physical security (access control and monitoring). Further that various classes of CPS are used in digital and computer-integrated production and in the digital economy. For example on the basis of CPS, the technology of controlled self-organization of traffic could be implemented, the equipment operation of production could be coordinated, the supply chain scheduling could be optimized etc. At the present paper supply chains (SC) are considered as highly dynamic systems forasmuch as companies strive to work with a large variety of products, to achieve a high quality supply and a reliability and environmental standards, to take into account fast appearance of new products to increase competitiveness, to react on time when unforeseen events happen. We suggest to use CPS to improve the efficiency of incoming information about the current state of the SC and timeliness of the data and to improve SC operating depending on this information. We propose to consider the SC execution problem in the case of unforeseen events [4,16] together with the problem of planning measuringcomputing operations, which allows to timely obtain the necessary information about the current state of the system (for example inventory levels, storage temperature of product, volume of supplies etc.). Different mathematical models for describing SC dynamic behavior were introduced in recent times and were used for optimization of operating SC and application of control theory methods [8,9,13,14]. The foundation of the considered in this paper complex modeling approach is control theoretic description of SCs as controllable dynamic systems. This study is based on the scheduling SC model [6] and algorithm in terms of optimal program control (OPC) [6,10]. To improve the efficiency of incoming information about the current state of the system and timeliness of the data we suggest to consider the model [6,19] with dynamic model of measuring and computing operations control. The advantage of this approach is that CPS have been applied and the classical methods of optimal program and position control have been transferred and modified for supply chain management.

2

Cyber-Physical System and Multiple-Model Description of Supply Chain

In pursuance of [6,10], the control theoretic description of SC is as follows. The SC is considered as a complex technical object that is described through differential equations based on a dynamic interpretation of the job execution. The job

396

I. Trofimova et al.

execution is characterized by execution results (e.g., volume, time, etc.), capacity consumption of the resources, and complex technical object flows resulting from the delivery to the customer. Along [6] let us consider a job control model (M 1) and a flow control model (M 2). Together with models M 1 and M 2 let us consider a measuring and computing operations control model (M 3). In line with [1] we propose to use scheduling procedure: M 1 is used to assign jobs to suppliers, M 2 is used to schedule the processing of assigned orders subject to capacity restrictions of the production and transportation resources, M 3 is used to optimize the process of receiving the necessary information about the current state of the SC. The basic interaction of these tree models contain the two stages. On the first stage the job-shop scheduling problem for the models M 1, M 2 and M 3 is considered. After the determining control variables in the model M 1 on the second stage they are used in the constraints of the model M 2 and M 3 and the SC schedule execution problem with the measuring and computing operations control problem are solved together. 2.1

A Dynamic Model of Job Control (Model M 1)

(o)

x˙ iμ =

n 

(o)

εij (t)uiμj ,

(o)

x˙ j

=

pi si  n n   

(o)

uiμj .

(1)

i=1 η=1,η=i μ=1 ρ=1

j=1 (o)

Here xiμ - the job state variable, which indicates the relation to jobs (orders), (o)

xj characterizes the total employment time of the j-supplier, εij (t) is an element of the preset matrix time function of time-spatial constraints. The control actions are constrained as follows: si n  

(o)

uiμj (t) ≤ 1, ∀j;

i=1 μ=1 n  i=1

(o)

uiμj [



n 

(o)

uiμj (t) ≤ 1, ∀i, ∀μ;

(2)

j=1

(o)

(o)

(aiα − xiα ) +

α∈Γiµ1



(o)

(o)

(aiβ − xiβ )] = 0;

(3)

β∈Γiµ2 (o)

uiμj (t) ∈ {0, 1};

(4)

where Γiμ1 , Γiμ2 are the sets of job numbers which immediately precede the job (i) Dμ subject to accomplishing of all the predecessor jobs or at least one of the jobs (o) (o) correspondingly, and aiα , aiβ are the planned lot-sizes. Constraint (3) refers to the allocation problem constraint according to the problem statement (i.e., only a single order can be processed at any time by the manufacturer). Constraint (4) determines the precedence relations, more over this constraint implies the (i) (i) (i) blocking of job Dμ until the previous jobs Dα , Dβ have been executed. If (o)

(i)

uiμj (t) = 1 , all the predecessor jobs of the operation Dμ have been executed. Note that these constraints are identical to those in MP models.

Application of CPS in SC Management Problem

397

According to the problem statement, let us introduce the following performance indicators: n si 1  (o) (o) (o) (a − xiμ (Tf ))2 ; (5) J1 = 2 i=1 μ=1 iμ Tf

(o) J2

=

si  n  n  

n

(o)

(o)

(o)

αiμ (τ )uiμj (τ )dτ ;

J3

=

i=1 μ=1 j=1 T

0

1 (o) (T − xj (Tf ))2 . 2 j=1

(6)

(o) J1

Indicator characterizes the accuracy of the end conditions’ accomplish(o) ment, i.e. the service level, J2 refers to the estimation of an job’s execution time with regard to the planned supply terms and reflects the delivery reliability, i.e., (o) the accomplishing the delivery to the fixed due dates, J3 estimates the equal (o) resource utilization. The functions αiμ (τ ) is assumed to be known. 2.2

A Dynamic Model of Flow Control (Model M 2) (f )

(f )

(f )

x˙ iμj = uiμj ,

(f )

x˙ ijηρ = uijηρ .

(7)

(f )

we denote xiμj - the flow state variable, which indicates the relation of the variable x to flows. The control actions are constrained by maximal capacities and intensities as follows: si n  

(f )

(f )

uiμj (t) ≤ Rj ,

i=1 μ=1 (f )

pi 

(f )

(f )

uijηρ (t) ≤ Rjη ,

(8)

ρ=1 (f ) (o)

(f )

(f )

(o)

0 ≤ uiμj (t) ≤ ciμj uiμj , 0 ≤ uijηρ (t) ≤ cijηρ uijηρ , (f )

where Rj

(9) (f )

is the total potential intensity of the resource C (j) , Rjη is the maxi(f )

mal potential channel intensity to deliver products to the customer B (η) , ciμj is (i)

(f )

the maximal potential capacity of the resource C (j) for the job Dμ , and cijηρ is (j,η)

the total potential capacity of the channel delivering the product flow P (i) of the job Dμ to the customer B (η) . The end conditions are similar to those in (6) and subject to the units of processing time. The economic meaning of (o) (o) performance indicators in this model correspond to J2 and J3 n

(f )

J1

=

s

n

i  1  (f ) (f ) [(a − xiμ (Tf ))2 + 2 i=1 μ=1 j=1 iμj

n

(f ) J2

s

n

i  1  = 2 i=1 μ=1 j=1

Tf

pi n  

(f )

η=1,η=i ρ=1

(f )

(f )

βiμ (τ )uiμj (τ )dτ. T0

(f )

(aijρη − xijρη (Tf ))2 ]; (10)

(11)

398

2.3

I. Trofimova et al.

A Dynamic Model of Measuring and Computing Operations Control (model M 3) (g)

x˙ i

yj = dTj (t)xi

(g)

(i)

= Fi (t)xi ,

Z˙ i = −Zi Fi − FiT Zi −

m  

(e)

uiγj

j=1 γ∈Γi

(g)

dj dTj , i = j σj2

(e)

+ ξj ,

(12)

(e)

(e) (o)

0 ≤ uiγj ≤ cγj uiγj ,

(13)

(g)

Here xi - the state vector, matrix Fi characterizes the dynamics of change of variables describing the state of an object (for example, actual supply quantities, (e) stock, product storage temperature, etc.), ξj are uncorrelated measurement errors of object parameters. We suppose that measurement errors follow the (e) normal distribution law with zero expectation and variance σj2 , uiγj - control (i)

action, which specifies the intensity of remote measurement parameters yj , Zi is inverse matrix of the correlation matrix Ki (t) of measurement errors of object parameters, dj characterizes the technical features of the device, which performs the measurement parameters. (e) Let us consider performance indicator in this model J1 : (e) J1

=

t m  m  f 

(e)

uiγj (τ )dτ, j = i.

(14)

i=1 j=1 γ∈Γi t

0

With the help of the weighting performance indicators, a general performance vector can be denoted as follows: (o)

(o)

(o)

(f )

(f )

(e)

J(x(t), u(t)) = J1 , J2 , J3 , J1 , J2 , J1

.

(15)

The partial indicators may be weighted depended on the planning goals and control strategies. Original methods [5] have been used to transform the vector J to a scalar form JG .

3

Problem Statement

Let us combine model M 1, M 2, M 3 (1)–(15) in one model M and sequentially consider the following problems. The model could be represented in the following form: ⎧ u(t) | x˙ = f (x, u, t) ⎨ M = h0 (x(t0 )) ≤ 0, h1 (x(tf )) ≤ 0 (16) ⎩ (1) q (x, u) ≤ 0, q (2) (x, u) = 0. On the first stage the job-shop scheduling problem for M is considered. We formulate it as optimal program control (OPC) problem: this is necessary to find an allowable control u(t) t ∈ (T0 , Tf ], that ensures for M meeting the vector

Application of CPS in SC Management Problem

399

constraint functions q(1) , q(2) , and guides the dynamic system of the model (16) (i.e., job shop schedule) from the initial state to the specified final state. If there are several allowable controls (schedules), then the best one (optimal) should be selected in order to maximize (minimize) JG . In terms of OPC, the program control of job execution is at the same time the job shop schedule. This problem includes the problem of planning measuring and computing operations in particular. Denote the optimal program control upr (t) and corresponding vector xpr (t) as a solution for the first stage OPC problem for the model M with JG . On the next stage let us consider model M under disturbances in the following form: ⎧ u(t) | x˙ = f (x, u, t, ζ) ⎨ (x(t h Mζ = (17) 0 0 )) ≤ 0, h1 (x(tf )) ≤ 0 ⎩ (1) q (x, u) ≤ 0, q (2) (x, u) = 0. Disturbance ζ(t) is piecewise continuous and confined function, it describes external actions, but in an explicit form ζ(t) is unknown. Assume the information about current state of the system xσ is obtained at the moments of time tσ , where tσ ∈ [t0 , tf ), σ = 1, N in case unforeseen events could happen. And they depend on the solution of the problem of planning measuring and computing operations x(g) and u(e) . Suppose company have incentives to minimize emergent deviations of system behavior under external actions owing to decrease its additional costs. On the second stage we presume the relation to jobs (orders) and the total employment time of the j-supplier remain the same. In this context, during the system operation the SC schedule execution problem is investigated. SC is able to realize the performance of production and transportation execution and reject the disturbances. Thus let us consider the problem: it is nec (t) (on the basis of upr (t)), defined on essary to construct a feasible control u the time interval [t0 , tf ) during the system operation on it, that satisfying constraints and boundary conditions in (17), and minimizing performance parameter tf (t) − xpr (t)  dt → min. J =  x t0

4

Generalized Real-Time Control Construction Algorithm

On the first stage we propose to solve job-shop scheduling problem that includes the problem of planning measuring and computing operations by control theory methods [2,6,10]. On the first stage we design the optimal program control upr (t) and corresponding vector xpr (t) for this OPC problem for the model M with JG [6]. 1. Based on the initial data, the technical characteristics and capabilities of suppliers and consumers, the time intervals of their potential interaction are calculated for the planning interval. The characteristics of attainable set are estimated [7].

400

I. Trofimova et al.

2. If the boundary conditions are satisfied, then the problem of optimal program control for (16) could be solved, else additional resources are required. 3. The problem of optimal program control for (16) is solved and upr (t), xpr (t) are obtained and for each measuring operation in the plan, the problem of planning measuring and computing operations is solved.  (t) = upr (t) and x (t) = xpr (t) 4. At the moment of time τ = t0 we could use u for Mζ , because it is feasible control and satisfies constraints of the model. On the second stage we propose consider model Mζ with assumption that the relation to jobs (orders) and the total employment time of the j-supplier remain the same. Thus nonlinear constrains prescribe the order of operations were used only at the first stage in the job shop scheduling problem. We propose to apply position optimization method to solve the SC schedule execution problem on the second stage. It is based on widely known mathematical methods of optimal control problems and linear programming (adaptive method) [3]. It was suitable for linear and nonlinear systems and it was applied in a number of cases [3,15,17]. We denote set Tτ = {τ = tσ | tσ ∈ [t0 , tf )}, t0 < tf < +∞. We’ll design position control u(t), t ∈ [t0 , tf ) in the class of discrete piecewise constant  (t) = u  (tσ ) at t ∈ [tσ , tσ + 1), for all tσ ∈ Tτ , tσ+1 ∈ Tτ using functions u position optimization method. 1. Current information about SC current condition xσ at the moments tσ is received. This moments depend on the solution of the problem of planning measuring and computing operations x(g) and u(e) . Let us consider a set of p , conditions  (t) ∈ U auxiliary OPC problems for the model Mζ with feasible u (τ ) = xσ , and at the moment of time tf : h1 ( x(tf )) ≤ 0 at initial time τ = tσ : x  This set depends on parameter τ ∈ Tτ and vector xσ is received. This and J. moments of time τ = tσ are defined fore the model M 3 in the problem of planning measuring and computing operations. 2. Auxiliary problem of optimal program control with new initial conditions  (τ | τ, xσ ) - OPC for the position (τ, xσ ) (posi(tσ , xσ ) is formulated. Denote u (τ ), for which tional solution), X(τ ) - a set of any and all start states xσ = x the auxiliary OPC problem could be solved at the moment of time τ , with (τ ), x (τ ) ∈ X(τ ), t ∈ Tτ . vectors xσ = x 3. The procedure of control design in the piecewise constant function class on the time interval Tτ with τ = tσ is realized and auxiliary problem of optimal program control is reduced to the linear programming problem. 4. The linear programming problem is solved using adaptive method. 5. Positional control is used on the time interval [tσ , tσ+1 ) till the new information about SC current condition (tσ+1 , xσ+1 ) is received. The set of auxiliary OPC problems at τ = tσ is considered in sequence, where current information about system states xσ is taken into account. Each one of them could be reduced to the linear programming (LP) problem. Solution of the auxiliary OPC problem applied to the considered model on the time interval until further information acquisition. Adaptive method is suitable for solving considered LP problems, because it was developed for time-dependent multidimensional systems under polyhedral constraints [3].

Application of CPS in SC Management Problem

401

Fig. 1. The areas of potential interaction facilities between suppliers and between suppliers and customers

5

Numerical Example

Let us demonstrate results of the realization of the planning of measuring and computing operations algorithm on the numerical example. It was realized as the part of the software package that is written in C++, the SQLite database is used to store the results, and the Qt OpenGL library is used to visualize them. The result of this program is optimal control, presented in the form of a schedule (plan) of operations.

Fig. 2. Optimal plan of measuring and computing operations

The numerical example demonstrates the simulation results for evaluating potential interaction time between suppliers and between suppliers and customers, on the basis of which on the Fig. 1 we demonstrate the areas of their potential interaction facilities. Taking into account the results obtained, a heuristic plan for programmed control of measuring and computing operations, and then using the principle of maximum Pontryagin optimizes the resulting plan (Fig. 2). The Fig. 3 shows that reducing the coefficients in the correlation matrix of errors leads to the improvement in the quality of measurements.

402

I. Trofimova et al.

Fig. 3. The quality of measurements depending on the coefficients in the correlation matrix of errors

6

Conclusion

In this paper to improve the efficiency of incoming information about the current state of the SC and timeliness of the data we suggest to use cyber-physical systems. Thus multiple-model description of the supply chain (the dynamic model of job control and the dynamic model of flow control) considered together with the dynamic model of measuring and computing operations control that allowed us to describe the process of receiving the necessary information about the current state of the system (for example inventory levels, storage temperature of product, volume of supplies etc.). Interaction of models in models combination and interrelation of its parameters enable to apply them towards the job shop scheduling problem, the problem of planning measuring and computing operations and the SC schedule execution problem. The advantage of this approach is that cyber-physical systems have been applied and methods of optimal program and position control have been used for considered supply chain management problem. Acknowledgements. The research described in this paper is partially supported by the Russian Foundation for Basic Research (grants 16-29-09482-fi-i, 17-08-00797, 1706-00108, 17-01-00139, 17-20-01214, 17-29-07073-fi-i, 18-07-01272, 18-08-01505, 19-0800989) state order of the Ministry of Education and Science of the Russian Federation no. 2.3135.2017/4.6, state research 0073-2019-0004, International project ERASMUS +, Capacity building in higher education, no. 73751-EPP-1-2016-1-DE-EPPKA2CBHE-JP, Innovative teaching and learning strategies in open modelling and simulation environment for student-centered engineering education.

References 1. Chen, Z.L., Pundoor, G.: Order assignment and scheduling in a supply chain. J. Oper. Res. 54, 555–572 (2006)

Application of CPS in SC Management Problem

403

2. Chernousko, F.L.: State Estimation of Dynamic Systems. SRC Press, Boca Raton (1994) 3. Gabasov, R., Dmitruk, N.M., Kirillova, F.M.: Numerical optimization of timedependent multidimensional systems under polyhedral constraints. Comput. Math. Math. Phys. 45(4), 593–612 (2005) 4. Garcia, C.A., Ibeas, A., Herrera, J., Vilanova, R.: Inventory control for the supplychain: an adaptive control approach based on the identification of the lead-time. Omega 40, 314–327 (2012) 5. Gubarev, V.A., Zakharov, V.V., Kovalenko, A.N.: Introduction to Systems Analysis. LGU, Leningrad (1988) 6. Ivanov, D.A., Sokolov, B.V.: Adaptive Supply Chain Management. Springer, New York (2010) 7. Ivanov, D., Sokolov, B., Dolgui, A., Solovyeva, I.: Application of control theoretic tools to supply chain disruption management. IFAC Proc. Vol. (IFACPapersOnline) 46(9), 1926–1931 (2013) 8. Ivanov, D., Sokolov, B., Solovyeva, I., Dolgui, A., Jie, F.: Ripple effect in the time-critical food supply chains and recovery policies. IFAC-PapersOnLine 28(3), 1682–1687 (2015). INCOM 2015 - Proceedings 9. Ivanov, D., Sokolov, B., Solovyeva, I., Dolgui, A., Jie, F.: Dynamic recovery policies for time-critical supply chains under conditions of ripple effect. Int. J. Prod. Res. 54(23), 7245–7258 (2016) 10. Kalinin, V.N., Sokolov, B.V.: ptimal planning of the process of interaction of moving operating objects. Int. J. Differ. Eqn. 21(5), 502–506 (1985) 11. Kupriyanovsky, V., Namiot, D., Sinyagov, S.: Cyber-phisical systems as a base for digital economy. Int. J. Open Inf. Technol. 4(2), 18–24 (2016) 12. Lee, E.A.: The past, present and future of cyber-physical systems: a focus on models. Sensors 15, 4837–4869 (2015) 13. Ortega, M.: Control theory applications to the production-inventory problem: a review. Int. J. Prod. Res. 8(2), 74–80 (2000) 14. Perea, E., Grossman, I., Ydstie, E., Tahmassebi, T.: Dynamic modeling and classical control theary for supply chain management. Comput. Chem. Eng. 24, 1143– 1149 (2000) 15. Popkov, A.S., Baranov, O.V., Smirnov, N.V.: Application of adaptive method of linear programming for technical objects control. In: ICCTPEA 2014 - Proceedings, p. 141 (2014) 16. Schwartz, D., Wang, W., Rivera, D.: Simulation-based optimization of process control policies for inventory management in supply chains. Automatica 125(2), 1311–1320 (2006) 17. Solovyeva, I., Ivanov, D., Sokolov, B.: Analysis of position optimization method applicability in supply chain management problem. In: International Conference on SCP 2015 - Proceedings, pp. 498–500 (2015) 18. Sokolov, B., Zelentsov, V., Mustafin, N., Kovalev, A., Kalinin, V.: Methods and algorithms of ship-building manufactory operation and resources scheduling. In: Proceedings of the 19th International Conference on HMS 2017, pp. 81–86 (2017) 19. Trofimova, I., Sokolov, B., Ivanov, D.: Multiple-model description and control construction algorithm of supply chain. In: Advances in Intelligent Systems and Computing 7th CSOC - Proceedings, pp. 102–108 (2019) 20. Wolf, W.: Cyber-phisical systems. Computer 3, 88–89 (2009)

Method for Design of ‘smart’ Spacecraft Onboard Decision Making in Case of Limited Onboard Resources Andrey Tyugashev1(B) and Sergei Orlov2 1

2

Samara State Transport University, 2V Svobody Str., Samara, Russia [email protected] Samara State Technical University, 244 Molodogvardeyskaya Str., Samara, Russia [email protected]

Abstract. The paper is devoted to method for design of real-time spacecraft onboard control algorithms. This kind of algorithms should provide onboard decision making in process of organization of synchronized cooperation of different onboard equipment. The right and valid onboard decision making is extremely important for the success of space exploration missions. The special aspect of complexity of such algorithms is related to limited available onboard resources – electric energy, computer power, etc. Correct decision making should take into account the limits of these resources. The paper introduces the mathematical models for onboard resources, equipment and devices and their working modes because the important issue is considering the levels of consumes of onboard resources by each onboard device in particular mode. Also, we present some prototypes of software tool allowing automated checking of existing algorithms whether they violate the resource constraints as well as the design of new algorithms from scratch.

Keywords: Real-time mode equipment · Decision making

1

· Onboard resources · Spacecraft onboard · Flight control software

Introduction

The modern spacecraft are equipped with a wide spectrum of onboard systems (such as Motion Control System, Telemetry System, Onboard Control System, Power Supply System, Thermal Control System, etc.). Each onboard system includes a set of devices, sensors, engines, etc. [1,2]. All of named equipment should co-operate to complete the space mission. To do this, spacecraft need to accomplish the timed set of actions involving various on-board devices. All actions must be logically coordinated and correctly synchronized considering the parameters of spacecraft motion and speed of onboard physical processes. One can name this set of operations as an ‘active schedule’, and in aerospace industry, it has the name ‘cyclogram’. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 404–413, 2020. https://doi.org/10.1007/978-3-030-32258-8_47

‘smart’ Spacecraft Onboard Decision Making

405

The important issue is that operations to be executed consumes onboard resources such as electric power, processor time and memory of an onboard computer, etc., and the consistent control algorithms must consider the limits of available resources. Actually, during execution of needed cyclogram the onboard flight control software performs ‘system forming’ role, i.e. integrates various onboard apparatus into a holistic entity. Furthermore, we know that in accordance with Ashby’s Law of Requisite Variety [3], the variety and complexity of control system should reflect the complexity of controlled object, in this case, complexity level of onboard control means should correspond to complexity of spacecraft. It is not a wonder that Onboard Control System is one of the most complex spacecraft systems. We can even note that today the modern spacecraft has several onboard computers integrated into onboard network. These computers are being controlled by a real-time operating system and run the onboard flight control software [4,5]. The growing complexity of tasks executed onboard lead to increase of control logic’s complexity. In fact, the complexity of flight onboard software has been increased dramatically during last decades [6]. During real mission, spacecraft faces with devices failures as well as with fails of software running on a main onboard computer or as embedded software of particular onboard system. Moreover, even if all onboard devices and programs work well, some problems could be inspired by a violation of the limits of available onboard resources. Some abnormal situations could not be carefully predicted at the design phase and not described in operation manuals. Consequently, it is very hard to parry these situations at operational phase. Fortunately, special redundancy of software can compensate failures [6–8]. But if we want to avoid many dangerous situations, the limits of available onboard resources are a very important issue to be considered at design phase [9,10]. Usually, each onboard device has a set of modes connected with various levels of consuming of onboard resources. The minimum number of modes associated with the particular device is two: ON and OFF. Some kind of equipment has a wide spectrum of modes with varying combinations of needed resources. Over space flight and depending on current situation, the modes of functioning might be switched slowly or quickly. But each kind of onboard resource has maximal available level, i.e. ‘red line’. For instance, spacecraft can be equipped with solar panels providing certain level of electric power. The control logic with ‘smart’ decision making must complete all required tasks but not violate these red lines during whole flight. Switching from mode to mode implemented by sonamed “apparatus control commands”, i.e. coded sequences of electric impulses issued through onboard control interfaces. For instance, we can use the command «SWITCH ON Device1 in System1» or «CHANGE MODE OF GyroDyne1 to Mode2». So, we need the approach allowing the design of ‘smart’ decision making providing warranty of absence of violations of available resource levels. This approach will improve a probability of space mission’s success. Usually, the complex control logic involving devices belong to different onboard systems is implemented by a special sort of software - control algorithms [11]. The particular control algorithm should be implemented by a par-

406

A. Tyugashev and S. Orlov

ticular program module. These modules are running as processes of multi-thread real-time onboard operating system, so there is parallel execution of many control algorithms at a particular time moment. As a rule, the algorithm includes many kinds of operators, but we are interested in followings: (1) call of other program module implementing other control algorithm; (2) issue of an apparatus control command to a particular device; (3) setting or resetting of some logical value (flag). The values of these flags have an impact on the execution of onboard actions. For example, some flag can means «level of charge of battery 1 is extremely low», and other means «second solar panel is fully operational». The full set of logical values forms the so-named ‘spacecraft state vector’, or ‘logical vector’. Some subsets of this vector could be reviewed as ‘particular state vector’, for example, state vector of certain onboard system [12]. If we fix the values of conditions, we get the concrete scenario of execution of the control algorithm, i.e. branch of control logic. Various scenarios of execution lead to the execution of various timed sequences of actions. In other words, we have particular cyclogram for each scenario. The additional level complexity of decision making in case of limited available re-sources is caused by the following. As it was mentioned above, the several control algorithms run in a parallel manner, and each of them executes different sequences of actions. So, we must consider the chain (in fact, tree) of calls: CA1 → CA2 → CA3. In fact, one control algorithm can activate dozens of other algorithms with corresponding cyclograms of actions. It is a very difficult problem for a designer to consider all branches of ‘execution tree’ with full checking of an absence of violation of available resources’ limits. In this paper, we propose an approach to resolve this problem. The second section of the paper is devoted to the formal problem statement, analysis of input and output data of each stage of onboard control logic design, and to a discussion about ways of development of special decision making automation tool with the utilization of opportunities provided by constraint programming and SMT solvers

2

Method and Discussion

In this section, we provide mathematical formalisms for modeling of spacecraft onboard equipment and flight control program modules and discuss the ways of implementation of automation tool. The spacecraft onboard situation can be represented by the sets: EA = {BAi }, i = 1 . . . N is a set of onboard devices; AP = {APj }, j = 1 . . . M is a set of onboard resources; Lim(APj ) = {(LimAP1 , LimAP2 , . . . , LimAPM )} is a set of limits of available resources; REA(BAi ) = {(RMi1 , RMi2 . . . , RMiN )} is a set of modes of onboard devices; CL(RMi k) = {(CPi1 , CPi2 . . . , CPiN )} is a vector of levels of consumption of each kind of resource in corresponding mode;

‘smart’ Spacecraft Onboard Decision Making

407

REA(BAi ) = {(RMi1 , RMi2 , . . . , RMiN )} is a set of modes of onboard devices; CL(RMik ) = {(CPi1 , CPi2 , . . . CPiN )}is a vector of levels of consumption of each kind of resource in corresponding mode; The starting point in the design process of decision making is a set of tasks to be executed by onboard equipment. The important issue is that we must implement action at specified time moment corresponding to speeds of onboard physical processes. We use the following set: Z = {P Tj , tj } - set of tasks to be executed by spacecraft. The onboard complex of control algorithms can be represented as U A = {CAm }, m = 1 . . . U . Then the designer must accomplish the following steps: 1. Define mapping: f 1 : Z → EA i.e., define which devices to be involved in the process of execution of each task, and which modes of functioning of this device should be used. 2. Define mapping: f 2 : EA → U A i.e. mapping between elements of onboard equipment and program modules which control them. 3. Set time parameters: f 3 : EA → T and U A → T Using these mappings we can build the cyclogram of functioning of various onboard devices and calculate the total level of consumption of particular onboard resource as function of time. The last mapping allows us to form the following set of tuples: {BAi , CAij , ti , li } Each tuple defines the scenario of execution of CAi starting at moment ti in the situation described by logical vector li . The logical (state) vector can be represented, for example, as l = (α1 = T RU E, α2 = F ALSE, . . . αq = H), where αi is a particular condition (flag). Some conditions could be insignificant for this scenario of execution, in this case, we use additional ‘H’ value of truth, in this aspect our approach can be considered as a variant of three-values logic. The set of these tuples represents the sequence of ‘sections’ of space mission and to determine the different scenarios of execution of the control algorithm. On the other hand, one can apply this sequence to check of overlays of devices’ modes and functioning of particular software modules. After that, we link the execution of program modules and work of onboard equipment. In other words, we form the schedule of onboard equipment’s functioning and corresponding schedule of consumption of onboard resources. The representation of cyclogram in the screen form of special software tool can be found in Fig. 1.

408

A. Tyugashev and S. Orlov

We must the mention the existence of alternate approach described in publications [11–14]. Similar but different model is based on following tuples: CAm = {}, fi = Call(CAk ) ∨ KU (BAi ) ∨ Set(αl ), where fi is an action to be executed at time moment ti with duration τi and associated with logical vector li . In this model, fi might be call of other control algorithms, issue of apparatus control command, or setting or clearing of logical value. 2.1

Input and Output Data at Stages of Design of Decision Making

To provide ‘smart’ decision making design support tool we need to analyze design process stages from the input and output data point of view. The process of design of ‘smart’ spacecraft operations implemented by onboard devices and software modules must harmonize cooperative work of all units in physical and logical senses [15]. In a physical sense, we need to consider the maximum available levels of onboard resources. At the first stage, the set of cyclograms of onboard apparatus and units work should be built for various possible situations reflected by space-craft state control vector. The important issue is that we need to design the decision making process both for control of particular onboard device (control ‘in small’) and for control of spacecraft as a whole (control ‘in big’). These procedures should be implemented by control algorithms for complex operation of spacecraft [11,14] at the next stage. We should consider the following information:

Fig. 1. Cyclogram of onboard operations at screenshot of special software tool

‘smart’ Spacecraft Onboard Decision Making

409

• Materials on the control logic of systems and units during the execution of particular tasks • Requirements to time characteristics of execution of the tasks, including restrictions of simultaneous execution of various functional tasks due to the physical sense of the tasks • Requirements related to the sequence of execution of different sections of the control process. After that we get the following results which are the input data for the next stages: • Input data for design of control algorithms for complex spacecraft operations • Time diagrams showing the operation of systems and units, indicating the modes of operation of the equipment and algorithms for different scenarios of control • Estimation of required levels of onboard resources consumed by spacecraft equipment during operations • Materials on a mutual overlay of execution of control algorithms and modes of onboard device functioning. 2.2

Mathematical Models for Input and Output Data of Design Stages

To build the formal mathematical models for input and output data of mentioned design stages [16] we need to consider the structure of the model which allows describing the named in previous sections entities related to various onboard systems, devices, and actuators with their work modes. On the other hand, the model should reflect the set of control algorithms divided into the ‘sections’, and adequately reflect various scenarios of execution. The model of control algorithms for complex operations can be reviewed as a ‘base’ model at this stage. Correspondingly, we propose the following additional sets: ALT S = {CAij } is a set of branches of control algorithm where i is ID of the algorithm, and j is some possible scenarios (branches of execution). As a rule, each control algorithm has several (Ki ) scenarios of execution depending on values of certain flags. In these terms, CAij is a control algorithm with ID i implementing scenario j. Ω = {ωd , ≤} is a ordered in time set of sections of spacecraft mission. Using this approach we can represent the particular control algorithm by the following structure: CAij = {T ij work , Aij on .Aij of f , BAij , KUij , F Pij }, i = 1, . . . , N , j = 1, . . . , K, ij

ij where T ij work , = {(tw ij 1 , lw 1 ), (tw ij 2 , lw ij 2 ), . . . (tw ij K , lw K )}. twij is a start time of execution of control algorithm scenario in case of validity 1 of logical condition lwij .

410

A. Tyugashev and S. Orlov

Fig. 2. The screenshot of other software prototype tool

The number of pairs (twij , lwij ) defines the number of calls of scenario j of Ki Ki control algorithms depending on lwi . The full set of logical conditions lwi defines the full ‘state vector’ in sense of [11,13]. Aij on is a set of algorithms called by scenario j of CAij : ij ij 2 ij K ij K 1 Aij on = {(CAij 1 , tij , tw ij , tw ij w , lwij ), (CA 2 , lw 2 ), . . . (CA K , lw )}. 1

ij where presence of (CAij i , tw ij i , lw i ) means that this control algorithm calls algoij ij ij i rithm CA at time tw i if lw i is true. Similarly, Aij of f is a set of algorithms which are to be shut down by scenario j of CAij : ij 1 ij 2 ij 2 ij 2 ij K K Aij of f = {(CAij 1 , tw ij , tw , lw ), . . . (CAij K , tij w , lw )}. 1 , lw ), (CA ij where tuple (CAij i , tw ij i , lw i ) means that our control algorithm shuts down algoij ij rithm CAij i at time tw i if lw i is true. Further, BAij is onboard apparatus controlled by the control algorithm:

BAij = {(N amij , Rij , Pij )} where N amij is an ID of particular onboard device, Rij is a mode of functioning of this device, and Pij is a vector which components are the levels of consumption of corresponding onboard resources in mode Rij . KUij is a set of commands issued by the control algorithm to onboard devices. ij 1 ij 2 ij 2 ij 2 ij K 1 K KUij = {(N K ij 1 , tij , tw , lw ), . . . , (N K ij K , tij w , lw ), (N K w , lw ), }

‘smart’ Spacecraft Onboard Decision Making

411

ij

where N K ij i is a name (ID) of control command, tw i is a time moment of issuing ij of the command in case of truth of lw i . ij ij ij ij ij ij ij F P ij = {(P I1ij , tsij 1 , ls1 ), (P I2 , ts2 , ls2 ), . . . (P IK , tsK , lsK )} is a set of ij flags (logical values) formed by scenario j of algorithm CA execution at time ij moments tsij i in case of truth of lsi . These models contain enough information required to determine the sequence of sections of spacecraft control and to form schedules reflecting its functioning in different scenarios, from various points of view. Using that, we can determine overlays of algorithms and work of certain onboard devices. These different points of view are shown in Fig. 2. 2.3

Application of Satisfiability Modulo Theories Solvers and Constraint Programming

The very prospective modern methods are connected with application of such approaches as constraint programming (CP), boolean satisfiability (SAT) solvers, mixed integer programming (MIP), and dynamic programming (DP). Many of named approaches are supported by software tools based on advanced methods and algorithms [17]. In fact, these tools can be reviewed as ‘universal solvers’ accessible using API, special problem statement languages or even free and online. Some of the solvers propose efficient, highly parallel mechanisms to be applied in various domains [18]. Boolean satisfiability problem can be reviewed as a problem partially similar to existence of control algorithm with ‘smart’ decision making with a warranty of correct completion of the space mission in case of limited onboard resources. Moreover, probably the approach based on the use of SMT solvers is even more suitable. We can consider applying of ABSolver, Alt-Ergo, Barcelogic, MathSAT, CVC, OpenSMT, Simplify, STeP, Yices, Z3, and other available tools to be applied. In [19,20] we described a case study of utilization of opportunities provided by Z3 solver and described special software tool allowing checking of satisfiability of important synchronization requirements to spacecraft control logic. Furthermore, by its nature, the constraint programming also can be reviewed as an approach well corresponding to the problem of control in case of limited available resources because constraints are the sort of limits. This circumstance opens the door to apply the constraint programming in our problem domain. To do this we first need to represent the mathematical models described in previous sections by the means and input languages of CP tool. Then we can formulate the real-time control algorithm in these terms. After that, we should specify existing limits for onboard resources and check whether the limits being violated. But in this way, we can face the following problems [17]. A modeling language should aim to convey problem structure to the solver as explicitly as possible. However, users may not always use the appropriate abstractions or may not recognize them in their model specifications based on input language. Thus, some solvers frequently use preprocessing steps which, among other actions, tries

412

A. Tyugashev and S. Orlov

to infer some implicit structure from the model (e.g., recognizing knapsack cuts by combining constraints). An appropriate modeling system must reach a similar goal but for a much richer modeling language and a larger set of underlying solving technologies. A focus should be on automatically determine global substructure in the form of implied global constraints. Adding these global constraints to the model (with or without removing the constraints that imply them) can dramatically improve solving abilities. More importantly, it may make life considerably easier for other model analyses and transformations listed in the paper. For instance, current methods for automatic derivation of search procedures strongly make use of global constraints [17,18]. A better understanding of the global substructure can also help to transform a model written for one solver technology for another. In this manner, automatic derivation of implied constraints is an enabling analysis for many other analyses.

3

Conclusion and Future Work

The analysis performed for the design of ‘smart’ spacecraft onboard decision making has shown the opportunity of formal representation of data related to different phases of design process. This formal representation has been specified as mathematical models which to be converted into electronic models used in special software design support tools. Now the authors work on implementation of these tools. The very promising approach is connected with constraint programming as it was discussed in Subsect. 2.3. The goal of this computer-aided design system is the support of spacecraft control algorithms specification, visualization and checking, in case of restricted onboard resources. We plan to provide designers with the tools allowing automatic calculation of estimated levels of consumption of onboard resources based on formal models of control algorithms. These estimations should be determined for all suggested time period of spacecraft mission. The very important feature is building and visualization of cyclograms showing the execution of tasks, calls of program modules, the functioning of certain equipment. Supposed introduction of these tools into spacecraft control logic design process at Space Rocket Centre ‘Progress’ will help to support decision making, allow to optimize the cyclograms of functioning of onboard apparatus, reduce labor costs, increase the quality of design documentation. These aspects are quite actual and prospective in modern conditions when we face with the growing importance of information models and complexity of software used in space industry. We hope this future work will have a significant practical impact.

References 1. Kozlov, D.I., Anshakov, G.P., Mostovoy, Ya.A.: Control of Earth Observation Satellites: Computer Technologies. Mashinostroenie, Moscow (1998). (in Russian) 2. Akhmetov, R., Makarov, V., Sollogub, A.: Principles of the earth observation satellites control in contingencies. Inf. Control Syst. 1, 16–22 (2012)

‘smart’ Spacecraft Onboard Decision Making

413

3. Ashby, W.R.: Introduction to Cybernetics. Chapman and Hall, New York (1956) 4. Tomayko, J.: Computers in Space: Journeys with NASA. Alpha Books, Indianapolis (1994) 5. Eickhoff, J.: Onboard Computers, Onboard Software and Satellite Operations. An Introduction. Springer, Heidelberg (2012) 6. Tyugashev, A.A., Ermakov, I.E., Ilyin, I.I.: Ways to get more reliable and safe software in aerospace industry. In: Program Semantics, Specification and Verification: Theory and Applications (PSSV 2012), Nizhni Novgorod, pp. 121–129 (2012) 7. Krasner, S., Bernard, D.: Integrating autonomy technologies into an embedded spacecraft system-flight software system engineering for new millennium. In: Proceedings of IEEE Aerospace Conference. IEEE Press, Snowmass (1997) 8. Khartov, V.V.: Autonomnoe upravlenie kosmicheskymi apparatami svyazi, retranslyacii i navigacii. Aviakosmicheskoe priborostroenie (Aerospace InstrumentMaking) 6, 12–23 (2006). (in Russian) 9. Tiugashev, A.A., Belozubov, A.V.: Toolset for construction and verification of rules for spacecraft’s autonomous decision making. Procedia Comput. Sci. 96, 811–818 (2016). Proceedings of the 20th International Conference on Knowledge Based and Intelligent Information and Engineering Systems 10. Kalentyev, A.A., Sygurov, Yu.M.: Development of information support for the spacecraft control algorithm design process. Vestnik SGAU 21(1), 58–62 (2010). (in Russian) 11. Tyugashev, A.A.: Integrated environment for designing realtime control algorithms. J. Comput. Syst. Sci. Int. 45(2), 87–300 (2006) 12. Tyugashev, A.A.: Language and toolset for visual construction of programs for intelligent autonomous spacecraft control. IFAC - PapersOnLine 49(5), 120–125 (2016) 13. Kalentyev, A.A., Tyugahsev, A.A.: CALS technologies in lifecycle of complex control programs. Samara Scientific center of Russian Academy of sciences, Samara (2006). (in Russian) 14. Tyugashev, A., Zheleznov, D., Nikishchenkov, S.: A technology and software toolset for design and verification of real-time control algorithms. Russ. Electr. Eng. 88(3), 154–158 (2017) 15. Filatov, A., Tyugashev, A., Sopchenko, E.: Structure and algorithms of motion control system software of the small spacecraft. In: Proceedings of International Conference on Information Technology and Nanotechnology (ITNT), Samara, pp. 246–251 (2015) 16. Bogatov, A., Tyugashev, A.: The logical calculus of control algorithms. In: Proceedings of International Symposium for Reliability and Quality, Penza, vol. 1, pp. 307–308 (2013). (in Russian) 17. Garcia, M., Stuckey, P., Pascal, V.H., Wallace, M.: The future of optimization technology. Constraints 19, 126–138 (2014) 18. Marriott, K., Nethercote, N., Rafeh, R., Stuckey, P., Garcia, M., Wallace, M.: The design of the Zinc modelling language. Constraints 13(3), 229–267 (2008) 19. Tiugashev, A.: Build and evaluation of real-time control algorithms in case of incomplete information about functional processes’ parameters. In: XXth IEEE International Conference on Soft Computing and Measurements (SCM 2017), Saint Petersburg, pp. 179–185 (2017) 20. Tyugashev, A.: Application of SMT solvers for evaluation of real-time control logic of spacecraft. J. Phys. Conf. Ser. 1096(1), 012156 (2018)

Intelligent Technologies and Systems for Spatial Industrial Strategic Planning Elena Serova(B) National Research University Higher School of Economics St. Petersburg Branch, 3 Kantemirovskaya st., St. Petersburg 194100, Russia [email protected]

Abstract. Today, the strategic planning of industrial spatial clusters is becoming extremely important when forecasting industrial regional economic development and making management decisions. Frequently, such challenges involve the use of contemporary methodological approaches and system models. Therefore, the role and importance of system models and intelligent information technologies are increasing, particularly in the creating of interdisciplinary databases and forecasting the results of interaction and interference of different elements of economic space. The main advantage of the research studies, which are based on the spatial approach, is its interdisciplinarity and ability to take advantages of systemic approach and synergy effect in the study of issues related to the spatial organization of the economy and management systems. It is theoretical and empirical study in equal measure. The scientific methodology of the research is system approach and comparative analysis, dynamic principle, and comprehensive consideration of the processes of spatial strategic planning and hybrid modeling. This paper deals with the issues of system modeling implementation and how system models can be used to support the process of spatial industrial strategic planning (SISP) in the context of theory of regional economic growth and development. The main goal of this paper is the analysis of the main domains or areas of system models application to support the process of spatial industrial strategic planning.

Keywords: Spatial industrial strategic planning economics · Hybrid intelligent models

1

· Regional

Introduction

At the present time the use of the latest achievements in the field of Information Communication Technologies (ICT) in global economy and management, including contemporary intelligent information technologies (IIT) and systems of distributed artificial intelligence, is one of the factors in improving organizational performance and increasing competitiveness. ICT can govern the ability of companies to generate the sustainable business models in global environment. c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 414–422, 2020. https://doi.org/10.1007/978-3-030-32258-8_48

Intelligent Technologies and Systems

415

One of the advantages of scientific direction devoted to solving problems related to strategic management of spatial economic systems is its interdisciplinary nature and ability to take advantage of system analysis and synergetic effect in the study of a range of different fields of knowledge. Another distinguishing feature is the synthesis of conceptual categories and methodologies of public sciences, sociology, humanities and technical sciences. The concept and theory of “economic space” was formed in compliance with geographic, geopolitical, and regional concepts. Now, economic space is considered in the framework of concepts of globalization, industrial spatial clusters, “cumulative causation”, high information technologies and network. Frequently, such decisions involve using adequate system models and appropriate methodological approaches. Hereby, the role and importance of modeling are increasing, particularly in the creation of interdisciplinary databases and forecasting. The problem the author considers with here is: How we can successfully use the advantages of complex system models of spatial organization to determine resource potential and increase the competitive advantages of the territory. The main research questions the author considers with here are: – What are the main types of spatial strategic planning models? – What kinds of complex system models are implemented in spatial strategic planning of regional development? – How the advantages of intelligent models and systems can be successfully used in spatial industrial strategic planning? The original contribution of this work is the classification (typology) of models, used in spatial strategic planning. This typology is determined by models’ complexity and variety of application areas. The rest of this paper is structured as follows: Theoretical background and literature review; Methodology; Results and findings.

2

Theoretical Background and Brief Literature Review

Nowadays intelligent information systems and technologies are evolving actively. These technologies and systems are based largely not on tangible, but on information and communication resources that belong to the class of synergistic resources. The class of intelligent information technologies (IIT) and systems, including multi-agent systems (MAS) [1–4,7], neural network (NN), and fussy logic (FL) continues to improve [5–8]. IIT are developed rapidly over the past ten years and they allow creating models of interaction between different kinds of spaces. The major advantage of the spatial approach is its ability for multidimensional representations of spatially localized complex systems, in which the economic, industrial, ecological, social, geographical, political, and technological components interact. These components determine the functioning equilibrium and development of the region, as well as creating conditions to maximize the region’s contribution to the spatial systems development at a higher level. The

416

E. Serova

basis of the spatiotemporal concept to strategic management is the principle of the systemic approach and consideration of the strategic management system as a large complex system consisting of elements of different types and having heterogeneous relationships between them. The spatial system of management is treated as a complex system, a set of subsystems and their relations in many dimensions: social, industrial, territorial, ecological, geographical, and political [9,10]. In accordance with the basic hypothesis of Granberg [11], spatial science is defined as an interdisciplinary scientific direction, where objects of research are forms and processes of a modern society, which are space-dependent. Three statements are offered as a conceptual basis. They relate to the spatial, regional and international aspects. The first statement affirms that every category of economic activity or vital activity has its own space (spatial aspect). All kinds of spaces have a number of common characteristics: the extension in different directions, position relative each other in space, nodes (centers), networks. The second statement concerns the regional aspect and supposes that the spatial science is considered as more broad research area, rather than “regional science”. The third statement is devoted to the international aspect [11]. Speaking about the development of methodological and methodical tools of interdisciplinary research in the field of spatial sciences, we should also mention such fundamental studies as the monograph of Minakir P.A. [12] and the textbook written by Granberg A.G. [13]. Sartorio [14] states that strategic spatial planning is creative with respect to the development of new territories and scales, to the definition of new continuities between state, market and civil society, and to the interaction with and creation of innovative local governance forms. According to Salet and Faludi [15] there are three main approaches to strategic spatial planning: – An institutional approach, which favors two main directions: one oriented at legitimizing planning activity, the other seeing institutionalization processes mainly as an opportunity for the implementation of plans and projects. – A communicative and discursive approach that favors framing and sensegiving activity; an interactive approach, suspended in a technocratic tension, oriented to building up connections between public and private organizations in order to improve performance in planning. – A sociocratic tendency, focused on the inclusion of society and emergent citizenship [15]. It should be noted that the attention of the famous international publishers to scientific research in the area of spatial sciences and spatial development is growing. For example, “Journal of Spatial Science” has been published in Australia (information available from the website: MSIA mapping science institute, 2017) [16]. Springer has produced volumes in the series “Advances in Spatial Science” (information available from the website: Springer, 2017) [17]. U.S. National Science Foundation (NSF) has approved a strategic plan for research in 2008–2012 entitled “Geography Spatial Sciences” (information available from the website NSF National Science Foundation, 2017) [18].

Intelligent Technologies and Systems

3

417

Research Methodology

Scientific methodology of the research is based on system approach and comparative analysis, dynamic principle, and comprehensive consideration of the processes of spatial strategic planning and hybrid modelling [10]: – According to the system approach, any spatial strategic planning object is considered as a complex system, i.e. a set of interrelated elements, which often possess different nature and relationship. This system has inputs (resources), outputs (results of activities), and connections with the external environment. The principle of system analysis involves the analysis of all subsystems of the research object, and analysis of all elements of its external and internal environment. – The principle of comparative analysis implies the comparison of activity results with economic indicators of competitors or standards. – The dynamic principle requires consideration of the object of research in the dialectical development, taking into account cause-and-effect relationships and co subordination. The use of modern modeling methods and technologies are now an essential trend in the improvement of strategic analysis processes and planning, enabling companies to succeed in a rapidly changing environment. Throughout the literature review and the results of this study, there are the following main types of models used in spatial strategic planning [10]: – Scientific cognition models or theoretical models, which are used for the development and deepening of the theoretical and methodological aspects; – Applied models used to solve practical problems; – Hybrid models, which combine the different approaches, styles, and modeling paradigms. The spatial strategic models typology is represented in Fig. 1. Scientific cognition models or theoretical models are being used in the research of regularities of spatial development. The concrete tasks of location and organization of economics space are being solved through applied models. The class of theoretical-analytic models includes conceptual models and mental models. Structural models reflect the organization of the research object, its elements and parameters, as well as the connections between them. Functional models allow us to study the object not by analyzing its internal structure, but through the cognition of its activities, the specifics of its behavior. Of course, these kinds of modeling are not mutually exclusive and can be used in the study of complex objects simultaneously or in some combination, complement each other. For example, conducting an experiment with the structural model provides information on how the research object responds to changes in external environmental conditions, and on the other hand, the study and analysis of functional models may help to get an understanding about the internal structure of the object.

418

E. Serova

Fig. 1. Spatial strategic planning models typology

In the case of solving applied problems, models of spatial interactions can be differentiated by the following distinguishing characters: – Availability of the probability distribution of values of the parameters that determine the spatial strategies: the stochastic and deterministic models. In deterministic models, parameters influencing the decision-making process and the development of the situation are known. – Dependence on the time factor – static and dynamic models. – Level of detailing – the aggregated (generic) and detailed models.

4

Empirical Results

In this research the qualitative parameters, which influence the equilibrium of operation and development of industrial spatial clusters and formation of conditions for maximizing its effectiveness, entails consideration of four main groups of factors: macroeconomic, industrial, and social and technological. Every group of factors includes some indicators. For example, for analysis of macroeconomics group factors the next indicators are considered: economic infrastructure, commodities and other resources, capital market, and global market condition; For in industry: global market condition, new entrants, stakeholders, suppliers, substitute products and services; For society and technologies: societal and cultural trends, socioeconomic trends, technology trends, regulatory trends. There is a particular group of models and modeling systems – intelligent models and systems, which allow in conditions of uncertainty, incomplete initial data and complex interdependence between elements of investigated spatial industrial regional economics system to evaluate the implications of realization

Intelligent Technologies and Systems

419

of various scenarios of strategic spatial development. Building these types of models, as a rule, is a time-consuming process with a calculating point of view. This requires the involvement of the potential of modern information and communication technologies both in the models building and in the carrying out of model experiments. The class of intelligent models includes system-dynamics and agent-based models as well as models and methods of “soft computing” – fuzzy logic, neural networks and evolutionary computation [19]. In system dynamics the real-world processes are represented in terms of stocks, flows between these stocks, and information that determines the values of the flows. System dynamics methodology describes the system behavior as a set of interacting feedback loops, including their balancing or reinforcing behavior. The main components of soft computing concept are fuzzy logic (FL), neural networks (NN), evolutionary computation (EC), and probabilistic inference (PE). Each of the above four methodologies has its strengths and weaknesses. Although they have some common characteristics, however, they can be considered as complimenting each other, because part of the required attributes missing from one technology, but can appear in the other [7]. Table 1 shows the comparative analysis of possibilities of intelligent technologies on certain criteria for major components of Soft Computing. Graduations of fuzzy logic are used as estimates of criteria. Table 1. Comparison of intelligent systems [20] Evaluation criteria

Neural networks Fuzzy logic

Evolutionary computation

Mathematical model

Slightly good

Bad

Bad

Learning ability

Bad

Good

Slightly good

Knowledge representation Good

Bad

Slightly good

Expert knowledge

Good

Bad

Bad

Nonlinearity

Good

Good

Good

Capability of optimization Bad

Slightly good Good

Tolerance of uncertainty

Good

Good

Operating time

Good

Slightly good Slightly bad

Good

This paper proposes ANFIS (Adaptive Neuro-Fuzzy Inference System) to determine resource potential and competitiveness of the industrial region with the purpose of increasing its competitive advantages. ANFIS is a hybrid technique, which combines the best features of fuzzy logic and parallel processing neural networks. It possesses fast convergence and has more accuracy than a back propagation neural network. Using a given input/output data set, the toolbox function ANFIS constructs a Fuzzy Inference System (FIS) whose membership function parameters are tuned (adjusted) by using either a backpropagation algorithm alone or in combination with a least squares type of method. This adjustment allows fuzzy systems to learn from the data they are modeling. The

420

E. Serova

modeling approach used by ANFIS is similar to many system identification techniques. The main steps of ANSIF creation are: – to hypothesize a parameterized model structure (relating inputs to membership functions to rules to outputs to membership functions, and so on) – to collect input/output data in a form that will be usable by ANFIS for training. – to use ANFIS to train the FIS model to emulate the training data presented to it by modifying the membership function parameters according to a chosen error criterion [21]. ANFIS architecture consists of five layers as shown in Fig. 2. Each layer contains several nodes described by the node function [22,23].

Fig. 2. Architecture of ANFIS hybrid model

A. Layer 1 Each node in the first layer of ANFIS architecture processes the inputs coming by using node functions. B. Layer 2 This is a layer of rules. The outputs of the first layer constitute the inputs of this one. C. Layer 3 This is a layer of normalization. All the outputs of the nodes in the layer of rules are used as input. The proportion of the ignition power of the node i in the layer of rules to the sum of the ignition power of all the nodes gives the normalized ignition rate of node I D. Layer 4 This is the clarification layer. Node i in this layer compute the contribution of i-th rule toward the overall output. E. Layer 5 There is one node in this layer. This node sums the output values of each node in the layer 4. This summation is the output value of the ANFIS system.

Intelligent Technologies and Systems

5

421

Conclusion

The use of modern modelling methods and technologies are now an essential method of improvement for strategic analysis processes and planning. The result of this research is the classification (typology) of models, used in spatial industrial strategic planning: this typology includes the following three main groups: Scientific cognition models or theoretical models; Applied models, which are used to solve practical problems; and hybrid models, which combine the different approaches, styles, and modeling paradigms. The major advantage of intelligent models and systems is their ability for multidimensional representation of spatially localized complex systems, within which the spatial interaction of macroeconomic, industrial, and social and technological factors may be taken into account. These factors can be considered as determinative not only for sustainable competitive advantage and for development of the industrial region, but also for the formation conditions of maximizing the region’s contribution to strategic spatial development of higher-level systems. Thus, the possibility of evaluation of the industrial region competitiveness can be successful realized by the implementation of the ANFIS method – the method of hybrid neural-fuzzy modeling – and performed in program MatLab R2012b.

References 1. Albright, S.C., Zappe, C.J., Winston, W.L.: Data Analysis, Optimization, and Simulation Modeling. Cengage Learning, Canada (2011) 2. Borshchev, A., Filipov, A.: AnyLogic – multi-paradigm simulation for business, engineering and research. In: The 6th IIE Annual Simulation Solutions Conference, Orlando, Florida, USA (2004) 3. Karpov, Y.: System Simulation Modeling. Introduction to Modeling with AnyLogic. St. Petersburg: BHV, St. Petersburg (2005) 4. Serova, E.: The role of agent based modelling in the design of management decision processes. Electron. J. Inf. Syst. Eval. 16(1), 71–80 (2013) 5. Bojadziev, V., Bojadziev, M.: Fuzzy Logic for Business, Finance and Management. World Scientific, New Jersey (2007) 6. Kecman, V.: Learning and Soft Computing - Support Vector Machines, Neural Networks, and Fuzzy Logic Models. The MIT Press, London (2001) 7. Krichevsky, M., Serova, E.: Business Analysis and Decision Making Based on Data and Models. Theory, Practice, and Tools. Publishing House “Professional Literature”, St. Petersburg (2016) 8. Ross, T.J.: Fuzzy Logic with Engineering Applications. John Wiley & Sons Ltd. (2010) 9. Serova, E., Krichevsky, M.: Intelligent models and systems in spatial marketing research. The Electron. J. Inf. Syst. Eval. 18(2), 159–171 (2014) 10. Serova, E.: System models for strategic spatial planning and regional development. In: the 13th European Conference on Management, Leadership and Governance – ECMLG 2017, pp. 452–458. Academic Conferences and Publishing International Limited, London (2017) 11. Granberg, A.G.: About program fundamental researches of spatial development of Russia. Reg. Econ. Sociol. 2, 168–170 (2009)

422

E. Serova

12. Minakir, P.A.: Economics of Regions. Far East. Publishing House “Economics”, Moscow (2006) 13. Granberg, A.G.: Foundation of the Regional Economy. SU Higher School of Economics, Moscow (2000) 14. Sartorio, F.S.: Strategic spatial planning: a historic review of approaches, its recent revival and an overview of the state of the art in Italy. DISP 162(3), 26–41 (2005) 15. Salet, W., Faludi, A. (eds.): The Revival of Strategic Spatial Planning. Koninklijke Nederlandse Akademie van Wetenschappen, Amsterdam (2000) 16. MSIA mapping science institute. http://www.tandfonline.com/toc/tjss20/current 17. Springer. Advances in Spatial Science. https://link.springer.com/bookseries/3302 18. NSF Geography and Spatial Science Program Strategic Plan, 2008–2012 (K1/19/09). http://www.nsf.gov/sbe/bcs/grs/GSS StrategicPlan 2008.pdf 19. Lotfi, Z.: Neural Network, and Soft Computing. Fuzzy Log. Commun. ACM 37(3), 77–84 (1994) 20. Krichevskiy, M.L.: Research Methods in Management. KNORUS, Moscow (2015) 21. Mart´ınez, L.G., et al.: Big five patterns for software engineering roles using an ANFIS learning approach with RAMSET. In: Proceedings of the 10th Mexican International Conference, MICAI 2010, Part II, LNAI 6438, pp. 428–439 (2010) 22. Jang, J.-S.R., Sun, C.-T., Mizutani, E.: Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Prentice-Hall, Englewood Cliffs, NJ (1997) 23. Kaynak, S., Evirgen, H., Kaynak, B.: Adaptive neuro-fuzzy inference system in predicting the success of student’s in a particular course. Int. J. Comput. Theory Eng. 7(1), 34–39 (2015)

Conceptual and Formal Models of Information Technologies Use for Decisions Support in Technological Systems Alexander S. Geyda(B) St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, 14 Line, 39, St. Petersburg 199178, Russia [email protected]

Abstract. The article outlines models of the formation of usage effects of information operations. Such models are used, for example, to estimate the usage performance, efficiency, and effectiveness indicators of information technology. In addition, such models could be used for the estimation of dynamic capability and indicators of system potential regarding the use of information technology. The estimation is obtained by plotting the dependencies of predicted values of operational properties of the use of information technology against the variables and options of the problems to be solved. To develop this type of models, we analyzed the use of information operations during system functioning. This is done through an example of a technological system. Basic concepts, principles, assumptions, and models are provided for such model construction. Keywords: Information technology · Capabilities · Effectiveness · Efficiency · Indicator · Estimation · Models · Methods · Information operations

1

Introduction

The research on the use of information technologies shall be implemented based on the operational properties of such use. The operational properties of the objects of research are an extensive class of properties of various objects, such that these properties characterize the results of the activity with these objects. Therefore, the operational properties form the basis of the quality of objects under research. These properties are manifested at the boundary of the object in which the activity is implemented (the boundary covers the object of activity, means of activity, and humans) and in the environment (outside the boundary). Operational properties are characterized by the effects (main results) of activity at the boundary, and these effects are compliant with the requirements of the environment. Activity is always implemented using certain information operations (at least using the senses or speech). Information operations are elements c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 423–429, 2020. https://doi.org/10.1007/978-3-030-32258-8_49

424

A. S. Geyda

of activity whose objectives are to obtain information, not to exchange matter and energy. Information operations are implemented in accordance with certain information technology whose objective is to describe the use of information operations. However, the mechanisms of the formation of activity effects and the subsequent formation of operational properties, taking into account the use of information operations, including modern (digital) information technologies (IT), have not been studied in sufficient detail in order to predict the effects of activity with mathematical models, depending on the selected characteristics of the information operations used, as mathematical problems of evaluation, analysis, and synthesis. This is primarily because there are no suitable models and methods for analytically describing the effects of information operations and the operational properties. This, in turn, is related to the absence of a universally accepted concept of the manifestation of the effects of information operations, particularly for non-information (material) effects that are obtained by noninformation (material) operations, which depend on the information operations under study. The non-information effects of such operations vary with the implementation of the dependencies between environmental changes, further information, and subsequently non-information operations. Since information operations lead to changes in non-information operations but do not directly lead to noninformation effects, it is necessary to develop the concept of information and non-information action dependencies and the concept of effect manifestation as a result of such dependencies. It is the development of the concept of such dependencies that causes conceptual difficulties. Therefore, further research is directed to the description of the dependencies between information and noninformation actions and between information and non-information effects, but first of all an analytical description using mathematical models is required. The mathematical models developed of the formation of usage effects of information operations, including non-information (material) effects that are changed as a result of the information effects, are designed to analytically evaluate the operational properties of the use of information operations. The models are also used to evaluate other operational properties, especially the complex operational property of the system potential, which are measured taking into account the necessary use of information operations. The results presented in this study are aimed at bridging the gap between the need to solve research problems of operational properties based on mathematical models and methods and the lack of the necessary concepts and methodology for solving usage problems of information operations in the sense of formalizing them as mathematical problems of estimation, analysis, and planning by operational properties indicators. The results can be used in solving problems in research on dynamic capabilities [1–3], system potential [4], operational properties [5,6], information technology use efficiency [7,8], exchange properties [9], and strategic management [10–13]. Their common feature is the need to solve these problems based on mathematical models [14– 16] of the functioning of complex systems [17,18]. For these purpose indicators of system functioning, the quality shall be measured analytically [19,20]. Such indicators shall reflect the effects of information technology [21] during system

Conceptual and Formal Models of Information Technologies Use

425

functioning [22,23]. For this purpose, the models and methods [25,26] of research on the effectiveness of system functioning [24] shall be adapted to reflect the use of information operations [27–29].

2

The Use of Information Technology: Basic Concepts, Principles, Assumptions and Models

The use of IT is illustrated in such complex systems that the operation of these systems (and hence the use of IT) is technological. We will say that the operation is technological if it is specified by technological operation modes, i.e. descriptions of technological operations in the technical documentation for a complex technological system (hereinafter CTS). In connection with this assumption about the technological form of functioning, not any action in the system is considered but only technological operations. This assumption further allows us to assert that the states of the beginning of CTS operations, the modes of the implementation of technological operations, and the possible resulting states of such operations are described in the technical documentation. The essence of changes in noninformation operations as a result of the use of information operations is that different states of the system and the environment (because of the changing environment) can be implemented and different requirements can be accommodated. These states and requirements can lead to different information operations and their results, i.e. lead to changes in information operations and subsequently to changes in other operations. It is assumed that the number of such changes in operations is finite and can be described by modes of operations. Such modes based on the initial states allow specifying the possible transitions and the corresponding possible final states of the operations. The modes of technological operations are specified in the technical documentation of a CTS, so this feature can be represented as a feature of the technical documentation, consisting of an exhaustive description of the possible initial states, transitions, final states, and the finiteness of such states and transitions. Accordingly, knowing the possible changes in the environment and their impacts on the CTS, we can build a model of the possible CTS states as a result of the chains of environmental impacts and information operations. These chains, in turn, can lead to different states of the beginning of non-information operations and, consequently, to different modes of the implementation of non-information (or material) operations. Furthermore, various modes of such material operations can lead to effects that will meet the differential effect requirements of the changing environment. Information operations in the CTS, thereby, can allow carrying out material actions in the changing conditions of the environment with modes of operations that are better adapted to these changing environmental conditions. If we make certain assumptions about the technological nature of CTS operations (of all types) and the limited number of possible environmental states, we can further assert that the possible chains of ways of implementing information, and dependent on them the modes of implementation of non-information operations, can be modeled. To model the use of information operations and therefore the use of IT

426

A. S. Geyda

describing these operations, it is necessary to perform conceptual modeling of possible sequences of environmental states and information and non-information operations. In these sequences, the states and operations are in causal relationships, and there can be alternative relationships and states. It is further assumed that such alternatives are known and that a measure for the possibility of such alternatives can be constructed. The basic concepts and their relations necessary to describe the chains of information and non-information operations are given in [29]. The concepts were linked together with IT usage schema. The concepts have been formalized using the Mind Map format of knowledge representation. Such a representation allows us to process concept models using knowledge processing applications. Then, based on the conceptual models obtained, it is necessary to construct mathematical models of possible sequences of environmental and CTS states. Next, it is necessary to obtain models of the effects of CTS functioning, assuming that one of the possible sequences of states and transitions is realized. It is assumed that transitions are described in the technical documentation with the use of functioning laws and the regularities of nature.

3

Method for the Formalisation of the Research Problem

The method of formalization consists of assigning the main concepts and relations of set-theoretic forms (sets, vectors, relations, mappings, functions) and linking these forms (mathematical objects) with the use of relations. The result of such a set-theoretic formalization would be the set of explications of the settheoretic forms of the problem, describing the problem in such a way that from the set-theoretic forms it would be possible to go to the parametric and functional formalization of the problem based on the explication of the mathematical objects. To implement such a set-theoretic formalization, the graph-theoretic model [30] of explication of the main set-theoretic forms of the problem is used. For such an explication, it is proposed to describe the set-theoretic forms so that they correspond to the elements of graph-theoretic models (vertices, arcs, nested graphs) connected by the explicated relations. This makes it then possible to go to the parametric models by parameterization of the graph-theoretic model and to the functional models by specifying and explicating the functional dependencies between the explicated parameters and variables. As a result, the main set-theoretic forms are connected by relations so that by traversing the graph corresponding to the specified forms and performing functional transformations when performing the traversal, the results required for solving the problems [31–35] of evaluation, analysis, and synthesis by operational properties, mainly system potential indicators, would be obtained by using meta-modeling techniques.

4

Conclusion

The results obtained enable the evaluation of the predicted values of the operational properties of systems. Corresponding IT-usage indicators, dynamic capabilities, or system potential indicators can be estimated as a result. Analytical

Conceptual and Formal Models of Information Technologies Use

427

estimation of such indicators becomes possible depending on the variables and options in the mathematical problems to be solved. This could lead to a solution to contemporary problems in research using predictive analytical mathematical models and mathematical methods. Examples of such research problems are problems related to IT productivity and efficiency and the estimation, analysis, and synthesis of the dynamic capabilities of systems. Possible aspects include choosing the best information operations and choosing IT characteristics for the optimal implementation of new IT. It makes it possible, as a result, to overcome the existing gap between the need to solve research problems in operational properties (especially regarding information operations) based on mathematical models and methods and the lack of the necessary concepts and methodology for solving such problems. Acknowledgements. The reported study was funded by RFBR according to the research project No. 19-08-00989.

References 1. Teece, D., Pisano, G., Shuen, A.: Dynamics capabilities and strategic management. Strateg. Manag. J. 18(7), 509–533 (1997) 2. Wang, C., Ahmed, P.: Dynamic capabilities: a review and research agenda. Int. J. Manag. Rev. 9(1), 31–51 (2007) 3. Teece, D.: Explicating dynamic capabilities: the nature and microfundations of (sustainable) enterprise performance. Strateg. Manag. J. 28(13), 1319–1350 (2007) 4. Geyda, A., Lysenko, I.: Tasks of study of the potential of socioeconomic systems. In: SPIIRAN Proceedings, vol. 10, no. 1, pp. 63–84 (2009) 5. Geyda, A.: Dynamic capabilities indicators estimation of information technology usage in technological systems. In: Dolinina, O., Brovko, A., Pechenkin, V., Lvov, A., Zhmud, V., Kreinovich, V. (eds.) Recent Research in Control Engineering and Decision Making, Studies in Systems, Decision and Control, vol. 199, pp. 379–395. Springer, Cham (2019) 6. Geyda, A., Lysenko, I.: Modeling of information operations effects: technological systems example. Future Internet 11(3), 62 (2019) 7. Ashimov, A., Geyda, A., Lysenko, I., Yusupov, R.: System functioning efficiency and other system operational properties: research problems, evaluation method. In: SPIIRAS Proceedings, vol. 5, no. 60, pp. 241–270 (2018) 8. Geyda, A., Lysenko, I., Yusupov, R.: Main concepts and principles for information technologies operational properties research. In: SPIIRAN Proceedings, vol. 42, no. 1, pp. 5–36 (2015) 9. Geyda, A., Ismailova, Z., Klitny, I., Lysenko, I.: Research problems in operating and exchange properties of systems. In: SPIIRAN Proceedings, vol. 35, pp. 136–160 (2014) 10. Taylor, J.: Decision Management Systems: A Practical Guide to Using Business Rules and Predictive Analytics. IBM Press, Indianapolis (2011) 11. Kendrick, T.: How to Manage Complex Programs. AMACOM, New York (2016) 12. Dinsmore, T.: Disruptive Analytics: Charting Your Strategy for Next-Generation Business Analytics. Apress, Berkeley (2016)

428

A. S. Geyda

13. Downey, A.: Think Complexity. Complexity Science and Computational Modeling. O’Reilly Media, Sebastopol (2012) 14. Cokins, G.: Performance management: myth or reality? In: Performance Management: Integrating Strategy Execution, Methodologies, Risk, and Analytics. Wiley (2009) 15. Cokins, G.: Why is Modeling Foundational to Performance Management? Dashboard inside newsletter, 3 (2009) 16. Hood, C., Wiedemann, S., Fichtinger, S., Pautz, U.: Requirements Management. The Interface Between Requirements Development and All Other Systems Engineering Processes. Springer, Heidelberg (2008) 17. Hybertson, D.: Model-oriented Systems Engineering Science: A Unifying Framework for Traditional and Complex Systems. Auerbach, Boca Raton (2009) 18. Aslaksen, E.: The System Concept and Its Application to Engineering. Springer, Heidelberg (2013) 19. Aslaksen, E.: Designing Complex Systems. Foundations of Design in the Functional Domain. CRC Press, Boca Raton (2008) 20. Franceschini, F., Galetto, M., Maisano, D.: Management by Measurement: Designing Key Indicators and Performance Measurement Systems. Springer, Heidelberg (2007) 21. Roedler, G., Schimmoller, R., Rhodes, D., Jones, C. (eds.): Systems Engineering Leading Indicators Guide, INCOSE-TP-2005-001-03. Version 2.0, INCOSE (2010) 22. Tanaka, G.: Digital Deflation: The Productivity Revolution and How It Will Ignite the Economy. McGraw-Hill, New York (2003) 23. Guide to the System Engineering Body of Knowledge, SEBoK, INCOSE (2014) 24. Simpson, J., Simpson, M.: Formal systems concepts. Formal, theoretical aspects of systems engineering. Syst. Eng. 13(2), 204–207 (2010) 25. Elm, J., Goldenson, D., Emam, Kh., Donatelli, N., Neisa, A.: A Survey of Systems engineering effectiveness - initial results, NDIA SE Effectiveness Committee, Special report CMU/SEI-2008-SR-034, NDIA (2008) 26. Patel, N.: Organization and Systems Design. Theory of Deferred Action. Palgrave McMillan, London (2006) 27. Stevens, R.: Engineering Mega-Systems: The Challenge of Systems Engineering in the Information Age. CRC Press, Boca Raton (2011) 28. Mikalef, P., Pateli, A.: Information technology-enabled dynamic capabilities and their indirect effect on competitive performance findings from PLS-SEM and fsQCA. J. Bus. Res. 70(C), 1–16 (2017) 29. Taticchi, P.: Business Performance Measurement and Management: New Contexts, Themes and Challenges. Springer, Heidelberg (2010) 30. Geyda, A.: The modeling in the course of technical systems investigation: some expansions of the graph theory usage. In: SPIIRAS Proceedings, vol. 2, no. 17, pp. 317–337 (2011) 31. Lee, E.: The past, present and future of cyber-physical systems: a focus on models. Sensors 15, 4837–4869 (2015) 32. Henderson-Sellers, B.: On the Mathematics of Modelling, Metamodelling, Ontologies and Modelling Languages. Briefs in Computer Science. Springer, Heidelberg (2012) 33. Debevoise, T., Taylor, J.: The MicroGuide to Process and Decision Modeling in BPMN/DMN: Building More Effective Processes by Integrating Process Modeling with Decision Modeling. CreateSpace Independent Publishing Platform, Scotts Valley (2014)

Conceptual and Formal Models of Information Technologies Use

429

34. Lankhorst, M.: Enterprise Architecture at Work: Modelling, Communication and Analysis. The Enterprise Engineering Series. Springer, Heidelberg (2013) 35. Kleppe, A.: Software Language Engineering: Creating Domain-Specific Languages Using Metamodels. Addison-Wesley Professional, Boston (2008)

Role and Future of Standards in Development of Intelligent and Dependable Control Software in Russian Space Industry Andrey Tyugashev1(B) , Alexander Kovalev2 , and Vjacheslav Pjatkov3 1

Samara State Transport University, 2V Svodody Str., Samara, Russia [email protected] 2 Arsenal Design Bureau, 1-3 Komsomola Str., St. Petersburg, Russia [email protected] 3 Military Space Academy, 13 Gdanovskaya Str., St. Petersburg, Russia [email protected]

Abstract. Nowadays onboard software plays one of the key roles in the success of space missions. The flight control program is a classic example of mission critical software. The spacecraft’s control logic is implemented by onboard software both for normal operations and abnormal situations. Today software provides spacecraft with system integration, and fault tolerance features. Unfortunately, in contrast to aviation, the Russian space industry did not have standards which could help to distribute the best practices, languages, and methods among organizations involved in rocket and spacecraft development. The paper argues the necessity for standardization, and presents the proposed list and content of these standards. Keywords: Software standards · Real-time control · Mission critical software · Software lifecycle · Software documentation · Software verification · Formal inspection · Static analysis · Spacecraft onboard control system · Model checking

1

Introduction

Nowadays, it is impossible to imagine a space mission without wide application of computers. Computers are used at the design stage, at lifting to space, during operations till the end of spacecraft’s active lifetime. In fact, even micro- and nano-satellites with the mass about 10 kg are equipped with an onboard control computer system which frequently integrates several computers into the onboard network. During the space mission, the control logic to be implemented by a special sort of software – flight control software, which can be reviewed as a classical example of mission-critical software. Onboard software might consist of hundreds of concurrently running program modules [1–4]. The control logic c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 430–436, 2020. https://doi.org/10.1007/978-3-030-32258-8_50

Future of Software Standards in Russian Space Industry

431

both for normal and abnormal situations is being implemented by this software. Moreover, we can state that the onboard software plays the system forming (integrating) role [4]. Namely, the flight control software organizes cooperative work of various onboard devices coordinated from the logical point of view and well synchronized in time, like the conductor of the orchestra do [5–8]. When we evaluate labor and time consumption and total costs of implementing of spacecraft control, the typical proportion between hardware and software can be characterized as 1:10 [1,9,10]. In aerospace projects (as it was noticed decades ago), the processes of design, development and verification of onboard software frequently form a ‘critical path’ of network schedule of jobs related to design and manufacturing of rocket/space system as a whole [11]. In many cases the expected active lifetime for modern unmanned spacecrafts is 10–15 years [1,3,8,11]. Thus, we need changes in control logic. Changes of spacecraft’s control logic might require a complex, time-consuming and many-staged process of software re-design, coding and testing (including unit testing, integrity testing, system testing, etc.). Thus, the total cost of the onboard software lifecycle dramatically grows because of required software maintenance efforts [1,11]. All of the above circumstances cause the highest level of significance of the space-craft onboard flight control software. The next section discusses the methods and means of providing this software with needed level of quality and dependability.

2

Providing Intelligence and Dependability at Stages of Software Lifecycle

The quality and dependability of the software depend on efforts applied at different stages of its lifecycle. Intelligent dependability cannot be ‘added’ to product at final stage. First, these features must be introduced during specification process as the formally defined requirements as well as requirements related to fault tolerance, recoverability and modifiability. Then these requirements must be taken in account and converted to appropriate design solutions at the design stage. Fault tolerance should be specially envisioned at the initial stage of spacecraft design. Spacecraft control logic should be specified in the corresponding obligatory documentation both for normal operations and for abnormal situations and then implemented by flight control software. These documents include a list of possible failures with the specification of the level of danger, state diagrams and cyclograms representing the counter measures to compensate the damage [3,5,7,8]. If the failure will be diagnosed, the reaction must be executed. The reaction implies the reconfiguration of the equipment using ‘hot’ or ‘cold’ backup or the transition of the spacecraft into one of the special modes providing safety (these modes exclude catastrophic consequences). Then this documentation together with papers describing control in normal modes became an input data’ for onboard software’s designers and developers.

432

A. Tyugashev et al.

The sentence ‘Humanum errare est’ is a well-known Latin aphorism. The programming is still not so ‘mature’ as other engineering disciplines. In many aspects software development remains a sort of ‘art’, and today the programming is a ‘human-defined’ activity. This is not a good thing for the development of mission critical applications. The results of engineers’ efforts should be a stable, measurable, depend-able and safe independently on the qualification and expertise of certain developer. The methods, notations and tools define possibilities to make errors. One noticed thousands of years ago that the language in many aspects defines the mechanism of thoughts. Final results of intellectual activity depend on used languages. Thus, the methods, notations and tools define quality of the product. We need to use clear, understandable for human, and practically proven models, languages and tools at design and development stages to reduce the number of errors. Fault tolerance and safety are very important features for flight control software. These features should be supported by special methods and means at the development stage, e.g. secure programming. Particularly, we can require use of ‘program contracts’ method and predefined processing of exceptions. The small but effective is logging of program execution process with fixation the ‘key points’ of procedures and use of special sort of telemetry - ‘program telemetry’ [4,7,9,10]. The very important and usually high labor consuming stage for critical software is testing. But it cannot be fully tested for all possible situations due to the real-time mode of functioning and aspects of concurrent execution of dozens of modules [1,9,11–13]. Verification processes should be described with all needed details in special documentation. Well prepared documentation for mission critical software also can include the plan of quality assurance and plan of certification. There are several examples of successful application of software model checking performed by Gerard Holzmann and his colleagues [14,15]. The other effective method of checking is static analysis [4,11,16]. In some cases, static analysis can be applied not to program’s source code only but other artifacts of software lifecycle. Of course, to perform the static analysis we need the formal rules of organization of documents to be checked. The other very important features for the mission critical software are recoverability and modifiability. There are a lot of examples of successful implementation of software changes and re-uploading them onboard, even when the distance between the Earth and deep space probes as great as millions of kilometers. The very impressive cases happened on the way to Mars or on the Mars surface. Jim Erickson, Chief Project Manager of Mars Science Laboratory, states that Curiosity is much more reprogrammable than previous missions. He even says that MSL is a ‘software-defined spacecraft’ [17]. The flexibility of the software of Curiosity has sometimes been a problem because it increases the complexity of the mission. Let us cite Erickson’s statement: ‘The more complicated the software, the more likely you’ll not get everything perfect. You’ll get surprises. Both in development, test and in operations’. But this is a situation where flexibility will

Future of Software Standards in Russian Space Industry

433

help, allowing to redesign the way the rover works in response to a potentially mission-ending hazard that was never anticipated. The author know impressing similar examples in Russian space projects [7–9,16]. Today the uploading of onboard software using radio becomes a ‘routine operation’ which has already been performed hundreds of times. These practices, principles of design, and operation organization deserve widespread distribution.

3

Role of Processes, Tools and Standards

Obviously, to get a good product, we need a good process. For example, it is well known fact that the medium number of errors depends on programming language. There are languages with more potential opportunities to make a mistake, and more ‘safe’ languages. In fact, C and C++ are widely used in real-time embedded systems now, but these languages are definitely not a best choice from the dependability point of view. To reduce the influence of C/C++ potential problems, many organizations accepted so-named ‘coding rules’ [11,16]. The other important component is organization. Right organization of cooperative work of dozens of specialists including analytics, designers, spacecraft system’s specialists, testers, managers, etc. defines the overall success. The organizational issues are an important component of lifecycle. We can state existence of enough detailed description of technological aspects and roles of participants of software lifecycle in some enterprise level standards. The tools supporting the organizational processes include tool for job planning and project management, electronic archive integrated with version control system, electronic document management system with control of changes. For example, ASPID system with mentioned functionality was specially designed and introduced at JSC ISS [10]. But how to propagate this experience and best approaches, process organization principles, and development tools among all enterprises and firms in the Russian Aerospace Industry? The answer is standardization. The goal of standardization is reaching the optimal level of ordering and regulating in certain problem domain by specifying and wide introducing of formally defined requirements, rules, measures, etc. The main result which one can wait from introduction of standardization is the improving of quality of products, breaking down barriers in interaction, support of progress in this problem domain. As it was shown above, in case of mission critical software, the standards must pay attention not to the features and characteristics of the product, but also processes of its lifecycle. Standards can specify the ‘common’ language both literally and figuratively for various structures involved in the creation and exploitation of onboard software. The standards should accumulate the best practices from the Russian space enterprises and design bureaus and adopt good solutions from other industries. One should also carefully examine the standards used abroad (for example, there are a set of software-related standards supported by European Space Agency), and adopt the best solutions.

434

4

A. Tyugashev et al.

State-of-the-Art in Russian Space Industry and the Future

There are practically proven approaches to prepare a high-quality and robust control software reflected in corresponding standards in various industries. These standards collect the best practices and contain the minimal requirements to software development processes and final product. The typical examples of such documents in aviation are DO-178B, ARINC 653, AS9115 (and now there are Russian national variants for many of them). In fact, Rosatom has similar requirements and there are even officially introduced a standard for the development of software applied in rail-road automatic systems. Of course, there are general international and national standards regulating various issues related to design, documenting, verification and testing of software. We can name IEC/ISO 9126, 9294, IEEE 1074, GOSTs of 19 (ESPD) and 34 series. But the particularities of the space industry are so important that we have the rights to expect development and introduction of industry-level standards. The Roskosmos initiated the development of draft standards regulating the development of onboard software. It should be an important part of industry’s transition to the use of unified onboard apparatus and approaches in future Russian space projects. The standards will define the general requirements to onboard software as well as the basic principles of the organization and functioning of the flight control software. The supposed set of standards includes: • • • • •

Standard devoted to software lifecycle. Requirements to software documentation. Methods and tools for ensuring the quality of software. Requirements for testing. General requirements to onboard computer operating system.

At the first stage, the requirements of the standards will be recommendatory. The approval of these standards will be just a first step. At the next stage, some of the requirements will become mandatory. Finally, the procedure certification for onboard software and its development processes should become obligatory, as it is in the aviation industry.

5

Conclusions

The paper argues the high level of importance of software in modern space missions and role of ‘onboard intelligence’ in providing unmanned spacecraft with requited reliability. The ways of providing flight control software with corresponding level of intelligent are being discussed in the second and next sections. Some of the approaches have been already developed and successfully introduced in complex with organizational measures and supporting tools at several enterprises and design bureaus in Russia. We can and must spread these best practices among all involved organization, and the standardization is an appropriate way to do this.

Future of Software Standards in Russian Space Industry

435

Finally, the paper discusses the future of standards for development of intelligent and dependable flight control software in Russian space industry. This future is great, and we are sure that, like in the aviation and at organizations under the jurisdiction of Rosatom, the standard requirements will become obligatory as well as procedure of certification of spacecraft onboard software and processes of its development, documentation and testing, in Russian space industry.

References 1. Eickhoff, J.: Onboard Computers, Onboard Software and Satellite Operations. An Introduction. Springer-Verlag, Heidelberg (2012) 2. Kozlov, D.I., Anshakov, G.P., Mostovoy, Y.A.: Control of Earth Observation Satellites: Computer Technologies. Mashinostroenie, Moscow (1998). (in Russian) 3. Akhmetov, R., Makarov, V., Sollogub, A.: Principles of the earth observation satellites control in contingencies. Inf. Control Syst. 1, 16–22 (2012) 4. Mostovoy, Y.A.: A Control of Complex Technical Systems: Construction of Earth Observation Satellites Software. Technospere, Moscow (2016) 5. Tyugashev, A.A.: Visual builder of rules for spacecraft onboard real-time knowledge base. In: Czarnowski, I., Caballero, A., Howlett, R., Jain, L. (eds.) Intelligent Decision Technologies. Smart Innovation, Systems and Technologies, vol. 57, pp. 189–205. Springer, Cham (2016) 6. Holzmann, G.: Mars code. Commun. ACM 57(2), 64–73 (2014) 7. Kirilin, A.N., Akhmetov, R.N., Sollogub, A.V., Makarov, V.P.: Metody obespecheniya zhivuchesty nizkoorbitalnykh avtomaticheskykh KA zondirovaniya Zemly. Mashinostroenie, Moscow (2010). (in Russian) 8. Khartov, V.V.: Autonomnoe upravlenie kosmicheskymi apparatami svyazi, retranslyacii i navigacii. Aviakosmicheskoe priborostroenie (Aerosp. Instrum.Making) 6, 12–23 (2006). (in Russian) 9. Tyugashev, A.A.: Language and toolset for visual construction of programs for intelligent autonomous spacecraft control. IFAC - PapersOnLine 49(5), 120–125 (2016) 10. Koltashev, A.A.: Effectivnaya technologiya upravleniya cyclom zhizni bortovogo programmnogo obespechenia sputnikov svyazi i navigacii. Aerosp. Instrum.Making 12, 20–25 (2006). (in Russian) 11. Tyugashev, A.A., Ermakov, I.E., Ilyin, I.I.: Ways to get more reliable and safe software in aerospace industry. In: Program Semantics, Specification and Verification: Theory and Applications (PSSV 2012), pp. 121–129. Nizhni Novgorod, Russia (2012) 12. Kalentyev, A.A., Tyugahsev, A.A.: CALS technologies in lifecycle of complex control programs. Samara Scientific center of Russian Academy of sciences, Samara (2006). (in Russian) 13. Tyugashev, A.A.: Integrated environment for designing real-time control algorithms. J. Comput. Syst. Sci. Int. 45(2), 87–300 (2006) 14. Holzmann, G.: Reliable software development: analysis aware design. TACAS 1, 1–2 (2011) 15. Holzmann, G.: Using SPIN model checking for flight software verification. In: IEEE Aerospace Conference Proceedings, vol. 1, pp. 105–112. IEEE Press (2002)

436

A. Tyugashev et al.

16. Tyugashev, A., Zheleznov, D., Nikishchenkov, S.: A technology and software toolset for design and verification of real-time control algorithms. Russian Electr. Eng. 88(3), 154–158 (2017) 17. Space.com Information Portal Mars Rover Curiosity Software Upgrade. http:// www.space.com/17034-mars-rover-curiosity-software-upgrade.html. Accessed 11 Jun 2019

Improved Particle Swarm Medical Image Segmentation Algorithm for Decision Making Samer El-Khatib1 , Yuri Skobtsov2(B) , and Sergey Rodzin1 1

2

Southern Federal University, Rostov-on-Don, Russia samer [email protected], [email protected] St. Petersburg State University of Aerospace Instrumentation, Saint Petersburg, Russia ya [email protected]

Abstract. Improved particle swarm optimization (PSO) algorithm is proposed for medical image segmentation. The complexity of the proposed algorithm is estimated based on the drift theorem. Computer experiments have shown the linear complexity of the algorithm. Images from the Ossirix image dataset and real medical images were used for testing. Low (polinomial time complexity) allows to use the proposed algorithm for rapid decision-making (medical diagnosis). The populationbased image segmentation methods such as PSO are well implemented at distributed computing systems, what allows increasing their efficiency even more. Keywords: MRI image segmentation · Particle Swarm Optimization k-means · Swarm intelligence · Segmentation · Bio-inspired methods

1

·

Introduction

Segmentation based on MRI data is very important but at the same time it is a time consuming task if it is performed manually by medical specialists. In the previous articles [1,2] there was proposed a modification of algorithm based on the particle swarm optimization image segmentation technique. To evaluate the practical and theoretical time complexity of the developed algorithm for segmentation of MRI images, a technique was proposed using the consequences of Drift’s theorem [3]. This article is dedicated to development and evaluation for new modification particle swarm segmentation method [1,4] for MRI images (Elitist Exponential Particle Swarm Optimization method).

2

Particle Swarm Optimization

Method particle swarm optimization (PSO) uses swarm of particles where each particle introduces own problem solution [1,2]. Each particle stores best fitness c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 437–442, 2020. https://doi.org/10.1007/978-3-030-32258-8_51

438

S. El-Khatib et al.

value and appropriative coordinates. Let denote this fitness value as yi and name it as cognitive component. Similarly, the best global optima, reached by all particles, let denote as yˆ(t) and name it asocial component. Each i-th particle has such characteristics as velocity vi (t) and position xi (t) in the time moment t. Particle’s position changes according to: xi (t + 1) = xi (t) + vi (t + 1)

(1)

where xi ∼ U (xmin , xmax ). yj (t) − xij (t)] vij (t + 1) = vij (t) + c1 r1j (t)[yij (t) − xij (t)] + c2 r2j (t)[ˆ

(2)

The best position (gbest) at the moment of time (t + 1) can be obtained as:  yi (t) if f (xi (t + 1)) ≥ f (yi (t)) yi (t + 1) = (3) xi (t + 1) if f (xi (t + 1)) < f (yi (t)) where f : Rn∞ → R – fitness function, which determines how the current solution is close to optimal. yˆj (t) (pbest) at the moment of time t can be obtained as: yˆ(t) = min{f (y0 (t)), . . . , f (yns (t))}

(4)

where yns – total amount of particles in the swarm.

3

Elitist Exponential PSO Algorithm

Note that the introduction of elitism in evolutionary methods sometimes allows to reduce the computational complexity of the algorithm [3]. Elitist Exponential PSO heuristic is preferred for segmentation of “contrast” images. The heuristic Elitist Exponential PSO consists in introducing a growth rate coefficient for each particle. If the value of the objective function for a particle at the i-th iteration is greater than (t − 1)-th, then the value increases. After the pbest value for all particles has been determined, the current best pbest value is updated. The value of gbest is replaced by pbest with the highest coefficient value. In the segmentation process, each particle represents K clusters in such a way that xi = (mi1 , ..., mij , ..., miK ), where mij represents center of the cluster j for particle i, βi represents the growth rate. Modified fitness function is as follows: f (xi , Zi , βi ) = ω1 dmax (Zi , xi ) + ω2 βmax (zmax − dmin (xi ))

(5)

where zmax = 2s − 1 for s-bit image, Z – connectivity matrix that represents link between pixel and center of the cluster for particle i. Each element of this matrix shows pixel attachment to cluster for particle I, βmax – maximum allowed growth rate. The input data for PSO-K-means heuristic are: the number of clusters K; the number of the particles m, that directly perform the segmentation; the maximum

Improved Particle Swarm Medical Image Segmentation Algorithm

439

number of iterations of the method nt0 to find a solution; acceleration coefficients c1 and c2 – personal and global components of the final particle velocity. Below is a step-by-step description of Elitist Exponential PSO heuristics [4]. Step 1. The choice of the number of particles in the swarm m, personal and global acceleration c1 and c2 , the maximum number of iterations nt0 , the number of clusters K, βmax – maximum allowed growth rate, the parameters for the objective function f , according to 6; Step 2. Create an initial population of pixels (particles) distributed over the image space and initialize βi ; Step 3. Calculate pixel values based on objective function (5); Step 4. If current pixel (current particle position) is better than previous, update it according to (1) Also update βi = βi + 1, if βi < βmax ; Step 5. Determine best pixel (particle) among all; Step 6. Update pixel’s velocity according to 5. Calculate pbest (3) and gbest (4) – best local and global particle’s solutions (pixels); Step 7. Move the particle to a new position according to (3); Step 8. Return to step 3 until the end condition is reached. PSO is an evolutionary process that will be repeated until the completion condition is fulfilled and the change in speed rushes to zero. Each particle xi represents K clusters such as xi = (mi1 , ..., mij , ..., miK ), where mij represents center of cluster j for particle i. Here objective function calculated according to (5), dmax – the maximum mean Euclidean distance from the particles to the associated clusters, ⎫ ⎧ ⎬ ⎨  dmax (Zi , xi ) = max d(Zp , mij )/|Cij | , (6) j=1..K ⎩ ⎭ ∀Zp ∈Cij

dmin (xi ) – minimal Euclidean distance between pairs of cluster centers [3,4]. dmin (xi ) =

min

∀j1 ,j2 ,j1 =j2

{d(mij1 , mij2 )}

(7)

The peculiarity of the EERA heuristic is the expansion of the optimum search space and the availability of a mechanism to prevent the premature convergence of the method, which does not allow to miss the potentially best possible solutions [1,2].

4

Theoretical and Experimental Evaluation of Elitist Exponential PSO Algorithm Using Drift Analysis

Drift analysis is a modern tool to estimate the complexity of bio-stochastic methods [3]. The method for the estimation of the time complexity using drift theorem is as follows. In the finite state space S there is a certain function f (x), x ∈ S. Need to find max{f (x); x ∈ S}

(8)

440

S. El-Khatib et al.

Let x∗ be the state with the maximum value of the function fmax = f (x∗ ). An abstract bio-inspired method for solving optimization problem includes the following steps: 1. Initialization of the solutions population (randomly or heuristically) ξ0 =(x1 ,. . .,x2n ) which contains 2n individuals (n is integer). Let be k = 0. For each population ξk need to determine ξk = max{f (xi ) : xi ∈ ξk }. 2. Generation of the population ξk+1/2 using evolutionary operators. 3. Selection and reproduction of 2n individuals from populations ξk+1/2 and ξk and deriving a new population ξk+S . 4. If f (ξk+S ) = fmax , then stop, otherwise ξk+1 = ξk+S , k = k + 1 and go back to the step 2. Let x∗ be the optimum point. Lets denote d(x, x∗ ) as distance between points x and x∗ . If there are many optimums S ∗ , then d(x, S ∗ ) = min{d(x, x∗ ) : x ∈ S)} is the distance between point x and the set S ∗ . We denote this distance by d(x). / S∗. Then d(x∗ ) = 0, d(x) > 0 for any x ∈ Considering that the population X = min{x1 , . . . , x2n }, we set d(X) = min{d(x) : x ∈ S}

(9)

Formula 9 serves to measure the distance between the population and the optimal solution. The sequence {d(ξk ); k = 0, 1, 2, . . . }, generated by bio-inspired method is random sequence and it can be modeled using homogeneous Markov chain. Then drift of the random sequence in the time moment k can be defined as Δ(d(ξk )) = d(ξk+1 ) − d(ξk )

(10)

The stopping time of the method can be estimated as τ = min{k : d(ξk ) = 0}. Need to study the relationship between the time τ and the dimension of the problem n. At what drift values Δ(d(ξk )) can we estimate the mathematical expectation E[τ ]? Can the presented method find optimal solution in polynomial time or it will require exponential time? The idea of drift analysis is quite simple. The key issue here is the evaluation of the relation between d and Δ. A bio-inspired method can solve an optimization problem in the polynomial average time under the following drift conditions: – If polynomial h0 (n) > 0 (n – the dimension of problem) exists such as d(X) ≤ h0 (n)

(11)

for any given population X, i.e. the distance from any population to the optimal solution is a polynomial function of the dimension of the problem; – At any moment k ≥ 0, if d(ξk ) > 0, then polynom h1 (n) > 0 exists such as E[d(ξk ) − d(ξk+1 )|d(ξk ) > 0] ≥ 1/h1 (n)

(12)

i.e. drift of the random sequence {d(ξk ); k = 0, 1, 2, . . . } in relation to optimal solution always positive and limited by reverse polynomial.

Improved Particle Swarm Medical Image Segmentation Algorithm

441

The consequence of the drift analysis results is in the fact that the estimation of the drift value is converted into an estimate of the time operation for this method, and the local property (drift of one step) is transformed into a global property (the operation time of the method until the optimum is found). This is a new result in estimation of the time complexity of bio-inspired methods obtained with the drift analyzing. Drift can be evaluated easier. With the help of the drift analysis there were determined conditions that guarantee the solution of some problems on average in polynomial time, as well as the conditions under which the method requires an average exponential time for solving the problem. It is necessary to estimate the drift values based on image data during the segmentation process and to perform a polynomial time complexity satisfactory test. Let’s estimate the fulfilment of these conditions for specific values of the objective functions for the developed image segmentation method. Table 1 represents the values of the objective function of the Elitist Exponential PSO algorithm with the number of particles m = 5, number of the clusters k = 5, image size is 109 × 106. Objective functions are presented on iteration k as in formulas 6. Table 1. Objective function of PSO-k-means algorithm at m = 5 k=1

k=2

k=3

k=4

k=5

k=6

k=7

k=8

k=9

k = 10

k = 11

433,66 635,52 840,27 1048,48 1321,57 1483,94 1670,52 1863,37 2066,50 2066,50 2066,50

Then values f1 − f3 in formula (11) for objective function are as follows: d(X) ≤







(Xi − Xj )2 + (Yi − Yj )2

i=1..width·height j=1..width·height,j=i

≤ width · height ·

width2 + height2

width – width of the image, height – height of the image. The formula 12 for the mathematical expectation of drift can be represented as a combination E[d(ξk ) − d(ξk+1 )|d(ξk ) > 0] ≥ min(d(X)). For the Table 1 above, the mathematical expectation of drift has the form shown in Table 2. Table 2. Drift of the objective function of the PSO-k-means segmentation algorithm at m = 5

k=2

k=3

k=4

k=5

k=6

k=7

k=8

k=9

k = 10

E 201.86 204.75 208.21 273.09 162.37 186.58 192.85 203.13 0

As it can be seen from the results, formulas 10 and 11 are valid when real data values are substituted, therefore, according to [4] the Exponential PSO method solves the segmentation problem in polynomial time.

442

5

S. El-Khatib et al.

Conclusion

In the presented paper, there has been introduced Elitist Exponential PSO segmentation algorithm and proposed investigation of its theoretical and practical time complexity using Drift theorem. Experimental results showed that EEPSO segmentation method can solve segmentation task in polynomial time. This low time complexity allows to use the proposed algorithm for rapid decision-making (medical diagnosis). Should be noted, that population-based image segmentation methods are well implemented at distributed computing systems, what allows to increase their efficiency even more. The proposed algorithm can be enhanced by means of the additional research, especially of pseudo-random heuristic coefficients, as well as their impact on convergence and the final result of the processing. Acknowledgements. The reported study was funded by Russian Foundation for Basic Research according to the research project 19-07-00570 “Bio-inspired models of problem-oriented systems and methods of their application for clustering, classification, filtering and optimization problems, including big data”.

References 1. Das, S., Abraham, A., Konar, A.: Automatic kernel clustering with a multi-elitist particle swarm optimization algorithm. Pattern Recogn. Lett. 29, 688–699 (2008). Science direct 2. El-Khatib, S., Skobtsov, Y., Rodzin, S., Zelentsov, V.: Hyper-heuristical particle swarm method for MR images segmentation. In: Silhavy, R. (eds.) Artificial Intelligence and Algorithms in Intelligent Systems. CSOC2018. Advances in Intelligent Systems and Computing, vol. 764, p. 256–264. Springer (2018) 3. He, J., Yao, X.: Drift analysis and average time complexity of evolutionary algorithms. Artif. Intell. 127(1), 57–85 (2001) 4. Das, S., Ajith, A., Amit, K.: Spatial information based image segmentation using a modified particle swarm optimization algorithm. Pattern Recogn. Lett. 29(5), 688–699 (2008)

Collecting and Processing Distributed Data for Decision Support in Social Ecology Dmitry Verzilin1(B) , Tatyana Maximova2 , and Irina Sokolova3 1

St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, Lesgaft National State University of Physical Education, Sport and Health, St. Petersburg, Russia [email protected] 2 ITMO University, St. Petersburg, Russia [email protected] 3 St. Petersburg State University, St. Petersburg, Russia i [email protected]

Abstract. The problems of environmental awareness in the population were considered. Distributed sources of information were revealed and analyzed to evaluate such awareness. The data on search queries and messages in social groups were used to estimate patterns of environmentrelated mass activities in the cyberspace and the real world of Russian regions. The results obtained demonstrate opportunities of distributed data collection for a real-time decision making as compared with sociological surveys. Keywords: Environmental monitoring · Key word searchers activity · Social ecology · Socio-ecological systems

1

· Online

Introduction

Russian environmental policy mainly relies on ecological economics. Environmental programs include the costs for measures to prevent environmental damage and preserve protected areas. At present we understand that grounding environmental policy must involve interdisciplinary research putting together environmental protection, the environmental behavior of the population, and the needs of the society [1,3]. The population of Russia has become increasingly interested in environmental issues in recent years. This interest is manifested in thematic virtual communities in which people exchange information about environmental problems and coordinate actions to solve these problems. Social ecology significantly changes the thinking style and contributes to the formation of ecological thinking and responsible attitude to the environment paradigm [1,2,5–7]. The idea of the interrelationship between environmental sustainability and social well-being in the formation of a socio-ecological perspective can c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 443–448, 2020. https://doi.org/10.1007/978-3-030-32258-8_52

444

D. Verzilin et al.

be used for adaptive management of environmental resources and making up environmental policies [1]. Digital transformation of society and economy alters the processes of information interaction in socio-ecological systems. Communication in the cyberenvironment (for example, social networks, information portals on environmental initiatives and the environment, Internet forums, etc.) contributes to the self-organization of system’s elements [8] and allows us to talk about the emergence of a new type of system, namely cyber-socioecological system. We need new approaches to collecting and analyzing data in such systems [9]. Our goal was to provide techniques for collecting and processing data for evaluating the needs of the society and its response to changes in the environment when making management decisions. There is a lack of tools for analyzing feedback, that is, the reaction of the population to the actions of the authorities. Rapid evaluation of the feedback using statistical data is rather difficult. It takes a long time to collect, aggregate, and analyze data. We discuss approaches to sharing data from various sources to overcome those problems.

2

Empirical Analysis

Online activity of population is a data source on the attitude to the problems of the urban environment. Mass events in virtual communities and in the real world involving independent actors (individuals and legal entities) can have a significant impact on socio-economic processes and systems. 2.1

Environment-Related Public Groups

We analyzed regional groups in the Russian social network vk.com, created to discuss problems and initiatives in the field of environmental protection in the region. We considered groups with at least 1000 participants for Moscow and the Moscow Region, St. Petersburg and the Leningrad Region, and the Arkhangelsk Region. A content analysis of messages in groups over the past six months gave us an opportunity to allocate four levels of activity and involvement of participants in the processes of forming regional environmental policy. The first level came to the presentation of information on local environmental issues and initiatives, related documents, photographic video materials, links, etc. At the second level a creation and signing of petitions to government bodies was added to information exchange. The third level differed in the organization of environmentally oriented events, actions in support of environmental initiatives. The fourth level manifested itself in the organization of rallies and protest actions. In addition, the topics most discussed in the group were analyzed. For each region, the average number of participants in the group and the proportion of

Collecting and Processing Distributed Data for Decision Support

445

groups at each level were estimated, taking into account the number of participants. There are significant differences between the considered regions both in the structure of the groups and in the topics of discussion. In Moscow and the Moscow region, sources of an unpleasant odor are discussed and fixed, dominated by groups focused on the exchange of information. For groups of St. Petersburg, a characteristic topic of discussion was separate garbage collection and its legal grounding [4]. Also, residents are concerned about the possible construction of waste recycling plants, which, in the absence of a separate waste collection system, may simply burn a significant proportion of municipal solid waste. The third level groups were most pronounced with the organization of environmental activities. For the Arkhangelsk region, the main topic was to counteract the plans to create new landfills (at Shies location, see Fig. 1) for solid household waste and the arrival of waste from Moscow and the Moscow region. Here the highest protest activity of the group members at the fourth level was observed. Classification of social network groups vk.com, created to discuss problems and initiatives in the field of environmental protection in the regions, allows to establish the level of self-organization of group members and the degree of an environmental awareness. It should be noted that environmental awareness is more developed among the participants of groups from St. Petersburg and the Leningrad Region. 2.2

Prevalence of Environmental-Related Collocations in Social Networks

We examined popularity of environmental collocations at the Russian social network vk.com. For that we used the open-source Knime software. The visualization of the results (Fig. 1) shows more popular collocations with a larger font. 2.3

Search Queries

For the analysis of thematic online activity, statistical data collection services on Yandex and Google platforms were used. The prevalence of requests for environmental pollution, clean water and air, the state of recreational resources showed the degree of public concern about the ecological situation in the regions. Seasonal variations in the number of search queries for the keywords “separate garbage collection” and “garbage landfill” with highs in spring and summer were identified. The significant increase in the number of requests with the words “separate garbage collection” may manifest the interest of the population in the so-called “law on garbage” and the discussion on territorial schemes for the deployment of garbage processing plants and landfills. Regional data on the prevalence of requests with the words “garbage landfill” were analyzed for areas of the Moscow Region, in which there were sharp

446

D. Verzilin et al.

Fig. 1. Popular collocations at the social network vk.com (translated from Russian)

deterioration in the quality of atmospheric air near solid waste polygons: Kolomna, Balashikha, Volokolamsk. The peak in the number of requests coincides with the periods of increased gas emission at landfills. In contrast to the listed territories in the Arkhangelsk region, it is possible to expect a further increase in the number of search queries in connection with the discussion of plans to dispose solid domestic waste from Moscow and the Moscow region. There is no direct correlation between the widespread requests for “separate garbage collection” in large cities of Russia with a population of over 1 million people (23% of the total population of Russia) and the real availability of separate garbage collection for the population (the proportion of the population for which separate collection of garbage is possible). As noted above, the population of St. Petersburg shows an increased interest in this problem, with the complete impossibility of the separate garbage collection till now. In addition to the absolute number of queries, the relative regional intensity of queries (regional popularity, affinity index). Affinity index is equal to a ratio of region’s share in the total amount of queries with definite search keywords to the share in all queries. Affinity index characterizes the regional popularity of keywords in search queries. Values greater than 1 express a higher popularity compared with average popularity in Russia, while values less than 1 mean less popularity. We distinguished four levels of environmental interest of the population (listed in order of increasing altruism): a reaction to a direct environmental threat; environmental needs associated with the search for information about the environmental situation in places of residence and recreation; reaction to a public debate in the field of ecology; environmental awareness.

Collecting and Processing Distributed Data for Decision Support

447

At the first level increased online activity was observed during the exacerbation of the environmental situation at landfills for municipal solid waste. Regression models were constructed that link the intensity of thematic online activity with the duration of an unfavorable environmental situation. To illustrate the second level of altruism in online reactions, that is, the environmental need, we analyzed the seasonal interest in St. Petersburg and Kaliningrad to the problem of water pollution. There are obvious peaks of relative online activity associated with the purity of tap water and reservoirs in spring and autumn. The affinity index in Kaliningrad was usually greater than 1. The third level of altruism in environmental interest was manifested in the increase of online activity during and after discussing environmental problems on federal TV channels. The fourth level of ecological altruism – environmental consciousness was revealed as a result of the analysis of data from a sociological survey. The respondents were 260 students from St. Petersburg. Participants answered questions about their incentives to search for environmental information, their attitude to environmental protection activities, and their own understanding of the environmental effects on health. A behavioral pattern has been identified, which is characterized by the desire to actively participate in environmental activities.

3

Discussions

Socio-economic relations are increasingly manifested in digital form. Information and communication technologies integrated into the economy and social relations become the basis of competitive advantages in world markets, transforming production systems into cyber-physical systems, economic and social relations and interaction into interaction in cyber socio-economic systems. These trends provide opportunities for innovation and economic growth through the development of cyberspace of smart cities. Measurement of thematic online activity of the population allows to identify its needs and to analyze in real time the response to changes in the urban environment. The proposed approach to identifying the needs of the population can be an addition to the platforms planned in the Smart City projects. Analysis of data on online activity of the population when making operational decisions is more quick and cheap than sociological surveys. When making management decisions, it is necessary to take into account the needs of the population, which are reflected in its socio-economic activity in cyberspace. Acknowledgements. The research described in this paper is partially supported by the Russian Foundation for Basic Research (grants 17-06-00108, 17-08-00797), state order of the Ministry of Education and Science of the Russian Federation № 2.3135.2017/4.6, state research № 0073-2019-0004 and International project ERASMUS+, Capacity building in higher education, # 73751-EPP-1-2016-1-DE-EPPKA2CBHE-JP.

448

D. Verzilin et al.

References 1. Armitage, D., B´en´e, C., Charles, A.T., Johnson, D., Allison, E.H.: The interplay of well-being and resilience in applying a social-ecological perspective. Ecol. Soc. 17(4), 15 (2012) ¨ Guerrero, A.M., McAllister, R.J., Alexander, S.M., Robins, 2. Barnes, M.L., Bodin, O., G.: The social structural foundations of adaptation and transformation in socialecological systems. Ecol. Soc. 22(4), 16 (2017) 3. Cumming, G.S.: Theoretical frameworks for the analysis of social-ecological systems. In: Sakai, S., Umetsu, C. (eds) Social-Ecological Systems in Transition. Global Environmental Studies. Springer, Tokyo (2014). https://doi.org/10.1007/978-4-43154910-9 1 4. Federal Law “On Production and Consumption Wastes” of 24.06.1998 N 89-FZ (last revised 2018). http://www.consultant.ru/document/cons doc LAW 19109/. Accessed 15 Mar 2019 5. Halliday, A., Glaser, M.: A management perspective on social ecological systems: a generic system model and its application to a case study from Peru. In: Human Ecology Review, vol. 18, No. 1 (2011) 6. Ostrom, E., Cox, M.: Moving beyond panaceas: a multi-tiered diagnostic approach for social-ecological analysis. Environ. Conserv. 37(4), 451–463 (2010). https://doi. org/10.1017/S0376892910000834 7. Ostrom, E.: A general framework for analyzing sustainability of social-ecological systems. Science 325, 419–422 (2009). https://doi.org/10.1126/science.1172133 8. Sokolov, B., Yusupov, R., Verzilin, Dm., Sokolova, I., Ignatjev, M.: Methodological basis of socio-cyber-physical systems structure-dynamics control and management. In: Chugunov, A.V., Bulgov, R., Kabanov, Y., Kampis, G., Wimmer, M. (eds.) Digital Transformation and Global Society First International Conference, DTGS 2016, St. Petersburg, Russia, 22–24 June 2016, pp. 610–618 (2016) 9. Verzilin, D., Maximova, T., Sokolova, I.: Online socioeconomic activity in Russia: patterns of dynamics and regional diversity In: Digital Transformation and Global Society: Second International Conference, DTGS 2017. CCIS, vol. 745, St. Petersburg, Russia, June 21–23, 2017. Springer, pp. 55–69 (2017)

Evaluation of the Dynamics of Phytomass in the Tundra Zone Using a Fuzzy-Opportunity Approach V. V. Mikhailov(B) , Alexandr V. Spesivtsev, and Andrey Yu. Perevaryukha St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, Saint Petersburg, Russia [email protected], [email protected], [email protected]

Abstract. A pattern of tundra plants development has been established on the basis of a long-term field study. This pattern can be described by a mathematical model as a result of the combination of two tendencies – growth and decay of plants during their lifetime. The photos taken from space allowed us to assess phytomass volume remotely, using the NDVI index. An attempt was made to develop a technique which would combine these two data sources in order to assess phytomass dynamics in the tundra zone. We use flexible methods for ecodynamics models. Keywords: Tundra · Patterns of plant development · Satellite images · Greenness index · Modelling of phytomass dynamics · Ecodynamics models

1

Introduction

The problem of assessing the volume, as well as the seasonal and yearly dynamics of on-land phytomass of plant species and communities of the tundra zone is a very important one. From the applied science perspective tundra vegetation is a forage source for reindeer and other species of herbivorous animals and birds. From the theoretical perspective information about phytomass dynamics is important for studying the connections between the zonal distribution of vegetation cover and climatic factors, continentality, altitude, estimation of balances and turnover of biogenic elements in the tundra biome. However, it should be pointed out that despite the importance of these data for various economic management issues in the tundra zone, the consistent patterns of growth and decay of various tundra species and plant communities have not been studied well enough. The data regarding reindeer capacity of pastures do not include weather and climate factors, which prevents us from assessing seasonal and interannual dynamics of fodder supplies. In this respect of great importance is the work performed at stations and devoted to seasonal and yearly dynamics of Arctic plant species and their communities [1]. The study of seasonal and annual dynamics of species and communities of tundra vegetation cover using traditional on-land c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 449–454, 2020. https://doi.org/10.1007/978-3-030-32258-8_53

450

V. V. Mikhailov et al.

techniques involves costly long-term stationary research. The possibilities of performing this work now are extremely limited. An alternative approach presupposes using remote data collection methods and multi-zonal satellite images for assessing the structure and capacity of on-land phytomass. As shown in [2,3], the interdependence of NDVI data and on-land phytomass for the natural climatic zones of the Siberian tundra biom is at the level of determination R = 0.95. In this respect on-land phytomass assessment plays role in adjustment, calibration and improvement of recognizing possibilities of satellite data during the identification of plant communities. The scope of this work is decreasing quickly. From the management theory perspective the study object itself is a complex underformalized biological system, which is difficult to study using deterministic mathematical methods. Therefore in this work in order to simulate phytomass dynamics an attempt is made to use the fuzzy opportunity approach, based on experts’ knowledge and experience. Taking into account above considerations, the aim of our study can be formulated such: using the data of retrospective stationary geobotanical research and remote satellite data about the volume on-land phytomass of plant communities, on the basis of the fuzzy opportunity approach we should build the mathematical model of phytomass dynamics and compare calculations made on the model and actual data.

2

Materials and Methods

We studied the regular patterns of phytomass growth, using the retrospective materials obtained at the Pokhodsk station of the Institute of Biology of the Yakutsk branch of the Siberian Department of the Academy of Sciences of the USSR [1]. The station is located in the Kolyma river delta in the Southern subarctic tundra. During 1971–1975 the station conducted research of seasonal and interannual dynamics of phytomass of the most important species of subarctic flora. Phytomass volume was measured every 15 days, from mid-May to mid-September. The study was conducted at two stations Pokhodsk and Rogovatka. The Pokhodsk station is located on the terrace above the flood-plain of the Kolyma River in the midst of the polygonal ridge tundra and swamp complex characteristic of the Southern subarctic tundra of the Yakut north. The Rogovatka station works in harsher climatic, soil, hydrological and cryosolic conditions, lying in the interfluve tundra covered with small hills and bushes, which is typical of southern and middle subarctic subzones of the Indigirka-Kolyma littoral plain. Figure 1 demonstrates normalized curves of phytomass dynamics of specific plant species. According to the graphs, the phytomass peak for the species and the community of the Pokhodsk station corresponds to early August (point 6). The growth of herbage plants phytomass begins in the second half of May, and of bushes – a month later. Moreover, herbal plants have up to 30–40% of maximum phytomass volume covered with snow; bushes lose their leaves completely. The Rogovatka station has its phytomass volume maximum in late July—early August.

Evaluation of the Dynamics of Phytomass in the Tundra Zone

451

Fig. 1. Phytomass dynamics of certain species at Pokhodsk station, average for 1971– 1975. By rows: 1 – cotton grass, 2 – Viluy sedge grass, 3 – pendant grass, 4 – herbage (average), 5 – alder stand, 6 – fluffy willow, 7 – bush leaves

Phytomass changes presented in Fig. 1 demonstrate the natural processes of plants growth and withering. Actually, both these processes can be described by a formula: (1) Y = a0 + a1 · exp (b1 · t) − a2 · exp (b2 · (t − tz )) where ai (i = 0, 1, 2) are the scaled coefficients of the model, bj (j = 1, 2) – curve run parameters describing the kinetics of processes, t, tz – the current time of the vegetation period and the lag time of the withering process, correspondingly, days. External factors (air temperature, precipitation, etc.) are not included in formulae arguments. Their influence is taken into account indirectly. Correspondingly, the results reflect general patterns of phytomass dynamics at certain average values of factors, corresponding to the average long-term standard.

3

Modeling of Yearly Phytomass Dynamics

The island Kolguyev was selected as a specific object for modeling phytomass volume dynamics (the subzone of Southern subarctic tundra). During 2003–2007 geobotanical research was conducted on the island. A lot of materials were collected regarding the mapping of plant communities’ types with their geobotanical description, spectral characteristics and exact geographical reference of working sites. In accordance with the above results (Fig. 1), in order to assess the maximum volume of on-land phytomass (yearly volume), the space images and NDVI made in late July and early August were selected. The external factors: the sum of positive temperatures, total amount of precipitation, cloud cover, average wind speed, the length of the vegetation period and the date of vegetation start (May, 1).

452

V. V. Mikhailov et al.

Processes in nature, as it was mentioned above, are underformalized, therefore mathematical models were built using the knowledge and experience of experts-professionals. The technique of building fuzzy opportunity models [4] at the first stage of substantive discussions with an expert involves the selection and substantiation of the factor space in which the problem is solved. When assessing phytomass dynamics in the tundra zone an expert chose four most informative variables from the ones mentioned in Table 1: x1 – is the sum of temperatures during the vegetation period, ◦ C; x2 – cloud cover, in oktas; x3 – start of vegetation period, days after May, 1; x4 – duration of vegetation period, days; Y – greenness index, values NDVI/1000. The dependent variable in the linguistic form is presented in Fig. 2, where Y is the greenness index.

Fig. 2. Y as a linguistic variable, NDVI/1000

After that a questionnaire matrix was made up (Table), in which the expert filled in column Y in a verbal form, where in accordance with the scale from Fig. 2 assessment modes are denoted: L – the low value of the greenness index, BA – below average, A – average, AA – above average, H – high or between the modes – L-BA – Low-Below Average, etc. Independent variables are also represented in the linguistic form, and the ends of the opposition standardized scale for each variable are denoted as «−1» – the lowest value, «+1» – highest value of the factor, according to table. The next step of the implementation of technique [4] is the construction of the polynomial model in accordance with the data of table. The resultant expression with the significant coefficients looks as follows: Y = 6, 617 + 0, 68x1 + 0, 117x2 + 0, 258x3 + 0, 352x4 + 0, 211x1 x2 − 0, 117x1 x4 − 0, 117x2 x3 + 0, 164x1 x2 x3 ,

(2)

where all independent variables are represented in the normalized scale. The results of checking the correlation between the calculations on the model (2) and expert assessment, Fig. 3a (correlation coefficient R = 0.98), and the

Evaluation of the Dynamics of Phytomass in the Tundra Zone

453

Table 1. Fragment of survey matrix with expert assessments and calculated values by the model #

Temperature, C



Clouds, points

Vegetation beginning, days

Duration of the NDVI – Normalized difference vegetation vegetation index, Y period, days Expert values

Calculated by model (2)

Verbal Numeric x1

x2

x3

x4

Y

Y

Y

1

−1

−1

−1

−1

BA

5,750

5,750

2

1

−1

−1

−1

A-AA

6,875

7,063

3

−1

1

−1

−1

L-BA

5,375

5,281

...

...

...

...

...

...

...

...

14 1

−1

1

1

AA-H

7,625

7,531

15 −1

1

1

1

BA

5,750

5,750

16 1

1

1

1

A-AA

6,875

7,063

actual NDVI data, Fig. 3b (correlation coefficient R = 0.79), demonstrate that the mathematical model (2) can quite well describe the dependence of the greenness coefficient by NDVI in accordance with the predicted values of factors. In this case simulation results coincide nearly completely with the experts’ assessment of the NDVI value. The degree of correspondence between the calculations on the model and actual index values is lower (R = 0.79), which can be explained by the qualification of experts, the peculiarities of selecting factors and their gradations, the difference between microclimate and soil in different spots, errors of the NDVI index assessment, temporal shifts of photo shootings and other reasons typical of complex underformalized phenomena.

Fig. 3. Checking the hypothesis regarding the correlation between calculations, expert’s opinion (a) and actual greenness values (b) by NDVI/1000

454

4

V. V. Mikhailov et al.

Conclusion

The volume of phytomass during its growth period can be predicted on the basis of the phytomass dynamics pattern synthesis using the data of field observations of phytomass and formalized experts’ knowledge within the framework of the fuzzy opportunity approach regarding changes of NDVI and external natural factors. The developed approach to the creation of the techniques of the phenomena under study is universal and it can be recommended for use in academic research. Acknowledgements. Work supported by RFBR project №17-07-00125 in SPIIRAS, and Budget theme 0074-2019-0009.

References 1. Andreev, V.N. (ed.): Seasonal and weather dynamics of phytomass in the subarctic tundra. Novosibirsk, p. 190 (1978) 2. Walker, D., et al.: Phytomass, LAI, and NDVI in northern Alaska: relationships to summer warmth, soil pH, plant functional types, and extrapolation to the circumpolar Arctic. J. Geophys. Res. 108(D2), 8169 (2003). https://doi.org/10.1029/ 2001JD000986 3. Raynolds, M., et al.: A new estimate of tundra-biom phytomass from trans-Arctic field data and AVHRR NDVI. Remote Sens. Lett. 3(5), 403–411 (2012) 4. Ignatyev, M.B., Marley, V.E., Mikhailov, V.V., Spesivtsev, A.V.: Modeling of weakly formalized systems based on explicit and implicit expert knowledge, p. 501. PolitehPress, St. Petersburg (2018)

Intelligent Human-Machine Interfaces

Lower Limbs Exoskeleton Control System Based on Intelligent Human-Machine Interface Ildar Kagirov1(B) , Alexey Karpov1 , Irina Kipyatkova1 , Konstantin Klyuzhev2 , Alexander Kudryavcev2 , Igor Kudryavcev2 , and Dmitry Ryumin1 1

St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, SPIIRAS, 14th Line, 39, 199178 St. Petersburg, Russia {kagirov,karpov}@iias.spb.su 2 Volga State University of Technology (Volga Tech), Lenina Sq. 3, 424000 Yoshkar–Ola, Russia

Abstract. In this article, we present an intelligent human-machine interface designed to control the medical robotic exoskeleton of the lower limbs “Remotion”. Intelligent bimodal interface combines tools of contactless voice control, as well as contact sensor-based control on mobile devices. The authors argue that the use of intelligent techniques of interaction between the user and the exoskeleton increases the level of ergonomics as well as effectiveness of its use in medical rehabilitation practice due to an intuitive and natural way of human-machine communication and control. The salient features of the proposed and developed system are automatic voice command recognition, conversion audio signal into text data, remote control and remote supervision of the exoskeleton through a PC, and active and informative feedback with the user, securing safety during rehabilitation sessions. The design of the interface is shown in the paper in details, outlines of engineering solutions applied for this exoskeleton are provided as well. Despite the existence of conceptually similar devices in Russia, the presented exoskeleton differs by the presence of an intelligent user interface, which significantly improves the ergonomics of the system. Another feature is a total modularity of the device. Such a solution greatly facilitates the maintenance of the exoskeleton and provides a range of adjustment opportunities.

Keywords: Medical exoskeleton Intelligent control · Voice control technologies

1

· Exoskeleton of the lower limbs · · Bimodal interface · Assistive

Introduction

This paper focuses on the intelligent human-machine interface designed for controlling the medical robotic lower limbs exoskeleton “Remotion” [1], elaborated in 2017–2019 as part of a joint project. The participants that took part in the c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 457–466, 2020. https://doi.org/10.1007/978-3-030-32258-8_54

458

I. Kagirov et al.

project were: Volzhsky Electromechanical Plant, Volga State University of Technology, as well as St. Petersburg Institute of Informatics and Automation of the Russian Academy of Sciences (SPIIRAS). By the term “intelligent human-machine interface” understands methods of interaction with, or control of, a computer, a system or a robot, which involve the use of artificial intelligent technologies. In spite of an increasing interest to intelligent interfaces, there is currently no strong consensus on what is an intelligent HMI. Based on published researches, one can list the functions that are closely related to the notion of the intelligent interface [2,3]: 1. analysis of the user’s actions; 2. a (speech-based) dialogue system; 3. adaptability, i.e. the system can change the pattern of its behavior, according to the situation. The intelligent user interface allows to increase the controlled object autonomy level, as well as contributing to naturalness and ergonomics of humanmachine interaction, allowing the user to take advantage of convenient and/or natural ways of interaction in accordance with his needs. The former aspect, namely, increasing the level of autonomy of the system, is important for complex robotic systems that function in information uncertainty conditions, the latter one (improving ergonomics and naturalness of the interface) – is important when the robotic system interacts with humans (social robotics, assistive robotics, exoskeletons, etc.) [4,5]. On the whole, one of decisive factors affecting the success of HMI is the naturalness and ease of user interaction with the robotic system [4,6]. This is especially important when the use of the device is closely associated with some active interaction with a person, for example, in everyday life, or in the entertainment sphere. Thus, a natural-language (in particular, speech-based) based or a multimodal interface belongs to the crucial features of intelligent control systems. One can state, that ideally there should be no difference between a ‘robot-human’ and ‘human-human’ communication acts. Both communicants should use natural, intuitive modalities: speech, facial expressions, gestures, emotions, and context [5]. In other words, the HMI it should be as natural as possible. The most natural way of communication is that based on the speech. It is an example of such a speech-based interface that is the main subject of the present paper.

2

Overview of Robotic Exoskeletons

Under the term “exoskeleton” are understood devices designed to increase the physical capacities of a human via an external frame (exoskeleton, anc. Greek ´ex¯o – ‘outside’ and skelet´ os – ‘skeleton’). According to the classification proposed by the International Federation of Robotics, exoskeletons are classified into the group of service robots [7]. However, unlike most robotic systems, exoskeletons are not autonomous. A special case of exoskeletons is constituted by robotic avatars and telepresence robots, controlled via a remote interface [8].

Lower Limbs Exoskeleton Control System

459

In spite of the fact that exoskeletons are deeply integrated in various areas of human activity [9], there is no complete and widely accepted classification of exoskeletons. Among the most common features for exoskeleton classifying the following can be listed [10–12]: 1. by the source of energy: active and passive – powered exoskeletons use external power banks (batteries, accumulators), or use cable connections to run sensors and actuators, while passive exoskeletons resort to the kinetic energy of the user; there are also the so-called pseud-passive exoskeletons, which are equipped with potteries or accumulators, but cannot use them for actuations; 2. by localization: exoskeleton of the upper limbs, exoskeletons of the lower limbs, and others; 3. by tasks: military, medical, industrial, space, etc.; 4. by weight: from ultralight (up to 2 kg) to heavy (more than 30 kg); 5. by user mobility: mobile and stationary. An informative survey of current exoskeletons is provided in [13]. One can conclude, that exoskeletons are currently being developed in many countries of the world, primarily in the United States, Japan, Israel and Germany [14,15]. Examples of lower limb exoskeletons include ReWalk [16] and eLEGS [17]. Both exoskeletons are addressed for patients with complete or partial lower extremities paralysis. The basis the ReWalk design is a set of sensors that track the inclination of the body and activate the supporting legs devices depending of the current body position. ELEGS functions in a similar way: information provided by the sensors is processed in real time, and some from the exoskeleton’s servos are activated, depending on the intentions of the patient. Despite the significance of the exoskeletons for the medical domain, a wide range of exoskeletons have been created for military purposes. The well-known examples are: the HULC exoskeleton by Lockheed Martin (US) [18] and the XOS exoskeleton [19]. The study [20] provides a list of research priorities in the development of exoskeletons. These include: • studies on the kinematic and biomechanical properties of current devices in order to formulate of the principles of their application; • development of new methods to determine parameters of exoskeletons and control their functioning, which would provide researchers with tools for quick and systematic evaluation of exoskeletons of various types according to selected criteria; • computer analysis of virtual topographic-anatomical environments when designing biomechanical systems; • creation and improvement of materials and basic parts of exoskeleton, ensuring their effective work. A detailed review of trends in the development of exoskeletons is given in [21]. This work is especially notable due the fact that it classifies current systems by assistive strategies of application.

460

3

I. Kagirov et al.

Medical Robotic Exoskeleton “Remotion”

The developed multifunctional robotic medical exoskeleton “Remotion” (hereinafter referred to as RME) is an exoskeleton designed for rehabilitation of patients with impaired lower limbs, recovering the motor activity of partially or completely disabled lower extremities, and also it can be used as a personal assistive tool by patients with partially or completely lost functions of the lower limbs. RME can be used to mitigate or liquidate the effects of the central and peripheral nervous system damage, consequences of injuries and diseases of the musculoskeletal system, accompanied by dysfunction of the lower extremities. Currently there exists four modifications of the RME: • REM–B – the basic version with basic sensors and control system, designed for rehabilitation in medical institutions; • REM–E – similar to REM–B, with the addition of electromyography (EMG) channels, designed for rehabilitation in medical institutions; • SEM–F – an enhanced version of SEM–E with a system of functional electrical stimulation of the patient’s muscles (FES), designed for rehabilitation in medical institutions; • REM–D – the basic version with basic sensors and control system, designed for rehabilitation of children in medical institutions. It should be especially noted that there are quite a few such exoskeletons developed in Russia. There is a number of models and prototypes similar to RME in some ways, but their functionality is significantly different from RME (for example, see [22]). RME is mounted of an external metal frame equipped with four servos and fixed on the patient’s body via soft cuffs, belts and fasteners. One of the features of the exoskeleton RME is the modular design, and options to program various walking patterns, as well as the availability of electromyography and functional electrical muscle stimulation. The parts of the RME are: a lumbar module, electrically driven hip modules, electrically driven lower-leg modules, foot modules, handles, a fixation system, and a power supplying system (battery), a control panel, and a charger. In addition, the exoskeleton is equipped with crutches or walkers. The overall 3D design of the exoskeleton is set out in Fig. 1. The lumbar module, modules of the hips and legs cam be adjusted in width, depth and length depending on size of the patient. For the same purpose, the foot module is equipped with an adjustable hinge, and the system of belts, cuffs and buckles is adjustable. A sketch and a photo of the equipped exoskeleton. The servos from the hip and lower leg module are motor reducers. The exoskeleton control system software allows various motion programs depending on medical purposes. RME allows rehabilitation of the user in the following modes of operation: • get up/sit down;

Lower Limbs Exoskeleton Control System

461

Fig. 1. The overall 3D design of the exoskeleton (frontal and rear views).

• • • • • •

go/stop; turns; mark time; asymmetric user goniograms; diagnostic mode “synchronous electromyography”; mode “synchronous functional electrostimulation”.

In order to improve the effectiveness of rehabilitation sessions, RME is equipped with FES and EMG systems. This uses one of the modes of operation: reading EMG signals from certain muscles of the patient, or muscle stimulation at a point in time synchronized with step periods. The activation and control in FES and EMG modes is carried out remotely from a PC via the RemotionTool software utility. The connection between a PC and the exoskeleton is set up automatically via a wireless Wi–Fi network. RemotionTool graphical user interface allows the medic to maintain several medical files, to create and update databases of rehabilitation techniques, to analyze the data from each goniography session.

462

4

I. Kagirov et al.

Exoskeleton Control System

At the moment, the exoskeleton control system is bimodal, and allows interacting with the exoskeleton via sensor interface and via voice commands. The former interaction technique assumes using of a wired control panel equipped with a graphical user interface (GUI) based on a smartphone. The latter technique assumes an automatic voice commands recognition. The basic idea of such a control system is that the patient can own activate the exoskeleton by himself and/or select the mode of functioning recommended to him. A profile from the PC with a recommended rehabilitation technique, being transferred from the PC once, can be accessed through the touch control panel or contactless voice input/output information during the future sessions. Beginner patients are not always able to perform a certain movement on their own, therefore, controlling the exoskeleton is the only way to perform the movements. Voice input significantly increases the naturalness and effectiveness of control of the exoskeleton. Intelligent voice control is more preferable when choosing from a relatively small number of commands available for executing, due to speed and convenience (saying a command is always easier than looking for and pushing virtual buttons on the touch screen). Moreover, during the rehabilitation course, the patient makes certain efforts to perform certain moves, therefore, touch control may negatively affect his concentration. Improving ergonomics by introducing an additional modality would allow the use of similar devices in everyday life, besides medical institutions. The use of intelligent methods of exoskeleton control, in particular, voice control has been implemented in a number of Russian and foreign projects. A good example would be ExoAtlet exoskeleton by the Russian company Exoatlet Ltd. [23], or the exoskeleton of the lower limbs ARKE [24] by Bionic Laboratories Corp., Canada. The latter example is particularly interesting thanks to the use of Alexa system (Amazon) for speech recognition purposes, integrated into the exoskeleton control system. A similar solution was chosen by the developers of RME when designing a voice control system for the exoskeleton. For the purposes of automatic speech recognition in RME project, the open source software from the Android OS were used. It is possible now to convert the audio signal into a textual representation on Android running mobile devices [25]. The software of the voice interface module is Java-based, and created within the IDE Android Studio 3.1.3. The logical structure of the voice interface is shown in Fig. 2. Among the salient features of the developed voice interface are: 1. automatic conversion of isolated Russian voice commands into a textual representation; 2. automatic conversion of isolated Russian voice commands into a textual representation; 3. sending the code of the recognized speech command to the electronic control system of the RME before the further actuation;

Lower Limbs Exoskeleton Control System

463

4. duplication of the progress of the voice control process using the graphic interface of the smartphone.

Fig. 2. The logical structure of the RME voice interface.

The complete list of voice commands currently available to the user is presented in Table 1. Before each voice command, the user must either press a virtual button on the screen, which turns on voice input mode, or say an activation keyword, which reduces the probability of false alarms of the voice control module. The user can choose one of the three keywords of his/her choice: (1) “Robot”, (2) “Command”, and (3) “Execute” (“Robot”, “Komanda”, and “Vypolni” in Russian). In order to reduce power consumption, the audio stream captured through the smartphone’s microphone is checked for any speech-like audio signal, and, in case of successful detection, the mode of command detection is activated. If the activation command matches the preferred keyword, the speech signal following the activation command is considered as the control command. The software delivers the code of the recognized command to the IP address specified in the settings in accordance with the developed protocol. The recognized command is

464

I. Kagirov et al.

then displayed in the graphic interface of the system is doubled by the voice via smartphone speakers. If the input speech command is not recognized by the system correctly, the user will be requested to repeat the attempt. In order to increase safety level and foolproofness, actuations of the exoskeleton are not executed immediately after the recognition process, but by active user confirmation: the execution of the action takes place after a certain manipulation by the user (pressing a button on the crutch). Table 1. Voice command list and the corresponding actuations by the exoskeleton. Voice command (in Russian) Actuation performed by the exoskeleton ‘Take a step’

5

One step forward

‘Go’

Continuous steps forward

‘Stop’

Keeping a standing position

‘Get up’

Getting up from a sitting position

‘Sit down’

Take a seated position

‘Turn left’

Turning left 90◦

‘Turn right’

Turning right 90◦

‘Abort’

Cessation of the last active actuation

Conclusions

The paper presents the intelligent human-machine interface for controlling the robotic lower limbs exoskeleton “Remotion” and provides specific technical details of the exoskeleton architecture. The conclusions that may be drawn are the following: 1. the key feature of Remotion Exoskeleton is the use of an intelligent user interface, which allows to improve the ergonomics of the exoskeleton; 2. an application of several channels of information exchange between the user and the robotic system increases the level of the naturalness of HMI, which allows to improve the ergonomics of the system; 3. a combination of intelligent control methods, such as speech interface, sensor control, voice and visual hints significantly increases the security level of the patient wearing the exoskeleton; 4. since the current projects in the area of nervous system interfaces highly relevant for assistive exoskeleton development [26,27], a prototype of a nervous system interface design is currently being developed. Acknowledgments. This work was performed within the joint project “Creating a high–tech production of multifunctional robotic exoskeleton for medical purposes (RME)” No. 2017–218–09–1807, and as a part of state research No. 0073–2019–0005.

Lower Limbs Exoskeleton Control System

465

References 1. Kapustin, A.V., Loskutov, Yu.V., Skvortsov, D.V., Nasybullin, A.R., Klyuzhev, K.S., Kudryavtsev, A.I.: Circuit solutions for the management of a rehabilitation exoskeleton for medical purposes. Vestnik Povolzhskogo gosudarstvennogo tekhnologicheskogo universiteta 2(38), 77–86 (2018) 2. Antsaklis, P.J., Passino, K.M.: An Introduction to Intelligent and Autonomous Control. Kluwer Academic Publishers (1993) 3. Shcherbatov, I.A.: Intellectual control of robotic systems in conditions of uncertainty. Vestnik Astrakhanskogo gosudarstvennogo tekhnicheskogo universiteta 1, 73–77 (2010) 4. Karpov, A.A., Yusupov, R.M.: Multimodal interfaces of human-computer interaction. Herald Russ. Acad. Sci. 88(1), 67–74 (2018) 5. Ronzhin, A.L., Yusupov, R.M., Li, I.V.: Speech and Multimodal Interfaces. Moscow (2006) 6. Ushakov, I.B., Karpov, A.A., Kryuchkov, B.I., Polyakov, A.V., Usov, V.M.: Promising solutions in the field of medical robotics to support crew life and reduce medical risks in space flight. Aviakosmicheskaya i ekologicheskaya meditsina 49(6), 76–83 (2015) 7. World Robotics - Service Robots 2017: Statistics, Market Analysis, Forecasts and Case Studies. VDMA Verlag, Frankfurt-am-Main (2017) 8. Ermolov, I.L., Knyaz’kov, M.M., Kryukova, A.A., Sukhanov, A.N., Kryuchkov, B.I., Usov, V.M.: Method of controlling an exoskeleton device using the system of recognition of arm movements on basis of biosignals from the skeletal muscles of a human operator’s arms. Pilotiruemye polety v kosmos 4(17), 80–93 (2015) 9. Ferris, D.: The exoskeletons are here. J. Neuroeng. Rehabil. 6(17) (2009) 10. Vorob’ev, A.A., Andryushchenko, F.A., Ponomareva, O.A., Solov’eva, I.O., Krivonozhkina, P.S.: Controversial terminology and classification of exoskeletons (Analytical review, own data, clarifications, suggestions). Volgogradskii nauchnomeditsinskii zhurnal 3(47), 14–20 (2015) 11. Herr, H.: Exoskeletons and orthoses: classification, design challenges and future directions. J. Neuroeng. Rehabil. 6(21) (2009) 12. Gorgey, A.S.: Robotic exoskeletons: the current pros and cons. World J. Orthop. 9(9), 112–119 (2018) 13. Vorob’ev, A.A., Zasypkina, O.A., Krivonozhkina, P.S., Petrukhin, A.V., Pozdnyakov, A.M.: Exoskeleton - the state of the problem and the prospects for the introduction of the system of habilitation and rehabilitation of persons with disabilities (analytical review). Vestnik Volgogradskogo gosudarstvennogo meditsinskogo universiteta 2(54), 9–17 (2015) 14. Banala, S.K., Agrawal, S.K., Kim, S.H., Scholz, J.P.: Novel gait adaptation and neuromotor training results using an active leg exoskeleton. IEEE/ASME Trans. Mechatron. 15(2), 216–225 (2010) 15. Banala, S.K., Kim, S.H., Agrawal, S.K., Scholz, J.P.: Robot assisted gait training with Active Leg Exoskeleton (ALEX). IEEE Trans. Neural Syst. Rehabil. Eng. 17(1), 2–8 (2009) 16. Talaty, M., Esquenazi, A., Briceno, J.E.: Differentiating ability in users of the ReWalk(TM) powered exoskeleton: an analysis of walking kinematics. In: IEEE International Conference on Rehabilitation Robotics (2013) 17. Exoskeleton (2019). https://bleex.me.berkeley.edu/research/exoskeleton

466

I. Kagirov et al.

18. Bednyak, S.G., Eremina, O.S.: HAL robotic exoskeletons (Feel like a HALc). Sworld 2(1), 49–51 (2009) 19. USA (2019). http://www.army-technology.co 20. Ergasheva, B.I.: Lower limb exoskeletons: brief review. Sci. Tech. J. Inf. Technol. Mech. Opt. 17(6), 1153–1158 (2017) 21. Yan, T., Cempini, M., Oddo, C.M., Vitiello, N.: Review of assistive strategies in powered lower-limb orthoses and exoskeletons. Robot. Auton. Syst. 64, 120–136 (2015) 22. Robopedia (2019). http://robotrends.ru/robopedia/katalog--ekzoskeletov/ 23. Exoatlet (2019). https://www.exoatlet.com/ru/node/84/ 24. Exoskeletonrepor (2019). https://exoskeletonreport.com/product/arke/ 25. SpeechRecognizer (2019). https://developer.android.com/reference/android/ speech/Speech--Recognizer 26. He, Y., Eguren, D., Azor´ın, J.M., Grossman, R., Luu, T.Ph., Contreras-Vidal, J.: Brain–machine interfaces for controlling lower–limb powered robotic systems. J. Neural Eng. 15 (2018) 27. Rosen, M.: Mind to motion: brain-computer interfaces promise new freedom for the paralyzed and immobile. Sci. News 184(10), 22–24 (2013)

Central Audio-Library of the University of Novi Sad Vlado Deli´c1 , Dragiˇsa Miˇskovi´c1 , Branislav Popovi´c1,2,3(B) , Milan Se˘cujski1 , Siniˇsa Suzi´c1 , Tijana Deli´c1 , and Nikˇsa Jakovljevi´c1 1

2

Department for Power, Electronic and Telecommunication Engineering, Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradovi´ca 6, 21000 Novi Sad, Serbia {vdelic,dragisa,bpopovic,secujski,sinisa.suzic,tijanadelic, jakovnik}@uns.ac.rs Department for Music Production and Sound Design, Academy of Arts, Alfa BK University, Nemanjina 28, 11000 Belgrade, Serbia 3 ˇ Computer Programming Agency Code85 Odˇzaci, Zelezni˘ cka 51, 25250 Odˇzaci, Serbia

Abstract. This paper presents the project Central Audio Library of the University of Novi Sad (CABUNS), aimed at automated creation of audio editions of textbooks, presentations and other course material using the new technology of text-to-speech synthesis in the Serbian language. The paper describes the architecture and the features of the developed system, from the points of view of both teachers and assistants who upload course material to the CABUNS server, as well as students who can download audio editions and listen to them (and view them) using their computers and mobile phones. The examples of the first audio editions of textbooks and PowerPoint presentations related to the course Acoustics and Audio Engineering are presented. The paper also analyzes the advantages and drawbacks of this new learning technology, which has a potential to greatly contribute to the quality of higher education, but also to education at other levels. The paper also presents the most recent results in the development of text-to-speech, enabling voice conversion, which means that very soon it will be possible to produce an audio edition in the voice of the author of the textbook or the person who delivers the lecture. Keywords: Audio library · Audio editions of textbooks · New digital learning technologies · Text-to-speech synthesis · Deep neural networks · Voice conversion

1

Introduction

The ever-faster pace of life and the need for efficient utilization of time resources have led to the popularization of audio books [7]. This technology allows books to be read hands-free and eyes-free, e.g. in public transport, during a vacation, c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 467–476, 2020. https://doi.org/10.1007/978-3-030-32258-8_55

468

V. Deli´c et al.

or a walk. Consequently, many e-learning frameworks have been developed in order to improve course programs in higher education [1]. Although classical audio books typically contain only literary works, over time it has become necessary to make professional literature available in audio format as well. The process of recording high quality audio books whose contents are read by professional speakers is time-consuming and financially demanding. Therefore, the use of text-to-speech synthesis (TTS) has emerged as a practical and efficient solution. The switch of focus of the speech technology research community from concatenation-based to parametric speech synthesis, that has occurred over the last two decades, has not only improved the general quality and naturalness of synthesized speech, but has also offered a number of solutions to enable synthesis in the voices of a wide range of speakers and in different speech styles using relatively small speech databases for model training [4–6,12,13]. This feature proved to be very important for expressive synthetic speech and creating opportunities to suit user preferences of age, gender, accent and character [14]. The Central Audio-Library of the University of Novi Sad (in Serbian: Centralna audio biblioteka Univerziteta u Novom Sadu - CABUNS) is a project in which the use of audio books aims to improve and facilitate higher education. Bearing in mind the rapid development of the body of human knowledge, for many university courses it is difficult to publish a book which would contain all the relevant information, and learning is often reduced to the use of multimedia presentations or written notes which are rarely informative enough. Within this project, teachers will be able to supplement their multimedia presentations with the text spoken during the lectures, which will result in a form of official literature from which the students will be able to learn, and which would be extremely easy to modify at any time. The audio editions of both books and multimedia presentations can be automatically generated and be made available by any device connected to Internet. The architecture of this system is described in [8]. The following section describes the CABUNS system and its utilization. Afterwards, the advantages and disadvantages of the system will be analyzed. At the end of this paper, an overview of the conclusions and the directions for further development and improvement of the system will be presented.

2

CABUNS System Description and Usage Examples

The CABUNS system is designed as a web-portal that integrates TTS-generated audio files and accompanying visual contents. The main architectural aspects are presented in Fig. 1. The core of the system is ASP.NET framework as the latest generation of Active Server Pages (ASP) technology. This layer is responsible for processing requests accepted by Internet Information Services (IIS) server and work distribution according to the request type. In Fig. 2, the home page of the CABUNS portal is presented in Serbian, as its original language. In addition to the essential project information, links in the menu are provided offering access to lectures in accordance with the selected Faculty and the course at the University of Novi Sad.

Central Audio-Library of the University of Novi Sad

Fig. 1. CABUNS architecture and data flow diagram

Fig. 2. Home page of the CABUNS portal

469

470

V. Deli´c et al.

Within the Lectures page (Serb. Predavanja), it is possible to add new content using two supported formats from the Microsoft Office package: PowerPoint presentations and Word documents. In Fig. 3, the layout of a page for uploading a lecture is presented. This option is restricted to the registered users - the teachers of the University of Novi Sad, for the beginning, but soon for other universities in Serbia. Since the processing of lecture files can be time-consuming, the entire process of parsing and generating speech based on text is not performed in real time, but as a background process. This process involves: • parsing a document in order to extract pages and text that will be sent to the speech synthesis module, • generating visual interfaces for various devices (personal computers, tablets and mobile phones). At the end of the conversion process, which may take several minutes up to several hours, depending on the size of the input material, the teacher who added a new teaching unit receives an email from the CABUNS server that the audio edition is ready for review. After that, the teacher can review the audio edition of the book or presentation (Figs. 4 and 5), and if satisfied, he/she can allow access to authorized students. Nevertheless, if the release does not satisfy the requirements, or if there is something to be added or changed, there is always an option which allows to set up a revised version of the document on the server, and an updated version of synchronized audio-visual edition will soon be generated. In [8], the applied architecture and more details related to the end-to-end processing of lecture files are briefly presented. It should be emphasized that in the case of PowerPoint presentations, audio releases are generated using the contents of standard slide notes existing below each slide (Fig. 5), where teachers can write any text - something they would otherwise say during the lecture, or several paragraphs from the course handbook, or exam questions related to the slide. In the case of Word files, entire text will be extracted from page to page and used for TTS synthesis (Fig. 4). In both these cases, visual content will be formed by converting files into the PDF format, in order to be accessible on various devices, as well as to be listened to - the audio content is reproduced using the standard GUI of the player below the shown page. Language identifier is one of the parameters which have to be set during the uploading process. The system currently supports text-to-speech synthesis only for Serbian, Croatian and English. The TTS synthesizer is based on principles of parametric speech synthesis using neural networks [6]. Although simple feedforward neural networks with 4–6 hidden layers and tangent hyperbolic activations are sufficient for the production of intelligible and natural sounding synthetic speech, introduction of LSTM (long short term memory) neurons additionally improved the quality of synthesized speech [16]. Synthesizers for all speakers used by the CABUNS system are based on the same architecture. It consists of two networks - one for generating per-state phoneme durations and the other for predicting acoustic parameters. Both networks contain 4 hidden layers. The first three layers consist of neurons with tangent hyperbolic activations, while LSTM

Central Audio-Library of the University of Novi Sad

471

Fig. 3. The layout of a page for uploading lectures, supported formats are PowerPoint and Word

neurons are used in the last hidden layer. Acoustic parameters are extracted using the WORLD vocoder [9]. Parametric approaches to synthesis provide a great deal of flexibility in modifying synthetic voice, both in terms of changing speaker identity and style. An analysis of different methods used for speaker adaptation in NN synthesis is given in [15]. All presented methods rely on training of average speaker model using speech databases containing voice of multiple speakers. On the other hand, only 10 min of target speech could be used to produce synthetic speech resembling target speaker’s voice with satisfactory quality [5]. Based on this idea, an approach for creating synthetic voice using amateur speaker recordings is presented

472

V. Deli´c et al.

Fig. 4. One-page example of an uploaded book chapter, playback and navigation controls are given at the bottom of the page

in [4]. This procedure could be applied for creating audio material in a lecturer’s voice. TTS system should also support synthesis in different speaking styles [2]. For example, the presentational style is the most suited style for audio materials generated in CABUNS. The speech in this style is clear, relatively slow, with frequent emphasis of keywords. Different methods for style modeling are compared in [12]. It is shown that only 5 min of speech per style (excluding neutral parts of the database) is sufficient for creating speech in supported styles.

Central Audio-Library of the University of Novi Sad

473

Fig. 5. An example of a slide presented on a mobile phone screen

3

Strengths and Weaknesses of the CABUNS Project

The new era requires course material to represent a balance between available time and ever-growing body of knowledge. This is particularly evident in engineering disciplines for which the pace of technological development is significantly accelerating. Fortunately, new technologies also allow the development of new teaching and learning methods. Specifically, this paper deals with the application of information-communication technologies (ICT) and TTS technologies for the automated generation of audio editions of course material. Audio editions allow students to listen to the lessons while they are resting, walking, traveling. Even when they are actively learning, the audio-library allows them to listen instead of reading, providing them more time to view various details in illustrations instead of looking at the text while reading. For slow readers and people who easily lose their focus, audio-library could maintain a pace of reading and stabilize their level of concentration [3]. Audio books are also very important for people with disabilities (visual impairment, dyslexia or any physical disability which makes it difficult to use a book or a computer) [10,11]. Classic production of audio releases relies on speakers who read texts, which is time-consuming and error-prone. It is much more practical and efficient to create an audio release automatically, using a speech synthesizer. Shortcomings of the CABUNS concept involve several unresolved issues. Specifically, TTS can only convert text into speech, but course material often contains figures, formulas, or tables. This is of course an issue only in the case of

474

V. Deli´c et al.

an edition which is entirely audio, intended to be played without being viewed on a screen. For such issues, teachers are advised to compose textual descriptions of figures, formulas and tables – in a way they would describe them during their lectures. For instance, nobody reads the whole table, row by row - usually the teacher outlines several details and then talks about them. Audio-library provides a new way of learning that will motivate many students, and provides a wide range of people with disabilities with a practical means to achieve equality in education more easily. On the other hand, teachers are given the opportunity to make it easier for students to use appropriate course material. Students will be able to learn from course material approved by teachers, instead of using inadequate or sometimes even incorrect notes, which is often the case today. Additionally, if the synthesized speech resembles the teacher’s voice, it could serve as a better reminder to the student about the contents of the lecture. Of course, this is only a hypothesis that will be examined using the results achieved in the development of ICT and TTS technologies implemented by the authors of this paper.

4

Conclusion

In this paper, the Central audio-library of the University of Novi Sad has been presented. The library enables students to access audio editions of tutorials and textbooks, as well as presentations of lectures enriched by synthesized speech. Until recently, creating an audio book presumed engaging people to read a text, while in CABUNS, audio content is generated automatically from text, which greatly accelerates the process of their creation and reduces its cost. For instance, the CABUNS platform allows the content of presentations used in the lectures to be enriched with appropriate explanations based on the text found in the notes for each of the slides. Contemporary TTS technologies provide not only more natural and more pleasant speech, but also enable synthesis using the voice of the lecturer, therefore enabling the creation of multimedia lectures without major investments in high-quality audio or video equipment. The platform interface is intuitive and transparent and its utilization is reduced to uploading books in .docx, or multimedia presentations in .pptx format to the project website, by filling up the appropriate fields by teachers, while the download of audio and multimedia versions of lectures is standard. The main advantage of the proposed concept is its ability to automatically generate audio editions of textbooks, in contrast to the standard voice recording, which is time-consuming and error-prone. The CABUNS platform provides teachers with a more efficient way of generating reliable audio or multimedia course material, which is also more easily accessible by the students. One of its major advantages is also that it provides accessibility for the people with disabilities. The further steps in the research and development of the proposed concept will include broader institutional networking of stakeholders, further development of techniques of synthesizing speech in an arbitrary voice based on a short speech sample, as well as finding practical solutions for verbal interpretation of figures, formulas and tables and their inclusion into audio editions.

Central Audio-Library of the University of Novi Sad

475

Acknowledgments. The work described in this paper was supported in part by the Ministry of Education, Science and Technological Development of the Republic of Serbia, within the project “Development of Dialogue Systems for Serbian and Other South Slavic Languages”, and the Provincial Secretariat for Higher Education and Scientific Research, within the project “Central Audio-Library of the University of Novi Sad”, No. 114-451-2570/2016-02.

References 1. Aasbrenn, M., Bingen, H.: Maximizing flexibility and learning; using learning technology to improve course programs in higher education. In: ICDE 23rd World Conference, Maastricht MECC, The Netherlands (2009) 2. Abe, M.: Speaking styles: statistical analysis and synthesis by a text-to-speech system. Progress in Speech Synthesis, pp. 495–510. Springer, New York (1997) 3. Beer, K.: Listen while you read. School Libr. J. 44(4), 30–35 (1998) 4. Deli´c, T., Suzi´c, S., Se˘cujski, M., Ostoji´c, V.: Deep neural network speech synthesis based on adaptation to amateur speech data. In: 5th International Conference on Electrical, Electronic and Computing Engineering (IcETRAN), Subotica, Serbia, pp. 1249–1252 (2018) 5. Deli´c, T., Suzi´c, S., Se˘cujski, M., Pekar, D.: Rapid development of new TTS voices by neural network adaptation. In: 17th International Symposium INFOTEHJAHORINA, Jahorina, Bosnia and Herzegovina, pp. 1–6 (2018) 6. Deli´c, T., Se˘cujski, M., Suzi´c, S.: A review of Serbian parametric speech synthesis based on deep neural networks. Telfor J. 9(1), 32–37 (2017). https://doi.org/10. 5937/telfor1701032D 7. Have, I., Stougaard Pedersen, B.: Digital Audiobooks: New Media, Users, and Experiences. Routledge, New York (2015) 8. Miˇskovi´c, D., Gnjatovi´c, M., Jakovljevi´c, N., Deli´c, V.: Development of the audio library of the University of Novi Sad. In: 11th DOGS, Digital Speech and Image Processing, Novi Sad, Serbia, pp. 53–56 (2017) 9. Morise, M., Yokomori, F., Ozawa, K.: WORLD: a vocoder-based high-quality speech synthesis system for real-time applications. In: IEICE Transactions on Information and Systems, vol. E99-D(7), pp. 1877–1884 (2016) 10. Nees, M.A., Berry, L.F.: Audio assistive technology and accommodations for students with visual impairments: potentials and problems for delivering curricula and educational assessments. Perform. Enhancement Health 2(3), 101–109 (2013) 11. Ozgur, A.Z., Kiray, H.S.: Evaluating audio books as supported course materials in distance education: the experiences of the blind learners. Turkish Online J. Educ. Technol. TOJET 6(4), 30–35 (2007) 12. Suzi´c, S., Deli´c, T., Jovanovi´c, V., Se˘cujski, M., Pekar, D., Deli´c, V.: A comparison of multi-style DNN-based TTS approaches using small datasets. In: 13th International Conference on Electromechanics and Robotics “Zavalishin’s Readings”, ER(ZR)-2018, St. Petersburg, Russia, pp. 1–6 (2018) 13. Suzi´c, S., Deli´c, T., Ostrogonac, S., -Duri´c, S., Pekar, D.: Style-code method for multi-style parametric text-to-speech synthesis. SPIIRAS Proc. 5(60), 216–240 (2018). https://doi.org/10.15622/sp.60.8 ´ Cabral, J.P., Abou-Zleikha, M., Cahill, P., Carson-Berndsen, J.: Eval14. Sz´ekely, E., uating expressive speech synthesis from audiobooks in conversational phrases. In: International Conference on Language Resources and Evaluation, Istanbul, Turkey, pp. 3335–3339 (2012)

476

V. Deli´c et al.

15. Wu, Z., Swietojanski, P., Veaux, C., Renals, S., King, S.: A study of speaker adaptation for DNN-based speech synthesis. In: 16th Annual Conference of the International Speech Communication Association, INTERSPEECH, Dresden, Germany (2015) 16. Zen, H., Agiomyrgiannakis, Y., Egberts, N., Henderson, F., Szczepaniak, P.: Fast, compact, and high quality LSTM-RNN based statistical parametric speech synthesizers for mobile devices. In: 17th Annual Conference of the International Speech Communication Association, INTERSPEECH, San Francisco, CA, USA, pp. 2273– 2277 (2016)

Applying Ensemble Learning Techniques and Neural Networks to Deceptive and Truthful Information Detection Task in the Flow of Speech Alena Velichko(B) , Viktor Budkov, Ildar Kagirov, and Alexey Karpov St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), St. Petersburg, Russia {velichko.a,budkov,kagirov,karpov}@iias.spb.su

Abstract. This paper presents the results of experiments on applying ensemble learning techniques and neural networks to a paralinguistic analysis of deceptive and truthful statements in the flow of speech. Based on an analysis and comparison of different approaches to the issue, we propose using a mixture of such methods. The Real-Life Trial Deception Detection Dataset was used for both training and testing. All the experiments were performed using 10-fold cross-validation. Using twolayer neural networks, k-nearest neighbor, random forest for evaluating and principal component analysis methods for preprocessing, results in UAR of 65.0% and 70.0%, in the case of average and majority voting correspondingly. Keywords: Deception detection in speech · Computational paralinguistics · Speech technology · Machine learning

1

Introduction

Paralinguistics studies non-verbal aspects in human speech. Human’s emotional state and psychological state are correlated. In recent years there have been a lot of studies devoted to development of contactless methods for detection of deceptive information in speech due to the fact that contact methods are not universal [1]. Non-contact methods based on automatic analysis of non-verbal signals during conversation are the focus of paralinguistics. Systems for detection of deceptive information can be used in practice in biometric researches with polygraph, prevention of telephone fraud, etc. The organizers of the ComParE-2016, that was held as a part of INTERSPEECH 2016, presented a speech dataset that consisted of truthful and deceptive speech utterances, and a base system for detection of deceptive information in the flow of speech. This system demonstrated the UAR (Unweighted Average Recall) of 68.3% [2]. The winners [3] elaborated a system that made use c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 477–482, 2020. https://doi.org/10.1007/978-3-030-32258-8_56

478

A. Velichko et al.

of prosodic cues and a base feature set extracted from the dataset. The performance of this system reached an UAR performance of 74.9%. In 2017 there was presented a system [4] that combined acoustic and lexical features and Random Forest Classifier. This system demonstrated the best F1-measure of 63.9%, with precision of 76.1%. The aforementioned studies, as well as a wide range of other investigations are devoted to feature extraction and feature processing, due to the fact that the idea of detection of deceptive information in the flow of speech is based on the hypothesis that telling lies causes a distress affecting on psychophysiological state of person and parameters of her/his speech. Paper [5] presents a comparative analysis of classification methods for detection of deceptive information in speech utterances. The Real-Life Trial Deception Detection Dataset [6] was chosen for both learning and testing. Audio data was extracted from the multimodal dataset with the use of Ffmpeg (https://www. ffmpeg.org/) and further processed with the use of Praat (http://www.fon.hum. uva.nl/praat/) with the manual annotation of the interviewers’ phrases. The total number of audio recordings that were labeled according to the annotation, accompanying the dataset, was 195. Audio features were extracted using openSMILE toolkit [7]. The ComParE’13 configuration, which includes 6373 acoustic supra-segmental features, was used with openSMILE. The WEKA software tool was applied to classification procedure [8]. The best results using 10-fold cross-validation were achieved with the following classification methods: Random Forest (UAR = 79.3%) [9], k-Nearest Neighbors [10] and Classification via Regression [11] with k-Nearest Neighbors (UAR = 76.3%).

2

Proposed System

This paper presents an alternative architecture of the above described system, replacing the single classifier with a mixture of classifiers (in this particular case, three). The final prediction is made with the use of the average and majority voting methods. This suggestion is based on the theory that the ensemble learning performs with more accurate predictions, compared to single classifier approaches. Moreover, such architecture is more common and could help to reduce the variance term of the error. The proposed architecture of the system is shown on Fig. 1 and the composition is shown in Fig. 2. For training and testing, the openSMILE ComParE’13 features from Real-Life Trial Deception Detection Dataset were extracted. In this study, min-max normalization and principal component analysis aimed to find the most relevant features were used. According to the comparative analysis, the following methods with the best performance were chosen: Random Forest and k-Nearest Neighbors. In addition, the composition of techniques includes a neural network.

Deceptive and Truthful Information Detection

479

Fig. 1. Proposed system with ensemble

Fig. 2. Ensemble of k-Nearest Neighbors, Random Forest and Neural Network

3

Experiments and Results

In this work audio data from the deceptive speech database named Real-Life Trial Deception Detection Dataset was used, due to the fact that it is an opensource dataset. The total amount of audio data was 195 audio files with the overall duration of 49 min and 41 s after preprocessing. The following open-source software was used: Python (v3.6), TensorFlow (v1.12.0) [12] with Keras (v2.2.4) [13] and scikit-learn (v0.20) [14]. Results of experiments with default model parameters are shown in Table 1. The best model parameters were found through a grid search. Those parameters allowed the single models to perform better, as well as the overall performance of the ensemble, the corresponding results are also presented in Table 1. In this Table AV is average voting and MV is majority voting. In k-Nearest Neighbors the k is set to 3, in Random Forest the number of estimators is 100. Moreover, we added one additional layer to the neural network, so it became a two-layer network with a dropout between input layer and hidden layers. The architectures are shown in Fig. 3 where dense 1 is the input layer.

480

A. Velichko et al.

Table 1. Results of experiments with default model parameters and best parameters found by grid search. Method

Default model parameters Best model parameters Accuracy, % UAR, % Accuracy, % UAR, %

k-Nearest Neighbors 55.0

55.0

60.0

60.0

Random Forest

55.0

55.0

65.0

65.0

Neural Network

60.0

Ensemble (AV, MV) 65.0, 65.0

60.0

60.0

60.0

65.0, 65.0

65.0, 70.0

65.0, 70.0

Fig. 3. Architecture of Neural Network in experiments with default parameters (upper) and architecture in experiments with grid search (lower).

Parameters of both architectures are presented in Table 2. Both architectures make use of kernel regularizers with the value penalty of 0.001 on layers and the dropout with value 0.6 between layers. Table 2. Parameters of Neural Network architectures. Parameter

Architecture 1

Number of neurons on layers (128, 1) Activation

Architecture 2 (128, 64, 1)

(ReLU, sigmoid) (ReLU, ReLU, sigmoid)

The principal component analysis method was used to transform the feature space into a lower dimensional, uncorrelated space, where the variance of the original space is retained maximally. The number of principal components that could allow to save the variance in the data, was chosen experimentally. Results Table 3. Correlation between number of components in PCA and Accuracy/UAR Variance Retained Number of Features Accuracy, % UAR, % Without PCA

6373

55.0

55.0

0.85

81

55.0

55.0

0.95

138

60.0

60.0

0.99

178

65.0

65.0

Deceptive and Truthful Information Detection

481

of these experiments with default parameters of the ensemble are presented in Table 3. The maximum variance was achieved with 178 principal components, when accuracy and UAR was 65.0%. The proposed deception detection system outperforms the base system presented within the challenge by 1.7% of UAR, but underperforms systems presented in [3,4] by 4.9% and 6.1% accordingly. The results of experiments have shown that the proposed methods can conflict with one another in ensemble. Thus, the principal component analysis was used during the postprocessing of audio features, that allowed to decrease noise in data which is important for k-Nearest Neighbors method and Neural Network, but reduced the overall number of features affecting on Random Forest performance, so it performed worse than in the previous paper. On the other hand, one could see the potential of the ensemble and neural networks applied to the deception detection in speech task. Taking into account above described advantages and disadvantages of the use of ensemble and neural networks, we are planning to focus on audio features preprocessing and postprocessing in our future works. Also, we are going to add modalities such as video and text, as well as to use more complicated architectures of neural networks and ensemble in general.

4

Conclusion

This paper presents the system for detection of deception, based on ensemble learning techniques instead of single classifier methods, as in [5] and neural networks. A comparative analysis of single classification methods is provided as well as a mixture of those methods. The system using ensemble learning demonstrated results that outperform those of single classifiers approaches and displayed the advantages of such architectures. However, it should be pointed out that in case when the majority of single classifiers perform with low accuracy, they could affect ensemble performance in general due to wrong predictions. Advantages and disadvantages of ensemble learning were found out. On the basis of those results we could determine the direction for future work. Acknowledgments. This research is supported by the Russian Science Foundation (project No. 18-11-00145).

References 1. Velichko, A., Budkov, V., Karpov, A.: Analytical survey of computational paralinguistic systems for automatic recognition of deception in human speech. Informatsionno-upravliaiuschie sistemy (Inf. Control Syst.) 90(5), 30–41 (2017). (in Russian) 2. Schuller, B.: The INTERSPEECH 2016 computational paralinguistics challenge: deception, sincerity & native language. In: Proceedings of INTERSPEECH-2016, San Francisco, USA, pp. 2001–2005 (2016)

482

A. Velichko et al.

3. Montaci´e, C., Caraty, M.-J.: Prosodic cues and answer type detection for the deception sub-challenge. In: Proceedings of INTERSPEECH-2016, San Francisco, USA, pp. 2016–2020 (2016) 4. Mendels, G., Levitan, S.I., Lee, K., Hirschberg, J.: Hybrid acoustic-lexical deep learning approach for deception detection. In: Proceedings of INTERSPEECH2017, Stockholm, Sweden, pp. 1472–1476 (2017) 5. Velichko, A., Budkov, V., Kagirov, I., Karpov, A.: Comparative analysis of classification methods for automatic deception detection in speech. In: Proceedings of 20th International Conference on Speech and Computer SPECOM-2018, Leipzig, Germany, LNAI, vol. 11096, pp. 737–746. Springer (2018) 6. P´erez-Rosas, V., Abouelenien, M., Mihalcea, R., Burzo, M.: Deception detection using real-life trial data. In: Proceedings of the 2015 ACM International Conference on Multimodal Interaction, Seattle, USA, pp. 59–66 (2015) 7. Eyben, F., et al.: Recent developments in openSMILE, the Munich open-source multimedia feature extractor. In: Proceedings of the 2013 ACM Multimedia (MM), Barcelona, Spain, pp. 835–838 (2013). https://doi.org/10.1145/2502081.2502224. ISBN 978-1-4503-2404-5 8. Frank, E., Hall, M.A., Witten, I.H.: The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques”, 4th edn. Morgan Kaufmann (2016) 9. Frank, E., Wang, Y., Inglis, S., Holmes, G., Witten, I.H.: Using model trees for classification. Mach. Learn. 32, 63–76 (1998) 10. Kukreja, M., Johnson, S.A., Stafford, P.: Comparative study of classification algorithms for immunosignaturing data. BMC Bioinf. 13, 139 (2012) 11. Fix, E., Hodges, J.L.: Discriminatory Analysis, Nonparametric Discrimination: Consistency Properties. Technical report 4, USAF School of Aviation Medicine, Randolph Field, Texas (February 1951) 12. Abadi, M., et al.: TensorFlow: Large-scale machine learning on heterogeneous systems (2015) 13. Chollet, F.: Keras (2015). https://keras.io 14. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

Security for Intelligent Distributed Computing - Machine Learning vs. Chains of Trust

Model Checking to Detect the Hummingbad Malware Fabio Martinelli1(B) , Francesco Mercaldo1,2 , Vittoria Nardone3 , Antonella Santone2 , and Gigliola Vaglini4 1

Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa, Italy {fabio.martinelli,francesco.mercaldo}@iit.cnr.it 2 Department of Bioscience and Territory, University of Molise, Pesche, IS, Italy {francesco.mercaldo,antonella.santone}@unimol.it 3 Department of Engineering, University of Sannio, Benevento, Italy [email protected] 4 Department of Information Engineering, University of Pisa, Pisa, Italy [email protected]

Abstract. Typically, when a platform is widely disseminated, malicious writers focus their attention in order to perpetrate attacks on the widespread environment. This is the reason why nowadays there exists a series of attacks targeting the Android operating system, the most common platform available for mobile devices. In this paper we present a tool implementing a model checking-based approach to identify Android malware. Furthermore, the tool is also useful to localize the malicious behaviour of the application under analysis code. We evaluate the effectiveness of the tool on real-world samples belonging to the HummingBad malware family, one of the most recent and aggressive Android threats.

Keywords: Model checking Security

1

· Formal methods · Malware · Android ·

Introduction and Related Work

The mobile threat landscape continues to grow and evolve with several factors contributing. As a matter of fact, the increasing speed, power and storage space available on mobile devices has led to more people using their devices in more places for online shopping, managing their finances and paying their bills [6,13]. This leads to mobile becoming a much more valuable target for malware writers [4,11,12]. This trend is confirmed by Nokia analysts: during the second half of 2016, the increase in smartphone infections was 83% following on the heels of a 96% increase during the first half of the same year1 . The most prevalent trend of the 2016 was Trojans gaining super-user privileges. In order to obtain these privileges, they use a variety of vulnerabilities 1

https://pages.nokia.com/8859.Threat.Intelligence.Report.html.

c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 485–494, 2020. https://doi.org/10.1007/978-3-030-32258-8_57

486

F. Martinelli et al.

that are usually patched in the newer versions of Android2 . Unfortunately, most user devices do not receive the latest system updates, making them vulnerable. Root privileges provide these Trojans with almost unlimited possibilities, allowing them to secretly install other advertising applications, as well as display ads on the infected device, often making it impossible to use the smartphone. In addition to aggressive advertising and the installation of third-party software, these Trojans can even buy applications on Google Play3 . In this landscape, a new malware family called HummingBad has infected millions of devices and brings in millions of dollars of fake ad revenue. HummingBad family was discovered by Check Point analysts in February 20164 and it is able to establishes a persistent rootkit on Android devices, generates fraudulent ad revenue, and installs additional fraudulent apps. Currently, it is estimated to be generating $300,000 per month in fraudulent ad revenue. HummingBad installs more than 50,000 fraudulent apps every day, and displays more than 20 million ads per day in these apps. With these devices, a group can create a botnet and carry out targeted attacks on businesses or government agencies. The malicious aim of this family is to escalate root privileges in order to perform drive-by-download attacks. HummingBad samples consider basically two attack vectors: the first one is silent and tries to obtain root access, the second one starts when the initial attack fails and its malicious goal is the same of the previous one. This is an interesting technique able to repeat the attack until it succeeds. When the root access is gained the malware sample establishes a communication with a Command and Control server in order to download a list of malicious applications. Afterwards it installs, in a silent mode, some malicious applications belonging to the downloaded list. In this paper we propose a model-checking based tool with the aim to detect and localize the malicious payload. We evaluate the effectiveness of the proposed solution on the HummingBad family. With regard to the current state of the art, basically the existing malware detection methods make use either static [18] or dynamic features [24]. We focus on the first category (i.e., the static one), that is the one involved in the proposed method. The analysis of smali code (i.e., the bytecode targeting the Android Dalvik Virtual Machine [20]) is considered in [18]. These papers focus on the opcode analysis: the occurrences of opcode n-grams are used, by means of machine learning, to classify apps as benign or malicious. A static feature which has been often used to characterize malware is the app permissions. In [8], the authors consider sets of required permissions: their security rules classify applications based on sets of permissions rather than individual permissions to reduce the number of false positives. In [9], sending SMS messages without confirmation or accessing unique phone identifiers like the IMEI are identified as promising features for malware detection as legitimate 2 3 4

https://securelist.com/files/2017/02/Mobile report 2016.pdf. https://www.mcafee.com/us/resources/reports/rp-mobile-threat-report-2016.pdf. https://blog.checkpoint.com/2016/02/04/hummingbad-a-persistent-mobile-chainattack/.

Model Checking to Detect the Hummingbad Malware

487

applications ask for these permissions less often. Nevertheless, using only asked permissions, see [10], high false positive rates are obtained. For example, nearly one third of applications request the access to user location but far fewer request the access to user location and to launch at boot time. The authors state that more sophisticated rules and classification features are required in the future. Recently, the possibility to identify the malicious payload in Android malware using a model checking based approach has been explored in [14–17]. Starting from the payload behavior definition, the authors formulate logic rules and then test them by using a real world dataset. The main difference between these works and the one we propose is represented by the temporal logic formula able to gather the HummingBad malicious payload. The reminder of the paper is organized as follows: Sect. 2 introduces the method, Sect. 3 illustrates the results of the experiment. Finally, conclusions and future works are given in Sect. 4.

2

The Tool

In this section our model checking based tool to identify Android malware families is described. In according with the model checking technique [5,7,22], we need: a formal model of the system, a set of behavioural properties and a model checker tool able to verify the property on the model. Since the model and the properties require a precise notation to be defined, we use the Calculus of Communicating Systems of Milner (CCS) [19] and the mu-calculus logic [23], respectively to define them. In this paper, we use Caal (Concurrency Workbench, Aalborg Edition) [1] as formal verification environment. Our tool aims to model Android applications. To achieve this goal we use the code of an APK file to build the formal model. We retrieve the application code, i.e., Java Bytecode, through a reverse engineering process and we perform the following steps: (i) we use dex2jar5 tool to convert the Dalvik Executable file (dex) into Java Archive file (jar); (ii) we extract the Java classes using the command: jar -xvf provided by the Java Development Kit; (iii) we parse the classes file using the Bytecode Engineering Library (Apache Commons BCEL)6 . Finally, every Java Bytecode instruction is translated in a CCS process through a Java Bytecode-to-CCS transformation function defined by the authors. Since in the CCS process algebra the systems are represented through processes and actions, which correspond to states and transitions, respectively, our model of the system is represented as an automaton. This representation allows to simulate the normal flow of the instructions. The automaton of an application has a set of labelled edges and a set of nodes. The nodes are the system states while an edge represents a transition from a state to another state (precisely the next state). An edge means that the system can evolve from a state s to a state s performing an instruction a (the label of the edge). For example, the if statement is translated as a non-deterministic choice: the system can evolve from a 5 6

https://sourceforge.net/projects/dex2jar/. https://commons.apache.org/proper/commons-bcel/.

488

F. Martinelli et al.

state s to two different states s and s , corresponding to the two alternative paths (true/false) of the classical if statement. In order to identify the malware family, we specify temporal logic formulae written in mu-calculus logic. The specified formulae encode a specific malicious behaviour, which is a typical behaviour characterizing the family. These are temporal logic rules and are obtained through a manual inspection process of few malware samples and examining malware technical reports. Finally, we use Caal tool which takes as input the formal CCS model (built as described above) and the temporal logic rules written in mu-calculus logic. The output of the model checker is binary: true, whether the property is verified on the model and false otherwise. We assume that a sample belongs to a particular family whether the properties related to that particular family are verified on the model. Figure 1 outlines the above described work-flow of our approach underlying the tool.

3

Evaluation

In order to evaluate our tool we select one of the last malicious threats in Android environment: HummingBad malware family. In this section we detail the dataset used during the evaluation and we discuss the results obtained. This section is organized as follows: first, we describe the logic property aimed to detect the HummingBad family and its malicious behaviour, then we detail the dataset used during the evaluation and finally we discuss the results obtained.

Fig. 1. The work-flow of our tool

Model Checking to Detect the Hummingbad Malware

489

The Logic Property The HummingBad malicious behaviour is able to establish a persistent rootkit aimed to generate fraudulent ad revenue for the perpetrators. HummingBad considers the a set of paid events for its malicious operation, including displaying ads, creating clicks, and installing fraudulent apps. The Hummingbad malicious payload is able to perpetrate following malicious action aimed to generate revenue for the attackers: • the apps display more than 20 million advertisements per day attackers achieve a high click rate of 12.5% with illegitimate methods, resulting in over 2.5 million clicks per day; • HummingBad installs more than 50,000 fraudulent apps per day. These behaviours translate to significant revenues: • • • •

the average revenue per clicks is USD $0.00125; accumulated revenue from clicks per day reaches more than $3,000; the rate for each fraudulent app is $0.15 with accruing over $7,500 per day; attackers are able to make $10,000 per day or about $300,000 a month.

Once installed, the malware will try to connect to a C&C servers to receive commands. The server can initiate several actions by the malware: • download mobile applications from a url provided by the server and install it. Depending on if the root access was successfully established, the application will install the apk silently or show an install dialog containing text provided by the server; • send referrer requests in order to create a Google Play advertisement https:// developers.google.com/ads/ revenue. To achieve this purpose, the malware gets a list of packages and referrer ids from the server and then scans the applications running on a device. Once it has collected this information the malware sends com.android.vending.INSTALL REFERRER intents with the corresponding referrer ID, in order to gain revenue; • the malware will get a list of packages from the server and try to launch them; • send request to a URL provided by the server. In this case, the malware will get a url from the server and will open a connection with this url ; • It is interesting to note that all of the C&C servers contain dozens of malicious APKs. Figure 2 shows a Java code snippet related to a malicious behaviour exhibited by the Hummingbad samples. In detail the malicious behaviour is exhibited by the com.android.vending. INSTALL REFERRER intent. This intent is broadcast when an app is installed from the Google Play Store7 : it listens for that Intent, passing the install referrer data for Mobile Apps and Google Analytics. 7

https://developers.google.com/android/reference/com/google/android/gms/ tagmanager/InstallReferrerReceiver.

490

F. Martinelli et al.

Fig. 2. Code snippet for the Humming malware identified by 0a4c8b5d54d860b3f97b476fd8668207a78d6179b0680d04fac87c59f5559e6c hash.

the

Humming samples send referrer requests to create a Google Play advertisement revenue. To achieve this purpose, the malware gets a list of packages and referrer ids from the server and then scans the applications running on a device. Once it has collected this information the malware sends com.android.vending.INSTALL REFERRER intents with the corresponding referrer ID, in order to gain revenue. We formulate the property aimed to detect this behaviour whether there is at least one invocation of the com.android.vending.INSTALL REFERRER intent, as shown in Table 1. Table 1. Temporal logic formula for the Hummingbad malicious behaviour detection.

The Dataset In our evaluation we use the following dataset: 50 samples belonging to the HummingBad family; 250 samples randomly selected from the 10 most populous families of the Drebin dataset [2]; 50 trusted samples downloaded from Google Play8 , the Android official market. It worth to note that our dataset is composed of only real word samples. Drebin dataset is a well known collection of malware used in many scientific works, which includes the most diffused Android families. The family label is related to the malicious payload that a particular family exposes. Thus, every sample is labelled and categorized starting from its malicious behaviour. The top 10 Drebin families are: FakeInstaller, 8

https://play.google.com.

Model Checking to Detect the Hummingbad Malware

491

DroidKungFu, Plankton, Opfake, GinMaster, BaseBridge, Kmin, Geinimi, Adrd and DroidDream. In our evaluation we randomly selected 25 samples from each one of them. These samples are used in order to evaluate the Precision of our methodology. We want to demonstrate if our tool is able to correctly categorize and distinguish the samples belonging to the HummingBad family from the other ones. Furthermore, in order to investigate if our tool is resilient to the code obfuscation, we have generated an obfuscated version of the HummingBad samples. Starting from the 50 samples and using the obfuscation tool in [3], we have produced 50 obfuscated samples. We have randomly selected the code obfuscation techniques that the tool offers. The obfuscation techniques aim to change only the syntax of the code without changing its behaviour. Evaluation Result In order to evaluate the completeness and correctness of our methodology we have computed the following metrics: Precision (PR), Recall (RC), F-measure (Fm) and Accuracy (Acc). PR =

TP TP 2P R RC TP + TN ; RC = ; Fm = ; Acc = TP + FP TP + FN P R + RC TP + FN + FP + TN

In the above formulae are involved also the values of True Positives (TP), False Positives (FP), False Negatives (FN) and True Negatives (TN). In our evaluation these values assume the following meaning: a sample results as a TP if our tool correctly identifies it in the HummingBad family; a sample results as a TN if our tool correctly identifies it as not belonging to the HummingBad family; when our tool classifies a samples in the wrong family, it is considered as an FP; when our tool not classifies a sample in the HummingBad family, it is considered as an FN. Tables 2 and 3 show the results achieved during the HummingBad family identification. As emerges from the results, our tool is able to correctly recognize the HummingBad samples without any negative result. Furthermore, our tool is transparent to code obfuscation techniques and it is more effective if compared with currently antimalware solutions (see Table 3). The comparison is performed in terms of the number of obfuscated samples correctly identified. Table 2. Performance Evaluation Formula

HummingBad

#Malware ∈

#Malware ∈

Family

Family

50

250

#Trusted

TP

FP

FN

TN

PR

RC

Fm

Acc

50

50

0

0

300

1

1

1

1

Table 4 reports the achieved results with the obfuscated samples showing that the performances keep pretty unchanged. In order to generate the obfuscated versions of the malware, we used the Droidchameleon [21] tool. Basically the obfuscation process consists of code injections (i.e., junk code insertion, code reordering, call indirections and renaming)

492

F. Martinelli et al. Table 3. Comparison between our methodology and antimalware

Family

#Samples

Our Method

AVG

Ad Aware

Avast

Arcabit

Alibaba

ESET

McAfee

NOD32 HummingBad

50

50

0

0

0

0

4

0

0

aimed at generating morphed versions of the applications belonging to the malware dataset. The DroidChameleon tool applies the transformation techniques not only on the malicious payload but to all the application code. For instance, in case we apply the junk code insertion technique, all the methods of the application will be obfuscated with this technique, the same happens with the other injected obfuscations. A previous work [21] demonstrated that current antimalware solutions fail to recognize the malware after these transformations. We applied our method to the morphed dataset in order to verify if the proposed method is able to detect HummingBad malicious payload even the malware has been obfuscated. The analysis confirms that the proposed method is resilient to the common code obfuscation techniques. Table 4. Resilience to the Obfuscation Techniques Dataset

Original Morphed # Samples TP # Samples TP

HummingBad 50

4

50 50

50

Conclusion and Future Work

In this paper we discuss the design and the implementation of a model checking based tool able to identify the malicious behaviour of Android applications. We evaluated the proposed tool analyzing the real-world HummingBad family obtaining an accuracy equal to 1. As future works, we plan to extend the experiments to other widespread malware threats in order to enforce the methodology proposed in this work. Acknowledgments. This work has been partially supported by H2020 EU-funded projects SPARTA contract 830892 and C3ISP and EIT-Digital Project HII and PRIN “Governing Adaptive and Unplanned Systems of Systems” and the EU project CyberSure 734815.

Model Checking to Detect the Hummingbad Malware

493

References 1. Andersen, J.R., Andersen, N., Enevoldsen, S., Hansen, M.M., Larsen, K.G., Olesen, S.R., Srba, J., Wortmann, J.K.: CAAL: concurrency workbench, aalborg edition. In: Theoretical Aspects of Computing - ICTAC 2015 - 12th International Colloquium Cali, Colombia, 29–31 October 2015, Proceedings, Lecture Notes in Computer Science, vol. 9399, pp. 573–582. Springer (2015) 2. Arp, D., Spreitzenbarth, M., Huebner, M., Gascon, H., Rieck, K.: Drebin: efficient and explainable detection of android malware in your pocket. In: Proceedings of 21th Annual Network and Distributed System Security Symposium (NDSS) (2014) 3. Canfora, G., Di Sorbo, A., Mercaldo, F., Visaggio, C.A.: Obfuscation techniques against signature-based detection: a case study. In: Proceedings of Workshop on Mobile System Technologies (2015) 4. Canfora, G., Mercaldo, F., Moriano, G., Visaggio, C.A.: Composition-malware: building android malware at run time. In: 2015 10th International Conference on Availability, Reliability and Security (ARES), pp. 318–326. IEEE (2015) 5. Ceccarelli, M., Cerulo, L., Santone, A.: De novo reconstruction of gene regulatory networks from time series data, an approach based on formal methods. Methods 69(3), 298–305 (2014) 6. Cimitile, A., Martinelli, F., Mercaldo, F.: Machine learning meets iOS malware: identifying malicious applications on apple environment. In: ICISSP, pp. 487–492 (2017) 7. Clarke, E.M., Grumberg, O., Peled, D.: Model Checking. MIT Press (2001) 8. Enck, W., Ongtang, M., McDaniel, P.: On lightweight mobile phone application certification. In: Proceedings of the 16th ACM Conference on Computer and Communications Security, pp. 235–245. ACM (2009) 9. Felt, A.P., Finifter, M., Chin, E., Hanna, S., Wagner, D.: A survey of mobile malware in the wild. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 3–14. ACM (2011) 10. Felt, A.P., Greenwood, K., Wagner, D.: The effectiveness of application permissions. In: Proceedings of the 2nd USENIX Conference on Web Application Development, p. 7 (2011) 11. Martinelli, F., Mercaldo, F., Nardone, V., Santone, A.: Car hacking identification through fuzzy logic algorithms (2017). Cited By 13 12. Martinelli, F., Mercaldo, F., Orlando, A., Nardone, V., Santone, A., Sangaiah, A.K.: Human behavior characterization for driving style recognition in vehicle system. Comput. Electr. Eng. (2018). Cited By 10; Article in Press 13. Mercaldo, F., Nardone, V., Santone, A., Visaggio, C.A.: Hey malware, i can find you! pp. 261–262 (2016). Cited By 10 14. Mercaldo, F., Nardone, V., Santone, A.: Ransomware inside out. In: 2016 11th International Conference on Availability, Reliability and Security (ARES), pp. 628– 637. IEEE (2016) 15. Mercaldo, F., Nardone, V., Santone, A., Visaggio, C.A.: Download malware? no, thanks: how formal methods can block update attacks. In: Proceedings of the 4th FME Workshop on Formal Methods in Software Engineering, pp. 22–28. ACM (2016) 16. Mercaldo, F., Nardone, V., Santone, A., Visaggio, C.A.: Hey malware, i can find you! In: 2016 IEEE 25th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), pp. 261–262. IEEE (2016)

494

F. Martinelli et al.

17. Mercaldo, F., Nardone, V., Santone, A., Visaggio, C.A.: Ransomware steals your phone. Formal methods rescue it. In: International Conference on Formal Techniques for Distributed Objects, Components, and Systems, pp. 212–221. Springer (2016) 18. Mercaldo, F., Visaggio, C.A., Canfora, G., Cimitile, A.: Mobile malware detection in the real world. In: Proceedings of the 38th International Conference on Software Engineering Companion, pp. 744–746. ACM (2016) 19. Milner, R.: Communication and Concurrency. PHI Series in Computer Science. Prentice Hall (1989) 20. Oh, H.-S., Kim, B.-J., Choi, H.-K., Moon, S.-M.: Evaluation of android dalvik virtual machine. In: Proceedings of the 10th International Workshop on Java Technologies for Real-time and Embedded Systems, pp. 115–124. ACM (2012) 21. Rastogi, V., Chen, Y., Jiang, X.: Droidchameleon: evaluating android anti-malware against transformation attacks. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, pp. 329–334. ACM (2013) 22. Santone, A., Vaglini, G.: Abstract reduction in directed model checking ccs processes. Acta Informatica 49(5), 313–341 (2012) 23. Stirling, C.: An introduction to modal and temporal logics for CCS. In: Yonezawa, A., Ito, T. (eds.) Concurrency: Theory, Language, and Architecture. LNCS, vol. 491, pp. 2–20. Springer (1989) 24. Xiao, X., Zhang, S., Mercaldo, F., Hu, G., Sangaiah, A.K.: Android malware detection based on system call sequences and LSTM. Multimed. Tools Appl. 78(4), 3979–3999 (2019)

ECU-Secure: Characteristic Functions for In-Vehicle Intrusion Detection Yannick Chevalier1(B) , Roland Rieke2 , Florian Fenzl3 , Andrey Chechulin4 , and Igor Kotenko4 1

2

Paul Sabatier University, Toulouse, France [email protected] Fraunhofer Institute for Secure Information Technology, Darmstadt, Germany [email protected] 3 University of Applied Sciences Mittelhessen, Giessen, Germany [email protected] 4 SPIIRAS, St-Petersburg, Russia {chechulin,ivkote}@comsec.spb.ru

Abstract. Growing connectivity of vehicles induces increasing attack surfaces and thus the demand for a sophisticated security strategy. One part of such a strategy is to accurately detect intrusive behavior in an in-vehicle network. Therefore, we built a log analyzer in C that focused on payload bytes having either a small set of different values or a small set of possible changes. While being an order of magnitude faster, the accuracy of the results obtained is at least comparable with results obtained using standard machine learning techniques. Thus, this approach is an interesting option for implementation within in-vehicle embedded systems. Another important aspect is that the explainability of the results is better compared to deep learning systems. Keywords: Controller area network security · Intrusion detection · Anomaly detection · Machine learning · Automotive security · Security monitoring

1 Introduction Information Technology (IT) security and data protection are enabling factors for newly emerging intelligent distributed computing ecosystems such as the Internet of vehicles [15]. Increasing attack surfaces in intelligent autonomous vehicles are inevitably caused by a strong interconnectedness of vehicles and the requirements for external information sources and services. Despite the complexity of modern vehicles with more than 100 interconnected Electronic Control Units (ECUs) and more than 100 million lines of code, an imperative need is that the vehicle cannot be controlled remotely by an attacker. Attacks based on remotely injecting messages to safety critical ECUs in order to influence physical actions such as steering and braking has already been demonstrated in [14]. It is thus very important to improve security of in-vehicle networks and as long as there are no effective means to prevent specific attacks, there should be methods in place to automatically detect them and react to the alerts. In principle, the c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 495–504, 2020. https://doi.org/10.1007/978-3-030-32258-8_58

496

Y. Chevalier et al.

detection of anomalies in the network traffic inside a vehicle that may be caused by intruders could be done remotely by sending all internal traffic to a security operations center. However, this would not only be a serious problem with respect to privacy concerns and regulations, it would also be inefficient, cause high costs and maybe not fulfill real-time reaction requirements. Thus, in this work we consider a new in-vehicle anomaly detection method which aims at solving four requirements: It should (1) be as accurate as possible, because false alarms may lead to unnecessary degradation of vehicle usability. It should (2) be as resource efficient as possible. It should (3) not require hardware changes and not require additional third party software libraries because these may be not available for vehicle-specific embedded systems or have licenses which would require to make the in-vehicle software open source. Finally (4) the results of message evaluation with respect to anomalies should be explainable, in order to be able to judge about possible counteractions. In order to address these requirements, we propose a logical analysis method which we compare with a simple neural network based methods which could probably be applied in embedded systems inside vehicles. We aim at better accuracy, faster and more resource efficient characterization of messages, portability to embedded systems without dependencies on libraries such as tensorflow, and rule-based reasoning so the results of message evaluation with respect to anomalies can be back-traced to the responsible rules. We evaluate the proposed method on data sets from the Controller Area Network (CAN) bus which is the standard solution for in-vehicle communication between ECUs. Section 2 gives an overview on the background and related work. Section 3 introduces data sets from two different vehicles that have been used to evaluate the proposed method. Section 4 describes some results from tests with neural networks in order to provide a benchmark for our work. Section 5 presents the principles of the characteristic function approach while Sect. 6 describes its implementation and the results of various detection setups. Finally, Sect. 7 concludes this paper.

2 Background and Related Work The most important topic in vehicle security should be the design of secure architecture, protocols, hardware and software. Hardening the system by reducing the attack surface is a common incremental approach. For example, [21, 25] list possible intrusion points together with proposals for countermeasures such as cryptography, anomaly detection and ensuring software integrity by separating critical systems. However, most of the currently discussed intrusion prevention measures require hardware changes, thus conflicting with backwards-compatibility. Therefore, it has been proposed by some actors that, in the CAN context, intrusion detection should be used in a defense-in-depth approach [6]. CAN intrusion detection methods can be grouped into four categories, namely, detecting ECU impersonating attacks, detecting specification violations, detecting message insertions, and detecting sequence context anomalies. The work on detection of ECU impersonating attacks such as [3, 5] in most cases uses some kind of physical fingerprinting by voltage or timing analysis with specific hardware. This work seeks to mitigate the general problems of missing authenticity measures in CAN bus design

ECU-Secure: Characteristic Functions for In-Vehicle Intrusion Detection

497

and thus is complementary to the work presented in this paper. The detection of specification violations assumes that the specification of normal behavior is available and thus there is an advantage that no alerts based on false positives will be generated. Specification based intrusion detection methods can use specific checks, e.g. for formality, protocol and data range [11], a specific frequency sensor [8], a set of network based detection sensors [16], or specifications of the state machines [20]. The detection of message insertions can be based on different technologies such as analysis of time intervals of messages [19] or LSTM [24]. The methods for detection of sequence context anomalies comprise process mining [18], hidden Markov models [12, 17], OCSVM [23], neural networks [9], and detection of anomalous patterns in a transition matrix [13]. Comparisons of different Machine Learning (ML) algorithms are given in [2, 4, 22]. OCSVM, Self Organizing Maps, and LSTM are used in [4] while LSTM, Gated Recurrent Units (GRU) and Markov models are used in [22]. OCSVM, SVM, sequential neural networks and LSTM are used in [2]. A detailed review on intrusion detection systems for in-vehicle networks is given in [1]. Our training sets and ML benchmark described in Sect. 3 are based on [2, 7]. We consider the method proposed in this paper in particular with respect to payload analysis to be more accurate than the methods based on SVM, OCSVM and similar approaches and to be much faster and more resource efficient than the methods based on different kinds of neural networks.

3 Simulated In-Vehicle Attacks For the evaluation of our work we used five different data sets from two different vehicles, namely, HCLRDoS , HCLR f uzzy , HCLRgear , ZOE f uzzy , and ZOE payload . We mapped the relevant data of the CAN messages to the following tuple structure: (time, ID, dlc, p1 , . . . , p8 ,type), where time is the time when the message was received, ID comprises information about the type and the priority of the message, dlc (data length code) provides the number of bytes of valid payload data, and p1, . . . , p8 is the payload. The messages are labeled by type (attack versus no attack). Three of these data sets that we used have been published by the “Hacking and Countermeasures Research Labs” (HCRL) [7]. The HCLRDoS data set contains DoS attacks. For this attack type, every 0.3 ms a message with the ID “0000” is injected (cf. m2 in Table 1). Conversely, in the HCLR f uzzy data set every 0.5 ms a random message is injected (cf. m3 and m4 in Table 1). The HCLRgear contains spoofing attacks, where every millisecond a message with an ID related to gear is injected, whereby payload does not change (cf. m5 and m6 in Table 1). The ZOE data set with about 1 million messages has been collected from a 9 min drive with a Renault Zoe electric car in an urban environment. It has been used before to evaluate process mining [18] as well as ML methods [2]. The ZOE data set contains 110 different IDs which makes it much more difficult to reach a good detection accuracy compared to the relatively simple HCRL data sets with maximal 38 different IDs. The ZOE data set originally contains no attack data but for this work we have created a merged data set ZOE f uzzy by injecting 32000 generated messages with random ID and payload, similar to the HCLR f uzzy data set. We have also created another data set

498

Y. Chevalier et al.

ZOE payload where we similarly injected 10000 generated messages with randomly generated payload but only with message IDs which have been used in the original data set. Thus, these messages are more difficult to detect. Table 1. Exemplary messages in data sets HCRL and ZOE Message Time

ID

Length p1

p2

p3

p4

p5

p6

p7

p8

Type Comment

m1

0.851863 1264 8

0

0

0

128 0

105 209 19

1

HCRL: normal msg

m2

0.852103 0

0

0

0

0

0

0

0

0

−1

HCRL: DoS attack

51

8

m3

0.972222 1869 8

68

82

16

80

85

48

212 −1

m4

0.982961 1139 8

148 217 62

32

201 26

23

44

−1

HCRL: fuzzy attack

m5

1.348859 1087 8

1

69

96

255 107 0

0

0

−1

HCRL: gear attack

m6

1.349963 1087 8

1

69

96

255 107 0

0

0

−1

HCRL: gear attack

m7

0.010919 504

248 4

255 239 254 0

10

13

1

ZOE: normal msg

m8

0.015625 1656 8

243 99

108 24

8

188 209 74

171 −1

HCRL: fuzzy attack

ZOE: fuzzy attack

4 Baseline Benchmark: Neural Network As a benchmark for the evaluation of our approach we used a neural network with a structure that is based on a Sequential model from the keras.models package. Neural networks are the standard for deep learning and can model very complex nonlinear relationships. A fully connected neural network utilizes a number of layers with each layer supporting an arbitrary number of neurons. Data is propagated from the input to the output layer using weighted connections between the neurons of these layers. In order to get a very high accuracy, we used two dense layers with 50 neurons each for the training data sets (trainable parameters: 3,201) and processed the learning phase in 20 epochs. Tensorflow 11 with Binary Crossentropy, Adam and Accuracy was used as loss, optimizer and performance metric. The validation of the neural network was done with a standard train/test split of the original data. In case of the ZOE f uzzy and ZOE payload data sets we used random attacks initialized with different seed for training and detection phases. Table 2. Neural network results (Data set: log-file with simulated attacks; TP/FP/TN/FN: true/false positives/negatives; Precision (Positive Predictive Value) PPV = T P/(T P + FP); Recall (True Positive Rate) T PR = T P/(T P + FN); Accuracy ACC = (T P + T N)/(T P + T N + FP + FN); Train: time for training the model on i7; i7: time for running detection on i7; ARM: time for running detection on ARM; Real: elapsed time in log-file). Data set

TP

HCRLDoS

587521

0

3078250

0

HCRL f uzzy

491829

118

3346895

18

HCRLgear

597252

136

3845754

0

ZOE f uzzy

31050

135

1012856

950

99.89% 99.56% 97.03% 1 m 23 s 4 m 4 s

41 m 31 s

9m

9284

964

1012027

716

99.83% 90.59% 92.84% 1 m 26 s 4 m 2 s

40 m 21 s

9m

ZOE payload

FP

TN

FN

ACC

PPV

TPR

Train

i7

100%

100%

100%

5m6s

14 m 37 s 144 m 46 s 47 m

99.99% 99.97% 99.99% 5 m 22 s 15 m 8 s 99.99% 99.97% 100%

ARM 153 m14 s

6 m 10 s 16 m 57 s 175 m 7 s

Real 91 m 133 m

ECU-Secure: Characteristic Functions for In-Vehicle Intrusion Detection

499

The results in Table 2 show an accuracy above 99.83% for all used data sets. Neural networks have the drawback of needing anomalous training data to be useful. Training for all known attacks is very difficult because the training set has to cover many variations of attack types. Furthermore, usually a data set covering all possible ‘normal’ behaviors of vehicles is not available. The training of the models has been done on a laptop with Intel i7-8550U CPU. The detection tests have been executed on the same laptop and also on a Raspberry Pi with ARMv7 CPU. In order to run the detection in real-time within an in-vehicle network it is a necessary precondition that the detection is faster than the real elapsed time as listed in the last column of Table 2. By comparison with the detection times on a Raspberry Pi (cf. column ARM), it is evident that an IDS using these neural network models would not be fast enough on an embedded system with such processing power.

5 Principles of the Characteristic Function Approach We aim at synthesizing a formal model for the network from a log of events. In contrast with other tentatives we focus on the fact that every good message sent in the network and thus occurring in the log is to be accepted by a device in this network. Thus and in contrast with natural systems, we know there exists, implemented in the devices connected to the CAN bus, tests that are employed to accept only legitimate messages. Conversely, legitimate messages sent on the bus are constructed so as to pass these tests. Under this assumption our approach consists in first choosing a set of tests that is likely relevant to the communications on the CAN bus, second in analyzing the log to keep only tests that are likely to be implemented by the devices or that are consequences on the messages’ payload of the information conveyed. Third we apply the tests computed during the training phase on messages from another log. Considerations on the Tests Space. The first step consists in choosing a set of simple tests that are likely to be relevant. The space of all possible tests will then be all the possible conjunctions and disjunctions of these simple tests. We model packets by an ID and a sequence of bytes, i.e. 256-valued integers. This ID determines a class to which each packet belongs. We assume that all packets in a given class are similar enough so that some tests exist that are valid on all messages on the class and are not vacuous. In principle the test space encompasses all boolean functions on messages or sequences of messages. However a succinct analysis already delineates a few types of tests that may be useful for the analysis of logs: • some tests are concerned about the syntactic content of the packet, such as the presence of a padding constant or the presence of a specific value, denoting e.g. a more precise type for the packet; • some tests are computed on the whole packet, such as an error-correcting code; • some tests are domain specific and relate to the possible evolutions of physical data between consecutive packets or the set of possible values of some data; • some tests depend on the internal state of the devices, a packet being acceptable at some point of their execution but not at another point.

500

Y. Chevalier et al.

For the sake of simplicity we consider in this paper only tests performed independently on the different fields of messages, as well as on their ID and their time, translated into first a 32-bits integer, then treated as four different 1-byte fields. That is, we consider only the first and third cases of the preceding list. We detail below how the interesting fields are discovered and how we construct rules from their values. Classes of Messages. Messages are placed in classes depending on the value of their IDs. A value test or a difference test has to be valid on all messages of a given class during training to be incorporated in the monitor. Automatic Fields. A field is automatic if the device receiving and accepting this packet tests whether the value of the field is equal to a constant in its program. It is expected that, if different packets can be sent from one device to another, at least one automatic field exists so that the receiver can derive the type of the received packet. The statistical characteristic of such fields are that they should have only a few legitimate values, and that these values should have no other detectable relations. There is obviously some arbitrariness in deciding what a few means. Since the tests performed are not based on any hints from the protocol, we have arbitrarily decided to define a small set of different values to be the square root of the total number of possible different values, that is less than 16 values among the 256 possible ones. In future work we plan to adapt this choice wrt the number of messages in the class. Tests relevant to automatic fields are value tests in which we record all the different values occurring in a field during training. If the number of different values is more than 16, we perform no value test on that field during monitoring. Otherwise we verify during monitoring that the value in that field for a message is among the ones seen during training. To sum up, value tests are a conjunction, on all fields f , of a disjunction f = v1 ∨ . . . ∨ f = vk with k ≤ 16, or of the true constant  if more than 16 different values have been encountered. Physical Values. These are values that are assumed to evolve slowly. For these values we assume a bound on the difference between the value present in the current packet wrt the value occurring in the last preceding similar packet. For these fields the analyzer keeps track of the value in the last accepted message and compares that value with the one in the current message. As in the case of value tests, these difference tests are performed during monitoring only if a small (less than 16) number of changes have been observed during the training phase. Re-using the same notation as above, but now denoting f the value of a field in the last accepted packet, and f  its value in the packet under analysis, difference tests are a conjunction, on all fields f , of a disjunction ( f  − f ) = v1 ∨ . . . ∨ ( f  − f ) = vk with k ≤ 16, or of the true constant  if more than 16 different values have been encountered for the difference between the values for that field between a message and its predecessor. Random Values. There are fields for which no relation was found in the data set among the ones that were searched for. In the data sets considered, a post-analysis of the rules has shown that in several cases these fields are often related with the physical value fields, and that the data conveyed were actually 2-bytes values. The analyzer does not perform any test on these fields, as per the construction described both the value and the difference tests are reduced to the  constant for these.

ECU-Secure: Characteristic Functions for In-Vehicle Intrusion Detection

501

6 Implementation and Evaluation of Characteristic Function Approach Our approach consists in using tests to first classify messages into classes, and second to characterize messages in a given class by the set of tests they pass. The monitor only implements tests that are satisfied by all messages in a given class. In the experiments of Table 3 the only classification performed is on the ID field. The tool outputs, for each class, the tests that are to be performed on packets of that class. We believe this information to be very valuable for future work. First, it permits to compute the probability that a random message satisfies all the tests in the class, and thus allows us to evaluate the robustness of the monitor against the injection of random messages. Assuming that in a given class there are n fields classified as automatic and m fields classified as physical, and that tests on fields all accept the maximum of 16 values, a random message in that class has a probability 16 n+m ) = 2−4·(n+m) to be accepted. This small but non-negligible probability explains ( 256 the occurrences of false negatives in Table 3. Second, given that the rules generated implement simple tests, it is also in theory possible for a human to better understand the system by looking at the rules produces, and eventually produce new (and less generic) tests beyond those described in this paper. A side result of this is that it is also quite easy to build a fake traffic that will be accepted by a monitor once we know its rules. Third, it permits to focus further classification work on classes for which only a few fields are tested. Though this is outside the scope of this paper, a manual analysis of the rules produced and of the messages in these classes strongly suggests new test functions tailored to handle these cases. Table 3. Results of characteristic functions approach (Data set: log-file with simulated attacks; TP/FP/TN/FN: true/false positives/negatives; Precision (Positive Predictive Value) PPV = T P/(T P + FP); Recall (True Positive Rate) T PR = T P/(T P + FN); Accuracy ACC = (T P + T N)/(T P + T N + FP + FN); Train: time for training the model on i7; i7: time for running detection on i7; ARM: time for running detection on ARM; Real: elapsed time in log-file). Data set

TP

FP TN

Train i7

ARM

Real

HCRLDoS

587521

0

3078250

0

100%

100% 100%

0.9 s

1m1s

17 m 2 s

47 m

HCRL f uzzy

491847

0

3347013

0

100%

100% 100%

1.2 s

1m2s

17 m 48 s 91 m

HCRLgear

597252

0

3845890

0

100%

100% 100%

1.2 s

1 m 22 s 20 m 27 s 133 m

ZOE f uzzy

31985

0

1012991

15

99.99% 100% 99.95% 0.3 s

0 m 19 s 5 m 1 s

9m

9910

0

1012991

90

99.99% 100% 99.10% 0.3 s

0 m 19 s 4m 50 s

9m

ZOE payload

FN ACC

PPV

TPR

A final comment on the experiments shown in Table 3 is that we have encountered no False Positive, which shows that though it is arbitrary, the heuristic threshold of 16 is not too high as it does not classify a field that contains random values into an automatic field, i.e. no overfitting has been observed. This however should not be interpreted as an impossibility for our approach to suffer from over-fitting. Especially a training dataset

502

Y. Chevalier et al.

which is too short would tend to produce illegitimate value tests, e.g. for the fields recording the timestamp of the packet. Implementation. The algorithm has been implemented in C. The log is first translated if necessary into a binary file which is then mmap’d to an array of structures, each structure representing a packet. This array is then analyzed independently by different modules. Each analysis module constructs a balanced binary tree mapping an ID to the result of the analysis on this ID. The monitor uses this structure to parse a log file and test whether a packet shall be accepted. Resources. During training all the log is virtually available in memory, though we rely on the operating system to optimize speed and memory consumption. By construction the memory needed by the monitor is linear in both the number of different IDs and in the number of fields. As can be seen in Table 3, the results are very encouraging against the different attacks considered. It shall be noted that knowing the results of the analysis modules, it is also quite easy to construct attacks (i.e., add additional messages) that follow a pattern that will be accepted by the analyzer.

7 Conclusion We have seen in previous work [2] that neural network approaches to anomaly detection deliver good results but that it is hard to implement this kind of detection in-vehicle because of restrictions with respect to on-board resources of typical ECUs used in vehicular systems. Thus, we have started to analyze logs using a bind and branch approach that was very accurate but lacked robustness. From this experience we built a log analyzer in C that focused on payload bytes having either a small set of different values or a small set of possible changes. The results obtained are at least comparable with results obtained using standard ML techniques. We will work in the near future on refining the analysis to guess the functions employed by the devices to test whether the packet shall be accepted. In order to evaluate our approach in a realistic context, we will further test our approach in a setup of several Raspberry Pis equipped with CAN-bus boards which use components such as the Microchip MCP2515 CAN controller with MCP2551 CAN transceiver that are typical for automotive ECUs. We furthermore plan to extend our approach to CAN with flexible data-rate (CAN-FD) which is an extension of the original CAN bus protocol with higher bandwidth. Another possible evaluation could improve the baseline neural network performance on micro-controllers utilizing optimized neural network functions such as CMSIS-NN [10]. Acknowledgement. This research is partially supported by the German Federal Ministry of Education and Research in the context of the project VITAF (ID 16KIS0835).

ECU-Secure: Characteristic Functions for In-Vehicle Intrusion Detection

503

References 1. Al-Jarrah, O.Y., Maple, C., Dianati, M., Oxtoby, D., Mouzakitis, A.: Intrusion detection systems for intra-vehicle networks: a review. IEEE Access 7, 21266–21289 (2019). https://doi. org/10.1109/ACCESS.2019.2894183 2. Berger, I., Rieke, R., Kolomeets, M., Chechulin, A., Kotenko, I.: Comparative study of machine learning methods for in-vehicle intrusion detection. In: Computer Security. ESORICS 2018 International Workshops, CyberICPS 2018 and SECPRE 2018, Barcelona, Spain, 6–7 September 2018, Revised selected papers. Lecture Notes in Computer Science, vol. 11387, pp. 85–101. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-1278626 3. Cho, K., Shin, K.G.: Fingerprinting electronic control units for vehicle intrusion detection. In: Holz, T., Savage, S. (eds.) 25th USENIX Security Symposium, USENIX Security 16, Austin, TX, USA, 10–12 August 2016, pp. 911–927. USENIX Association (2016) 4. Chockalingam, V., Larson, I., Lin, D., Nofzinger, S.: Detecting attacks on the CAN protocol with machine learning (2016) 5. Choi, W., Joo, K., Jo, H.J., Park, M.C., Lee, D.H.: VoltageIDS: low-level communication characteristics for automotive intrusion detection system. IEEE Trans. Inf. Forensics Secur. 13(8), 2114–2129 (2018) 6. ENISA: cyber security and resilience of smart cars. Tech. rep. ENISA (2016). https://doi. org/10.2824/87614 7. Hacking and Countermeasure Research Lab (HCRL): Car-Hacking Dataset for the intrusion detection (2018). http://ocslab.hksecurity.net/Datasets/CAN-intrusion-dataset. Accessed 28 Jun 2018 8. Hoppe, T., Kiltz, S., Dittmann, J.: Security threats to automotive CAN networks - practical examples and selected short-term countermeasures. Reliab. Eng. Syst. Saf. 96, 11–25 (2011) 9. Kang, M.J., Kang, J.W.: A novel intrusion detection method using deep neural network for in-vehicle network security. In: 2016 IEEE 83rd Vehicular Technology Conference (VTC Spring) (2016) 10. Lai, L., Suda, N., Chandra, V.: CMSIS-NN: efficient neural network kernels for arm cortexM CPUS. CoRR abs/1801.06601 (2018). http://arxiv.org/abs/1801.06601 11. Larson, U.E., Nilsson, D.K., Jonsson, E.: An approach to specification-based attack detection for in-vehicle networks. In: Intelligent Vehicles Symposium 2008, pp. 220–225. IEEE (2008) 12. Levi, M., Allouche, Y., Kontorovich, A.: Advanced analytics for connected cars cyber security. CoRR abs/1711.01939 (2017) 13. Marchetti, M., Stabili, D.: Anomaly detection of CAN bus messages through analysis of id sequences. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 1577–1583 (2017) 14. Miller, C., Valasek, C.: Remote exploitation of an unaltered passenger vehicle. Tech. rep. IOActive Labs (2015) 15. M¨uller-Quade, J., et al.: Cybersecurity research: challenges and course of action. Tech. rep. Karlsruher Institut f¨ur Technologie (KIT) (2019). https://doi.org/10.5445/IR/1000090060 16. M¨uter, M., Asaj, N.: Entropy-based anomaly detection for in-vehicle networks. In: 2011 IEEE Intelligent Vehicles Symposium (IV), pp. 1110–1115 (2011) 17. Narayanan, S.N., Mittal, S., Joshi, A.: OBD securealert: an anomaly detection system for vehicles. In: IEEE Workshop on Smart Service Systems (SmartSys 2016) (2016) 18. Rieke, R., Seidemann, M., Talla, E.K., Zelle, D., Seeger, B.: Behavior analysis for safety and security in automotive systems. In: 25nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) 2017, pp. 381–385. IEEE Computer Society (2017)

504

Y. Chevalier et al.

19. Song, H., Kim, H., Kim, H.: Intrusion detection system based on the analysis of time intervals of CAN messages for in-vehicle network, March 2016, vol. 2016, pp. 63–68. IEEE Computer Society (2016) 20. Studnia, I., Alata, E., Nicomette, V., Kaˆaniche, M., Laarouchi, Y.: A language-based intrusion detection approach for automotive embedded networks. In: The 21st IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2015) (2014) 21. Studnia, I., Nicomette, V., Alata, E., Deswarte, Y., Kaˆaniche, M., Laarouchi, Y.: Security of embedded automotive networks: state of the art and a research proposal. In: Roy, M. (ed.) SAFECOMP 2013 - Workshop CARS of the 32nd International Conference on Computer Safety, Reliability and Security (2013) 22. Taylor, A., Leblanc, S.P., Japkowicz, N.: Probing the limits of anomaly detectors for automobiles with a cyber attack framework. IEEE Intell. Syst. PP(99), 1 (2018) 23. Theissler, A.: Anomaly detection in recordings from in-vehicle networks. In: Proceedings of Big Data Applications and Principles First International Workshop, 11–12 September 2014, BIGDAP 2014, Madrid, Spain (2014) 24. Wei, Z., Yang, Y., Rehana, Y., Wu, Y., Weng, J., Deng, R.H.: IoVShield: an efficient vehicular intrusion detection system for self-driving (short paper), pp. 638–647. Springer International Publishing, Cham (2017) 25. Wolf, M., Weimerskirch, A., Paar, C.: Security in automotive bus systems. In: Proceedings of the Workshop on Embedded Security in Cars, July 2004, pp. 1–13 (2004)

Experimenting with Machine Learning in Automated Intrusion Response Andre Lopes(B) and Andrew Hutchison Department of Computer Science, University of Cape Town, Rondebosch 7700, South Africa [email protected], [email protected]

Abstract. Traditionally, Intrusion Response Systems (IRSs) have had a strong reliance on network administrators to perform various responses for a network. Though this is expected, particularly with networks containing sensitive data, it is not completely practical, considering the ever growing demand for speed, scalability and automation in computer networks. This study presents a proof of concept IRS that provides for high speed networks, by using reinforcement learning to analyze and respond to trivial network attacks. This is done by creating a response system that is able to learn from the effectiveness of its own responses. All tests are conducted using an emulated network, that was designed to replicate real network behavior. Simulated attacks were used to train the IRS. Results of training were evaluated at intervals of 100, 500, 1000 and 2000 attacks. The findings of this work indicate that while applying reinforcement learning to IRSs is possible, adjustments may still be required to improve its performance. Keywords: Machine learning

1

· Intrusion response

Introduction

Network security is a prerequisite for any computer network [3], especially those containing sensitive data. Intrusion Response Systems (IRSs) are a key component in defending networks as they either prevent an attack from happening, or stop an attack from succeeding. IRSs are often accompanied by Intrusion Detection Systems (IDSs) which provide alerts about suspicious network behavior or data. IRSs make use of alerts, along with response policies, to best decide how to respond to an attack. Response policies make changes to a network to counter attacks, and are either enacted manually by a network administrator, or automatically by an IRS. An ideal IRS would respond dynamically to all network attacks, taking into consideration the current state of the network, the development or progress of attacks, and the potential damage that both the attack and the response could inflict on the network. Such IRSs would also adapt from previous attacks, and would avoid being manipulated by an attacker into damaging the network it c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 505–514, 2020. https://doi.org/10.1007/978-3-030-32258-8_59

506

A. Lopes and A. Hutchison

protects. Previous works that have created an IRS framework suggest using machine learning to optimize performance [10]. This work attempts to implement machine learning directly, omitting the use of a framework. The problem with machine learning is that there is no certainty to the values they output, and as a result, they could cause damage to the network with the decisions they make. This is the reason why machine learning, and more specifically, even reinforcement learning, is applied to IDSs [6], where mistakes are often tolerated in the form of false positives, but rarely applied to IRSs, where mistakes directly affect the network. To avoid irreversible damage, the response generated from the IRS created in this study will only be applied to Denial of Service (DoS) attacks. For testing simplicity, the response will only predict the length of the attack, and consequently, the timeout or isolation time of the server. This work, which exists as part of a broader research investigation, aims to expand on the automated capabilities of IRSs, by experimenting with an IRS that is capable of learning from the effectiveness of its own responses. More background on IRSs, reinforcement learning and network emulation is provided in the following section, after which, the particular experimental setup is introduced, the experiment described, and the results presented and discussed.

2

Background

The information discussed in this section describes the basic structure of intrusion response (Sect. 2.1), how different learning problems can be approached in reinforcement learning (Sect. 2.2), and the technologies associated with network emulation (Sect. 2.3). 2.1

Intrusion Response

Intrusion response is the field concerned with responding to network attacks when given an alert. These systems work alongside IDSs, which are responsible for identifying network attacks and subsequently generating alerts. IRSs can be placed into three categories. They can exist as notification systems, manual response systems or automated response systems. These categories create what is known as the levels of automation [9,13–15]. Each level can be seen in Fig. 1. Notification systems act to alert a network administrator about an attack. Administrators are required to respond to the attack themselves. Manual response systems, which in addition to alerting the network administrator about an attack, also provides a set of preconfigured responses, requiring less human intervention. Automated response systems do not require any human interaction as they respond to attacks independently. The first two categories (notification and manual) can be considered passive response systems, and can therefore be handled by an IDS. The last category (automated) is an active response system, and can only be handled by an IRS [1–3].

Experimenting with Machine Learning in Automated Intrusion Response

507

Fig. 1. Our assessment of an IRS taxonomy

2.2

Reinforcement Learning

Reinforcement learning, often regarded as a machine learning method alongside supervised and unsupervised learning, is a method that learns from experience, or trial-and-error. It is often suited, and structured, for problems that involve: a goal, some action to be taken, and some form of sensing to be done [8,16]. Reinforcement learning can be adapted to a variety of problems, though not all make full use of its capabilities. For example, reinforcement learning can be used to select an action when given nothing but a reward as guidance, and it will still learn, however, learning would be much more efficient if given an input, or state information. With some form of state information, correlations between the optimal actions and the state, can be made. This is better known as associative search. Another capability of reinforcement learning is that of delayed rewards. Some problems may require a sense of foresight, whereby a number of specific actions will need to be performed in order to get the maximum reward. It is possible for reinforcement learning algorithms to select the best immediate reward, which would result in a greedy approach, and a less than optimal total reward. In order to obtain the best reward, some reinforcement learning algorithms will purposely select poor immediate rewards in order to get better rewards later on. 2.3

Network Emulation Technologies

Virtualization technologies allow a single physical computer system to host multiple isolated systems, each of which run in their own virtual environments. Methods of virtualization include hypervisor-based and container-based. Containerbased methods run at the operating system level, that is, they require a host operating system to run. Each virtualized system created with this method is referred to as a container, and runs as a normal process on the host machine, thus not requiring their own kernels. Container-based methods are usually easier to setup and often are lightweight [5]. Hyper-visor-based methods run at hardware level, meaning that they do not necessarily need a host operating system.

508

A. Lopes and A. Hutchison

Each virtualized system created with this method contains a guest operating system and an associated kernel, which significantly increases the hard drive and memory space required to store and run these systems. Furthermore, hypervisor-based methods have been shown to have poorer performance in comparison to container-based methods [7,17]. Despite their disadvantages, hypervisor-based methods have proved to be more versatile, as they can support multiple operating system environments (e.g. Windows and Linux) on the same system [5]. Docker and dynamips are examples of virtualization tools that are most similar to container and hyper-visor based approaches, respectively [4,12].

3

Implementation

The information presented in this section will discuss the test-bed in terms of hardware, software and the emulated network. Section 3.1 and Fig. 2 discusses and represents this diagrammatically, while Sect. 3.2 lists the associated technical information. 3.1

Test-Bed Overview

The hardware layer represents the physical computer that the test-bed ran on. This layer allows for the execution of the software layer, using resources which will be discussed in the test-bed technologies Subsect. 3.2. The software layer allows for the execution of Graphical Network Simulator 3 (GNS3) and the test-bed scripts. GNS3 is the software used to create and run the emulated network, seen in the next layer. The test-bed scripts, which were created for this work, and contain the functionality required by the IDS and IRS, are used to create simulated alerts and learn from attacks. These scripts exist at the software layer, however, they do communicate with the emulated network, or GNS3. The emulated network layer represents the components of the emulated network. This network was designed to replicate the behavior of a real network. The components of the emulated network consist of a user/attacker (debian docker image), which pings the server to check if any responses are active, routers (Cisco c3745 layer 3 switches), which enable communication, and a server (another debian docker image) which acts as a device to be pinged. No actual attacks are performed in this emulated network, and are instead simulated by the IDS, which generates its own alerts. 3.2

Test-Bed Technologies: Hardware and Software

The physical system, as depicted in Fig. 2, executed the emulated environment using an Intel i7-8700 and 16 GB of ram. No Graphical Processing Unit (GPU) acceleration was used for learning, since the learning problem was simple, as will be discovered in the next section (Sect. 4). The following lists provide a summary of all software used in the test-bed and debian docker images, along with version numbers:

Experimenting with Machine Learning in Automated Intrusion Response

509

Fig. 2. Overview of test-bed structure

OS: Kernel: GNS3: Docker: Dynamips Openssh: Inetutils: Python: Lynis: Tensorforce: Tensorflow:

4

arch linux 5.0.7 2.1.15 18.09.5 0.2.20 7.9p1 1.9.4 3.7.4 2.7.3 0.4.3 1.12.0

OS: Openssh: Ufw: Iputils:

debian 7.4p1 0.3.5 3:20161105

Experiment Test-Bed Components

The operations of the experiment are divided into 4 components: alert, response, emulated network (covered in Sect. 3) and timeout models. The alert component performs the IDS related tasks of the experiment, while the response component performs the IRS related tasks. The remaining components are supplementary, and serve to aid the IRS in its tasks. 4.1

Alert

The alert component uses simulated DoS alerts to train and test automated responses that can learn through the use of reinforcement learning. DoS alerts are used since it is assumed that damage to the network with this attack will

510

A. Lopes and A. Hutchison

not be irreversible and will only affect its availability. The alerts themselves are structured to occur multiple times throughout an attack and each attack is structured to last a specific amount of time. There are multiple attacks, and the amount of time they last follows a sorted poisson distribution. The intention is to have the reinforcement learning algorithms learn the randomized poisson distribution, and thus be able to better predict how long attacks will last. Having the learning algorithms learn a specific distribution will also make it easier to discern whether they are learning or not in the results section (Sect. 5). 4.2

Response

The response component prevents communication between a client and a server for a specific amount of time. To do so, the response system first notes the alert time, which is used for training purposes for the learning algorithm at a later stage. The response system then either ignores the alert, or generates what is called the timeout time, depending on whether or not the response is already active. The timeout time is generated using one of the learning algorithms (see Sect. 4.3). Once done, the gateway router which connects the client to the server will block the IP address of the client (note that this is done statically for testing purposes), thus preventing further communication. A process, called the timeout process, is then immediately started, and will activate when the timeout occurs (or when the system clock reaches the current time + the timeout time). The timeout process is multi-threaded, and therefore allows newer alerts to be received, however, only the arrival times of these alerts will be recorded, and the alerts themselves will be ignored, as a response is already active. Once the timeout occurs, the client’s IP address is unblocked, and the accuracy of the timeout time is measured by subtracting the time of the last alert, from the time that the IP address was unblocked (or when the timeout occurred). This value is then sent to the reinforcement learning algorithm for training. 4.3

Timeout Models

This component makes use of reinforcement learning algorithms to generate an estimate between 0 and 120 on how long an attack will last in seconds. Once generated, the estimate is then used in the response (see Subsect. 4.2). The learning algorithms were implemented using a python library called tensorforce [11]. Three timeout models have been incorporated in the experiment, though only two use reinforcement learning, as one merely performs random guessing for comparison. The reinforcement learning algorithms used were Proximal Policy Optimization (PPO) and Trust Region Policy Optimization (TRPO).

5

Results

Tests were conducted and summarized according to the number of attacks that occurred on the emulated network. The intervals for the number of attacks were

Experimenting with Machine Learning in Automated Intrusion Response

511

100, 500, 1000 and 2000. The length of the attacks, according to a poisson distribution, remained at 60 s for each interval. The IRS algorithms would make estimates as to how long an attack would last, and consequently, how long a response policy would be enacted for. These estimates were recorded, grouped together, and graphed according to the frequency of their occurrences (see Fig. 3). These graphs also contain the frequency of the actual attack times that each attack lasted for, as well as the attack times for an algorithm that guessed randomly. The data on the actual and random attack times are used for comparison in the remaining subsections, along with the table below (Table 1), which summarizes the average inaccuracy of each guess made by the learning algorithms and the random algorithm. Table 1. Summary of average offsets Algorithm/instances 100 500 1000 2000

5.1

PPO

30

24

9

9

TRPO

29

28

27

21

Random

30

30

29

31

100 Instances

The results for running the test-bed with 100 attack instances can be seen in the top left graph in Fig. 3 and the second column of Table 1. Both learning algorithms at 100 attack instances perform poorly, with an average offset, or inaccuracy estimate of 30 s for PPO and 29 s for TRPO. The algorithms perform similarly to random guessing, which can been seen in the graph, as their estimates form a relatively uniform distribution in comparison to the poisson distribution of the actual attack times. 5.2

500 Instances

At 500 instances, improvements start to become noticeable, as can be seen in the top right graph in Fig. 3 and the third column of Table 1. PPO improves by 6 s, with a new average offset, or inaccuracy estimate, of 24 s. TRPO, however, has only improved by 1 s, and remains similar to random guessing in terms of performance. The improvements of PPO can be seen in the graph, as a poisson distribution like shape starts to form between roughly 45 and 80 s. Outside of this range, PPO, again, guesses relatively uniformly. TRPO, as indicated by the table, has not improved, and still maintains a relatively uniform distribution. 5.3

1000 Instances

At 1000 instances, improvements are made, as can be seen in bottom left graph in Fig. 3 and the fourth column of Table 1. PPO improves drastically by 15 s

512

A. Lopes and A. Hutchison

from its previous best, with a new average offset of 9 s. TRPO, however, only improves by 1 s again, and though still similar to random guessing in terms of performance, it can be seen that the algorithm is learning. This is particularly evident in the graph around attack times of 60 s, where TRPO makes most of its guesses. PPO’s performance is also at its best at attack times of 60 s, and proves to obtain near identical results to the actual. In addition to this, PPO has also made far fewer guesses at attack times where the actual attack frequency is 0. Unfortunately, performance has become poor at attack times around 40 and 70 s. This discrepancy is likely caused due to the number of training examples seen at 60 s (around 350) and 40 s (less than 10). Furthermore, PPO displays qualities of being too greedy, as it over estimates the number of attacks at 40 and 60 s. This greedy behavior prevents the algorithms from adjusting to different attack times, which can be seen in the low number of guesses at 70 s.

Fig. 3. Results of testing 100, 500, 1000 and 2000 attack instances

5.4

2000 Instances

At 2000 instances, both algorithms have learned to some extent, as can be seen in the bottom right graph in Fig. 3 and the fifth column in Table 1. PPO did not improve despite training on an extra 1000 instances and still obtains an average

Experimenting with Machine Learning in Automated Intrusion Response

513

offset of 9 s. TRPO, though still not performing as well as PPO, did improve by 6 s from its previous best, and is now distinctly different from random guessing. The graph shows this, as a trend has formed between the attack times of 45 and 60 s. PPO has also improved outside the attack range of 60 s and is still making very few guesses at times where the attack frequency of the actual is 0, however, the algorithm still displays qualities of being greedy, as it over estimates guesses at 45 s and under estimates them at 60 s as a result.

6

Conclusion and Future Work

It can be seen that both algorithms, PPO and TRPO, are capable of learning from their own responses, as each of them perform better than random guessing, with an average inaccuracy of 9 s for PPO, and 21 s for TRPO. This can also be verified by the frequency of attack times seen Fig. 3, with the trained algorithms more closely representing the distribution of the actual attack time frequencies. Improvements could, however, still be made, as a 9 s inaccuracy is not necessarily optimal. Both algorithms only perform well at attack times close to 60 s, and do not achieve optimal response times outside of this range. Furthermore, the results from PPO indicate that the algorithm’s performance will not increase, meaning that a 9 s inaccuracy is the best it will achieve. TRPO may perform better if given more training instances, as it was still improving, but unfortunately, training on more than 2000 attack instances may prove to be too slow for any practical IRS. This work only experimented with simple reinforcement learning problems, and has shown some success. It is possible for future works to increase the complexity of the reinforcement learning problem, by adding state information and delayed rewards to further increase the performance and usefulness of an IRS that uses reinforcement learning. While the results remain promising, we still do acknowledge that responses generated by reinforcement learning should be limited to those that cannot irreversibly damage the network they protect.

References 1. Anuar, N.B., Papadaki, M., Furnell, S., Clarke, N.: An investigation and survey of response options for intrusion response systems (IRSs). In: Information Security for South Africa (ISSA), 2010, pp. 1–8. IEEE (2010). https://doi.org/10.1109/ISSA. 2010.5588654 2. Anwar, S., Mohamad Zain, J., Zolkipli, M.F., Inayat, Z., Khan, S., Anthony, B., Chang, V.: From intrusion detection to an intrusion response system: fundamentals, requirements, and future directions. Algorithms 10(2), 39 (2017). https://doi.org/ 10.3390/a10020039 3. Bace, R., Mell, P.: Nist special publication on intrusion detection systems. Technical report, Booz-Allen and Hamilton Inc., McLean, VA (2001) 4. Boettiger, C.: An introduction to docker for reproducible research. ACM SIGOPS Operating Syst. Rev. 49(1), 71–79 (2015). https://doi.org/10.1145/2723872. 2723882 5. Bui, T.: Analysis of docker security. arXiv preprint arXiv:1501.02967 (2015)

514

A. Lopes and A. Hutchison

6. Cannady, J.: Next generation intrusion detection: autonomous reinforcement learning of network attacks. In: Proceedings of the 23rd National Information Systems Security Conference, pp. 1–12 (2000) 7. Felter, W., Ferreira, A., Rajamony, R., Rubio, J.: An updated performance comparison of virtual machines and linux containers. In: 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 171–172. IEEE (2015). https://doi.org/10.1109/ISPASS.2015.7095802 8. Hessel, M., Modayil, J., Van Hasselt, H., Schaul, T., Ostrovski, G., Dabney, W., Horgan, D., Piot, B., Azar, M., Silver, D.: Rainbow: combining improvements in deep reinforcement learning. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018) 9. Inayat, Z., Gani, A., Anuar, N.B., Khan, M.K., Anwar, S.: Intrusion response systems: foundations, design, and challenges. J. Netw. Comput. Appl. 62, 53–74 (2016). https://doi.org/10.1016/j.jnca.2015.12.006 10. Kanoun, W., Cuppens-Boulahia, N., Cuppens, F., Dubus, S.: Risk-aware framework for activating and deactivating policy-based response. In: 2010 4th International Conference on Network and System Security (NSS), pp. 207–215. IEEE (2010). https://doi.org/10.1109/NSS.2010.80 11. Kuhnle, A., Schaarschmidt, M., Fricke, K.: Tensorforce: a tensorflow library for applied reinforcement learning (2017). https://github.com/tensorforce/tensorforce 12. Muniz, S., Ortega, A.: Fuzzing and debugging cisco IOS. BlackHat Europe (2011) 13. Shameli-Sendi, A., Cheriet, M., Hamou-Lhadj, A.: Taxonomy of intrusion risk assessment and response system. Comput. Secur. 45, 1–16 (2014). https://doi. org/10.1016/j.cose.2014.04.009 14. Shameli-Sendi, A., Ezzati-Jivan, N., Jabbarifar, M., Dagenais, M.: Intrusion response systems: survey and taxonomy. Int. J. Comput. Sci. Netw. Secur. (IJCSNS) 12(1), 1–14 (2012) 15. Stakhanova, N., Basu, S., Wong, J.: A taxonomy of intrusion response systems. Int. J. Inf. Comput. Secur. 1(1–2), 169–184 (2007) 16. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018) 17. Xavier, M.G., Neves, M.V., Rossi, F.D., Ferreto, T.C., Lange, T., De Rose, C.A.: Performance evaluation of container-based virtualization for high performance computing environments. In: 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 233–240. IEEE (2013). https://doi.org/10.1109/PDP.2013.41

Method of Several Information Spaces for Identification of Anomalies Alexander Grusho(B) , Nick Grusho, and Elena Timonina Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, Vavilova 44-2, 119333 Moscow, Russia [email protected], [email protected], [email protected]

Abstract. The problem of identification of poorly expressed anomalies in the model of generalized systems is considered. System is described by a set of processes. Processes can have different nature, i.e. description of each of them has own language and therefore an anomaly in each process is expressed in own way. Therefore an identification of anomalies belongs to the problems connected with heterogeneous systems which cannot be investigated by homogeneous means. For the analysis and identification of anomalies several information spaces are used. The independent analysis of behavior of the system in each information space allows to simplify the procedure of identification of poorly expressed anomalies. Combination of results of identification of poorly expressed anomalies in several information spaces can be constructed on the basis of binary functions. Such approach allows to unify results of researches of anomalies in various information spaces.

Keywords: Information space anomaly

1

· Information security · Search of

Introduction

We’ll define deviations of observed processes from normal steady behavior of these processes as anomalies. Processes can have different nature, i.e. description of each of them has own language and therefore an anomaly in each process is expressed in own way. Therefore an identification of anomalies belongs to the problems connected with heterogeneous systems which cannot be investigated by homogeneous means. Let’s give examples of anomalies. At failure in the computer the system administrator (SA) considers different variants of a cause of the failure, and uses the different tools allowing to check working capacity of different components of the computer and the software. Identification of the cause of the occurred failure is result of his search. For searching of an insider breaking information security it is necessary to consider various groups of signs of activity of various users. These signs can be described in languages of professional activity of users of the computer system, c Springer Nature Switzerland AG 2020  I. Kotenko et al. (Eds.): IDC 2019, SCI 868, pp. 515–520, 2020. https://doi.org/10.1007/978-3-030-32258-8_60

516

A. Grusho et al.

or in language of psychology of users. Violations of access rights to information, or symptoms of a depression connected with awareness of illegality of actions can be anomalies in this example. Results of identification of anomalies in each of feature sets can only cause suspicions of violation of information security. But, considering at the same time poorly reasoned arguments being received in partial anomalies it is possible to achieve strengthening of these reasons with each other. Many articles are devoted to researches of anomalies in computer systems and networks (see, for example, [1–4]). On the basis of these researches software products for collecting and the analysis of data with the purpose of identification of anomalies are already created. However the problem of joint processing of heterogeneous data is far from the decision. In this paper for processing of heterogeneous data the approach of allocation and the isolated analysis of several information spaces is used [5]. Communications between information spaces are used for the choice of a next space for the purpose of search of traces of anomalies. This research is an attempt to define the necessary data bases for SA that can help to solve complex problems of fault cause search. We try to support the idea to construct the data base with descriptions of information spaces (IS), languages for analysing data in them, and connections between IS.

2

Mathematical Models. Identification of Anomalies with Usage of Communications of Several Information Spaces

Developing the general methods of identification of anomalies, it is impossible to concentrate on single type of systems. Therefore we will use general idea of systems [6], which is described by following model. The system is described by a set of initial parameters (variables) X. For each x ∈ X the own domain of values D(x) is defined. Rather often parameters are reordered under decreasing of their importance, and the system is modelled by a set of the most essential parameters. However to define the importance of variables is a rather complex problem and therefore the aggregation method is used further. Let f is the sequence of functions, each of which depends on a finite vector x of arguments from X. Functions are defined on Cartesian products of domains of the variables entering these functions. The sequence of variables {y = f (x)} defines the aggregated description Y of initial system. Parametrization can consider different ways of superposition of aggregations of parameters. However for simplicity we will consider that there are defined one initial parametrization X and one system of aggregation Y . The system of aggregation generates new domains of aggregated parameters. Let’s consider that the system changes in a time under the influence of the external or internal reasons. I.e. all parameters depend on time. Thus, all x, y are functions of time. Let’s assume that initial parameters can depend from

Method of Several Information Spaces for Identification of Anomalies

517

each other. Follows fromthis dependence that not all possible combinations  from Cartesian product x D(x) are achievable. Thus it exists Dx ⊆ x D(x) in which possible combinations of values of parameters lie with changing system state. In the same way admissible combinations of values of parameters y which  we will designate through Dy , Dy ⊆ y D(y) are defined. Let’s assume that the domain D(x) of each initial variable x is known. From this it follows that the aggregated domains of each variable are also known. However domains Dx and Dy are unknown since their descriptions are connected with the description of all dependencies. At the same time, we will make the assumption that dependencies between parameters almost do not change in a time, and it can be used by search of anomalies. Definition 1. Any finite set of parameters X is called as information space ISx over initial parameters X. Similarly ISy is a finite set of parameters Y . When it is clear from context we will skip indexes and write IS. Let’s assume that there was a failure of one of parameters x ∈ X, i.e. the value x ∈ D(x), but at value of this parameter the state of the system at present time went beyond the domain Dx . Let’s consider influence of this event on the aggregated description of Y . The aggregated states of those y ∈ Y are changed which are defined by the functions containing parameter x. At the same time impact of failure of x will affect several values y though change of value y can be much less noticeable, than failure of parameter x. That means information about the failure can be distributed between several IS. Then the search of anomalies and cause of the fault needs to look through several IS. Example 1. Let’s demonstrate SA work on failure identification, using communications between IS. Assume that there is a server running the Microsoft Windows Server 2008 operating system (OS). On it the role of file storage is set. For data access on the server the general network access is configured. After unknown change on the server users cannot get access through the local area network to file storage. SA connects to the server (locally or far away) and begins diagnostics process. At first SA checks existence of the necessary files and folders on the server (IS1). Let SA find out on IS1 that in the system there is no disk, for example, D, on which all data should be stored. SA starts tools of viewing of the OS system log. In the OS system log (IS2) SA runs for search of events marked as “error” or “warning”. Search is conducted from the last (earliest) event to the first (latest) event. In the course of searching SA finds the event marked as “error”. In the detailed description of this record SA finds the message that the disk D ceased to work. The messages following it in the journal of logs and being marked as “attention” show that process of data exchange with the disk is broken.

518

A. Grusho et al.

In this case the changes of states of the equipment (cause) led to emergence of log entries of OS and also to the error in the system of providing network access to file storage. After elimination of problems with the equipment (IS3), data recovery and providing access to the file storage, users can continue work. As it was shown earlier, it is necessary to expect that failures at the level of initial parametrization will be more weakly seen at the level of aggregation. Let’s describe the main idea of the addressing the analysis of IS. Each information space is limited to a research only of a part of parameters of a system. It means that restrictions of the studied characteristics reduce “noise” which arises by simultaneous consideration of many processes, and which prevents carrying out an analysis of behavior of the selected parameters. The choice of information space is defined by an interval of time around timepoint of already found feature of anomaly in some process. Besides, it is necessary to select IS which have connections with those processes where the features of anomalies are found. For usage of these advantages of an isolated analysis of separate IS it is necessary to have database of interconnections of different components of a computer system and external environment. Let’s assume that several IS with indications of anomalies which are coordinated on the time and on interconnections were detected. Then it is necessary to use the unifying procedure allowing to make the decision on existence of anomaly during this period and in this region of the interconnected components of the computer system and the external environment. For this purpose, it is necessary to have criterion of the fact that the marked-out ill-defined anomalies are not a consequence of random combination of circumstances, that is the probability of random emergence of anomalies as a result of the carried-out procedure is rather small. Further two important questions will be considered: • methods of detection of anomalies in IS; • unification of information on poorly expressed anomalies in various IS.

3

The Analysis of Data for Identification of Poorly Expressed Anomalies in Information Spaces

Let’s remind that all parameters x or y have the domain of values {D(x)} and {D(y)} and which we consider to be known, but domains Dx and Dy are unknown. The method of identification of anomalies in IS is the next. For each IS it is necessary to consider its features in definition of a measure of dissimilarity dT1 , T – dissimilarity of sections of process in this IS during period of time T1 and period of time T . It means that we suggest to use machine learning for description of normal behavior of processes in IS, definition of the measure of dissimilarity and its variation in normal conditions. While the parameter T should express influence of fault in IS.

Method of Several Information Spaces for Identification of Anomalies

519

Definition 2. The poorly expressed anomaly for process y(t) in the period of time T is determined by the rule dT1 , T (y(t)) > C, where C is some threshold value. Usage of this definition needs the deep preliminary analysis of IS. Let’s consider example of measures of dissimilarity. Example 2. Let y(t) is words sequence in some language L (for example, it is log of events in a computer system or of network). Assume that T1 is a period of time without anomalies and rather wide to consider that on the interval T1 steady statistical regularities of emergence of words of process y(t) were shown. Let’s carry out marking of words and expressions of process y(t) on T1 . As a result, we will receive the diagram of relative frequencies η(l) of occurrence of word l of language L in process y(t). Then the considered earlier interval T has a poorly expressed anomaly if it contains a word l such that η(l) < C where C – threshold value. It means that the probability of appearance of the word l is low and can be empirically estimated. The considered simplest way of identification of poorly expressed anomaly in IS can be generalized in various directions. However, in Example 2 the main idea of definition of a measure of similarity for discrete words is given. Define the simplest approach to creation of the unifying procedure. Let’s consider the variable zj , which is equal to 1 if in j-th IS, j = 1, 2, ..., m, revealed a poorly expressed anomaly, and is equal to 0 otherwise. If all revealed anomalies in m IS have type of insertion, then the anomaly of . m the system is defined by conjunction α = j=i zj . If to consider poorly expressed nature of anomalies, then it is possible to cut off logically a part of iIS in which anomalies are not revealed. For example, for k