Advances in Intelligent Networking and Collaborative Systems: The 12th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2020) [1st ed.] 9783030577957, 9783030577964

This book aims to provide the latest research findings, innovative research results, methods and development techniques

1,101 36 51MB

English Pages XXIX, 517 [543] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Advances in Intelligent Networking and Collaborative Systems: The 10th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2018) ... and Communications Technologies Book 23) 9783319985572, 9783319985565, 3319985574

159 28 13MB Read more

Advances in Intelligent Networking and Collaborative Systems: The 13th International Conference on Intelligent Networking and Collaborative Systems (Incos-2021) 3030849090, 9783030849092

This book provides latest research findings, innovative research results, methods and development techniques from both t

577 52 25MB Read more

Advances in Intelligent Networking and Collaborative Systems: The 11th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2019) [1st ed. 2020] 978-3-030-29034-4, 978-3-030-29035-1

This book presents the latest innovative research findings, methods, and development techniques related to intelligent s

391 72 70MB Read more

Advances in Intelligent Networking and Collaborative Systems : The 9th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2017) 978-3-319-65636-6, 3319656368, 978-3-319-65635-9

The aim of this book is to provide the latest research findings, innovative research results, methods and development te

966 84 62MB Read more

Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2020 [1st ed.] 9783030586683, 9783030586690

This book presents the proceedings of the 6th International Conference on Advanced Intelligent Systems and Informatics 2

2,552 76 84MB Read more

Collaborative Design in Virtual Environments (Intelligent Systems, Control and Automation: Science and Engineering) 9789400706040, 9789400706057

670 154 8MB Read more

Advances in Computing and Intelligent Systems: Proceedings of ICACM 2019 (Algorithms for Intelligent Systems) 9789811502217, 9811502218

This book gathers selected papers presented at the International Conference on Advancements in Computing and Management

680 96 20MB Read more

Advances in Computing and Intelligent Systems: Proceedings of ICACM 2019 (Algorithms for Intelligent Systems) [1st ed. 2020] 9789811502224, 9789811502217, 9811502226

This book gathers selected papers presented at the International Conference on Advancements in Computing and Management

257 60 61MB Read more

International Conference on Advanced Intelligent Systems for Sustainable Development: Volume 1 - Advanced Intelligent Systems on Artificial ... (Lecture Notes in Networks and Systems, 637) 9783031263842, 9783031263835, 3031263839

This book describes the potential contributions of emerging technologies in different fields as well as the opportunitie

226 116 118MB Read more

Intelligent Systems and Applications: Proceedings of the 2020 Intelligent Systems Conference (IntelliSys) Volume 2 [1st ed.] 9783030551865, 9783030551872

The book Intelligent Systems and Applications - Proceedings of the 2020 Intelligent Systems Conference is a remarkable c

1,011 111 89MB Read more

Advances in Intelligent Networking and Collaborative Systems: The 12th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2020) [1st ed.]
9783030577957, 9783030577964

Author / Uploaded
Leonard Barolli
Kin Fun Li
Hiroyoshi Miwa

Table of contents :
Front Matter ....Pages i-xxix
Performance Evaluation of WMNs by WMN-PSODGA Simulation System Considering Exponential Distribution of Mesh Clients and Different Router Replacement Methods (Seiji Ohara, Admir Barolli, Phudit Ampririt, Keita Matsuo, Leonard Barolli, Makoto Takizawa)....Pages 1-14
An Integrated Fuzzy-Based Simulation System for Driving Risk Management in VANETs Considering Road Condition as a New Parameter (Kevin Bylykbashi, Ermioni Qafzezi, Phudit Ampririt, Keita Matsuo, Leonard Barolli, Makoto Takizawa)....Pages 15-25
Performance Evaluation of RIWM and RDVM Router Replacement Methods for WMNs by WMN-PSOHC Hybrid Intelligent System (Shinji Sakamoto, Admir Barolli, Phudit Ampririt, Leonard Barolli, Shusuke Okamoto)....Pages 26-35
A Decision-Making System Based on Fuzzy Logic for IoT Node Selection in Opportunistic Networks Considering Node Betweenness Centrality as a New Parameter (Miralda Cuka, Donald Elmazi, Makoto Ikeda, Keita Matsuo, Leonard Barolli, Makoto Takizawa)....Pages 36-43
Applying a Consensus Building Approach to Communication Projects in the Health Sector: The Momento Medico Case Study (Ilaria Avino, Giuseppe Fenza, Graziano Fuccio, Alessia Genovese, Vincenzo Loia, Francesco Orciuoli)....Pages 44-55
Gesture-Based Human-Machine Interface System by Using Omnidirectional Camera (Liao Sichao, Yasuto Nakamura, Hiroyoshi Miwa)....Pages 56-66
A Secure Group Communication (SGC) Protocol for a P2P Group of Peers Using Blockchain (Rui Iizumi, Takumi Saito, Shigenari Nakamura, Makoto Takizawa)....Pages 67-77
Method for Detecting Onset Times of Sounds of String Instrument (Kenta Kimoto, Hiroyoshi Miwa)....Pages 78-88
PageRank for Billion-Scale Networks in RDBMS (Aly Ahmed, Alex Thomo)....Pages 89-100
An Algorithm to Select an Energy-Efficient Sever for an Application Process in a Cluster of Servers (Kaiya Noguchi, Takumi Saito, Dilawaer Duolikun, Tomoya Enokido, Makoto Takizawa)....Pages 101-111
Suggesting Cultural Heritage Points of Interest Through a Specialized Chatbot (Roberto Canonico, Giovanni Cozzolino, Giancarlo Sperlì)....Pages 112-120
Considering a Method for Generating Human Mobility Model by Reinforcement Learning (Yuutaro Iwai, Akihiro Fujihara)....Pages 121-132
Data Mining on Open Public Transit Data for Transportation Analytics During Pre-COVID-19 Era and COVID-19 Era (Carson K. Leung, Yubo Chen, Siyuan Shang, Yan Wen, Connor C. J. Hryhoruk, Denis L. Levesque et al.)....Pages 133-144
Eye Movement Patterns as a Cryptographic Lock (Marek R. Ogiela, Lidia Ogiela)....Pages 145-148
Evolutionary Fuzzy Rules for Intrusion Detection in Wireless Sensor Networks (Tarek Batiha, Pavel Krömer)....Pages 149-160
Blockchain Architecture for Secured Inter-healthcare Electronic Health Records Exchange (Oluwaseyi Ajayi, Meryem Abouali, Tarel Saadawi)....Pages 161-172
Semi-automatic Knowledge Base Expansion for Question Answering (Alessandro Maisto, Giandomenico Martorelli, Antonietta Paone, Serena Pelosi)....Pages 173-182
Mina: SeMantic vIrtual Assistant for Domain oNtology Based Question-Answering (Nicola Fiore, Gaetano Parente, Michele Stingo, Massimiliano Polito)....Pages 183-193
SafeEat: Extraction of Information About the Presence of Food Allergens in Recipes (Alessandra Amato, Giovanni Cozzolino)....Pages 194-203
Accelerated Neural Intrusion Detection for Wireless Sensor Networks (Tarek Batiha, Pavel Krömer)....Pages 204-215
End-to-End Security for Connected Vehicles (Kazi J. Ahmed, Marco Hernandez, Myung Lee, Kazuya Tsukamoto)....Pages 216-225
Triangle Enumeration on Massive Graphs Using AWS Lambda Functions (Tengkai Yu, Venkatesh Srinivasan, Alex Thomo)....Pages 226-237
C’Meal! the ChatBot for Food Information (Alessandra Amato, Giovanni Cozzolino)....Pages 238-244
A Study on the Effect of Triads on the Wigner’s Semicircle Law of Networks (Toyoaki Taniguchi, Yusuke Sakumoto)....Pages 245-255
COVID-19-FAKES: A Twitter (Arabic/English) Dataset for Detecting Misleading Information on COVID-19 (Mohamed K. Elhadad, Kin Fun Li, Fayez Gebali)....Pages 256-268
Precision Dosing Management with Intelligent Computing in Digital Health (Hong Lu, Sara Rosenbaum, Wei Lu)....Pages 269-280
Optimal Number of MOAP Robots for WMNs Using Silhouette Theory (Atushi Toyama, Kenshiro Mitsugi, Keita Matsuo, Leonard Barolli)....Pages 281-290
Performance Evaluation of a Recovery Method for Vehicular DTN Considering Different Reset Thresholds (Yoshiki Tada, Makoto Ikeda, Leonard Barolli)....Pages 291-299
Survey of UAV Autonomous Landing Based on Vision Processing (Liu Yubo, Bei Haohan, Li Wenhao, Huang Ying)....Pages 300-311
Improved Sentiment Urgency Emotion Detection for Business Intelligence (Tariq Soussan, Marcello Trovati)....Pages 312-318
Knowledge-Based Networks for Artificial Intuition (Olayinka Johnny, Marcello Trovati)....Pages 319-325
Research on Foreign Anti-terrorism Intelligence Early Warning Based on Visual Measurement (Xiang Pan, Zhiting Xiao, Xuan Guo, Yuan Chen)....Pages 326-337
Credit Rating Based on Hybrid Sampling and Dynamic Ensemble (Shudong Liu, Jiamin Wei, Xu Chen, Chuang Wang, Xu An Wang)....Pages 338-347
Low Infrared Emission Hybrid Frequency Selective Surface with Low-Frequency Transmission and High-Frequency Low Reflection in Microwave Region (Yiming Xu, Yu Yang, Xiao Li)....Pages 348-360
Rapid Detection of Crowd Abnormal Behavior Based on the Hierarchical Thinking (Xiao Li, Yu Yang, Yiming Xu, Linyang Li, Chao Wang)....Pages 361-371
Deep and Shallow Feature Fusion and Recognition of Recording Devices Based on Attention Mechanism (Chunyan Zeng, Dongliang Zhu, Zhifeng Wang, Yao Yang)....Pages 372-381
Teaching Evaluation Index Based on Analytic Hierarchy Process (Qiong Li, Yanyan Zhao, Jiangtao Li, Lili Su)....Pages 382-390
High-Dimensional Data Clustering Algorithm Based on Stacked-Random Projection (Yujia Sun, Jan Platoš)....Pages 391-401
Adaptive Monitoring in Multiservice Systems (Lukáš Révay, Sanja Tomić)....Pages 402-412
Towards Faster Matching Algorithm Using Ternary Tree in the Area of Genome Mapping (Rostislav Hřivňák, Petr Gajdoš, Václav Snášel)....Pages 413-424
Preprocessing COVID-19 Radiographic Images by Evolutionary Column Subset Selection (Jana Nowaková, Pavel Krömer, Jan Platoš, Václav Snášel)....Pages 425-436
Transmission Scheduling for Tandemly-Connected Sensor Networks with Heterogeneous Packet Generation Rates (Ryosuke Yoshida, Masahiro Shibata, Masato Tsuru)....Pages 437-446
Smart Watering System Based on Framework of Low-Bandwidth Distributed Applications (LBDA) in Cloud Computing (Nurdiansyah Sirimorok, Mansur As, Kaori Yoshida, Mario Köppen)....Pages 447-459
P4-Based Implementation and Evaluation of Adaptive Early Packet Discarding Scheme (Kazumi Kumazoe, Masato Tsuru)....Pages 460-469
Matching Based Content Discovery Method on Geo-Centric Information Platform (Kaoru Nagashima, Yuzo Taenaka, Akira Nagata, Hitomi Tamura, Kazuya Tsukamoto, Myung Lee)....Pages 470-479
SDN-Based In-network Early QoE Prediction for Stable Route Selection on Multi-path Network (Shumpei Shimokawa, Yuzo Taenaka, Kazuya Tsukamoto, Myung Lee)....Pages 480-492
Reliable Network Design Considering Cost to Decrease Failure Probability of Simultaneous Failure (Yuma Morino, Hiroyoshi Miwa)....Pages 493-502
Beacon-Less Autonomous Transmission Control Method for Spatio-Temporal Data Retention (Ichiro Goto, Daiki Nobayashi, Kazuya Tsukamoto, Takeshi Ikenaga, Myung Lee)....Pages 503-513
Back Matter ....Pages 515-517

Citation preview

Advances in Intelligent Systems and Computing 1263

Leonard Barolli Kin Fun Li Hiroyoshi Miwa Editors

Advances in Intelligent Networking and Collaborative Systems The 12th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2020)

Advances in Intelligent Systems and Computing Volume 1263

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong

The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **

More information about this series at http://www.springer.com/series/11156

Leonard Barolli Kin Fun Li Hiroyoshi Miwa •

•

Editors

Advances in Intelligent Networking and Collaborative Systems The 12th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2020)

123

Editors Leonard Barolli Department of Information and Communication Engineering Faculty of Information Engineering Fukuoka Institute of Technology Fukuoka, Japan

Kin Fun Li Department of Electrical and Computer Engineering University of Victoria Victoria, BC, Canada

Hiroyoshi Miwa School of Science and Technology Kwansei Gakuin University Sanda, Japan

ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-3-030-57795-7 ISBN 978-3-030-57796-4 (eBook) https://doi.org/10.1007/978-3-030-57796-4 © Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Welcome Message from the INCoS-2020 Organizing Committee

Welcome to the 12th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2020), which is held at University of Victoria, Victoria, Canada, from 31 August to 2 September 2020. INCoS is a multidisciplinary conference that covers the latest advances in intelligent social networks and collaborative systems, intelligent networking systems, mobile collaborative systems, secure intelligent cloud systems, etc. Additionally, the conference addresses security, authentication, privacy, data trust and user trustworthiness behaviour, which have become crosscutting features of intelligent collaborative systems. With the fast development of the Internet, we are experiencing a shift from the traditional sharing of information and applications as the main purpose of the networking systems to an emergent paradigm, which locates people at the very centre of networks and exploits the value of people’s connections, relations and collaboration. Social networks are playing a major role as one of the drivers in the dynamics and structure of intelligent networking and collaborative systems. Virtual campuses, virtual communities and organizations strongly leverage intelligent networking and collaborative systems by a great variety of formal and informal electronic relations, such as business-to-business, peer-to-peer and many types of online collaborative learning interactions, including the virtual campuses and eLearning and MOOCs systems. Altogether, this has resulted in entangled systems that need to be managed efficiently and in an autonomous way. In addition, the conjunction of the latest and powerful technologies based on cloud, mobile and wireless infrastructures is currently bringing new dimensions of collaborative and networking applications a great deal by facing new issues and challenges. INCoS-2020 conference paid a special attention to cloud computing services, storage, security and privacy, data mining, machine learning and collective intelligence, cooperative communication and cognitive systems, big data analytis, eLearning, virtual campuses and MOOCs, among others. The aim of this conference is to stimulate research that will lead to the creation of responsive environments for networking and, at longer-term, the development of adaptive, secure, mobile and intuitive intelligent systems for collaborative work and v

vi

Welcome Message from the INCoS-2020 Organizing Committee

learning. As in all previous editions, INCoS-2020 counted on with the support and collaboration of a large and internationally recognized TPC covering all main themes of the conference. The successful organization of the conference is achieved thanks to the great collaboration and hard work of many people and conference supporters. First, we would like to thank all the authors for their continued support to the conference by submitting their research work to the conference, for their presentations and discussions during the conference days. We would like to thank PC co-chairs, TPC members and external reviewers for their work by carefully evaluating the submissions and providing constructive feedback to authors. We would like to thank the track chairs for their work on setting up the tracks and the respective TPCs and also for actively promoting the conference and their tracks. We would like to acknowledge the excellent work and support by the international advisory committee. Our gratitude and acknowledgment for the conference keynotes for their interesting and inspiring keynote speeches. We greatly appreciate the support by web administrator co-chairs. We are very grateful to Springer as well as several academic institutions for their endorsement and assistance. Finally, we hope that you will find these proceedings to be a valuable resource in your professional, research and educational activities. Leonard Barolli Steering Committee Chair Kin Fun Li Hiroyoshi Miwa General Co-chairs Alex Thomo Flora Amato Omar Hussain Program Co-chairs

INCoS-2020 Organizing Committee

Honorary Chair Makoto Takizawa

Hosei University, Japan

General Co-chairs Kin Fun Li Hiroyoshi Miwa

University of Victoria, Canada Kwansei Gakuin University, Japan

Program Co-chairs Alex Thomo Flora Amato Omar Hussain

University of Victoria, Canada University of Naples “Frederico II”, Italy UNSW Canberra, Australia

Workshops Co-chairs Issa Traore Santi Caballé Natalia Kryvinska

University of Victoria, Canada Open University of Catalonia, Spain Comenius University in Bratislavia, Slovakia

International Advisory Committee Vincenzo Loia Fang-Yie Leu Albert Zomaya

University of Salerno, Italy Tunghai University, Taiwan University of Sydney, Australia

vii

viii

INCoS-2020 Organizing Committee

International Liaison Co-chairs Riham AlTawry Aneta Poniszewska-Maranda Xu An Wang

University of Victoria, Canada Lodz University of Technology, Poland Engineering University of CAPF, China

Award Co-chairs Tomoya Enokido Marek Ogiela Masato Tsuru Vaclav Snasel

Rissho University, Japan AGH University of Science and Technology, Poland Kyushu Institute of Technology, Japan Technical University of Ostrava, Czech Republic

Web Administrator Co-chairs Kevin Bylykbashi Donald Elmazi Miralda Cuka

Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan

Local Arrangement Co-chairs Parastoo Soleimani Sina Ghaffari

University of Victoria, Canada University of Victoria, Canada

Finance Chair Makoto Ikeda

Fukuoka Institute of Technology, Japan

Steering Committee Chair Leonard Barolli

Fukuoka Institute of Technology, Japan

Track Areas and PC Members Track 1: Data Mining, Machine Learning and Collective Intelligence Track Co-chairs Carson K. Leung Alfredo Cuzzocrea

University of Manitoba, Canada University of Calabria, Italy

INCoS-2020 Organizing Committee

ix

TPC Members Fan Jiang Wookey Lee Oluwafemi A. Sarumi Syed K. Tanbeer Tomas Vinar Kin Fun Li

University of Northern British Columbia, Canada Inha University, Korea Federal University of Technology, Akure, Nigeria University of Manitoba, Canada Comenius University in Bratislava, Slovakia University of Victoria, Canada

Track 2: Fuzzy Systems and Knowledge Management Track Co-chairs Marek Ogiela Morteza Saberi Chang Choi

AGH University of Science and Technology, Poland University of Technology Sydney, Australia Gachon University, Republic of Korea

TPC Members Hsing-Chung (Jack) Chen Been-Chian Chien Junho Choi Farookh Khadeer Hussain Hae-Duck Joshua Jeong Hoon Ko Natalia Krzyworzeka Libor Mesicek Lidia Ogiela Su Xi Ali Azadeh Jin Hee Yoon Hamed Shakouri Jee-Hyong Lee Jung Sik Jeon

Asia University, Taiwan National University, Taiwan Chosun University, Korea University of Technology Sydney, Australia Korean Bible University, Korea Sungkyunkwan University, Korea AGH University of Science and Technology, Poland J. E. Purkinje University, Czech Republic Pedagogical University of Cracow, Poland Hohai University, China Tehran University, Iran Sejong University, South Korea Tehran University, Iran Sungkyunkwan University, South Korea Mokpo National Maritime University, South Korea

Track 3: Grid and P2P Distributed Infrastructure for Intelligent Networking and Collaborative Systems Track Co-chairs Aneta Poniszewska-Maranda Takuya Asaka

Lodz University of Technology, Poland Tokyo Metropolitan University, Japan

x

INCoS-2020 Organizing Committee

TPC Members Jordi Mongay Batalla Nik Bessis Aniello Castiglione Naveen Chilamkurti Radu-Ioan Ciobanu Alexandru Costan Vladimir-Ioan Cretu Marc Frincu Rossitza Ivanova Dorian Gorgan Mauro Iacono George Mastorakis Constandinos X. Mavromoustakis Gabriel Neagu Rodica Potolea Radu Prodan Ioan Salomie George Suciu Nicolae Tapus Sergio L. Toral Marín Radu Tudoran Lucian Vintan Mohammad Younas

National Institute of Telecommunications, Poland Edge Hill University, UK University of Naples Parthenope, Italy La Trobe University, Australia University Politehnica of Bucharest, Romania IRISA/INSA Rennes, France University Politehnica of Timisoara, Romania West University of Timisoara, Romania Goleva Technical University of Sofia, Bulgaria Technical University of Cluj-Napoca, Romania University of Campania “Luigi Vanvitelli”, Italy Technological Educational Institute of Crete, Greece University of Nicosia, Cyprus National Institute for Research and Development in Informatics, Romania Technical University of Cluj-Napoca, Romania University of Innsbruck, Austria Technical University of Cluj-Napoca, Romania BEIA International, Romania University Politehnica of Bucharest, Romania University of Seville, Spain European Research Center, Germany Lucian Blaga University, Romania Oxford Brookes University, UK

Track 4: Nature’s Inspired Parallel Collaborative Systems Track Co-chairs Mohammad Shojafar Zahra Pooranian Daichi Kominami

University of Rome, Italy University of Padua, Italy Osaka University, Japan

TPC Members Francisco Luna Sergio Nesmachnow Nouredine Melab Julio Ortega Domingo Giménez Gregoire Danoy

University University University University University University

of Málaga, Spain La Republica, Uruguay of Lille 1, France of Granada, Spain of Murcia, Spain of Luxembourg, Luxembourg

INCoS-2020 Organizing Committee

Carolina Salto Stefka Fidanova Michael Affenzeller Hernan Aguirre Francisco Chicano Javid Tahery Enrique Domínguez Guillermo Leguizamón Konstantinos Parsopoulos Carlos Segura Eduardo Segredo Javier Arellano

xi

University of La Pampa, Argentina IICT-BAS, Bulgaria Upper Austria University, Austria Shinshu University, Japan University of Malaga, Spain Karlstad University, Sweden University of Málaga, Spain Universidad Nacional de San Luis, Argentina University of Ioannina, Greece CIMAT, Mexico Edinburgh Napier University, UK University of Málaga, Spain

Track 5: Security, Organization, Management and Autonomic Computing for Intelligent Networking and Collaborative Systems Track Co-chairs Jungwoo Ryoo Simon Tjoa

Pennsylvania State University, USA St. Pölten University of Applied Sciences, Austria

TPC Members Nikolaj Goranin Kenneth Karlsson Peter Kieseberg Hyoungshick Kim Hae Young Lee Moussa Ouedraogo Sebastian Schrittwieser Syed Rizvi

Vilnius Gediminas Technical University, Lituania Lapland University of Applied Sciences, Finland SBA Research, Austria Sungkyunkwan University, Korea DuDu IT, Korea Wavestone, Luxembourg St. Pölten University of Applied Sciences, Austria Pennsylvania State University, USA

Track 6: Software Engineering, Semantics and Ontologies for Intelligent Networking and Collaborative Systems Track Co-chairs Kai Jander Flora Amato

University of Hamburg, Germany University of Naples “Federico II”, Italy

xii

INCoS-2020 Organizing Committee

TPC Members Tsutomu Kinoshita Kouji Kozaki Hiroyoshi Miwa Burin Rujjanapan Hiroshi Kanasugi Takayuki Shimotomai Jinattaporn Khumsri Rene Witte Amal Zouaq Jelena Jovanovic Zeinab Noorian Faezeh Ensan Alireza Vazifedoost Morteza Mashayekhi Giovanni Cozzolino

Fukui University of Technology, Japan Osaka University, Japan Kwansei Gakuin University, Japan Nation University, Thailand Tokyo University, Japan Advanced Simulation Technology of Mechanics R&D, Japan Fukui University of Technology, Japan Concordia University, Canada University of Ottawa, Canada University of Belgrade, Serbia Ryerson University, Canada Ferdowsi University of Mashhad, Iran Sun Life Financial, Canada Royal Bank of Canada, Canada University of Naples “Federico II”, Italy

Track 7: Wireless and Sensor Systems for Intelligent Networking and Collaborative Systems Track Co-chairs Do van Thanh Shigeru Kashihara

Telenor & Oslo Metropolitan University, Norway Nara Institute of Science and Technology, Japan

TPC Members Dhananjay Singh Shirshu Varma B. Balaji Naik Sayed Chhattan Shah Madhusudan Singh Irish Singh Gaurav Tripathi Jun Kawahara Muhammad Niswar Vasaka Visoottiviseth Jane Louie F. Zamora

HUFS, Korea IIIT-Allahabad, India NIT-Sikkim, India HUFS, Korea, USA Yonsei University, Korea Ajou University, Korea Bharat Electronics Limited, India Kyoto University, Japan Hasanuddin University, Indonesia Mahidol University, Thailand Weathernews Inc., Japan

INCoS-2020 Organizing Committee

xiii

Track 8: Service-Based Systems for Enterprise Activities Planning and Management Track Co-chairs Corinna Engelhardt-Nowitzki Natalia Kryvinska

University of Applied Sciences, Austria Comenius University in Bratislava, Slovakia

TPC Members Maria Bohdalova Ivan Demydov Jozef Juhar Nor Shahniza Kamal Bashah Eric Pardede Francesco Moscato Tomoya Enokido Olha Fedevych

Comenius University in Bratislava, Slovakia Lviv Polytechnic National University, Ukraine Technical University of Košice, Slovakia Universiti Teknologi MARA, Malaysia La Trobe University, Australia University of Campania, Italy Rissho University, Japan Lviv Polytechnic National University, Ukraine

Track 9: Next Generation Secure Network Protocols and Components Track Co-chairs Xu An Wang Mingwu Zhang

Engineering University of CAPF, China Hubei University of Technology, China

TPC Members Fushan Wei He Xu Yining Liu Yuechuan Wei Weiwei Kong Dianhua Tang Hui Tian Urszula Ogiela

The PLA Information Engineering University, China Nangjing University of Posts and Telecommunications, China Guilin University of Electronic Technology, China Engineering University of CAPF, China Xi’an University of Posts & Telecommunications, China CETC 30, China Huaqiao University, China Pedagogical University of Krakow, Poland

xiv

INCoS-2020 Organizing Committee

Track 10: Big Data Analytics for Learning, Networking and Collaborative Systems Track Co-chairs Santi Caballé Francesco Orciuoli Shigeo Matsubara

Open University of Catalonia, Spain University of Salerno, Italy Kyoto University, Japan

TPC Members Jordi Conesa Soumya Barnejee David Bañeres Nicola Capuano Nestor Mora Jorge Moneo David Gañán Isabel Guitart Elis Kulla Evjola Spaho Florin Pop Kin Fun Li Miguel Bote Pedro Muñoz

Open University of Catalonia, Spain Institut National des Sciences Appliquées, France Open University of Catalonia, Spain University of Salerno, Italy Open University of Catalonia, Spain University of San Jorge, Spain Open University of Catalonia, Spain Open University of Catalonia, Spain Okayama University of Science, Japan Polytechnic University of Tirana, Albania University Politehnica of Bucharest, Romania University of Victoria, Canada University of Valladolid, Spain University of Carlos III, Spain

Track 11: Cloud Computing: Services, Storage, Security and Privacy Track Co-chairs Javid Taheri Shuiguang Deng

Karlstad University, Sweden Zhejiang University, China

TPC Members Ejaz Ahmed Asad Malik Usman Shahid Assad Abbas Nikolaos Tziritas

National Institute of Standards and Technology, USA National University of Science and Technology, Pakistan Comsats Institute of Information Technology, Pakistan North Dakota State University, USA Chinese Academy of Sciences, China

INCoS-2020 Organizing Committee

Osman Khalid Kashif Bilal Javid Taheri Saif Rehman Inayat Babar Thanasis Loukopoulos Mazhar Ali Tariq Umer

xv

Comsats Institute of Information Technology, Pakistan Qatar University, Qatar Karlstad University, Sweden COMSATS University Islamabad, Pakistan University of Engineering and Technology, Pakistan Technological Educational Institute of Athens, Greece COMSATS University Islamabad, Pakistan COMSATS University Islamabad, Pakistan

Track 12: Intelligent Collaborative Systems for Work and Learning, Virtual Organization and Campuses Track Co-chairs Nikolay Kazantsev Monika Davidekova

National Research University Higher School of Economics, Russia Comenius University in Bratislava, Slovakia

TPC Members Luis Alberto Casillas Nestor Mora Michalis Feidakis Sandra Isabel Enciso Nicola Capuano Rafael Del Hoyo George Caridakis Kazunori Mizuno Satoshi Ono Yoshiro Imai Takashi Mitsuishi Hiroyuki Mitsuhara

University of Guadalajara, Mexico University of Cadiz, Spain University of Aegean, Greece Fundación Universitaria Juan N. Corpas, Colombia University of Salerno, Italy Technological Center of Aragon, Spain University of Aegean, Greece Takushoku University, Japan Kagoshima University, Japan Kagawa University, Japan Tohoku University, Japan Tokushima University, Japan

Track 13: Social Networking and Collaborative Systems Track Co-chairs Nicola Capuano Dusan Soltes Yusuke Sakumoto

University of Salerno, Italy Comenius University in Bratislava, Slovakia Kwansei Gakuin University, Japan

xvi

INCoS-2020 Organizing Committee

TPC Members Santi Caballé Thanasis Daradoumis Angelo Gaeta Christian Guetl Miltiadis Lytras Agathe Merceron Francis Palma Krassen Stefanov Daniele Toti Jian Wang Jing Xiao Jian Yu Aida Masaki Takano Chisa Sho Tsugawa

Open University of Catalonia, Spain University of the Aegean, Greece University of Salerno, Italy Graz University of Technology, Austria American College of Greece, Greece Beuth University of Applied Sciences Berlin, Germany Screaming Power, Canada Sofia University “St. Kliment Ohridski”, Bulgaria Roma Tre University, Italy Wuhan University, China South China Normal University, China Auckland University of Technology, Australia Tokyo Metropolitan University, Japan Hiroshima City University, Japan Tsukuba University, Japan

Track 14: Intelligent and Collaborative Systems for e-Health Track Co-chairs Massimo Esposito

Mario Ciampi

Giovanni Luca Masala

Institute for High Performance Computing and Networking - National Research Council of Italy, Italy Institute for High Performance Computing and Networking - National Research Council of Italy, Italy University of Plymouth, UK

TPC Members Tim Brown Mario Marcos do Espirito Santo Jana Heckenbergerova Zdenek Matej Michal Musilek Michal Prauzek Vaclav Prenosil Alvin C. Valera Nasem Badr El Din

Australian National University, Australia Universidad Stadual de Montes Claros, Brazil University Pardubice, Czech Republic Masaryk University, Czech Republic University Hradec Kralove, Czech Republic VSB-TU OStrava, Czech Republic Masaryk University, Czech Republic Singapore Management University, Singapore University of Manitoba, Canada

INCoS-2020 Organizing Committee

Emil Pelikan Joanne Nightingale Tomas Barton

xvii

Academy of Sciences, Czech Republic National Physical Laboratory, UK University of Alberta, Canada

Track 15: Mobile Networking and Applications Track Co-chairs Miroslav Voznak Akihiro Fujihara Lukas Vojtech

VSB-Technical University of Ostrava, Czech Republic Chiba Institute of Technology, Japan Czech Technical University in Prague, Czech Republic

TPC Members Nobuyuki Tsuchimura Masanori Nakamichi Masahiro Shibata Yusuke Ide Takayuki Shimotomai Dinh-Thuan Do Floriano De Rango Homero Toral-Cruz Remigiusz Baran Mindaugas Kurmis Radek Martinek Mauro Tropea Gokhan Ilk Shino Iwami

Kwansei Gakuin University, Japan Fukui University of Technology, Japan Kyushu Institute of Technology, Japan Kanazawa Institute of Technology, Japan Advanced Simulation Technology of Mechanics R&D, Japan Ton Duc Thang University, Vietnam University of Calabria, Italy University of Quintana Roo, Mexico Kielce University of Technology, Poland Klaipeda State University of Applied Sciences, Lituania VSB-Technical University of Ostrava, Czech Republic University of Calabria, Italy Ankara University, Turkey Microsoft, Japan

INCoS-2020 Reviewers Amato Flora Barolli Admir Barolli Leonard Bylykbashi Kevin Caballé Santi Capuano Nicola Cuka Miralda Cui Baojiang

Elmazi Donald Enokido Tomoya Esposito Christian Fenza Giuseppe Ficco Massimo Fiore Ugo Fujihara Akihiro Fun Li Kin

xviii

Funabiki Nobuo Gañán David Hsing-Chung Chen Hussain Farookh Hussain Omar Ikeda Makoto Ishida Tomoyuki Javaid Nadeem Joshua Hae-Duck Kohana Masaki Kolici Vladi Koyama Akio Kromer Pavel Kryvinska Natalia Kulla Elis Leu Fang-Yie Leung Carson Li Yiu Maeda Hiroshi Mangione Giuseppina Rita Matsuo Keita Messina Fabrizio Miguel Jorge Miwa Hiroyoshi Natwichai, Juggapong Nadeem Javaid Nalepa Jakub

INCoS-2020 Organizing Committee

Nowakowa Jana Ogiela Lidia Ogiela Marek Orciuoli Francesco Palmieri Francesco Pardede Eric Poniszewska-Maranda Aneta Rahayu Wenny Rawat Danda Sakaji Hiroki Shibata Masahiro Shibata Yoshitaka Snasel Vaclav Spaho Evjola Sukumoto Yusuke Taniar David Takizawa Makoto Terzo Olivier Thomo Alex Tsukamoto Kazuya Tsuru Masato Uchida Masato Uehara Minoru Venticinque Salvatore Wang Xu An Woungang Isaac

INCoS-2020 Keynote Talks

Trustworthy Decision-Making and Artificial Intelligence Arjan Durresi Indiana University Purdue University in Indianapolis, Indianapolis, Indiana, USA

Abstract. Algorithms and computers have been used for a long time in supporting decision-making in various fields of human endowers. Examples include optimization techniques in engineering, statistics in experiment design, modeling of different natural phenomena and so on. In all such uses of algorithms and computers, an essential question has been how much we can trust them, what are the potential errors of such models, what is the field range of their applicability? With time, the algorithms and computers we use have become more powerful and more complex, and we call them today as artificial intelligence that includes various machine learning and other algorithmic techniques. But the increase of power and complexity of algorithms and computers and with extended use of them the question of how much we should trust them becomes more crucial. Their complexity might hide more potential errors and especially the interdependencies; their solution might be difficult to be explained and so on. To deal with these problems, we have developed an evidence and measurement-based trust management system; our system can be used to measure trust in human to human, human to machine and machine to machine interactions. In this talk, we will introduce our trust system and its validation on real stock market data. Furthermore, we will discuss the use of our trust system to build more secure computer systems, filter fake news on social networks and develop better collective decision-making support systems in managing natural resources, as well as future potential uses.

xxi

Mining and Modeling Regime Shifts for Multiple Time Series Forecasting Shengrui Wang University of Sherbrooke, Sherbrooke, Quebec, Canada

Abstract. Time series are a type of time-dependent data found in many fields such as finance, medicine, meteorology, ecology and utility industry. In such fields, time series forecasting is an issue of great importance as it can help predict future index/equity values and behavioral changes in a stock market, health trajectories of a patient and probabilities of failure events such death and (re)-hospitalization and varying electricity consumption of individual households. It also poses significant theoretical and methodological challenges. One such challenge is identification and prediction of a regime and regime shifts since most time series models, whether they are linear or nonlinear, work well only within one regime. In this talk, I will introduce our recent work on building a novel framework to model time series interaction and evolution as an ecosystem. The focus of the talk is to show how to make use of the graph or network structures and community analysis approaches to account for interactions or interdependencies between time series. For this purpose, we build a time-evolving network graph from moving window segments of time series, where nodes represent time series profiles and links correspond to the similarity between the time series. By using a community detection algorithm, we discover communities displaying similar behaviours, or regimes, and trace the discovered behaviours in the form of trajectories of structural changes in these communities. Such changes are considered as behaviour events that may appear, persist or disappear w.r.t. a community, reflecting the survival of regimes and abrupt transition between regimes. Using such network structures for modelling the interactions allows also discovering “relational” features explaining why certain behaviours may persist longer than others. These relational features, together with behaviour profiles, constitute input to machine learning models for regime analysis and trajectory forecasting. In our work, we tackle the problem of learning regime shifts by modeling a time-dependent probability of transition between regimes, using a full time-dependent Cox regression model. We evaluate the whole approach by testing it on both synthetic and real data sets and compare its performance with that of state-of-the-art learning algorithms.

xxiii

Contents

Performance Evaluation of WMNs by WMN-PSODGA Simulation System Considering Exponential Distribution of Mesh Clients and Different Router Replacement Methods . . . . . . . . . . . . . . . . . . . . . . Seiji Ohara, Admir Barolli, Phudit Ampririt, Keita Matsuo, Leonard Barolli, and Makoto Takizawa

1

An Integrated Fuzzy-Based Simulation System for Driving Risk Management in VANETs Considering Road Condition as a New Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kevin Bylykbashi, Ermioni Qafzezi, Phudit Ampririt, Keita Matsuo, Leonard Barolli, and Makoto Takizawa

15

Performance Evaluation of RIWM and RDVM Router Replacement Methods for WMNs by WMN-PSOHC Hybrid Intelligent System . . . . . Shinji Sakamoto, Admir Barolli, Phudit Ampririt, Leonard Barolli, and Shusuke Okamoto A Decision-Making System Based on Fuzzy Logic for IoT Node Selection in Opportunistic Networks Considering Node Betweenness Centrality as a New Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Miralda Cuka, Donald Elmazi, Makoto Ikeda, Keita Matsuo, Leonard Barolli, and Makoto Takizawa Applying a Consensus Building Approach to Communication Projects in the Health Sector: The Momento Medico Case Study . . . . . . . . . . . . Ilaria Avino, Giuseppe Fenza, Graziano Fuccio, Alessia Genovese, Vincenzo Loia, and Francesco Orciuoli Gesture-Based Human-Machine Interface System by Using Omnidirectional Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liao Sichao, Yasuto Nakamura, and Hiroyoshi Miwa

26

36

44

56

xxv

xxvi

Contents

A Secure Group Communication (SGC) Protocol for a P2P Group of Peers Using Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rui Iizumi, Takumi Saito, Shigenari Nakamura, and Makoto Takizawa

67

Method for Detecting Onset Times of Sounds of String Instrument . . . . Kenta Kimoto and Hiroyoshi Miwa

78

PageRank for Billion-Scale Networks in RDBMS . . . . . . . . . . . . . . . . . . Aly Ahmed and Alex Thomo

89

An Algorithm to Select an Energy-Efficient Sever for an Application Process in a Cluster of Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Kaiya Noguchi, Takumi Saito, Dilawaer Duolikun, Tomoya Enokido, and Makoto Takizawa Suggesting Cultural Heritage Points of Interest Through a Specialized Chatbot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Roberto Canonico, Giovanni Cozzolino, and Giancarlo Sperlì Considering a Method for Generating Human Mobility Model by Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Yuutaro Iwai and Akihiro Fujihara Data Mining on Open Public Transit Data for Transportation Analytics During Pre-COVID-19 Era and COVID-19 Era . . . . . . . . . . . 133 Carson K. Leung, Yubo Chen, Siyuan Shang, Yan Wen, Connor C. J. Hryhoruk, Denis L. Levesque, Nicholas A. Braun, Nitya Seth, and Prakhar Jain Eye Movement Patterns as a Cryptographic Lock . . . . . . . . . . . . . . . . . 145 Marek R. Ogiela and Lidia Ogiela Evolutionary Fuzzy Rules for Intrusion Detection in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Tarek Batiha and Pavel Krömer Blockchain Architecture for Secured Inter-healthcare Electronic Health Records Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Oluwaseyi Ajayi, Meryem Abouali, and Tarel Saadawi Semi-automatic Knowledge Base Expansion for Question Answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Alessandro Maisto, Giandomenico Martorelli, Antonietta Paone, and Serena Pelosi Mina: SeMantic vIrtual Assistant for Domain oNtology Based Question-Answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Nicola Fiore, Gaetano Parente, Michele Stingo, and Massimiliano Polito

Contents

xxvii

SafeEat: Extraction of Information About the Presence of Food Allergens in Recipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Alessandra Amato and Giovanni Cozzolino Accelerated Neural Intrusion Detection for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 Tarek Batiha and Pavel Krömer End-to-End Security for Connected Vehicles . . . . . . . . . . . . . . . . . . . . . 216 Kazi J. Ahmed, Marco Hernandez, Myung Lee, and Kazuya Tsukamoto Triangle Enumeration on Massive Graphs Using AWS Lambda Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 Tengkai Yu, Venkatesh Srinivasan, and Alex Thomo C’Meal! the ChatBot for Food Information . . . . . . . . . . . . . . . . . . . . . . 238 Alessandra Amato and Giovanni Cozzolino A Study on the Effect of Triads on the Wigner’s Semicircle Law of Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 Toyoaki Taniguchi and Yusuke Sakumoto COVID-19-FAKES: A Twitter (Arabic/English) Dataset for Detecting Misleading Information on COVID-19 . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Mohamed K. Elhadad, Kin Fun Li, and Fayez Gebali Precision Dosing Management with Intelligent Computing in Digital Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Hong Lu, Sara Rosenbaum, and Wei Lu Optimal Number of MOAP Robots for WMNs Using Silhouette Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Atushi Toyama, Kenshiro Mitsugi, Keita Matsuo, and Leonard Barolli Performance Evaluation of a Recovery Method for Vehicular DTN Considering Different Reset Thresholds . . . . . . . . . . . . . . . . . . . . . . . . . 291 Yoshiki Tada, Makoto Ikeda, and Leonard Barolli Survey of UAV Autonomous Landing Based on Vision Processing . . . . 300 Liu Yubo, Bei Haohan, Li Wenhao, and Huang Ying Improved Sentiment Urgency Emotion Detection for Business Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 Tariq Soussan and Marcello Trovati Knowledge-Based Networks for Artificial Intuition . . . . . . . . . . . . . . . . 319 Olayinka Johnny and Marcello Trovati Research on Foreign Anti-terrorism Intelligence Early Warning Based on Visual Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 Xiang Pan, Zhiting Xiao, Xuan Guo, and Yuan Chen

xxviii

Contents

Credit Rating Based on Hybrid Sampling and Dynamic Ensemble . . . . 338 Shudong Liu, Jiamin Wei, Xu Chen, Chuang Wang, and Xu An Wang Low Infrared Emission Hybrid Frequency Selective Surface with Low-Frequency Transmission and High-Frequency Low Reflection in Microwave Region . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 Yiming Xu, Yu Yang, and Xiao Li Rapid Detection of Crowd Abnormal Behavior Based on the Hierarchical Thinking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Xiao Li, Yu Yang, Yiming Xu, Linyang Li, and Chao Wang Deep and Shallow Feature Fusion and Recognition of Recording Devices Based on Attention Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . 372 Chunyan Zeng, Dongliang Zhu, Zhifeng Wang, and Yao Yang Teaching Evaluation Index Based on Analytic Hierarchy Process . . . . . 382 Qiong Li, Yanyan Zhao, Jiangtao Li, and Lili Su High-Dimensional Data Clustering Algorithm Based on Stacked-Random Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 Yujia Sun and Jan Platoš Adaptive Monitoring in Multiservice Systems . . . . . . . . . . . . . . . . . . . . 402 Lukáš Révay and Sanja Tomić Towards Faster Matching Algorithm Using Ternary Tree in the Area of Genome Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 Rostislav Hřivňák, Petr Gajdoš, and Václav Snášel Preprocessing COVID-19 Radiographic Images by Evolutionary Column Subset Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 Jana Nowaková, Pavel Krömer, Jan Platoš, and Václav Snášel Transmission Scheduling for Tandemly-Connected Sensor Networks with Heterogeneous Packet Generation Rates . . . . . . . . . . . . . . . . . . . . 437 Ryosuke Yoshida, Masahiro Shibata, and Masato Tsuru Smart Watering System Based on Framework of Low-Bandwidth Distributed Applications (LBDA) in Cloud Computing . . . . . . . . . . . . . 447 Nurdiansyah Sirimorok, Mansur As, Kaori Yoshida, and Mario Köppen P4-Based Implementation and Evaluation of Adaptive Early Packet Discarding Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460 Kazumi Kumazoe and Masato Tsuru Matching Based Content Discovery Method on Geo-Centric Information Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470 Kaoru Nagashima, Yuzo Taenaka, Akira Nagata, Hitomi Tamura, Kazuya Tsukamoto, and Myung Lee

Contents

xxix

SDN-Based In-network Early QoE Prediction for Stable Route Selection on Multi-path Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480 Shumpei Shimokawa, Yuzo Taenaka, Kazuya Tsukamoto, and Myung Lee Reliable Network Design Considering Cost to Decrease Failure Probability of Simultaneous Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493 Yuma Morino and Hiroyoshi Miwa Beacon-Less Autonomous Transmission Control Method for Spatio-Temporal Data Retention . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Ichiro Goto, Daiki Nobayashi, Kazuya Tsukamoto, Takeshi Ikenaga, and Myung Lee Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515

Performance Evaluation of WMNs by WMN-PSODGA Simulation System Considering Exponential Distribution of Mesh Clients and Different Router Replacement Methods Seiji Ohara1(B) , Admir Barolli2 , Phudit Ampririt1 , Keita Matsuo3 , Leonard Barolli3 , and Makoto Takizawa4 1

2

3

Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected], [email protected] Department of Information Technology, Aleksander Moisiu University of Durres, L.1, Rruga e Currilave, Durres, Albania [email protected] Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan {kt-matsuo,barolli}@fit.ac.jp 4 Department of Advanced Sciences, Faculty of Science and Engineering, Hosei University, Kajino-Machi, Koganei-Shi, Tokyo 184-8584, Japan [email protected]

Abstract. Wireless Mesh Networks (WMNs) are an important networking infrastructure and they have many advantages such as low cost and high-speed wireless Internet connectivity. However, they have some problems such as router placement, covering of mesh clients and load balancing. To deal with these problems, in our previous work, we implemented a Particle Swarm Optimization (PSO) based simulation system, called WMN-PSO, and a simulation system based on Genetic Algorithm (GA), called WMN-GA. Then, we implemented a hybrid simulation system based on PSO and distributed GA (DGA), called WMN-PSODGA. Moreover, we added in the fitness function a new parameter for the load balancing of the mesh routers called NCMCpR (Number of Covered Mesh Clients per Router). In this paper, we consider Exponential distribution of mesh clients and five router replacement methods and carry out simulations using WMN-PSODGA system. The simulation results show that RIWM and LDIWM router replacement methods have better performance than other methods. Comparing RIWM and LDIWM, we see that RIWM has better behavior.

1

Introduction

The wireless networks and devices can provide users access to information and communication anytime and anywhere [3,8–11,14,20,26,27,29,33]. Wireless c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 1–14, 2021. https://doi.org/10.1007/978-3-030-57796-4_1

2

S. Ohara et al.

Mesh Networks (WMNs) are gaining a lot of attention because of their low-cost nature that makes them attractive for providing wireless Internet connectivity. A WMN is dynamically self-organized and self-configured, with the nodes in the network automatically establishing and maintaining mesh connectivity among itself (creating, in effect, an ad hoc network). This feature brings many advantages to WMN such as low up-front cost, easy network maintenance, robustness and reliable service coverage [1]. Moreover, such infrastructure can be used to deploy community networks, metropolitan area networks, municipal and corporative networks, and to support applications for urban areas, medical, transport and surveillance systems. Mesh node placement in WMNs can be seen as a family of problems, which is shown (through graph theoretic approaches or placement problems, e.g. [6,15]) to be computationally hard to solve for most of the formulations [37]. We consider the version of the mesh router nodes placement problem in which we are given a grid area where to deploy a number of mesh router nodes and a number of mesh client nodes of fixed positions (of an arbitrary distribution) in the grid area. The objective is to find a location assignment for the mesh routers to the cells of the grid area that maximizes the network connectivity, client coverage and consider load balancing for each router. Network connectivity is measured by Size of Giant Component (SGC) of the resulting WMN graph, while the user coverage is simply the number of mesh client nodes that fall within the radio coverage of at least one mesh router node and is measured by Number of Covered Mesh Clients (NCMC). For load balancing, we added in the fitness function a new parameter called NCMCpR (Number of Covered Mesh Clients per Router). Node placement problems are known to be computationally hard to solve [12, 13,38]. In previous works, some intelligent algorithms have been recently investigated for node placement problem [4,7,16,18,21–23,31,32]. In [24], we implemented a Particle Swarm Optimization (PSO) based simulation system, called WMN-PSO. Also, we implemented another simulation system based on Genetic Algorithm (GA), called WMN-GA [19], for solving node placement problem in WMNs. Then, we designed and implemented a hybrid simulation system based on PSO and distributed GA (DGA). We call this system WMN-PSODGA. In this paper, we present the performance analysis of WMNs using WMNPSODGA system considering Exponential distribution of mesh clients and different router replacement methods. The rest of the paper is organized as follows. We present our designed and implemented hybrid simulation system in Sect. 2. The simulation results are given in Sect. 3. Finally, we give conclusions and future work in Sect. 4.

Performance Evaluation of WMNs by WMN-PSODGA Simulation System

2 2.1

3

Proposed and Implemented Simulation System Particle Swarm Optimization

In PSO a number of simple entities (the particles) are placed in the search space of some problem or function and each evaluates the objective function at its current location. The objective function is often minimized and the exploration of the search space is not through evolution [17]. Each particle then determines its movement through the search space by combining some aspect of the history of its own current and best (best-fitness) locations with those of one or more members of the swarm, with some random perturbations. The next iteration takes place after all particles have been moved. Eventually the swarm as a whole, like a flock of birds collectively foraging for food, is likely to move close to an optimum of the fitness function. Each individual in the particle swarm is composed of three D-dimensional vectors, where D is the dimensionality of the search space. These are the current position xi , the previous best position pi and the velocity vi . The particle swarm is more than just a collection of particles. A particle by itself has almost no power to solve any problem; progress occurs only when the particles interact. Problem solving is a population-wide phenomenon, emerging from the individual behaviors of the particles through their interactions. In any case, populations are organized according to some sort of communication structure or topology, often thought of as a social network. The topology typically consists of bidirectional edges connecting pairs of particles, so that if j is in i’s neighborhood, i is also in j’s. Each particle communicates with some other particles and is affected by the best point found by any member of its topological neighborhood. This is just the vector pi for that best neighbor, which we will denote with pg . The potential kinds of population “social networks” are hugely varied, but in practice certain types have been used more frequently. We show the pseudo code of PSO in Algorithm 1. In the PSO process, the velocity of each particle is iteratively adjusted so that the particle stochastically oscillates around pi and pg locations. 2.2

Distributed Genetic Algorithm

Distributed Genetic Algorithm (DGA) has been used in various fields of science. DGA has shown their usefulness for the resolution of many computationally hard combinatorial optimization problems. We show the pseudo code of DGA in Algorithm 2. Population of individuals: Unlike local search techniques that construct a path in the solution space jumping from one solution to another one through local perturbations, DGA use a population of individuals giving thus the search a larger scope and chances to find better solutions. This feature is also known as “exploration” process in difference to “exploitation” process of local search methods.

4

S. Ohara et al.

Algorithm 1. Pseudo code of PSO. /* Initialize all parameters for PSO */ Computation maxtime:= T pmax , t := 0; Number of particle-patterns:= m, 2 ≤ m ∈ N 1 ; Particle-patterns initial solution:= P 0i ; Particle-patterns initial position:= x0ij ; Particles initial velocity:= v 0ij ; PSO parameter:= ω, 0 < ω ∈ R1 ; PSO parameter:= C1 , 0 < C1 ∈ R1 ; PSO parameter:= C2 , 0 < C2 ∈ R1 ; /* Start PSO */ Evaluate(G0 , P 0 ); while t < T pmax do /* Update velocities and positions */ = ω · v tij v t+1 ij +C1 · rand() · (best(Pijt ) − xtij ) +C2 · rand() · (best(Gt ) − xtij ); t+1 xij = xtij + v t+1 ij ; /* if fitness value is increased, a new solution will be accepted. */ Update Solutions(Gt , P t ); t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;

Fitness: The determination of an appropriate fitness function, together with the chromosome encoding are crucial to the performance of DGA. Ideally we would construct objective functions with “certain regularities”, i.e. objective functions that verify that for any two individuals which are close in the search space, their respective values in the objective functions are similar. Selection: The selection of individuals to be crossed is another important aspect in DGA as it impacts on the convergence of the algorithm. Several selection schemes have been proposed in the literature for selection operators trying to cope with premature convergence of DGA. There are many selection methods in GA. In our system, we implement 2 selection methods: Random method and Roulette wheel method. Crossover operators: Use of crossover operators is one of the most important characteristics. Crossover operator is the means of DGA to transmit best genetic features of parents to offsprings during generations of the evolution process. Many methods for crossover operators have been proposed such as Blend Crossover (BLX-α), Unimodal Normal Distribution Crossover (UNDX), Simplex Crossover (SPX). Mutation operators: These operators intend to improve the individuals of a population by small local perturbations. They aim to provide a component of randomness in the neighborhood of the individuals of the population. In our system, we implemented two mutation methods: uniformly random mutation and boundary mutation.

Performance Evaluation of WMNs by WMN-PSODGA Simulation System

5

Escaping from local optima: GA itself has the ability to avoid falling prematurely into local optima and can eventually escape from them during the search process. DGA has one more mechanism to escape from local optima by considering some islands. Each island computes GA for optimizing and they migrate its gene to provide the ability to avoid from local optima (See Fig. 1). Convergence: The convergence of the algorithm is the mechanism of DGA to reach to good solutions. A premature convergence of the algorithm would cause that all individuals of the population be similar in their genetic features and thus the search would result ineffective and the algorithm getting stuck into local optima. Maintaining the diversity of the population is therefore very important to this family of evolutionary algorithms. Algorithm 2. Pseudo code of DGA. /* Initialize all parameters for DGA */ Computation maxtime:= T gmax , t := 0; Number of islands:= n, 1 ≤ n ∈ N 1 ; initial solution:= P 0i ; /* Start DGA */ Evaluate(G0 , P 0 ); while t < T gmax do for all islands do Selection(); Crossover(); Mutation(); end for t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;

Fig. 1. Model of migration in DGA.

2.3

WMN-PSODGA Hybrid Simulation System

In this subsection, we present the initialization, particle-pattern, fitness function, and replacement methods. The pseudo code of our implemented system is shown

6

S. Ohara et al.

in Algorithm 3. Also, our implemented simulation system uses Migration function as shown in Fig. 2. The Migration function swaps solutions among lands included in PSO part.

Algorithm 3. Pseudo code of WMN-PSODGA system. Computation maxtime:= Tmax , t := 0; Initial solutions: P . Initial global solutions: G. /* Start PSODGA */ while t < Tmax do Subprocess(PSO); Subprocess(DGA); WaitSubprocesses(); Evaluate(Gt , P t ) /* Migration() swaps solutions (see Fig. 2). */ Migration(); t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;

Fig. 2. Model of WMN-PSODGA migration.

Initialization We decide the velocity of particles by a random process considering the area size. For√instance, when √ the area size is W × H, the velocity is decided randomly from − W 2 + H 2 to W 2 + H 2 . Particle-Pattern A particle is a mesh router. A fitness value of a particle-pattern is computed by combination of mesh routers and mesh clients positions. In other words, each particle-pattern is a solution as shown is Fig. 3.

Performance Evaluation of WMNs by WMN-PSODGA Simulation System

7

Fig. 3. Relationship among global solution, particle-patterns, and mesh routers in PSO part.

Gene Coding A gene describes a WMN. Each individual has its own combination of mesh nodes. In other words, each individual has a fitness value. Therefore, the combination of mesh nodes is a solution. Fitness Function WMN-PSODGA has the fitness function to evaluate the temporary solution of the router’s placements. The fitness function is defined as: F itness = α × N CM C(xij , y ij ) + β × SGC(xij , y ij ) + γ × N CM CpR(xij , y ij ). This function uses the following indicators. • NCMC (Number of Covered Mesh Clients) The NCMC is the number of the clients covered by the SGC’s routers. • SGC (Size of Giant Component) The SGC is the maximum number of connected routers. • NCMCpR (Number of Covered Mesh Clients per Router) The NCMCpR is the number of clients covered by each router. The NCMCpR indicator is used for load balancing. WMN-PSODGA aims to maximize the value of the fitness function in order to optimize the placements of the routers using the above three indicators. Weight-coefficients of the fitness function are α, β, and γ for NCMC, SGC, and NCMCpR, respectively. Moreover, the weight-coefficients are implemented as α + β + γ = 1. Router Replacement Methods A mesh router has x, y positions, and velocity. Mesh routers are moved based on velocities. There are many router replacement methods, such as: Constriction Method (CM) CM is a method which PSO parameters are set to a week stable region (ω = 0.729, C1 = C2 = 1.4955) based on analysis of PSO by M. Clerc et al. [2,5,35]. Random Inertia Weight Method (RIWM) In RIWM, the ω parameter is changing randomly from 0.5 to 1.0. The C1 and C2 are kept 2.0. The ω can be estimated by the week stable region. The average of ω is 0.75 [28,35].

8

S. Ohara et al.

Linearly Decreasing Inertia Weight Method (LDIWM) In LDIWM, C1 and C2 are set to 2.0, constantly. On the other hand, the ω parameter is changed linearly from unstable region (ω = 0.9) to stable region (ω = 0.4) with increasing of iterations of computations [35,36]. Linearly Decreasing Vmax Method (LDVM) In LDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). A value of Vmax which is maximum velocity of particles is considered. With increasing of iteration of computations, the Vmax is kept decreasing linearly [30,34]. Rational Decrement of Vmax Method (RDVM) In RDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). The Vmax is kept decreasing with the increasing of iterations as Vmax (x) =

W 2 + H2 ×

T −x . x

Where, W and H are the width and the height of the considered area, respectively. Also, T and x are the total number of iterations and a current number of iteration, respectively [25].

3

Simulation Results

In this section, we present the simulation results. Table 1 shows the common parameters for each simulation. Figure 4 shows the visualization results after the optimization. While, Fig. 5 shows the number of covered clients by each router. In Fig. 6 are shown the transition of the standard deviations. The standard deviation is related to load balancing. When the standard deviation increased, Table 1. The common parameters for each simulation. Parameters

Values

Distribution of mesh clients Exponential distribution Number of mesh clients

48

Number of Mesh routers

16

Radius of a mesh router

2.0–3.5

Number of GA Islands

16

Number of migrations

200

Evolution steps

9

Selection method

Random method

Crossover method

UNDX

Mutation method

Uniform mutation

Crossover rate

0.8

Mutation rate

0.2

Area size

32.0 × 32.0

Performance Evaluation of WMNs by WMN-PSODGA Simulation System

(a) CM

(b) RIWM

(c) LDIWM

(d) LDVM

(e) RDVM

Fig. 4. Visualization results after the optimization.

9

10

S. Ohara et al. 16

Number of Covered Clients

Number of Covered Clients

16 14 12 10 8 6 4 2 0

14 12 10 8 6 4 2 0

[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

Router

Router

(a) CM

(b) RIWM

16

Number of Covered Clients

Number of Covered Clients

16 14 12 10 8 6 4 2 0

14 12 10 8 6 4 2 0

[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

Router

Router

(c) LDIWM

(d) LDVM

Number of Covered Clients

16 14 12 10 8 6 4 2 0

[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

Router

(e) RDVM

Fig. 5. Number of covered clients by each router after the optimization.

the number of mesh clients for each router tends to be different. On the other hand, when the standard deviation decreased, the number of mesh clients for each router tends to go close to each other. The value of r in Fig. 6 means the correlation coefficient. In Fig. 6(a), Fig. 6(d), and Fig. 6(e), the standard deviation is increased by increasing the number of updates. On the other hand, the standard deviation of Fig. 6(b) and 6(c) decreases by increasing the number of updates. Especially, Fig. 6(b) has better behavior than Fig. 6(c). Thus, we conclude that RIWM has better behavior than other methods.

Performance Evaluation of WMNs by WMN-PSODGA Simulation System 5

4

4

r = -0.886207 Standard Deviation

Standard Deviation

r = 0.533856

3 2 1 0

regression line data 5

10

15

20

Number of Updates

25

30

3

2

1

regression line data

0

35

5

10

15

25

(b) RIWM

5

5

r = -0.361949 4

r = 0.115279 Standard Deviation

Standard Deviation

20

Number of Updates

(a) CM

3 2 1 0

11

regression line data 5

10

15

20

25

30

Number of Updates

35

40

4 3 2 1

regression line data

0

45

5

10

15

20

25

Number of Updates

(c) LDIWM

30

35

40

(d) LDVM

5

Standard Deviation

r = 0.54791 4 3 2 1 0

regression line data 10

20

30

40

50

Number of Updates

60

70

80

(e) RDVM

Fig. 6. Transition of the standard deviations.

4

Conclusions

In this work, we evaluated the performance of WMNs using a hybrid simulation system based on PSO and DGA (called WMN-PSODGA). We considered Exponential distribution of mesh clients and five router replacement methods for WMN-PSODGA. The simulation results show that RIWM and LDIWM router replacement methods have better performance than other methods. Comparing RIWM and LDIWM, we see that RIWM has better behavior. In future work, we will consider other distributions of mesh clients.

12

S. Ohara et al.

References 1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005) 2. Barolli, A., Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L., Takizawa, M.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering constriction and linearly decreasing vmax methods. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 111–121. Springer (2017) 3. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance analysis of simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs considering different distributions of mesh clients. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 32–45. Springer (2018) 4. Barolli, A., Sakamoto, S., Ozera, K., Barolli, L., Kulla, E., Takizawa, M.: Design and implementation of a hybrid intelligent system based on particle swarm optimization and distributed genetic algorithm. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 79–93. Springer (2018) 5. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002) 6. Franklin, A.A., Murthy, C.S.R.: Node placement algorithm for deployment of twotier wireless mesh networks. In: Proceedings of Global Telecommunications Conference, pp. 4823–4827 (2007) 7. Girgis, M.R., Mahmoud, T.M., Abdullatif, B.A., Rabie, A.M.: Solving the wireless mesh network design problem using genetic algorithm and simulated annealing optimization methods. Int. J. Comput. Appl. 96(11), 1–10 (2014) 8. Goto, K., Sasaki, Y., Hara, T., Nishio, S.: Data gathering using mobile agents for reducing traffic in dense mobile wireless sensor networks. Mob. Inf. Syst. 9(4), 295–314 (2013) 9. Inaba, T., Elmazi, D., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A secure-aware call admission control scheme for wireless cellular networks using fuzzy logic and its performance evaluation. J. Mob. Multimed. 11(3&4), 213–222 (2015) 10. Inaba, T., Obukata, R., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of a QoS-aware fuzzy-based CAC for LAN access. Int. J. Space-Based Situat. Comput. 6(4), 228–238 (2016) 11. Inaba, T., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A testbed for admission control in WLAN: a fuzzy approach and its performance evaluation. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 559–571. Springer (2016) 12. Lim, A., Rodrigues, B., Wang, F., Xu, Z.: k-Center problems with minimum coverage. Theoret. Comput. Sci. 332(1–3), 1–17 (2005) 13. Maolin, T., et al.: Gateways placement in backbone wireless mesh networks. Int. J. Commun. Netw. Syst. Sci. 2(1), 44–50 (2009) 14. Matsuo, K., Sakamoto, S., Oda, T., Barolli, A., Ikeda, M., Barolli, L.: Performance analysis of WMNs by WMN-GA simulation system for two WMN architectures and different TCP congestion-avoidance algorithms and client distributions. Int. J. Commun. Netw. Distrib. Syst. 20(3), 335–351 (2018) 15. Muthaiah, S.N., Rosenberg, C.P.: Single gateway placement in wireless mesh networks. In: Proceedings of 8th International IEEE Symposium on Computer Networks, pp. 4754–4759 (2008)

Performance Evaluation of WMNs by WMN-PSODGA Simulation System

13

16. Naka, S., Genji, T., Yura, T., Fukuyama, Y.: A hybrid particle swarm optimization for distribution state estimation. IEEE Trans. Power Syst. 18(1), 60–68 (2003) 17. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007) 18. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of simulated annealing and genetic algorithm for node placement problem in wireless mesh networks. J. Mob. Multimed. 9(1–2), 101–110 (2013) 19. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of hill climbing, simulated annealing and genetic algorithm for node placement problem in WMNs. J. High Speed Netw. 20(1), 55–66 (2014) 20. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A simulation System for WMN based on SA: performance evaluation for different instances and starting temperature values. Int. J. Space-Based Situat. Comput. 4(3–4), 209–216 (2014) 21. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Performance evaluation considering iterations per phase and SA temperature in WMN-SA system. Mob. Inf. Syst. 10(3), 321–330 (2014) 22. Sakamoto, S., Lala, A., Oda, T., Kolici, V., Barolli, L., Xhafa, F.: Application of WMN-SA simulation system for node placement in wireless mesh networks: a case study for a realistic scenario. Int. J. Mob. Comput. Multimed. Commun. (IJMCMC) 6(2), 13–21 (2014) 23. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: An integrated simulation system considering WMN-PSO simulation system and network simulator 3. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 187–198. Springer (2016) 24. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evaluation of a simulation system based on particle swarm optimisation for node placement problem in wireless mesh networks. Int. J. Commun. Netw. Distrib. Syst. 17(1), 1–13 (2016) 25. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation of a new replacement method in WMN-PSO simulation system and its performance evaluation. In: The 30th IEEE International Conference on Advanced Information Networking and Applications (AINA 2016), pp. 206–211 (2016) 26. Sakamoto, S., Obukata, R., Oda, T., Barolli, L., Ikeda, M., Barolli, A.: Performance analysis of two wireless mesh network architectures by WMN-SA and WMN-TS simulation systems. J. High Speed Netw. 23(4), 311–322 (2017) 27. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Implementation of an intelligent hybrid simulation systems for WMNs based on particle swarm optimization and simulated annealing: performance evaluation for different replacement methods. Soft. Comput. 23(9), 3029–3035 (2017) 28. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering random inertia weight method and linearly decreasing vmax method. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 114–124. Springer (2017) 29. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid systems for node placement problem in WMNs considering particle swarm optimization, hill climbing and simulated annealing. Mob. Netw. Appl. 23(1), 27–33 (2017)

14

S. Ohara et al.

30. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering constriction and linearly decreasing inertia weight methods. In: International Conference on Network-Based Information Systems, pp. 3–13. Springer (2017) 31. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of intelligent hybrid systems for node placement in wireless mesh networks: a comparison study of WMN-PSOHC and WMN-PSOSA. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 16–26. Springer (2017) 32. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of WMN-PSOHC and WMN-PSO simulation systems for node placement in wireless mesh networks: a comparison study. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 64–74. Springer (2017) 33. Sakamoto, S., Ozera, K., Barolli, A., Barolli, L., Kolici, V., Takizawa, M.: Performance evaluation of WMN-PSOSA considering four different replacement methods. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 51–64. Springer (2018) 34. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle swarms. J. Global Optim. 31(1), 93–108 (2005) 35. Shi, Y.: Particle swarm optimization. IEEE Connections 2(1), 8–13 (2004) 36. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In: Evolutionary Programming VII, pp. 591–600 (1998) 37. Vanhatupa, T., Hannikainen, M., Hamalainen, T.: Genetic algorithm to optimize node placement and configuration for WLAN planning. In: Proceedings of the 4th IEEE International Symposium on Wireless Communication Systems, pp. 612–616 (2007) 38. Wang, J., Xie, B., Cai, K., Agrawal, D.P.: Efficient mesh router placement in wireless mesh networks. In: Proceedings of IEEE International Conference on Mobile Adhoc and Sensor Systems (MASS 2007), pp. 1–9 (2007)

An Integrated Fuzzy-Based Simulation System for Driving Risk Management in VANETs Considering Road Condition as a New Parameter Kevin Bylykbashi1(B) , Ermioni Qafzezi1 , Phudit Ampririt1 , Keita Matsuo2 , Leonard Barolli2 , and Makoto Takizawa3 1

2

Graduate School of Engineering, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected], [email protected], [email protected] Department of Information and Communication Engineering, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan {kt-matsuo,barolli}@fit.ac.jp 3 Department of Advanced Sciences, Faculty of Science and Engineering, Hosei University, 3-7-2, Kajino-machi, Koganei-shi, Tokyo 184-8584, Japan [email protected]

Abstract. In this paper, we propose a new Fuzzy-based Simulation System for Driver Risk Management in Vehicular Ad hoc Networks (VANETs). The proposed system considers Driver’s Health Condition (DHC), Vehicle’s Environment Condition (VEC), Weather Condition (WC), Road Condition (RC) and Vehicle Speed (VS) to assess the risk level. The proposed system is composed of two Fuzzy Logic Controllers (FLCs): FLC1 and FLC2. FLC1 has the following inputs: WC, RC and VS and its output, together with VEC and DHC, serve as input parameters for FLC2. The input parameters’ data can come from different sources, such as on-board and on-road sensors and cameras, sensors and cameras in the infrastructure and from the communications between the vehicles. Based on the system’s output i.e., driving risk level, a smart box informs the driver for a potential risk/danger and provides assistance. We show through simulations the effect of the considered parameters on the determination of the driving risk and demonstrate a few actions that can be performed accordingly.

1

Introduction

Traffic accidents, road congestion and environmental pollution are persistent problems faced by both developed and developing countries, which have made people live in difficult situations. Among these, the traffic incidents are the most serious ones because they result in huge loss of life and property. For decades, we have seen governments and car manufacturers struggle for safer roads and car accident prevention. The development in wireless communications has allowed c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 15–25, 2021. https://doi.org/10.1007/978-3-030-57796-4_2

16

K. Bylykbashi et al.

companies, researchers and institutions to design communication systems that provide new solutions for these issues. Therefore, new types of networks, such as Vehicular Ad hoc Networks (VANETs) have been created. VANET consists of a network of vehicles, in which vehicles are capable of communicating among themselves in order to deliver valuable information such as safety warnings and traffic information. Nowadays, every car is likely to be equipped with various forms of smart sensors, wireless communication modules, storage and computational resources. The sensors gather information about the road and environment conditions and share it with neighboring vehicles and adjacent roadside units (RSU) via vehicleto-vehicle (V2V) or vehicle-to-infrastructure (V2I) communication. However, the difficulty lies on how to understand the sensed data and how to make intelligent decisions based on the provided information. As a result, various intelligent computational technologies and systems such as fuzzy logic, machine learning, neural networks, adaptive computing and others, are being or already deployed by many car manufacturers [6]. They focus on these auxiliary technologies to launch and fully support the driverless vehicles. Fully autonomous vehicles still have a long way to go but driving support technologies are becoming widespread, even in everyday cars. The goal is to improve both driving safety and performance relying on the measurement and recognition of the outside environment and their reflection on driving operation. On the other hand, we are focused not only on the outside information but also on the in-car information and driver’s health information to detect a potential accident or a risky situation, and alert the driver about the danger, or take over the steering if it is necessary. We aim to realize a new intelligent driver support system which can provide an output in real-time by combining information from many sources. In this work, we implement a fuzzy-based simulation system for driving risk management considering different types of parameters that have an impact on the driving safety and performance. The considered parameters are driver’s health condition, vehicle’s environment condition, weather condition, vehicle speed in addition to road condition which is a new parameter we have included in this work. The model of our proposed system is given in Fig. 1. Based on the output parameter value, it can be decided if an action is needed, and if so, which is the appropriate task to be performed in order to provide a better driving support. The structure of the paper is as follows. Section 2 presents a brief overview of VANETs. Section 3 describes the proposed fuzzy-based simulation system and its implementation. Section 4 discusses the simulation results. Finally, conclusions and future work are given in Sect. 5.

An Integrated Fuzzy-Based Simulation System

17

Vehicle Speed

Actor Device Weather Condition Road Condition Vehicle’s Environment Condition

Driver’s Health Condition

Fig. 1. Proposed system architecture.

2

Overview of VANETs

VANETs are a special case of Mobile Ad hoc Networks (MANETs) in which mobile nodes are vehicles. In VANETs nodes (vehicles) have high mobility and tend to follow organized routes instead of moving at random. Moreover, vehicles offer attractive features such as higher computational capability and localization through GPS. VANETs have huge potential to enable applications ranging from road safety, traffic optimization, infotainment, commercial to rural and disaster scenario connectivity. Among these, the road safety and traffic optimization are considered the most important ones as they have the goal to reduce the dramatically high number of accidents, guarantee road safety, make traffic management and create new forms of inter-vehicle communications in Intelligent Transportation Systems (ITS). The ITS manages the vehicle traffic, support drivers with safety and other information, and provide some services such as automated toll collection and driver assist systems [7]. Despite the attractive features, VANETs are characterized by very large and dynamic topologies, variable capacity wireless links, bandwidth and hard delay constrains, and by short contact durations which are caused by the high mobility, high speed and low density of vehicles. In addition, limited transmission ranges, physical obstacles and interferences, make these networks characterized by disruptive and intermittent connectivity. To make VANETs applications possible, it is necessary to design proper networking mechanisms that can overcome relevant problems that arise from vehicular environments.

18

3

K. Bylykbashi et al.

Proposed Fuzzy-Based Simulation System

Although the developments in autonomous vehicle design indicate that this type of technology is not that far away from deployment, the current advances fall only into the Level 2 of the Society of Automotive Engineers (SAE) levels [13]. However, the automotive industry is very competitive and there might be many other new advances in the autonomous vehicle design that are not launched yet. Thus, it is only a matter of time before driverless cars are on the road. On the other side, there will be many people who will still be driving even on the era of autonomous cars. The high cost of driverless cars, lack of trust and not wanting to give up driving might be among the reasons why those people will continue to drive their cars. Hence, many researchers and automotive engineers keep working on Advanced Driver Assistance Systems (ADASs) as a primary safety feature, required in order to achieve full marks in safety. ADASs are intelligent systems that reside inside the vehicle and help the driver in a variety of ways. These systems rely on a comprehensive sensing network and artificial intelligence techniques, and have made it possible to commence the era of connected cars. They can invoke action to maintain driver’s attention in both manual and autonomous driving. While the sensors are used to gather data regarding the inside/outside environment, vehicle’s technical status, driving performance and driver’s condition, the intelligent systems task is to make decisions based on these data. If the vehicle measurements are combined with those of the surrounding vehicles and infrastructure, a better environment perception can be achieved. In addition, with different intelligent systems located at these vehicles as well as at geographically distributed servers more efficient decisions can be attained. Our research work focuses on developing an intelligent non-complex driving support system which determines the driving risk level in real-time by considering different types of parameters. In the previous works, we have considered different parameters including in-car environment parameters such as the ambient temperature and noise, and driver’s vital signs, i.e. heart and respiratory rate for which we implemented a testbed and conducted experiments in a real scenario [2,4]. The considered parameters include environmental factors and driver’s health condition which can affect the driver capability and vehicle performance. In [3], we included vehicle speed in our intelligent system for its crucial impact on the determination of risk level and in [5] the weather condition was added. In this work, we consider the road condition as a new parameter. Although the weather affects the roads, it is essential to consider the road condition as a separated parameter because a change in the weather condition does not necessary mean the road condition will change too. For example, after a heavy snow or rain even if the weather gets better, the roads might still be slippery for a while, or a flooding may happen, which, in turn, could damage the roads. Moreover, the roads may have potholes or they may be bumpy, and such poor conditions are not related with the current weather condition.

An Integrated Fuzzy-Based Simulation System

19

We use fuzzy logic to implement the proposed system as it can make a realtime decision based on the uncertainty and vagueness of the provided information [1,8–12,14,15].

Fig. 2. Proposed system structure.

The proposed system called Fuzzy-based Simulation System for Driving Risk Management (FSSDRM) is shown in Fig. 2. For the implementation of FSSDRM, we consider the following parameters: Weather Condition (WC), Road Condition, Vehicle Speed (VS), Vehicle’s Environment Condition (VEC) and Driver’s Health Condition (DHC) to determine the Driving Risk Management (DRM). We implement the FSSDRM using two Fuzzy Logic Controllers (FLCs) because having five input parameters in a single FLC results in a complex Fuzzy Rule Base (FRB) composed of many rules which affect the overall complexity of the system. FLC1 makes use of WC, RC and VS to produce an output parameter, which then serves as one of the three input parameters of the FLC2. We call this parameter Weather-Road-Speed (WRS) only for illustration purposes which allows us to better explain the results. The other two input parameters of FLC2 are VEC and DHC, and the output is DRM which is the final output of our proposed system. The term sets of linguistic parameters of FSSDRM are defined respectively as: T (W C) = {V ery Bad (V B), Bad (B), Good (G)}; T (RC) = {V ery Bad (V Ba), Bad (Ba), Good (Go)}; T (V S) = {Slow (Sl), M oderate (M o), F ast (F a)}; T (W RS) = {N o/M inor Danger (Sl), M oderate Danger(M D), Considerable Danger(CD) High Danger(HD), V ery High Danger(V HD)}; T (V EC) = {V ery U ncomf ortable (V U nC), U ncomf ortable (U nC), Comf ortable (C)}; T (DHC) = {V ery Bad (V B), Bad (B), Good (G)}. T (DRM ) = {Saf e (Sf ), Low (Lw), M oderate (M d), High (Hg), V ery High (V H), Severe (Sv), Danger (D)}.

20

K. Bylykbashi et al.

Fig. 3. Membership functions.

Based on the linguistic description of input and output parameters, we make the Fuzzy Rule Bases (FRBs) of the two FLCs.The FRB forms a fuzzy set of dimensions | T (x1 ) | × | T (x2 ) | × · · · × | T (xn ) |, where | T (xi ) | is the number of terms on T (xi ) and n is the number of FLC input parameters. FLC1 has three input parameters with three linguistic terms each, therefore, there are 27 rules in the FRB1, which is shown in Table 1. The FRB of FLC2 is shown in Table 2. Since FLC2 has three input parameters, with two parameters having three linguistic terms each and one parameter having five, it consists of 45 rules. The control rules of FRB have the form: IF “conditions” THEN “control action”. The membership functions used for fuzzification and defuzzification are given in

An Integrated Fuzzy-Based Simulation System

21

Table 1. FRB of FLC1. No WC RC

VS WRS No WC RC CD

VS WRS

No WC RC

VS WRS

1

VB

VBa Sl

10

B

VBa Sl

MD

19

G

VBa Sl

2

VB

VBa Mo VHD 11

B

VBa Mo HD

20

G

VBa Mo MD

3

VB

VBa Fa

VHD 12

B

VBa Fa

VHD

21

G

VBa Fa

HD

4

VB

Ba

Sl

CD

13

B

Ba

Sl

MD

22

G

Ba

Sl

N/MD

5

VB

Ba

Mo CD

14

B

Ba

Mo CD

23

G

Ba

Mo MD

6

VB

Ba

Fa

VHD 15

B

Ba

Fa

HD

24

G

Ba

Fa

CD

7

VB

Go

Sl

MD

16

B

Go

Sl

N/MD 25

G

Go

Sl

N/MD

8

VB

Go

Mo CD

17

B

Go

Mo MD

26

G

Go

Mo N/MD

9

VB

Go

Fa

18

B

Go

Fa

27

G

Go

Fa

HD

CD

N/MD

MD

Table 2. FRB of FLC2. No WRS VEC DHC DRM No WRS VEC DHC DRM No WRS VEC DHC DRM 1

N/MD VUnC VB

VH

16 MD C

VB

Hg

31 HD

UnC

VB

D

2

N/MD VUnC B

Hg

17 MD C

B

Lw

32 HD

UnC

B

Sv

3

N/MD VUnC G

Lw

18 MD C

G

Sf

33 HD

UnC

G

VH

4

N/MD UnC

VB

Hg

19 CD

VUnC VB

D

34 HD

C

VB

Sv

5

N/MD UnC

B

Md

20 CD

VUnC B

Sv

35 HD

C

B

VH

6

N/MD UnC

G

Sf

21 CD

VUnC G

VH

36 HD

C

G

Hg

7

N/MD C

VB

Md

22 CD

UnC

VB

Sv

37 VHD VUnC VB

D

8

N/MD C

B

Lw

23 CD

UnC

B

VH

38 VHD VUnC B

D

9

N/MD C

G

Sf

24 CD

UnC

G

Hg

39 VHD VUnC G

D

10 MD

VUnC VB

Sv

25 CD

C

VB

VH

40 VHD UnC

VB

D

11 MD

VUnC B

Hg

26 CD

C

B

Hg

41 VHD UnC

B

D

12 MD

VUnC G

Md

27 CD

C

G

Md

42 VHD UnC

G

Sv

13 MD

UnC

VB

VH

28 HD

VUnC VB

D

43 VHD C

VB

D

14 MD

UnC

B

Md

29 HD

VUnC B

D

44 VHD C

B

Sv

15 MD

UnC

G

Lw

30 HD

VUnC G

VH

45 VHD C

G

VH

Fig. 3. We use triangular and trapezoidal membership functions because they are suitable for real-time operation.

4

Simulation Results

In this section, we present the simulation results for our proposed system. The simulation results are presented in Fig. 4 and Fig. 5. In Fig. 4 is shown the relation between WRS and VS for different WC and RC values. For both WC and RC, we consider the values 0.1, 0.5 and 0.9 which simulate a very bad, bad and good weather/road condition, respectively. From Fig. 4(a), we can see that most of WRS values show a potential danger. If both the weather and road condition are very bad, the scenario with the lowest

22

K. Bylykbashi et al. WC = 0.5 1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6 WRS

WRS

WC = 0.1 1

0.5 0.4

0.5 0.4

0.3

0.3

0.2

0.2 RC=0.1 RC=0.5 RC=0.9

0.1 0 0

10

20

30

40

50

60

70

80

90

RC=0.1 RC=0.5 RC=0.9

0.1 0

100 110 120

0

10

20

30

40

VS

50

60

70

80

90

100 110 120

VS

(a) WC = 0.1

(b) WC = 0.5 WC = 0.9

1 0.9 0.8 0.7 WRS

0.6 0.5 0.4 0.3 0.2 RC=0.1 RC=0.5 RC=0.9

0.1 0 0

10

20

30

40

50

60

70

80

90

100 110 120

VS

(c) WC = 0.9

Fig. 4. Simulation results for FLC1

degree of danger is when the vehicle is moving slowly, and yet, is decided as a situation with a Considerable Danger. When RC is increased (implying better road conditions), we can see that the WRS values are decreased. The WRS values are decreased even more when the weather gets better and if both the road and the weather are good, many situations are decided with No or Minor Danger. Driving at a high speed it is dangerous itself, regarding the weather or road condition, thus, these situations are not decided with a “No or Minor Danger” on any occasion. In Fig. 5 is shown the relation between DRM and DHC for different WRS and VEC values. We consider all the possible danger levels of WRS induced by the different WC, RC and VS values. To be specific, we consider the values 0.1, 0.3, 0.5, 0.7 and 0.9 which indicate scenarios with No or Minor, Moderate, Considerable, High and Very High Danger, respectively. While for VEC, we consider a very uncomfortable, uncomfortable and comfortable environment represented by the values 0.1, 0.5 and 0.9. In Fig. 5(a), we consider the WRS value 0.1 and change the VEC from 0.1 to 0.9. The DRM is decided as “Safe” when driver’s health condition parameter indicates a very good status of his/her health and is not driving in a very uncomfortable vehicle. Regarding to the outside environment, he/she is driving under good weather condition, in roads with good conditions and slowly or at a

An Integrated Fuzzy-Based Simulation System

23

Fig. 5. Simulation results for FLC2.

moderate speed. If DHC shows a poor health condition and the inside environment is not comfortable, we can see that the driving risk values are increased compared with the aforementioned scenario. By comparing the DRM values for all the considered scenarios, we can see the significant impact of the WRS on the determination of the driving risk. When WRS is more than 0.5, the driver cannot manage to drive without a high risk even if he/she is in good health condition and feels comfortable inside the vehicle. These situations are determined with High, Very High, Severe or Danger regardless DHC and VEC values. This is due to the fact that there are other

24

K. Bylykbashi et al.

external or internal factors which can affect the driving and vehicle performance, such as a very low visibility caused by a bad weather, dangerous roads in which the driver cannot fully control the vehicle or a high speed combined with these factors which makes the effect even worse. In the cases when the risk level is above the moderate level for a number of consecutive decided DRM values, the system can perform a certain action. For example, when the DRM value is slightly above the moderate level the system may take an action to lift the driver’s mood, and when the DRM value is very high, the system could even decide to limit the vehicle’s operating speed to a speed that the risk level is decreased significantly.

5

Conclusions

In this paper, we proposed a fuzzy-based system to decide the driving risk management. We considered five parameters: road condition, weather condition, vehicle speed, vehicle’s environment condition and driver’s health condition. We showed through simulations the effect of the considered parameters on the determination of the driving risk level. In addition, we demonstrated a few actions that can be performed based on the output of our system. However, it may occur that the system provides an output which determines a low risk, when actually the chances for an accident to happen are high, or the opposite scenario, which is the case when the system’s output implies a false alarm. Therefore, we intend to implement the system in a testbed and estimate the system performance by looking into correct detection and false positives/negatives to determine its accuracy.

References 1. Bylykbashi, K., Elmazi, D., Matsuo, K., Ikeda, M., Barolli, L.: Effect of security and trustworthiness for a fuzzy cluster management system in VANETs. Cogn. Syst. Res. 55, 153–163 (2019). https://doi.org/10.1016/j.cogsys.2019.01.008 2. Bylykbashi, K., Elmazi, D., Matsuo, K., Ikeda, M., Barolli, L.: Implementation of a fuzzy-based simulation system and a testbed for improving driving conditions in VANETs. In: International Conference on Complex, Intelligent, and Software Intensive Systems, pp. 3–12. Springer (2019). https://doi.org/10.1007/978-3-03022354-01 3. Bylykbashi, K., Qafzezi, E., Ikeda, M., Matsuo, K., Barolli, L.: A fuzzy-based system for driving risk measurement (FSDRM) in VANETs: a comparison study of simulation and experimental results. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 14–25. Springer (2019) 4. Bylykbashi, K., Qafzezi, E., Ikeda, M., Matsuo, K., Barolli, L.: Fuzzy-based driver monitoring system (FDMS): implementation of two intelligent FDMSs and a testbed for safe driving in VANETs. Future Gener. Comput. Syst. 105, 665–674 (2020). https://doi.org/10.1016/j.future.2019.12.030

An Integrated Fuzzy-Based Simulation System

25

5. Bylykbashi, K., Qafzezi, E., Ikeda, M., Matsuo, K., Barolli, L., Takizawa, M.: A fuzzy-based simulation system for driving risk management in VANETs considering weather condition as a new parameter. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 23–32. Springer (2020) 6. Gusikhin, O., Filev, D., Rychtyckyj, N.: Intelligent vehicle systems: applications and new trends. In: Informatics in Control Automation and Robotics, pp. 3–14. Springer (2008). https://doi.org/10.1007/978-3-540-79142-31 7. Hartenstein, H., Laberteaux, L.: A tutorial survey on vehicular ad hoc networks. IEEE Commun. Mag. 46(6), 164–171 (2008) 8. Kandel, A.: Fuzzy Expert Systems. CRC Press, Boca Raton (1991) 9. Klir, G.J., Folger, T.A.: Fuzzy Sets, Uncertainty, and Information. Prentice Hall Inc, Upper Saddle River (1987) 10. McNeill, F.M., Thro, E.: Fuzzy Logic: A Practical Approach. Academic Press, Cambridge (1994) 11. Munakata, T., Jani, Y.: Fuzzy systems: an overview. Commun. ACM 37(3), 69–77 (1994). https://doi.org/10.1145/175247.175254 12. Qafzezi, E., Bylykbashi, K., Ikeda, M., Matsuo, K., Barolli, L.: Coordination and management of cloud, fog and edge resources in SDN-VANETs using fuzzy logic: a comparison study for two fuzzy-based systems. Internet Things 11, 100169 (2020) 13. SAE On-Road Automated Driving (ORAD) committee: taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles. Technical report, Society of Automotive Engineers (SAE) (2018). https://doi.org/10. 4271/J3016201806 14. Zadeh, L.A., Kacprzyk, J.: Fuzzy Logic for the Management of Uncertainty. Wiley, New York (1992) 15. Zimmermann, H.J.: Fuzzy Set Theory and its Applications. Springer Science & Business Media, New York (1996). https://doi.org/10.1007/978-94-015-8702-0

Performance Evaluation of RIWM and RDVM Router Replacement Methods for WMNs by WMN-PSOHC Hybrid Intelligent System Shinji Sakamoto1(B) , Admir Barolli2 , Phudit Ampririt3 , Leonard Barolli4 , and Shusuke Okamoto1 1

2

Department of Computer and Information Science, Seikei University, 3-3-1 Kichijoji-Kitamachi, Musashino-shi, Tokyo 180-8633, Japan [email protected], [email protected] Department of Information Technology, Aleksander Moisiu University of Durres, L.1, Rruga e Currilave, Durres, Albania [email protected] 3 Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected] 4 Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected]

Abstract. With the rapid development of wireless technologies, Wireless Mesh Networks (WMNs) are becoming an important networking infrastructure due to their low cost and increased high speed wireless Internet connectivity. However, WMNs have some problems sush as node placement problem, security, transmission power and so on. In this work, we deal with node placement problem. In our previous work, we implemented a hybrid simulation system based on Particle Swarm Optimization (PSO) and Hill Climbing (HC) called WMN-PSOHC for solving the node placement problem in WMNs. In this paper, we evaluate the performance of two mesh router replacement methods: Random Inertia Weight Method (RIWM) and Rational Decrement of Vmax Method (RDVM) by WMN-PSOHC hybrid intelligent simulation system. Simulation results show that a better performance is achieved for RIWM compared with RDVM.

1

Introduction

The wireless networks and devices are becoming increasingly popular and they provide users access to information and communication anytime and anywhere [1,3–5,8–11,13,14,16,18,19,24,29]. Wireless Mesh Networks (WMNs) are c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 26–35, 2021. https://doi.org/10.1007/978-3-030-57796-4_3

Performance Evaluation of Mesh Router Replacement Methods

27

gaining a lot of attention because of their low cost nature that makes them attractive for providing wireless Internet connectivity. A WMN is dynamically self-organized and self-configured, with the nodes in the network automatically establishing and maintaining mesh connectivity among them-selves (creating, in effect, an ad hoc network). This feature brings many advantages to WMNs such as low up-front cost, easy network maintenance, robustness and reliable service coverage [2]. Moreover, such infrastructure can be used to deploy community networks, metropolitan area networks, municipal and corporative networks, and to support applications for urban areas, medical, transport and surveillance systems. In this work, we deal with node placement problem in WMNs. We consider the version of the mesh router nodes placement problem in which we are given a grid area where to deploy a number of mesh router nodes and a number of mesh client nodes of fixed positions (of an arbitrary distribution) in the grid area. The objective is to find a location assignment for the mesh routers to the cells of the grid area that maximizes the network connectivity and client coverage. Network connectivity is measured by Size of Giant Component (SGC) of the resulting WMN graph, while the user coverage is simply the number of mesh client nodes that fall within the radio coverage of at least one mesh router node and is measured by Number of Covered Mesh Clients (NCMC). Node placement problems are known to be computationally hard to solve [12,33]. In some previous works, intelligent algorithms have been recently investigated [7,15,17,26,27,35]. We already implemented a Particle Swarm Optimization (PSO) based simulation system, called WMN-PSO [22]. Also, we implemented a simulation system based on Hill Climbing (HC) for solving node placement problem in WMNs, called WMN-HC [21]. In our previous work [22,25], we presented a hybrid intelligent simulation system based on PSO and HC. We called this system WMN-PSOHC. In this paper, we analyze the performance of Random Inertia Weight Method (RIWM) and Rational Decrement of Vmax Method (RDVM) by WMN-PSOHC simulation system. The rest of the paper is organized as follows. We present our designed and implemented hybrid simulation system in Sect. 2. In Sect. 3, we introduce WMNPSOHC Web GUI tool. The simulation results are given in Sect. 4. Finally, we give conclusions and future work in Sect. 5.

2 2.1

Proposed and Implemented Simulation System Particle Swarm Optimization

In Particle Swarm Optimization (PSO) algorithm, a number of simple entities (the particles) are placed in the search space of some problem or function and each evaluates the objective function at its current location. The objective function is often minimized and the exploration of the search space is not through evolution [20]. However, following a widespread practice of borrowing from the

28

S. Sakamoto et al.

evolutionary computation field, in this work, we consider the bi-objective function and fitness function interchangeably. Each particle then determines its movement through the search space by combining some aspect of the history of its own current and best (best-fitness) locations with those of one or more members of the swarm, with some random perturbations. The next iteration takes place after all particles have been moved. Eventually the swarm as a whole, like a flock of birds collectively foraging for food, is likely to move close to an optimum of the fitness function. Each individual in the particle swarm is composed of three D-dimensional vectors, where D is the dimensionality of the search space. These are the current position xi , the previous best position pi and the velocity vi . The particle swarm is more than just a collection of particles. A particle by itself has almost no power to solve any problem; progress occurs only when the particles interact. Problem solving is a population-wide phenomenon, emerging from the individual behaviors of the particles through their interactions. In any case, populations are organized according to some sort of communication structure or topology, often thought of as a social network. The topology typically consists of bidirectional edges connecting pairs of particles, so that if j is in i’s neighborhood, i is also in j’s. Each particle communicates with some other particles and is affected by the best point found by any member of its topological neighborhood. This is just the vector pi for that best neighbor, which we will denote with pg . The potential kinds of population “social networks” are hugely varied, but in practice certain types have been used more frequently. In the PSO process, the velocity of each particle is iteratively adjusted so that the particle stochastically oscillates around pi and pg locations. 2.2

Hill Climbing

Hill Climbing (HC) algorithm is a heuristic algorithm. The idea of HC is simple. In HC, the solution s is accepted as the new current solution if δ ≤ 0 holds, where δ = f (s ) − f (s). Here, the function f is called the fitness function. The fitness function gives points to a solution so that the system can evaluate the next solution s and the current solution s. The most important factor in HC is to define the neighbor solution, effectively. The definition of the neighbor solution affects HC performance directly. In our WMN-PSOHC system, we use the next step of particle-pattern positions as the neighbor solutions for the HC part. 2.3

WMN-PSOHC System Description

In following, we present the initialization, particle-pattern, fitness function and router replacement methods.

Performance Evaluation of Mesh Router Replacement Methods

29

Initialization Our proposed system starts by generating an initial solution randomly, by ad hoc methods [34]. We decide the velocity of particles by a random process considering the area size. For √ instance, when√the area size is W × H, the velocity is decided randomly from − W 2 + H 2 to W 2 + H 2 . Particle-pattern. A particle is a mesh router. A fitness value of a particle-pattern is computed by combination of mesh routers and mesh clients positions. In other words, each particle-pattern is a solution as shown is Fig. 1. Therefore, the number of particle-patterns is a number of solutions. Fitness Function One of most important thing is to decide the determination of an appropriate objective function and its encoding. In our case, each particle-pattern has an own fitness value and compares other particle-patterns fitness value in order to share information of global solution. The fitness function follows a hierarchical approach in which the main objective is to maximize the SGC in WMN. Thus, we use α and β weight-coefficients for the fitness function and the fitness function of this scenario is defined as: Fitness = α × SGC(xij , y ij ) + β × NCMC(xij , y ij ).

Fig. 1. Relationship among global solution, particle-patterns and mesh routers.

Router Replacement Methods A mesh router has x, y positions and velocity. Mesh routers are moved based on velocities. There are many router replacement methods in PSO field [6,30–32]. In this paper, we consider RIWM and RDVM. Random Inertia Weight Method (RIWM) In RIWM, the ω parameter is changing randomly from 0.5 to 1.0. The C1 and C2 are kept 2.0. The ω can be estimated by the week stable region. The average of ω is 0.75 [28,31]. Rational Decrement of Vmax Method (RDVM) In RDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 =

30

S. Sakamoto et al.

2.0). The Vmax is kept decreasing with the increasing of iterations as Vmax (x) =

W 2 + H2 ×

T −x . x

Where, W and H are the width and the height of the considered area, respectively. Also, T and x are the total number of iterations and a current number of iteration, respectively [23].

3

WMN-PSOHC Web GUI Tool

The Web application follows a standard Client-Server architecture and is implemented using LAMP (Linux + Apache + MySQL + PHP) technology (see Fig. 2). We show the WMN-PSOHC Web GUI tool in Fig. 3. Remote users (clients) submit their requests by completing first the parameter setting. The parameter values to be provided by the user are classified into three groups, as follows.

Fig. 2. System structure for web interface.

• Parameters related to the problem instance: These include parameter values that determine a problem instance to be solved and consist of number of router nodes, number of mesh client nodes, client mesh distribution, radio coverage interval and size of the deployment area. • Parameters of the resolution method: Each method has its own parameters. • Execution parameters: These parameters are used for stopping condition of the resolution methods and include number of iterations and number of independent runs. The former is provided as a total number of iterations and depending on the method is also divided per phase (e.g., number of iterations in a exploration). The later is used to run the same configuration for the same problem instance and parameter configuration a certain number of times.

Performance Evaluation of Mesh Router Replacement Methods

31

Fig. 3. WMN-PSOHC Web GUI tool.

4

Simulation Results

In this section, we show simulation results using WMN-PSOHC system. In this work, we consider Normal distribution of mesh clients. The number of mesh routers is considered 16 and the number of mesh clients 48. We consider the number of particle-patterns 9. We conducted simulations 100 times, in order to avoid the effect of randomness and create a general view of results. The total number of iterations is considered 800 and the iterations per phase is considered 4. We show the parameter setting for WMN-PSOHC in Table 1. Table 1. Parameter settings. Parameters

Values

Clients distribution

Normal distribution

Area size

32.0 × 32.0

Number of mesh routers

16

Number of mesh clients

48

Total iterations

800

Iteration per phase

4

Number of particle-patterns

9

Radius of a mesh router

2.0

Fitness function weight-coefficients (α, β) 0.7, 0.3 Replacement method

RIWM, RDVM

We show the simulation results in Fig. 4 and Fig. 5. For SGC, both replacement methods reach the maximum (100%). This means that all mesh routers

32

S. Sakamoto et al.

Fig. 4. Simulation results of WMN-PSOHC for SGC.

Fig. 5. Simulation results of WMN-PSOHC for NCMC.

are connected to each other. We see that RIWM converges faster than RDVM for SGC. Also, for the NCMC, RIWM performs better than RDVM. Therefore, we conclude that the performance of RIWM is better compared with RDVM.

5

Conclusions

In this work, we evaluated the performance of RIWM and RDVM router replacement methods for WMNs by WMN-PSOHC hybrid intelligent simulation system. Simulation results show that the performance of RIWM is better compared with RDVM. In our future work, we would like to evaluate the performance of the proposed system for different parameters and scenarios.

References 1. Ahmed, S., Khan, M.A., Ishtiaq, A., Khan, Z.A., Ali, M.T.: Energy harvesting techniques for routing issues in wireless sensor networks. Int. J. Grid Util. Comput. 10(1), 10–21 (2019) 2. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005)

Performance Evaluation of Mesh Router Replacement Methods

33

3. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: A hybrid simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs: performance evaluation considering normal and uniform distribution of mesh clients. In: International Conference on Network-Based Information Systems, pp. 42–55. Springer (2018) 4. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance analysis of simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs considering different distributions of mesh clients. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 32–45. Springer (2018) 5. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance evaluation of WMN-PSODGA system for node placement problem in WMNs considering four different crossover methods. In: The 32nd IEEE International Conference on Advanced Information Networking and Applications. (AINA-2018), pp 850–857. IEEE (2018) 6. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002) 7. Girgis, M.R., Mahmoud, T.M., Abdullatif, B.A., Rabie, A.M.: Solving the wireless mesh network design problem using genetic algorithm and simulated annealing optimization methods. Int. J. Comput. Appl. 96(11), 1–10 (2014) 8. Gorrepotu, R., Korivi, N.S., Chandu, K., Deb, S.: Sub-1 GHz miniature wireless sensor node for IoT applications. Internet Things 1, 27–39 (2018) 9. Inaba, T., Obukata, R., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of a QoS-aware fuzzy-based CAC for LAN access. Int. J. Space-Based Situated Comput. 6(4), 228–238 (2016) 10. Inaba, T., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A testbed for admission control in WLAN: a fuzzy approach and its performance evaluation. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 559–571. Springer (2016) 11. Islam, M.M., Funabiki, N., Sudibyo, R.W., Munene, K.I., Kao, W.C.: A dynamic access-point transmission power minimization method using PI feedback control in elastic WLAN system for IoT applications. Internet Things 8(100), 089 (2019) 12. Maolin, T., et al.: Gateways placement in backbone wireless mesh networks. Int. J. Commun. Netw. Syst. Sci. 2(1), 44 (2009) 13. Marques, B., Coelho, I.M., Sena, A.D.C., Castro, M.C.: A network coding protocol for wireless sensor fog computing. Int. J. Grid Util. Comput. 10(3), 224–234 (2019) 14. Matsuo, K., Sakamoto, S., Oda, T., Barolli, A., Ikeda, M., Barolli, L.: Performance analysis of WMNs by WMN-GA simulation system for two WMN architectures and different TCP congestion-avoidance algorithms and client distributions. Int. J. Commun. Netw. Distrib. Syst. 20(3), 335–351 (2018) 15. Naka, S., Genji, T., Yura, T., Fukuyama, Y.: A hybrid particle swarm optimization for distribution state estimation. IEEE Trans. Power Syst. 18(1), 60–68 (2003) 16. Ohara, S., Barolli, A., Sakamoto, S., Barolli, L.: Performance analysis of WMNs by WMN-PSODGA simulation system considering load balancing and client uniform distribution. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp 25–38. Springer (2019) 17. Ozera, K., Bylykbashi, K., Liu, Y., Barolli, L.: A fuzzy-based approach for cluster management in VANETs: performance evaluation for two fuzzy-based systems. Internet Things 3, 120–133 (2018)

34

S. Sakamoto et al.

18. Ozera, K., Inaba, T., Bylykbashi, K., Sakamoto, S., Ikeda, M., Barolli, L.: A WLAN triage testbed based on fuzzy logic and its performance evaluation for different number of clients and throughput parameter. Int. J. Grid Util. Comput. 10(2), 168–178 (2019) 19. Petrakis, E.G., Sotiriadis, S., Soultanopoulos, T., Renta, P.T., Buyya, R., Bessis, N.: Internet of things as a service (iTaaS): challenges and solutions for management of sensor data on the cloud and the fog. Internet Things 3, 156–174 (2018) 20. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007) 21. Sakamoto, S., Lala, A., Oda, T., Kolici, V., Barolli, L., Xhafa, F.: Analysis of WMN-HC simulation system data using friedman test. In: The Ninth International Conference on Complex, Intelligent, and Software Intensive Systems. (CISIS-2015), pp 254–259. IEEE (2015) 22. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evaluation of a simulation system based on particle swarm optimisation for node placement problem in wireless mesh networks. Int. J. Commun. Netw. Distrib. Syst. 17(1), 1–13 (2016) 23. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation of a new replacement method in WMN-PSO simulation system and its performance evaluation. The 30th IEEE International Conference on Advanced Information Networking and Applications. (AINA-2016), pp. 206–211 (2016). https://doi.org/ 10.1109/AINA.2016.42 24. Sakamoto, S., Obukata, R., Oda, T., Barolli, L., Ikeda, M., Barolli, A.: Performance analysis of two wireless mesh network architectures by WMN-SA and WMN-TS simulation systems. J. High Speed Netw. 23(4), 311–322 (2017) 25. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid systems for node placement problem in WMNs considering particle swarm optimization, hill climbing and simulated annealing. Mob. Netw. Appl. 23(1), 27–33 (2018) 26. Sakamoto, S., Barolli, A., Barolli, L., Okamoto, S.: Implementation of a web interface for hybrid intelligent systems. Int. J. Web Inf. Syst. 15(4), 420–431 (2019) 27. Sakamoto, S., Barolli, L., Okamoto, S.: WMN-PSOSA: an intelligent hybrid simulation system for WMNs and its performance evaluations. Int. J. Web Grid Serv. 15(4), 353–366 (2019) 28. Sakamoto, S., Ohara, S., Barolli, L., Okamoto, S.: Performance evaluation of WMNs by WMN-PSOHC system considering random inertia weight and linearly decreasing inertia weight replacement methods. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp 39–48. Springer (2019) 29. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Implementation of an intelligent hybrid simulation systems for WMNs based on particle swarm optimization and simulated annealing: performance evaluation for different replacement methods. Soft. Comput. 23(9), 3029–3035 (2019) 30. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle swarms. J. Global Optim. 31(1), 93–108 (2005) 31. Shi, Y.: Particle swarm optimization. IEEE Connections 2(1), 8–13 (2004) 32. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In: Evolutionary Programming VII, pp. 591–600 (1998) 33. Wang, J., Xie, B., Cai, K.: Agrawal DP efficient mesh router placement in wireless mesh networks. In: Proceeding of IEEE International Conference on Mobile Adhoc and Sensor Systems. (MASS-2007), pp. 1–9 (2007)

Performance Evaluation of Mesh Router Replacement Methods

35

34. Xhafa, F., Sanchez, C., Barolli, L.: Ad hoc and neighborhood search methods for placement of mesh routers in wireless mesh networks. In: Proceeding of 29th IEEE International Conference on Distributed Computing Systems Workshops. (ICDCS2009), pp. 400–405 (2009) 35. Yaghoobirafi, K., Nazemi, E.: An autonomic mechanism based on ant colony pattern for detecting the source of incidents in complex enterprise systems. Int. J. Grid Util. Comput. 10(5), 497–511 (2019)

A Decision-Making System Based on Fuzzy Logic for IoT Node Selection in Opportunistic Networks Considering Node Betweenness Centrality as a New Parameter Miralda Cuka1(B) , Donald Elmazi1 , Makoto Ikeda1 , Keita Matsuo1 , Leonard Barolli1 , and Makoto Takizawa2 1

Department of Information and Communication Engineering, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected], [email protected], [email protected], {kt-matsuo,barolli}@fit.ac.jp 2 Department of Advanced Sciences, Hosei University, 2-17-1 Fujimi, Chiyoda, Tokyo 102-8160, Japan [email protected]

Abstract. Designed as a specialized ad hoc network suitable for applications such as emergency responses, Opportunistic Networks (OppNets) are considered as a sub-class of DTN where communication opportunities are intermittent and an end-to-end path between the source and destination may never exist. Existing networks have already brought connectivity to a broad range of devices, such as hand held devices, laptops, tablets, PC, etc. The Internet of Things (IoT) will extend the connectivity to devices beyond just mobile phones and laptops, but to buildings, wearable devices, cars, different things and objects. One of the issues for these networks is identifying the most important IoT nodes and then selecting them to carry out tasks in OppNets. In this work, we implement an IoT Node Selection System (IoNSS) in OppNets based on Fuzzy Logic. We use three input parameters for IoNSS: Node’s Buffer Occupancy (NBO), Node’s Residual Energy (NRE) and Node Betweenness Centrality (NBC). The output parameter is IoT Node Selection Decision (NSD). The results show that proposed system makes a proper selection decision for IoT nodes in OppNets.

1

Introduction

Communication systems are becoming increasingly complex, involving thousands of heterogeneous nodes with diverse capabilities and various networking technologies interconnected with the aim to provide users with ubiquitous access to information and advanced services at a high quality level, in a cost efficient manner, any time, any place, and in line with the always best connectivity principle. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 36–43, 2021. https://doi.org/10.1007/978-3-030-57796-4_4

A Decision-Making System Based on Fuzzy Logic

37

The Opportunistic Networks (OppNets) can provide an alternative way to support the diffusion of information in special locations within a city, particularly in crowded spaces where current wireless technologies can exhibit congestion issues. Sparse connectivity, no infrastructure and limited resources further complicate the situation [1,2]. Routing methods for such sparse mobile networks use a different paradigm for message delivery. These schemes utilize node mobility by having nodes carry messages and wait for an opportunity to transfer messages to the destination or the next relay, rather than transmitting them over a fixed path [3]. Hence, the challenges for routing in OppNets are very different from the traditional wireless networks and their utility and potential for scalability makes them a huge success. Internet of Things (IoT) seamlessly connects the real world and cyberspace via physical objects embedded with various types of intelligent sensors. A large number of Internet-connected machines will generate and exchange an enormous amount of data that make daily life more convenient, help to make a tough decision and provide beneficial services. The IoT probably becomes one of the most popular networking concepts that has the potential to bring out many benefits [4,5]. The Fuzzy Logic (FL) is unique approach that is able to simultaneously handle numerical data and linguistic knowledge. The fuzzy logic works on the levels of possibilities of input to achieve the definite output. In this paper, we propose and implement a Fuzy-based system for node selection called IoNSS in OppNets. We use three input parameters: Node’s Buffer Occupancy (NBO), Node’s Residual Energy (NRE) and Node Betweenness Centrality (NBC). The output parameter Node Selection Decision (NSD) represents the possibility that the IoT nodes will be selected to complete a certain task. We evaluate the system and present the simulation results for different values of input parameters. The rest of the paper is organized as follows. In Sect. 2, we present IoT and OppNets. Section 3 describes the proposed system. The simulation results are shown in Sect. 4. Finally, conclusions and future work are given in Sect. 5.

2

IoT and OppNets

The IoT network is a combination of IoT nodes which are connected with different mediums and use IoT Gateway to connect to the Internet. The data transmitted through the gateway is stored and processed securely within cloud server. These new connected things will trigger increasing demands for new IoT applications that are not limited to only individual users. The current solutions for IoT application development generally rely on integrated service-oriented programming platforms. In particular, resources (e.g., sensory data, computing resource, and control information) are modeled as services and deployed in the cloud or at the edge. It is difficult to achieve rapid deployment and flexible resource management at network edges, in addition, an IoT system’s scalability will be restricted by the capability of the edge nodes [6].

38

M. Cuka et al.

Fig. 1. OppNets scenario.

The data might be relayed by any node and forwarded through other nodes (e.g. via smartphones) even in the absence of a predefined end-to-end by exploiting opportunities for communication as soon as they become available. Evidently, such an OppNet paradigm plays an important role as an enabler for communication in IoT because without it disconnected networks of nodes could not be connected to the Internet world [7]. OppNets comprises a network where nodes can be anything from pedestrians, vehicles, fixed nodes and so on (see Fig. 1). The data is sent from sender to receiver by using any communication opportunity that can be Wi-Fi, Bluetooth, cellular technologies or satellite links. In such scenario, IoT nodes might roam and opportunistically encounter several different statically deployed networks and perform either data collection or dissemination as well as relaying data between these networks, thus introducing further connectivity for disconnected networks. For example, as seen in Fig. 1, a car could opportunistically encounter other IoT nodes, collect information from them and relay it until it finds an available access point where it can upload the information. Similarly, a person might collect information from home-based weather stations and relay it through several other people, cars and buses until it reaches its intended destination [7]. OppNets are not limited to only such applications, as they can introduce further connectivity and benefits to IoT scenarios.

3

Description of the Proposed System

The main goal of our decision-making system is to select the best IoT nodes among a set of available ones. To achieve this, our system based on FL takes three

A Decision-Making System Based on Fuzzy Logic

39

inputs into consideration. Fuzzy sets and FL have been developed to manage vagueness and uncertainty in a reasoning process of an intelligent system such as a knowledge based system, an expert system or a logic control system [8–20]. Our system implementation consists of three main stages. In the first stage the input and output parameters used for our system are selected based on the challenges OppNets face. The parameters we have used are: • Node’s Buffer Occupancy (NBO); • Node’s Residual Energy (NRE); • Node Betweenness Centrality (NBC). The NBO and NRE parameters indicate the amount of data stored in the node’s buffer at any given time and the energy remaining in the node’s battery. We have considered energy and buffer occupancy as resources which vary over time and extend network longevity. The NBC parameter measures the extent to which a node lies on the path of other nodes. Such nodes have a significant influence in the network as the information passes through them. In the second stage we design the proposed system based on FL (see Fig. 2). It consists of one Fuzzy Logic Controller (FLC), which is the main part of our

Fig. 2. IoNSS proposed system and its components. Table 1. Parameters and their term sets for FLC. System Parameters

Term sets

IoNSS I Node’s Buffer Occupancy (NBO)

Empty (Em), Medium (Med), Full (Fu)

Node’s Residual Energy (NRE)

Low (Lw), Medium (Mdm), High (Hi)

Node Betweenness Centrality (NBC) Small (Sm), Medium (Md), High (Hg) O Node Selection Decision (NSD)

Extremely Low Selection Possibility (ELSP), Very Low Selection Possibility (VLSP), Low Selection Possibility (LSP), Medium Selection Possibility (MSP), High Selection Possibility (HSP), Very High Selection Possibility (VHSP), Extremely High Selection Possibility (EHSP)

40

M. Cuka et al. Table 2. FRB.

No. NBO NRE NBC NSD

No. NBO NRE NBC NSD

1

Em

Lw

Sm

LSP

10 Med Lw

Sm

ELSP 19 Fu

Lw

Sm

ELSP

2

Em

Lw

Md

MSP

11 Med Lw

Md

VLSP 20 Fu

Lw

Md

ELSP

3

Em

Lw

Hg

VHSP 12 Med Lw

Hg

MSP

Lw

Hg

VLSP

4

Em

Mdm Sm

HSP

VLSP 22 Fu

Mdm Sm

ELSP

5

Em

Mdm Md

VHSP 14 Med Mdm Md

LSP

23 Fu

Mdm Md

VLSP

6

Em

Mdm Hg

EHSP 15 Med Mdm Hg

HSP

24 Fu

Mdm Hg

MSP

7

Em

Hi

Sm

VHSP 16 Med Hi

Sm

MSP

25 Fu

Hi

Sm

LSP

8

Em

Hi

Md

EHSP 17 Med Hi

Md

HSP

26 Fu

Hi

Md

LSP

9

Em

Hi

Hg

EHSP 18 Med Hi

Hg

VHSP 27 Fu

Hi

Hg

HSP

13 Med Mdm Sm

No. NBO NRE NBC NSD

21 Fu

Fig. 3. Input and output MFs for IoNSS.

system, its basic elements and the input and output parameters. We first define all the linguistic variables and their term sets for each parameter as shown in Table 1. Next the appropriate Fuzzy Membership Functions (FMFs) are chosen based on the inputs and output and the problem specifics (see Fig. 3). We have decided to use triangular and trapezoidal FMFs due to their simplicity and computational efficiency [21]. However, the overlap triangle-to-triangle and trapezoid-to-triangle fuzzy regions can not be addressed by any rule. It depends on the parameters and the specifics of their applications. Lastly, the Fuzzy Rule Base (FRB) is build as a set of fuzzy parameters and decision rules as shown in Table 2. Decision rules are fuzzy conditional statements expressed in the “if X then Y ” form, where X and Y are term sets characterized by appropriate FMFs. In the third stage simulation results to analyze the behavior of our system are performed.

A Decision-Making System Based on Fuzzy Logic

4

41

System Evaluation

In Fig. 4 are shown the simulation results for our IoNSS proposed system. We estimate the possibility of one IoT node to be selected based on three input parameters. The simulation results show that an increase in NBC increases the possibility of said node to be selected. In Fig. 4(a), for NRE = 0.1, when NBC increases from 0.1 to 0.5 and from 0.5 to 0.9, NSD increases 15% and 23%, respectively. We

Fig. 4. Simulation results for IoNSS.

42

M. Cuka et al.

see this increase because nodes with a higher betweenness centrality have more control over the network since more information will pass through that node. NBO is another parameter we considered in this work. When we compare Fig. 4(a) with Fig. 4(b) and Fig. 4(a) with Fig. 4(c), for NRE = 0.1 and NBC = 0.5, NSD decreases 15% and 42%, respectively. An increase in NBO causes a decrease in NSD because the buffer of a node is more likely to overflow or drop the messages. NRE parameter is important because most of the nodes are battery powered. In Fig. 4(a), for NBC = 0.3, NSD has increased 29% when NRE increases from 0.1 to 0.5, and 14% when NRE increases from 0.5 to 0.9. Different tasks have different battery demands so a high NRE means more energy to accomplish them.

5

Conclusions and Future Work

In this paper, we proposed a fuzzy-based system for IoT node selection in OppNets called IoNSS. The IoNSS makes the selection decision and chooses which IoT nodes from a set of nodes are better suited for a task. We used three input parameters for the selection decision. The NSD increases with the increase of NRE and decreases with the increase of NBO. A high NBC increases NSD as the possibility of the message to reach the destination is increased. In the future work, we will also consider other parameters for IoT node selection and make extensive simulations and experiments to evaluate the proposed system.

References 1. Dhurandher, S.K., Sharma, D.K., Woungang, I., Bhati, S.: HBPR: history based prediction for routing in infrastructure-less opportunistic networks. In: 27th International Conference on Advanced Information Networking and Applications (AINA), pp. 931–936. IEEE (2013) 2. Spaho, E., Mino, G., Barolli, L., Xhafa, F.: Goodput and PDR analysis of AODV, OLSR and DYMO protocols for vehicular networks using cavenet. Int. J. Grid Util. Comput. 2(2), 130–138 (2011) 3. Abdulla, M., Simon, R.: The impact of intercontact time within opportunistic networks: protocol implications and mobility models. TechRepublic White Paper (2009) 4. Kraijak, S., Tuwanut, P.: A survey on internet of things architecture, protocols, possible applications, security, privacy, real-world implementation and future trends. In: 16th International Conference on Communication Technology (ICCT), pp. 26– 31. IEEE (2015) 5. Arridha, R., Sukaridhoto, S., Pramadihanto, D., Funabiki, N.: Classification extension based on IOT-big data analytic for smart environment monitoring and analytic in real-time system. Int. J. Space-Based Situated Comput. 7(2), 82–93 (2017) 6. Chen, N., Yang, Y., Li, J., Zhang, T.: A fog-based service enablement architecture for cross-domain IOT applications. In: 2017 IEEE Fog World Congress (FWC), pp. 1–6. IEEE (2017)

A Decision-Making System Based on Fuzzy Logic

43

7. Pozza, R., Nati, M., Georgoulas, S., Moessner, K., Gluhak, A.: Neighbor discovery for opportunistic networking in internet of things scenarios: a survey. IEEE Access 3, 1101–1131 (2015) 8. Inaba, T., Sakamoto, S., Kolici, V., Mino, G., Barolli, L.: A CAC scheme based on fuzzy logic for cellular networks considering security and priority parameters. In: The 9th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA-2014), pp. 340–346 (2014) 9. Spaho, E., Sakamoto, S., Barolli, L., Xhafa, F., Barolli, V., Iwashige, J.: A fuzzybased system for peer reliability in JXTA-overlay P2P considering number of interactions. In: The 16th International Conference on Network-Based Information Systems (NBiS-2013), pp. 156–161 (2013) 10. Matsuo, K., Elmazi, D., Liu, Y., Sakamoto, S., Mino, G., Barolli, L.: FACS-MP: a fuzzy admission control system with many priorities for wireless cellular networks and its performance evaluation. J. High Speed Netw. 21(1), 1–14 (2015) 11. Grabisch, M.: The application of fuzzy integrals in multicriteria decision making. Eur. J. Oper. Res. 89(3), 445–456 (1996) 12. Inaba, T., Elmazi, D., Liu, Y., Sakamoto, S., Barolli, L., Uchida, K.: Integrating wireless cellular and ad-hoc networks using fuzzy logic considering node mobility and security. In: The 29th IEEE International Conference on Advanced Information Networking and Applications Workshops (WAINA-2015), pp. 54–60 (2015) 13. Kulla, E., Mino, G., Sakamoto, S., Ikeda, M., Caballé, S., Barolli, L.: FBMIS: a fuzzy-based multi-interface system for cellular and ad-hoc networks. In: International Conference on Advanced Information Networking and Applications (AINA2014), pp. 180–185 (2014) 14. Elmazi, D., Kulla, E., Oda, T., Spaho, E., Sakamoto, S., Barolli, L.: A comparison study of two fuzzy-based systems for selection of actor node in wireless sensor actor networks. J. Ambient Intell. Humaniz. Comput. 6(5), 635–645 (2015) 15. Zadeh, L.: Fuzzy logic, neural networks, and soft computing. ACM Commun. 37(3), 77–84 (1994) 16. Spaho, E., Sakamoto, S., Barolli, L., Xhafa, F., Ikeda, M.: Trustworthiness in P2P: performance behaviour of two fuzzy-based systems for JXTA-overlay platform. Soft. Comput. 18(9), 1783–1793 (2014) 17. Inaba, T., Sakamoto, S., Kulla, E., Caballe, S., Ikeda, M., Barolli, L.: An integrated system for wireless cellular and ad-hoc networks using fuzzy logic. In: International Conference on Intelligent Networking and Collaborative Systems (INCoS-2014), pp. 157–162 (2014) 18. Matsuo, K., Elmazi, D., Liu, Y., Sakamoto, S., Barolli, L.: A multi-modal simulation system for wireless sensor networks: a comparison study considering stationary and mobile sink and event. J. Ambient Intell. Humaniz. Comput. 6(4), 519–529 (2015) 19. Kolici, V., Inaba, T., Lala, A., Mino, G., Sakamoto, S., Barolli, L.: A fuzzy-based CAC scheme for cellular networks considering security. In: International Conference on Network-Based Information Systems (NBiS-2014), pp. 368–373 (2014) 20. Liu, Y., Sakamoto, S., Matsuo, K., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study for two fuzzy-based systems: improving reliability and security of JXTAoverlay P2P platform. Soft. Comput. 20(7), 2677–2687 (2015) 21. Mendel, J.M.: Fuzzy logic systems for engineering: a tutorial. Proc. of the IEEE 83(3), 345–377 (1995)

Applying a Consensus Building Approach to Communication Projects in the Health Sector: The Momento Medico Case Study Ilaria Avino, Giuseppe Fenza, Graziano Fuccio, Alessia Genovese, Vincenzo Loia, and Francesco Orciuoli(B) DISA-MIS, Universit` a degli Studi di Salerno, Via Giovanni Paolo II 132, 84084 Fisciano, SA, Italy {gfenza,gfuccio,loia,forciuoli}@unisa.it, {i.avino,a.genovese28}@studenti.unisa.it https://www.disa.unisa.it/

Abstract. This work describes the decision-making solution provided by the CONSENSUS project. Such solution is based on contextualising, applying and experimenting both Fuzzy Consensus Model and Average Rating Values algorithm to the Consensus Conference, i.e., a decisionmaking method widely used to achieve an agreement among different opinions on controversial and complex medical issues. The Consensus Conference has been applied so far without the use of technological enablers but the need for automatically executing some of its steps, tracing all the phases of the process, improving the transparency of the whole procedure, eliminating some biases due to the physical presence of the decision-makers, allowing the remote interaction of all the involved actors has fed the idea to create a tool based on the above computational models. Keywords: Consensus conference rating values

1

· Fuzzy consensus model · Average

Introduction

The Consensus Conference [2] is a method used to achieve an agreement among several actors on particularly controversial and complex medical issues, and develops through a structured process. The aim is to produce evidence-based recommendations supporting operators and patients in the management of specific clinical situations. Such recommendations are typically produced starting from the evaluation of the best scientific evidence. This is not always sufficient to certainly determine what is the best choice (possibly, among a set of alternatives) in the different situations that could be faced in clinical practice. The existence of these uncertainty areas is the starting point for organizing a structured discussion and involving a group of experts. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 44–55, 2021. https://doi.org/10.1007/978-3-030-57796-4_5

The Momento Medico Case Study

45

In general, the Consensus Conference method foresees the consultation of a group of researchers with experience on topics related to a specific issue. These researchers are charged by the Conference Organizing Committee (COC) to prepare and present a summary of the available scientific knowledge related to the aforementioned issue, in front of a jury composed of specialists and nonspecialists in the field. Moreover, the COC has the possibility to make working groups to analyze different aspects related to the conference issue and from their work, the jury draws information to compare the available evidence with experts’ opinions. At the end of the consultation, the jury draws up a final report, which summarizes answers obtained by aggregating experts’ opinions, and the related recommendations for the clinical application. The stronger the agreement among the experts’ opinions the greater is the value of the final report produced along with the phases of the conference. Thus, the phase in which the COC looks for the agreement among the experts is crucial for the quality of the result and very important for companies which offer the Consensus Conference as a service. The Consensus Conference method is also used by Momento Medico, namely MM, (https://www.momentomedico.it/) which is a company that promotes scientific culture by carrying out scientific communication projects. Typically, the customers of MM are medical or pharmaceutical companies interested in realizing new communication projects on drugs and therapies related, for instance, to some disease. Momento Medico adopts a Consensus Conference procedure that replaces the activity of the COC with the combined work of both the customer and the MM representative. MM representative also plays the role of moderator in the conference and replaces the activity of the jury. During the execution of the Consensus Conference many critical points could arise: i) experts may be influenced by their own experiences or prejudices, commonly known as bias, in making a decision, ii) the need to protect experts’ identity, guaranteeing anonymity to avoid them to feel limited in expressing their opinion due to the fear of other expert judgments, iii) the impossibility of being physically present in the same place for geographical reasons, timing or exceptional events, such as the recent “Covid-19” health emergency. Furthermore, technical needs arise like, for instance, the request for tracing all the phases of the process in order to analyse data and improve the next conferences. Therefore, it is also possible improving the method of expert selection, group building and so on. In this context, the employment of a computational model to support the Consensus Conference phases is proposed. In particular, this work proposes an approach based on the integrated use of the Fuzzy Consensus Model and the Average Rating Values approaches to provide a comprehensive formal framework on which a Group Decision Support System tool can be implemented and deployed to offer a solution for the aforementioned criticalities of the traditional Consensus Conference. The remaining part of the work is structured as follows. Section 2 describes the contextualisation of Fuzzy Consensus Model and Average Rating Values to the Consensus Conference. Section 3 describe the software prototype implementing the Group Decision Support System to sustain the phases of the Consensus Conference. Section 4 describes several experimentations carried out by the

46

I. Avino et al.

research team of the University of Salerno. Lastly, Sect. 5 provides conclusions and final remarks.

2

A Computational Approach for the Consensus Conference

The idea underlying this work is to propose a computational approach for supporting the Consensus Conference in order to solve the criticalities discussed in Sect. 1. Such approach develops along a process leveraging on two main subprocesses: consensus-building and selection. The former is enabled by the Fuzzy Consensus Model [8]. The latter is enabled by the Average Rating Values algorithm [3].

Fig. 1. Consensus Conference process

In the following subsections, the computational approach supporting the Consensus Conference is described along the conference phases (as described by Fig. 1) in order to emphasize its benefits. It is important to underline that phase #4 and phase #5 implement the two main sub-processes. 2.1

Phase #1: Conference Setting

In this phase, MM arranges a Consensus Conference upon request by a customer who needs to promote a medical product and/or a specific therapy. Thus, the customer representative meets the MM representative and asks for arranging the conference. This is the first contact and the starting point of the Consensus Conference. Firstly, they define the main topic to deal with. Given the conference main topic, MM is in charge of looking for and selecting experts with the most appropriate background, and the customer is in charge of preparing the product documentation to allow experts to read, study, analyse and discuss it. Experts are classified according to their knowledge level in: i) experts with the highest level of knowledge on the defined topic (e.g. university professors or nationally

The Momento Medico Case Study

47

recognized doctors), called Very-High Level experts, ii) experts with a high level of knowledge in the field of the defined topic (e.g. specialists), called High Level experts, and iii) experts with a medium level of knowledge (e.g. general practitioners), called Medium Level experts. The number and the class of the experts to be selected are parameters of the whole process. Typically MM selects the 50% of Very High Level experts and the remaining part of High Level or Medium Level experts. A subset of experts E has been selected from the whole expert set. 2.2

Phase #2: Brainstorming for Decision Problem Definition

In this phase the customer representative explains the conference main topic and her issues Q to the selected experts E: Q = {q1 , q2 , ..., ql }

(1)

Thereafter, MM takes the customer away from the meeting and the experts can start their work. Experts discuss in the brainstorming session [1] in order to elicit all reasonable and supported viewpoints for each issue proposed by the customer. During the brainstorming, similar viewpoints are aggregated or further processed to obtain a set of alternatives for each different issue. This task is coordinated by a moderator committed by MM. Thus, each issue qj has its set of mutually exclusive alternatives (viewpoints): q

Aqj = {a1j , ..., aqnjj }, j = 1...l

(2)

Alternatives in the same set are different viewpoints on the same issue. In other terms, the formal definition of a decision problem has been constructed during the current phase. More in detail, for each issue qj , one of its associated alternatives aqj must be selected as the most suitable choice. This is a matter of the next phases. 2.3

Phase #3: Configuring Decision-Making Groups

Once the issues and the corresponding sets of alternatives have been formalized, it is possible to divide the selected experts into one or more decision-making groups. The number of groups is established during the conference setting [4]. Typically, if the groups are more than one, they work independently in making their decisions with respect to all the proposed issues. At the end of the process, each group provides a choice for each issue (by selecting the choice among all the alternatives). Given the independence of groups, it could be possible to have conflicting results (decisions). In this case, MM must synthesize and harmonize the results. Generally, each decision-making group is composed by more than 50% of Very-High Level experts and the remaining part of High Level or Medium Level experts.

48

I. Avino et al.

The simplest scenario provide for the construction of only one decisionmaking group in order to minimize the harmonization phase conducted by MM. The i-th decision-making group is: Gi = {ei1 , ei2 , . . . , eim }

(3)

Lastly, a function L is defined: L : G → {V H, H, M }

(4)

Function L is used to retrieve the level of individual experts belonging to a given decision-making group and is fundamental to support the Feedback Mechanism during the phase described in Sect. 2.4. 2.4

Phase #4: Consensus Building in Decision-Making Groups

Once the decision-making groups have been created, MM provides each group with Q = {q1 , q2 , . . . , ql } and Aqj (for all j = 1, . . . , l), i.e., the l decision-making problems to solve. For the sake of simplicity, it is better to consider only one decision-making group, namely G = {e1 , e2 , . . . , em }, to which apply the Fuzzy Consensus Model approach [7]. Firstly, each expert in G must provide her preferences on the set of alternatives Aqj , with qj ∈ Q, ∀j = 1, . . . , l. Preferences are provided by using Fuzzy Preference Relations [6]. More in detail, the expert ei must provide a matrix Pji for each issue qj (thus, each expert provides l Fuzzy Preference Relations for each issue). The value phk ij , i.e. the value corresponding to the cell (i, j) of q the matrix, indicates the degree to which the alternative ahj is preferred to the qj alternative ak , for h, k = 1, . . . , nj . Secondly, the Fuzzy Consensus Model - which takes as input the Fuzzy Preference Relations of all experts - is employed to compute a consensus measure [8], within the decision-making group, for each issue qj , for all j = 1, . . . , l. Let crj the consensus measure achieved for the issue qj . Such measure represents the agreement degree of the group of experts with respect to the alternatives for the issue qj . For each j = 1, . . . , l, the consensus measure cr must be compared to the minimum required consensus level cl, if crj ≥ cl, then an acceptable level of consensus is achieved and it is possible to rank the alternatives in order to retrieve the most suitable one (the final choice of the group). The ranking of the alternatives is provided by the selection process described in Sect. 2.5. Otherwise, if crj < cl than it is required to employ the Feedback Mechanism. The previous operations can be executed without the support of a moderator. In the Feedback Mechanism, instead, since there is the need of trying to increase the consensus measure, the moderator plays a fundamental role. She is in charge of identifying specific experts and providing them with personalized advice to convince them to change their opinions to strengthen the agreement in the group. The moderator’s tasks, in the Feedback Mechanism, can be fully automatized [8]. The personalization process takes care of the classification of the experts

The Momento Medico Case Study

49

obtained by using the function in the Eq. 4 and two thresholds, namely λ1 and λ2 . The idea is to provide a greater amount of advice for less-experienced decision makers and fewer for more-experience ones. Once received the advice, each expert can change its Fuzzy Preference Relation. This operation is needed for all the issues in Q. Once all the new Fuzzy Preference Relations are ready, the new consensus measures must be computed to execute a new iteration. It is evident that the consensus process could not converge. Thus, a maximum number of iterations must be pre-defined and used as an exit condition. Hence, it is possible to have consensus-unachieved issues. In this case, MM knows that a risk to continue with the communication project arises. MM should assess the risk and define a mitigation strategy for it. It could continue with the phase described in Sect. 2.5 for selecting the most suitable alternative also for the consensus-unachieved issues. 2.5

Phase #5: Selection of the Most Suitable Alternative

Once all issues in Q (both consensus-achieved and consensus-unachieved) have a final consensus measure, it is possible to choose the most suitable alternative for each one of them. This operation can be executed by using a ranking algorithm that accepts as input the Fuzzy Preference Relations provided by the experts. Also in this case, the computation is repeated for each issue. In the CONSENSUS Project, the Average Rating Values algorithm has been employed to execute the ranking task [3]. At the end of the algorithm execution, it is possible to obtain a vector Rj for each issue qj ∈ Q. For each issue, the alternative with the highest rank represents the choice of the group for such issue. Once the algorithm has been executed for all the issues, MM can write a report for the customer, providing the final result of the whole Consensus Conference. The report will contain, for each issue, the alternatives, the final choice and the rank vector, the last computed consensus measure, the flag consensus-achieved/consensus-unachieved, the consensus level (threshold) and all the other conference parameters. The report will include also qualitative information related to the risk analysis.

3

Prototype Implementation

This section shows the main functionalities of the software prototype concerning the creation and management of the Consensus Conference. Such prototype implements the phases described in Sect. 2 and has been developed by using the following tools: Python 3.6 (https://www.python.org/), for implementing the Fuzzy Consensus Model and the Average Rating Values algorithm, both functionalities have been implemented as external services invoked by the main application; MySQL (https://www.mysql.com/), for implementing the data layer of the main application, and PHP 7.4 (https://www.php.net/), with CakePHP 3.8 (https://cakephp.org/), for implementing the Web and the Business logic layers of the main application.

50

I. Avino et al.

Fig. 2. Understanding the issue and providing fuzzy preference relations

Figure 2 shows the Web UI used by an expert to read the current issue and set the pairwise preferences to fulfil her Fuzzy Preference Relation matrix (such fulfilment is automatically executed by the application) needed to start the consensus building sub-process. Figure 3 shows the Web UI used by an expert to receive advice from the Feedback Mechanism and change her pairwise preferences (if she decide so). Figure 4 reports the snapshot of the Web UI accessed by the Momento Medico representative. Such screen shows the consensus measures obtained at each round (iteration) for tracing and monitoring the process.

4

Experiments and Results

In this section, one of the first experiments executed in the context of Project CONSENSUS is described. Such experiment has been carried out through the engagement of internal stakeholders (involved in the healthcare domain). The experiment is organized in a way to consider a single issue with five different alternatives and only one decision-making group composed of five experts. As first step of the Consensus Conference, all required parameters are fixed: – Minimum consensus level, cl: 0.8 – Maximum number of rounds, roundmax: 4 – Experts and their weights: five experts are involved in this experiment and to each expert a weight is assigned: • • • • •

Expert1 : Expert2 : Expert3 : Expert4 : Expert5 :

0.45 0.45 0.25 0.25 0.25

(Very-high level) (Very-high level) (High level) (High level) (High level)

The Momento Medico Case Study

51

Fig. 3. Receiving advice

– λ1 = 0.20 and λ2 = 0.40 are two thresholds used to create the subgroups to which send personalized advice in the case of Feedback Mechanism. The issue of the Consensus Conference concerns a patient with ASMA. In particular, the customer is interested to know which is the optimal approach to the management of an 18-year-old patient with episodes of wheezing, nocturnal coughing, rhinitis, and appearance of respiratory symptoms in the presence of pets. Five different alternatives about the issue are proposed after the execution of the first three phases of the conference (see Sect. 2): – Alternative 1:Triggering factor removal, spirometry, expiratory flow peak monitoring, pneumological examination; – Alternative 2: Triggering factor removal and Thorax X-ray, spirometry after bronchoconstrictors or bronchodilators, ENT examination, gastroenterological investigations; – Alternative 3: Triggering factor removal and Thorax X-ray, CT thorax scan, expiratory flow peak monitoring, bronchial provocation test with metacholine;

Fig. 4. Summarizing consensus measures

52

I. Avino et al.

– Alternative 4: Triggering factor removal and thorax X-ray, empirical treatment with ICS and short-acting bronchodilators if necessary; – Alternative 5: Triggering factor removal, spirometry, empirical treatment with ICS and short-acting bronchodilators if necessary. Experts provide their fuzzy preference relations as follows: ⎛

⎞ − 0.6 0.6 0.6 0.6 ⎜0.4 − 0.6 0.3 0.3⎟ ⎜ ⎟ ⎟ P11 = ⎜ ⎜0.4 0.4 − 0.3 0.3⎟ ⎝0.4 0.7 0.7 − 0.3⎠ 0.4 0.7 0.7 0.7 − ⎛

⎞ − 0.6 0.5 1 1 ⎜0.4 − 0.5 0.6 0.6⎟ ⎜ ⎟ 2 ⎟ P1 = ⎜ ⎜0.5 0.5 − 1 1 ⎟ ⎝ 0 0.4 0 − 1 ⎠ 0 0.4 0 0 −

⎛

⎞ − 1 0.6 1 1 ⎜ 0 − 1 0.6 0.6⎟ ⎜ ⎟ ⎟ P13 = ⎜ ⎜0.4 0 − 0.6 0.6⎟ ⎝ 0 0.4 0.4 − 0.6⎠ 0 0.4 0.4 0.4 − ⎛

⎞ − 1 1 0.3 0.3 ⎜ 0 − 0.6 0.5 0.3⎟ ⎜ ⎟ 4 ⎟ P1 = ⎜ ⎜ 0 0.4 − 0.6 0.3⎟ ⎝0.7 0.5 0.4 − 0.3⎠ 0.7 0.7 0.7 0.7 −

⎛

⎞ − 0.6 0.5 0.6 0.6 ⎜0.4 − 0.5 1 0.6⎟ ⎜ ⎟ 5 ⎟ P1 = ⎜ ⎜0.5 0.5 − 0.5 0.5⎟ ⎝0.4 0 0.5 − 0.5⎠ 0.4 0.4 0.5 0.5 − The consensus measure at the end of the first round is cr = 71.9%. Figure 5 shows the consensus measures of all alternatives and the final consensus measure. Since the minimum consensus level cl = 80% hasn’t been achieved, the Feedback Mechanism is activated and a new round can start.

Fig. 5. Consesus measures at the end of the first round

The Momento Medico Case Study

53

The Feedback Mechanism generates customized advice for the experts, who should modify their own preferences. In particular, the experts are divided into three sub-groups (by using λ1 and λ2 and the approach proposed in [8]). In the case of the experiment, due to the original weights of the experts, only two groups are created: the first one including the highest experienced experts and the second one including the high experienced experts. Moreover, by taking into account the expert e3 , the advice generated for her by the Feedback Mechanism are reported in Fig. 6.

Fig. 6. Generated advices for expert e3

Hence, the second round starts by submitting to the group of experts the same issue with the advice generated by the Feedback Mechanism. Each expert must provide her preferences another time and can decide to accept the recommendations or not. By considering the case of the expert e3 , she decide to accept the advice, so that her new pairwise comparison matrix is the following: ⎛ ⎞ − 0.6 0.6 0.3 0.3 ⎜0.4 − 0.5 0.5 0.5⎟ ⎜ ⎟ ⎟ P¯13 = ⎜ ⎜0.4 0.5 − 0.3 0.3⎟ ⎝0.7 0.5 0.7 − 0.3⎠ 0.7 0.5 0.7 0.7 − The consensus measure for the second round is computed and is cr = 82.99%. Figure 7(a) shows the consensus measure for each alternative and the final consensus level reached. This time, The minimum required consensus level is achieved. The last phase of the conference can start and the most suitable alternative can be selected now by means of the Average Rating Values algorithm. In this experiment, the best alternative is Alternative 1, which has been chosen from the majority of the expert group. The consensus has been achieved in two rounds, since all experts have accepted the advice suggested by the Feedback Mechanism. The results are reported in Fig. 7(b).

54

I. Avino et al.

(a) Consensus measures

(b) Ranks

Fig. 7. Results of round 2

5

Final Remarks

In this paper a Group Decision Support System for the management of the Consensus Conference has been proposed. The model has been developed by automating the Fuzzy Consensus Model and the Average Rating Values Algorithm. Observing the experimentation results, it is possible to underline that both time and costs have been greatly optimized, the human moderator’s tasks have been partially automated, the privacy of the experts has been guaranteed in order to avoid the risk of bias and the experts’ individual judgement has been facilitated and encouraged. In future works the authors would like to organize a new experimentation with a larger number of experts to assess also the suitability of Fuzzy Consensus Model in such context and the adoption of a further approach [5]. Acknowledgements. This paper is partially supported by MISE (Italian Government) in the context of the research initiative CONSENSUS. Moreover, the authors thank the company Momento Medico (https://www.momentomedico.it/) to have supported this work with its knowledge on Consensus Conference practice. Finally, we thank Lucia Pascarella for her contribution in the early study phases of the CONSENSUS project, and Luca Rizzuti for this precious support to the Web App implementation phase.

References 1. Al-Samarraie, H., Hurmuzan, S.: A review of brainstorming techniques in higher education. Thinking Skills Creativity 27, 78–91 (2018). https:// doi.org/10.1016/j.tsc.2017.12.002. https://www.sciencedirect.com/science/article/ pii/S1871187117302729 2. Candiani, G., Colombo, C., Daghini, R., Magrini, N., Mosconi, P., Nonino, F., Satolli, R.: Come organizzare una conferenza di consenso. Manuale metodologico Sistema nazionale per le Linee Guida. In: Sistema Nazionale per le Linee Guida, 10 October 2019

The Momento Medico Case Study

55

3. Das, S., Ghosh, P.K., Kar, S.: Ranking of alternatives in multiple attribute group decision making: a fuzzy preference relation based approach. In: 2013 International Symposium on Computational and Business Intelligence, pp. 127–130 (2013) 4. Dong, Y., Zhao, S., Zhang, H., Chiclana, F., Herrera-Viedma, E.: A self-management mechanism for noncooperative behaviors in large-scale group consensus reaching processes. IEEE Trans. Fuzzy Syst. 26(6), 3276–3288 (2018) 5. Liao, H.: From conventional group decision making to large-scale group decision making: what are the challenges and how to meet them in big data era? A state-ofthe-art survey. Omega 102141, 10 (2019). https://doi.org/10.1016/j.omega.2019. 102141 6. Morente-Molinera, J., Pérez, I., Ure˜ na, M., Herrera-Viedma, E.: On multi-granular fuzzy linguistic modeling in group decision making problems: a systematic review and future trends. Knowl.-Based Syst. 74, 49–60 (2015). https://doi.org/10.1016/j. knosys.2014.11.001. https://www.sciencedirect.com/science/article/pii/S09507051 1400392X 7. Pérez, I., Wikstr¨ om, R., Mezei, J., Carlsson, C., Herrera-Viedma, E.: A new consensus model for group decision making using fuzzy ontology. Soft Comput. 7, 1617– 1627 (2013). https://doi.org/10.1007/s00500-012-0975-5 8. Pérez, I.J., Cabrerizo, F.J., Alonso, S., Herrera-Viedma, E.: A new consensus model for group decision making problems with non-homogeneous experts. IEEE Trans. Syst. Man Cybern. Syst. 44(4), 494–498 (2014)

Gesture-Based Human-Machine Interface System by Using Omnidirectional Camera Liao Sichao(B) , Yasuto Nakamura, and Hiroyoshi Miwa Graduate School of Science and Technology, Kwansei Gakuin University, 2-1 Gakuen, Sanda-shi, Hyogo, Japan {liaosichao,miwa}@kwansei.ac.jp, [email protected]

Abstract. There has been extensively a huge amount of research on gesture-based human-machine interface so far. It is used not only for computer operations but also for virtual reality operations and computer game operations as applications in which gesture input is effective. In general, a dedicated device for such a system is developed; however, a gesture-based human-machine interface by using a generic device is more convenient and scalable than a dedicated device. In this paper, we propose a gesture-based human-machine interface system by using a generic device, an omnidirectional camera. We can input some simple operations for computer games by gestures, and the system can recognize a series of successive gestures. In addition, the system can simultaneously recognize gestures of several persons who stand along the surrounding of an omnidirectional camera.

1

Introduction

There has been extensively a huge amount of research on gesture-based humanmachine interface so far. The pointing motion [1] in 1980 is an early one. Nowadays, gesture operations moving both hands toward a wide-field display is practical by inexpensive and easy-to-use motion capture devices such as “Kinect” and “Leap Motion.” It is used not only for computer operations but also for virtual reality (VR) operations and computer game operations as applications in which gesture input is effective. In general, a dedicated device must be prepared for gesture operations. Therefore, when several persons try to participate in a game at the same time, it is necessary to prepare the devices for the number of people. However, the dedicated device has few uses except the game and is inconvenient in terms of carrying it in daily life. On the other hand, an omnidirectional camera has been developed, and it becomes inexpensive and popular now. The camera makes it possible to easily capture 360-degree areas in a single image. A gesture input system using the omnidirectional camera was proposed [2] by one of the author. This system can recognize a gesture of circular movement by using both hands along surrounding c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 56–66, 2021. https://doi.org/10.1007/978-3-030-57796-4_6

Gesture-Based Human-Machine Interface System

57

of the omnidirectional camera; therefore, we can easily rotate an object by the instinctive gesture. A gesture-based human-machine interface system by using a generic device is more convenient and scalable than a dedicated device. A practical gesture-based human-machine interface must be able to recognize a series of successive gestures. However, when several successive gestures are input, it is difficult to recognize each gesture, because, in general, there is no way to delimit each gesture. This is also the reason that it is difficult to recognize a sign language. Although it is almost impossible to re-design the gestures of a sign language, in an application such as VR operations and computer game operations, we can design a news set of appropriate gestures. It is necessary to design a set of gestures dedicated to an application. In this paper, we propose a gesture-based human-machine interface system by using a generic device, an omnidirectional camera. We can input simple operations for computer games by gestures, and the system can recognize a series of successive gestures. In addition, the system can simultaneously recognize gestures of several persons who stand along the surrounding of an omnidirectional camera.

2

Related Works

There are many studies on gesture-based human-machine interface by using a dedicated device as follows. The reference [3] proposes a system using a glove-type wearable device that is equipped with sensors for measuring angles of fingers and a wrist and with sensors for detecting opening or closing of a hand, and with an acceleration sensor. The reference [4] proposes a system that recognizes finger characters, a part of a sign language, by using a neural network analyzing recorded data from attaching myopotential sensors to a forearm. There are studies that uses the commercially available wearable device, Myo [5–7]. The reference [5] analyzes myopotential patterns of Brazilian sign language by using SVM (Support Vector Machine). A motion that is significantly different from the others can be effectively recognized; however, it is difficult to recognize small motions such as finger motions. The reference [6] studies to recognize a sign language by analyzing data from a myopotential sensor, an acceleration sensor, and a magnetic sensor by DTW (Dynamic Time Warping). The reference [7] studies to recognize the American sign language by analyzing data from a myopotential sensor, a gyro sensor, an acceleration sensor, and a magnetic sensor, and it shows that the k neighborhood method using DTW is effective. “Leap motion” is a device that accurately recognizes hand gestures [8]. The motions of hands and fingers on the device can be accurately recognized by using an infrared sensor. This device is very effective, when hands and fingers are near

58

L. Sichao et al.

on the device. However, if gestures of several persons must be simultaneously recognized, the device cannot be used. “Kinect” is a motion-capture device [9]. This detects the skeleton of a person and its movement by adding distance information obtained from an infrared camera to data from an RGB camera. This device is also very effective, when we need only the motion of a skeleton. However, if a gesture including fingers must be recognized, the device cannot be used. Many system and methods using no dedicated device have been studied as follows. As a study on gesture input system, studies such as the references [10,11] have been proposed. These studies assume to use RGB cameras. The method of the reference [10] detects a hand by extracting a skin color area from the input image and judging that the area has a characteristic shape of a hand and fingers. This method assumes that a hand is near by a camera and the largest skin color area is a hand. If a camera must simultaneously recognize gestures of several persons, this method cannot be used, because largest skin color areas are not always hands but faces. The reference [11] proposes a method tracking hand areas by using the meanShift method detecting the maximal value of probabilistic distribution function. It is difficult to detect fingers by this method; therefore, it cannot be used for recognizing a gesture using fingers. The reference [12] proposes a system to recognize gestures from still images. It is difficult to track the movement of a hand, because it is necessary to cut out still images from a video in real time. The reference [13] proposes a system that simultaneously recognizes gestures of several persons by using an omnidirectional visual sensor. This system focuses on the recognition of the trajectories of motions, and input images are converted into low resolution images. Therefore, it is difficult to recognize the motions of fingers. If there are many gestures using fingers, it cannot be used.

3

Gesture-Based Human-Machine Interface System by Using Omnidirectional Camera

In this section, we propose a gesture-based human-machine interface system by using a generic device, an omnidirectional camera. The system can simultaneously recognize gestures of several persons (players) who stand along the surrounding of an omnidirectional camera. In this paper, we used Theta S of RICOH as an omnidirectional camera. The input of the system is RGB images that are made from spherical images from two fish-eye lenses. First, the system detects the hands and the faces of each players, then it recognizes gestures of the hands. We describe the method in detail in Sect. 3.1 3.1

Gesture Recognition Method from Image

The method to recognize a gesture proceeds the following procedures.

Gesture-Based Human-Machine Interface System

1. 2. 3. 4. 5.

59

Background difference Skin color detection Facial recognition Fingertip detection Motion tracking

We explain the above procedures using an example. The background image is shown in Fig. 1 and the input image is shown in Fig. 2.

Fig. 1. Background image

Fig. 2. Input image

Background difference is one of the methods used for detection and tracking of moving objects from video. When the background image is Ib (x, y) and an image from video is Im (x, y), the difference image Id (x, y) is given by the following expression (1). (1) Id (x, y) = |Ib (x, y) − Im (x, y)| We can extract moving objects from Id (x, y) and Im (x, y). An example of the background difference of Fig. 1 and Fig. 2 is shown in Fig. 3.

Fig. 3. Example of background difference

60

L. Sichao et al.

Next, we describe the skin color detection procedure. Different people have different skin colors, and a skin color greatly depends on brightness in an environment. Therefore, when a skin color is detected using the RGB color system, many colors other than skin color are extracted. To avoid this, HSV, YIQ or YCbCr color system is used for skin color detection. The HSV color system is composed of 3 elements of H (hue), S (Saturation), V (luminosity), and it is defined as a model which is comparatively close to human. The conversion equation from the RGB color system to the HSV color system is given in Eq. (2). In this paper, we use HSV color system. V = M AX(R, G, B) M AX(R, G, B) − M IN (R, G, B) S= V (G − B) + (G − R) −1 H = cos 2 (G − B)2 + (G − R)(B − R)

(2)

Lightness, V , represents the brightness of a color, and the highest value among the RGB elements is used as the value. Saturation, S, is defined as the ratio of the maximum value of each RGB element minus the minimum value to the lightness. The hue, H, indicates the type of color vision and takes its value in the range of 0◦ to 360◦ . Since the image processing in this paper uses 8 bitmaps, the hue is made to be half of 0◦ to 180◦ . The threshold value set in this paper is limited to the specific indoor environment, and (H, S, V) is in the range from the lower bound (97, 30, 13) to the upper bound (125, 255, 255). The values depend on an environment. A skin color detected from Fig. 2 by this procedure is shown in Fig. 4.

Fig. 4. Skin detection

Gesture-Based Human-Machine Interface System

61

The candidates of the regions are extracted by combining the background difference procedure and the skin color detection procedure (Fig. 5 and Fig. 6).

Fig. 5. Binarized regions extracted by background difference procedure and skin color detection procedure

Fig. 6. Regions extracted by background difference procedure and skin color detection procedure

Next, we explain the facial detection procedure by using the Haar-Like feature method [14]. There is a method using LBP feature quantity using the distribution of luminance. The LBP feature quantity can be calculated faster than the Haar-Like feature quantity. However, we use the Haar-Like feature method, because the accuracy of face recognition is necessary to avoid the unstableness of recognition. The first player who has been stationary for a fixed time is set as the first player, and the other players are recognized sequentially. The area to be processed is limited by estimating the possible range of hands based on the basic human body ratio from the detected face size. Thus, the area candidates of the hands of each player are narrowed down. Next, we explain the fingertip detection procedure. The hands are extracted by the method in [10] from Fig. 6. Regions are determined by the labeling processing from a binary image such as Fig. 5 (Fig. 7) and the hands regions are extracted by using the method in [10] (Fig. 8).

Fig. 7. Binarized labeled region

Fig. 8. Labeled region

62

L. Sichao et al.

The boundaries are extracted from Fig. 8 by using the eight-neighborhood method (Fig. 9) and the fingers are detected by the piecewise linear approximation (Fig. 10).

Fig. 9. Boundary of hand

Fig. 10. Detection of fingers by piecewise linear approximation

We show an example of the situation that the face, the hands, and the fingers of a player are detected in Fig. 11. Next, we explain the motion tracking procedure. Rectangles including a hand in Fig. 11 are tracked. Repeating that the fingers are detected by applying the fingertip detection procedure in the rectangles, a motion is tracked.

Fig. 11. Detection of face, hands, and fingers

From the above procedures, the system can recognize each player’s gestures.

Gesture-Based Human-Machine Interface System

3.2

63

Design of Gestures

Since a gesture-based human-machine interface system must be able to recognize a series of successive gestures, it is necessary not only to recognize each gesture but also to easily delimit each gesture. Each gesture must be designed to intuitively recall the corresponding operation. In this section, we design a set of gestures for VR operations and computer game operations. The design of a set of gestures means that each gesture is defined and an operation is allocated to the gesture. We assume that the system deals with the cursor operations; the object operations such as selection, movement, rotation; the meta-operations such as start, halt, reset of the system. A set of gestures must be designed so that, even if a series of successive gestures are input, each gesture is delimit and correctly recognized. We define a gesture so that the number of fingers of each hand represents the intent of an operation, and the motion of hands while keeping the state of fingers represents direction, degree, or quantity. We allocate each operation to a gesture in Table 1. Each operation is defined as a pair of a target and an action to Table 1. Gesture and Operation Target

Action

Hand

Number of fingers Operation

cursor

select, waiting

one hand

zero

Select an object by the cursor at the same location of the object, if the object and the cursor are at the same location

object

move

one hand

one finger

Move the selected object to the direction of the motion of the hand.

camera move, rotate one hand

two fingers

If the number of fingers of the left hand is two, move the camera; if the number of fingers of the right hand is two, rotate the camera

object

rotate

one hand

three fingers

Rotate objects

object

expansion and contraction

one hand

four fingers

Expand an object if the hand moves upward; Shrink the object if the hand moves downward

cursor

move

one hand

five fingers

Release the object and moves only the cursor

camera expansion, contraction

two hands two fingers

Change the magnification of the viewpoint based on the distance between the hands

meta

halt

two hands five fingers

When the positions the left and right cursors are reversed, the system halts

meta

restart

unknown

The system is reset, if no hands and no fingers are recognized

unknown

64

L. Sichao et al.

the target. The waiting action means the reset of the operations. The transition from a gesture to another gesture must go through the waiting action. A gesture is applied as a valid gesture, if the gesture is the majority of the gestures recognized in the preceding 30 frames by the system.

4

Performance Evaluation

In this section, we evaluate the performance of the proposed system. First, we investigate the recognition ratio of the gestures by the system. Since the gestures are distinguished by the number of the fingers, we examine the ratio that the number of the figures are correctly recognized. Figure 12 shows the recognition ratio in one frame in case of one player.

Fig. 12. Recognition ratio in the case of one player

Fig. 13. Recognition ratio in the case of three players

The average recognition ratio is 60%. Since the system uses as a valid gesture when the gesture is the majority of the gestures recognized in the preceding 30 frames, this recognition ratio is feasible. The recognition ratios are more than 70% when the number of the fingers is 0, 1, and 5; however, the recognition ratios are less than 50% when the number of the fingers is 3 and 4. Since the system makes one image from two images by two fish-eye lens, the image is distorted. In addition, when the number of the fingers is 3 and 4, the distances among the figures tend to be small. Therefore, the slender fingers easily get buried in the background, and it is difficult to extract the fingers correctly from an image. Figure 13 shows the recognition ratio in one frame in case of three players. The average recognition ratio is 56.6%, about 3% less than that of one player. When the number of the players increase, the distances from the omnidirectional camera to the players increase a little, because, since the players stand along the surrounding of the omnidirectional camera, the players take their distance from each other. Consequently, the fingers easily get buried in the background more than that of one player, and it is difficult to extract the fingers correctly from an image.

Gesture-Based Human-Machine Interface System

65

The recognizable distance of the gestures is in the range from 0.5 m to 1.5 m. The less the distance between a player and a camera, the higher the recognition ratio is and the faster the recognition speed is. The average recognition speed of the system is about 10 fps for one player and 6 fps for three players, because the system that we developed in this time is a prototype on a low-spec laptop PC. The speed is too slow to use in an actual computer game. It is necessary to develop a system that is enough fast to use in an actual computer game.

5

Conclusion

In this paper, we proposed a gesture-based human-machine interface system by using a generic device, an omnidirectional camera. The prototype system that we developed can recognize a series of successive gestures of several persons who stand along the surrounding of an omnidirectional camera, and the persons can execute the VR operations if the gestures are input slowly. Furthermore, we evaluated the recognition ratio of the system. As a result, the average recognition ratio is 60% in case of one person and 56.6% in case of three persons. Since the system uses as a valid gesture when the gesture is the majority of the gestures recognized in the preceding 30 frames, this recognition ratio is feasible. The improvement of the recognition ratio by the deep learning technology and the increase in the recognition speed that is enough fast to use in an actual computer game remain for the future works.

References 1. Bolt, R.: Put-that-there: voice and gesture at the graphics interface. In: Proceedings of SIGGRAPH, pp. 262–270 (1980) 2. Irie, D., Kamada, Y., Tuno, Y., Yamasaki, T., Takano, A., Miwa, H.: Kinephy (2016). http://ist.ksc.kwansei.ac.jp/miwa/miwaLab/?page id=991 3. Voona, A.K., Chouhan, T., Panse, A., Sameer, S.M.: Smart glove with gesture recognition ability for the hearing and speech impaired. In: Proceedings IEEE Global Humanitarian Technology Conference, pp. 101–105 (2014) 4. Koike, Y., Hirayama, R., Yoneyama, K.: Finger character recognition method from surface myoelectric signals. In: Proceedings of the 73 Japan Congress of the Information Processing Society, pp. 41–42 (2011) 5. Figueiredo, L.S., Abreu, J.G., Teixeira, J.M., Teichrieb, V.: Evaluating sign language recognition using the Myo Armband. In: Proceedings of 2016 XVIII Symposium on Virtual and Augmented Reality (SVR), Gramado, Brazil, June 21–24, 2016 6. Banerjee, A., Paudyal, P., Gupta, S.K.S.: SCEPTRE: a pervasive, non-invasive, and programmable gesture recognition technology. In: Proceedings of the 21st International Conference on Intelligent User Interfaces, Sonoma, California USA, March 7–10, pp. 282–293, 2016

66

L. Sichao et al.

7. Taylor, J.: Real-time translation of American Sign Language using wearable technology. In: Honors Theses 928 (2016). http://scholarship.richmond.edu/ honorstheses/928/ 8. Leap Motion INC., Leap Motion. https://www.leapmotion.com/ 9. Microsoft Corporation, Kinect for Windows. https://developer.microsoft.com/enus/windows/kinect 10. Kathura, P., Yoshitaka, A.: Hand Gesture Recognition by using Logical Heuristics. Inf. Process. Soc. Jpn. Res. Rep. 2012–HCI–147(25), 1–7 (2012) 11. Sinomiya, Y., Okada, H., Hosino, H.: Development of a hand detection system for hand gestures. In: Proceedings of the 30 Symposium on Fuzzy Systems, Kochi, vol. 30, pp. 730–733, September 2014 12. Hunayama, M., Hirayama, R.: Finger character recognition from still images by neural networks. In: Proceedings of the 72 Japan Congress of the Information Processing Society, Tokyo, Japan, March 8–12, 2010 13. Nisimura, T., Mukai, T., Oka, R.: Spotting recognition of multi-person gestures from a single moving image using low-resolution features. J. Inst. Electr. Inf. Commun. Eng. D J80–D2(6), 1563–1570 (1997) 14. Viola, P., Jones, M.J.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, December 8–14 2001

A Secure Group Communication (SGC) Protocol for a P2P Group of Peers Using Blockchain Rui Iizumi1(B) , Takumi Saito1 , Shigenari Nakamura2 , and Makoto Takizawa1 1

2

Hosei University, Tokyo, Japan {rui.iizumi.8k,takumi.saito.3j}@stu.hosei.ac.jp, [email protected] Tokyo Metropolitan Industrial Technology Research Institute, Tokyo, Japan [email protected]

Abstract. In distributed applications, a group of multiple peer processes (peers) are cooperating with one another by exchanging messages in underlying networks. Here, every message sent by each member peer has to be causally delivered to every member peer. In addition to causally delivering messages, each message has to be securely exchanged among peers in a group. That is, each message is required to be sent by only member peers and received by only and every member peer in a group. In one approach to realizing the secure group communication, only and all the member peers share a group key and exchange messages encrypted by the group key. A group key is obtained among member peers by using the PKI (Public Key Infrastructure). However, a CA (Certification Authority) has to provide each peer with a public key of another peer. The blockchain is now used to securely share data among peers by fully replicating append-only data in a distributed manner, especially in a scalable system. In this paper, we propose an SGCB (Secure Group Communication using Blockchain) protocol for member peers in a group to share public keys and create and share a group key by using the blockchain in a distributed manner without using CA. Keywords: Secure Group Communication model · SGCB protocol

1

· Blockchain · PKI · P2P

Introduction

In distributed applications, a group of multiple processes are cooperating with one another by exchanging messages in underlying networks. Cloud computing systems [8] are widely used in various applications, where every server and client are managed in a centralized manner. On the other hand, distributed applications like Skype [5] are now realized in a distributed model, i.e. peer-to-peer (P2P) model [16] to increase the scalability, performance, reliability, and availability. A P2P model is composed of peer processes (peers) which are autonomons c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 67–77, 2021. https://doi.org/10.1007/978-3-030-57796-4_7

68

R. Iizumi et al.

and are cooperating with one another in a distributed manner, i.e. there is no centralized coordinator. In a group of multiple peers of the P2P model, messages are required to be causally delivered to every member peer [14]. A message m1 causally precedes a message m2 iff (if and only if) the transmission event of the message m1 happened before the message m2 [14]. In order to causally deliver messages, logical clocks like linear clock [14] and vector clock [15] and physical clocks [18] are used. In addition to causally delivering messages, messages have to be authentically and secretly [9] delivered to only and every peer in a group in presence of malicious peers. In order to realize the authenticity and secrecy of the group communication, all and only the member peers share a group key and exchange messages encrypted by the group key with one another in a group. The blockchain [17] is a peer-to-peer (P2P) model for multiple peers to securely share an append-only database named ledger by fully replicating the database on every peer. A secure one-to-one communication protocol among a pair of peers is proposed by using the blockchain [13]. Here, the public key of each peer is stored in the blockchain and shared by every peer. In the traditional PKI (Public Key Infrastructure) [7], the CA (Certification Authority) manages public keys of every process. Every process has to communicate with the CA to obtain the public key of another process. In scalable systems, centralized authorities like the CA easily become performance bottleneck and a single point of failure. In this paper, we take a distributed P2P model by taking advantage of the blockchain. The public key of each peer and the group key of each group of peers are fully replicated to all the member peers of the group and cannot be changed in the blockchain once the keys are validated. In this paper, we newly propose a secure group communication (SGCB) protocol so that a group of multiple peers can share a group key with one another using the blockchain. The SGCB protocol is composed of BR (Blockchain registration) and SCV (smart-contract-based verification) protocols. Each peer first generates a pair of public and secret keys. Then, each peer registers the public key with its identifier in the blockchain by the BR protocol. Every peer can share the public key with the other peers through the blockchain. By using the public keys and private keys of member peers, only and every member peer obtains a group key in the blockchain by the SCV protocol. We implement the SGCB protocol on the private blockchain of Ethereum [1]. In Sect. 2, we present a system model like PKI and blockchain. In Sect. 3, we propose the SGCB protocol. In Sect. 4, we implement the SGCB protocol.

2 2.1

System Model Group Communication

A group G is a collection of member peer processes (peers) p1 , . . ., pn (n ≥ 1), which are cooperating with one another by exchanging messages in an underlying network N [Fig. 1]. We assume the underlying network N supports every pair of member peers with a reliable one-to-one communication like TCP [6]. That

A Secure Group Communication (SGC) Protocol

69

is, messages sent by a peer can be delivered to a destination peer in the sending order with neither loss nor duplication of messages. In a group G, messages sent by each member peer have to be causally delivered to every member peer [14]. A message m1 causally precedes a message m2 iff (if and only if) the transmission event of the message m1 happened before the message m2 [14]. Messages are causally delivered to peer by using the logical clocks like linear clock [14] and vector clock [15] and the hybrid clock [18] which uses both physical clocks and logical clocks. In this paper, we assume every message is causally delivered to every member peer in a group by taking advantage of the logical clocks.

Fig. 1. Group communication.

2.2

Blockchain

The blockchain [17] is a peer-to-peer (P2P) model [16] which is composed of autonomous peers interconnected in networks. Here, every member peer shares a ledger which is an append-only database and is fully replicated on every member peer in a distributed manner. A transaction stands for a record like trading record among peers. A ledger is a collection of transactions. Transactions are validated in a distributed manner by a consensus algorithm among the member peers such as Proof-of-Work (PoW) [17], Proof-of-Stake (PoS) [4], and Proofof-Importance (PoI) [3]. Then, only validated transactions are appended to the ledger. Each transaction is encrypted by the PKI (Public Key Infrastructure) [7] cryptographic technologies [17], and hash functions like SHA-256 [11] and RIPEMD [10]. Each transaction in the ledger can be viewed by anyone. Tracking the identification of addresses and transactions can be applied to audit systems. In addition, the ledger is distributed and fully replicated to every peer

70

R. Iizumi et al.

in a P2P network. That is, there is no centralized server to store and manipulate transaction in the ledger. In order to make every replica of the ledger mutually consistent, tamper proof system like the PoW and PoS is provided. The blockchain is tolerant of peer faults since mutually consistent full replicas of the ledger are held by so large number of peers that malicious peers cannot be a majority. All the proper peers can make an agreement on transactions in presence of untrusted malicious peers [Fig. 2].

Fig. 2. Blockchain.

2.3

Public Key Infrastructure (PKI)

A group G is composed of member peers p1 , . . ., pn (n ≥ 1) which securely communicate with one another. That is, the secrecy and authenticity of communication [9] have to be supported to every member peer in the group. If a member peer pi sends a message m in a group, every and only member peer pj has to receive the message m. The other peers in the group do not receive the message. In addition, if a member peer pi receives a message m, the message m has to be one sent by a member peer in a group G. The PKI [17] is used to realize the authenticity and secrecy of the secure communication. The PKI is an authentication infrastructure which uses a pair of keys for encryption and decryption differently from the traditional common key encryption method. There are a pair of a public key P Ki and a private key SKi for each peer pi [7]. Not only a member peer but also a non-member peer pi can know the public key P Kj of every other peer pj . The public key P Ki of each peer pi is derived from the private key SKi using a one-way hash function. Therefore, it is impossible to obtain the private key SKi from the public key P Ki due to computation complexity. In addition, the RSA algorithm [19] is widely used for generating a pair of keys and the ESDSA (Elliptic Curve Digital Signature Algorithm) [20] is used in the blockchain. The RSA (R.L. Rivest, A. Shamir, L. Adleman) algorithm [19] takes advantage of the prime factorization problem. In the ECDSA algorithm [20], the discrete logarithm problem on the elliptic curve

A Secure Group Communication (SGC) Protocol

71

is used to make it impossible to analyze the secret key. The public key infrastructure (PKI) is mainly used for digital signature and public key cryptography. • Digital signature: This is a method to prove that every message sent by a member peer pi does not include “spoofing” or “falsification”. A message m encrypted by the private key SKi of a member peer pi can only be decrypted from the public key P Ki of the peer pi . In other words, if the message m can be decrypted with the public key P Ki of the peer pi , it is sure the message m is sent by the peer pi . • Public key cryptography: A pair of peers pi and pj securely exchange messages. The peer pj sends a message m obtained by encrypting a message m with the public key P Ki of a destination peer pi . On receipt of the message m from the peer pj , the peer pi obtains the message m by decrypting the encrypted message m with its private key SKi . That is, the message m encrypted with the public key P Ki of the member peer pi can only be decrypted with the private key SKi . In the traditional CA (Certification Authority) models [12], there is a centralized controller, CA, where the public key of every peer is maintained [Fig. 3]. If a process pi would like to get a public key P Kj of another peer pj , the process pi asks the CA to give the public key P Kj . In scalable systems, the CA becomes performance bottleneck and a single point of failure. In this paper, we consider a distributed model where there is no centralized CA. Every peer pi registers the public key P Ki in the ledger of the blockchain, i.e the public key of each peer is replicated in every peer [Fig. 4].

Fig. 3. CA model.

Fig. 4. Blockchain.

72

3 3.1

R. Iizumi et al.

SGCB (Secure Group Communication Using Blockchain) Protocol Distributed Protocol

A group G is composed of peers p1 , . . ., pn (n ≥ 1) which are interconnected in reliable networks. In this paper, we newly propose an SGCB (Secure Group Communication with using Blockchain) protocol for a group G of peers p1 , . . ., pn to make only and every member peer to obtain a group key. In the traditional CA (Certification Authority) model [12], the public key P Ki and private key SKi of each peer pi and the group key GK are managed in a centralized manner. The SGCB protocol is a P2P model which takes advantage of the blockchain, where a group key GK is generated and replicated on every peer in a distributed model without relying on the CA (Certification Authority). The SGCB protocol is a system which generates a secure group key GK by strictly adhering to two algorithms: blockchain registration (BR) algorithm and smart contract-based verification (SCV) algorithm. After registering a secure group key GK of a group G using the SGCB protocol in the blockchain, every member peer pi encrypts a message m in the group key GK and the message m obtained by encrypting the message m is sent to every other member peer in the group G. Here, each message is assumed to be reliably delivered to every destination peer by a reliable communication protocol such as TCP [6]. On receipt of the message m , only a member peer can obtain the message m by decrypting the message m in the group key GK. 3.2

Blockchain Registration (BR)

First, each peer pi registers the peer identifier pidi and the public key P Ki in the ledger of the blockchain B according to the following BR protocol: [BR protocol] 1. A peer pi first decides a private key SKi and then generates a public key P Ki from the private key SKi by the ECDSA algorithm [20]. The identifier pidi of the peer pi is derived from the public key P Ki using a hash function like SHA-256 [11]. The peer pi generates a transaction pidi , P Ki , ti in the blockchain B, where ti is current time of the peer pi . 2. The peer pi signs the transaction Ti with the priveta key SKi , i.e. SKi (pidi , P Ki , ti ) and broadcasts the signed transaction Ti . 3. The miner checks if the followings conditions hold for the transaction Ti : a The contents pidi , P Ki , and ti of the peer pi Ti are obtained from the transaction Ti by P Ki (SKi pidi , P Ki , ti ). b The identifier pidi is never previously registered in the blockchain B. 4. If the transaction Ti is taken into the blockchain B and is sufficiently approved by member peers with a consensus algorithm like PoW [17], the transaction Ti in the ledger becomes the certificate of the peer pi . Here, the identifier pidi and public key P Ki of each member peer pi , and associated timestamp ti are only used to register the transaction Ti .

A Secure Group Communication (SGC) Protocol

3.3

73

Smart Contract-Based Verification (SCV)

We propose the SCV (smart-contract-based verification) protocol to establish a secure group G of peers p1 , . . ., pn (n ≥ 1) in the P2P model. The secure certificate of each peer is stored in the ledger and is automatically and securely validated by taking advantage of the smart contract [1]. The smart contract is created on accounts of member peers and stored in the blockchain B. In this paper, the smart contract is created by using the solidity [2] used in Ethereum [1]. [SCV protocol] [Fig. 5] 1. One of the member peers, say pi , presents a group G as a leader peer and creates a transaction T for doing the smart contract-based verification. The transaction T contains the identifiers pid1 , . . . , pidn (n ≥ 1) and public keys P K1 , . . . , P Kn (n ≥ 1) of the member peers p1 , . . ., pn of the group G, respectively. In addition, the transaction T is broadcast with the verified contract address which is previously stored in the blockchain B. The data included in the transaction Ti is also signed by the creator. 2. The miner receives the transaction T and verifies the signature of the transaction T according to the ECDSA algorithm [20]. If it is successfully verified, the smart contract is executed by minner. At this time, a sufficient transaction fee [Gas] [1] is required to be paid to verify the transaction T . 3. The smart contract behaves as follows. a The smart contract receives the identifiers pid1 , . . . , pidn (n ≥ 1) of all the member peers p1 , . . . , pn , respectively. b The smart contract associates the public key P Ki included in the transaction T with the identifier pidi of each peer pi . c For each member peer pj , the public key P Kj included in the transaction T is compared with each public key P Kj associated with the identifier pidj stored in the blockchain B. In particular, the certificate with the signature stored in the blockchain B is verified by referring to the public key P Kj included in the transaction, i.e. P Kj = P Kj . d If the certificate is successfully verified to be secure, the smart contract generates a group key GK using the public keys P K1 , . . . , P Kn (n ≥ 1) collected from the member peers p1 , . . . , pn as a seed. The group key GK is created by randomly hashing the public keys P K1 , . . . , P Kn (n ≥ 1). e The smart contract returns the created group key GK to the leader peer pi issuing the transaction T using the public key cryptography. 4. The leader peer pi obtains the group key GK from the smart contract. Then, the leader peer pi broadcasts the group key GK to all the member peers p1 , . . . , pn by encrypting the group key GK with their public keys P K1 , . . . , P Kn , i.e. P Ki (GK) to each peer pj . By performing the SGCB protocol which is composed of the BR and SCV protocols, a secure group G can be established for member peers p1 , . . . , pn . In addition, this proves that non-member peer does no impersonation of any member peer.

74

R. Iizumi et al.

Fig. 5. Secure group communication (SGC) protocol.

4 4.1

Implementation System Model

We implement the SGC protocol by using the Ethereum private blockchain [1] as shown in Table 1. Here, macPCs with macOS are used which are interconnected in a local area network (LAN). The consensus algorithm in Ethereum takes usage of Proof-of-Work (PoW). In this implementation, by using a virtual group on macPC, the SGC protocol is implemented on one macPC. A group G is composed of four virtual accounts of peers p1 , . . . , p4 [Fig. 6]. Table 1. Implementation Operating System macOS Catalina v.10.15.1 Technical Spec

4.2

MacBook Air (Early 2015, Core 1.6 GHz DualCore Intel Core i5)

Blockchain

Ethereum (private network)

Client

Geth v.1.8.27-stable

BR Protocol

The SGCB protocol is composed of the BR (Blockchain registration) and SCV (Smart contract-based verification) protocols. In the BR protocol, the public key P Ki of each peer pi is registered as a certificate in the blockchain. Each member peer pi creates a transaction Ti for registering the public key P Ki [Fig. 7]. In the blockchain, an electronic signature is made with the private key P Ki to prove that the peer pi is the creator peer of the transaction Ti . In addition, the creator peer pi inserts the digitally signed data into the optional data of

A Secure Group Communication (SGC) Protocol

75

Fig. 6. Implementation.

the transaction, which is in the variable length of the Ethereum transaction. The data to be inserted here is the certificate pidi , P Ki , ti registered in the blockchain. The certificate data pidi , P Ki , ti to be inserted into the optional data are fixed-length data hashed in advance with soliditysha3 (32 bytes) [2]. The created transaction is broadcast to the private network. If the transaction is successfully verified by the blockchain minner, the certificate idi , P Ki , ti is registered on the private blockchain. 4.3

SCV Protocol

We present how to implement and use the SCV (smart-contract-based verification) protocol. The smart contract is implemented as a program which is primarily written in the solidity [2], which is a high-level language. We create a program by which the certificate on the blockchain is compared with the public key stored in a transaction. The leader peer compares the contract address created in the transaction for verification and the public key P Ki and identifier pidi of each member peer pi (i = 1, . . ., n) as arguments of the smart contract and broadcasts to the private network of Ethereum. The blockchain is converted to an ABI (Application Binary Interface) which is obtained by compiling the solidity program by the solc of compiler [2]. The smart contract is executed by the minner. Then, if successful, the group key GK is returned to the leader peer. The result of the successful verification of the group member peers p1 , . . . , pn and the group key GK created are recorded in the blockchain and can be referenced at any time. In other words, if the program is executed in the same conditions and returns the same result, the member peers p1 , . . . , pn of the group G judge the group G is secure.

76

R. Iizumi et al.

Fig. 7. Registration transaction.

5

Concluding Remarks

In this paper, we newly proposed the SGCB (Secure Group Communication with Blockchain) protocol to realize the secure group communication among peers by taking advantage of the blockchain. The SGCB protocol is composed of the BR (Blockchain registration) and SCV (smart contract-based verification) protocols without using the CA. First, the public key P Ki of end peer pi is registered in the ledger of the blockchain in the BR protocol. Then, the group key GK is created and appended to the ledger in the SCV protocol. The public key of each peer and the group key of a group of peers are stored in the blockchain, i.e. replicated on every peer. Only and all the member peers can securely communicate with one another by using the blockchain. We implemented the SGCB protocol by using the Ethereum private blockchain. As on-going studies, we are now evaluating the SGCB protocol, especially scalable environment.

References 1. Ethereum white paper. https://github.com/ethereum/wiki/wiki/White-Paper 2. Solidity, the contract-oriented programming language. https://github.com/ ethereum/solidity 3. Symbol from nem. https://nemtech.github.io/concepts/consensus-algorithm.html 4. Ethereum wiki. https://github.com/ethereum/wiki/wiki/Proof-of-Stake-FAQ. Accessed 2 Aug 2019 5. Baset, S.A., Schulzrinne, H.: An analysis of the Skype peer-to-peer internet telephony protocol. In: Proceedings IEEE INFOCOM (2006) 6. Comer, D.E.: Internetworking With TCP/IP, vol. 1. Prentice Hall, Upper Saddle River (1991) 7. Conti, M., Kumar, E.S., Lal, C., Ruj, S.: A survey on security and privacy issues of bitcoin. IEEE Commun. Surv. Tutor. 20(4), 3416–3452 (2018) 8. Creeger, M.: Cloud computing: an overview. Queue 7(5), 3–4 (2009) 9. Denning, P.J.: Fault tolerant operating systems*. ACM Comput. Surv. (CSUR) 8(4), 359–389 (1976)

A Secure Group Communication (SGC) Protocol

77

10. Dobbertin, H., Bosselaers, A., Preneel, B.: RIPEMD-160: a strengthened version of RIPEMD. In: International Workshop on Fast Software Encryption, pp. 71–82 (1996) 11. Gilbert, H., Handschuh, H.: Security analysis of SHA-256 and sisters. In: International Workshop on Selected Areas in Cryptography, pp. 175–193 (2003) 12. Hallam-Baker, P., Stradling, R.: DNS certification authority authorization (CAA) resource record. Internet Engineering Task Force (IETF) (2013) 13. Khacef, K., Pujolle, G.: Secure peer-to-peer communication based on blockchain. In: Workshops of the International Conference on Advanced Information Networking and Applications (WAINA 2018), pp. 662–672 (2019) 14. Lamport, L.: Time, clocks, and the ordering of event in a distributed systems. Commun. ACM 21(7), 558–565 (1978) 15. Mattern, F.: Virtual time and global states of distributed systems. In: Parallel and Distributed Algorithms, pp. 215–226 (1988) 16. Mudliar, K., Parekh, H., Bhavathankar, P.: A comprehensive integration of national identity with blockchain technology. In: 2018 International Conference on Communication, Information & Computing Technology (ICCICT) (2018) 17. Nakamoto, S.: Bitcoin: A Peer-to-Peer Electronic Cash System (2009). https:// bitcoin.org/bitcoin.pdf 18. Nakayama, H., Nakamura, S., Enokido, T., Takizawa, M.: Topic-based causally ordered delivery of event messages in a peer-to-peer (P2P) model of publish/subscribe systems. In: Proceedings of the 7th International Workshop on Heterogeneous Networking Environments and Technologies (W-HETNET 2016), pp. 348–354 (2016) 19. Rivest, R.L., Shamir, A., Adleman, L.M.: A method for obtaining digital signature and public-key cryptosystems. Commun. ACM 21(2), 120–126 (1978). https://doi. org/10.1145/359340.359342 20. Sankar, L.S., Sindhu, M., Sethumadhavan, M.: Survey of consensus protocols on blockchain applications. In: 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS) (2017)

Method for Detecting Onset Times of Sounds of String Instrument Kenta Kimoto and Hiroyoshi Miwa(B) Graduate School of Science and Technology, Kwansei Gakuin University, 2-1 Gakuen, Sanda-shi, Hyogo, Japan {kmt,miwa}@kwansei.ac.jp

Abstract. The technology of CG (Computer Graphics) is indispensable in the production of animation especially for playing musical instruments. Generally, in CG animation production of musical instrument performance, recording of musical instrument performance data by a motion capture system and recording of sound source data are performed separately. Therefore, it is inevitable that there are gaps between the sound source and the video. Since high-quality sound is required for sound source data, electronic musical instruments are not used. Therefore, sound source data is recorded in WAV format, which do not include the information of onset times and frequency. Consequently, it is necessary to detect onset times of musical instruments and to stretch the intervals of the onset times in order to synchronize sound source data and video data. There is still no effective method for detecting onset times for sound source of a stringed instrument such as a violin. In this paper, we focus on a unique property that occurs during performance of a stringed instrument. We propose a method for detecting onset times in sound source of a stringed instrument based on the property. Furthermore, we evaluate the effectiveness by experiments using real sound sources.

1

Introduction

The technology of CG (Computer Graphics) is indispensable in the production of animation especially for playing musical instruments. It is extremely difficult for animators to manually make a video image of a musical instrument performance, because the musical instrument performance is a complicated operation and even a slight difference from reality makes viewers feel uncomfortable. Therefore, the technology of CG animation for performance of some musical instruments such as piano has been researched and developed so far [1]. In the recent animation industry, the number of animation works that have performance scenes of musical instruments is increasing (Fig. 1). This fact suggests that demand for technology of CG animation will continue to increase. Generally, in CG animation production of musical instrument performance, recording of musical instrument performance data by a motion capture system and recording of sound source data are performed separately. Therefore, it is inevitable that there are gaps between the sound source and the video. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 78–88, 2021. https://doi.org/10.1007/978-3-030-57796-4_8

Method for Detecting Onset Times of Sounds of String Instrument

79

8 7 6 5 4 3 2 1 0 2011

2012

2013

2014

2015

2016

2017

Fig. 1. The number of animation works related to music [2]

Since high-quality sound is required for sound source data, electronic musical instruments are not used for recording sound source. Therefore, sound source data cannot be obtained in MIDI format but can be obtained only in WAV format. Onset times and frequency information are not explicitly recorded in WAV format. Consequently, it is necessary to detect onset times of musical instruments and to stretch the intervals of the onset times in order to synchronize sound source data and video data. Although the method of detecting the onset times for sound source of a piano has been developed [1], there is still no effective method for detecting the onset time for sound source of a stringed instrument such as a violin. Compared to musical instruments such as a piano, a violin does not have a fixed change in intensity, and frequency width and fluctuations are large; therefore, it is difficult to detect onset times by naively applying a conventional method. In this paper, we focus on a unique property that occurs during performance of a string instrument. This is a momentary noise that occurs when a bow is applied to the strings. We propose a method for detecting onset times in sound source of a stringed instrument based on the property. Furthermore, we evaluate the effectiveness by experiments using real sound sources.

2

Related Works

We describe the related works in this section. There is a research area on automatic transcription of a music score from WAV format data (ex. [3–6]). Since WAV format data stores values obtained by synthesizing all frequencies for each time, it is impossible to directly detect individual sounds from music with many superimposed sounds recorded as WAV format data. Therefore, the spectrogram, the representation of the spectrum of frequencies of a signal as it varies with time, is used for automatic transcription of

80

K. Kimoto and H. Miwa

a music score. A spectrogram of a music piece is obtained from WAV format data by frequency decomposition using a short Fourier transform. In general, the onset time of a sound is estimated as the peak time of the intensity of the frequency corresponding to each sound. However, since harmonic, reverb, and noise are included in real sound data, the accuracy of detecting onset times by this method is low for actual sound data. There is a method for automatic transcription from WAV format data using deep learning [7]; however, this method also does not solve the problem of harmonic sufficiently, reverb, and noise, and the accuracy is not so high yet. Consequently, no generally effective method for automatic transcription of a music score including a method determining onset times is yet known. As for the study on synchronization of sound and motion, the reference [8] proposes a method by using feature points of music and movement. However, the feature points are not always the onset times. Therefore, it is difficult to reduce the gap between the onset time of a sound and the starting time of the motion creating the sound. To synchronize sound source and motion, it is necessary to change the speed of the sound and the motion. As for the study on variable speed of sound, there are some methods based on time stretch processing [9–11]. These methods change only playback time without changing the height of sounds. Time intervals between successive onset times must be stretched or compressed so that the onset time of a sound is equal to the starting time of the motion creating the sound, and the height of sounds must not be changed. Therefore, the above methods can be applied to synchronize sound source and motion. As for the studies on the record of the performance motion, there is a study on the record of three-dimensional hand movement by a motion capture system [12]. It is difficult to capture the motion accurately, because the markers attached to the fingers cause misrecognition due to the short distance to other markers. The reference [12] proposes the method to reduce the markers by attaching the markers only to the joints of the fingers. However, it is not useful for capturing musical instrument’s performance, because the fingers moves very rapidly. The reference [13] proposes the method of capturing accurate motion by correcting captured original motion of the performance of a piano.

3

Method for Detecting Onset Times of Sounds from Sound Source of String Instrument

In this section, we propose a method for detecting onset times of sounds from sound source of a stringed instrument. We found, for the first time, that a momentary noise occurs when a bow hits to the strings or when the direction that a bow moves changes. This is referred to as a bow-string noise in this paper.

Method for Detecting Onset Times of Sounds of String Instrument

81

We describe the bow-string noise in Sect. 3.1 and the proposed method by using the bow-string noise in Sect. 3.2. In the rest of the paper, we assume that we have a musical score for a musical instrument and that the musical performance is coincident with the musical score. The sound source data is recorded in WAV format. 3.1

Bow-String Noise

A sound of a stringed instrument is made by moving a bow in one direction while touching it to the strings on the instrument’s body. When the touching point of the bow to the strings reaches the end of the bow, the bow is moved in the opposite direction. When the direction changes, since the bow gives an opposite vector of force to a vibrating string, the momentary noise occurs. When a bow hits to the strings and a sound is made, the momentary noise also occurs. Bowstring noises frequently occur in musical performance of a stringed instrument. We show a spectrogram and momentary noises (bow-string noises) in Fig. 2.

Fig. 2. Spectrogram and bow-string noise

82

K. Kimoto and H. Miwa

It is not difficult to detect bow-string noises from the spectrogram of a musical performance, because a bow-string noise has intensity over all frequencies in a moment. Thus, we can detect the onset time of a sound by detecting a time that intensity is larger than a constant over all frequencies (Fig. 3).

Fig. 3. Onset times of sounds and bow-string noise

In general, not all sounds have bow-string noises. Sounds made during a slur have no bow-string noises (Fig. 4), because these sounds are made by only fingering without changing the direction of a bow or without hitting by a bow. 3.2

Proposed Method

We propose a method for detecting onset times of sounds from sound source of a stringed instrument.

Method for Detecting Onset Times of Sounds of String Instrument

83

Fig. 4. No bow-string noise during a slur

A musical score for a stringed instrument and the sound source data recorded in WAV format which is coincident with the musical score are given. First, the proposed method determines the spectrogram of the sound source data. In this paper, a sound is decomposed into 550 frequencies, and the sound intensity in a frequency is a value within the range from 0 to 1677215. We show an example of a spectrogram in Fig. 5.

84

K. Kimoto and H. Miwa

Fig. 5. Spectrogram

Next, the method detects the times when the bow-string noises occur by choosing the time that the number of the frequencies where intensity is larger than a threshold is larger than a threshold (Fig. 6). Thus, the method detects the onset times of the sounds made at the time when a bow hits to the strings or when the direction that a bow moves changes.

Fig. 6. Onset times detected by bow-string noises in Fig. 5

Next, the method detects the onset times of the sounds between the onset times of the sounds detected by the bow-string noises. These sounds are made during a slur. The method applies the method proposed in the reference [1]. The algorithm of the reference [1] determines the onset times based on the dynamic programming so that the occurrence times of frequencies with strong sound intensity matches to the musical score. Consequently, the method detected the onset times of all sounds contained in the sound source data of a musical instrument performance. We show an example of the detected onset times by the proposed method in Fig. 7.

Method for Detecting Onset Times of Sounds of String Instrument

85

Fig. 7. Detected onset times by proposed method in Fig. 5

4

Performance Evaluation

In this section, we evaluate the performance of the method proposed in the previous section. We used two pieces of the violin sonata no. 5 in F major by Ludwig van Beethoven and the musette D-Dur BWV Anh. 126 by Johann Sebastian Bach. The former musical piece has no slurs, and the latter has some slurs. The sound source data of musical performance by two players are recorded in WAV format with the sampling frequency of 44.1 kHz, stereo, and 16 bit linear PCM. We show the result in Table 1. The part that does not include slurs are used as for the Musette. Table 1. The number of correctly detected onset times Musical piece

All numbers of onset times Players A Players B

Violin sonata, 1st movement 166 Musette

72

156

159

71

69

The result indicates that more than 90% onset times are correctly detected by the proposed method.

86

K. Kimoto and H. Miwa

The sounds that the method failed to detect are the first sound of the musical piece, very short sounds made in succession over a short period of time, and a sound made on the same string after a long extended sound. The part that no sound exists were not detected as an onset time by the method. We show an example in Fig. 8. White lines denote the correctly detected onset times and the arrows denote the onset times that the method failed to detect.

Fig. 8. Correctly detected onset times and non-detected onset times in case of no slur

We show the result for the part including slurs in Table 2. Table 2. The number of correctly detected onset times for parts including slurs Musical piece All numbers of onset times Players A Players B Musette

18

17

17

The result indicates that more than 90% onset times are correctly detected by the proposed method. We show an example in Fig. 9. A slur begin from the violet line and the yellow lines denote the correctly detected onset times. The blue line is the mistakenly detected times; that is, no sound actually occurs at the time, but the method mistakenly outputs the time as a onset time.

Method for Detecting Onset Times of Sounds of String Instrument

87

Fig. 9. Correctly detected onset times and non-detected onset times for part of slurs

5

Conclusion

In this paper, we proposed a method for detecting onset times in sound source of a stringed instrument. The production of CG animation for playing musical instruments needs the method of synchronizing video data and sound source data, since the recording of musical instrument performance data by a motion capture system and the recording of sound source data are performed separately. For the synchronization, it is necessary to detect onset times of sounds from sound source data in WAV format. The method of detecting the onset times for sounds of a piano has been developed; however, there was no method for detecting the onset times for sounds of a stringed instrument such as a violin. Our proposed method gives a solution to this problem. First, we found, for the first time, that a momentary noise, a bow-string noise, occurs when a bow hits to the strings or when the direction that a bow moves changes. A bow-string noise denotes the onset time of a sound. In addition, we proposed the method detecting all onset times using a combination of detecting occurrence times of bow-string noises and our previous method for detecting onset times during a slur. The performance evaluation of the proposed method showed that the proposed method can detect more than 90% onset times correctly. For the future work, it is necessary to improve the correct ratio for detecting onset times.

88

K. Kimoto and H. Miwa

References 1. Takano, A., Hirata, J., Miwa, H.: Method of generating computer graphics animation synchronizing motion and sound of multiple musical instruments. In: Advances in Intelligent Networking and Collaborative Systems, pp. 124–133. Springer (2018) 2. http://uzurainfo.han-be.com/ 3. Ochiai, K., Kameoka, H., Sagayama, S.: Explicit beat structure modeling for nonnegative matrix factorization-based multipitch analysis. In: Proceeding of ICASSP, March 2012 4. Patel, J.K., Gopi, E.S.: Musical notes identification using digital signal processing. Procedia Comput. Sci. 57, 876–884 (2015) 5. Bello, J.P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., Sandler, M.B.: A tutorial on onset detection in music signals. IEEE Trans. Speech Audio Process. 13(5), 1035–1047 (2005) 6. Marolt, M., Kavcic, A., Privosnik, M., Divjak, S.: On detecting note onsets in piano music. In: Proceeding of 11th IEEE Mediterranean Electrotechnical Conference, 7– 9 May, Cairo, Egypt (2002) 7. Thickstun, J., Harchaoui, Z., Kakade, S.: Learning features of music from scratch. In: Proceedings of International Conference on Learning Representations (ICLR), 24–26 April, Toulon, France, pp. 2–7 (2017) 8. Lee, H.C., Lee, I.K.: Automatic synchronization of background music and motion in computer animation. Comput. Graph. Forum 24(3), 353–361 (2005) 9. Arfib, D., Verfaille, V.: Driving pitch-shifting and time-scaling algorithms with adaptive and gestural techniques. In: Proceeding of 6th International Conference on Digital Audio Effects. (DAFx 2003), 8–11 September, London, UK (2003) 10. Kawai, T., Kitaoka, N., Takeda, K.: Acoustic model training using feature vectors generated by manipulating speech parameters of real speakers. In: Proceeding of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2012), Hollywood, 3–6 December, CA, USA (2012) 11. Mousa, A.: Voice conversion using pitch shifting algorithm by time stretching with PSOLA and resampling. J. Electr. Eng. 61(1), 57–61 (2010) 12. Miyata, N., Kouchi, M., Kurihara, T., Mochimaru, M.: Modeling of human hand link structure from optical motion capture data. In: Proceeding of 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 28 September–2 October, Sendai, Japan (2004) 13. Kugimoto, N., Miwa, H., et al.: CG animation for piano performance. In: Proceeding of ACM SIGGRAPH 2009, 3–7 August, New Orleans Louisiana, USA (2009)

PageRank for Billion-Scale Networks in RDBMS Aly Ahmed(B) and Alex Thomo University of Victoria, Victoria, BC, Canada {alyahmed,thomo}@uvic.ca

Abstract. Data processing for Big Data plays a vital role for decisionmakers in organizations and government, enhances the user experience, and provides quality results in prediction analysis. However, many modern data processing solutions make a significant investment in hardware and maintenance costs, such as Hadoop and Spark, often neglecting the well established and widely used relational database management systems (RDBMS’s). PageRank is vital in Google Search and social networks to determine how to sort search results and how influential a person is in a social group. PageRank is an iterative algorithm which imposes challenges when implementing it over large graphs which are becoming the norm with the current volume of data processed everyday from social networks, IOT, and web content. In this paper we study computing PageRank using RDBMS for very large graphs using a consumer-grade server and compare the results to a dedicated graph database. Keywords: PageRank · One billion graph database · Big data · Matrix partitioning

1

· RDBMS · Graph

Introduction

The amount of data collected from different sources has reached unprecedented levels [10]; the vast majority of the world’s data has been created in the last few years. To put this in numbers, data now is being measured in zetabytes, 1021 bytes, and petabytes, 1015 bytes. For example, Walmart is estimated to create 2.5 petabytes of consumer data every hour [13], Facebook processes tens of billions of likes and messages every day, Google receives 1.2 trillion search requests every year, and Internet of Things (IoT) data is expected to exceed 1.6 zetabytes by 2020 [11]. With the amount of data produced daily, one of the main challenges facing big data is filtering out data and identifying the wheat from the precious. As search results tend to be in millions of pages, ranking search results becomes very crucial, however ranking pages tends to be one of the most difficult problems as the search engine is required to present a very small subset of results and order them by relevance. A variety of ranking features such as page content or hyperlink structure of the web are used by Internet search engines to come up c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 89–100, 2021. https://doi.org/10.1007/978-3-030-57796-4_9

90

A. Ahmed and A. Thomo

with a good ranking. Many algorithms have been proposed to sort query results and return the most relevant pages first. Among them are PageRank [15], HillTop [4] and Hypertext Induced Topic Selection (HITS) algorithms [6]. PageRank [5,15] developed by the Google founders is based on the hyperlink structure and on the assumption that high ranked pages usually contain links to useful pages, therefore giving more weight to pages that have more inbound links from high weighted paged. PageRank is an iterative algorithm and executed on the whole graph, which, in Google’s case, is very large. Algorithms such as PageRank for big data sets require extensive hardware setups and, in many cases, distributed computing, such as Hadoop [17] and Spark [16]. This is because we cannot fit the whole data set in one machine’s memory. Building such a setup comes with significant investment and continuous running costs. Nevertheless, some of the existing systems such as Relational Database Management Systems (RDBMS’s) are still widely used and will not be deprecated in the future as the amount of investments on them is growing over the years. More specifically, RDBMS’s have been around for over half-century [7] and proven to provide consistent performance, stability, and concurrency control. RDBMS’s are currently the backbone of the IT industry and have been evolving over the past few decades for better performance. On the other hand, using dedicated graph databases for graph processing is presumed to provide better performance and scalability over relational databases (c.f. [3]), however, graph databases still have a long way to reach the level of maturity of RDMBS’s. From this prospective, using an RDBMS to implement graph algorithms seems logical and in fact more efficient. However computing graph algorithms using SQL queries is challenging and requires novel thinking. As such, there is active research on the use of novel methods to compute graph analytics on RDBMS (c.f. [1,2,9,14]). These works have shown that RDBMS’s often provide higher efficiency over graph databases for specific analytics task. This paper presents a PageRank algorithm implementation using RDBMS with table partitioning and compares it with the implementation provided by a dedicated Graph Database. The rest of this paper is organized as follows. A brief background review of the PageRank algorithm is given in Sect. 2. Our PageRank implementation using RDBMS with table partitioning is given in Sect. 3. Section 4 shows the results of the experiments. Section 5 concludes the paper.

2

Preliminaries

We denote by G = (V, E) a graph with V as a set of vertices, and E as the set of edges. For each vertex v there will be a non-negative initial PageRank value P R(v) and for each edge e = (u, v) there is a weight of 1/n assigned to it, where n is the number of outgoing links from u. We can use the following relational tables to store a graph. The TE table contains ∀e = (u, v) ∈ E along with their weights, where u is denoted by fid, and v by tid and w(u, v) by cost. We can construct a unique index on (fid,tid).

PageRank for Billion-Scale Networks in RDBMS

91

The TV table contains ∀u ∈ V in the graph, denoted by id along with its rank P R(u), denoted by pagerank. We can also construct a unique index on id. The PageRank algorithm assigns a weight value to each page in the web or vertex in a graph; the higher the weight of a page or vertex, the more important it is. Web pages are represented as a directed graph where pages are vertices and links are edges. Below is an example of how we calculate PageRank for a small graph.

Fig. 1. Simple directed graph

The graph in Fig. 1 has four vertices representing four web pages. Page 1 has links to each of the other three pages; page 2 has links to 1 and 3 only; page 0 has a link only to 1, and page 3 has links to 2 and 0 only. Let us assume a random user is visiting page 1; this user will have a probability of 1/3 for each link (0, 2, 3) to follow and visit a next page. If the user is visiting page 0, then he will have a probability of 1 to visit page 1 as this is the only link available. If we follow the same logic, we will have a probability of 1/2 for each link on page 2 and page 3. The probability value for each link is the weight for each link, and based on this, we could build an adjacency matrix for the graph as a square matrix M with a number n of columns and rows. The PageRank algorithm proceeds in the following steps. • Set an initial PageRank value for each page • Repeat until convergence: compute PageRank using Eq. 1 P R(A) =

n P R(i) i=1

C(i)

(1)

where P R(A) is the PageRank value of vertex A, P R(i) is the PageRank value of vertex i, and C(i) is the number of outbound links (edges) of vertex i. Vertices i for i ∈ [1, n] are all the vertices of the graph that contain links pointing to A. Usually, there is also a damping factor present in the computation of PageRank values but we ignore it in this paper for simplicity and because all techniques we present can be extended easily to that case.

92

A. Ahmed and A. Thomo

The link probabilities (1/C(i)), as described above, could be represented as a matrix M . For the graph in Fig. 1, the matrix will be as follows. ⎡ ⎤ 0 1/3 0 1/2 ⎢1 0 1/2 0 ⎥ ⎥ M =⎢ ⎣0 1/3 0 1/2⎦ 0 1/3 1/2 0 Regarding the P R(i) values, we can represent them all by a PageRank vector V . Then the computation given by Eq. 1 can be written as M ·V , which captures the computation of PR values for all the vertices of the graph at the same time. We denote by Vt the version of V at iteration t. Then, PageRank is iteratively computed using Eq. 2, by multiplying matrix M and vector Vt and repeating until convergence. (2) Vt+1 = M · Vt where Vt+1 is the new vector holding the newly computed PageRanks for all the vertices. In each iteration, the newly computed PageRank values will get closer to the final PageRank values. We stop when PageRank values do not change much. Observe that the PageRank value of a vertex A is dependent on the value of PageRank of vertices pointing to it. However, we do not know the Pagerank value of inbound vertices till we calculate the ones pointing to them and we will not know the Pagerank values of them till we calculate the PageRank values of vertices pointing to them too and this keeps on. So, to overcome this starting problem, we initially set an estimated PageRank value for each vertex. This can be represented as a vector. V0 = [1/n, 1/n, · · · , 1/n] where n is the number of vertices in the graph.

3

PageRank in RDBMS

Representing the graph in a square matrix, M requires quadratic size. Computing Pagerank in its matrix representation requires the matrix to be fully loaded in memory; however, loading the graph into memory might not be possible for large graphs like the Google web or Facebook. However, since the matrix is very sparse, all the implementations exploit sparseness and do not materialize the matrix as is. Instead only the non-zero entries are stored in the format (i, j, mij ). Using RDBMS is quite efficient in this regard. First, matrix M could be saved as tuples (i, j, mij ) of only connected vertices. Second, when computing PageRank for a vertex A, the edges that need to be considered are only those pointing to vertex A. This is a tiny subset of the matrix. Figure 2 shows the SQL statement used to compute Pagerank using Eq. 2, where T E stores graph edges and T V stores vertices’s Pagerank estimates. If we

PageRank for Billion-Scale Networks in RDBMS

93

run this SQL query, it will produce the result of multiplying the matrix with the vector Vi . The multiplication is very efficient as we only do the calculation for existing edges in the matrix.

Fig. 2. Compute pagerank for one iteration

Figure 3 shows the full SQL statement using the new Merge SQL [8] operation, which is very efficient in saving SQL results. This way, we save the new Pagerank estimate so that it can be used in the next iteration. The query will do a full table scan or index scan based on the table setup. RDBMS will need to load parts of the table into memory to compute Pagerank. This process is acceptable when the loaded parts could be loaded into memory but cumbersome when graph size is hugely larger than available memory, which inevitably will lead to use data swap and, as a result, diminish the performance dramatically. In the following section, we solve the graph size problem by using table partitioning based on partitioning the matrix M and vector Vi into parts that can be loaded into memory.

Fig. 3. Compute pagerank and update vector V

3.1

Table Partitioning

To overcome the matrix size problem, we partitioned both the matrix and the vector into k parts and saved each part in a separate table, T Vi , and T Ei , where

94

A. Ahmed and A. Thomo

i ∈ [1, k]. We divide the matrix into stripes of almost equal size, and we create vectors to have only the vertices that are needed to compute Pagerank for each matrix stripe. Figure 4 shows how the matrix and vector are partitioned. Each matrix stripe will have a full set of inbound edges for a set of vertices and matched with a vector containing all the fid’s that exist in the partitioned matrix. This way, we will be able to compute Pagerank for the set of vertices of interest. A similar matrix partitioning scheme is also described in the Map-Reduce chapter of [12].

Fig. 4. Matrix and vector partitioning into k stripes.

The main goal is to create as many stripes as needed so that the portions of the matrix in one partition can fit conveniently into memory. We used the SQL statements in Fig. 5 to build the partitioned tables based on matrix partitions. Each T Ei table will have a subset of vertices along with all inbound edges, and each table T Vi will have all fid’s that exist in T Ei .

Fig. 5. Creating partition tables T Vi and T Ei for i ∈ [1, k].

4 4.1

Experimental Results Setup Configurations

We executed the experiments on a consumer-grade server with Intel Core i7-2600 CPU @3.4 GHz 64 bit Processor, 12 G of RAM and running Windows 7 Home Premium, using Java JDK SE 1.8.

PageRank for Billion-Scale Networks in RDBMS

95

As RDBMS’s we used the latest versions of a commercial database (which we anonymously call CD) and an open-source database (which we anonymously call OD). As graph database, we used the latest version of a graph database (which we anonymously call GD). We refrain from using the real names of these databases for obvious reasons. We used four real datasets from Stanford’s Data collection and a one-billionedge graph from The Laboratory for Web Algorithmics. By default, we used three table partitions in the case of table partitioning experiments except stated otherwise. All the results shown are based on computing one Pagerank iteration. The real datasets are Web-Google, Pokec, Live-Journal and Orkut (from http:// snap.stanford.edu), and UK 2005 (from http://law.di.unimi.it/webdata). Table 1 shows statistics about the datasets used. Table 1. Graph datasets Data set Web-Google

4.2

Nodes#

Edges#

875,713

5,105,039

Pokec

1,632,803

30,622,564

Live journal

4,847,571

68,993,773

Orkut

3,072,441

117,185,083

UK 2005

39,459,921

936,364,282

IT 2004

41,291,594 1,150,725,436

Results

We observed that in all the datasets we used, OD and CD clearly out-perform GD significantly. Figure 6 shows how GD performs poorly with large datasets, such as Live Journal (LJ) or Orkut. Orkut was the largest data set that GD could manage to process without crashing out. Using table partitioning gives significant enhancement in managing memory load which in turn boosts PageRank processing time especially with large data sets such as LJ, Orkut and the large graph UK-2005. Figure 7 shows big performance differences between the GD processing time and both RDBMS approaches using table scan and table partitioning. The impact of table partitioning starts to appear once the datasets become larger, as shown in the chart. Table partitioning significantly improved over the approach of table scan especially for LJ and Orkut. In our experiments we also wanted to decouple the processing time of computing PageRank from the time to save the results, hence we ran two separate experiments; one with saving the outcome and the other without saving the outcome. Figure 8 compares the results of the experiments. We noticed that CD did a better job than OD in both operations and the time taken for saving data was noticeably shorter. We relate this to the Merge operation which exists in CD

96

A. Ahmed and A. Thomo

Fig. 6. Results of running pageRank using GD, CD, and OD.

Fig. 7. Results of pageRank in RDBMS CD using table scan, table partitioning and GD. We show here only the dataset sizes as opposed to their names. The names are as in Fig. 6.

but does not in OD. The Merge operation showed to have superior performance over regular insert/update operations. In addition to the above, OD performs poorly in computing PageRank using a non-clustered index scan. Figure 9 shows a big jump in time when we used non-clustered index scan in large data sets, in contrast to a clustered index scan or table scan. We relate this to the OD optimizer not being good enough in planning and executing the queries. Also the I/O cost was high which indicates the data retrieval process included high random access. Such random access was reduced significantly when the table was reordered as part of building the clustered index, hence the processing time was also reduced significantly.

PageRank for Billion-Scale Networks in RDBMS

97

Fig. 8. Show the difference between the time taken to only calculate pageRank without saving the results and the time taken to do the same with saving the results.

Fig. 9. OD performs poorly in the case of non-clustered index vs table scan or clustered index.

Figure 10 shows the processing time in the case of using table scan and clustered index scan. Using an index did not help that much in reducing processing time and the results were very comparable to just table scan and differences were not noticeable. We relate this to the fact that the query used to calculate PageRank requires a full table retrieval hence using an index will not make that big of a difference.

98

A. Ahmed and A. Thomo

Fig. 10. Results of using table scan vs index scan in OD and CD

4.3

Experiments on Billion-Scale Networks

Here we show our experiments on two very large datasets, namely UK-2005 and IT-2004, the latter with more than a billion edges. They represent the web network of UK and Italy in 2005 and 2004. The precise number of nodes and edges is given in Table 2. We ran the PageRank algorithm using table partitioning. Both data sets were partitioned into 12 partitions and we sum up all the processing time to compute PageRank for each partition. Figure 11 shows the runtime for each of the datasets. We used CD as it showed superiority over OD in I/O and memory management. GD could not be a part of the experiment as it failed to process any graph bigger than Orkut in our test environment setup. As Fig. 11 shows, even with over one billion graph size, we managed to get a good processing time. More specifically, for IT-2004, we were able to complete the computation of PageRank for all the partitions in about 10 min (600 s). Table 2. Billion-size datasets Data set Nodes# IT 2004

Edges#

41,291,594 1,150,725,436

UK 2005 39,459,921

936,364,282

PageRank for Billion-Scale Networks in RDBMS

99

Fig. 11. Results of calculating pageRank on very large data sets, IT 2004: 1.15 billion edge graph and UK 2005:0.93 billion edge

5

Conclusion

We presented the implementation of the PageRank algorithm over RDBMS using different options, such as table-scan, non-clustered-index, clustered-index, and table-partitioning. We showed that RDBMS’s could perform better than GD and could process very big datasets in a consumer-grade server. The experiments showed that the OD optimizer was not good enough for our task. For instance, it was not able to determine that using a non-clustered index is not a good choice for our queries. The OD clustered index behaved better but still there was no improvement compared to simple table-scan without any indexing at all. The CD commercial query optimizer is more intelligent than its open-source counterpart. We observed that manually partitioning tables gives a significant improvement in the execution time of our queries. This tells us that RDBMS optimizers of today, even after many decades of development, still can be improved further in order to handle heavy analytical queries such as those computing PageRank. CD did not consume significant processing time in order to merge the data but OD in large datasets consumed significant processing time, in some cases double the query time. We clearly observed that both RDBMS’s we use, without using any indexing or partitioning still dramatically outperform graph database GD. This comes as a surprise because the latter was designed for handling graphs from the ground up. Therefore we conclude that specialized graph databases still have a lot of ground to cover in order to be good competitors to RDBMS engines for large datasets.

References 1. Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925 (1993)

100

A. Ahmed and A. Thomo

2. Ahmed, A., Thomo, A.: Computing source-to-target shortest paths for complex networks in RDBMS. J. Comput. Syst. Sci. 89, 114–129 (2017) 3. Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. (CSUR) 40(1), 1–39 (2008) 4. Bharat, K., Mihaila, G.A.: When experts agree: using non-affiliated experts to rank popular topics. In: Proceedings of the 10th International Conference on World Wide Web, pp. 597–602 (2001) 5. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998) 6. Chakrabarti, S., Dom, B.E., Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A., Gibson, D., Kleinberg, J.: Mining the web’s link structure. Computer 32(8), 60–67 (1999) 7. Codd, E.F.: A relational model of data for large shared data banks. In: Broy, M., Denert, E. (eds.) Software Pioneers, pp. 263–294. Springer, Heidelberg (2002) 8. Eisenberg, A., Melton, J., Kulkarni, K., Michels, J.-E., Zemke, F.: SQL: 2003 has been published. ACM SIGMoD Rec. 33(1), 119–126 (2004) 9. Gao, J., Zhou, J., Yu, J.X., Wang, T.: Shortest path computing in relational DBMSs. IEEE Trans. Knowl. Data Eng. 26(4), 997–1011 (2014) 10. Jagadish, H., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J.M., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014) 11. Kelly, R.: Internet of things data to top 1.6 zettabytes by 2020. Campus Technology (2015) 12. Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Data Sets. Cambridge University Press, Cambridge (2020) 13. McAfee, A., Brynjolfsson, E., Davenport, T.H., Patil, D., Barton, D.: Big data: the management revolution. Harv. Bus. Rev. 90(10), 60–68 (2012) 14. Ordonez, C., Omiecinski, E.: Efficient disk-based k-means clustering for relational databases. IEEE Trans. Knowl. Data Eng. 16(8), 909–921 (2004) 15. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report, Stanford InfoLab (1999) 16. Zaharia, M., Xin, R.S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016) 17. Zikopoulos, P., Eaton, C., et al.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media (2011)

An Algorithm to Select an Energy-Efficient Sever for an Application Process in a Cluster of Servers Kaiya Noguchi1(B) , Takumi Saito1 , Dilawaer Duolikun1 , Tomoya Enokido2 , and Makoto Takizawa1 1

Hosei University, Tokyo, Japan {kaiya.noguchi.9w,takumi.saito.3j}@stu.hosei.ac.jp, [email protected], [email protected] 2 Rissho University, Tokyo, Japan [email protected]

Abstract. We have to decrease electric energy consumption of information systems, especially servers to reduce carbon dioxide emission. In information systems, a client issues application processes to servers in clusters. Application processes have to be performed on servers so that the total energy consumption of servers in the cluster can be reduced. In this paper, we discuss how to select a server to energy-efficiently perform an application process issued by a client. In order to find an energyefficient server, we have to estimate the execution time of application processes and the energy consumption of the server to perform but new application processes issued and current application processes. In this paper, we newly propose an algorithm to estimate the execution time of application processes and the energy consumption of a server by considering not only current active application processes but also possible application processes to be issued after the current time. By using the estimation model, we also propose an MES (Minimum-Energy Server selection) algorithm to select a server to perform an application process. We design and implement an EDS (Eco Distributed System) simulator to evaluate selection algorithms in terms of energy consumption of servers and execution time of each process. Keywords: Energy-efficient cluster · Server selection · Power consumption model · MES algorithm · Energy estimation model simulator

1

· EDS

Introduction

It is critical to decrease the energy consumption of servers in clusters to reduce carbon dioxide emission on the earth. In information systems, a client issues a request to a load balancer of a cluster of servers. The load balancer selects one host server in the cluster and then an application process to handle the request is c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 101–111, 2021. https://doi.org/10.1007/978-3-030-57796-4_10

102

K. Noguchi et al.

created and performed on the server. In our previous studies [4,8–10], algorithms to select a host server for each application process are proposed so as to reduce energy consumption of servers. Here, we have to estimate the execution time of application processes on servers. The simple power consumption (SPC) and computation (SC) models [5,6] and the multi-level power consumption (MLPC) and computation (MLC) models [9,10] are proposed to give the power consumption [W] of a server and the execution time of application processes on the server. The models are macrolevel ones [5] where the total power consumption of a whole server is considered without being conscious of each hardware component like CPU and memory. By using the power consumption and computation models, the execution time of an application process on a server and the energy consumption of the server can be estimated. Algorithms to select energy-efficient servers in a cluster are so far proposed [4–7,9]. Here, only active application processes being performed are considered to do the estimation. In this paper, we consider not only current active application processes but also possible application processes to be issued on a server after current time. We newly propose an algorithm to obtain the expected execution time of application processes on a server by estimating the number of possible application processes to be issued to the server. Here, the expected number of possible application processes to be issued is derived from the average number of application processes performed before the current time. By using the estimation model, we also propose an MES (Minimum-Energy Server selection) algorithm to select a server to perform a new application process issued by a client where the total energy consumption of the servers can be reduced. Here, a server is selected to perform a new application process, where estimated energy consumption is smallest to perform the new application process. We design an EDS (Eco Distributed System) simulator to estimate the MES algorithm in terms of the total energy consumption of servers and the average execution time of application processes compared with the previous estimation algorithm. The EDS simulator is implemented by SQL on a database. In Sect. 2, we present a system model and the power consumption and computation models. In Sect. 3, we propose an algorithm to estimate the execution time of application processes on a server. In Sect. 4, we propose the MES algorithm to select servers to energy-efficiently perform application processes. In Sect. 5, we present the EDS simulator and show some experiment.

2 2.1

System Model Clusters of Servers

A cluster S is composed of servers s1 , · · · , sm (m ≥ 1) (Fig. 1). A client ci issues a request qi to a load balancer L. The load balancer L selects a host server st in the cluster S and forwards the request qi to the server st . An application process pi is created to handle the request qi and performed on the sever st . On termination of the application process pi , the server st sends a reply ri to

An Algorithm to Select an Energy-Efficient Sever for an Application Process

103

the client ci . Here, the load balancer L selects such host server st that the total energy consumption of the servers s1 , · · · , sm can be reduced and the average execution time of application processes on the servers can be shortened. In this paper, a term process stands for an application process to be performed on a server, which only consumes CPU resources of a server, i.e. computation process [5]. A process pi is active on a server st if and only if (iff) the process pi is performed on the server st . Otherwise, the process pi is idle.

Fig. 1. A cluster and load balancer.

2.2

Power Consumption and Computation Models

A server st is composed of npt (≥1) homogeneous CPUs cpt0 , · · · , cptnpt −1 . Each CPU cptk [1] is composed of nct (≥1) homogeneous cores ctk0 , · · · , ctknct −1 . Each core ctkh supports the same number ctt of threads. A server st supports processes with totally ntt (= npt · nct · ctt ) threads. Each process is at a time performed on one thread. A thread is active iff at at least one process is performed, otherwise idle. In this paper, a term process means an application process which consumes CPU resources on a server. CPt (τ ) is a set of active processes performed on a server st at time τ . Here, the electric power N Et (n) [W] of a server st to concurrently perform n(≥0) processes is given in the MLPC (Multi-level Power Consumption) model [9,10] as follows [8,10]:

104

K. Noguchi et al.

[Power consumption of a server st to perform for n processes]

⎧ ⎪ minEt if n = 0. ⎪ ⎪ ⎪ ⎪ ⎪ ⎨minEt + n · (bEt + cEt + tEt ) if 1 ≤ n ≤ npt . N Et (n) = minEt + npt · bEt + n · (cEt + tEt ) if npt < n ≤ nct · npt . ⎪ ⎪ ⎪ minEt + npt · (bEt + nct · cEt ) + ntt · tEt if nct · npt < n < ntt . ⎪ ⎪ ⎪ ⎩ maxEt if n ≥ ntt .

(1)

The electric power Et (τ ) [W] consumed by a server st at time τ is assumed to be N Et (|CPt (τ )|) in this paper. That is, the electric power consumption of a number n of active processes. A server st is defined to server st depends on the et consume electric energy τ =st Et (τ ) [W · time unit] from time st to et. Let minTti show the minimum execution time [time unit] of a process pi , i.e. only the process pi is performed on a thread of a server st without any other process. Let minTi be a minimum one of minE1i , · · · , minTmi . That is, minTi is minTf i of a server sf which supports the fastest thread. A server sf with the fastest thread is referred to as f astest in a cluster S. In order to give the computation metrics of each process, the virtual computation amount V Ci of a process pi is introduced, which is defined to be minTi . The thread computation rate T CRt of a server st is minTti /V Ci = minTti /minTi (≤1). The computation rate SCRt of a server st is ntt · T CRt where ntt is the total number of thread of the server st . The maximum computation rate maxP CRti of a process pi on a server st is T CRt . Here, the process pi is only performed on the server st without any other process. It is noted, for every pair of processes pi and pj on a server st , maxP CRti = maxP CRtj = T CRt . The process computation rate N P Rti (n)(≤SCRt ) of a process pi on a server st where n processes are currently performed at time τ is defined as follows [3,4,9]: [MCML (Multi-level Consumption with Multiple CPUs) model] ntt · T CRt /n if n > ntt . N P Rti (n) = T CRt if n ≤ ntt .

(2)

Since N P Rti (n) = N P Rtj (n) for every pair of processes pi and pj on a server st , N P Rt (n) stands for N P Rti (n) for each process pi . The computation rate P Rti (τ ) of each process pi at time τ is assumed to be N P Rt (|CPt (τ )|). If a process pi on a server st starts at time st and terminates at time et, et τ =st N P Rt (|CPt (τ )|) = minTi . The server computation rate N SRt (n) of a server st to perform n processes is n · N P Rt (n), i.e. ntt · T CRt (= maxSCRt ) for n > ntt and T CRt for n ≤ ntt . A process pi is modeled to be performed on a server st as follows. [Computation model of a process pi ] 1. At time τ a process pi starts, the computation residue Ri of a process pi is V Ci , i.e. Ri = minTi ; 2. At each time τ , Ri = Ri − N P Rt (|CPt (τ )|); 3. If Ri ≤ 0, the process pi terminates at time τ .

An Algorithm to Select an Energy-Efficient Sever for an Application Process

3

105

Estimation Model

In our previous studies [8,10], we estimate at current time τ by what time every current active process in a set CPt (τ ) to terminate on a server st under the assumption that no new process is to be issued after time τ by a client. Here, CPt (τ ) is a set of current active processes of a server st at time τ . Our previous estimation model [8,10] is shown in Algorithm 1. At current time τ , the variable Pt denotes a set CPt (τ ) of current active processes on a server st . The variable n shows the number |CPt (τ )| of current active processes. The variable EEt denotes the energy consumption of a server st , initially EEt = 0. The variable Ri denotes the computation residue of each active process pi (∈ Pt ) on the server st . The time variable tm stands for each time unit. The time variable tm initially denotes the current time τ . The energy consumption EEt of the server st is incremented by the power consumption N Et (n). The computation residue Ri is decremented by the process computation rate N P Rt (n) for each active process pi in the set Pt . If Ri ≤ 0, the process pi terminates and is removed from the set Pt . Then, the time variable tm is incremented by one, i.e. tm = tm + 1. Until the set Pt gets empty, this procedure is iterated. If Pt = φ, the estimation procedure terminates. The execution time ETt is tm − 1. EEt shows the total energy to be consumed by the server st and ETt indicates the execution time of the server st to perform every current active processes. As shown here, no new process is issued to the server st after the current time τ . In order to make the estimation of the energy consumption and execution time of a server st more accurate, we have to take into consideration possible processes to be issued after current time τ . A possible process is a process to be issued after current time τ . Let P be a set of all processes to be issued to a cluster. Let minT be the average minimum execution time of all the processes, i.e. minT = pi ∈P minTi /|P |. At current time τ , we consider processes performed for δ time units from τ − δ2 to one time units τ − δ1 before the current time τ (δ = δ1 − δ2 + 1 ≤ τ ) (Fig. 2). We assume current active processes start after time τ − δ1 . In this paper, δ1 is the half of the average minimum execution time minT of all the processes in the set P , i.e. δ1 = minT /2. For each time t (τ − δ2 ≤ t ≤ τ − δ1 ), CPt (t) shows a set of active processes performed on a server st . The average number ant of active processes per one time unit from τ −δ1 |CPt (t)|/(δ2 − δ1 + 1). Let nt be the number time τ − δ2 to time τ − δ1 is t=τ −δ 2 |CPt (τ )| of active processes performed on a server st at current time τ . Let ent be the expected number of possible processes to be issued after the current time τ . In this paper, the expected number ent (nt ) of possible processes to be issued for the number nt of current active processes after the current time τ is given as follows; ent (nt ) =

nt · (1 − (ant − nt )/(δ2 − δ1 + 1)) 0 otherwise.

if (ant − nt )/(δ2 − δ1 + 1) < 1.

(3)

Here, δ1 = minT /2. Figure 3 shows the expected number of possible processes for the number nt of current active processes. If ant ≥ nt , ent (nt ) ≤ nt . If ant < nt , ent (nt ) > nt .

106

K. Noguchi et al.

The total computation residue of current active processes and possible processes is pj ∈CPt (τ ) Rj + ent (nt ) · minT . The total number of processes to be performed at time τ is expected to be nt + ent (nt ). Hence, the execution time N Tt (nt ) of nt active processes on a server st is defined as follows; pj ∈CPt (τ ) Rj + ent (nt ) · minT . (4) N ETt (nt ) = (nt + ent (nt )) · N P Rt (nt + ent (nt )) A server st consumes electric power N Et (nt + ent (nt )) [W] for N ETt (nt ) time units. Hence, the energy consumption N EEt (nt ) of a server st to perform active processes and new processes is given as follows; N EEt (nt ) = N ETt (nt ) · N Et (nt + ent (nt )).

(5)

Algorithm 1: Estimation algorithm 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

4

input st = server; Pt = set CPt (τ ) of current active processes on a server st ; τ = current time; output EEt = energy consumption of st ; ETt = execution time of st ; EEt = 0; tm = τ ; while Pt = φ do n = |Pt |; EEt = EEt + N Et (n); for each process pi in Pt do Ri = Ri − N P Rt (n); if Ri ≤ 0 then Pt = Pt − {pi }; tm = tm + 1; ETt = tm − 1;

Server Selection Algorithm

A client issues a request to a load balancer L of a cluster S of servers s1 , · · · , sm (m ≥ 1). The load balancer L selects a server st in the cluster S. Then, a process pi to handle the request is created and performed on the server st as shown in Fig. 1. Let S be a set {s1 , · · · , sm }(m ≥ 1) of servers in a cluster and P be a set {p1 , · · · , pn }(n ≥ 1) of processes to be performed on the servers. In this paper, we propose an MES (Minimum-Energy Server selection) algorithm to select a server to perform a new process by taking advantage of the proposed estimation algorithm. Suppose a process pi is issued to a server st at time τ . At time τ , not only current active processes CPt (τ ) but also the new process pi are performed on the server st . Let nt be the number of active processes on the server st at time

An Algorithm to Select an Energy-Efficient Sever for an Application Process

107

Fig. 2. Active processes performed on a server.

Fig. 3. Expected number of new processes.

τ , i.e. nt = |CPt (τ )|. Here, the total computation residue St of the current active processes and the new process pi on a server st is pj ∈CPt (τ ) Rj + minTi . Totally (nt + 1) processes are performed on the server st at time τ . As discussed in the preceding section, ent (nt ) possible processes are estimated to be newly performed after time τ . Hence, the execution time T ETt (nt , pi ) of the server

108

K. Noguchi et al.

st to perform current active processes, a new process pi , and ent (nt ) possible processes to be performed is given as follows; pj ∈CPt (τ ) (Rj + minTi + ent (nt ) · minT ) T ETt (nt , pi ) = . (6) (nt + ent (nt ) + 1) · N P Rt (nt + ent (nt ) + 1) Here, the server st consumes the electric power N Et (nt + ent (nt ) + 1) [W] for T ETt (nt , pi ) time units. Hence, the server st consumes electric energy T EEt (nt , pi ) to perform the new process pi and the active processes as follows; T EEt (nt , pi ) = T ETt (nt , pi ) · N Et (nt + ent (nt ) + 1).

(7)

A load balancer L selects a server st whose expected energy consumption T EEt (nt , pi ) is the smallest to perform a new process pi by the MES (MinimumEnergy Server selection) algorithm. [MES (Minimum-Energy Server selection) algorithm] 1. A client issues a process pi to a load balancer L. 2. The load balancer L selects a server st whose T EEt (nt , pi ) is minimum. 3. The process pi is performed on the server st .

5

EDS Simulator

We design and implement the EDS (Eco Distributed System) simulator to evaluate algorithms like the MES algorithm to select a server to perform a process issued by a client in terms of the total energy consumption of servers and the average execution time of processes. The EDS simulator is performed on tables in a database, which hold configurations of servers and processes. For example, a cluster is composed of four servers s1 , · · · , s4 (m = 4). The thread computation rate T CR1 of the server s1 is one, i.e. T CR1 = 1. T CR2 = 0.8, T CR3 = 0.6, and T CR4 = 0.4 for the servers s2 , s3 , and s4 , respectively. The performance and energy parameters of the servers are shown in Table 1. For example, the server s1 supports the server computation rate SCR1 = 16 by sixteen threads where the maximum power consumption maxE1 is 230 [W ] and the minimum power consumption minE1 is 150 [W ]. The server s4 supports SCR4 = 3.2 by eight threads where maxE4 = 77 [W ] and minE4 = 40 [W ]. The servers s2 and s3 supports the same number twelve threads while maxE2 > maxE1 and minE2 > minE1 . SCR2 = 9.6 and SCR3 = 7.2. The server s3 is more energyefficient than the server s2 . The server s4 shows a desktop PC. The server s1 stands for a server computer.

An Algorithm to Select an Energy-Efficient Sever for an Application Process

109

Let P be a set of processes p1 , · · · , pn (n ≥ 1) to be issued to the clusters. For each process pi in the set P , the starting time stimei [time unit] and the minimum execution time minTi [time unit] are randomly given as 0 < stimei ≤ xtime and 5 ≤ minTi ≤ 25. Here, xtime = 1, 000 [time unit]. At each time τ , if there is a process pi whose start time (stimei ) is τ , one server st is selected by a selection algorithm. The process pi is added to the set Pt , i.e. Pt = Pt ∪ {pi }. For each server st , active processes in the set Pt are performed. The energy variable Et is incremented by N Et (|Pt |). If |Pt | = φ, Et is incremented by minEt . If |Pt | > 0, the variable Tt [time unit] is incremented by one [time unit]. The variable Tt [time unit] shows how long the server st is active, i.e. some process is performed. For each process pi in the set Pt , the computation residue Ri of the process pi is decremented by the process computation rate N P Rt (nt ). If Rt ≤ 0, the process pi terminates, i.e. Pt = Pt − {pi } and P = P − {pi }. Until the set P gets empty, the steps are iterated. The variables Et and Tt give the total energy consumption and execution time of each server.

Table 1. Parameters of servers. npt nct ntt T CRt SCRt minE maxE

pE

cE tE

s1

1

8

16

1.0

16.0

150.0

230.0 40.0 8.0 1.0

s2

1

6

12

0.8

9.6

128.0

176.0 30.0 5.0 1.0

s3

1

6

12

0.6

7.2

80.0

130.0 20.0 3.0 1.0

s4

1

4

8

0.4

3.2

40.0

77.0

15.0 2.0 0.5

The EDS simulator is implemented in SQL on a Sybase database [2]. Information on servers and processes are stored in tables of the database. Algorithm 2 shows the procedure of the simulator. Given the process set P and the server set S, the EDS simulator gives the total energy consumption Et and total active time Tt of each server st in the cluster S and the execution time P Ti of each process pi in the set P . Figure 4 shows the total energy consumption of the servers obtained by the EDS simulator in the ME algorithm which uses the previous estimation model [Algorithm 1]. We are now evaluating the MES algorithm by the EDS simulator.

110

K. Noguchi et al.

Algorithm 2: EDS Simulator 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

input P = set of processes p1 , · · · , pn ; S = set of servers s1 , · · · , sm ; output Et = energy consumption of each server st ; Tt = active time of each server st ; P Ti = execution time of each process pi ; τ = 0; for each server st do Pt = φ; Et = 0; Tt = 0; while P = φ do for each process pi where stimei = τ do select a server st in a selection algorithm; Pt = Pt ∪ {st }; /* pi is performed on st */ for each server st do Et = Et + N Et (|Pt |); /* energy consumption */ if Pt = φ then nt = |Pt |; for each process pi in Pt do Ri = Ri − N P Rt (nt ); if Ri ≤ 0 then Pt = Pt − {pi }; P = P − {pi }; /*pi terminates */ P Ti = τ − stimei + 1; /* execution time */ Tt = Tt + 1; τ = τ + 1;

Fig. 4. Energy consumption.

An Algorithm to Select an Energy-Efficient Sever for an Application Process

6

111

Concluding Remark

In this paper, we newly proposed the model to estimate the execution time of processes on a server and the energy consumption of a server where not only current active processes but also possible processes to be issued after current time are considered. By using the estimation model, we also proposed the MES algorithm to select a server to energy-efficiently perform a process issued by a client. We developed the EDS simulator to obtain the total energy consumption of the servers and the execution time of the processes. We showed the total energy consumption of servers of the ME algorithm obtained by the EDS simulator. We are now evaluating the MES algorithm by the EDS simulator.

References 1. Intel xeon processor 5600 series: The next generation of intelligent server processors, white paper (2010). http://www.intel.com/content/www/us/en/processors/ xeon/xeon-5600-brief.html 2. Sybase. https://www.sap.com/products/sybase-ase.html 3. Duolikun, D., Aikebaier, A., Enokido, T., Takizawa, M.: Energy-aware passive replication of processes. Int. J. Mob. Multimed. 9(1,2), 53–65 (2013) 4. Duolikun, D., Kataoka, H., Enokido, T., Takizawa, M.: Simple algorithms for selecting an energy-efficient server in a cluster servers. Proc. Int. J. Comm. Netw. Distrib. Syst. 21(1), 1–25 (2018) 5. Enokido, T., Ailixier, A., Takizawa, M.: A model for reducing power consumption in peer-to-peer systems. IEEE Syst. J. 4(2), 221–229 (2010) 6. Enokido, T., Ailixier, A., Takizawa, M.: Process allocation algorithms for saving power consumption in peer-to-peer systems. IEEE Trans. Ind. Electron. 58(6), 2097–2105 (2011) 7. Enokido, T., Ailixier, A., Takizawa, M.: An extended simple power consumption model for selecting a server to perform computation type processes in digital ecosystems. IEEE Trans. Ind. Inform. 10(2), 1627–1636 (2014) 8. Kataoka, H., Duolikun, D., Enokido, T., Takizawa, M.: Evaluation of energy-aware server selection algorithm. In: Proceedings of the 9th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS-2015), pp. 318–325 (2015) 9. Kataoka, H., Nakamura, S., Duolikun, D., Enokido, T., Takizawa, M.: Multi-level power consumption model and energy-aware server selection algorithm. Int. J. Grid Util. Comput. (IJGUC) 8(3), 201–210 (2017) 10. Kataoka, H., Sawada, A., Duolikun, D., Enokido, T., Takizawa, M.: Energy-aware server selection algorithm in a scalable cluster. In: Proceedings of IEEE the 30th International Conference on Advanced Information Networking and Applications (AINA-2016), pp. 565–572 (2016)

Suggesting Cultural Heritage Points of Interest Through a Specialized Chatbot Roberto Canonico, Giovanni Cozzolino(B) , and Giancarlo Sperl`ı DIETI - University of Napoli Federico II, via Claudio 21, Naples, Italy {roberto.canonico,giovanni.cozzolino,giancarlo.sperli}@unina.it

Abstract. Researchers and companies are making great efforts to create new ways of interaction with a increasing number and types of electronic devices, with particular attention to the conversational interface, whether spoken or written. The cultural heritage domain can bring many benefits from the great effort in this field, as a smart conversational system can provide both room/hotel/apartment information and information related to the area, natural or cultural points of interest, tourist routes, social events, history etc. The paper’s main goal is to present a quick and effective service to the tourists visiting a city for the first time. Learning the cultural preferences of the users can enhance customer satisfaction, since a smart system can propose them customised tours in order to reach point of interests according to the expressed preferences. The interaction with the user is simplified by a conversational interface (chatbot).

1 Introduction The Chatbot4Heritage (CB4H) project was born with the aim of integrating and centralising, on a single platform, information related to the tourism sector in Campania, generally from heterogeneous sources. In fact, the information a tourist may need during his or her stay ranges from details on local cultural sites, to upcoming cultural and entertainment events, the location of various types of services (e.g. in the catering sector), updates on the status of mobility, making it necessary to carry out numerous searches on different websites and worsening the overall user experience. The project aims, through a service-oriented architecture, to extrapolate through a web crawler the information coming from multiple event aggregator websites, transform it into a common format and, through the mediation of an Enterprise Service Bus, store it in special databases. This information can then be made available to client applications, including a Chatbot [1–3] with which any users will be able to interact easily in natural language. Researchers and companies are making great efforts to develop new ways of communicating with a increasing number and types of electronic devices, with particular attention to the conversational interface, whether spoken or in writing. The cultural heritage domain can bring many benefits from the great effort in this field, as an intelligent conversational framework can include both room/hotel/apartment information [4] and area specific information, natural or cultural points of interest, tourist routes, social events, history etc. The main aim of the paper is to provide the tourists visiting a city for the first time with a fast and efficient service. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 112–120, 2021. https://doi.org/10.1007/978-3-030-57796-4_11

Suggesting Cultural Heritage Points of Interest Through a Specialized Chatbot

113

The system can suggest to the users most appreciated cultural sites, by analysing opinion taken from social networks [5], and specifically taking in consideration cultural heritage communities [6] or the most authoritative persons [7] in the field of cultural heritage domain. Therefore, understanding the users’ cultural preferences can improve customer loyalty, as a smart system can give them personalised tours to reach the point of interest according to the preferences expressed [8]. A conversational interface [9] (chatbot) simplifies communications with the user. Moreover, since of Covid-19 disease spreading, many cities started to improve their mobility services, like underground, by implementing a routing service for tourists, that offers: • • • •

Route planning by taking as input user’s first and last stop. number of stops from starting stop to last one. Number of line changes during the route. Detection of the nearest stop to a given tourist’s attraction. TravelBot [10] provides the aforementioned basic functions and other services like:

• • • • •

Informing the user about any faults on the lines. Showing the metro’s map. Informing and guiding the user to the purchase of tickets. Showing the order of the stops for the line selected. Informing the user about the working timetables of metro’s services.

2 System Architecture In order to retrieve information about locations, events or other relevant information, we have to chose a set of sources considered reliable. Then, the obtained data-set must undergo a process of processing and cleaning, removing formatting errors and non machine readable elements, so that the information contained in it can be inserted in a natural language sentence. We adopted a Service Oriented Architecture because of its intrinsic modularity. In this type of architecture, in fact, each module is developed independently and can be connected and removed from the main system without compromising the others. If, conceptually, the chatbot is considered a front end module, then it is possible to delegate data extraction and cleaning operations to other back end modules, using the Service Bus as an intermediary. The chatbot, at this point, can use the received data immediately, without having to process it further at runtime. Some example components were chosen and described in Fig. 1: a scraper, whose implementation in Python is described here through the Scrapy framework, for the Napolitoday.it website; an external API, that of the Yelp website; a MySQL database, which in this field is identified as Data Service; a web application in Django for direct insertion. For the implementation of the Service Bus, the choice fell on the open-source framework WSO2. WSO2 Enterprise Integrator has a graphical drag-and-drop interface for the configuration of the message flows between one component and another. One of

114

R. Canonico et al.

Fig. 1. High level architecture of a bot. Data extraction occurs only in scraping bots [11].

the main features of a Service Oriented Architecture is the possibility to configure the interactions between components, instead of defining them entirely by code. The application in Django, finally, was born as an interface for the management of CRUD (Crud-Retrieve-Update-Delete) operations by an employee or operator of a cultural entity. However, due to restrictions imposed by the framework itself, it was decided to use it for insertion only. In the section dedicated to future developments, this issue will also be dealt with.

3 Knowledge Base Population Information extracted from web pages sources are processed and stored in the system’ Knowledge Base. The Knowledge Base was implemented in Prolog language, taking advantage of its ease of use. The code will be divided into: • Facts: they report certain data known by the machine. • Rules: they manipulate the facts to obtain data useful to the end user. The “machine learning” process is used to obtain useful information for the user (e.g.) and this will help the programmer who does not need to provide too many facts to the machine. 3.1

Facts

The machine knows two types of facts: 1. Connections (conn), composed by 3 arguments: a. First stop;

Suggesting Cultural Heritage Points of Interest Through a Specialized Chatbot

115

b. Second stop; c. Median time between that two stops, it also serves to identify which line is it. It establishes that there is a connection between the two stops and how long the subway takes to travel from one to another. conn(battistini, cornella, 2).

1 2

2. Locations (locate), composed by 2 arguments: a. List of one or more tourist attraction b. Nearest stop to the attraction list. It links a list of tourist attraction to their nearest stop. locate([s_pietro,musei_vaticani],ottaviano).

1 2

3.2 Rules The rules can enhance the inferential process that retrieve customised information according to user preferences. Further enhancement can be taken with the exploitation of semantic techniques [12, 13]. Conn1 The machine can determine only a one way connection with his known fact “conn” . To be able to establish a bidirectional connection between two stops, the machine uses this mathematical expression: ∃conn(X,Y,C)i f ∃conn(Y, X,C). The OR statement is needed to get rid off infinite recursive call that can be possible in the next set of rules. Append The “append” rule, like his name suggest, will take care of appending one element to an existing list and it will give a new resulting list as an output. • [Head1|Tail1] = List where you want to append an element. • List2 = Element or list of elements that you wish to append. • [Head1|TailResult] = Result. First part of the code, by using a simple fact, establishes the result of appending an element to an empty list. In the second part then we will use a rule that with a recursive call, taking advantage of backtracking, delete the head of [Head1—Tail1] and continues until the list is empty, after that it starts to rebuilt a new list starting from tail to head, using as a tail List2. This rule will come in handy when we will need to find out how many line change we have to do during end user’s journey. 1 2 3

append([],X,X). append([Head1|Tail1],List2,[Head1|TailResult]):append(Tail1,List2,TailResult).

Change We use average time to go from one stop to the next, to check how many lines the tourist must change.

116

R. Canonico et al.

• T = Average time between two stops. • Lc = List of changes, initially empty and used as a temporary. • [T |Ris] = A second list of changes, needed to make append rule to work and to get a result. The rule will initially check if given T value is inside Lc by using: not(member(T, Lc)). After that if the result is true, append rule will add the given T value to [T—Ris]. Otherwise we use append with an empty list, so it will not add nothing to [T—Ris]. Result will be a list containing a maximum of three elements, that it will be then used to determine how many lines the tourist must change. change(T,Lc,[T|Ris]):not(member(T,Lc)), append([T],Lc,[T|Ris]). change(T,Lc,[T|Ris]):member(T,Lc), append([],Lc,[T|Ris]).

1 2 3 4 5 6

Len Len works pretty much like append with a recursive call (backtracking) of itself and using a given fact: len([],0). goal is different, we need to evaluate the length of a given list, in our case that list will be Lc. • [ HEAD|TAIL] = List. • N = Number of elements. Decr This simple rule will decrement P by 1 and it will store the result in Q. decr will work with change and len to work on Lc, we will see that our main goal is to get the number of elements in Lc and decrement that number by one to get our result. Connection Connection is a rule that takes two stops as input and calculates the list of stops, line changes made and duration time of the journey. The algorithm takes care of solving the“travel salesman” problem. Rule’s arguments are: • • • • • • • • •

Fs = First stop. Ls = Last stop. Ms = Middle stop. Lc = List of changes. Pl = Prohibited list. T = Total time of journey. N1 = Number of changes. Result = Non decremented list of changes. L1c = Temporary list with last stops to do.

In the first part of code, connection will check if there is a direct connection between first stop (Fs) and last stop (Ls) by using rule conn1 and checking after if these two stops are member of Pl. Otherwise if this first part is false, the machine will search for

Suggesting Cultural Heritage Points of Interest Through a Specialized Chatbot

117

a middle stop Ms and check if there is a direct connection to first stop (Fs). When this stop is found, the machine will check if both stops are not in the prohibited list using not(member) after that change rule will be called checking if there was a line change from one stop to another. Change rule as we saw makes a list made of average times taken only once (in this specific case), so the maximum length of this list will be three, decreasing this value by 1 will give us N1. Please note: Both Pl and Lc must be empty list while calling connection in a query, because when the tourist starts his journey he have not travelled to anywhere still. Also neither first stop and last stop can be prohibited stops. Connection is going to use a recursive call that will find a connection between Ms and Ls, using updated arguments value: Fs → s

Pl → [Fs, Pl]

Lc → Result

(1)

The recursive call has to be called until it find a direct connection between middle stop (Ms) (which is going to be updated everytime) and final stop (Fs). After the rule will find final route it will determine length of Result by using change and len, lastly to get N1 it uses decr (N, N1). Then connection will calculate T by adding each time between all the stops. 1 2 3 4 5 6 7 8 9 10 11 12 13 14

connection(Fs,Ls,Pl,Lc,[Fs,Ls],T,N1):conn1(Fs,Ls,T), not(member(Fs,Pl)), not(member(Ls,Pl)), change(T,Lc,Ris), len(Ris,N), decr(N,N1). connection(Fs,Ls,Pl,Lc,[Fs|L1c],T,M):conn1(Fs,Ms,T1), not(member(Fs,Pl)), not(member(Ms,Pl)), change(T1,Lc,Result), connection(Ms,Ls,[Fs,Pl],Result,L1c,T2,M), T is T1+T2.

Visit Rule that calculate the path for the nearest stop, based on the selected attraction, it also shows the travel time, the number of changes and the stops list. Or, choose the final stop you want to reach and define the initial stop where the user is located, check if there are tourist attractions nearby. Compared to ‘Connection’ rule, there is an additional argument: the attraction that the user wants to visit (Att). The IM, as we already said, shows facts that associate a list of attractions with their nearest stop. Arguments • • • • • • •

Att = Attraction. Fs = First stop. Ls = Last stop. Ms =Middle stop. Lc = List of changes. Pl = Prohibited list. T = Total time of travel.

118

R. Canonico et al.

• N1 = Number of changes. • Result = List containing the number of changes not decreased. • L1c =List that represent the remaining path. In the first case the IM checks if there is already an attraction or a list of them X, in the Knowledge Base, that can be associated with the last stop (Ls). After the IM finds it, checks if the desired attraction belongs to the list X, if it’s true the IM searches a connection between Fs and Ls. If, at the first try, the IM doesn’t find a direct connection, then it recalls the OR in visit and it will repeat the initial step, but this time searching, In the facts, a connection between Fs and a middle stop Ms, with T1 as travel time, recalling the rule: conn1. For the remaining part of the code, can be checked the rule: connection. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

visit(Att,Fs,Ls,Pl,Lc,[Fs,Ls],T,N1):locate(X,Ls), member(Att,X), conn1(Fs,Ls,T), not(member(Fs,Pl)), not(member(Ls,Pl)), change(T,Lc,Ris), len(Ris,N), decr(N,N1). visit(Att,Fs,Ls,Pl,Lc,[Fs|L1c],T,M):locate(X,Ls), member(Att,X), conn1(Fs,Ms,T1), not(member(Fs,Pl)), not(member(Ms,Pl)), change(T1,Lc,Result), visit(Att,Ms,Ls,[Fs,Pl],Result,L1c,T2,M), T is T1+T2.

4 TravelBot In this section we will show the main functions of the chatbot component and guidelines for correct execution. The main program uses two doc.json, these two document has been made with two different approach to test the chat bot: 1. ‘doc-v1.json’ had pattern made of extended question containing when/how/what/ which pronouns; 2. ‘doc.json’ is the current document in use, and is made of keywords for each tag with no pronouns at all. If you want to test the two different documents please make sure to delete model and data files or uncomment line 46 and 123, and change the following constant to the corresponding file.

Suggesting Cultural Heritage Points of Interest Through a Specialized Chatbot

119

In Listing 1 we open the document ‘doc.json’ and download the content. With the Boolean cycle ‘for’ we are taking the words taken from the document and inserting them in the lists through the functions ‘append’ and ‘extend’. 1

2

with open(DOC_FILE) as file: #Opens "doc.json" file and load with json libray. data = json.load(file)

3 4 5 6 7 8 9 10 11 12

13

try: #asfgsadf }. A

(4)

Reward Modeling

The taxi fares in San Francisco are modeled with reference to Yellow Cab SF Cab Fares [18]. We used the following elemental fares to calculate the cost of a taxi ride. • Base fare (first 0.2 mile) : $3.50, • Additional fare (every 0.2 mile) : $0.55, • Waiting fare (every minute) : $0.55. The taxi fare Rf for one ride is calculated by the following equation. Rf = B +

A(d − 0.2) + W T, 0.2

(5)

where B is the base fare, A is the additional fare A per 0.2 mile, d is the distance traveled by taxi, W is the waiting fare, and T is the waiting time (the total time traveled at 10 km/h or less).

126

Y. Iwai and A. Fujihara

Next we consider the calculation of gasoline cost for taxi transfer. We assume that taxi agent will take a shortest route to the destinations on the road network. In the process of reinforcement learning, this route is derived by Djikstra’s algorithm. We also assume that the gasoline cost is 1$/L and the fuel consumption is 10 miles/L. Then, the calculation of gasoline cost is described by Rg = −

d , 10

(6)

where d mile is the distance to the destination for taxi transfer. Also, the speed of taxi movement basically follows the maximum speed of each road along the shortest route. The deceleration due to the close distance to the vehicle ahead at the time of traffic congestion or turning right or left around the intersection is assumed to obey the Optimal Velocity Model (OVM) [19]. The agent proceeds its learning by sensing its own state and the reward given by the environment to accumulate its reward experience. When the agent arrives at the destination, it records the cumulative reward AR, the cumulative time required to arrive at the destination S, the cumulative reward per unit time RP S, and the number of visits to the destination area C. Here, the cumulative reward AR is defined by the following equations. AR(t) = AR(t − 1) + Rf + Rg AR(t) = AR(t − 1) + Rg

(If the agent (a taxi) is taking passengers),

(otherwise).

(7) (8)

Fig. 3. Visualization of a typical result of learning the passenger probability distribution by -greedy method with the expected reward ( = 0). The darker the blue, the higher the passenger probability distribution is, and the lighter the blue, the lower the passenger probability distribution.

Considering a Method for Generating Human Mobility Model

127

The cumulative reward per unit time RP S is defined as follows. RP S =

AR S

(9)

The action of the agent selecting a destination is based on comparing the expected reward if it arrived there. In the -greedy method, therefore, the agent utilizes the past exploration by comparing the expected reward < Rt > to find the next destination where the reward is maximum. < Rt >= RP S +

Rg S/C

(10)

As shown in the above equation, it is found that the expected reward is derived by the positive reward per unit time RP S plus the negative reward of gasoline cost to the destination Rg divided by the number of expected time steps S/C to the destination. 3.3

Simulator for Generating Taxi’s Mobility Model by Reinforcement Learning

This section describes the rough processing flow of the simulator program to perform reinforcement learning 1. Settings of parameters for simulation (the number of taxi agents, the number of areas to divide, , the duration time of simulation) 2. Acquisition of map image by using the Smopy module.

Fig. 4. The pattern of changes in rewards earned every 5,000 steps.

128

Y. Iwai and A. Fujihara

3. Extraction of the passenger probability distribution and passenger entry/exit data from the taxi’s mobility data. 4. After loading road network data, the program builds a network (directed graph) that fits the coordinates with the map image. 5. Generation of the taxi agent. 6. Execution of reinforcement learning using -greedy method. 7. Output the learning process as images, videos, and graphs as needed. 8. When the reward per unit time converges to a certain value, the learning ends. 9. Generate the final mobility model by using the learning parameters at the time of convergence as input parameters. 10. By changing , the mobility model with where the simulation has obtained the best cumulative reward is selected as the final generation model.

Fig. 5. The relation between the search parameter and the cumulative reward.

First, an taxi agent is generated, and the starting point and destination are chosen randomly. The movement is repeated for duration time of the simulation, and basically, one step corresponds to one second. When the agent reaches the destination, it stochastically determines whether or not it can pick up the passenger in the destination area based on the passenger probability distribution. When a passenger can be picked up, one person is randomly selected from all the passengers in the area. Then, the destination of the passenger is set as the next destination and the reward is obtained according to the taxi fare. When the taxi agent cannot pick up any passenger, it chooses the next destination for exploration or exploitation based on -greedy method. If it chooses exploration,

Considering a Method for Generating Human Mobility Model

129

the destination is randomly selected from the whole areas. If it chooses exploitation, the destination with the highest expected value of reward is chosen. After choosing the destination, the agent moves there, and while moving the gasoline cost is reduced from the reward for each time step. By repeating the movement and choosing the destination in this way, reinforcement learning is performed. Then, when the number of steps in the simulation reaches the duration time, the program ends to generate the learned mobility model.

4

Simulation Results

The settings of simulation parameters are as follows. • • • •

The The The The

number of taxi agents is 5. simulation area is divided by 20 × 20 = 400 tiles of small areas. exploration parameter is changed: = 0, 0.01, 0.05, 0.1, 0.2, 0.5, 0.8, 1.0. duration time of simulation is 100,000 steps.

Figure 3 shows a visualization of the expected reward for each tile obtained by reinforcement learning as a heat map. It is observed that the pattern in Fig. 3 is similar to that in Fig. 2. This results indicates that the agent can learn the passenger probability distribution correctly by reinforcement learning.

Fig. 6. The temporal pattern of the cumulative reward.

Figure 4 shows the change over time in the reward earned by the agent every 5,000 steps.

130

Y. Iwai and A. Fujihara

The reward gradually increases up for the first 40, 000(= 8×5, 000) steps, but thereafter it is stagnant between 70 and 100. This means that by learning up to 40, 000 steps, the agent efficiently earned the reward, and then it converged. The relation between the exploration parameter and the final total earned reward is shown in Fig. 5. It can be observed that the smaller the exploration parameter , the larger the final total earned reward at the end of simulation. In the usual -greedy method, the maximum reward is obtained when the value of is as small as around 0.1, so by comparison with this, this result seems to be strange because the agent with = 0 has the highest reward. The reason comes from the fact that the taxi agent has the implicit exploration to pick up a passenger after taking the previous passenger to the destination. Therefore, it is considered that the reinforcement learning could be sufficiently performed without the explicit exploration for choosing random destination. Next, Fig. 6 shows the change over time step in the total earned reward of the agent. In the early steps, the agents with = 0.01, 0.05 earned rewards larger than others. But, the agent with = 0 becomes the largest of all in the total amount of reward after around 10,000 steps. This result indicates that the agent with = 0, which has only the implicit exploration is disadvantageous in the beginning, but after sufficient time has passed, it is sufficient without the explicit exploration. This time, we adopted the condition that the agents did not forget what they learned and that the passenger probability distribution did not change over time. Under this condition, the agent with = 0 is found to be advantageous in the long-time limit. From this result, it can be seen that the learning model outputs an efficient mobility model that does not perform any exploration. From the viewpoint of dichotomy of returner and explorer in human mobility patterns, our method generates a mobility model of the returner.

5

Summary and Discussions

We investigated a method to generate a mobility model for taxi’s mobility patterns by reinforcement learning using the actual travel data of taxis in San Francisco. In order to perform the reinforcement learning, it is necessary to model the environment, the state of agent, the reward, the expected reward, and the actions for agent to choose properly. For taxi’s mobility model, we successfully constructed a model of them by using the passenger probability distribution, the taxi fare, and gasoline cost for modeling the reward. We used -greedy method as a well-known algorithm for reinforcement learning. We compared the cumulative reward as the degree of learning by changing the exploration parameter . As a result, it was found that the cumulative reward was the highest when = 0 in the long-time limit because the implicit search was dominant in the taxi mobility patterns. We concluded that from the viewpoint of dichotomy of returner and explorer in human mobility patterns, our method successfully generates a mobility model of the returner. For future work, it is interesting to consider how to generate a mobility model of the explorer by using reinforcement learning framework. In this study, we

Considering a Method for Generating Human Mobility Model

131

modeled that the passenger probability distribution does not change over time. However, it does change in general since there are times when many people move frequently, such as the morning and evening commuting hours. In such a time-varying situation, if the chance of the explicit exploration is small, it is not possible for the agent to adapt to the changes of passenger probability distribution, and it would be disadvantageous. To do this, it is necessary to introduce the forgetting rate of learning appropriately. It is also necessary to compare the results of reinforcement learning in various areas, such as a wide area including downtown San Francisco where the passenger probability distribution becomes more complicated. The reason that we first investigate this study using the taxi mobility data is that it is easy to model rewards based on the taxi fare and gasoline cost. However, there is a possibility to generate a mobility model of a person using some mobility data by properly estimating appropriate rewards for each movement even if there is no explicitly given rewards. Actually, it seems that people often decide to move on the assumption of short-term or long-term rewards. For example, going to school or going to work would be to expected to have long-term rewards, such as earning a degree or earning salary. Also, returning home would be a shortterm reward, such as comfort with the family and recovery of physical strength by sleep. We could also consider the reward obtained by moving to a certain place, such as meeting a specific person. If these various kinds of rewards could also be modeled based on datasets of mobility flows of people and their social interactions, wider human mobility models would be generated, which could help to perform more realistic social and economic simulations, and also those of disaster evacuation guidance [20–24]. Acknowledgements. This work was partially supported by the Japan Society for the Promotion of Science (JSPS) through KAKENHI (Grants-in-Aid for Scientific Research) Grant Numbers 17K00141, 17H01742, and 20K11797.

References 1. Song, C., Qu, Z., Blumm, N., Barab´ asi, A.-L.: Limits of predictability in human mobility. Science 327(5968), 1018–1021 (2010) 2. Pappalardo, L., Simini, F., Rinzivillo, S., Pedreschi, D., Giannotti, F., Barab´ asi, A.-L.: Returners and explorers dichotomy in human mobility. Nat. Commun. 6, 8166 (2015) 3. Gonz´ alez, M.C., Hidalgo, C.A., Barab´ asi, A.-L.: Understanding individual human mobility patterns. Nature 453(7196), 779–782 (2008) 4. Rhee, I., Shin, M., Hong, S., Lee, K., Kim, S.J., Chong, S.: On the lévy-walk nature of human mobility. IEEE/ACM Trans. Netw. 19(3), 630–643 (2011) 5. Zaburdaev, V., Denisov, S., Klafter, J.: Lévy walks. Rev. Mod. Phys. 87(483), 483–530 (2015) 6. Song, C., Koren, T., Wang, P., Barab´ asi, A.-L.: Modeling the scaling properties of human mobility. Nat. Phys. 6(10), 818–823 (2010) 7. Evans, M.R., Majumdar, S.N.: Diffusion with stochastic resetting. Phys. Rev. Lett. 106, 160601 (2011)

132

Y. Iwai and A. Fujihara

8. Fujihara, A., Miwa, H.: Homesick lévy walk and optimal forwarding criterion of utility-based routing under sequential encounters. Internet Things Intercooperative Computational Technologies for Collective Intelligence, vol. 460, pp. 207–231. Springer (2013) 9. Fujihara, A., Miwa, H.: Homesick lévy walk: a mobility model having Ichi-Go Ichi-e and scale-free properties of human encounters. In: 2014 IEEE 38th Annual International Computers, Software and Applications Conference, pp. 576–583. Springer (2014) 10. Fujihara, A.: Analyzing scale-free property on human serendipitous encounters using mobile phone data. In: MoMM 2015: Proceedings of the 13th International Conference on Advances in Mobile Computing and Multimedia, pp. 122–125. ACM (2015) 11. Sudo, A., et al.: Particle filter for real-time human mobility prediction following unprecedented disaster. In: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, vol. 5, pp. 1–10 (2016) 12. Pappalardo, L., Simini, F.: Data-driven generation of spatio-temporal routines in human mobility. Data Min Knowl. Disc. 32, 787–829 (2018). https://github.com/ jonpappalord/DITRAS 13. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (2018) 14. OpenStreetMap. https://www.openstreetmap.org/ Geofabrik Download Server. https://download.geofabrik.de/north-america/us/california/norcal.html 15. SUMO: Simulation of Urban MObility. http://sumo.sourceforge.net/ 16. Smopy. https://github.com/rossant/smopy 17. Piorkowski, M., Sarafijanovic-Djukic, N., Grossglauser, M.: CRAWDAD dataset epfl/mobility (v.2009-02-24), traceset: cab (2009). https://crawdad.org/epfl/ mobility/20090224/cab 18. Yellow Cab SF Cab Fares. https://yellowcabsf.com/service/cab-fares/ 19. Bando, M., Hasebe, K., Nakayama, A., Shibata, A., Sugiyama, Y.: Structure stability of congestion in traffic dynamics. Jpn. J. Ind. Appl. Math. 11, 203 (1994). http://traffic.phys.cs.is.nagoya-u.ac.jp/mstf/sample/ov.html 20. Fujihara, A., Miwa, H.: Real-time disaster evacuation guidance using opportunistic communication. In: IEEE/IPSJ-SAINT 2012, pp. 326–331 (2012) 21. Fujihara, A., Miwa, H.: Effect of traffic volume in real-time disaster evacuation guidance using opportunistic communications. In: IEEE-INCoS-2012, pp. 457–462 (2012) 22. Fujihara, A., Miwa, H.: On the use of congestion information for rerouting in the disaster evacuation guidance using opportunistic communication. In: ADMNET 2013, pp. 563–568 (2013) 23. Fujihara, A., Miwa, H.: Disaster evacuation guidance using opportunistic communication: the potential for opportunity-based service. In: Big Data and Internet of Things: A Roadmap for Smart Environments, Studies in Computational Intelligence, vol. 546, pp. 425–446 (2014) 24. Fujihara, A., Miwa, H.: Necessary condition for self-organized follow-me evacuation guidance using opportunistic networking. In: INCoS 2014, pp. 213–220 (2014)

Data Mining on Open Public Transit Data for Transportation Analytics During PreCOVID-19 Era and COVID-19 Era Carson K. Leung(&) , Yubo Chen, Siyuan Shang, Yan Wen, Connor C. J. Hryhoruk, Denis L. Levesque, Nicholas A. Braun, Nitya Seth, and Prakhar Jain University of Manitoba, Winnipeg, MB, Canada [email protected]

Abstract. As the urbanization of the world continues and the population of cities rise, the issue of how to effectively move all these people around the city becomes much more important. In order to use the limited space in a city most efficiently, many cities and their residents are increasingly looking towards public transportation as the solution. In this paper, we focus on the public bus system as the primary form of public transit. In particular, we examine open public transit data for the Canadian city of Winnipeg. We mine and conduct transportation analytics on data prior to the coronavirus disease 2019 (COVID19) situation and during the COVID-19 situation. By discovering how often and when buses were reported to be too full to take on new passengers at bus stops, analysts can get an insight of which routes and destinations are the busiest. This information would help decision makers make appropriate actions (e.g., add extra bus for those busiest routines). This results in a better and more convenient transit system towards a smart city. Moreover, during the COVID-19 era, it leads to additional benefits of contributing to safer buses services and bus waiting experiences while maintaining social distancing. Keywords: Data mining Open data Data analytics analytics Public transit analytics COVID-19

Transportation

1 Introduction In the modern era, with increasing urbanization and population density, the issue of how to accommodate larger and larger groups of people in cities is becoming more important. As the population of metropolitan areas rise so does the amount of traffic generated by people commuting within cities. By looking at major North American cities like Los Angeles or New York, it is clear that private individual transportation does not scale well with millions of inhabitants. In response, there has been a growing movement of people that demand better public transportation options in their cities. To meet this demand, cities have been looking into how to improve their public transportation systems to better fit the needs of its inhabitants.

© Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 133–144, 2021. https://doi.org/10.1007/978-3-030-57796-4_13

134

1.1

C. K. Leung et al.

Prior to the COVID-19 Situation

In a Canadian metropolitan area (MA) of Winnipeg, the lack of sufficient public transit is a major issue. According the 2016 census [1], 13.6% of working people there used public transit as their main mode of commuting. Although this percentage is slightly higher than the national average of 12.4%, it is much lower than those in the three largest Canadian MAs (with 24.3% in Toronto, 22.3% in Montreal, and 20.4% in Vancouver). This low percentage of public transit users in Winnipeg means more cars on the road, which results in more traffic congestion and wear on the roads. The question is then: “What can be done to increase the number of transit users?” In a city like Winnipeg where the primary means of public transportation comes in the form of buses operated by Winnipeg Transit (a public transit agency in the City of Winnipeg government), an obvious direction for improvements can be found by examining the bus network. One might suggest that expanding the types of public transit could be beneficial for Winnipeg, but we have chosen to focus only on the system as it currently stands, ignoring the unknown future where sudden increases in funding could allow for new large-scale infrastructure projects. By analyzing collected data about bus operation, one discover where the current system is lacking. For instance, predicting more accurate arrival times of a bus at a particular stop can be very useful for potential bus riders. Accomplishing this would require more than just examining the bus operation data, external factors like weather conditions and lane closures would also need to be taken into account. Weather is an especially important factor when seeking for improvements to transit services of a MA where winters can be very harsh. These harsh winters often cause slower traffic, and thus delays in the arrival time of buses during certain times of the year. In this paper, we focus the “pass-up”1, which is a situation that occurs when a bus is full to its capacity and can no longer take on additional passengers. If a full bus arrives at a particular stop, and cannot allow anyone waiting for the bus to board, the bus is forced to pass by that stop and continue driving. Pass-ups can be a major problem for commuters, they can easily add an extra 10–20 min to someone’s commute, which could cause them to become late. If pass-ups are frequent, public transit becomes a less reliable transportation option, which could cause many people under tight time constraints to stop using the buses altogether. 1.2

The Current COVID-19 Situation

The aforementioned “pass-up” problem can lead to riders’ inconvenience (e.g., being late for school or work, missing an important meetings or flights). However, this problem can become more serious under the coronavirus disease 2019 (COVID-19) situation. To elaborate, COVID-19 is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To combat COVID-19 (e.g., limit the spread of the virus), citizens are recommended to practice social distancing (aka

1

https://winnipegtransit.com/en/open-data/pass-ups/.

Data Mining on Open Public Transit Data for Transportation Analytics

135

physical distancing), quarantine, and isolation2, depending on their conditions. For instance, an infected person (i.e., confirmed COVID-19 case) should be isolated (say, for at least 10 days until symptoms had improved and fever had been gone for at least three days without medications) with a purpose to keep infectious people separate from healthy people. A person who is exposed to the virus (i.e., could possibly become infectious) should be quarantined (say, for 14 days from the most recent date that a contact was exposed) with a purpose to separate currently healthy but potentially infectious people from other healthy people who have not been exposed to the virus. A healthy person should practice social distancing with a purpose to keep physical space (e.g., at least 2 arms lengths—which are approximately 2 m—as much as possible) between himself and other people outside of his home. Suggestions include (a) maintain social distance when waiting at a bus station, (b) maintain social distance when selecting seats (e.g., skipping a row of seats), (c) travel when there are fewer riders. Hence, when “pass-up” occurs, a bus has reached its “safe” capacity for passengers to practice social distancing. Moreover, when commuters cannot get on to the bus due to “pass-up”, they will have to wait for subsequent buses at the bus stops. This increases the number of people at the bus stop, making it more challenging to practice social distancing. 1.3

Our Contributions

During either the pre-COVID-19 era or the COVID-19 era, it is important to answer questions like: (a) How often, and when, buses are reported to be too full to take on new passengers at bus stops? (b) Which routes and destinations are the busiest? To answer these questions, we conduct data mining on open public transit data for transportation analytics. Specifically, we use open data from Winnipeg Transit3. We analyze and mine these data to discover useful knowledge related to the pass-ups. The discovered knowledge help analysts, officials and decision makers in the city to find the routes that might need additional bus service so as to reduce the frequency of pass-ups. In this paper, our key contribution is to our data mining on open public transit data (especially, pass-ups for buses in Winnipeg) for transportation analytics during the preCOVID-19 era and COVID-19 era. The mining results help reveal improvement areas for bus services. The remainder of this paper is organized as follows. Next section presents related work. In Sect. 3, we describe our data mining on open public transit data for transportation data. In Sect. 4, we present our evaluation for the pre-COVID-19 era and the COVID-19 era. Finally, we draw our conclusions in Sect. 5.

2 3

https://www.cdc.gov/coronavirus/. https://winnipegtransit.com/en/open-data/.

136

C. K. Leung et al.

2 Related Work Nowadays, big data [2–5] are everywhere. Examples include big sensor data [6, 7], social network data [8–10], and transportation data [11, 12]. Embedded in these big data are valuable information and knowledge that can be discovered by data science and/or data mining techniques [13–17]. In recent years, open data [11, 18] have become more popular as it allows governments to provide transparency for citizens to monitor government activities and allows citizens to gain some insights about services provided by the government. As a preview, in this paper, we examine open data and conduct data mining on the transportation data (especially, public transit data). Analyzing transportation data has become popular. For instance, Zhao et al. [19] predicted the demand of private vehicle services like taxi and ride-sharing services like Uber. They utilized temporal-correlated entropy to measure the demand and incorporate it into several predictor algorithms—such as Markov chain stochastic model, long short-term memory (LSTM) algorithm—to obtain accurate predictions. Unlike Zhao et al.’s works on predicting the demand for private vehicle services, we focus on analyzing the demand for public transportation services—especially, bus service. Among related works on analyzing public transportation data, many of them focused on the problem of accurate predictions of bus arrival or delay times. For instance, Audu et al. [11] used open data from Toronto, Canada, to accurately predict delays in bus arrival time. Yamaguchi et al. [20] applied various sophisticated machine learning models like artificial neural network (ANN) and gradient boosting decision tree (GBDT) to a month of bus probe data from Fukuoka, Japan, to predict bus delay time. Liu et al. [21] adapted a modified k-nearest neighbor method to predict bus arrival time in Beijing, China. Unlike these related works on predicting bus delays or arrival times, we focus on a different aspect—namely, pass-ups. We examine the demand of bus services, especially on the busiest routes. Instead of predicting delays or arrival times of buses, we focus on a different aspect—namely, pass-ups. Rather than predicting arrival times, we perform frequent pattern mining to determine if extra buses are required based on the pass-up frequency at any particular stop. Negative impacts of bus pass-ups could be more serious than a delayed or late bus because, when bus passes up, passengers need to wait at the bus stops for the next scheduled bus. Such a waiting time for the next scheduled bus is usually much longer than the time delay caused by the current late bus. Moreover, the bus pass-ups could become more serious during the COVID-19 era because they may negatively affect the ability to practice social distancing, both on the bus and at the bus stops.

3 Our Mining on Open Public Transit Data In this section, we describe our data mining on open public transit data—namely, the pass-up data—for transportation analytics. We aim to determine the routes and/or time when the buses are too full to allow any additional passengers to board a bus. Passengers are “passed-up” and must wait at the bus stop for the next scheduled bus.

Data Mining on Open Public Transit Data for Transportation Analytics

137

The pass-up data are recorded by the bus’s on-board computer whenever a bus operator pushes a button to log when a bus is full and passengers have been passed-up at a bus stop. These data are updated on a daily basis. The number of pass-ups has been recorded within each calendar month since January 2011. Each record consists of the following attributes: 1. 2. 3. 4. 5. 6.

Pass-up ID, which is a unique identifier in the dataset; Pass-up type; Date and time, when the button was pressed by the bus operator to record a pass-up; Route number; Route name, which captures the common name that corresponds to a route number; Route destination, which indicates the terminus of route and helps determine the direction (e.g., northbound vs. southbound) of the bus; and 7. Location, which is a point—i.e., global positioning system (GPS) coordinates in the form of POINT (latitude, longitude)—representing the location at which the button was pressed to record a pass-up.

3.1

Data Preprocessing

Note that a pass-up can either be a “full bus pass-up” or a “wheelchair user pass-up”. To elaborate, a “full bus pass-up” occurs when a bus is too full to fit any additional passenger. In contrast, a “wheelchair user pass-up” occurs when a wheelchair user cannot be accommodated. This happens when (a) the bus being too full or (b) wheelchair positions on the bus being occupied by other wheelchair users. In this paper, let us first focus on “full bus pass-up” first, and will transfer the knowledge we learned from analyzing and mining “full bus pass-up” to analyzing and mining “wheelchair user pass-up” as future work. As such, the value of the attribute “pass-up type” for our current study would be uniquely “full bus pass-up”, and thus can be ignored. The attribute “date and time” captures both the date and time when the button was pressed by the bus operator to record a pass-up. To get more interesting patterns from our data mining, we split the attribute into two attributes “date” and “time”. From the resulting attribute “date”, we also derive: • the attribute “day of the week” (DOW), which helps to reveal relationships among pass-ups and day of the week (or at least weekday, Saturday, and Sunday). • the attribute “month”, which helps to reveal relationships among pass-ups and month of the year. From the resulting attribute “time”, we also derive and bin the values into the following time intervals: • • • • •

05:00–08:59, 09:00–15:59, 16:00–18:29, 18:30–22:29, 22:30–04:59,

which which which which which

refers refers refers refers refers

to to to to to

morning peak hours; off-peak hours; afternoon peak hours; evening hours; and night hours.

138

C. K. Leung et al.

Moreover, attribute “route name” just a user-friendly attribute (with meaningful name rather than “route number”) to identify a route. In other words, one attribute is dependent of another. As such, only one (e.g., attribute “route number”) needs to be kept. Note that the attribute “location” records the location of the bus when the bus operator pressed the button to log the pass-up. Consequently, the recorded location may not always match precisely to the bus stop where the actual pass-up occurred. Hence, to get a more meaningful representation of pass-up locations, we perform data preprocessing and data transformation to identify the nearest bus stop (as opposed to GPS coordinates) where a bus was full. It was tempting to find a bus stop closest to the recorded (latitude, longitude)-values for the attribute “location”. However, a careful consideration reveals that the nearest bus stop may be for (a) some other routes or (b) the same route but running in the opposite direction for the opposite destination, rather than the one that passed up. Hence, we add a constraint in finding the nearest bus stop for that bus route and route destination at the time of reporting the pass-up. To do this, we integrate an additional dataset—namely, the list of bus stops, with their GPS coordinates in the form of (latitude, longitude)-values. Then, we search among the GPS coordinates of all the bus stops for a specific route and route destination, to find the nearest stops. A pseudocode is shown in Fig. 1.

Algorithm 1. Find nearest bus stop location for bus route R and destination D for each passupLocation busLoc for route R and destination D minDist ∞ currStop NULL passupStop NULL for each stopLocationOnRouteAndDestination stopLoc if (minDist > dist(stopLoc, busLoc)) minDist dist(stopLoc, busLoc) currStop stopLoc passupStop currStop where distance between the stop location stopLoc = (stopLatitude,, stopLongitude) and the recorded location busLoc = (busLatitude, busLongitude) where the bus operator pressed the button to log the pass-up is calculated by using the Manhattan distance: |busLatitude – stopLatitude| + |busLongitude – stopLongitude|

Fig. 1. Pseudocode for finding nearest bus stop location for bus route R and destination D

With the integration of the list of bus stops into the bus pass-up data, we preprocess the data and capture the following attribute to conduct data mining for transportation analytics: 1. Pass-up date (or DOW, day, month, and year), when the button was pressed by the bus operator to record a pass-up; 2. Pass-up time (or time intervals), when the button was pressed by the bus operator to record a pass-up;

Data Mining on Open Public Transit Data for Transportation Analytics

139

3. Route number; 4. Route destination, which indicates the terminus of route and helps determine the direction (e.g., northbound vs. southbound) of the bus; and 5. Bus stop location of the pass-up. 3.2

Data Mining

Once the open public transit data are preprocessed, we conduct data mining on the resulting preprocessed data. Specifically, we perform frequent pattern mining to find implicit, previously unknown and potentially useful information and knowledge in the form of sets or collections of frequently co-occurring attribute-values. The singleton frequent patterns help reveal the frequently occurring values for each attribute related to bus pass-ups. For example, they help reveal information like: • • • •

In which month did most of the pass-ups occur? On which day of the week did most of the pass-ups occur? At which time interval did most of the pass-ups occur? On which routes did most of the pass-ups occur?

Moreover, non-singleton frequent patterns help reveal combination of frequently co-occurring attribute-values related to bus pass-ups. For example, they help reveal information like: • On which routes and to which destinations did most of the pass-ups occur? • Which combination of month, day, time interval, route number, and/or route destinations led to high numbers of the pass-ups? With the dataset containing temporal information on pass-ups, in addition to frequent pattern mining, we also perform frequent sequence mining (aka sequential mining) and periodic pattern mining to find interesting frequent sequences or frequent periodic patterns. From all these discovered frequent patterns or sequences, we could get insight about when and on which routes and destinations did pass-ups occur frequently. This information reveals that additional buses are in demand for those routes and destinations during those busiest times in order to improve rider experience. To a further extent, this revealed information also helps riders to safely practice social distancing during the COVID-19 era. 3.3

Collaborative Reasoning

In addition to conducting data mining, we also examine the negative impacts caused by the frequent pass-ups. We determine the severity of the pass-ups by measuring the frequency of pass-ups and the amount of extra time the passengers need to wait for the next scheduled bus. To do so, we integrate a third and fourth datasets—namely, the bus schedules and the on-time performance dataset. The former gives the expected arrival time of the next scheduled bus, and the latter gives the actual arrival time of the next scheduled bus.

140

C. K. Leung et al.

With the quantifiable measures (i.e., frequency, time), we also determine the perceived negative impacts by crowd-voting. Specifically, the crowd collaboratively votes for the acceptable and unacceptable pass-up frequency and waiting time for the next scheduled bus due to pass-ups.

4 Evaluation We conducted our evaluation on real-life open public transit data from Winnipeg Transit (See footnote 3). During the pre-COVID-19 era, Winnipeg Transit operated a fleet of 640 buses over 93 bus routes departing from approximately 5,170 bus stops. These buses operated long hours for many passengers (e.g., over 1.55M hours to carry 48,409,060 passengers in year 2018). However, due to the COVID-19 pandemic, it experienced a 72% reduction in ridership. Hence, to maintain essential services while balancing the health, safety, and well-being of its bus operators and riders, it reduced its service to an “enhanced Saturday schedule” and cutting some routes (and operating only 84 routes) effective May 4, 2020. In response to the COVID-19 pandemic, on March 20, 2020, the Canadian province of Manitoba (where the city of Winnipeg is located) declared a state of emergency, which ordered public transit to enforce social distancing. Many non-critical businesses allowed their employees to work remotely from home, and schools were closed. In our evaluation, we analyzed and mined (1) bus pass-up data (available from January 1, 2011 to now). In addition, we also integrated other datasets including (2) the list of bus stops, which helps locate the nearest bus stops for the passed-up bus route and destination; (3) bus schedules, which helps find the expected arrival time of the next scheduled bus; and (4) on-time performance dataset, which helps find the actual arrival time of the next scheduled bus. To measure the impact of COVID-19, we divide these data into three intervals: (1) Pre-COVID-19 era (i.e., up to March 20, 2020), (2) COVID-19 era but before the bus service reduction (i.e., March 21 to May 3), and (3) COVID-19 era but after the bus service reduction (i.e., May 4 to now). For the per-COVID-19 era data, we focus on those from January 1, 2019 to March 20, 2020. 4.1

Sequential and Periodic Pattern Mining

Based on the bus pass-up data from 2011 to now, we observed that a periodic (or reoccurring) pattern that pass-ups occurred most frequently in September of each year as shown in Fig. 2. This is probably that, when students begin classes—following new schedules—often at new schools and universities, passenger loads are at their highest in the first few weeks of classes. The pass-ups then become lower when everyone learns their new routines and figures out the best way to get to class. We also observed that the next frequent month of pass-ups were in January. This is probably that, when students begin classes for a new year after coming back from December holiday. It takes time again for everyone learns their new routines and figures out the best way to get to class.

Data Mining on Open Public Transit Data for Transportation Analytics

141

In addition to these periodic patterns, we also note a sharp drop in the number of pass-ups from an average of around 1,100 monthly pass-ups during the pre-COVID-19 era to an average of 19 pass-ups (specifically, 13, 33 and 12 in April, May and up to June 11) during the COVID-19 era. In other words, the number of monthly pass-ups during the COVID-19 era is about 1.8% of that in the pre-COVID-19 era.

Number of pass-ups 3500 3000 2500 2000 1500 1000 500

Jan-20

Apr-20

Jul-19

Oct-19

Jan-19

Apr-19

Jul-18

Oct-18

Jan-18

Apr-18

Jul-17

Oct-17

Jan-17

Apr-17

Jul-16

Oct-16

Jan-16

Apr-16

Jul-15

Oct-15

Jan-15

Apr-15

Jul-14

Oct-14

Jan-14

Apr-14

Jul-13

Oct-13

Jan-13

Apr-13

Jul-12

Oct-12

Jan-12

Apr-12

Jul-11

Oct-11

Jan-11

Apr-11

0

Fig. 2. Number of pass-ups from January 2011 to June 2020

4.2

Frequent Pattern Mining

Based on the bus pass-up data from January 2019 to now, we observed that, among the five time intervals in a day, the second most frequent time interval for the pass-ups was the afternoon peak hours (16:00–18:29), which is not too surprising. However, it is surprising that the most frequent time interval for the pass-ups was not the morning peak hours, but the off-peak hours (09:00–15:59). This may be an indication that the Transit is better prepared for the peak hours than off-peak hours.

Table 1. Comparisons on the top-10 routes having pass-ups during (a) January 1, 2019 to March 20, 2020; (b) March 21 to May 3, 2020; and (c) May 4 to June 11, 2020 2019-Mar 20, 2020 Mar 21-May 3, 2020 Route Pass-ups Route Pass-ups Total Weekly Total Weekly 162 2368 37.25 11 9 1.43 75 2132 33.54 18 8 1.27 11 1417 22.29 162 6 0.95 160 1029 16.19 17 5 0.80 21 858 13.50 47 4 0.64 18 769 12.10 16 4 0.64 170 741 11.66 19 4 0.64 19 733 11.53 47 3 0.48 72 664 10.44 55 3 0.48 36 623 9.80 45 2 0.32

May 4-Jun 11, 2020 Route Pass-ups Total Weekly 11 11 1.97 18 5 0.90 47 5 0.90 14 4 0.72 21 3 0.54 15 2 0.36 24 2 0.36 44 1 0.18 60 1 0.18 46 1 0.18

142

C. K. Leung et al.

We observed and compared the top-10 routes having pass-ups during the preCOVID-19 era and the two COVID-19 periods. See Table 1. Due to the uneven number of days in these three periods, we add columns for showing the weekly passups for easy comparisons. We observed that Route 162 had the most frequent pass-ups in the pre-COVID-19 era, and dropped to the third during the first COVID-19 period (March 21 to May 3). A possible explanation is that this route mainly serves university students. However, starting March 21, the university closed. A lower demand leads to fewer passengers, and thus fewer pass-ups. Route 162 is not even on the top-10 list for the second COVID-19 period (May 4 to June 11). A careful investigation reveals that, following the reduction in transit services, Route 162 ceased operations from May 4. For routes 11 and 18, they had the third and sixth most frequent pass-ups in the preCOVID-19 era, and climbed to the top-2 in the two COVID-19 periods. This is an indication that these two routes were in high demand prior to COVID-19 and still in high demand during the COVID-19 era. Moreover, by examining the bus schedule and on-time performance data, it is disappointing to know the long waiting time to the next scheduled bus, especially during the second COVID-19 period. Specifically: • For route 11, the average time to wait for the next scheduled bus is about 35 min ± 19 min. Moreover, the maximum recorded waiting time during the second COVID-19 period was 1 h and 22 min. During the long duration, lots of passengers would be staying at the bus stops, making it challenging to maintain social distancing. Moreover, once the next scheduled bus arrived, it is likely that the bus could be easily filled and make it challenging to maintain social distancing on the bus. • For route 18, the average time to wait for the next scheduled bus is about 26 min ± 5 min. Moreover, the maximum recorded waiting time during the second COVID-19 period was 34 min. With crowd-voting, we collaboratively get an idea that a waiting time of longer 20 min is undesirable. This is an indication that these two bus routes are in demand.

5 Conclusions In this paper, we present our data mining on open public transit data (specifically, bus pass-up data) for transportation analytics. When integrating with other related data (e.g., list of bus stops, bus schedules, and on-time performance dataset), we managed to analyze and mine interesting patterns related to bus pass-ups. We also divided the data into pre-COVID-19 era and COVID-19 era to examine the impact of COVID-19 pandemic. By discovering how often and when buses were reported to be too full to take on new passengers at bus stops, analysts can get an insight of which routes and destinations are the busiest. This information would help decision makers make appropriate actions (e.g., add extra bus for those busiest routines). This results in a better and more convenient transit system towards a smart city. Moreover, during the

Data Mining on Open Public Transit Data for Transportation Analytics

143

COVID-19 era, it leads to additional benefits of contributing to safer buses services and bus waiting experiences while maintaining social distancing. As ongoing and future work, we conduct further analyze to find more interesting patterns. To a further extent, we plan to transfer the knowledge learned from mining this dataset to other similar datasets via transfer learning. Acknowledgements. This work is partially supported by NSERC (Canada) and University of Manitoba.

References 1. Statistics Canada: Commuters using sustainable transportation in census metropolitan areas. 2016 census in brief (2017). https://www12.statcan.gc.ca/census-recensement/2016/as-sa/ 98-200-x/2016029/98-200-x2016029-eng.cfm 2. Consolo, S., Petrillo, U.F.: A framework for the efficient collection of big data from online social networks. In: INCoS 2014, pp. 34–41 (2014) 3. Leung, C.K.: Big data analysis and mining. In: Advanced Methodologies and Technologies in Network Architecture, Mobile Computing, and Data Analytics, pp. 15–27 (2019) 4. Mai, M., Leung, C.K., Choi, J.M.C., Kwan, L.K.R.: Big data analytics of Twitter data and its application for physician assistants: who is talking about your profession in Twitter? In: Data Management and Analysis, pp. 17–32 (2020) 5. Souza, J., Leung, C.K., Cuzzocrea, A.: An innovative big data predictive analytics framework over hybrid big data sources with an application for disease analytics. In: AINA 2020. AISC, vol. 1151, pp. 669–680 (2020) 6. Leung, C.K., Braun, P., Cuzzocrea, A.: AI-based sensor information fusion for supporting deep supervised learning. Sensors 19(6), 1345:1–1345:12 (2019) 7. Lin, J., Liao, S., Leu, F.: A novel bounded-error piecewise linear approximation algorithm for streaming sensor data in edge computing. In: INCoS 2019. AISC, vol. 1035, pp. 123–132 (2019) 8. Amato, F., Cozzolino, G., Moscato, F., Xhafa, F.: Semantic analysis of social data streams. In: INCoS 2018. LNDECT, vol. 23, pp. 59–70 (2018) 9. Leung, C.K., Tanbeer, S.K., Cameron, J.J.: Interactive discovery of influential friends from social networks. Soc. Network Anal. Min. 4(1), 154:1–154:13 (2014) 10. Tanbeer, S.K., Leung, C.K., Cameron, J.J.: Interactive mining of strong friends from social networks and its applications in e-commerce. JOCEC 24(2–3), 157–173 (2014) 11. Audu, A.A., Cuzzocrea, A., Leung, C.K., MacLeod, K.A., Ohin, N.I., Pulgar-Vidal, N.C.: An intelligent predictive analytics system for transportation analytics on open data towards the development of a smart city. In: CISIS 2019. AISC, vol. 993, pp. 224–236 (2019) 12. Leung, C.K., Braun, P., Hoi, C.S.H., Souza, J., Cuzzocrea, A.: Urban analytics of big transportation data for supporting smart cities. In: DaWaK 2019. LNCS, vol. 11708, pp. 24– 33 (2019) 13. Kolici, V., Xhafa, F., Barolli, L., Lala, A.: Scalability, memory issues and challenges in mining large data sets. In: INCoS 2014, pp. 268–273 (2014) 14. Lakshmanan, L.V.S., Leung, C.K., Ng, R.T.: The segment support map: scalable mining of frequent itemsets. ACM SIGKDD Explor. 2(2), 21–27 (2000) 15. Leung, C.K.: Frequent itemset mining with constraints. In: Encyclopedia of Database Systems, 2nd edn., pp. 1531–1536 (2018)

144

C. K. Leung et al.

16. Perner, P.: The study of the internal mitochondrial movement of the cells by data mining with prototype-based classification. In: INCoS 2014, pp. 262–267 (2014) 17. Zhang, J., Li, J.: Retail commodity sale forecast model based on data mining. In: INCoS 2016, pp. 307–310 (2016) 18. Fujihara, A.: Proposing a blockchain-based open data platform and its decentralized oracle. In: INCoS 2019. AISC, vol. 1035, pp. 190–201 (2019) 19. Zhao, K., Khryashchev, D., Vo, H.: Predicting taxi and Uber demand in cities: approaching the limit of predictability. IEEE TKDE (2019). https://doi.org/10.1109/TKDE.2019.2955686 20. Yamaguchi, T., Mansur, A.S., Mine, T.: Prediction of bus delay over intervals on various kinds of routes using bus probe data. In: IEEE/ACM BDCAT 2018, pp. 97–106 (2018) 21. Liu, T., Ma, J., Guan, W., Song, Y., Niu, H.: Bus arrival time prediction based on the knearest neighbor method. In: CSO 2012, pp. 480–483 (2012)

Eye Movement Patterns as a Cryptographic Lock Marek R. Ogiela1(&) and Lidia Ogiela2 1

Cryptography and Cognitive Informatics Research Group, AGH University of Science and Technology, 30 Mickiewicza Ave, 30-059 Kraków, Poland [email protected] 2 Department of Cryptography and Cognitive Informatics, Pedagogical University of Krakow, Podchorążych 2 St., 30-084 Kraków, Poland [email protected]

Abstract. Eye tracking technologies play increasing role in advance vision systems, and multimedia applications. Such technologies can be also applied for security purposes, as a contactless devices and user interfaces. High precision of registered eye movements, allow also to use them in security cryptographic systems, in which individual and unique eye movement patterns, can be used as personal key or security lock. In this paper we define such solutions based on these technologies, which will be focused for creation of personalized security protocols for user authentication. Keywords: Security protocols procedures

Eye tracking technologies Cryptographic

1 Introduction In advanced cryptographic systems it is possible to use different procedures for user authentication. Most common solutions are based on personal keys or CAPTCHA codes, which allow to check if authenticated persons allow to gain access for data or services, or simply check if he is a human being or computer bot. Additionally in creation personal authentication protocols we can consider many specific features or personal parameters like biometric patterns or behavioral features [1, 2]. Registering such parameters requires sensor or vision devices, which can analyzed registered patterns, or simply traditional interfaces for acquisition of user responses. In this paper will be described a different approach for user authentication and creation of personal security lock, which will base on analysis of special eye movements, analyzed with eye tracking glasses, and connected with performing very special selection of patterns according specific questions or semantic meaning [3]. In this solution eye movements can be performed by authorized persons and registered using eye tracking devices. The movement patterns can be user defined, and specific only for particular persons, or can be semantically oriented, where eye movements should follow special semantic sequence. Such new solutions will be described in next sections. © Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 145–148, 2021. https://doi.org/10.1007/978-3-030-57796-4_14

146

M. R. Ogiela and L. Ogiela

2 User Specific Eye Movements Using eye tracking glasses we can register areas and points of user attention. Looking on the test page it is possible to draw out, using eyes movements, any lines of patterns, according user preferences. Authentication protocols can be constructed on such user defined patterns, entered sequentially during authentication steps, and compared with stored and hashed patterns in databases. Such authentication protocols has several advantages like: • • • •

Authentication is touchless and secure Can be performed in real time Mobile, and not dependent from environment or context Fuzzy and robust

Authentication protocol requires only in advance registration of personal patterns, characteristic for particular user, or registration of distinctive parameters fully describe such personal pattern [4]. During authentication using eye movement glasses particular user should again draw by eyes his own pattern, which should be quite similar to the registered ones. Pattern classification methods allow than compare unique features of both patterns (one stored in database, and second - a new one), and classify them as similar or different. In case of proper recognition the authentication procedure is successfully completed for authenticated user. The only limitation of such procedure is requirements connected with creation of database containing unique patterns registered for each users. Figures 1 presents an example of user specific eye movement’s patterns which can be applied for authentication purposes.

Fig. 1. Eye tracking devise as drawing tool for user specific authentication procedure

3 Semantic-Based Eye Movements The second approach for user authentication, in which we can use eye tracking device is a CAPTCHA based protocols [5, 6]. In this approach authenticated user observe several different visual patterns, and he has to select only a few of them having particular meaning. Because such protocol can be semantically driven the proper selection may require also to choose patterns in particular order.

Eye Movement Patterns as a Cryptographic Lock

147

When we consider such protocol with relation to transformative computing technologies it can be extended towards considering context parameters or environmental features. Such solutions similarly to AR systems, which presents augmented information, will be dependent from observed objects. The most important features of this solutions are following: • • • •

Touchless and secure authentication Performed in real time Mobile and fully dependent from environment or context Dependent from user knowledge and perception abilities

This type of eye tracking authentication does not requires to register any user specific biometric or behavioral patterns, but should be oriented for his/her expertise knowledge and perception abilities. It also consider environmental parameters.

4 Security of Authentication Based on Eye Movements Authentication protocols based on application of eye tracking systems allow to perform secure personal authorization using personal or knowledge-based patterns. Such techniques have security features characteristic for traditional visual CATCHA codes. Presented procedures require from user the abilities of understanding complex visual patterns or having knowledge from selected area of expertise [7, 8]. Application of eye tracking systems during authentication stages, may also require cognitive skills, and abilities to moving attention, and focus it on different object or environmental elements. It is also possible to apply for authentication specific eye movement pattern, which can be performed on demand by users. Eye moving patterns can be used for personal authentication, especially when connected with demands of proper understanding the content of visual elements. Computer systems are not capable to fast evaluate pattern in semantic way, or following moving attention according semantic path [9].

5 Conclusions In this paper have been presented new applications of eye tracking technologies in creation of personal authentication protocols. Such protocols allow to define user oriented visual pattern, or can base on users knowledge and expertise in some areas. Application of eye tracking devices make them fully mobile and allow to use in different situations considering external features and factors. Security of these procedures depend on specific user abilities, and on difficulties in semantic evaluation observed patterns. In most cases it requires specific knowledge about observed objects. Such contactless and mobile authentication systems can be applied in different situations e.g. in travel, smart cities etc. Presented solutions considerably extend standard authentication protocols towards application of perception features, and eye tracking systems, and allow to create user oriented security protocols [10–13].

148

M. R. Ogiela and L. Ogiela

Acknowledgments. This work has been supported by the AGH University of Science and Technology research Grant No 16.16.120.773. This work has been supported by Pedagogical University of Krakow research Grant No BN.610-29/PBU/2020.

References 1. Ogiela, M.R., Ogiela, L.: On using cognitive models in cryptography. In: IEEE AINA 2016 The IEEE 30th International Conference on Advanced Information Networking and Applications, Crans-Montana, Switzerland, 23–25 March, pp. 1055–1058 (2016) 2. Ogiela, M.R., Ogiela, L.: Cognitive keys in personalized cryptography. In: IEEE AINA 2017 The 31st IEEE International Conference on Advanced Information Networking and Applications, Taipei, Taiwan, 27–29 March, pp. 1050–1054 (2017) 3. Ogiela, M.R.., Ogiela, L., Ogiela, U.: Biometric methods for advanced strategic data sharing protocols. In: 2015 9th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing IMIS 2015, pp. 179–183 (2015). https://doi.org/10.1109/imis. 2015.29 4. Ogiela, L., Ogiela, M.R.: Bio-inspired cryptographic techniques in information management applications. In: IEEE AINA 2016 - The IEEE 30th International Conference on Advanced Information Networking and Applications (AINA 2016), Crans-Montana, Switzerland, 23– 25 March, pp. 1059–1063 (2016) 5. Alsuhibany, S.: Evaluating the usability of optimizing text-based CAPTCHA generation. Int. J. Adv. Comput. Sci. Appl. 7(8), 164–169 (2016) 6. Osadchy, M., Hernandez-Castro, J., Gibson, S., Dunkelman, O., Perez-Cabo, D.: No bot expects the DeepCAPTCHA! Introducing immutable adversarial examples, with applications to CAPTCHA generation. IEEE Trans. Inf. Forensics Secur. 12(11), 2640–2653 (2017) 7. Ogiela, U., Ogiela, L.: Linguistic techniques for cryptographic data sharing algorithms. Concurrency Comput. Pract. Exp. 30(3), e4275 (2018). https://doi.org/10.1002/cpe.4275 8. Ogiela, M.R., Ogiela, U.: Secure information splitting using grammar schemes. In: New Challenges in Computational Collective Intelligence, Studies in Computational Intelligence, vol. 244, pp. 327–336 (2009) 9. Ogiela, M.R., Ogiela, U.: Secure Information Management in Hierarchical Structures. In: Kim, T.-H., et al. (ed.) AST 2011, CCIS, vol. 195, pp. 31–35 (2011) 10. Ogiela, L., Ogiela, M.R., Ogiela, U.: Efficiency of strategic data sharing and management protocols. In: The 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2016), 6–8 July 2016, Fukuoka, Japan, pp. 198–201 (2016). https://doi.org/10.1109/imis.2016.119 11. Menezes, A., van Oorschot, P., Vanstone, S.: Handbook of Applied Cryptography. CRC Press, Waterloo (2001) 12. Schneier, B.: Applied Cryptography: Protocols, Algorithms, and Source Code in C, Wiley (1996) 13. Ancheta, R.A., Reyes Jr., F.C., Caliwag, J.A., Castillo, R.E.: FEDSecurity: implementation of computer vision thru face and eye detection. Int. J. Mach. Learn. Comput. 8, 619–624 (2018)

Evolutionary Fuzzy Rules for Intrusion Detection in Wireless Sensor Networks Tarek Batiha(B) and Pavel Kr¨ omer VSB - Technical University of Ostrava, 17. listopadu 2172/15, Ostrava, Czech Republic {tarek.batiha,pavel.kromer}@vsb.cz

Abstract. Next–generation digital services and applications often rely on large numbers of devices connected to a common communication backbone. The security of such massively distributed systems is a major issue and advanced methods to improve their ability to detect and counter cybernetic attacks are needed. Evolutionary algorithms can automatically evolve and optimize sophisticated intrusion detection models, suitable for different applications. In this work, a hybrid evolutionary–fuzzy classification and regression algorithm is used to evolve detectors for several types of intrusions in a wireless sensor network. The ability of genetic programming and differential evolution to construct and optimize intrusion detectors for wireless sensor networks is evaluated on a recent intrusion detection data set capturing malicious activity in a wireless sensor network.

1

Introduction

Genetic and evolutionary methods play an important role in many application domains, including computer security. Genetic fuzzy systems form a wide family of fuzzy systems evolved (constructed and/or optimized) by genetic and evolutionary algorithms with a specific optimization objective or objectives in mind [7]. The symbolic nature of evolutionary methods (e.g., genetic programming) make them a suitable candidate for the search for interpretable fuzzy models and rules. The security of massively distributed infrastructures, including wireless sensor networks, is becoming a crucial aspect of their development, deployment, and operations [3,6]. Wireless sensor networks comprise of many individual devices that are connected to a wireless network and communicate to accomplish their common goal. Intrusion detection is an important field of computer security that aims at the identification (detection) and prevention of security intrusions, i.e., malicious activities that have as the main goal an unauthorized use of the target devices [5,23]. Typical security intrusions include, for example, unauthorized access, identity and data theft, denial of service, etc. In this work, an evolutionary fuzzy model, called evolutionary fuzzy rule [14], is used as a machine–learned single–attack classifier for intrusion detection in c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 149–160, 2021. https://doi.org/10.1007/978-3-030-57796-4_15

150

T. Batiha and P. Kr¨ omer

wireless sensor networks. An individual instance of the model is evolved for each attack type and used to classify normal and malicious traffic in a simulated wireless sensor network.

2

Intrusion Detection

In computer security, intrusion detection is a wide field of activity that deals with security intrusions, i.e., security and privacy violations [5] originating from unauthorized use of computer networks and connected devices [23]. There are several types of malicious activities (attacks) that can be classified as security intrusions. They include information gathering (packet sniffing, keylogging), system exploits, privilege escalation, access to restricted resources and so on [23]. Due to the variety and novelty of attacks and the flaws in the development and implementation of complex systems, it is usually not possible to completely secure computer systems (networks) for their entire lifetime by design. In reality, such systems are prone to design flaws and no prevention measures can suppress user errors and system abuse. Intrusion detection systems (IDSs) are software and/or hardware tools developed to mitigate malicious activities and in particular security intrusions. They continuously monitor the target systems and identify system states associated with security intrusions [23]. An IDS has to monitor the status of the target system, collect the data that describes its properties in time, and analyze its behaviour to detect anomalies that might be related to security breaches. The monitoring is achieved by a set of sensors that collect the data from available data sources and forward it to analyzers that confront it with models of normal behaviour and models of known types of security intrusions [5,23]. The analysis of system behaviour usually exploits patterns (signatures) of malicious data and heuristic characteristics (attack rules) of known types of security intrusions. Alternatively, it attempts to detect the discrepancies between current behaviour and known behaviour of legitimate users by various anomaly (outlier) detection algorithms [23]. Most often, real–world IDSs use a combination of intrusion modelling and anomaly detection. Fuzzy rule–based systems [8,11], fuzzy clustering [21], and evolutionary fuzzy systems [8] have all been used to select suspicious traffic [11] or classify attacks [8] in security applications. Evolutionary methods were used to evolve the structure [8,22,27], or parameters [24] of classifiers, evolve strategies of cybernetic attackers [17], or select input features best suited for the intrusion classification exercise [28]. The security of WSNs is linked to the trade–off between complexity (resource consumption) and detection accuracy of the models [11,22]. Single and multi– attack models [22] as well as those able to discover unknown (zero day) intrusions [15] have been developed on the basis of nature–inspired and fuzzy methods. The models are usually evaluated on publicly available datasets (not always specific for WSNs) [3,4,11,28], through computer simulations [17,24,27], and only rarely by real–world experiments [21].

Evolutionary Fuzzy Rules for Intrusion Detection in WSNs

151

This work contributes to the research of intrusion detection for WSNs by the design of a collective evolutionary–fuzzy intrusion detector. It combines the modelling power, lightweight nature, and transparency of fuzzy systems with the ability to adapt and learn of genetic programming to establish a useful intrusion detection system.

3

Wireless Sensor Networks

Wireless sensor networks (WSNs) are distributed cyberphysical systems that are usually spread over large areas and comprise of many cooperating and communicating devices (nodes). The main goal of a typical WSN is environmental monitoring, sensing of the variables of interest, and data transmission [6]. Individual WSN nodes observe the target environmental properties such as temperature, humidity, and solar radiation. At specified time steps, they measure the values of the properties in their vicinity, monitor their change in time, and transmit the gathered data to the outside world (e.g., the Internet). WSN nodes are usually embedded devices with significant constraints such as low computing power, small operating memory, and limited source of energy [2]. One of the main concerns of all algorithms and applications for WSN is their computational complexity and the associated energy budget. WSN nodes usually carry limited and often irreplaceable sources of energy (e.g., batteries). Because of that, one of the main aims of WSN operations is to balance the quality of service and energy consumption [9]. WSNs have often built–in mechanisms that allow the users to choose between increased lifetime (lower sampling frequency, limited data throughput, longer transmission delays) and more intense operations (frequent sampling, instant data transmissions). Security is a major aspect of all types of WSNs regardless of their primary application [6,16,18]. The limited computing power and memory and energy constraints of typical sensor nodes make it hard or even impossible to implement traditional computer security mechanisms [18]. On the other hand, the security requirements of mission critical WSN applications such as surveillance [6,9], industrial control and management, environmental monitoring, and, e.g., healthcare [9], are exceptionally high. The data processed and transmitted by WSNs needs to be confidential (not disclosed to unauthorized parties), have integrity (be reliable, authentic, and not tampered with), and be available as soon as possible [18]. Attackers aim at compromising different aspects of services and data provided by the networks. In response, cybersecurity measures including cryptography, secure routing, and intrusion detection are developed specifically for WSNs [6]. However, traditional intrusion detection techniques are in WSN difficult to implement due to the constraints of network nodes. The main goal of intrusion detection in WSN is the identification of malicious behaviour that violates the security rules of the system. There are several usual architectures of IDSs for WSNs that include standalone IDS (each WSN node has its independent IDS), distributed and cooperative IDSs (all nodes take collaborate on a global intrusion detection process), and hierarchical IDSs (each

152

T. Batiha and P. Kr¨ omer

WSN layer in a multilayer network has its own IDS). In general, they focus on the detection of abnormal behaviour of users, misuse of network resources, and irregularities in the operations of systems and protocols that could indicate an intrusion [6]. One important aspect of intrusion detection in WSN is the knowledge of the routing algorithm. The attackers exploit routing algorithms, e.g., the Low-energy adaptive clustering hierarchy algorithm (LEACH). They compromise its individual steps by, e.g., false CH advertisements, disturbing the TDMA, improper data forwarding (discarding) and so on. Intrusion detection systems, on the other hand, can take advantage of the known behaviour associated with LEACH to build models of legitimate operations and confront them with the real operations of the nodes and the networks.

4

Evolutionary Fuzzy Rules

Evolutionary fuzzy rules (EFRs) [12] were developed as a hybrid soft computing classification and regression model based on the principles of fuzzy information retrieval and evolutionary computation. The fuzzy set theory provides the theoretical background behind fuzzy information retrieval and evolutionary fuzzy rules, too. The general classification and/or regression task is in EFRs associated with the evaluation of a fuzzy search expression (EFR model, named fuzzy rule) over input data. A fuzzy rule is formulated as an extended Boolean search expression [19]. It consists of weighted input terms and fuzzy operators that aggregate them to a special kind of hierarchical fuzzy tree. The input terms are mapped to the attributes (features) of the processed data. The resulting fuzzy tree consists of nodes (operator nodes) and leafs (terminal nodes). In the fuzzy rule, three types of terminal nodes are used: 1) a feature node represents the name of an input feature (variable); 2) a past feature node that defines a requirement on certain feature in a previous data record; 3) a past output node that puts a requirement on a previous output of the predictor. Different node types are useful for different tasks. The feature node is the basic element of the model. It represents an entry point of the model in which is the weighted value of an attribute (feature) of current data record read and evaluated. The past feature node introduces the notion of sequences to EFRs and is required for the analysis of sequential (stream) data. The past input node brings feedback to EFRs. The operator nodes are be used to combine other nodes to sophisticated classification and regression trees by the means of binary fuzzy operators. The operators, currently supported by EFRs, are and, or, not, prod, and sum. They are associated with fuzzy set operations and evaluated using the standard t-norm and t-conorm, fuzzy complement, and product t-norm and t-conorm [29]. Genetic programming (GP) [1] and differential evolution (DE) [20] are in EFR used as a problem–independent machine learning strategies for structure learning and fine–tuning (parameter optimization) of EFR models. The EFRs

Evolutionary Fuzzy Rules for Intrusion Detection in WSNs

153

are in this way first evolved by GP. An EFR is for the GP encoded into a linear chromosome composed of a series of instructions in reverse polish notation. They represent the operations the model performs with input data and are interpreted by a virtual stack machine. The instruction set contains instructions corresponding to the operations of EFR nodes Using this approach, an entire EFR, composed of n instructions, can be stored in a single linear chromosome. The length of the chromosomes depends on the structure and size of the encoded EFRs and may vary between the individuals in the population. GP operators were implemented with respect to the semantics of the EFR and are applied directly to the chromosomes. Mutation is implemented by a stochastic application of the following operations: 1) removal of a randomly selected subtree, 2) replacement of a randomly selected node by a new randomly generated subtree, 3) replacement of a randomly selected node by another compatible node, and 3) a combination of the above. The linear structure of EFR chromosomes also allows the use of traditional recombination strategies with only minor modifications. A modified one– point crossover, implemented in a way that maintains the of the encoded EFRs, is used. The crossover operator selects a random gene, x, from an EFR encoded by the first parent chromosome. Then, a random gene compatible with x is selected from the EFR encoded by the second parent chromosome. Finally, the marked parts of parent chromosomes (single genes or complete subtrees) are exchanged to form two new offspring chromosomes. The fitness function employed by EFRs, can be an arbitrary similarity, error, or goodness–of–fit measure. Although the GP is able to evolve good EFRs [12,13], the nature of the models allows further fine–tuning and optimization. The GP learns the structure and the parameters of the models at the same time. This is, however, a very complex task. When the GP terminates and the structure of an EFR is fixed, it can be seen as a model with a number of real–valued parameters and arbitrary real–parameter optimization methods can be employed to optimize the EFR discovered by the GP. In this work, the differential evolution is used to optimize learned EFRs and to constitute a 2–stage nature–inspired learning and optimization pipeline that produces accurate intrusion detection systems.

5

Evolutionary Fuzzy Rules for Intrusion Detection in Wireless Sensor Networks

Intrusion detection models for WSNs ought to have specific properties, including high accuracy, low computational (and energy) costs, flexibility, and transparency. EFRs satisfy these requirements very well and are in this work proposed as single–attack intrusion detection models for use in wireless sensor networks. Their ability to learn specific types of attacks is experimentally evaluated with the help of an intrusion detection data set, developed recently for intrusion detection in WSNs.

154

T. Batiha and P. Kr¨ omer

The WSN–DS is a data set describing several types of attacks in a simulated wireless sensor network [3]. It contains 374,661 records with 23 nominal and numerical features divided into 5 categories (4 types of simulated security intrusions and normal network traffic). The data set consists of recorded network traffic in a simulated WSN composed of 100 sensor nodes and one base station (BS). The network uses the LEACH routing protocol [10] with the number of clusters set to 5. All simulated attacks exploit the properties of the routing protocol and the attributes that characterise the status of each network node are based specifically on LEACH. WSN–DS contains four types of simulated security intrusions [3]. In the course of a blackhole attack, the attacker pretends to be a cluster head (CH). The false CH captures, stores, and discards all received data and does not forward it to the base station. During a grayhole attack, the attacker assumes a similar strategy of false CH advertisement but drops only several randomly selected packets and forwards the rest to the base station. The flooding attack exploits several weak spots of the LEACH protocol. The attacker pretends to be the CH by sending a huge number of LEACH advertisement messages. This alone increases the power consumption of the nodes but the rogue node intercepts network traffic, too. The scheduling attack destroys the time–division multiple–access (TDMA) medium access schedule and makes multiple nodes to transmit data at the same time. The packet collisions that are caused by this behaviour result in data losses. In total, WSN–DS is composed of 5 classes of network traffic. However, as usual for intrusion detection data sets [25,26], the size of the classes is highly imbalanced. A detailed description of the structure of the data set is summarized in Table 1. It shows that the vast majority of the records corresponds to normal traffic and the individual attack classes represent only very small portion of the whole data set. Table 1. Traffic classes in WSN-DS Traffic class

Num. of records Percentage

Normal

340066

90.77%

Blackhole attack

10049

2.68%

Grayhole attack

14596

3.9%

Flooding attack

3312

0.88%

Scheduling attack

6638

1.77%

To overcome the different sizes of the classes, the data set was split into smaller sets, each representing one type of WSN attack and examples of remaining network traffic (including other attacks). They were created using randomized but stratified sampling strategy. It means that the sizes of classes (attack/no attack) were in each set the same. The attack–specific data sets were further split

Evolutionary Fuzzy Rules for Intrusion Detection in WSNs

155

into training (60%) and test (40%) subsets and used to train and evaluate different neural architectures as specialized single attack classifiers. Finally, the train and test data sets were complemented by a rest data set which was a union of test with all the remaining records from WSN–DS. 5.1

Evolution of EFRs for Intrusion Detection

The attack–specific data sets were used to evolve EFRs as single–attack intrusion detection models. The evolution was carried out for each attack class, independently. The genetic programming, used at the first stage of the process to learn EFR structure and parameters, was a steady–state GP with a population of 100 candidate solutions and generation gap of 2 individuals (i.e., offspring individuals competed with their parents for survival in the population). It used a semi–elitary selection scheme, under which one parent was selected using the traditional roulette wheel selection mechanism and one from an elite set of best– ranked solutions. The crossover probability was set to 0.8 and the mutation probability to 0.02. The maximum number of generations was set to 10,000. The GP also used a simple restarting mechanism to avoid premature convergence. When the absolute difference between the fitness of the worst and the best EFR in the population was lower than a tolerance, = 1e−6 , (i.e., convergence was detected), 50% of the rules in the population were replaced by new randomly generated ones. At the second stage of EFR learning, differential evolution was used to fine– tune the parameters (feature and operator weights) of the best EFR, found by the GP. The DE was a /DE/rand/1 version of the algorithm, with crossover probability, C, and scaling factor, F, both set to 0.9. The optimization algorithm used a population of 100 candidate parameter vectors and was executed for 1,000 generations. The dimension of the optimization problem, associated with EFR tuning, depended on the results of the GP and ranged in the case of single– attack intrusion detectors evolved in this work from 13 to 99 parameters. The combination of the GP and DE facilitates a 2–stage evolutionary learning and optimization pipeline to achieve accurate intrusion detection models. The fitness function, used by both, the GP and the DE, was accuracy, A, defined as the ratio of correctly classified records to all records in the data set, A=

TP + TN , TP + TN + FP + FN

(1)

where TP, TN, FP, and FN were true positives (attacks classified as attacks), true negatives (non–attacks classified as non–attacks), false positives (non–attacks classified as attacks), and false negatives (attacks classified as non–attacks), respectively. The goal of both algorithms was to evolve models wit maximum accuracy. Albeit prone to a bias towards the class with the majority of records, such fitness is sufficient for the proposed evolution of EFRs for intrusion detection because of the stratified sampling of training data described in detail in the following section.

156

6

T. Batiha and P. Kr¨ omer

Experiments

The ability of GP and DE to find good EFRs for intrusion detection in WSNs was studied in a series of computational experiments. A version of WSN–DS by Almomani et al. [3] was used as the source of data. It contains 18 features (2 of them identifying the source node) and 374,661 records with network traffic. The data features were for the experiments encoded as numerical values, normalized, and split into four attack–specific data sets using the principles outlined in Sect. 5. The data sets were used to learn, optimize, and validate intrusion detection models for different types of attacks. Because of the stochastic nature of the used algorithms, the evolution was for each attack class executed 31 times, independently. The accuracy of the evolved models is summarized in Table 2. The table shows the accuracy of the worst, the average (mean), and the best intrusion detection model found for each attack class during the 31 independent optimization runs. It can be immediately seen that the accuracies of the mean and the best models were on the test data set between 0.8998 and 1.00 (i.e., between 89.98% and 100%). The table also shows that the models also reached high classification accuracy of 88.49% to 99.96% on the test data set. The results on the rest data set are good, too, but biased by the unequal size of the classes (selected attack vs. other traffic). A statistical analysis of the results in the form of boxplots is shown in Fig. 1. The plots show that models with accuracy lower than 0.90 (for the grayhole attack 0.85) are outliers rather than common results.

Fig. 1. Accuracy of intrusion detection models evolved for each attack type in the 31 independent runs.

Evolutionary Fuzzy Rules for Intrusion Detection in WSNs

157

Table 2. Accuracy of the worst, the average, and the best evolved intrusion detection models. Data Attack

Model Worst Average Best

Train Blackhole Flooding Grayhole TDMA

0.9502 0.9995 0.7269 0.9328

0.9540 0.9997 0.8998 0.9601

0.9907 1.0000 0.9779 0.9656

Test

Blackhole Flooding Grayhole TDMA

0.9470 0.9989 0.5954 0.9083

0.9511 0.9994 0.8849 0.9596

0.9907 0.9996 0.9776 0.9682

Rest

Blackhole Flooding Grayhole TDMA

0.9834 0.9934 0.9488 0.9733

0.9853 0.9981 0.9656 0.9960

0.9985 0.9991 0.9782 0.9980

A more detailed insight in the evolution of EFRs for intrusion detection in terms of classification accuracy is provided in Table 3. The table shows that the average and the best evolved models provided in the case of the majority of attacks very good classification with high number of true positives and true negatives and low number of false positives and false negatives. Such behaviour was observed not only on the balanced training and test data sets, but on the highly imbalanced rest data set, too. The only exception was the case of the grayhole attack, for which even the best evolved attack yielded relatively high number of false positives (7853 false positives compared to 14,277 true positives). Table 3. Classification of attacks provided by the worst, the average, and the best evolved intrusion detection models. Data Attack

Worst model TP

TN

Average model FP

FN

Train Blackhole 6035

5423

594

Flooding

1988

1984

0

Grayhole

5349

7382

TDMA

TP

TN

FN

TP

6 6038.387 5464.516

552.484

2.613

6005

5941

76

2 1988.645 1984

0

1.355

1990

1984

0

0

1551.387 203.323 8664

8464

352

35

1434 3350 8495.677 7264.613

3442

3987

1

534 3678.323 3967.581

Test Blackhole 3993

3620

411

15 4002.548 3643.516

Flooding

1321

1326

2

1 1320.968 1327.484

Grayhole

1192

5760

TDMA

2175

2648

0

Rest Blackhole 10014 358429

6183

Flooding

3311 368869

20 4705 5271.097 5062.194

2480

Best model FP

487 2455.839 2639.452 35 10038

TN

FP

FN 36

20.419

297.677 3708

3982

387.484

5.452

3976

3988

43

32

0.516

1.032

1321

1328

0

1

717.806

625.903 5767

5649

8.548

206.161 2494

2647

359119.516 5492.484 11

1 3309.516 370637.387 711.613

2.484

6 268

131 130 1 168

9980 364121

491

69

3309 371000

349

3

Grayhole 13614 341870 18195

982 13061.613 348719.290 11345.710 1534.38714277 352212 7853 319

TDMA

425 6099.806 367048.387 974.613

6213 358456

9567

538.194 6186 367734

289 452

158

T. Batiha and P. Kr¨ omer

Nevertheless, the results confirm the ability of the EFRs to serve as single–attack intrusion detection models and the ability of evolutionary methods to learn them from data.

7

Conclusions

This work studied a flexible evolutionary–fuzzy classification algorithm, evolutionary fuzzy rules, in the role of single–attack intrusion detection models for wireless sensor networks. The soft computing method was used as a lightweight and transparent but accurate classifier that was learned and optimized from a WSN traffic data. The experiments used a realistic intrusion detection data set, WSN–DS, describing attacks typical for the environment of wireless sensor networks running the low–energy adaptive clustering hierarchy routing protocol. The computational experiments showed that accurate EFRs for intrusion classification can be obtained by a 2–stage learning and optimization process based on genetic programming and differential evolution. The evolved intrusion detection models achieve high classification accuracy between 88.49% and 99.96% on the test and rest data sets and provide only a small number of false positive and false negative intrusion classifications. In the case of the only attack when the number of misclassifications (false positives) was higher, the grayhole attack, the EFR can be at least used as a detector to select suspected network traffic for further analysis. Altogether, the results of the computational experiments show that EFRs have a good ability to detect intrusions in wireless sensor networks. Acknowledgements. This work was supported by the Technology Agency of the Czech Republic in the frame of the project no. TN01000024 “National Competence Center - Cybernetics and Artificial Intelligence”, and by the projects of the Student Grant System no. SP2020/108 and SP2020/161, VSB - Technical University of Ostrava, Czech Republic.

References 1. Affenzeller, M., Winkler, S., Wagner, S., Beham, A.: Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Application. Chapman & Hall/CRC, Boca Raton (2009) 2. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor networks: a survey. Comput. Netw. 38(4), 393–422 (2002) 3. Almomani, I., Al-Kasasbeh, B., AL-Akhras, M.: WSN-DS: a dataset for intrusion detection systems in wireless sensor networks. J. Sensors 2016 (2016). https://doi. org/10.1155/2016/4731953 4. Batiha, T., Prauzek, M., Kr¨ omer, P.: Intrusion detection in wireless sensor networks by an ensemble of artificial neural networks. In: Czarnowski, I., Howlett, R.J., Jain, L.C. (eds.) Intelligent Decision Technologies 2019, pp. 323–333. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-8311-3 28 5. Bishop, M.: Computer Security: Art and Science. Addison-Wesley, Boston (2003)

Evolutionary Fuzzy Rules for Intrusion Detection in WSNs

159

6. Cayirci, E., Rong, C.: Security in Wireless Ad Hoc and Sensor Networks. Wiley, Hoboken (2008) 7. Cord´ on, O., Gomide, F., Herrera, F., Hoffmann, F., Magdalena, L.: Ten years of genetic fuzzy systems: current framework and new trends. Fuzzy Sets Syst. 141(1), 5–31 (2004). https://doi.org/10.1016/S0165-0114(03)00111-8 8. Elhag, S., Fern´ andez, A., Alshomrani, S., Herrera, F.: Evolutionary fuzzy systems: a case study for intrusion detection systems, pp. 169–190. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91341-4 9 9. Fahmy, H.: Wireless Sensor Networks: Concepts, Applications, Experimentation and Analysis. Signals and Communication Technology. Springer, Singapore (2016) 10. Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication protocol for wireless microsensor networks. In: Proceedings of the 33rd Annual Hawaii International Conference on System Sciences, vol. 2, p. 10 (2000). https://doi.org/10.1109/HICSS.2000.926982 11. Islabudeen, M., Kavitha Devi, M.K.: A smart approach for intrusion detection and prevention system in mobile ad hoc networks against security attacks. Wireless Pers. Commun. (2020). https://doi.org/10.1007/s11277-019-07022-5 12. Kr¨ omer, P., Owais, S.S.J., Platos, J., Sn´ asel, V.: Towards new directions of data mining by evolutionary fuzzy rules and symbolic regression. Comput. Math. Appl. 66(2), 190–200 (2013). https://doi.org/10.1016/j.camwa.2013.02.017 13. Kr¨ omer, P., Platos, J.: Simultaneous prediction of wind speed and direction by evolutionary fuzzy rule forest. In: International Conference on Computational Science, ICCS 2017, Zurich, Switzerland, 12–14 June 2017, pp. 295–304 (2017) 14. Kromer, P., Platos, J., Snasel, V., Abraham, A.: Fuzzy classification by evolutionary algorithms. In: 2011 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 313–318 (2011). https://doi.org/10.1109/ICSMC.2011. 6083684 15. Kumar, S., Dutta, K.: Intrusion detection in mobile ad hoc networks: techniques, systems, and future challenges. Secur. Commun. Netw. 9(14), 2484–2556 (2016). https://doi.org/10.1002/sec.1484 16. Liu, D., Ning, P.: Security for Wireless Sensor Networks. Advances in Information Security. Springer, New York (2010) 17. Mrugala, K., Tuptuk, N., Hailes, S.: Evolving attackers against wireless sensor networks using genetic programming. IET Wirel. Sensor Syst. 7(4), 113–122 (2017). https://doi.org/10.1049/iet-wss.2016.0090 18. Oreku, G., Pazynyuk, T.: Security in Wireless Sensor Networks. Risk Engineering. Springer, Cham (2016) 19. Pasi, G.: Fuzzy sets in information retrieval: state of the art and research trends. In: Bustince, H., Herrera, F., Montero, J. (eds.) Fuzzy Sets and Their Extensions: Representation, Aggregation and Models. Studies in Fuzziness and Soft Computing, vol. 220, pp. 517–535. Springer, Heidelberg (2008) 20. Price, K.V., Storn, R.M., Lampinen, J.A.: Differential Evolution: A Practical Approach to Global Optimization. Natural Computing Series. Springer, Heidelberg (2005) 21. Qu, H., Lei, L., Tang, X., Wang, P.: A lightweight intrusion detection method based on fuzzy clustering algorithm for wireless sensor networks. Adv. Fuzzy Syst. 2018 (2018). https://doi.org/10.1155/2018/4071851 22. Sen, S., Clark, J.A.: Evolutionary computation techniques for intrusion detection in mobile ad hoc networks. Comput. Netw. 55(15), 3441–3457 (2011). https://doi. org/10.1016/j.comnet.2011.07.001

160

T. Batiha and P. Kr¨ omer

23. Stallings, W., Brown, L.: Computer Security: Principles and Practice, 4th edn. Pearson, London (2018). Always Learning 24. Stehlik, M., Matyas, V., Stetsko, A.: Attack detection using evolutionary computation, pp. 99–129. Springer, Cham (2017). https://doi.org/10.1007/978-3-31947715-2 5 25. Tan, X., Su, S., Huang, Z., Guo, X., Zuo, Z., Sun, X., Li, L.: Wireless sensor networks intrusion detection based on smote and the random forest algorithm. Sensors (Basel, Switzerland) 19(1), 203 (2019). https://doi.org/10.3390/s19010203. PMID 30626020 26. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD cup 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6 (2009). https://doi.org/10.1109/ CISDA.2009.5356528 27. Bapuji, V., Manjula, B., Srinivas Reddy, D.: Soft computing technique for intrusion detection system in mobile ad hoc networks. In: Soft Computing in Wireless Sensor Networks, pp. 95–113. Chapman and Hall/CRC, New York (2018). https://doi.org/ 10.1201/9780429438639 28. Xue, Y., Jia, W., Zhao, X., Pang, W.: An evolutionary computation based feature selection method for intrusion detection. Secur. Commun. Netw. 2018 (2018). https://doi.org/10.1155/2018/2492956 29. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965)

Blockchain Architecture for Secured Inter-healthcare Electronic Health Records Exchange Oluwaseyi Ajayi(&), Meryem Abouali, and Tarel Saadawi City University of New York, City College, New York, NY 10031, USA {Oajayi000,maboual000}@citymail.cuny.edu, [email protected]

Abstract. In this on-going research, we propose a blockchain-based solution that facilitates a scalable and secured inter-healthcare EHRs exchange. These healthcare systems maintain their records on separate blockchain networks and are independent of each other. The proposed architecture can detect and prevent malicious activities on both stored and shared EHRs from either outsider or insider threats. It can also verify the integrity and consistency of EHR requests and replies from other healthcare systems and presents them in a standard format that can be easily understood by different healthcare nodes. In the preliminary result, we evaluate the security analysis against frequently encounter outsider and insider threats within a healthcare system. The result shows that the architecture detects and prevents outsider threats from uploading compromising EHRs into the blockchain and also prevents unauthorized retrieval of patient’s information.

1 Introduction Recently, the rapid increase in the cyberattack launched at Healthcare systems has become a significant concern. In 2019, over 572 recorded data breaches in the U.S. health care industries have breached over 41 million patient records, and it is estimated to jump up by 60% in 2020 [1]. The effects of these cyberattacks are estimated to cost the industry about $1.4 billion a year. Although ransomware attack accounts for about 58% of the total breach, staff members inside the healthcare organization were responsible for about 9.2% of the data breach in 2019 [2]. Due to the prevalence of attacks on patient records, there is an urgent need to protect and secure stored data and exchanged data among different healthcare systems, especially now that healthcare systems are proposing more robust interoperability. One of the significant techniques to protect Electronic Healthcare Records is the use of firewall [3–7]. [3] implements the firewall to serve as an anomaly-based intrusion detection system (IDS). In the implementation, the firewall is either configured as a packet filtering firewall or status inspection firewall. The authors in [8] put forward encryption as a way of ensuring the security of EHRs during the exchange process. This approach was designed by Health Insurance Portability and Accountability Act (HIPAA) to secure EHRs when viewed by patients or when creating, receiving, maintaining, or transmitting Patient Health © Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 161–172, 2021. https://doi.org/10.1007/978-3-030-57796-4_16

162

O. Ajayi et al.

Information (PHI) by mobile devices. Despite the success of the approaches, the malicious intruders still find ways to subvert these protection systems and gain unauthorized EHRs. Healthcare providers believe that their data is secured as far as it is encrypted. Although encryption guarantees the confidentiality of such data, consistency, and integrity are not guaranteed. [9] proposed a message authentication code algorithm (MAC) for detecting any changes in stored data. Although this approach detects changes in the stored data, it is not practical for extensive data because downloading and calculating MAC of large files is overwhelming and time-consuming. Another method described in [9] secures cloud data integrity by computing the hash values of every data in the cloud. This solution is lighter than the first approach in [9], however, it requires more computation power, especially for massive data; hence, it is not practical. The authors in [10] employ the third party to coordinate activities of the database. The problem with this approach is that the data is vulnerable to man-in-themiddle or single-point-of-failure attack. Further research has put forward the application of blockchain technology in handling, protecting, and interaction of IoT devices with personal EHRs [10–17]. The approaches described in those researches prove effective in handling and protecting stored personal EHRs. However, the proposed solutions cannot be applied to the EHRs exchanged between two or more healthcare systems as they are primarily focused on securing and protecting personal EHRs; hence, the motivation for this work. In this ongoing research, we propose a solution that leverages the tamper-proof ability, data immutability, and distributive ledger ability of blockchain technology to exchange secured EHRs among different healthcare systems without any security concerns. The new dimension in the healthcare industry is the interoperability of different healthcare systems. The interoperability is important because a patient’s diagnosis and treatment journey can take them from a physician’s office to an imaging center, to the operating room of a hospital. Each stop generates a record, such as doctors’ notes, test results, medical device data, discharge summaries, or information pertinent to the social determinants of health, which become part of a patient’s electronic health record in each setting. For the best outcome, this health information needs to be accessible and securely exchanged among all sources that accompany patient’s treatment every step of the way. This outcome will not only strengthen care coordination but also improve safety, quality, efficiency, and encouragement of robust health registries. Most of the available solutions use a fax message for EHR exchange between healthcare systems, and cloud database for storing EHRs. The significant problems facing the currently available solution are (i) The medium of exchange can be hacked, thereby compromising the integrity and consistency of the shared data. (ii) the database housing the EHRs can be hacked, and data can be manipulated or deleted. (iii) Lack of universal format for EHRs exchange makes it difficult to detect and prevent malicious activities, both insider and outsider attackers. We are proposing a solution that ensures the privacy and consistency of shared data, presents a standard format for exchanging EHRs, and detects any malicious activities on stored and shared EHRs by either insider or outsider threats. Hence, the contributions of our work can be summarized as follows:

Blockchain Architecture for Secured Inter-healthcare Electronic Health

163

• We propose a blockchain-based architecture that facilitates a scalable and secured inter-healthcare Electronic Healthcare Records (EHRs) exchange among different healthcare systems. • The proposed architecture detects and prevents malicious activities on both stored and shared EHRs from either outsider or insider threats. • The architecture verifies the integrity and consistency of EHR requests and replies from other healthcare systems and presents them in a standard format easily understood by the different healthcare system. • The architecture permanently stores the verified EHRs distributively, and shares with other health care systems securely when requested. • The proposed architecture is robust to a new healthcare system joining and leaving the network in real-time. The remainder of this paper is organized as follows: Sect. 2 discusses the background and related works on application blockchain technology in healthcare. Section 3 describes the proposed architecture. Section 4 presents the results, while Sect. 5 presents the conclusions of this paper and possible future works.

2 Background and Related Works First introduced as the technology behind bitcoin in 2008 [18], blockchain was implemented to solve the double-spending problem in a cryptocurrency called bitcoin. Since its inception, diverse areas have seen the application of blockchain technology. e.g. health system [10–17], data integrity security [19], as an intrusion detection system [20–22]. Blockchain is an append-only public ledger that records all transactions that have occurred in the network. Every participant in a blockchain network is called nodes. The data in a blockchain is known as a transaction, and it is divided into blocks. Each block is dependent on the previous one (parent block). So, every block has a pointer to its parent block. Each transaction in the public ledger is verified by the consensus of most of the system’s participants. Once the transaction is verified, it is impossible to mutate/erase the records [18]. Blockchain is broadly divided into two: public and private blockchain [23]. A public blockchain is a permissionless blockchain in which all nodes do verification and validation of transactions. e.g., Bitcoin, Ethereum. While private blockchains are permissioned blockchains where only nodes given permission can join and participate in the network. e.g., Hyperledger. Blockchain application in EHR is still in its inception. However, the potential it offers, the deficiencies and gaps it fills, especially ensuring the security and confidentiality of health data, makes it the forefront to be adopted in the healthcare industry nowadays. Different kinds of researches have been carried out for the application of blockchain technology in securing personal data. The authors in [24] propose a platform that enables a secure and private health record system by separating sensitive and non-sensitive data. The platform serves as a way of sharing a patient’s healthcare data with researchers without revealing the patient’s privacy. The model successfully uses proxy re-encryption techniques to share a patient’s sensitive data without revealing the patient’s private key and adopting an asymmetric cryptography technique to encrypt

164

O. Ajayi et al.

these data while storing it on the cloud. Another similar work in [25] proposes iBlockchain, which uses a permissioned blockchain to preserve the privacy of the Patient’s Health Data (PHD) and improve the individual’s experience in data exchange. It allows only qualified individuals and Healthcare Service Providers (HSP) to join the network to prevent malicious attacks. It uses cold storage functions as an offblockchain storage, and hot storage functions as the store where users temporarily put requested data in addition to a private key and a public key for secure data exchange. Further research in [26] proposes the conceptual design for sharing personal continuous dynamic health data using blockchain technology, which is supplemented by cloud storage. The authors proposed using hash pointers to the storage location to solve the problem of sharing large size continuous-dynamic data while integrating blockchain and cloud storage. Extensive size data can be stored in an encrypted format on the cloud, and only the transactional data and metadata can be saved and shared on the blockchain. The authors in [27] propose a decentralized record management system (MedRec) to manage authentication, confidentiality, accountability, and data sharing of EHRs using blockchain technology. It is a modular design that integrates with patient’s local data storage and encourages medical stakeholders to participate as miners. The result shows that the system enables the emergence of big data to empower researchers while engaging the patient and providers in the choice of release metadata. [28] proposes a new approach which joins blockchain and cloud computing network. In their work, they employ Amazon Web Services and Ethereum blockchain to facilitate the sematic level interoperability of EHRs systems without standardized data forms and formatting. The model proposes an interoperability data sharing framework that includes security through multilayer encryption, optical data storage through Amazon Web Service, and transfer using the Ethereum blockchain. Despite several types of blockchain application research in healthcare, most of the available solutions focus on securing and sharing personal EHRs, failing to address the interoperability of different healthcare systems in securely exchanging patient EHRs. Hence, the motivation for this work. The novelty in our proposed solution is that it facilitates a scalable and secured inter-healthcare EHRs exchange while detecting and preventing malicious activities on the data. This novelty distinguishes our work from previous works.

3 The Proposed Architecture The proposed architecture, which focuses on the interoperability of different blockchain networks, is implemented on the Ethereum blockchain platform. The Ethereum blockchain features a smart contract, which is stored on the chain and keeps the agreement among consortium members. All participants run it. Figure 1 shows a pictorial representation of the proposed architecture (Fig. 2).

Blockchain Architecture for Secured Inter-healthcare Electronic Health

165

Fig. 1. The proposed architecture

The architecture comprises different health care systems running separate private blockchain network. In each private network, computers used for EHRs form nodes in the blockchain network. Each blockchain network is independent of each other and features a unique smart contract that is written according to the healthcare system’s policies and health insurance portability and accountability act (HIPAA). The computers (also known as miners) in each network prepare, submit, and verify all transactions (patient’s EHRs). The miners also run the consensus algorithm, thus validate transactions/blocks. In our previous works [29, 30], we described how miners prepare, verify, validate, and retrieve stored transactions from a consortium blockchain network. In these past works, we focused on how cyberattack features are securely distributed among different nodes through the blockchain network. In this paper, miners in a healthcare network prepare, submit, verify, and similarly validate transactions as described in our previous works; however, unlike the previous work, which uses public-private blockchain networks, we set up a fully private blockchain network for each healthcare system. In the current work, we focus on investigating a secured inter-healthcare EHRs exchange. In this implementation, the healthcare systems are assumed to keep and maintain patient health information on separate blockchain networks while we evaluate the interoperability’s security. Each healthcare network contains miners and a smart contract already running on the chain. The miners prepare transactions, submit, and validate these transactions while the smart contract handles the transaction verifications. A transaction can be a patient’s health information about to be stored into the blockchain network, requesting patient information from other healthcare or replies that carry the requested patient’s information. We described how information could be stored and retrieved within a blockchain network in our previous works, in this paper,

166

O. Ajayi et al.

we describe how our architecture carries out the formation of a patient’s information request and reply across different blockchain platforms. The proposed architecture is divided into three main steps, as shown below.

Fig. 2. Steps in the proposed architecture

3.1

Request

The request stage is subdivided into three categories: Request formation, Request verification, and transaction formation. The request is formed based on the information and permission from the patient. The request formation is necessary when the past medical history of a patient is needed for treatment. For example, a person that lives say in New York, USA, travels to London, UK. If the person had to visit the hospital for treatment, the past medical history must be retrieved from the New York hospital. The past medical history can be retrieved by preparing a request with information that is unique to the patient. During the process, a requester (doctor or nurse in the visiting hospital) supplies the required information to a developed script running on the miners. This script captures the patients’ information such as Name, date of birth, Social Security Number (SSN), Name of the former healthcare system, and requester’s unique code. The script verifies the information and also verifies the identity of the requester. The request is developed into an agreed-upon format and submitted as a transaction to the hospital’s blockchain network. Apart from the submitted transaction, the miner (node) submits its information, which involves the requester’s unique code, the MAC address, and the transaction address of the miner. The smart contract verifies the format of the transaction, the requester, and the miner’s identities. The purpose of the verification is to detect and prevent all malicious activities on the transaction either by insider or outsider threats. Algorithm below describes the significant steps in the verification process. Since the blockchains communicate via smart contracts, each smart contract running on the blockchain networks contains the processes described in algorithm1. For a request to be successful, it must agree with a standard format, the unique code of the requester must be in the authorized code sets, former healthcare must be in the table, miner information must be correct, and the public key of miner must verify its

Blockchain Architecture for Secured Inter-healthcare Electronic Health

167

private key If any of these verification steps fail, the transaction is dropped, and smart contract returns failed request. A successful request is validated and attached to request blockchain. For more about validation, visit our previous works in [29, 30]. program Verification (Request/Reply) var formatted request/reply; miner information; begin If Request; If

(request agrees with standard format) and (requester code in authorized code set) and (destination in look-up table) and (miner information is verified) and (public key verifies private key); Validate transaction; Return success;

else;

Return fail; Drop transaction;

end; else;

If

else;

(reply agrees with standard format) and (reply source matches request destination) and (Verification information in respective sets); Validate reply; Return success; Return fail; Drop transaction;

end; end; end.

3.2

Reply

After a successful validation process, the smart contract routes the request to the designated healthcare network based on the look-up table. The smart contract verifies the format of the received request and the requesting network. Algorithm below describes the verification of requests received by the healthcare network. For a request to be successful, the format of the request must agree with a standard, source information must pass verification step, requested EHRs must be available in the healthcare network, and source public key must verify its public key. If any verification fails, the request is dropped, and the smart contract issues a failed request to the sending network. A successful request is validated and attached to the blockchain. Based on the required information, the miners compete to prepare a reply by retrieving the patient’s EHRs from the blockchain network and preparing it for a transaction submitted to the blockchain for verification. The transaction (reply) and respective sending node are

168

O. Ajayi et al.

verified. A successfully verified reply is validated and attached to blockchain while routed back to the requester’s network. program Reply formation (Reply) var formatted request; Source information; begin If

(Incoming request agrees with standard format) and (S.I. in lookup table) and (Patient’s EHRs in destination Healthcare) and (source public key verifies source private key);

else; end;

Validate transaction; Return success; Return fail; Drop transaction;

end.

3.3

Authentication

The requesting network verifies the sending network and the format of the received reply, as shown in verification algorithm above. When the verification process is successful, the transaction (reply) is validated and attached to the blockchain. The newly added block reflects on the ledger of every node in the network. Every blockchain node possesses a copy of this ledger. All blockchain nodes receive the notification of the newly added block but do not have access to the block’s content. The requesting node retrieves the information in the block, and a developed script converts it to a format that can easily be understood by the requester.

4 Result We carry out the implementation of the proposed architecture in the lab. We set up two different blockchain networks (I and II) with each network comprising of three nodes. For each blockchain network, we use Solidity v 0.6.2 implementation for smart contract and geth v 1.9.0 for Ethereum. The smart contract is written as described above and mined into the blockchain network. A transaction (request) was prepared as explained in Sect. 3 and submitted to the transaction in blockchain network I. We randomly generate ten-string long numbers to serve as the unique requester code. The MAC and transaction address of each miner, the format of a request and reply, and the requester’s unique codes are written in the smart contract. Apart from this, a look-up table that stores information about the blockchain II is written in the smart contract. This smart

Blockchain Architecture for Secured Inter-healthcare Electronic Health

169

contract is mined into the blockchain I. We write a similar smart contract (with the blockchain I information) into blockchain II’s smart contract. We evaluate a preliminary result on the security analysis to demonstrate how the architecture detects and prevents threats from outsider and insider intruders within the blockchain network. We implement how the architecture detects an unauthorized node’s attempt to submit a transaction to the blockchain network. 4.1

Security Analysis

4.1.1 Outsider Threat Detection We analyze the security of the architecture against malicious transaction injection. We added a node (malicious node) that was not part of the blockchain to network I. Here, we assume that an attacker may find its way into joining the blockchain. The malicious node prepared a request transaction and submitted it to the blockchain network for verification. Although the transaction agrees with the standard, we observed a failed transaction notification contrary to the transaction address expected. The transaction failed because the sender is not privileged to submit the transaction; hence, it fails the verification step. We further check if the transaction is validated and join to the chain by querying the blockchain using manually created transaction address. The result shows that the transaction is not chained to the network. 4.1.2 Insider Threat Detection Here, we tested the security of the architecture against two typical ways a malicious insider can breach the patient’s health record. Multiple Requests. We implement a case where malicious insider compromised an authorized node and begins to send a large amount of what appears to be legitimate standard formatted request in an attempt to mount a DoS attack on the blockchain network. Although other authorized nodes are working to validate the transaction, we observed that the transactions are not mined because the frequency of receiving the same or similar transaction from the same node exceeds the threshold set in the smart contract. We persistently submit the same request from the same authorized node, and we observed that the miners stop mining after the sender was flagged to be compromised. The smart contract automatically drops all subsequent transactions from the same authorized node. Unauthorized Retrieval of Patient Information. We implement a case with malicious insider attempts to retrieve patient information. It is assumed that an attacker is not likely to hold an authorized node in a compromised state for too long due to frequent security checks. Based on this assumption, an attacker makes all efforts to assess the information in the shortest time. The result showed that no information was returned because the node used is not privileged to retrieve the information. In the smart contract, information retrieval privilege is set for each node (i.e., the node can only retrieve information that it prepares the request). The architecture drops the query because the node has no retrieval privilege for that patient’s EHRs, which makes it suspicious to have been compromised.

170

O. Ajayi et al.

5 Conclusion In this on-going research work, we propose a blockchain-based architecture that facilitates and secures inter-healthcare EHRs exchange. The proposed solution focuses on preparing a secured patient’s EHRs request and replies to and from another healthcare system. In this implementation, each healthcare system is assumed to keep and maintain patient health information on separate blockchain networks while we evaluate the security of the interoperability between them. We evaluate the security of the architecture on the detection and prevention of malicious transactions within a healthcare system. The preliminary result shows that architecture has a prospect of detecting and preventing malicious activities from either insider or outsider threats. As part of the continuation of the work, we wish to expand our work to accommodate the following: • Detect malicious replies or requests coming for another healthcare system. • Investigate how it protects against more insider threats scenarios • Evaluate the response time

References 1. Clement, J.: Number of U.S. data breaches 2013–2019, by industry (2020). https://www. statista.com/statistics/273572/number-of-data-breaches-in-the-united-states-by-business/ 2. Landi, H.: Number of patients records breached nearly triples in 2019 (2020). https://www. fiercehealthcare.com/tech/number-patient-records-breached-2019-almost-tripled-from-2018as-healthcare-faces-new-threats 3. Liu, V., Musen, M.A., Chou, T.: Data breaches of protected health information in the United States. J. Am. Med. Assoc. 313(14), 1471–1473 (2015) 4. Jannetti, M.C.: Safeguarding patient information in electronic health records. AORN 100(3), C7–C8 (2014) 5. Hunter, E.S., Electronic health records in an occupational health setting–Part I: a global overview. Workplace Health Safety 61(2), 57–60 (2013) 6. Lemke, J.: Storage and security of personal health information. OOHNA J. 32(1), 25–26 (2013) 7. Liu, V., Musen, M.A., Chou, T.: Data breaches of protected health information in the United States. JAMA 313(14), 1471–1473 (2015) 8. Wang, C.J., Huang, D.J.: The HIPAA conundrum in the era of mobile health and communications. JAMA 310(11), 1121–1122 (2013) 9. Aldossary, S., Allen, W.: Data security, privacy, availability and integrity in cloud computing: issues and current solutions. (IJACSA) Int. J. Adv. Comput. Sci. Appl. 7(4), 485–498 (2016) 10. Wang, C., Chow, S., Wang, Q., Ren, K., Lou, W.: Privacy-preserving public auditing for secure cloud storage. IEEE Trans. Comput. 62(2), 362–375 (2013) 11. Yang, X., Li, T., Liu, R., Wang, M.: Blockchain-based secure and searchable EHR sharing scheme. In: 2019 4th International Conference on Mechanical, Control and Computer Engineering (ICMCCE), Hohhot, China, pp. 822–8223 (2019)

Blockchain Architecture for Secured Inter-healthcare Electronic Health

171

12. Zheng, X., Mukkamala, R.R., Vatrapu, R., Ordieres-Mere, J.: Blockchain-based personal health data sharing system using cloud storage. In: 2018 IEEE 20th International Conference on e-Health Networking, Applications and Services (Healthcom), Ostrava, pp. 1–6 (2018) 13. Amofa, S., et al.: A blockchain-based architecture framework for secure sharing of personal health data. In: 2018 IEEE 20th International Conference on e-Health Networking, Applications and Services (Healthcom), Ostrava, pp. 1–6 (2018) 14. Ito, K., Tago, K., Jin, Q.: i-Blockchain: a blockchain-empowered individual-centric framework for privacy-preserved use of personal health data. In: 2018 9th International Conference on Information Technology in Medicine and Education (ITME), Hangzhou, pp. 829–833 (2018) 15. Yang, G., Li, C.: A design of blockchain-based architecture for the security of electronic health record (EHR) systems. In: 2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), Nicosia, pp. 261–265 (2018) 16. Zhang, P., Walker, M.A., White, J., Schmidt, D.C., Lenz, G.: Metrics for assessing blockchain-based healthcare decentralized apps. In: 2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom), Dalian, pp. 1–4 (2017). https://doi.org/10.1109/healthcom.2017.8210842 17. Liang, X., Zhao, J., Shetty, S., Liu, J., Li, D.: Integrating blockchain for data sharing and collaboration in mobile healthcare applications. In: 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Montreal, QC, pp. 1–5 (2017). https://doi.org/10.1109/pimrc.2017.8292361 18. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008). http://bitcoin.org/ bitcoin.pdf 19. Zikratov, I., Kuzmin, A., Akimenko, V., Niculichev, V., Yalansky, L.: Ensuring data integrity using Blockchain technology. In: Proceeding of the 20th Conference of fruct Association. IEEE (2017). ISSN 2305-7254 20. Signorini, M., Pontecorvi, M., Kanoun, W., Di Pietro, R.: BAD: a Blockchain Anomaly Detection solution arXiv:1807.03833v2, [cs. C.R.] 12 jul 2018 21. Golomb, T., Mirsky, Y., Elovici, Y.: CIoTA: Collaborative IoT Anomaly Detection via Blockchain arXiv:1803.03807v2, [cs.CY] 09 Apr 2018 22. Gu, J., Sun, B., Du, X., Wang, J., Zhuang, Y., Wang, Z.: Consortium blockchain-based malware detection in mobile devices. IEEE Access 6, 12118–12128 (2018) 23. Abdullah, N., Hakansson, A., Moradian, E.: Blockchain-based approach to enhance big data authentication in distributed environment. In: 2017 Ninth International Conference on Ubiquitous and Future Networks (ICUFN), pp. 887–892 (2017) 24. Mahore, V., Aggarwal, P., Andola, N., Raghav, Venkatesan, S.: Secure and privacy focused electronic health record management system using permissioned blockchain. In: 2019 IEEE Conference on Information and Communication Technology, Allahabad, India, pp. 1–6 (2019). https://doi.org/10.1109/cict48419.2019.9066204 25. Ito, K., Tago, K., Jin, Q.: i-Blockchain: a blockchain-empowered individual-centric framework for privacy-preserved use of personal health data. In: 2018 9th International Conference on Information Technology in Medicine and Education (ITME), Hangzhou, pp. 829–833 (2018). https://doi.org/10.1109/itme.2018.00186 26. Zheng, X., Mukkamala, R.R., Vatrapu, R., Ordieres-Mere, J.: Blockchain-based personal health data sharing system using cloud storage. In: 2018 IEEE 20th International Conference on e-Health Networking, Applications and Services (Healthcom), Ostrava, pp. 1–6 (2018). https://doi.org/10.1109/healthcom.2018.8531125 27. Azaria, A., Ekblaw, A., Vieira, T., Lippman, A.: MedRec: using blockchain for medical data access and permission management. In: 2016 2nd International Conference on Open and Big Data (OBD), Vienna, pp. 25–30 (2016). https://doi.org/10.1109/obd.2016.11

172

O. Ajayi et al.

28. Carter, G., Shahriar, H., Sneha, S.: Blockchain-based interoperable electronic health record sharing framework. In: 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Milwaukee, WI, USA, pp. 452–457 (2019). https://doi.org/10. 1109/compsac.2019.10248 29. Ajayi, O., Cherian, M., Saadawi, T.: Secured cyber-attack signatures distribution using blockchain technology. In: 2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), New York, NY, USA, pp. 482–488 (2019). https://doi.org/10.1109/cse/ euc.2019.00095 30. Ajayi, O., Igbe, O., Saadawi, T.: Consortium blockchain-based architecture for cyber-attack signatures and features distribution. In: 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York City, NY, USA, 2019, pp. 0541–0549. https://doi.org/10.1109/UEMCON47517.2019.8993036

Semi-automatic Knowledge Base Expansion for Question Answering Alessandro Maisto(B) , Giandomenico Martorelli, Antonietta Paone, and Serena Pelosi University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, Italy {amaisto,gmartorelli,spelosi}@unisa.it, [email protected]

Abstract. In this paper we present a hybrid semi-automatic methodology for the construction of a Lexical Knowledge Base. The purpose of the work is to face the challenges related to synonymy in Question Answering systems. We talk about a “hybrid” method for two main reasons: firstly, it includes both a manual annotation process and an automatic expansion phase; secondly, the Knowledge Base data refers, a the same time, to both the syntactic and the semantic properties borrowed from the Lexicon-Grammar theoretical framework. The resulting Knowledge Base allows the automatic recognition of those nouns and adjectives which are not typically related into synonyms databases. In detail, we refer to nouns and adjectives that enter into a morph-phonological relation with verbs, in addition to the classic matching between words based on synset from the MultiWordNet synonym database.

1

Introduction

One of the main problems of Question Answering (QA) is about the strong presence of ambiguity in Natural Language. [11] reports ambiguity as one of the QA criticality together with the lexical gap between Knowledge base and Queries, syntactic complexity of queries, procedural questions. In terms of QA, ambiguity concerns the properties of a single lexical unit to represent more meanings (polysemy) or the possibility that the same meaning may be expressed by more lexical units (synonymy). Synonyms might represent a critical resource in order to face the problem of ambiguity, but they are also useful for the resolution of the Lexical Gap problem. In this work we will describe the semi-automatic construction of a large lexical Knowledge Base for the treatment of Synonymy in an Italian closeddomain Question Answering system. In order to expose the relations between different words that refers to the same meaning there are three ways: first, semantic similarity method, which uses automatic algorithms to calculate the similarity between words and highlight synonymy relations; second, open-domain synonymy dictionaries such as WordNet [17]; third, morpho-syntactic relations dictionaries, which connect predicates with their related nouns and adjectives, linking the syntactic structures where they appear. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 173–182, 2021. https://doi.org/10.1007/978-3-030-57796-4_17

174

A. Maisto et al.

This work is included into a more general project of creation of a Closeddomain Question Answering Agent called B.4.M.A.S.S. [19] and developed by Network Contacts in collaboration with University of Salerno. The project focuses on the creation of a platform of hybrid agents (Human and Virtual agents) which collaborate in order to produce best Customer Operations services. In the Sect. 2 we briefly present the theoretical background: first we will describe the main core of the Lexicon Grammar Theory, then, we will analyze the importance of Semantic Knowledge Bases in QA projects; in Sect. 3 we present the adopted methodology. In Sect. 4 we present our conclusions.

2

Theoretical Background

Since we propose an hybrid model for Question Answering [14], also the construction of the KB could be considered as result of an hybrid process and, consequently, is based as well on classic instruments such as WordNet and ontology [1–4], on a linguistic theoretical background called Lexicon-Grammar Theory. 2.1

The Lexicon-Grammar Theory

The Lexicon-Grammar (LG) is the method and the practice of formal description of the natural language [9,15]. Lexicon-grammar is based on the study and the collection of a great number of syntactic and semantic data about predicates stored into Lexicon-Grammar Tables. These binary tables describe, for each entry, a unique predicate by specifying the presence/absence of a set of definitional properties. Italian LG Tables have been developed by the Department of Political, Social and Communication Sciences of the University of Salerno in the last 40 years, by manually explore texts [7]. 2.2

Knowledege Base for Question Answering

Since about a decade, a great part of human knowledge on digital support is stored in knowledge bases: peculiar databanks which administrate information for didactic, cultural and business purposes. Information in KB can be stored in a structured or non-structured way, depending on the objective. In an extensive way, the net itself may be considered an immense KB due to the function that the two entities share, such as the necessity to be created and used by more than one user; the capacity to preserve integrity following the ACID protocol: Atomicity, Consistency, Isolation, and Durability; the presence of hypertexts; the possibility to create wide datasets which can be expanded in time [10]. Despite the size some knowledge bases have reached during years, we can always observe that they are quite incomplete. Such level of incompleteness mainly displays to a qualitative level, not quantitative. KB can grow enormous in size, but not very deep in structure and possibility to apply a high-level information retrieval. Some examples of this are the datasets of KB such as

Semi-automatic Knowledge Base Expansion for Question Answering

175

EuroWordNet, Wikidata, YAGO, BabelNet, UMBEL or Mindpixel. “Despite their seemingly huge size, these knowledge bases are greatly incomplete. For example, over 70% of people included in Freebase have no known place of birth, and 99% have no known ethnicity” [22]. This happens due to the fact that many KB contain manually inserted data, and complete all instances associating them to every possible structured information would be a tremendous long work. Those which have not so many structured data, or non-manually inserted ones, seem not able to offer satisfying answers. 2.3

Question Answering Applied to Knowledge Bases

Since the net has been defined as “a semantic web” [5], information in web are associated to metadata which specify their semantic context. In this way, the web has been made compatible to the reception of interrogations via search engines and automatic processing. Knowledge bases can also be defined as wide databases which allow the information retrieval task [22]. A KB predisposed for Question Answering can contain either typical structured data (html; json, xml, rdf) or semi-structured information or completely non-structured ones, as a collection of texts. Depending on quantity and typology of data in the KB is possible to divide QA systems in two types: Closed-domain QA and Open-domain QA. The first ones are automated responses which works in closed or limited knowledge fields. On the other hands, the second ones provide solutions to questions posed in generic knowledge fields, with less boundaries. 2.4

The Role of Semantics in Question Answering

Transform KB in semantic databases is the key to obtain more relevant answers and fill the gap we discussed previously [22]. Ideally, an answer should display in a triple, such as the sequence of subject-predicate-object. At the current state, this only occurs in certain cases and not all of them. The way to extract the best quality of information from KBs is via question answering, and the criteria the interrogations are asked. Generally, the types of questions lead back to two: Factoid and Complex/Narrative. Factoid questions are related to Closed-domain QA, while Complex/Narrative questions are connected to Open-domain QA. In turn, QA paradigms are divided in two types: • IR-based approaches: TREC; IBM Watson; Google • Knowledge-based and Hybrid approaches: IBM Watson; Apple Siri; Wolfram Alpha; True Knowledge Evi And they can consider three kinds of approaches: • IR-based Factoid QA • Knowledge-based approaches (Siri) • Hybrid approaches (IBM Watson)

176

A. Maisto et al.

According to the studies formalized by Dan Jurafsky [12], the best way to take advantage of semantic databases is based on KB. IR-factoid does not have the necessity to use a semantic database because that approach requires less information and should be able to provide more straightforward data. On the other hand, hybrid approaches generate a partial semantic representation of queries and are based on other information to provide answers. In order to obtain better structured KB and fill the gap they present, a semantic approach is more advisable. In such way, question answering will excerpt better results because it will permit a more extended use of language during the asking phase, allowing databases to be accessible to a wider audience. Another issue QA can encounter when interacting with KB is ambiguity, the presence of multiple meanings in the same lexical unit. Each KB is structured to work and have identity independently from others, and this sometimes enhances the risk of ambiguity. The same lexical unit can be associated to multiple meanings in various KB. There are multiple ways this has been dealt with during the years, even if the issue is still not completely solved. One way is through neural networks, which associate a series of meanings to a lexical units, or express it in a triple. In this case, “the answer to the question can thus be derived by finding the relation–entity triple in the KB and returning the entity not mentioned in the question” [23]. Another way to deal with ambiguity is through the idea of generality. Assuming to analyze a series of contexts with the same domains, generality identifies the property of the same context to indicate similar meanings. By assigning a context to different KB that were not originally structured to work together, this will enhance the possibility of reducing generality and ambiguity as a consequence. “A logic dealing with KB contexts enables a reasoning system to use such seemingly inconsistent knowledge bases without deriving a contradiction. Every knowledge base will be given in a separate context and lifting axioms will be used to relate the two contexts” [6,16].

3 3.1

Methodology Creation of a Semantic Domain Database

For the purposes of a closed-domain QA System it is necessary the creation of a Domain Knowledge Base, in the form of a Lexical database or electronic dictionary, which includes terms of a specialized language that belongs to a specialized area of speech [13]. In our case-study the domain of the system concern the Costumer Operations and, in particular, ICT services. A great number of technical terms of the ICT domain derive from other languages (i.e. bufferizzare from the English word buffer ) but included into the domain-lexicon there are also generic terms that are used with a specific meaning. Our dictionary includes terms that refers to promos (i.e All Inclusive Unlimited ), or to hardware (i.e. Samsung Galaxy S7 ). Moreover, the dictionary can

Semi-automatic Knowledge Base Expansion for Question Answering

177

be used for disambiguate generic terms used in a specific way in the telecommunication language, such as ricarica (recharge), attivazione (activation), traffico (traffic), etc. For the creation of the lexical resource we used a semi-automatic process starting from a corpus of forum posts. The corpus is composed by 168.734 tokens and includes users queries and operators responses related to the specific domain of ICT. We extracted the recurring N-grams (in this case we considered Bi-grams and Tri-Grams), that are compound words generally not included in dictionaries. For the extraction we use the software NooJ [20]. A Syntactic Finite State Automaton (FSA) (Fig. 1) has been built with NooJ to extract specialized terms not yet included into the dictionary from the corpus. In order to recognize acronyms (i.e. GPRS ) or special terms composed by mixed-cases (i.e. MyWind ), present in large number into the corpus and not recognized by NooJ, we took advantage of a Morphological Finite State Automaton (Fig. 2).

Fig. 1. Syntactic FSA for the extractions of elements from the corpus, processing by Nooj

Finite State Automata built with NooJ include nodes, which represent words or morphemes, and edges. Syntactic Automata include yellow nodes that are called metanodes because they contain other grammars. Red parentheses are used to create variables that can be used in order to write outputs or new dictionary entries. Empty nodes can be used to write instruction such as restrictions or constraints for the variable (〈$Nome=:N〉 means that the word included into the variable Nome must be a Noun), or outputs ($1L and, 〈Verbo〉 compose an output in which the Lemma in the variable 1, Verbo, is followed by the tag 〈Verbo〉). Morphological Automata include some command for the treatment of single characters such as 〈U〉 which indicate an Uppercase character or 〈D〉 which indicates Digits. nodes with circular edges represent Loops. The results of the application of FSA to the corpus is a list of 138265 word forms. The logic of this extraction is to reach the best Recall, in order to collect

178

A. Maisto et al.

Fig. 2. Morfological FSA for the extractions of specific types of words from the corpus, processing by Nooj

all possible domain terms. Consequently, the Precision, calculated on a sample of 500 words, is about 0,482 due to the presence of a great number of false positives. In the sample we found only 5 false negatives and they were words composed by mixed-cases unrecognized by the morphological grammar, or English words not contained in the NooJ dictionary. Therefore, the recall reaches the 0,979. We manually check the list of results and we delete all the words not related with ICT domain; after that, we reduce to lemmas all the words obtaining a list of 1024 elements. 3.2

Automatic Expansion of Knowledge Base

For the extractions of synonyms we use WordNet and MultiWordNet Java API. Terms which not match with these two databases have been manually added. MultiWordNet [18] is an Italian version of WordNet which includes about 40.000 word entries and 31 synsets. Due to the strong presence of English terms, the extraction includes two steps: first we extract synonyms, hypernyms, hyponyms and contraries from MultiWordNet Nouns, Adjectives and Verbs databases. Then we repeat the same operation with WordNet. All the results have been included in a JSON structure like the one in the Fig. 3. We used the JSON1 format because, even if it had never been used for linguistic annotations, it is particularly suitable for this purpose. The JSON format is synthetic, easy to compile for humans and for the machines, and it also allows data to be affordable on the web. 1

JavaScript Object Notation, http://www.json.org/.

Semi-automatic Knowledge Base Expansion for Question Answering

179

In this structure the item “lemma” refers to the word concerned. The item “id” refers to a specific id composed of a letter that represents the domain (T in this case) and a sequence number. The item “POS” refers to the part of speech and it could be “NOUN” for nouns, “ADV” for adverbs, “VERB” for verbs and “ADJ” for adjectives. The item “lingua” refers to the language of the word and it could be English or Italian. The item “class” refers to the category of the word (i.e. Telephony, Electronics, etc.). The others items refer to synonyms, hypernyms, hyponyms, holonyms and meronyms of the word. The last two items contain, respectively, an example with the term concerned and the definition of the word. We found useful the example because it allows to know if the terms in the list of results were, indeed, related to telecommunications domain. Each term is contained in the structure as many times as its meanings related to the domain. The automatic expansion of the Lexical Databases generates 1070 entries organized as showed in Table 1.

Fig. 3. JSON structure of the semantic database

180

A. Maisto et al. Table 1. Annotated Lemmas POS

3.3

Italian English

Nouns 499 299 Verbs Adjectives 70 3 Adverbs

151 17 28 3

Total

199

871

Nominalizations and Adjectivalizations

The semantic database described in previous section includes specific domain terms, proper Nouns of devices or promotions. Moreover there are morphosyntactical relations that concern the verbs and its connected Nouns or Adjectives [8,21]. The QA system must be able to create a relation between the Italian verb “costare” (to cost) and the Italian Noun “costo” (the cost). In order to realize these connections we rely on Lexicon-Grammar Theory. Thanks to the LG Tables we realize two tasks: • Nominalization: Identification of Nouns related to verbs annotation (i.e. Costare [to cost], costo [cost]); • Adjectivalization: Identification of Adjectives related to verbs annotation (i.e. costoso [expensive]) Information about Nominalization and Adjectivalization have been stored into the JSON dictionary. For each semantic predicate has been created a JSON object with information about its distributional, syntactic and transformational properties with the intention of uniting the electronic dictionaries and the information contained in the Lexicon-grammar tables.

Fig. 4. Nominalizations and adjectivalizations JSON structure

We associated an id to each predicate (Fig. 4) so that we can work on lexical entries subsets and complete the associated information without changing the

Semi-automatic Knowledge Base Expansion for Question Answering

181

main resource. In this way we simplified the original tables. The id is composed by the Lexicon-grammar class number and a sequence number. Others items of the structure refer to the lemma, the part of speech and the Lexicon-grammar class. The item “omografi” indicates if the lemma has an homograph, using a boolean value (true/false). The item “struttura definizionale” refers to the definitional structure of the class. The item “esempio” contains an example. The last two items refer to nominalizations (Vn) and adjectivalizations (V-a).

4

Conclusion

In this work we presented a methodology for the creation and the expansion of Lexical Knowledge Bases for the treatment and the solution of two of the main problem of the Question Answering systems: the linguistic gap between queries and KB and the ambiguity. The methodology relies on two steps: the first one concerns the creation of a large database of verbs, connected to nouns and adjectives which, in certain syntactic structures express the same meaning of the predicate; the second step consists of the automatic expansion of a list of noun, adjectives and verbs extracted from a domain corpus, based on MultiWordNet Synsets. The methodology has been implemented into a Question Answering System tested on the ICT domain and, in particular, on Customer Operations for Telephone Companies. The advantage of a semi automatic approach in a methodology such as the one presented in this paper is that rebuilt the resource for other domain is faster and to adapt the entire system to other domains are easier. In fact, the manual compilation of the dictionary is a smaller part of the work because the extraction of terms from the corpus, the expansion of the lexical entries and also the calculation of Nominalization and Adjectivalization could be performed in a semi-automatic way.

References 1. Amato, F., Cozzolino, G., Maisto, A., Mazzeo, A., Moscato, V., Pelosi, S., Picariello, A., Romano, S., Sansone, C.: ABC: a knowledge based collaborative framework for e-health. In: 2015 IEEE 1st International Forum on Research and Technologies for Society and Industry Leveraging a better tomorrow (RTSI), pp 258–263. IEEE (2015) 2. Amato, F., Cozzolino, G., Mazzeo, A., Mazzocca, N.: Correlation of digital evidences in forensic investigation through semantic technologies. In: 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA). IEEE, pp. 668–673 (2017) 3. Amato, F., Cozzolino, G., Mazzeo, A., Moscato, F.: An application of semantic techniques for forensic analysis. In: 2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 380–385. IEEE (2018)

182

A. Maisto et al.

4. Amato, F., Cozzolino, G., Moscato, V., Moscato, F.: Analyse digital forensic evidences through a semantic-based methodology and NLP techniques. Future Gener. Comput. Syst. 98, 297–307 (2019) 5. Berners-Lee, T.: Realising the full potential of the web. Tech. Commun. 46(1), 79 (1999) 6. Buvac, S.: Resolving lexical ambiguity using a formal theory of context. In: Semantic Ambiguity and Underspecification. Citeseer (1996) 7. D’Agostino, E., Elia, A., Vietri, S.: Lexicon-grammar, electronic dictionaries and local grammars of Italian. Lingvisticae Investigationes Supplementa 24, 125–136 (2004) 8. Elia, A.: Le verbe italien. Les complétives dans les phrases ` a un complément (1984) 9. Gross, M.: Transformational analysis of French verbal constructions, University of Pennsylvania (1971) 10. Haerder, T., Reuter, A.: Principles of transaction-oriented database recovery. ACM Comput. Surv. 15, 287–317 (1983) 11. H¨ offner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngonga Ngomo, A.C.: Survey on challenges of question answering in the semantic web. Seman. Web 8(6), 895–920 (2017) 12. Jurafsky, D., Martin, J.H.: Dialog systems and chatbots. In: Speech and language processing, p. 3 (2017) 13. Litkowski, K.C.: Syntactic clues and lexical resources in question-answering, vol. 249, pp. 157–166. NIST Special Publication SP (2001) 14. Maisto, A., Pelosi, S., Polito, M., Stingo, M.: Automatic text preprocessing for intelligent dialog agents. In: Workshops of the International Conference on Advanced Information Networking and Applications, pp. 805–814. Springer (2019) 15. Maurice, G.: Méthodes en syntaxe. Hermann, Paris (1975) 16. McCarthy, J.: Notes on formalizing context (1993) 17. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995) 18. Pianta, E., Bentivogli, L., Girardi, C.: MultiWordNet: developing an aligned multilingual database. In: First International Conference on Global WordNet, pp. 293– 302 (2002) 19. Shashaj, A., Mastrorilli, F., Stingo, M., Polito, M.: An industrial multi-agent system (MAS) platform. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 221–233. Springer (2019) 20. Silberztein, M.: NooJ: a linguistic annotation system for corpus processing. In: Proceedings of HLT/EMNLP 2005 Interactive Demonstrations, pp. 10–11 (2005) 21. Vietri, S.: Lessico-grammatica dell’italiano: metodi, descrizioni e applicazioni. UTET libreria (2004) 22. West, R., Gabrilovich, E., Murphy, K., Sun, S., Gupta, R., Lin, D.: Knowledge base completion via search-based question answering. In: Proceedings of the 23rd international conference on World wide web, pp 515–526 (2014) 23. Yih, W., He, X., Meek, C.: Semantic parsing for single-relation question answering. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp 643–648 (2014)

Mina: SeMantic vIrtual Assistant for Domain oNtology Based Question-Answering Nicola Fiore1(B) , Gaetano Parente1 , Michele Stingo1 , and Massimiliano Polito1,2 1

Network Contacts, Molfetta, Italy {nicola.fiore,gaetano.parente,michele.stingo, massimiliano.polito}@network-contacts.it 2 DSPC - Università degli Studi di Salerno, Salerno, Italy [email protected]

Abstract. Aim of this article is to present the methodological nucleus of a custom Question-Answering system named Mina: a software solution with the ability to perform automated search over an ontological representation of localized knowledge domains, in order to extract fine-grained information so as to respond to user queries expressed as natural language questions. First the pipeline adopted by Mina will be presented, focusing on the multi-agent nature of the system Mina exists within, consisting of a distributed and dynamic environment on which several autonomous software (the so called agents) coexist and cooperate to perform specific tasks. Second we will focus on the strategies Mina exploits for the extraction of the correct answer which needs to be provided, starting with the introduction of the NC Common Lang library, a collection of linguistic analysis tools, adopted for the extraction of intents and entities contained within user queries; following up we’ll show the ontological representation of a specific knowledge sphere - the telecommunication domain in our case - describing the structured scheme of concepts and relationships according to which the answer retrieval operation is performed, querying the ontology through Description Logic inferences. As a final instance, three different test scenarios are presented with the intent to provide solid evidence about the correct functioning of Mina.

1 Introduction Performing automated information retrieval processes has become a matter of crucial importance both for private and public institutions, particularly considering the vast amount of data on which the use of several digital services is based on, from healthcare to customer care passing through the whole tertiary economic sector. Taking into further account the extremely mixed and complex nature of the informative flow that runs within such services, increasingly becoming more centered around semi-structured and unstructured data, it follows that designing and implementing software solutions able to handle these kind of input becomes essential in order to retrieve punctual information. To this purpose in this article we present the base methodology of a custom QA (Question Answering) system, which is a software solution exploited when automated research of answers becomes necessary within specific knowledge area. We present Mina, a virtual assistant capable of answering to human input questions from localized c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 183–193, 2021. https://doi.org/10.1007/978-3-030-57796-4_18

184

N. Fiore et al.

knowledge domains, exploiting several natural language processing tools (NLP) so as to extrapolate the intents and the entities of a sentence, the latter used to interrogate an ontological conceptualization of the working domain area, which in our prototype corresponds to the Telecommunication (TELCO) field. The resulting output it is a single reply, either obtained from the retrieval of a punctual information or from a further filter mechanism amongst a set of possible replies. The remainder of the paper is organized as follows. Section 2 reviews a selection of related works concerning QA methodologies of particular interest. Section 3 presents the complete pipeline of Mina, first illustrating the NC Common Lang library - composed of PoS tagging, Lemmatization and Parsing modules - then displaying the ontological scheme of our working domain, focusing on the structuring principles and explaining the set of classes and relationships composing the TELCO ontology instance; further information will be provided both on how ontological queries are translated from natural language questions into Description Logic expressions and about the possible filter algorithm step, that serves its purpose when a submitted ontological query returns a set of formally correct responses amongst which is to find a single reply. Section 4 presents three different test cases so as to give solid proof on the functioning of Mina pipeline, namely a generic case, synonymy case and a filtered answer case. Lastly in Sect. 5 we discuss future works.

2 Related Works Throughout the entire design phase of the project, we focused on some of the most interesting literature works within the field of QA systems, paying particular attention to ontology-based methodologies. In AQUA [1] for instance, the ontological structures are compared against relationships found within a formal query via similarity algorithm. The input question is firstly analyzed by a syntactic parser which outputs a parsed sentence in the form of subject, predicate, object and, whenever are present, adjectives. In case of convergence between query features and the ontological structures, AQUA can return the answer to be provided; otherwise, the similarity algorithm is called in order to find similarities between the relationships used within the parsed query and the relationships within the ontological structures. When similarities are found AQUA returns the linked answer, or else the system tries to reformulate the query, still exploiting the ontologies, so as to repeat the whole pipeline until a proper response is found. MOQA [2] implements an ontology representing things or events in the real world. As a first step, the TMR (Text Meaning Representation) is extracted from the question, thanks to which a formal query is launched over the ontology, this way obtaining all the hierarchical connections related to the initial question (usually on historical topic or about known personalities. Further the questions and the instances are taken within another structure called Fact Repository. MKQA [3] is a QA system for the musical domain. The structure of the ontology has classes such as singers, songs etc. Once that a question has been analyzed and the query constructed, the ontology returns all the pieces of information which are then processed, so that a specific answer may be constructed based on the input question. (Fu et al., 2010) [4] use a different approach. Dealing with questions from real users, through textual preprocessing analysis they derive the predicate from the initial answer. When a new query containing a predicate matches with one

Mina: SeMantic vIrtual Assistant for Domain oNtology Based Question-Answering

185

of the questions previously processed (usually FAQs), they retrieve the proper response from an indexed repository of replies; the ontology is structured in such a way that there is only one possible match and, therefore, the associated index linked to the correct answer is taken. In case there were more matching FAQs - which happens when a sentence partially corresponds with more than one pre-processed answer - an algorithm step in calculating the ranking for each one of the possible matches, further taking the index of the FAQ with the highest matching ranking.

3 Mina Pipeline The genesis of this work comes from the need to create a QA (Question Answering) solution - namely a process with automatic ability to answer questions expressed in natural language [5] - for industrial purposes. To this purpose we designed a mixed methodology where the retrieval of the correct answer relies on multiple pre/post processing steps in conjunction with the use of an ontological conceptualization of a specific knowledge area. The pipeline which exploits the ontological knowledge involves two main software routines: Mina, a semantic agent which returns unambiguous responses within a specific domain; LandS, both a speech-to-text and text-to-speech agent which - via Google APIs - takes a question in audio format and creates a proper string transcription. The same applies contrariwise, when after receiving the answer from Mina, the agent vocally converts the response. These two agents coexist and collaborate within the MAS (Multi Agents System) environment [6, 7], a set of cooperative software components with the aim of performing tasks within a distributed and dynamic ecosystem. The entire pipeline is outlined in Fig. 1. Following we address the solution with which Mina is able to find the correct answer within a localized domain. In the first stage, once that a text converted request has been received from Mina, the latter submits the natural language query to the NC Common Lang library, a collection of several Natural Language Processing tools. This library help us obtaining meaningful semantic intents and entities within the uttered questions. 3.1 NC Common Lang The use of NC Common Lang library is preparatory for the construction of Descprition Logic queries, thanks to which the ontology is inspected and thus returning the correct answer to be provided. This library contains and refers to the majority of the best known methods adopted for the processing of natural language. In the first place a Part-of-Speech Tagger - which is a software attributing to each word of a given text the correct grammatical category - based on TINT [8] and refined on [9] is invoked. The second module used by NC Common Lang produces text lemmatization. Using the tags found by the PoS Tagger, our Lemmatizer performs a comparision against the entries of custom electronic dictionaries based on [10–12]. The last module used at this stage is the Syntactic Parser, which gives the syntactic tree representation of a written text. The type of syntactic information will depend on the reference grammar used for the purpose. The term grammar here refers to a descriptive language and its binding set

186

N. Fiore et al.

Fig. 1. Flowchart of the Mina pipeline

of rules. In this sense, a parser does anything but accept a string of symbols, further checking if the sequence falls within the possible structural relationships established by the working grammar, in order to recognize its correctness (also defined as grammaticality) and provide one or more structural representations of the input sequence. 3.2

Ontologic Knowledge Base of Domain Answers

Borrowing thee definition proposed by (Studer et al., 1998) [13], an Ontology, within the IT field, corresponds to a formal and explicit specification of a shared conceptualization related to a knowledge area. For explicit we mean that constraints, concepts and properties of an ontology must be explicitly defined and linked by precise causal relationships. Formality refers to the ability of an ontology to be accessible for simple computations (therefore machine-readable) but also for complex computation operations (machine-understandable). Sharing, lastly, assumes that the knowledge underlying the ontological representation is not the result of the speculation of an single individual, but that it is precisely shared by a group that expresses its consent on it. According to the above mentioned principles, the ontological representation of the TELCO domain adopted in our project is structured as follows:

Mina: SeMantic vIrtual Assistant for Domain oNtology Based Question-Answering

187

• 3 SuperClasses: Intenti, Entità and Risposte (Intents, Entities, Answers); • There are subclasses of the class Entità corresponding to the entities of the working domain. In this case we implemented only subclasses of the TELCO domain (e.g. recharge, activate, etc.), but the adopted methodology is readily replicable over different business domains; • There is the subclass Risposte TELCO of the class Risposte which contains as instances (individuals) the single answer to be provided; • The Intenti class has instances corresponding to the uttered intents • Finally, there are links between the mentioned classes in the form of object properties and the related descriptions in the form of data properties. The complete ontology scheme is outlined in Fig. 2. A further relationship between ontology objects is constituted so as to represent lexical synonymy between words. This way we are not only stating meaning equivalence between instances of the domain - as in the case of the words SIM and scheda (Italian slang to name a sim card) - but we are allowing transitive sharing of object and data properties sets as well. As stated before, our ontology relies on three main super classes, namely Intenti, which contains instances of all the terms (mainly predicates) that are to be identified within a phrase; Entità which contains within its subclasses all the entities semantically linked to one or more intents. Finally, there is the

Fig. 2. Ontological conceptualization of the TELCO domain with central focus on the Risposte TELCO class.

188

N. Fiore et al.

main class - Risposte TELCO - which lists instances corresponding to specific codes that uniquely lead us in retrieving the correlated answers stored in external database. Summarizing, once that the system receives a question, the latter is examined by the NC Common Lang library that extrapolates the related intent and entities; thus the system interrogates via Description Logic queries the ontology so as to obtain the single instance of the Risposte Telco class that is linked both with the identified intents and entities. Therefore we can say that we use our ontology as a means of answer collection. 3.3

QnA System: Query Creation and Filter Algorithm

As previously stated, a fundamental feature of our work is the automatic investigation of the knowledge contained within the ontology throughout queries in the form of inferences made via Description Logic (DL) [14], which is a family of knowledge representation languages that are widely used in ontological modelling and that is fully supported as World Wide Web Consortium standard. The way in which we build the DL queries works on top of the OWL API Java distribution [15] - that in turn exploits the HermiT reasoner [16] - and it has been algorithmically instantiated as outlined below. Algorithm 1: createExpression Result: expression expression = “Risposte TELCO THAT ha intento VALUE”; expression = expression + NC Common Lang’s intent value; while entity has next do expression = expression + ” AND ” + entity’s category + ” VALUE ” + entity’s value; While using DL Queries as a means for interrogating the ontology, we faced a limit. In front of a relatively low number of found entities, the query submitted to the ontology returns a set of instances (individuals) instead of a single item. Since we are in the necessity to retrieve a unique instance as a valid response, a filter algorithm in the way that is shown below has been implemented to address this issue. Algorithm 2: getUniqueIndividual Result: response execute DL query on ontology and get individuals list; create elementsList; add to elementsList intents and entities taken distinctly from NC Common Lang; while individuals has next do get individual’s object and data properties labels; create propertiesList; add to propertiesList object and data properties labels taken distinctly; if propertiesList size is equal to the elementsList size then individual is the response; As you can see, the algorithm already holds a list containing the labels of both intent and entities and, for each one of the instances returned by the query, creates a list

Mina: SeMantic vIrtual Assistant for Domain oNtology Based Question-Answering

189

of labels corresponding to those that identify the object properties of the intent with the relative entity data properties that have been found. At the end of the expanded instance listing creation, the algorithm just take the only individual that has the same size in terms of linked labels when compared with that of the input question. Practically, we rule out instances that have different labels other than those of the original question, because these are probably associated with questions that should have more pieces of information and that therefore should refers to more associated entities. Thanks to this algorithm, we univocally find the correct answer to be provided, as it is shown in the example in Sect. 4.3.

4 Testing the System So as to prove the effectiveness of the proposed methodology, we carried out three different test cases, whose results are displayed in the present paragraph. 4.1 Generic QnA Test This test addresses the easiest case where, from a sentence received as input, we retrieve firsthand a unique result from the DL query investigation. For this test we took the following sentence: Voglio effettuare una ricarica homebanking1 Sending this sentence to the NC Common Lang processing step, and passing its result to the createExpression method, we obtain the following DL query: Risposte TELCO THAT ha intento VALUE caricare AND ha mezzo VALUE homebanking Executing the DL query on the ontology we obtain the answer to be provided to the user as it can be seen in Fig. 3.

Fig. 3. Generic QnA test query

1

I want to make a homebanking top-up.

190

4.2

N. Fiore et al.

Synonym-Based Test

This test addresses the case in which the system correctly retrieves the same answer while trying to respond to different questions having similar words in terms of their meaning content. In this example we rely on the synonimic relationship between the words sim and scheda, as shown in Fig. 4.

Fig. 4. Description of the individual sim. The synonymy with the term scheda is declared thanks to the correlation Same individual as.

Starting with this sentence as input question: Voglio riattivare la mia sim scaduta2 further sending this sentence to the NC Common Lang processing step, and passing its result to the createExpression method, we obtain the following DL query: Risposte TELCO THAT ha intento VALUE attivare AND ha contratto VALUE sim AND ha stato VALUE scaduto Executing the DL query on the ontology we correctly obtain the answer that has to be provided, as it can be seen in Fig. 4.

Fig. 5. Synonym-based test query with instance sim

In case of the following sentence as starting point Voglio riattivare la mia scheda scaduta3 further passing the NC Common Lang result to the createExpression algorithm, we obtain the following query: Risposte TELCO THAT ha intento VALUE attivare AND ha contratto VALUE scheda AND ha stato VALUE scaduto Executing the DL query on ontology we obtain the same answer as in the previous example as it can be seen in Fig. 5. 2 3

I want to reactivate my expired sim. I want to reactivate my expired card.

Mina: SeMantic vIrtual Assistant for Domain oNtology Based Question-Answering

191

Fig. 6. Synonym-based test query with instance scheda

4.3 Filter Mechanism Test This case shows the use of the Filter mechanism in order to obtain the instance that adheres completely to the conditions of the input query in case of overlapping of different replies. For this test we take the following sentence: Voglio attivare un numero4 Sending this sentence to the NC Common Lang processing step and further passing its result to the createExpression method, we obtain the following query: Risposte TELCO THAT ha intento VALUE attivare AND ha contratto VALUE numero Executing the DL query on the ontology we get all the instances that meet at least these conditions, as it is shown in Fig. 7.

Fig. 7. Filter mechanism test query

In order to solve this issue it becomes necessary to obtain the instance that only adheres to the conditions expressed by the labels of the input query. To do so we use the Filter mechanism as displayed in the getOnlyIndividual method. In Fig. 8 it is shown a step by step portion of the test filter algorithm execution. As it is possible to notice, the Filter mechanism identifies the newLinea instance as the unique item that only respects the ha intento and “ha contratto” conditions found in the original query, ruling out the remaining instances holding these two abovementioned conditions and more. 4

I want to activate a number.

192

N. Fiore et al.

Fig. 8. getOnlyIndividual method in execution

5 Future Works As stated at the beginning, this project currently represents a baseline methodology and our future goals will mainly focus on the complete implementation of the knowledge contained in the ontology regarding the TELCO domain, as well as for different domain such as ENERGY, INSURANCE, AUTOMOTIVE, LAW and HEALTH, exploiting original methodologies [17, 18]. As further step we also intend to statistically compare the robustness of our system in comparison against methodologies representing QnA state of art solution.

References 1. Vargas-Vera, M., Motta, E.: AQUA–ontology-based question answering system. In: Mexican International Conference on Artificial Intelligence, pp. 468-477. Springer, Heidelberg, April 2004 2. Beale, S., Lavoie, B., McShane, M., Nirenburg, S., Korelsky, T.: Question answering using ontological semantics. In: Proceedings of the 2nd Workshop on Text Meaning and Interpretation, pp. 41-48, July 2004 3. Fu, J., Xu, J., Jia, K.: Domain ontology based automatic question answering. In: 2009 International Conference on Computer Engineering and Technology, vol. 2, pp. 346–349. IEEE, January 2009 4. Fu, J., Li, S., Qu, Y.:. Question matching based on domain ontology and description logic. In: 2010 Second International Conference on Computer Research and Development, pp. 833– 838. IEEE, May 2010 ¨ ur, A., Kartal, G.: Question 5. Derici, C., Çelik, K., Kutbay, E., Aydın, Y., Güngör, T., Ozg¨ analysis for a closed domain question answering system. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 468–482. Springer, Cham, April 2015

Mina: SeMantic vIrtual Assistant for Domain oNtology Based Question-Answering

193

6. Shashaj, A., Mastrorilli, F., Stingo, M., Polito, M.: An industrial multi-agent system (MAS) platform. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 221–233. Springer, Cham, November 2019 7. Shashaj, A., Mastrorilli, F., Morrelli, M., Pansini, G., Iannucci, E., Polito, M.: A distributed multi-agent system (MAS) application for continuous and integrated big data processing. In: Chatzigiannakis, I., De Ruyter, B., Mavrommati, I. (eds.) Ambient Intelligence. AmI 2019. Lecture Notes in Computer Science, vol. 11912. Springer, Cham (2019) 8. Aprosio, A.P., Moretti, G.: Tint 2.0: an all-inclusive suitefor NLP in Italian. In: CLiC-it (2018) 9. Marulli, F., Pota, M., Esposito, M., Maisto, A., Guarasci, R.: Tuning SyntaxNet for POS tagging Italian sentences. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 314–324. Springer, Cham, November 2017 10. De Bueris, G., Elia, A.: Lessici elettronici e descrizioni lessicali, sintattiche, morfologiche ed ortografiche. Plectica, Salerno (2008) 11. Vietri, S., Elia, A.: Analisi automatica dei testi e dizionari elettronici. In: Burattini, E., Cordeschi (2001) 12. Elia, A.: Metodi statistici e dizionari elettronici: il trattamento dei sintagmi complessi. In: Leoni, et al. (ed.) Dati Empirici e teorie linguistiche, pp. 505–526. ROMA Bulzoni. ISBN: 9788883196096 13. Studer, R., Benjamins, R., Fensel, D.: Knowledge engineering: principles and methods. Data Knowl. Eng. 25(1–2), 161–198 (1998) 14. Markus, K.: Description Logic Rules, vol. 8. IOS Press (2010) 15. Horridge, M., Bechhofer, S.: The OWL API: a Java API for OWL ontologies. Seman. Web 2(1), 11–21 (2011) 16. Glimm, B., Horrocks, I., Motik, B., Stoilos, G., Wang, Z.: HermiT: an OWL 2 reasoner. J. Autom. Reasoning 53(3), 245–269 (2014) 17. Amato, F.: ABC: a knowledge based collaborative framework for e-health. In: 2015 IEEE 1st International Forum on Research and Technologies for Society and Industry Leveraging a better tomorrow (RTSI), vol. 2015, pp. 258–263. IEEE (2015) 18. Amato, F., Cozzolino, G., Mazzeo, A., Moscato, F.: An application of semantic techniques for forensic analysis. In: 2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 380-385. IEEE, May 2018

SafeEat: Extraction of Information About the Presence of Food Allergens in Recipes Alessandra Amato1 and Giovanni Cozzolino2(B) 1

2

University of Napoli Federico II, Naples, Italy [email protected] DIETI - University of Napoli Federico II, via Claudio 21, Naples, Italy [email protected]

Abstract. The application of artificial intelligence is becoming increasingly complex and sophisticated to shorten the gap between users and digital systems. IA techniques can bring a lot of advantage in facing the issue of allergic reaction related to food allergies. Recognising and avoiding the foods on which it is based is an effective way to avoid an allergic reaction. To identify possible allergens, it is important to read the food labels carefully or to know what are the ingredients from which it is made. In this paper we present a system that exploits the techniques and tools of Artificial Intelligence to extract and analyse the ingredients of a recipe, and alert the user to the presence of possible allergens. The performed experimentation shows that the system can alert the user of allergens in Wikipedia Cookbok dataset recipes.

1 Introduction Allergy indicates a condition in which a subject’s immune system reacts abnormally by producing antibodies to certain substances considered harmful, which for most people are completely harmless. Food allergy, in particular, is a reaction of the immune system to a certain food, perceived by the body as harmful. When you suffer from a food allergy, the immune system mistakenly identifies a specific food or a substance present in it as something harmful (allergen): to neutralise it then releases antibodies (immunoglobulin E, also known as IgE). Allergy symptoms are due to the body’s release of chemical mediators (e.g. histamine) in response to the immune reaction triggered by the encounter of allergens with antibodies. Food allergens are all foods, or ingredients of which they are composed, that trigger immune-mediated reactions, even serious ones, if ingested by specific individuals. The best way to prevent an allergic reaction is to know - and avoid - the foods on which it is based. It is good to read the food labels carefully and if you have already had a severe allergic reaction, wear an identification mark (bracelet or tag) that allows others to know what you are suffering from if you are unable to communicate. Also social network are becoming a media that collect users behaviours [1] and influence the tastes and the preferences of individuals regarding food consumption [2, 3] by providing personalised offers [4]. A fundamental rule introduced in the European Regulation 1169/2011 is the obligation for all companies in the food sector to inform consumers about the presence of c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 194–203, 2021. https://doi.org/10.1007/978-3-030-57796-4_19

SafeEat: Extraction of Information About the Presence of Food Allergens in Recipes

195

allergens in the foods they are going to eat. For example, in the catering sector it is necessary to indicate the possible allergenic content of each dish when drawing up the menu. In fact, in accordance with the Regulation, information on food allergens must be properly documented and, above all, it must be displayed well in view in order to be easily available both for the person responsible for any inspection and for the consumer. In the European Regulation 1169/2011 there is indicated a detailed list of the 14 main food allergens with their respective derivatives: crostaceans, eggs, arachids, fish, soia, gluten, springs, solfites, milk, nuts, cederal, senape, sulphur dioxide and sulphites, lupins. In this paper, we present the SafeEat [5] project aiming to assist companies in the process of recognition and identification of allergens inside food. The project exploits Artificial Intelligence techniques [6] that uses an inferential engine [7, 8] to unify the food tested with any allergens.

2 System Implementation System’s core is a Knowledge Base populated through a set of Rules that encode the results of researches performed according European Regulation 1169/2011. In the following we describe the specifications of both Knowledge Base and rules, with some additional examples. 2.1 Knowledge Base The structure of the code is divided into three main functions aiming to indicate the ‘facts’ on which the rules will be based. The functions are: 1. allergy(ALLERGEN, LIST ) - combine an ALLERGEN with a LIST of ingredients containing it; 2. derivatives(ORIGIN, LIST ) - associate to an ORIGIN element all its derivatives in a LIST; 3. substitutes(LIST, ADMIT T ED) - associates to a LIST of elements containing an allergen, ones that can be substituted with the elements present within an ADMITTED list. We noted that the category of elements related to sulphites was directly dependent on elements combined with other categories of allergens. Therefore, a rule was provided for the list of foods containing sulphites, by means of an append of lists related to fish, molluscs, crustaceans and nuts with a list of basic sulphites. 1. sul f ites(LIST, X) is based on a recursive structure that uses append, a preset function of Prolog that provides the concatenation of lists. In the Listing 1 we reported an extract of the facts constituting the Knowledge Base.

196 1

A. Amato and G. Cozzolino

%%%--------------------------FACTS-------------------------

2 3

4

5

6 7

8

9

10 11 12 13

14 15 16 17 18 19

% "allergy" approaches each allergen with a list of ingredients with this particular allergen allergy ( ALLERGEN,LIST) % "derivatives" associates a list of his derivatives with certain ingredients Es: wheat->wheat_flour derivatives( INGREDIENT,LIST) % "substitutes" shows a list of possible endangered foods that can be combined with substitute ingredients "substitutes "(DANGER_LIST,POSSIBLE_SUBSTITUTES) % --------------------------------------------------------allergy(fish,[tuna_fish,anchovies,sardines,dolphinfish,cheppie ,herring,mackerel,salmon,cod,halibut]). allergy(clams,[squid,tattler,cuttlefish,dormouse,octopus, limpets,snails, scallops, clams,oyster,cockler,mussels, fasolara,sea_dates,octopus,canestrelli]). allergy(gluten,[pasta,bread,quinoa,kamut,sorghum,teffi,oats, barley,wheat,rye,spelled,wheat_bud,starch,seitan,tapioca, beer,tofu,gelatine]). allergy(lactose,[milk,provola,milk_cream]). allergy(celery,[celery,celeriac,ribbed_celery]). allergy(sesame,[sesame_oil,sesame_seeds,sesame_flour]). allergy(shellfish,[lobster,crab,granseola,crab,crayfish, gamberone,sea_cicadas,scampi,shrimp]). allergy(lupine,[lupine]). allergy(soy,[soy]). allergy(mustard,[mustard]). allergy(egg,[eggs,egg]). allergy(peanut,[peanuts]). allergy(nuts,[almond,cashew_nuts,hazelnut,wot,cashew_nut, pecans,brazilian_nuts,pistachio,macadamia_nut, queensland_walnut,pine_seed,peanut,chestnut]).

20 21

22 23 24 25 26 27 28

29

30

31

derivatives(oats,[oat_grains,oat_bread,oat_biscuits,oat_flour, oat_milk]). derivatives(celery,[celery_sauce]). derivatives(egg,[yolk,egg_white]). derivatives(crab,[surimi,crabmeat,crab_claws]). derivatives(lupine,[lupine_grain,lupine_flour]). derivatives(soy,[soy_oil,soy_milk,soy_flour,tofu,soy_sauce]). derivatives(mustard,[mustard,mustard_oil,karashi]). derivatives(almond,[almond_milk,almond_flour,almond_oil, almond_butter]). derivatives(hazelnut,[nutella,hazelnut_cream,hazelnut_milk, hazelnut_flour,hazelnut_paste,nocino,hazelnut_butter]). derivatives(cashew_nuts,[cashew_oil,cashew_juice,cashew_butter ]). derivatives(milk,[cow_milk,sheep_milk,goat_milk,buffalo_milk, skimmed_milk,cooking_cream,cheddar,ice_cream,

SafeEat: Extraction of Information About the Presence of Food Allergens in Recipes

32

33

34 35

36 37 38 39 40

41

197

milk_chocolate,cow_milk_ricotta,butter,yogurt,fresh_cream, milk_flakes,robiola,ricotta_cheese,emmenthal,cheese, mozzarella_cheese,milk_cream,crescenza,goat_cheese]). derivatives(wheat,[durum_wheat,wheat,wheat_flour,wheat_flakes, bread_crumbs,semolina]). derivatives(sorghum,[sorghum_paste,sorghum_molasses, fermented_sorghum, sorghum_flakes, puffed_sorghum]). derivatives(peanuts,[peanut_butter,peanut_flour,peanuts_oil]). derivatives(spelled,[spelled_flour,spelled_pasta, spelled_flakes,puffed_spelled,einkorn,spelled_milk]). derivatives(barley,[barley_malt,barley_coffee,barley_yest]). derivatives(rye,[rye_bread]). derivatives(tuna_fish,[tuna_pate,tuna_fillets,bottarga]). derivatives(kamut,[kamut_bread, kamut_flour]). derivatives(pistachio,[pistachio_grain,pistachio_cream, pistachio_pesto]). derivatives(spelled,[spelled_flour, spelled_bread, fermented_spelled]).

42 43

44

45

substitutes([spelled_flour,wheat_flour,oat_flour],[ amaranth_flour,buckwheat_flour,corn_flour,millet_flour, quinoa_flour,rice_flour,sorghum_flour,teffi_flour]). substitutes([sesame_oil,soy_oil,cashew_oil,mustard_oil, almond_oil,peanuts_oil],[soy_oil,cashew_oil,mustard_oil, almond_oil,peanuts_oil,seed_oil]). substitutes([cow_milk,sheep_milk,goat_milk,buffalo_milk, soy_milk,almond_milk,oat_milk,hazelnut_milk],[cow_milk, sheep_milk,goat_milk,buffalo_milk,soy_milk,rice_milk, almond_milk,oat_milk,amaranth_milk,hazelnut_milk]). Listing 1. Facts of the Knowledge Base

2.2 Rules The rules on which the code is based are different and provide for the complete exploration of KB. The following are codes and examples. • allergies(ALLERGEN, LIST ) - allergies is true if each indicated ALLERGEN corresponds to the required LIST; the rule aims to associate to the KB also the list of sulphites. • derivate(DERIVAT E, ORIGIN) - derivate is true if ORIGIN corresponds to a Z list within the derivates ‘fact’ of which DERIVATE is part (the verification of the presence of DERIVATE in the list is done through a preset function of Prolog member). • allergen(ELEMENT, ALLERGEN) - allergen is true if ALLERGEN is merged with a Z list in allergies of which ELEMENT is part of or if the ELEMENT itself is derived from P, member of the Z list. • recipe(INGREDIENT SL IST, ALLERGEN, ELEMENT ) - recipe is true if within INGREDIENTS LIST is present ELEMENT and ELEMENT unifies with the required allergen.

198

A. Amato and G. Cozzolino

• substitute(ELEMENT, SUBST ITUT E) - substitute is true if in ELEMENT there is a Z allergen and if SUBSTITUTE belongs to the list of substitutes S that merges with the list of which X is part and furthermore SUBSTITUTE is not a Z allergen. In the Listing 2 we report a set of implemented rules. 1

%%%--------------------------RULES-------------------------

2 3

4

5

% "sulfites" is a recursive function that determines the list of ingredients where sulphites may be present, i.e. a primary list and all the elements related to fish, shellfish, clams and nuts in the KB sulfites([T],X):-allergy(T,Y),append([wine,beer,sausages, vinegar,fruit_juices,dried_mushroooms,canned_food,sugar, cider,jam],Y,X). sulfites([Head|Tail],X):-sulfites(Tail,Z),allergy(Head,Y), append(Z,Y,X).

6 7

8 9

10

% "allergies" is an auxiliary rule that associates the corresponding list with each allergen (including sulphites ) % allergies(allergen,ingredients_list). allergies(sulfites,X):-sulfites([fish,shellfish,clams,nuts],X) . allergies(Y,Z):-allergy(Y,Z).

11 12

13 14

% X is derivative of Y if Y corresponds to a list Z of which X is a part % derivate(DERIVATIVE,ORIGIN) derivative(X,Y):-derivatives(Y,Z),member(X,Z).

15 16 17 18 19

20

% X is an element with allergen Y if X belongs to allergy Y % allergen(INGREDIENT,ALLERGEN) allergen(X,Y):-allergies(Y,Z),member(X,Z). % X is an element with allergen Y if X is derived from P, and P is a member of the list Z accosted to Y allergen(X,Y):-allergies(Y,Z),member(P,Z),derivative(X,P).

21 22 23 24

% In list L there is an element X with allergen Y % recipe(ingredients_list,allergen,risk_ingredient). recipe(L,Y,X):-member(X,L),allergen(X,Y).

25 26

27 28

% An element X is replaceable by Y if in X if there is an allergen Z and if Y belongs to the list S of the substitutes linked to X and Y it is not a type Z allergen % replaceable(risk_element,substitute). replaceable(X,Y):- allergen(X,Z),substitutes(R,S),member(X,R), member(Y,S),not(allergen(Y,Z)). Listing 2. Rules of the Knowledge Base

SafeEat: Extraction of Information About the Presence of Food Allergens in Recipes

199

3 System Implementation By exploiting the functionalities of Python programming language and NLP libraries [9], we aim to obtain information about the presence of food allergens within specific recipes. In particular, we considered recipes extracted from the Wikipedia Cookbook. After obtaining the recipe, the system will be able to recognise the main allergens contained in it, by using the facts and the rules of the Knowledge Base. The extraction of the text requires several steps due to the exclusive interest in the ingredients used, in fact specific functions are implemented to eliminate from the text everything that is not included in the list of ingredients. Generally the Cookbook is structured as shown in the Fig. 1: there is an introductory section of the recipe, a list of ingredients with the respective quantities and the procedure.

Fig. 1. Wikipedia Cookbook recipe

We have to import the Cookbook web page through the urllib.request library through which the HTML code will be decoded and imported. The inspection of the web page allows you to understand where the ingredients are placed in the HTML of the recipe web page. After, it is necessary to implement a method to “clean” the text of the page from all the strings, numbers, units of measures and symbols that we are not interested in. We first have to perform the tokenization of the text, i.e. separates all words. We continue by deleting all the numbers present between the tokens, precisely because a number will in no case indicate an allergen. Then, we continue with the lemmatization process. A lemma is the canonical form corresponding to a dictionary entry. We carry out this process in order to collect the terms in clusters as they are derived from the same lemma.

200

A. Amato and G. Cozzolino

Finally, proceed with the process of POS tagging or grammatical tagging. This step consists in associating to each token the correct grammatical category. It has been decided to proceed in this way with the aim of identifying among all the tokens only those corresponding to names, thus searching among them the food allergens. Once we have extrapolated the names in the correct lemmatized format, we adapt the list of ingredients for the request in Prolog, as a string with characters ‘[’ and ‘]’ to open and close the list and each identified ingredient separated by a comma. In Listing 3 we report the implementation of the NLP pipeline 1 2

#TOKENIZES INTO A WORDS DIVISION token=nltk.word_tokenize(list1)

3 4 5

#DELETES NUMBERS token=unit_delete(token)

6 7 8

#REMOVES PUNCTUATION AND LOWER FROM THE LIST OF TOKENS token=[x.lower() for x in token if not(x in string.punctuation )]

9 10 11

#LIBRARY FOR LEMMATIZE from nltk.stem import WordNetLemmatizer

12 13 14 15

16 17 18 19

20

#BUILDS THE LEMMATIZER lemmatizer = WordNetLemmatizer() #APPLICATES THE LEMMATIZER TO TOKENS’ LIST AND GET A LIST OF WORDS ALREADY LEMMATIZED lemmatized=[lemmatizer.lemmatize(x) for x in token] #POST TAGGING OF the lemmatized words pos_tagged=nltk.pos_tag(lemmatized) #selectes only the ingredients into the list pos tagged detecting only the names only_ingredients=[x[0] for x in pos_tagged if x[1]==’NN’]

21 22 23

24 25 26 27 28 29 30 31

#In order to get a right request for our Prolog program #we should translate the list in a string format enclosed into ’[’ and ’]’ def stringatize(list1): string=’[’ for a in list1: string+=a if not(a==list1[len(list1)-1]): string+=’,’ string+=’]’ return string

32 33 34 35

#list of ingredients in a swish compatible format swi_ingredients=stringatize(only_ingredients) #PRINTS THE INGREDIENTS LIST IN SWISH FORMAT

SafeEat: Extraction of Information About the Presence of Food Allergens in Recipes 36

201

print(f"Ingredients in your recipe are: {swi_ingredients [1:-1]}") Listing 3. NLP pipeline for recipe analysis

4 Experimentation We performed an evaluation of the potentialities of our approach by considering as a case study a pizzeria, and simulating the approach through a series of queries explained in the following. For example, to prepare a Margherita, composed of water, yeast, wheat flour, tomato sauce, olive oil and mozzarella cheese, we can start by analysing the mozzarella and yeast elements, verifying their possible unification with an allergen. As we can see in the Fig. 2, we can easily verify, by querying the Knowledge Base through the allergen function, that lactose is present in the mozzarella, while in the yeast no allergens are present.

Fig. 2. Querying the Knowledge Base we can identify possible allergens.

We can also specify a query that verifies the presence of an allergen in a certain recipe. Considering the Fig. 3, the presence of gluten has occurred. The query returns us the Boolean value true, which confirms the presence of gluten within the recipe.

Fig. 3. The query returns true if an allergen is present within the ingredients of a recipe.

We can also request a replacement for an allergy-generating element. The case under consideration, reported in Fig. 4 is the replacements of cashew nut oil: these will certainly be elements that do not contain cashew nuts but it is not certain that they are free from other allergens.

202

A. Amato and G. Cozzolino

With the compound query submitted, all the substitutes elements containing an allergen and those in which no allergen is present, such as olive oil or seed oil, will be verified twice.

Fig. 4. The query returns all the substitutes of a specific element, with their possible allergens.

5 Conclusions The immune system erroneously defines a food allergy as something unhealthy (allergen) with a particular food or ingredient found in it. An effective way to avoid an allergic reaction is by recognising and avoiding the foods it is based on. It’s important to carefully read the food labels or to know what are the ingredients it is made of, in order to identify possible allergens. In this paper we presented a system that exploits Artificial Intelligence techniques and tool in order to extract and analyse the ingredients of a recipe, and alert the user about the presence of possible allergens. The experimentation show that the system is able to alert the user about the presence of allergens in recipes taken from Wikipedia Cookbok dataset. Future works foresee to enhance NLP pipeline and Knowledge Base implementation, by taking into account semantic tools, like ontologies, that can improve the correlation and reasoning process [10–12]. Acknowledgement. This paper has been produced with the financial support of the Project financed by Campania Region of Italy ‘REMIAM - Rete Musei intelligenti ad avanzata Multimedialità’. CUP B63D18000360007.

References 1. Amato, F., Cozzolino, G., Moscato, F., Moscato, V., Picariello, A., Sperli, G.: Data mining in social network. In: International Conference on Intelligent Interactive Multimedia Systems and Services, pp. 53–63. Springer (2018)

SafeEat: Extraction of Information About the Presence of Food Allergens in Recipes

203

2. Amato, A., Cozzolino, G., Giacalone, M.: Opinion mining in consumers food choice and quality perception. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 310–317. Springer (2019) 3. Amato, A., Balzano, W., Cozzolino, G., Moscato, F.: Analysis of consumers perceptions of food safety risk in social networks. In: International Conference on Advanced Information Networking and Applications, pp. 1217–1227. Springer (2019) 4. Amato, F., Cozzolino, G., Moscato, V., Picariello, A., Sperl´ı, G.: Automatic personalization of visiting path based on users behaviour. In: 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 692–697. IEEE (2017) 5. Maimone, A., Russo, F., Piccolo, G.: Safeeat: extraction of information about the presence of food allergens in recipes 6. Nadarzynski, T., Miles, O., Cowie, A., Ridge, D.: Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: a mixed-methods study. Digital Health 5, 2055207619871808 (2019) 7. Dube, K., McLachlan, S., Zanamwe, N., Kyrimi, E., Thomson, J., Fenton, N.: Managing knowledge in computational models for global food, nutrition and health technologies. medRxiv (2020) 8. Lin, C.-C., Liou, C.-H.: A SWI-prolog-based expert system for nutrition supplementary recommendation, pp. 3G5ES101–3G5ES101 (2020) 9. Hudaa, S., Setiyadi, D.B.P., Lydia, E.L., Shankar, K., Nguyen, P.T., Hashim, W., Maseleno, A.: Natural language processing utilization in healthcare. Int. J. Eng. Adv. Technol. 8 Special Issue 2(6), 1117–1120 (2019) 10. Cozzolino, G.: Using semantic tools to represent data extracted from mobile devices. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 530– 536. IEEE (2018) 11. Amato, F., Cozzolino, G., Moscato, V., Moscato, F.: Analyse digital forensic evidences through a semantic-based methodology and NLP techniques. Future Gener. Comput. Syst. 98, 297–307 (2019) 12. Amato, F., Cozzolino, G., Mazzeo, A., Moscato, F.: An application of semantic techniques for forensic analysis. In: 2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 380–385. IEEE (2018)

Accelerated Neural Intrusion Detection for Wireless Sensor Networks Tarek Batiha(B) and Pavel Kr¨ omer VSB-Technical University of Ostrava, 17. listopadu 2172/15, Ostrava, Czech Republic {tarek.batiha,pavel.kromer}@vsb.cz

Abstract. Wireless sensor networks (WSNs) form an important layer of technology used in smart cities, intelligent transportation systems, Industry, Energy, Agriculture 4.0, the Internet of Things, and, for example, fog and edge computing. Cybernetic security of such systems is a major issue and efficient methods to improve their security and reliability are sought. Intrusion detection systems (IDSs) automatically detect malicious network traffic, classify cybernetic attacks, and protect systems and their users. Neural networks are used by a variety of intrusion detection systems. Their efficient use in WSNs requires both learning and optimization and very efficient implementation of the detection. In this work, the acceleration of a neural intrusion detection model, developed specifically for wireless sensor networks, is proposed, studied, and evaluated.

1

Introduction

Wireless sensor networks are in the heart of today’s massively distributed infrastructures including the Internet of Things, smart and cognitive environments, vehicular and mobile ad–hoc networks, and environmental monitoring networks. Their security is becoming the crucial aspect of their development, deployment, and operations [3,8]. They are usually composed of large numbers of individual devices connected via different kinds of communication networks. The wireless nature of the communication and the accessibility of their installations make them especially prone to various cybernetic attacks [24]. Intrusion detection is an important area of cybernetic security focused on identification (detection) and prevention of security intrusions. Intrusions are malicious activities which are in favour of an intruder like unauthorized use of the target devices, unauthorized access, identity and data theft, denial of service, etc. Artificial neural networks are often used by knowledge–based and machine learning–based intrusion detection strategies. Together with other algorithms, they constitute efficient IDSs [27]. They can be used to learn models of different actors and to predict their behaviour [10] or to model intrusion classes in computer networks [6] and wireless sensor networks [3]. The learning and adaptation of artificial neural networks is very resources and time consuming. There have been many strategies of improving learning c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 204–215, 2021. https://doi.org/10.1007/978-3-030-57796-4_20

Accelerated Neural Intrusion Detection for Wireless Sensor Networks

205

efficiency like to improve learning algorithm [11] or put a maximum effort into the use of available hardware resources [26]. Massively parallel and energy– efficient floating–point accelerators and general purpose graphical processing units (GPGPUs) are nowadays becoming available for wireless sensor networks, too [4]. In this work, a GPGPU–based acceleration of a neural intrusion detection system, proposed recently for wireless sensor networks [5], is designed, developed, and evaluated.

2

Intrusion Detection

Intrusion detection is a field that deals with security intrusions, i.e., security and privacy violations [6] originating from unauthorized uses of computer networks and connected devices [23]. Security intrusions also often target data and communication confidentiality and integrity and decrease (even disable) the level of service provided by the attacked devices [21]. Security intrusions are attributed to intruders, i.e., hostile actors that aim at the misuse of computer and network resources. There are several types of malicious activities (attacks) that can be classified as security intrusions. They include information gathering (packet sniffing, keylogging), system exploits, privilege escalation, access to restricted resources and so on [23]. Due to the variety and novelty of attacks and the flaws in the development and implementation of complex systems, it is usually not possible to completely secure computer systems (networks) for their entire lifetime by design. In reality, such systems are prone to design flaws and no prevention measures can suppress user errors and system abuse [21]. Intrusion detection systems (IDSs) are software and/or hardware tools developed to mitigate malicious activities and in particular security intrusions. They continuously monitor the target systems and identify system states associated with security intrusions [23]. The operations of IDSs are based on the assumption that regular (authorized) actions of system users can be described by predictable patterns, do not include commands or sequences of commands that can weaken system security policies, and follow the specifications that describe which actions are allowed to which users [6]. An IDS has to monitor the status of the target system, collect the data that describes its properties in time, and analyze its behaviour to detect anomalies that might be related to security breaches. The monitoring is achieved by a set of sensors that collect the data from available data sources and forward it to analyzers that confront it with models of normal behaviour and models of known types of security intrusions [6,23]. The analysis of system behaviour usually exploits patterns (signatures) of malicious data and heuristic characteristics (attack rules) of known types of security intrusions. Alternatively, it attempts to detect the discrepancies between current behaviour and known behaviour of legitimate users by various anomaly (outlier) detection algorithms [23]. Most often, real–world IDSs use a combination of intrusion modelling and anomaly detection.

206

T. Batiha and P. Kr¨ omer

The majority of security intrusion models is based on statistical and probabilistic methods, cluster analysis, knowledge–based approaches, and machine learning–based algorithms [6,23]. Statistical methods build statistical profiles of captured data in the form of univariate, multivariate, and time–series based models [23]. Other types of probabilistic security intrusion models use belief (Bayesian) networks and, e.g., Markov models. Cluster analysis can be used to find groups (clusters) of similar events and find their typical representatives (centroids, medoids). Under this approach both complete clusters and their representatives can be used to classify user behaviour as normal, malicious, or anomalous [6]. Knowledge–based approaches take advantage of expert knowledge in the form of crisp and fuzzy (rule–based) expert system that are designed either manually or with the help of machine learning [23]. Intrusion models based on machine learning are nowadays used in a growing number of IDSs. In general, they use some form of supervised or unsupervised learning to find classification models that classify user behaviour into several groups [23]. Machine learning– based intrusion models include artificial neural networks (multilayer feedforward networks, self–organizing maps), kernel–based methods (support vector machines) [6], evolutionary methods [23], and other nature–inspired algorithms.

3

Wireless Sensor Networks

Wireless sensor networks are distributed cyberphysical systems that are usually spread over large areas and comprise of many cooperating and communicating devices (nodes). WSN nodes are usually embedded devices with significant constraints such as low computing power, small operating memory, and limited source of energy [1]. They carry limited and often irreplaceable sources of energy (e.g., batteries). Because of that, one of the main aims of WSN operations is to balance the quality of service and energy consumption [12]. Security is one of the major concerns of WSNs [8,22]. The limited computing power, memory, and energy constraints of typical WSN nodes make it hard or even impossible to implement traditional computer security mechanisms [22]. On the other hand, the security requirements of mission critical WSN applications such as surveillance [8,12], industrial control and management, environmental monitoring, and, e.g., healthcare [12], are exceptionally high. The data processed and transmitted by WSNs needs to be confidential (not disclosed to unauthorized parties), have integrity (be reliable, authentic, and not tampered with), and be available as soon as possible [22]. Attackers aim at compromising different aspects of services and data provided by the networks. In response, cybersecurity measures including cryptography, secure routing, and intrusion detection are developed specifically for WSNs [8]. However, traditional intrusion detection techniques are in WSN difficult to implement due to the constraints of network nodes. The main goal of intrusion detection in WSN is the identification of malicious behaviour that violates the security rules of the system. They focus on

Accelerated Neural Intrusion Detection for Wireless Sensor Networks

207

the detection of abnormal behaviour of users, misuse of network resources, and irregularities in the operations of systems and protocols that could indicate an intrusion [8]. The knowledge of the routing algorithm is important for intruders as well as for intrusion detection. Low–energy adaptive clustering hierarchy (LEACH) is a hierarchical routing algorithm based on clustering [14]. It is one of the most popular routing strategies for WSNs [3,8]. LEACH divides nodes into regular (normal) nodes and cluster heads (CH) that collect and forward data to base stations. The attackers exploit the protocol by compromising its individual steps. Intrusion detection systems, on the other hand, take advantage of the known behaviour associated with LEACH to build models of legitimate operations and confront them with the real operations of the nodes and the networks. 3.1

Neural Models for Intrusion Detection

Artificial neural networks (ANNs) are often used by knowledge–based and machine learning–based intrusion detection strategies. Together with other soft computing algorithms (fuzzy methods, cluster analysis), they constitute efficient IDSs [27]. They can be used to learn models of different actors (systems components and users) and to predict their behaviour [10], to detect anomalies and to identify misuse in computer networks [13,28], and to model intrusion classes in computer networks [6] and wireless sensor networks [3]. ANNs are in this field popular due to the ability to learn complex non–linear relationships between inputs and outputs but lack interpretability (explainability) which is in some cases required [10]. Nevertheless, their good detection capability makes them useful also together with other machine learning methods to establish hybrid intrusion detection pipelines. There is a number of recent intrusion detection models taking advantage of deep learning and deep neural networks. Deep neural models are used to pre– train neural models [15,25], process high–dimensional data [15], extract useful features from network traffic payload [19], and enable efficient anomaly detection for self–learning IDSs [2,20]. Their applications span several domains, including backbone Internet connections [2], local area (in–vehicle) networks [15], wireless sensor networks [5], and IoT devices [25]. The deep neural network models used in the context of intrusion detection include autoencoders [20], recurrent neural networks [2], deep belief networks [15,25], and, e.g., convolutional neural networks [19]. Despite the undoubted usefulness of neural network–based models for intrusion detection, the computational complexity associated with their training, inference, and adaptation makes their use in certain types of devices complicated [17,20] and the design of practical models has been identified as one of the challenges of modern IDSs [17]. It is also well–known that neural networks with different architectures and topologies perform differently on the intrusion detection task [5].

208

T. Batiha and P. Kr¨ omer

In this work, we study the acceleration of a collective neural model for intrusion detection in WSNs [5] and its ability to train and detect on a variety of different floating–point accelerators (GPGPUs). 3.2

Acceleration of Neural Models

The learning and adaptation of ANNs on data intensive tasks is very resource and time consuming. It often involves large amounts of data that need to be processed to (re)train the neural models. One way of improving the efficiency of learning and evaluation is through modifications of the learning algorithm [11]. Other put emphasis on the use of the full potential of the available (parallel) hardware [7,26]. This includes the advanced features of modern CPUs, such as extended instructions sets of modern processors (SSE2, SSE3, SSE4) [26], and, in particular, the massive parallelism available in modern GPGPUs [7]. Even though modern software frameworks, such as Keras and TensorFlow, are well optimized for the use of parallel hardware [9], there are still ways to further improve accelerated neural models by the selection of optimum parameters for specific networks. This has been demonstrated, for example, on the optimization of the batch sizes with regard to training stability and generalization performance [18].

4

Accelerated Neural Intrusion Detection for WSNs

This work studies the acceleration of a neural intrusion detection strategy for WSNs introduced in [5]. It is based on a collection of compact neural networks, each optimized for the detection of a specific type of intrusions. This approach allows flexibility and a variety of possible deployments of the IDS. The modularity of the collective system and the loose ties between individual models allow, e.g., to use specific detectors only by nodes, which are prone to the associated attacks. The ANN architectures, identified as best against different types of attacks [5], are in this work implemented in the Keras framework and studied on three different GPGPUs, representing different families of accelerators with different high–level properties and low–level hardware architectures. The speed and accuracy of GPGPU–accelerated network learning is measured and compared. All experiments were performed in the context of an intrusion detection data set created specifically for WSN security research [3]. The WSN-DS is a data set describing several types of attacks in a simulated wireless sensor network [3]. It contains 374,661 records with 23 nominal and numerical features divided into 5 categories (4 types of simulated security intrusions and normal network traffic). The data set consists of recorded network traffic in a simulated WSN composed of 100 sensor nodes and one base station (BS). The network uses the LEACH routing protocol [14] with the number of clusters set to 5. All simulated attacks exploit the properties of the routing protocol and the attributes that characterise the status of each network node are based specifically on LEACH.

Accelerated Neural Intrusion Detection for Wireless Sensor Networks

209

WSN-DS contains four types of simulated security intrusions [3]. In the course of a blackhole attack, the attacker pretends to be a cluster head (CH). The false CH captures, stores, and discards all received data and does not forward it to the base station. During a grayhole attack, the attacker assumes a similar strategy of false CH advertisement but drops only several randomly selected packets and forwards the rest to the base station. The flooding attack exploits several weak spots of the LEACH protocol. The attacker pretends to be the CH by sending a huge number of LEACH advertisement messages. This alone increases the power consumption of the nodes but the rogue node intercepts network traffic, too. The scheduling attack destroys the time–division multiple–access (TDMA) medium access schedule and makes multiple nodes to transmit data at the same time. The packet collisions that are caused by this behaviour result in data losses. In total, WSN-DS is composed of 5 classes of network traffic. However, the size of the classes is highly imbalanced. A detailed description of the structure of the data set is summarized in Table 1. It shows that the vast majority of the records corresponds to normal traffic and the individual attack classes represent only very small portion of the whole data set. Table 1. Network traffic classes in WSN-DS

5

Traffic class

Num. of records Percentage

Normal

340066

90.77%

Blackhole attack

10049

2.68%

Grayhole attack

14596

3.9%

Flooding attack

3312

0.88%

Scheduling attack

6638

1.77%

Experiments and Results

The ability of different accelerators to train and execute neural intrusion detection models for WSNs was tested in a series of computational experiments. A version of WSN-DS by Almomani et al. [3] was used in the experiments as the source of data. It contains 18 features (2 of them identifying the source node) and 374,661 records with network traffic. The data features were for the experiments encoded as numerical features and normalized. To overcome the different sizes of the classes, the data set was split into smaller sets, each representing one type of WSN attack and examples of remaining network traffic (including other attacks). They were created using randomized but stratified sampling strategy. It means that the sizes of classes (attack/no attack) were in each set the same. The attack–specific data sets were further split into training (60%) and test (40%) subsets and used to train and evaluate different neural architectures as specialized single attack classifiers.

210

T. Batiha and P. Kr¨ omer

Finally, the train and test data sets were complemented by rest data set which was a union of test with all the remaining records from WSN-DS. The best ANN–based intrusion detection models, found for each attack class in an earlier work [5], were trained on the train data by three different GPGPU–based floating–point accelerators and the speed of the training and the accuracy of the trained models was evaluated. The accuracy of intrusion classification was measured in terms of the true positive rate, TPR, the false positive rate, FPR, the true negative rate, TNR, the false negative rate, FNR, and accuracy, A, defined as [3]: TN TP , TNR = , TP + FN TN + FP FP FN FPR = , FNR = , FP + TN FN + TP TP + TN A= , TP + TN + FP + FN TPR =

(1) (2) (3)

Three Nvidia GPGPUs, common as of early 2020, were in this work used to accelerate the ANNs for attack classification: • Nvidia Jetson TX2 is a mini computer with an integrated GPU. It is considered to be one of the most effective platforms for AI computing. The architecture of the integrated GPU is Pascal and its maximum power input is only 15W. • Nvidia GTX 1070 Ti GPU is a widely used GPU that is mostly recognized as a gaming GPU. The architecture of the GTX 1070 Ti is Pascal, too, and its maximum power input is 180W. • Nvidia RTX 2080 has a newer architecture than Jetson TX2 and GTX 1070ti, called Turing. It is considered to be a hi–end GPU with maximum power input of 225W. The main features of the GPUs are summarized in Table 2. Table 2. Accelerators’ properties RTX2080 GTX1070Ti TX2 GPU architecture

Turing

Pascal

Pascal

CUDA cores

3072

2432

256

Base clock (MHz)

1650

1607

1300

Memory speed (Gbps)

15.5

8

59.7

Memory config (GB)

8

8

8

Mem. bandwidth (GB/s) 496

256

59.7

The use of GPUs allows efficient utilization of their highly concurrent architecture for training and inference of neural networks over huge amounts of data.

Accelerated Neural Intrusion Detection for Wireless Sensor Networks

211

The ability of fast learning is especially useful in situations when the environment changes and the models need to adapt. The GPGPUs, used in this study, use the Compute Unified Device Architecture (CUDA) [16]. CUDA is a hardware and software platform which enables the execution of general purpose programs on Nvidia GPUs. The CUDA runtime takes care of the scheduling and execution of kernels (CUDA routines) by many concurrent tasks (threads and thread groups) executed on the available hardware in parallel. A CUDA application is split between the CPU (host) and one or more GPUs (device). The host code can contain arbitrary operations, data types, and functions while the device code can contain only a subset of operations and functions implemented on the device. The ANNs were in this work implemented in the Keras framework with TensorFlow 2.0 as a GPU–accelerated backend. The adoption of a high–level framework enabled the use of a single portable code base across all testing platforms. The experiments involved the training of an ANN architecture, found successful in the single attack detection task in previous research [5]. The network had 16 neurons in the input layer, 2 hidden layers with 7 and 5 neurons, respectively, rectified linear unit (ReLU) activation function in the neurons in hidden layers, and one output neuron with sigmoid activation function. Separate ANNs were trained for individual attacks for 300 epochs using input batches with sizes 32, 64, 128, and 256 records, respectively. All experiments were due to the stochastic nature of some training steps and to obtain representative time profiles repeated 31 times, independently. The results of the computational experiments are shown in Table 3. The table shows for each type of attack and every batch size the quality of learning (train), quality of testing (train), and the overall quality of detection (rest) on every tested GPU architecture. It also shows the average times needed to train the networks on every accelerator. The table clearly confirms the expected: the more powerful the GPGPU, the faster the training times. Next, the comparison reveals that the quality, in terms of TPR, FPR, TNR, FNR, and A measures is comparable on different hardware. This indicates robust and stable implementation of the training algorithms across different GPGPUs with different low–level properties. The high impact of the batch size parameter on the training time is clearly shown in the TIME column of Table 3. It can be seen that the average reduction in training time between the minimum batch size of 32 and the maximum batch size of 256 is 7 fold. The table also shows that it has no negative impact on the quality of the trained models. In contrary, large batch sizes not only reduce the training time but also increase the ability of the trained models to detect single attacks for certain types of intrusions. Finally, the table shows a good ability of the low–power Jetson TX2 GPGPU to train the ANNs. At the largest batch sizes of 256, it was able to train the networks to classify different types of intrusions in, on average, 1.5–6.7 s, at only

212

T. Batiha and P. Kr¨ omer

Table 3. Average accuracy and training times (in seconds) of the ANN-7-5 intrusion detector on different hardware and with different input batch sizes. TX2

Grayhole

train

test

rest

TPR FPR TNR FNR A

TPR FPR TNR FNR A

TPR FPR TNR FNR A

Time [s]

Batch size

0.982 0.160 0.840 0.018 0.911 0.979 0.157 0.843 0.021 0.912 0.981 0.038 0.962 0.019 0.962 52.068

32

0.995 0.162 0.838 0.005 0.916 0.993 0.159 0.841 0.007 0.918 0.994 0.039 0.961 0.006 0.962 25.323

64

0.996 0.162 0.838 0.004 0.916 0.994 0.159 0.841 0.006 0.918 0.995 0.040 0.960 0.005 0.962 12.610 128 0.996 0.161 0.839 0.004 0.917 0.995 0.158 0.842 0.005 0.919 0.996 0.040 0.960 0.004 0.962 TDMA

Blackhole

Flooding

32

0.928 0.015 0.985 0.072 0.956 0.932 0.013 0.987 0.068 0.959 0.929 0.007 0.993 0.071 0.992 11.277

64

0.926 0.010 0.990 0.074 0.958 0.931 0.009 0.991 0.069 0.961 0.928 0.006 0.994 0.072 0.993

5.662 128

0.930 0.012 0.988 0.070 0.959 0.934 0.012 0.988 0.066 0.961 0.931 0.008 0.992 0.069 0.991

3.231 256

1.000 0.100 0.900 0.000 0.950 1.000 0.104 0.896 0.000 0.948 1.000 0.016 0.984 0.000 0.984 36.335

32

1.000 0.098 0.902 0.000 0.951 1.000 0.103 0.897 0.000 0.948 1.000 0.016 0.984 0.000 0.984 17.400

64

1.000 0.098 0.902 0.000 0.951 1.000 0.102 0.898 0.000 0.949 1.000 0.016 0.984 0.000 0.985

8.950 128

1.000 0.098 0.902 0.000 0.951 0.999 0.102 0.898 0.001 0.948 1.000 0.016 0.984 0.000 0.985

4.646 256

0.994 0.045 0.955 0.006 0.975 0.995 0.053 0.947 0.005 0.971 0.995 0.015 0.985 0.005 0.985 12.843

32

0.992 0.014 0.986 0.008 0.989 0.994 0.013 0.987 0.006 0.991 0.993 0.005 0.995 0.007 0.995

5.792

64

0.991 0.005 0.995 0.009 0.993 0.993 0.007 0.993 0.007 0.993 0.992 0.003 0.997 0.008 0.997

2.936 128

0.992 0.005 0.995 0.008 0.994 0.994 0.007 0.993 0.006 0.994 0.993 0.003 0.997 0.007 0.997 GTX1070Ti train TPR FPR TNR FNR A Grayhole

TDMA

Blackhole

Flooding

test

rest

TPR FPR TNR FNR A

TPR FPR TNR FNR A

Grayhole

TDMA

Blackhole

Flooding

1.512 256 Time [s]

Batch size

0.993 0.178 0.822 0.007 0.907 0.991 0.172 0.828 0.009 0.910 0.992 0.041 0.959 0.008 0.961

9.644

32

0.994 0.172 0.828 0.006 0.910 0.993 0.169 0.831 0.007 0.912 0.994 0.040 0.960 0.006 0.961

4.492

64

0.993 0.169 0.831 0.007 0.911 0.992 0.166 0.834 0.008 0.913 0.992 0.040 0.960 0.008 0.961

2.323 128

0.992 0.168 0.832 0.008 0.911 0.990 0.166 0.834 0.010 0.913 0.991 0.040 0.960 0.009 0.961

1.236 256

0.928 0.035 0.965 0.072 0.946 0.932 0.029 0.971 0.068 0.951 0.929 0.029 0.971 0.071 0.970

4.456

32

0.927 0.024 0.976 0.073 0.952 0.933 0.020 0.980 0.067 0.956 0.930 0.018 0.982 0.070 0.981

2.095

64

0.927 0.019 0.981 0.073 0.954 0.933 0.016 0.984 0.067 0.959 0.929 0.017 0.983 0.071 0.983

1.072 128

0.928 0.018 0.982 0.072 0.955 0.933 0.016 0.984 0.067 0.958 0.930 0.018 0.982 0.070 0.981

0.564 256

0.992 0.097 0.903 0.008 0.948 0.991 0.102 0.898 0.009 0.944 0.992 0.016 0.984 0.008 0.984

6.547

32

1.000 0.098 0.902 0.000 0.951 1.000 0.103 0.897 0.000 0.948 1.000 0.016 0.984 0.000 0.984

3.117

64

1.000 0.098 0.902 0.000 0.951 1.000 0.103 0.897 0.000 0.948 1.000 0.016 0.984 0.000 0.984

1.636 128

1.000 0.098 0.902 0.000 0.951 0.999 0.103 0.897 0.001 0.948 1.000 0.016 0.984 0.000 0.984

0.860 256

0.984 0.046 0.954 0.016 0.969 0.980 0.053 0.947 0.020 0.963 0.982 0.015 0.985 0.018 0.985

2.385

32

0.984 0.012 0.988 0.016 0.986 0.983 0.009 0.991 0.017 0.987 0.984 0.004 0.996 0.016 0.996

1.073

64

0.988 0.011 0.989 0.012 0.988 0.989 0.009 0.991 0.011 0.990 0.988 0.004 0.996 0.012 0.996

0.556 128

0.988 0.007 0.993 0.012 0.991 0.988 0.008 0.992 0.012 0.990 0.988 0.003 0.997 0.012 0.997 RTX 2080

6.657 256

0.928 0.028 0.972 0.072 0.950 0.931 0.024 0.976 0.069 0.953 0.929 0.011 0.989 0.071 0.988 24.262

train

test

rest

TPR FPR TNR FNR A

TPR FPR TNR FNR A

TPR FPR TNR FNR A

0.294 256 Time [s]

Batch size

0.996 0.166 0.834 0.004 0.914 0.994 0.165 0.835 0.006 0.915 0.995 0.040 0.960 0.005 0.961

6.538

32

0.996 0.162 0.838 0.004 0.916 0.995 0.159 0.841 0.005 0.918 0.995 0.040 0.960 0.005 0.962

2.876

64

0.994 0.160 0.840 0.006 0.917 0.994 0.157 0.843 0.006 0.919 0.994 0.039 0.961 0.006 0.962

1.475 128

0.996 0.162 0.838 0.004 0.916 0.995 0.160 0.840 0.005 0.918 0.996 0.040 0.960 0.004 0.962

0.787 256

0.927 0.030 0.970 0.073 0.948 0.931 0.025 0.975 0.069 0.953 0.928 0.014 0.986 0.072 0.985

2.882

32

0.928 0.017 0.983 0.072 0.956 0.931 0.014 0.986 0.069 0.959 0.929 0.011 0.989 0.071 0.988

1.384

64

0.928 0.012 0.988 0.072 0.958 0.932 0.011 0.989 0.068 0.960 0.930 0.007 0.993 0.070 0.991

0.687 128

0.929 0.012 0.988 0.071 0.959 0.934 0.012 0.988 0.066 0.961 0.931 0.007 0.993 0.069 0.992

0.368 256

0.996 0.098 0.902 0.004 0.949 0.995 0.102 0.898 0.005 0.946 0.995 0.016 0.984 0.005 0.984

4.411

32

1.000 0.098 0.902 0.000 0.951 0.999 0.103 0.897 0.001 0.948 1.000 0.016 0.984 0.000 0.984

2.031

64

1.000 0.098 0.902 0.000 0.951 1.000 0.103 0.897 0.000 0.948 1.000 0.016 0.984 0.000 0.984

1.006 128

1.000 0.098 0.902 0.000 0.951 1.000 0.103 0.897 0.000 0.948 1.000 0.016 0.984 0.000 0.984

0.551 256

0.980 0.038 0.962 0.020 0.971 0.977 0.044 0.956 0.023 0.966 0.979 0.013 0.987 0.021 0.987

1.590

32

0.986 0.014 0.986 0.014 0.986 0.985 0.013 0.987 0.015 0.986 0.986 0.005 0.995 0.014 0.995

0.683

64

0.987 0.010 0.990 0.013 0.989 0.988 0.008 0.992 0.012 0.990 0.988 0.004 0.996 0.012 0.996

0.371 128

0.988 0.008 0.992 0.012 0.990 0.990 0.008 0.992 0.010 0.991 0.989 0.003 0.997 0.011 0.997

0.195 256

Accelerated Neural Intrusion Detection for Wireless Sensor Networks

213

15 W power. The mobile platform is 7.5–8.4 times slower, but consumes 15 times less energy than the high–end GPGPU.

6

Conclusions

This work investigated the acceleration of ANN–based intrusion detection models for wireless sensor networks. Three different GPGPUs, representing families of energy–efficient, mid–range, and high–end floating–point accelerators, were used to train multilayer artificial neural networks that served as single attack intrusion detectors for wireless sensor networks. The experiments used a realistic intrusion data set, WSN–DS, describing attacks typical for the environment of wireless sensor networks running the low–energy adaptive clustering hierarchy routing protocol. The computational experiments showed that the GPGPUs can efficiently accelerate the ANN training process and different GPU architectures can train networks with similar classification accuracy. The level of training speedup depends on the architecture and capability of the hardware, and other parameters of the training algorithm, too. In particular, the effect of input batch size on the training time and model accuracy was studied. The computational experiments showed that an increase in the batch size reduced the average training time and at the same time maintained or even slightly improved model accuracy. It is, however, also known that small batch sizes are preferred for some other ANN architectures, for example, deep convolutional neural networks [18]. Special attention was in this research paid to hardware acceleration of ANN– based intrusion detectors by energy–efficient accelerators, represented by Nvidia Jetson TX2. The experiments showed that despite its highly constrained architecture, not comparable to medium and top–range GPGPUs, the mobile device is able to train ANN intrusion detection models at a reasonable time without compromising their accuracy. This makes them useful for efficient execution of local intrusion detection models operating in the changing conditions of real– world wireless sensor networks that require occasional adaptation (retraining) of the models but usually have only limited resources and small energy budget. Acknowledgements. This work was supported from ERDF in project “A Research Platform focused on Industry 4.0 and Robotics in Ostrava”, reg. no. CZ.02.1.01/0.0/ 0.0/17 049/ 0008425 and by the grants of the Student Grant System no. SP2020/108 and SP2020/161, VSB - Technical University of Ostrava, Czech Republic.

References 1. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor networks: a survey. Comput. Netw. 38(4), 393–422 (2002) 2. Al Jallad, K., Aljnidi, M., Desouki, M.S.: Big data analysis and distributed deep learning for next-generation intrusion detection system optimization. J. Big Data 6(1), 88 (2019). https://doi.org/10.1186/s40537-019-0248-6

214

T. Batiha and P. Kr¨ omer

3. Almomani, I., Al-Kasasbeh, B., AL-Akhras, M.: WSN-DS: a dataset for intrusion detection systems in wireless sensor networks. J. Sens. 2016 (2016). https://doi. org/10.1155/2016/4731953 4. Barthélemy, J., Verstaevel, N., Forehead, H., Perez, P.: Edge-computing video analytics for real-time traffic monitoring in a smart city. Sensors 19(9), 2048 (2019). https://doi.org/10.3390/s19092048.31052514[pmid] 5. Batiha, T., Prauzek, M., Kr¨ omer, P.: Intrusion detection in wireless sensor networks by an ensemble of artificial neural networks. In: Czarnowski, I., Howlett, R.J., Jain, L.C. (eds.) Intelligent Decision Technologies 2019, pp. 323–333. Springer, Singapore (2020) 6. Bishop, M.: Computer Security: Art and Science. Addison-Wesley, Boston (2003) 7. Carlson, K., Nageswaran, J., Dutt, N., Krichmar, J.: An efficient automated parameter tuning framework for spiking neural networks. Front. Neurosci. 8, 10 (2014). https://doi.org/10.3389/fnins.2014.00010 8. Cayirci, E., Rong, C.: Security in Wireless Ad Hoc and Sensor Networks. Wiley, Hoboken (2008) 9. Chollet, F.: Deep Learning with Python, 1st edn. Manning Publ. Co., USA (2017) 10. Debar, H., Dacier, M., Wespi, A.: A revised taxonomy for intrusion-detection systems. Annales Des Télécommunications 55(7), 361–378 (2000). https://doi.org/ 10.1007/BF02994844 11. Ergezinger, S., Thomsen, E.: An accelerated learning algorithm for multilayer perceptrons: optimization layer by layer. IEEE Trans. Neural Netw. 6(1), 31–42 (1995). https://doi.org/10.1109/72.363452 12. Fahmy, H.: Wireless Sensor Networks: Concepts, Applications, Experimentation and Analysis. Signals and Communication Technology. Springer, Singapore (2016) 13. Ghosh, A.K., Schwartzbard, A.: A study in using neural networks for anomaly and misuse detection. In: Proceedings of the 8th Conference on USENIX Security Symposium - Volume 8, SSYM 1999, p. 12. USENIX Association, Berkeley (1999) 14. Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication protocol for wireless microsensor networks. In: Proceedings of the 33rd Annual Hawaii International Conference on System Sciences, vol. 2, p. 10 (2000). https://doi.org/10.1109/HICSS.2000.926982 15. Kang, M.J., Kang, J.W.: Intrusion detection system using deep neural network for in-vehicle network security. PLoS ONE 11(6), 1–17 (2016). https://doi.org/10. 1371/journal.pone.0155781 16. Kirk, D.: Nvidia CUDA software and GPU parallel computing architecture. In: Proceedings of the 6th international Symposium on Memory Management, ISMM 2007, pp. 103–104. Association for Computing Machinery, New York (2007). https://doi.org/10.1145/1296907.1296909 17. Liu, H., Lang, B.: Machine learning and deep learning methods for intrusion detection systems: a survey. Appl. Sci. 9(20) (2019). https://doi.org/10.3390/ app9204396 18. Masters, D., Luschi, C.: Revisiting small batch training for deep neural networks (2018) 19. Min, E., Long, J., Liu, Q., Cui, J., Chen, W.: TR-IDS: anomaly-based intrusion detection through text-convolutional neural network and random forest (2018). https://doi.org/10.1155/2018/4943509 20. Mirsky, Y., Doitshman, T., Elovici, Y., Shabtai, A.: Kitsune: an ensemble of autoencoders for online network intrusion detection. CoRR abs/1802.09089 (2018)

Accelerated Neural Intrusion Detection for Wireless Sensor Networks

215

21. Mukherjee, B., Heberlein, L.T., Levitt, K.N.: Network intrusion detection. IEEE Netw. 8(3), 26–41 (1994). https://doi.org/10.1109/65.283931 22. Oreku, G., Pazynyuk, T.: Security in Wireless Sensor Networks. Risk Engineering. Springer, Cham (2016) 23. Stallings, W., Brown, L.: Computer Security: Principles and Practice, 4th edn. Pearson, New York (2018). Always learning 24. Stehlik, M., Matyas, V., Stetsko, A.: Attack detection using evolutionary computation, pp. 99–129. Springer, Cham (2017). https://doi.org/10.1007/978-3-31947715-2 5 25. Thamilarasu, G., Chawla, S.: Towards deep-learning-driven intrusion detection for the internet of things. Sensors 19(9), 1977 (2019). https://doi.org/10.3390/ s19091977.31035611[pmid] 26. Vanhoucke, V., Senior, A., Mao, M.Z.: Improving the speed of neural networks on CPUs. In: Deep Learning and Unsupervised Feature Learning Workshop. NIPS (2011) 27. Varghese, J., Muniyal, B.: A comparative analysis of different soft computing techniques for intrusion detection system. In: Thampi, S., Rawat, D., Alcaraz Calero, J., Madria, S., Wang, G. (eds.) Security in Computing and Communications - 6th International Symposium, SSCC 2018, Revised Selected Papers, Communications in Computer and Information Science, pp. 563–577. Springer, Germany (2019). https://doi.org/10.1007/978-981-13-5826-5 44 28. Yu, Y., Ge, Y., Fu-xiang, G.: A neural network approach for misuse and anomaly intrusion detection. Wuhan Univ. J. Nat. Sci. 10(1), 115–118 (2005). https://doi. org/10.1007/BF02828630

End-to-End Security for Connected Vehicles Kazi J. Ahmed1(&), Marco Hernandez2, Myung Lee1, and Kazuya Tsukamoto3

2

1 City University of New York (CUNY), New York, USA [email protected], [email protected] Mexico Autonomous Institute of Technology (ITAM), Mexico City, Mexico [email protected] 3 Kyushu Institute of Technology, Kitakyushu, Japan [email protected]

Abstract. Recently Mode 4 operation of Cellular Vehicle to Everything (CV2X) specifies the operation of vehicle-to-vehicle, vehicle-to-pedestrian and vehicle-to-UE-stationary over the PC5 interface. However, the security is delegated to the application layer, which is out of the scope of the 3GPP-layer specification. Hence, we propose a transparent and independent distributed security protocol for C-V2X over the PC5 interface at the RLC-layer based on cryptographic ratchets. Our new proposed security protocol provides authenticated encryption, integrity, forward and backward secrecy. The security procedure can start on the fly as soon as vehicles enter a C-V2X group over the PC5 interface, using the cryptographic credentials of the digital certificate issued for ITS applications. The distributed security protocol supports strong encryption, authentication and privacy regardless of the use case in 5G applications for CV2X over the PC5 interface.

1 Introduction Major focus of incoming 5G cellular networks is to secure it from the ground up, protecting the confidentiality and integrity of data and control frames with strong encryption, as well as the authentication of users. However, 5G security protocols are not applicable to PC5 interface, termed Mode 4, as the 5G security mechanisms are embedded in the network, leaving security of C-V2X over PC5 out of the scope of the 3GPP-layer [1]. Of course, security may be applied at the application layer. Initially, 3GPP specified Release 12 mainly regarding the Proximity Services (ProSe) for Device-to-Device (D2D) including unicast, multicast and broadcast. Later, the support for Vehicle-to-everything (V2X) over the sidelink PC5 interface was introduced by Release 14 and 15 [2, 3] which was in fact the legacy from early ITS wireless standards. However, IEEE WAVE [4] and ITS - 5G [5] prioritized emergency and safety communication (vehicle’s position, speed, vehicle’s technical data, user identification, etc.), where latency was a concern for safety. Thus, early ITS applications do not include association, handshake, etc., which essentially means all vehicles that want to transmit broadcast, while the rest listen under certain scheduling for channel access. The security coordination was delegated to layers above the MAC layer. To provide © Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 216–225, 2021. https://doi.org/10.1007/978-3-030-57796-4_21

End-to-End Security for Connected Vehicles

217

multicast and unicast among other services, 3GPP is currently in development for Release 16, C-V2X for 5G. This new V2X systems in 5G can bring new threats, which necessitates to revisit the security assessment for such applications in 5G. Conventionally, the secure communication is modeled to achieve confidentiality in unicast or multicast sessions, while assuming the communication interfaces are ideal, from the physical layer to the application layer. Moreover, it is thought that an adversary only observes and interacts with the communication channel. However, from the recent security breaches, it is clear that the system vulnerabilities due to malware or implementation bugs in hardware and software are critical and an immediate threat. For instance, an adversary does not need to break a cryptographic key or cipher, but simply extracts it using a system exploit. Several solutions have already been proposed to mitigate this vulnerability, most notably the family of protocols named Off the Record (OTR) [6]. The concept is to mitigate the damage of a compromised key by regularly refreshing keys, while making computationally hard to derive future or past keys from a compromised key. These OTR based protocols for end-to-end encryption attempt to increase users’ privacy as the encrypted traffic is not controlled by intermediary service providers. The author in [7] introduced the idea of using a one-way function in the process of updating a message key with the aim of establishing forward and backward secrecy, later named ratchet. However, the assumption that ciphers like AES and Elliptic-Curve (EC) cryptography are robust, and as such if there is a security breach via a system exploit, that would be more likely in the key management. In this proposed work, we employ Radio Link Control (RLC) layer based security solution without interacting with the service provider. Thus it can run security procedure on the fly as soon as one wants to communicate other. Moreover, unlike others, key refreshing is controlled only by the initiator, thus simplifies RLC procedure and reduces the key refreshment latency. Hence, we propose a distributed security protocol for C-V2X over the PC5 interface at the RLC layer based on cryptographic ratchets, which is transparent and independent to the application layer. The security protocol introduces cryptographic ratchets and an ephemeral version of the Diffie-Hellman (DH) algorithm where keys are created without central control, securely protecting the out-of-coverage use case. The security procedure will start on the fly as soon as vehicles enter a C-V2X group over the PC5 interface, using the cryptographic credentials of the digital certificate issued for ITS applications. The proposed distributed security protocol supports strong encryption, authentication, integrity and privacy regardless of the use cases in 5G applications for C-V2X over the PC5 interface.

2 Background The proposed security protocol introduces the use of cryptographic ratchets to secure the PC5 interface of C-V2X sessions and an ephemeral version of the Diffie-Hellman (DH) algorithm for the initial cryptographic handshake, protecting the out-of-coverage use case. Keys are created without central control and a simplified cryptographic ratchet algorithm streamline implementation allowing fast data transmission. While the proposal runs at the RLC layer, in conjunction with the randomization of MAC

218

K. J. Ahmed et al.

addresses [1] it supports strong privacy of user identities and sensitive vehicle’s data information without the need to use pseudonyms. Moreover, our security protocol encrypts RLC-PDUs that include Basic Safety Message (BSM) or similar ITS messages, vehicle’s user/owner identity, group identity, IP address, etc. The combination of randomization of MAC addresses, the proposed security protocol enables strong privacy as well. This protocol also gears toward C-V2X Release 16 over the PC5 interface regardless of the use case. This can include use cases such as V2UAV. To the authors’ knowledge, there is no security protocol that addresses such case in C-V2X applications. 2.1

Protocol Stack for Connected Vehicles

C-V2X specifies Vehicle-to-Vehicle (V2V), Vehicle-to-Infrastructure (V2I) and Vehicle-to-Pedestrian (V2P) communication via the PC5 interface assuming out-ofcoverage scenarios. The operation of such ad-hoc networks without access to the 5G cellular network is supported with two protocol stacks: 1) in the user data plane (UPL) and 2) in the user control plane (CPL) [8]. The communication protocol stack specified for the CPL differs from UPL in the radio resource control (RRC) layer. Cellular V2X (C-V2X) communications are supported with two logical channels: 1) the Sidelink Broadcast Control Channel (SBCCH) to carry Control Plane (CPL) messages, and 2) the Sidelink Traffic Channel (STCH) to carry User Plane (UPL) data [8]. Sidelink Control Information (SCI) is used to transmit control information for processing time and frequency. In out-of-coverage scenarios of a 5G network, PC-5 interface C-V3X will come into play. However, 3GPP did not lay out any security specification for this interface. We are proposing a distributed security procedure to secure the data flow at the RLC layer in order to produce secure RLC Protocol Data Units (RLC-PDUs) as shown in Fig. 1a. UE k

UE l Get.Key.Request

Secure RLC-PDUs MAC Header

PN + Encrypted MAC SDU + MIC PHY Transport Blocks

Get.Key.Response Send.DH.Request Send.DH.Response

(b)

(a) Fig. 1. Secure RLC PDUs.

In order for this to work, the MAC Header, the SCI as well as the SL-DCH transport blocks need to be transmitted in the clear. After MAC packet filtering, the authenticated decryption and integrity check of secure RLC-PDUs can take place. As shown in Fig. 1a, the Packet Number (PN) is used for protection against replay packets

End-to-End Security for Connected Vehicles

219

and the Message Integrity Code (MIC) is part of the authenticated encryption for integrity and authentication protection of secure RLC-PDUs. 2.2

Protection from Compromised Keys

A cryptographic ratchet is a procedure of updating an encryption key combined with a one-way function, resulting in past and future keys are computationally hard to be derived from a compromised key. The proposed security protocol is based on the conjectured one-way functions: cryptographic hash function and discrete logarithm problem as in the Diffie-Hellman key exchange [9]. Especially, we use the Elliptic Curve DH (ECDH) algorithm [10] and the cryptographic hash embedded in the Hashbased Key Derivation Function (HKDF) [11] to update encryption keys. Initially, parties will establish a shared secret key, which is then used as the initialization of the cryptographic ratchet algorithm. An ephemeral version of the ECDH algorithm will be employed to derive such shared secret key or root key that will cover out-of-coverage use cases (Mode 4 of the PC5 interface). Thus, some added mechanisms for authentication are required in order to ensure that the cryptographic credentials are from the intended party. The mechanisms are based on deniable authentication protocol based on Diffie–Hellman algorithm [7, 13]. The procedure is performed by the concatenation of the ECDH operations applied to the public key of vehicle’s digital certificate, an ephemeral key generated per session, and assigned key for additional authentication. We assume the vehicle’s digital certificate, which includes vehicle’s public key, has been validated by the corresponding Certificate Authority (CA) repository. These multiple assurances prevent attacks, like man-in themiddle (MitM) attacks, even if the vehicle is out-of-coverage. 2.3

Security Keys and Functions

Our proposed security procedure works with several security keys and functions. General form of asymmetric key pairs is ðdK ; K Þ; where dK represent the private and K represents the public key. Each vehicle has Identity key pairs ðdIK ; IK Þ where IK is the public key in the vehicle’s digital certificate, Vehicles also have ephemeral key pair ðdEK ; EK Þ which is generated only once per invocation and disposable afterwards. In addition, each vehicle has signed key pair ðdSK ; SK Þ, where SK is transformed to octets and digitally signed by dIK using EdDSA [12] and represented as SigSK. In order for this procedure to work, every vehicle has to post the public keys and digital signature to the corresponding ITS-PKI repository, as well as the refreshed versions when available. Moreover, each vehicle has to store previous keys in a secure location locally, in case of delayed packets. Further, every vehicle has to delete securely every disposable key pair and signature locally and in the corresponding repository. A set of keys we called it the bundle-keys of a given vehicle as the set of public keys and signature fIK; SK; SigSK g need to be posted to the corresponding repository for online validation.

220

K. J. Ahmed et al.

Besides key pairs, we also have employed several functions to run the security procedure. • DS:TestðPK; SignatureÞ: performs a verification of Signature using public PK; • DH:PK ðdPK1 ; PK2 Þ: obtains the shared secret key from the ECDH algorithm with the private key dPK1 and public key PK2 . • RSCðÞ: obtains the value of a receiver sequence counter used to protect against replay packets. • HKDFðÞ: obtains the message key from the Hash-based Key Derivation Function defined as [11], MK ¼ HKDF ðZ; Salt; Hash; L; OtherInfoÞ. Here MK is the message key for encryption and decryption; Z is the shared secret as an octet sequence generated during the execution of a key-establishment scheme of either the EDH algorithm (during the cryptographic hand-shake) or the conventional ECDH algorithm (during the cryptographic ratchet); Salt is a message authentication code (MAC) used as key for the randomness-extraction step during two-step key derivation; Hash indicates the Hash function employed (SHA-2 or SHA-3) in the HMAC procedure for randomness-extraction and key-expansion; L is the length of MK in bits, and OtherInfo is an octet sequence of context-specific data, whose value does not change during the execution of the HKDF. Finally, we quickly summarize the ECDH parameters in order to introduce the notation of variables used across the paper. Given the domain parameters for EC Cryptography ðp; a; b; G; n; hÞ, there exist a curve over a finite field Fp and base point G such that given d as a random number in ½1; 2; . . .n 1 such that Q ¼ dG. The generated key pair ðd; QÞ defines the public key as Q and the private key as d.

3 Proposed Scheme 3.1

Cryptographic Handshake and Initial Shared Key

The Extended Diffie-Hellman (EDH) algorithm [7, 13, 14] is used for the key agreement protocol based on a shared secret key between the parties. This allows authenticated encryption (AE) and integrity check during the cryptographic handshake, besides of cryptographic deniability at the RLC layer. This step will create a initial shared key that will be used as a root key in the later security procedure. The EDH procedure is depicted below by Algorithm 1. To follow the procedure let Peer k and Peer l are two parties need to exchange encrypted information. In case of offline or outof-coverage situation, one party Peer k can start a secure session by requesting Peer’s l public keys either directly via a command frame, or to a CA repository. As mentioned before the public key in the digital certificate is defined as the Identity Key ðIK Þ. The corresponding private key is located in a secure location in every Peer device. Now, Peer k starts a secure communication session with Peer l over PC5 by requesting the bundle-keys of Peer l via the command frame Get.Key.Request(), and validates such bundle-keys with the corresponding CA repository. If Peer k is offline the 5G network, such validation is postponed till Peer k is back online. However, the cryptographic handshake can continue in the mean time. In order to get the bundle-keys

End-to-End Security for Connected Vehicles

221

of Peer k, Peer l can follow similar procedure. Peer k will only proceeds with the EDH algorithm once a full validation of the cryptographic credentials is passed. The EDH handshake consists of two parts: 1) Derives SSK in Peer k, 2) Derives SSK in Peer l. Once both SSK matched, the derived SSK is then used as the Root Key ðRK Þ for the cryptographic ratchet algorithm in order to provide forward and backward secrecy. The EDH procedure 1 is executed by Peer k, while the EDH procedure 2 is executed by Peer l. Algorithm 1 Extended Diffie-Hellman Procedure EDH.1 Input: Key of Peer Output: Shared Secret Key ( ) Steps: Bundle-keys of Peer Signature Check: ; : 0, if fails. Return Status: FAIL and ; Generate ; ; ; ;

Procedure EDH.2 Input: Keys of Peer l; Payload of the received from Peer k. Output : Shared Secret Key (SSK) Steps:

Peer receives a request to send its bundle-keys to Peer ; Peer sends its keys to Peer ; its keys Peer receives from Peer ; and Signature Check: ; Return Status: FAIL and ;

Encrypt with a known message Concatenate Send over to Peer l for response.

: 0, if fails. ; ;

;

Decrypt the sent message with to see whether it have the know message. If fail

After a successful EDH procedure, vehicles will share a secret key ðSSK Þ, which can be used to initialize the cryptographic ratchet algorithm. Moreover, after the SSK successfully generated, pair will exchange DH completion message to end the procedure as shown in Fig. 1b. The shared secret key before the HKDF is formed as D ¼ dIKk dSKl GjjdEKk dIKl GjjdEKk dSKl G to avoid MitM attacks. 3.2

Message Key Creation and Management

After successful completion of EDH procedure, the cryptographic handshake is complete and the resulting shared secret key (SSK) becomes the root key (RKÞ in the cryptographic ratchet algorithm. Thus initialize with RK: 1) HKDF ratchet will provide a symmetric key for a block or stream cipher (backward secrecy), 2) a conventional ECDH algorithm as DH ratchet to provide a shared secret key used as HMAC key to

222

K. J. Ahmed et al.

the HKDF (forward secrecy), 3) exchanging public keys for the DH ratchet via RLC-PDUs over a secure channel. Fib 1b illustrates this exchange with the commands Send.DH.Request() and Send.DH.Response(). Our proposed security procedure supports backward secrecy utilizing HKDF ratchet. Let’s assume the initiator, Peer k, controls the refreshing of cryptographic ratchet keys. Thus HKDF ratchet is implemented as an iterative one-way function given by MKnk þ 1 ¼ HKDF MKnk ; SKnk þ 1 ; Hash; L; OtherInfo where n ¼ 0; 1. . .; MKn is the message key for encryption and decryption at stage n; MK0 ¼ RK is the root key from the EDH algorithm; Salt ¼ SKK is the shared secret key from the DH ratchet generated by the k th user; Hash is generated by SHA-3 [15]; L is the length of the message key in bits; OtherInfo ¼ IKk jjIKljjPN is given by where PN (Packet Number) is an unsigned integer rollover counter initialized to 0 at the start of a secure communication session, and incremented by 1 per transmitted RLC-PDU, which is used for protection against replay frames. Notice that the same message key, MK is derived in both Peer k and Peer l at stage n for encryption and decryption respectively. While HKDF ratchet provides backward secrecy, the DH ratchet supports forward secrecy: if a given message key is compromised, future keys cannot be derived as the DH ratchet resets the HKDF ratchet. The exchange of public keys and management information is performed over a secured channel with integrity and authentication checks. The main motivation for cryptographic ratchets is to mitigate the damage of a compromised key by regularly refreshing keys, while making computationally hard to derive future or past keys from a compromised key based on one-way functions. In order to increase users’ privacy by end-to-end encryption while the encrypted traffic is not controlled by intermediary service providers, several protocols have been proposed. For instance, the protocol in [7] introduced the concept of double ratchet, handles 3 chains: root, sending and receiving chain, in which a message key is refreshed by swapping the sending and receiving chains between two end-to-end participants. These participants take turns to refresh the ratchet keys, like a table tennis game. Our proposal in contrast uses a generalized iterative HKDF and ECDH as cryptographic ratchets without the need to define such chains (root, sending, receiving) and without the need that participants take turns to refresh the ratchet keys as the algorithm goes on. As such, the initiator controls the refreshing of keys during a communication session, simplifying the protocol implementation. Algorithm 2 illustrates the proposed cryptographic ratchet protocol for C-V2X over the PC5 interface performed by Peer k, and by Peer l. The DH ratchet however, requires the exchange of public keys between Peers. The commands Get.Keys.Request () and Get.Keys.Response() are used for this purpose over a secure channel as shown in Fig. 1b. These public keys can be inserted in RLC-PDUs. After the public keys of the DH ratchet are exchanged, the derived SKn þ 1 at stage n þ 1 is used as the Salt input of the iterative HKDF ratchet, and consequently a new message key is computed in Peer k for encryption and Peer l for decryption.

End-to-End Security for Connected Vehicles Algorithm 2 Cryptographic Ratchet Procedure Double Ratchet.1 Input: Root key from the EDH handshake; Domain parameters of EC; Parameters of HKDF ( ) Output : Message key for Peer ; Steps: Initialize state for Peer ; ; Generate to Peer with Send over a secure channel and waits for a response; Peer receives the command with from Peer over a secure channel; ; ; Peer Encrypts RLC-PDU with key at stage

223

Procedure Double Ratchet.2 Input: Root key from the EDH handshake; Domain parameters of EC; Parameters of HKDF ( )

Payload of command

;

Output : Message key for Peer ; Steps: Initialize state for Peer Peer l receives the command with from Peer ; Peer l receives the command with from Peer ; Generate ; Send to Peer with over asecure channel and waits for a response; ;

Peer with

; Encrypts (or Decrypts) RLC-PDU key at stage

The refreshing of ratchet keys may be preset by vendors or indicated by applications. Note that if the CA bans a vehicle’s certificate, the proposed security protocol cannot proceed. Then vehicle has to go through re-registration to the CA with appropriate penalty to resolve the dispute. 3.3

Security Analysis

Here we present a formal analysis of the proposed security protocol. One may try to use software or system exploit to extract security keys. However, it cannot acquire such keys since security procedure is controlled by RLC layer. If one tries to put any discrepancy in RLC layer, the procedure will not commence. All the device keys will be stored in tamper resistant module of the device. If one tries to break it in order to obtain the keys, the device will automatically send message to the CA and the device keys will be registered as banned key. Thus a compromised device will lose its certificate from CA. An attacker can use a compromised device B (Peer k) to acquire the private messages from device A (Peer l). However, the security procedure with device A will first check the identity key IKB of device B in its digital certificate and CA repository, and will find that it is banned. Thus, such compromised device B will not able to complete the handshake procedure and hence not procure private messages. An attacker can try Man in the Middle (MitM) attack. A rogue device R may download all the keys (public) of device B and send those to device A in order to commence the security handshake. While device A generates the SSK as indicated in Algorithm 1, and sends an encrypted message in Send:DH:RequestðÞ, device R cannot

224

K. J. Ahmed et al.

recreate the SSK counterpart without having device B’s private keys. Thus the MitM attack will not be successful. An attacker may retrieve session key from communicating message. However, it can only obtain the current message, the previous message keys cannot be derived, because of the HKDF ratchet. As such, previous encrypted messages are secured.

4 Conclusion We propose a distributed security protocol for C-V2X over the PC5 interface at the RLC-layer based on cryptographic ratchets, which provides authenticated encryption, integrity, privacy, forward and backward secrecy. The proposal can be introduced on the fly as soon as vehicles enter a C-V2X group over the PC5 interface, regardless of the use case in 5G applications for C-V2X. All Elliptic-Curve (EC) keys are created without central control and a simplified cryptographic ratchet algorithm simplifies implementation allowing fast data transmission. As the security protocol runs at the RLC layer, the proposal in combination with the randomization of MAC addresses [1] supports strong privacy of user identities and sensitive vehicle’s data information without the need to use pseudonyms. The proposed security protocol supports a strong level of protection due to the authentication, integrity, confidentiality, privacy and protection against replay of packets at the RLC layer of C-V2X applications over the PC5 interface. Furthermore, if an attacker compromises an encryption key, such attack would not be able to derive previous or future keys as the algorithm goes on, because of the forward and backward secrecy properties. Acknowledgments. This research is supported in part by NSF grant # 1827923 and Juno2 project, NICT grant # 19304.

References 1. 3GPP TS 33.185 V15.0.0: Security aspect for LTE support of Vehicle-to-Everything (V2X) services (2018) 2. 3GPP TS 136 300 V15.8.0: Evolved Universal Terrestrial Radio Access (E-UTRA) and Evolved Universal Terrestrial Radio Access Network (E-UTRAN); Overall description (2020) 3. 3GPP TS 123 303 V15.1.0: Universal Mobile Telecommunications System (UMTS); LTE; Proximity-based services (ProSe); Stage 2 (2018) 4. IEEE 1609.0-2019: IEEE Guide for Wireless Access in Vehicular Environments (WAVE) Architecture (2019) 5. ETSI 303 613 V1.1.1: Intelligent Transport Systems (ITS); LTE-V2X Access layer specification for Intelligent Transport Systems operating in the 5 GHz frequency band (2020) 6. Borisov, N., Goldberg, I., Brewer, E.: Off-the-record communication, or, why not to use PGP. In: Workshop on Privacy in the Electronic Society (2004)

End-to-End Security for Connected Vehicles

225

7. Perrin, T.: Moxie Marlinspike, Signal Protocol (2016). https://signal.org. Accessed 12 Feb 2019 8. 3GPP TS 23.285 V15.4.0: Architecture enhancements for V2X services (2020) 9. Luby, M.: Pseudorandomness and Cryptographic Applications. Princeton University Press, Princeton (1996) 10. Goldreich, O.: Foundations of Cryptography, vol. 1. Cambridge University Press. ISBN 0521-79172-3 11. NIST Special Publication 800-56C: Recommendation for Key-Derivation Methods in KeyEstablishment Schemes (2018) 12. Internet Research Task Force, RFC 8032: Edwards Curve Digital Signature Algorithm (EdDSA) (2017) 13. Fan, L., et al.: Deniable authentication protocol based on Diffie-Hellman algorithm. Electron. Lett. 38, 705–706 (2002) 14. Kar, J., Majhi, B.: A novel deniable authentication protocol based on Diffie-Hellman algorithm using pairing technique. In: Proceedings of the ICCCS, pp. 493–498 (2011) 15. NIST SP 800-185: SHA-3 Derived Functions (2016)

Triangle Enumeration on Massive Graphs Using AWS Lambda Functions Tengkai Yu(B) , Venkatesh Srinivasan(B) , and Alex Thomo(B) University of Victoria, Victoria, BC, Canada {yutengkai,srinivas,thomo}@uvic.ca

Abstract. Triangle enumeration is a fundamental task in graph data analysis with many applications. Recently, Park et al. proposed a distributed algorithm, PTE (Pre-partitioned Triangle Enumeration), that, unlike previous works, scales well using multiple high end machines and can handle very large real-world networks. This work presents a serverless implementation of the PTE algorithm using the AWS Lambda platform. Our experiments take advantage of the high concurrency of the lambda instances to compete with the expensive server-based experiments of Park et al. Our analysis shows the tradeoff between the time and cost of triangle enumeration and the numbers of tasks generated by the distributed algorithm. Our results reveal the importance of using a higher number of tasks in order to improve the efficiency of PTE. Such an analysis can only be performed using a large number of workers which is indeed possible using AWS Lambda but not easy to achieve using few servers as in the case of Park et al.

1

Introduction

Triangle enumeration is an essential task when analyzing large graphs. Due to the booming of graph sizes in modern data science, it becomes almost impossible to fit the entire graph into the main memory of a single machine. Although there are several methods developed to use a single machine’s resources efficiently (EMNI [1], EM-CF [2], and MGT [3]), distributed algorithms show higher scalability and speed during experiments. The PTE algorithm of Park et al. [4] has stood out among the distributed algorithms due to its minimized shuffled data size. The algorithm separates the graph into subgraphs based on edge types. A higher number of tasks leads to smaller subgraph sizes. Park et al. implemented the PTE algorithm over 41 top tier servers and showed promising experimental results. However, server-based implementation limits the maximum number of tasks. In this paper, we implement the PTE algorithm on the AWS Lambda platform. This highly scalable, on-demand computation service allows us to use 1000 concurrent instances at the same time, thus significantly increasing the maximum number of tasks allowed. This implementation turns out to be not only cost-effective but also achieved the same speed as the original PTE implementation. The experiment results show the sweet spot for numbers of tasks when c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 226–237, 2021. https://doi.org/10.1007/978-3-030-57796-4_22

Triangle Enumeration on Massive Graphs Using AWS Lambda Functions

227

enumerating smaller size graphs, which can be the guide for choosing the number of tasks for a similar size data. More importantly, for more massive graphs like the Twitter network, our charts show how increasing the number of tasks can help to increase the speed significantly. 1.1

Our Contribution

1. We present a serverless lambda implementation of the PTE triangle algorithm on the AWS Lambda platform, thus showing the possibility of efficiently enumerating triangles on the inexpensive AWS Lambda framework. Namely, we are able to enumerate more 34 billion triangles for twitter-2010 in less than 2 min and spending less $4. This is in contrast to the original paper that achieved a similar run time but using an expensive high-end, multi-server environment. 2. We present a detailed trade-off of the time and cost of enumerating triangles versus the number of tasks used. This analysis was not available in the original PTE paper. We show that the run time reduced at a quadratic rate while the money cost increased at a lesser rate. Also, we determine the sweet spot for the number of tasks after which an increase in this number is not beneficial. 3. We also present a more detailed investigation of the scalability of the PTE algorithm enabled by the large number of lambda instances we can inexpensively spawn in AWS lambda. This analysis was not available in the original PTE paper. Furthermore, our results represent a careful design in terms of language and frameworks chosen in order to be able to scale to large datasets. 4. We show the benefit of using higher number of tasks to the boosting of the enumeration speed. This is hard to evaluate in the server-based system. However, in the serverless system like the AWS Lambda platform, we always have the flexibility to use more tasks when we are not satisfied with the current enumeration speed.

2

Related Work

Before the PTE algorithm, there were a couple of approaches that tried to enumerate triangles in massive graphs. Their implementations run on either a single machine or distributed clusters. However, due to the considerable memory or disk cost of this problem, these implementations failed under certain extreme conditions. On the other hand, even when they are capable of completion, these approaches were more costly complexity-wise compared to the PTE algorithm. 2.1

Distributed-Memory Triangle Algorithms

Many distributed algorithms copied parts of the graph and sent them to distributed machines so that they can achieve parallel and exact enumerations.

228

T. Yu et al.

The PATRIC algorithm [5] separated the original set of vertices into p disjoint subsets of core nodes where p is the number of their distributed machines. Then this algorithm sent subgraphs, each one containing a set of core vertices to a distributed machine along with all the neighbors of these core vertices, and enumerated the triangles in these subgraphs at the same time. On the other hand, the PDTL algorithm [6] implemented a multi-core version of MGT. We recall that the MGT algorithm [3] is an I/O efficient algorithm using external memory. So in the PDTL algorithm, every machine received a full copy of the graph in their external memory. The disadvantages of these two approaches are undeniable. PATRIC requires the separated graphs to fit into the main memory on every machine while the division can be redundant. Nodes with high degrees replicate in too many machines, thus challenge the total memory size. The PDTL algorithm suffers the from same problem as the MGT algorithm that the entire graph can be much larger than the capacity of external memories of every distributed machine. 2.2

Map-Reduce Algorithms and Shuffled Data

The critical difference between Map-reduce algorithms and distributed algorithms from the previous subsection is that Map-reduce uses keys to group the data when shuffling data. This helps the machines retrieve and combine data when there exists stable storage available, thus fitting the cloud platforms most. In this case, the challenging factor is no longer the original graph, but the total amount of the shuffled data. Cohen [7] uses a pair of nodes (u, v) as the key, and the shuffled data contains all 1 step or 2 step paths between u and v [7]. As a result, every edge has a chance 3 to be replicated n − 1 times which leads to the O(|E| 2 ) of shuffled data. The Graph Partition (GP) algorithm [8] uses the nodes’ hashed value triples as the 3 √ key, which reduces the quantity of shuffled data to O(|E| 2 / M ) where M is the total data space of all machines. However, every triangle is still reported more than once and needs a weight function to count the correct value. The TTP algorithm [9] sorts the edge list, so every triangle with more than one hash value is enumerated only once. The CTTP algorithm [10] requires the number of √ 3 rounds R = O(|E| 2 /(M m)) to ensure enough space for every round process, where m√is the memory size of a single reducer. However, TTP and CTTP need 3 O(|E| 2 / M ) of shuffled data. This drawback leads us to the PTE algorithms [4], which only generates O(|E|) shuffled data and will be discussed in more detail in Sect. 4.

3 3.1

Background Preprocessing

There are two reasons for us to preprocess the data before the triangle enumeration. The first one is to ensure the counting accuracy. We have to remove

Triangle Enumeration on Massive Graphs Using AWS Lambda Functions

229

self-loop over any node u since it can form a triangle (u, u, u). Also, for any triangle with nodes (u, v, w), we wish to count it exactly once. For example, given directed edges e1 = (u, v), e2 = (v, u), e3 = (v, w) and e4 = (u, w), they can form two triangles over nodes (u, v, w) with edges sets (e1 , e3 , e4 ) and (e2 , e3 , e4 ). This fact requires us to remove all self-loops and only keep one edge between all connected nodes. As a solution, we remove all self-loops, symmetrize all edges between connected nodes pairs, and then remove one edge from all symmetric edge pairs. Now, for any symmetric edge pair, which edge are we supposed to remove? This question leads to our second reason: we wish to increase our enumeration speed when using set intersections. In the PTE algorithm, for every edge (u, v), we intersect the outgoing edge sets of nodes u and v and add the length of the intersection set to the counting result. If there exists a node with a very high outdegree, this node involves a high volume of intersections, and each intersection computation takes a longer time than intersecting out-neighbours of two low out-degree nodes. Please note that, although the original PTE paper for-loops the intersection and applies if conditioning on every node, which we modify in our work by using separated type-1 workers, the time cost is affected by the outdegree of all nodes in the same way. As the solution, for any two nodes u and v, we denote u ≺ v if d(u) < d(v) or d(u) = d(v) and id(u) < id(v), and we only keep the edge (u, v) if u ≺ v. This prevents the existence of nodes whose degrees are much higher than others thus reducing the enumeration time significantly. 3.2

Amazon Website Service

3.2.1 AWS Lambda AWS Lambda is a serverless computing platform provided by Amazon Website Services [11]. Unlike EC2 servers, which is another computing product from AWS, Lambda provides higher scalability and convenience of implementation. Scientists do not need to calculate the proper server size to rent, which they have to do when using EC2, but they only need to judge the number of function instances required by the job. Due to the on-demand nature of Lambda function instances, the Lambda platform is more convenient to achieve the most resourceefficient and economic experiment. Each function can configure the memory space (from 128 MB to 3008 MB) and the maximum running time (3 s to 900 s) for its instances [12]. The temporary disk size for any function instance is 512 MB. However, the AWS team does not reveal the exact computation power allocated to the functions but only informs the customers that the CPU power increases linearly with the configured memory. Although the documentation shows there is one full vCPU granted when the configured memory is 1792 MB, we could find the exact CPU power for our experiment [12].

230

T. Yu et al.

Why are the Lambda functions more economical than renting EC2 servers? The EC2 servers, by the service of AWS [13], are rented by hours. If we wish to rent 41 machines as described in the original PTE paper, we can choose the machine type t3a.2xlarge. This machine charges us $0.3008 per hour, and thus 41 of these machines can cost us more than 12 dollars per hour. This cost is more than our experiments cost for four graphs and five different numbers of colors. Also, the Lambda’s low cost per function makes it inexpensive for the early implementation and debugging. 3.2.2 S3 Buckets S3 (Simple Storage Service) buckets are where we store the graphs and subgraphs’ enumeration results. With a valid IAM role, a Lambda function instance can search, get, and put a target file from the S3 bucket efficiently. This feature overcomes the difficulty of emitting data on a shared disk machine. Another benefit of S3 buckets is that they can provide a massive amount of data storage space at low cost. Local machines disk can run out quickly when partitioning the graphs into many while we do not need to worry about this when using S3 bucket.

4

PTE Algorithm

The key idea of the PTE (Pre-partitioned Triangle Enumeration) algorithm is to partition vertices by assigning ρ colors to vertices before the enumeration starts. There are three versions of PTE algorithms in [4]: P T EBASE , P T ECD and P T ESC . The P T EBASE is foundation of all of them, and it is the one we focus in this paper to illustrate the benefits of using AWS Lambda over a set of high-end servers. The algorithm first assigns these ρ colors to all vertices uniformly at random using a function ξ. Since each edge has two endpoint nodes, edges are separated subsets by the algorithm. This coloring process uses a into ρ + ρ2 = ρ(ρ+1) 2 simple hash function to achieve uniform distribution, and the time cost is linear in the size of the edges set. Thus, there are 3 types of triangles. Type-1 triangles have all three nodes of the same color, type-2 triangles contain 2 different colors, and type-3 triangles require all three nodes have distinct colors. The type-1 triangles are the easiest to count. A single edge subset file Eii is enough to count all type-1 triangles of color i. On the other hand, all type-2 or type-3 triangles require combining 3 color subsets to enumerate. We can use the same procedure to implement the function enumerating these two types of triangles, and the total number of the function instances is ρ2 + ρ3 . The PTE algorithm achieves a significant decrease in the amount of shuffled data. For any edge Eij or Eii , it only creates ρ copies since there are only ρ choices of the third node. This helps the PTE algorithm to stand out against other distributed algorithms. Note that, when we enumerate type-2 triangles of color i and j, we also enumerate two kinds of type-1 triangles: type-1 triangles of color i or j. This

Triangle Enumeration on Massive Graphs Using AWS Lambda Functions

231

overlapping implies that all type-1 triangles are counted (ρ − 1) times by enumerating type-2 triangles. Thus the final count of triangles should be enum(i.j) + enum(i, j, k) − (ρ − 2) ∗ enum(i) i,j∈[ρ]

i,j,k∈[ρ]

i∈[ρ]

This formula is an additional contribution of this work. In [4], type-1 triangles are enumerated as part of type-2 triangles. This increases the workload for the type-2 workers. Our formula above allows the type-1 workers to be separated from the type-2 and 3 workers and hence reduce the total running time of the algorithm.

5 5.1

Experiments Data Collection

We use four datasets in our experiments downloaded from the Webgraph site, and they are based on the real-world internet data [14,15]. uk-2005 and indochina2004 are the internet crawl from two regions. Thus their nodes represent URLs, and edges are directed links from one URL to another. On the other hand, ljournal-2008 and twitter-2010 are from the social network applications, Live Journal and Twitter. Their nodes are users, while every edge indicates that one user is following another. The sizes of graphs are presented in Table 1. As described previously, we pre-processed and partitioned these graphs, then uploaded them to the S3 buckets. Using the compression framework of Webgraph facilitated this process significantly and furthermore being able to run the algorithm on compressed partitions of the graphs in the Lambda functions made their footprint small enough to easily satisfy the AWS Lambda requirements. Notably, some other works that have successfully used Webgraph for scaling various algorithms to big graphs are [16–21]. Table 1. The summary of datasets Dataset

Vertices

Edges

Triangles

ljournal-2008 5,363,260 79,023,142 411,155,444 194,109,311 60,115,561,372 indochina-2004 7,414,866 39,459,925 936,364,282 21,779,366,056 uk-2005 41,652,230 1,468,365,182 34,824,916,864 twitter-2010

232

5.2

T. Yu et al.

Setup

We used two types of Lambda functions in our implementation. The Type-23 function is responsible for enumerating all the type 2 and 3 triangles. Thus its job includes getting 3 graphs from the S3 buckets, combining them, and enumerating the combination. On the other hand, the Type-1 function only gets one file from the S3 bucket and enumerates this relatively smaller subgraph. This difference between workloads leads to a significant running time difference. As we mentioned in the previous section, the original PTE paper counted the type-1 triangles as a part of the type-2 triangle, so they set up a condition to ensure all type-1 triangles are counted only once. We take advantage of the fact that type-1 enumerations cost much less time than the other two types according to Fig. 1 and 2. Thus, we can call more tasks in parallel to reduce the total time cost. Every instance from either function can have 3,008 MB memory, 512 MB temporary disk size, and a maximum of 15 min running time. Unfortunately, we cannot provide the information about the CPU since it is hidden from customers, as we explained before. AWS Lambda allows a maximum of 1000 parallel instances from both functions.

Fig. 1. Average running time (type 2 and 3)

Fig. 2. Average running time (type 1)

We created a folder “graphs” inside an S3 bucket containing all graphs. All subgraphs partitioned from the same original graph belong to the same subfolder under the “graphs” folder (“graphs/twitter-2010” for example). Then this subfolder contains all sub subfolders of different numbers of color partitions. This structure makes sure that all subgraphs of a single experiment share the same prefix, and S3 buckets limit the number of requests of all files under the same prefix by 3500 (Please note that this version of the “enumerateTriangles” function is much simpler than the one in the original PTE paper. Since we are fixing the number of the type-1 triangles in the final result by subtraction, we no longer need those if conditions enforcing the non-duplicate type-1 triangle enumerations.).

Triangle Enumeration on Massive Graphs Using AWS Lambda Functions

233

Algorithm 1: Triangle Enumeration on AWS Lambda Data: problem = (i) or (i,j) or (i,j,k) initialize E if problem is type (i) then retrieve Ei,i from the S3 bucket E = Ei,i else if problem is type (i, j) then retrieve Ei,i , Ei, j , E j, j from the S3 bucket E = Ei,i ∩ Ei, j ∩ E j, j else // problem is type (i, j, k) retrieve Ei,k , Ei, j , E j,k from the S3 bucket E = Ei,k ∩ Ei, j ∩ E j,k end enumerateTriangles(E )

5.3

Experimental Results

5.3.1 Number of Colors We started at 3 colors (1 or 2 colors are not sufficient for separating the graph) and chose steps of size 3 for increasing the colors. However, it turns out only ljournal-2008 is enumerable with 3 colors. indochina-2008 and uk-2005 exceeded the Lambda function time limit (15 min) when using 3 colors. Thus our visualization starts at 6 colors. Also, twitter-2010 is not enumerable with only 6 colors, so its line starts at 9 colors. Also, keep in mind that the AWS Lambda only allows 1000 instancesrunning at the same time, and type-2 and type-3 function instances sum up to ρ2 + ρ3 instances. The calculation of this formula shows that in order not to exceed the number 1000 of lambda instances we can start, the largest number of colors we can use is 18 (969 instances).

Fig. 3. Total running period for subgraph enumeration

Fig. 4. Summation of all workers running time

234

T. Yu et al.

5.3.2 Total Running Time The total running time is the time from the start to the end of enumeration. Figure 3 shows a significant decrease in running time when the number of colors increases. We make an interesting observation that, when the number of colors is 18, the time required to enumerate twitter-2010 is only 117.865 s, which is very close to the Twitter enumeration time in the PTE paper. We note that these two Twitter graphs are slightly different: the twitter-2010 dataset has 200M more edges than the Twitter graph used by the PTE paper. This fact means the we can achieve the same running time with a lower budget. We can also locate the sweet spot quickly for every graph. All graphs except twitter-2010 have their sweet spot at 9 colors. On the other hand, the running time on twitter-2010 decreases well beyond 9 colors. This implies that, in an ideal implementation, a greater number of colors can achieve an even shorter time for its enumeration. 5.3.3 Time Summation from All Workers Figure 4 shows the sum of running time of all workers, which is directly related to the money charged by the AWS Lambda platform. As the number of colors grows, this sum increases quadratically. However, the rate of increase is slower than the rate at which the total running time decreases. The most expensive experiment is enumerating twitter-2010 with 18 colors. It contains 969 type-2-3 function instances and 18 type-1 instances. The total summation of time is 79,620 s in total. Given the charge rate from AWS Lambda, this experiment only costs 3.89806709729 dollars per run. 5.4

Insights

When the graph fits into the AWS Lambda platform, the speed of enumeration is very close or maybe even faster than the one from [4]. We achieve this at a much lower cost. They used 41 machines fully equipped with Intel Xeon E3-1230v3 CPUs and 32 GB RAM [4]. Our result highlights the usefulness of on-demand services. A data scientist can turn to the AWS Lambda instead of purchasing or renting an expensive set of machines to either enumerate graphs fitting inside the platform, or running other algorithms that can separate the graph into different instances correctly. Now assume scientists chose to enumerate triangles using the AWS Lambda, what is the relationship between time and money? If you want your program to run faster, you need to pay more money. This insight is straightforward from the Fig. 4. As the number of colors grows, the real-world time cost decreases quadratically. However, the sum of the workers’ time increases also quadratically, albeit at a lower rate. Since AWS Lambda charges mainly for the time cost of functions in this application, higher speed implies higher cost. Space in S3 Buckets, on the other hand, does not cost that much. This leads us to the following question: Is higher number of colors always a better choice? In other words, given the 1000 concurrent instances limit, should

Triangle Enumeration on Massive Graphs Using AWS Lambda Functions

235

the experiment always choose the number of colors to be 18? Fortunately, the answer is no. Remember, the total time charts contain the sweet spots for three graphs. These sweet spots imply that, for any graph with a size smaller or equal to uk-2005, 9 colors are enough efficiency-wise. A higher number of colors cannot increase the speed of enumeration significantly for these graphs but can cost much more. More experiments are needed to list sweet spots for different graph sizes, but this insight can help us choose the workload for this on-demand service wisely. Is it disappointing that we failed to find the sweet spot for twitter-2010? On the contrary, this is our most important insight overall. Our chart shows the total time decreases even when the number of colors is 18. This fact reveals the contribution of a higher number of colors, which the original PTE paper did not consider. Park et al. only had 41 machines, and there are only 3 cores per machine used in their experiments. If they use one machine as the master, there are 9 120 workers for the enumeration job, where the number 120 equals to 9 only + 2 3 . This calculation implies that their concurrent number of colors cannot be higher than 9. However, since the total time cost still decreases fast with 18 colors, it motivates the use of higher number of colors in future experiments.

6

Conclusions and Future Work

In this paper, we implemented the PTE algorithm using the AWS Lambda platform and ran experiments over large graphs like twitter-2010. The PTE algorithm partitions massive graphs efficiently and provides high scalability when enumerating triangles. This algorithm minimizes the shuffled data to increase the capability of the distributed system. We used the AWS Lambda platform to offer an economical implementation of the PTE algorithm and achieved accurate, high-speed enumeration. Our experiments proved the strength of the AWS Lambda platform for running the distributed enumeration algorithms. None of the Lambda instances can hold the entire graph inside its space (memory or the tmp disk), but they can achieve the correct count if we properly separate the graph. High concurrency helped us to achieve scalability when running the experiments. We could locate the sweet spot for graphs shown in the previous section and choose the appropriate number of colors according to the size of any new graph. Our experiments showed that for any graphs of the size bigger than twitter-2010, it is helpful to use the highest number of colors possible. What’s Next? In [4], the authors used 41 high-tier machines to enumerate triangles in much larger graphs like ClueWeb12, which has the size of 56G and contains more than 3 Trillion triangles. This workload certainly exceeds the AWS Lambda limit for personal usage. We are looking at ways to fit this monster-size graph into AWS Lambda with a more sophisticated algorithmic engineering, if possible. Future Work. We would like to extend distributed triangle enumeration to directed graphs. In those graphs we do not talk about triangles but “triads” (cf.

236

T. Yu et al.

[22]). There are 7 types of triads in directed graphs, and extending distributed enumeration to them might prove to be challenging. We would also like to extend our work to distributed enumeration of fournode graphlets [23]. This is a more challenging problem but it is based on forming triangles and wedges (open triangles) first and then extending them to four-node graphlets. As such, we believe the techniques outlined in this paper could prove to be useful for enumeration of four-node graphlets as well. One of the main applications of triangle enumeration is computing truss decomposition. In this problem we need to compute the number of triangles supporting each edge of the graph (cf. [24,25]). Extending distributed triangle enumeration to an algorithm for distributed truss decomposition is also another avenue for our future research.

References 1. Dementiev, R.: Algorithm engineering for large data sets, Ph.D. dissertation, Verlag nicht ermittelbar (2006) 2. Menegola, B.: An external memory algorithm for listing triangles (2010) 3. Hu, X., Tao, Y., Chung, C.-W.: Massive graph triangulation. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 325–336 (2013) 4. Park, H.-M., Myaeng, S.-H., Kang, U.: PTE: enumerating trillion triangles on distributed systems. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1115–1124 (2016) 5. Arifuzzaman, S., Khan, M., Marathe, M.: PATRIC: a parallel algorithm for counting triangles in massive networks. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 529–538 (2013) 6. Giechaskiel, I., Panagopoulos, G., Yoneki, E.: PDTL: parallel and distributed triangle listing for massive graphs. In: 2015 44th International Conference on Parallel Processing, pp. 370–379. IEEE (2015) 7. Cohen, J.: Graph twiddling in a mapreduce world. Comput. Sci. Eng. 11(4), 29–41 (2009) 8. Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer. In: Proceedings of the 20th International Conference on World Wide Web, pp. 607–614 (2011) 9. Park, H.-M., Chung, C.-W.: An efficient mapreduce algorithm for counting triangles in a very large graph. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 539–548 (2013) 10. Park, H.-M., Silvestri, F., Kang, U., Pagh, R.: Mapreduce triangle enumeration with guarantees. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 1739–1748 (2014) 11. Wikipedia contributors: AWS lambda—Wikipedia, the free encyclopedia (2020). https://en.wikipedia.org/w/index.php?title=AWS Lambda. Accessed 10 Apr 2020 12. Amazon Web Service: Configuring functions in the AWS lambda console (2020). https://docs.aws.amazon.com/lambda/latest/dg/configuration-console.html 13. Amazon Web Service: Amazon EC2 pricing (2020). https://aws.amazon.com/ec2/ pricing/on-demand/

Triangle Enumeration on Massive Graphs Using AWS Lambda Functions

237

14. Boldi, P., Vigna, S.: The WebGraph framework I: compression techniques. In: Proceedings of the Thirteenth International World Wide Web Conference (WWW 2004), Manhattan, USA, pp. 595–601. ACM Press (2004) 15. Boldi, P., Rosa, M., Santini, M., Vigna, S.: Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks. In: Srinivasan, S., Ramamritham, K., Kumar, A., Ravindra, M.P., Bertino, E., Kumar, R. (eds.) Proceedings of the 20th International Conference on World Wide Web, pp. 587–596. ACM Press (2011) 16. Chen, S., Wei, R., Popova, D., Thomo, A.: Efficient computation of importance based communities in web-scale networks using a single machine. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 1553–1562. ACM (2016) 17. Esfahani, F., Srinivasan, V., Thomo, A., Wu, K.: Efficient computation of probabilistic core decomposition at web-scale. In: Advances in Database TechnologyEDBT 2019, 22nd International Conference on Extending Database Technology, pp. 325–336 (2019) 18. Khaouid, W., Barsky, M., Srinivasan, V., Thomo, A.: K-core decomposition of large networks on a single PC. Proc. VLDB Endow. 9(1), 13–23 (2015) 19. Popova, D., Ohsaka, N., Kawarabayashi, K., Thomo, A.: NoSingles: a spaceefficient algorithm for influence maximization. In: Proceedings of the 30th International Conference on Scientific and Statistical Database Management, p. 18. ACM (2018) 20. Simpson, M., Srinivasan, V., Thomo, A.: Clearing contamination in large networks. IEEE Trans. Knowl. Data Eng. 28(6), 1435–1448 (2016) 21. Simpson, M., Srinivasan, V., Thomo, A.: Efficient computation of feedback arc set at web-scale. Proc. VLDB Endow. 10(3), 133–144 (2016) 22. Santoso, Y., Thomo, A., Srinivasan, V., Chester, S.: Triad enumeration at trillionscale using a single commodity machine. In: Advances in Database TechnologyEDBT 2019, 22nd International Conference on Extending Database Technology. OpenProceedings.org (2019) 23. Santoso, Y., Srinivasan, V., Thomo, A.: Efficient enumeration of four node graphlets at trillion-scale. In: Advances in Database Technology-EDBT 2020, 23rd International Conference on Extending Database Technology, pp. 439–442 (2020) 24. Esfahani, F., Wu, J., Srinivasan, V., Thomo, A., Wu, K.: Fast truss decomposition in large-scale probabilistic graphs. In: Advances in Database Technology-EDBT 2019, 22nd International Conference on Extending Database Technology, pp. 722– 725 (2019) 25. Wu, J., Goshulak, A., Srinivasan, V., Thomo, A.: K-truss decomposition of large networks on a single consumer-grade machine. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 873–880. IEEE (2018)

C’Meal! the ChatBot for Food Information Alessandra Amato1 and Giovanni Cozzolino2(B) 1

2

University of Napoli Federico II, Naples, Italy [email protected] DIETI - University of Napoli Federico II, via Claudio 21, Naples, Italy [email protected]

Abstract. Conversational systems are growing their success among users thanks to their ability to collect and rank users’ preferences and provide them relevant information in a simple way. In this paper we present C’Meal, a chatbot-based conversational framework which, given some ingredients and other requests entered by the user as input, researches the most appropriate recipe. Two categories of people are addressed in the program: those who want to discover new recipes, and those who want to create a recipe with the few ingredients left in the fridge to reduce food waste. The user must formulate his request in both cases by inserting the desired ingredients and other specific requests as a meal type or method of cooking.

1

Introduction

In this paper we present C’Meal [1], a conversational system [2,3] based on a chatbot, which researches the most relevant recipe, given some ingredients and other requests entered by the user as input. The program addresses to two categories of people: the ones that want to discover new recipes, and the ones who want to create a recipe with the few ingredients left in the fridge, in order to reduce the food waste. In both cases the user has to formulate his request by inserting the desired ingredients and other particular requests, as meal type or cooking methods. The program was entirely designed in Python 3. It performs text manipulation in two different phases: the first phase manage text belonging to the knowledge base through the aid of some libraries for semantic text processing; subsequently, the second phase regards the interaction with a chatbot [4,5] which receives user requests, makes the research and gives in output the recipe.

c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 238–244, 2021. https://doi.org/10.1007/978-3-030-57796-4_23

C’Meal! the ChatBot for Food Information

2

239

Creation of the Knowledge Base

The first action was to create a recipes’ knowledge base. In order to do this it’s been created a folder named Recipes, which contains many text files, each of them represents a recipe. All the recipes included in the knowledge base refer to the site AllRecipes 1 . All the recipes have the following layout: Apple Cheesecake ( Dessert ) INGREDIENTS : 1 cup graham cracker crumbs 1/2 cup finely chopped pecans 3 tablespoons white sugar 1/2 teaspoon ground cinnamon 1/4 cup unsalted butter , melted 2 (8 ounce ) packages cream cheese , softened 1/2 cup white sugar 2 eggs 1/2 teaspoon vanilla extract 4 cups apples - peeled , cored and thinly sliced 1/5 cup white sugar 1/2 teaspoon ground cinnamon 1/4 cup chopped pecans DIRECTIONS : Preheat oven to 350 degrees F (175 degrees C ) . Grease a 9 x13 inch baking pan . Set aside . In a large bowl , mix together sugar , oil , eggs , vanilla , and buttermilk . Stir in carrots , coconut , vanilla , and pineapple . In a separate bowl , combine flour , baking soda , cinnamon , and salt ; gently stir into carrot mixture . Stir in chopped nuts . Spread batter into prepared pan . Bake for 55 minutes or until toothpick inserted into cake comes out clean . Remove from oven , and set aside to cool . In a medium mixing bowl , combine butter or margarine , cream cheese , vanilla , and confectioners sugar . Blend until creamy . Frost cake while still in the pan .

1

https://www.allrecipes.com/.

240

3

A. Amato and G. Cozzolino

Semantic Text Processing

The semantic text processing represents the fulcrum of the whole program, inasmuch it’s crucial for the creation of a search engine. In fact this processing makes possible to the calculator to extract information and knowledge from a text and simulate the human behavior. About that, the following libraries have been used: • nltk, for tokenization, lemmatization and normalization of the text2 ; • string, for manipulating the strings3 ; • os, for files handling. The first step was to create a communication between the knowledge base and the Python IDE; in order to do this it’s been created a list in which each element contains a single recipe:

Subsequently the semantic text processing begins. The first manipulation has been the creation of the tokens, removing all the stopwords from the text; in this way the program sees the knowledge base as a list of words:

The second step was the lemmatization of the words; in this way all the words that belong to the WordNet categories has been classified and carried to their base form, while the remaining have been left unchanged:

2 3

Nltk Library: https://www.nltk.org/ - https://pypi.org/project/nltk/. Sklearn Library: https://scikit-learn.org/ - https://pypi.org/project/sklearn/.

C’Meal! the ChatBot for Food Information

241

The last step of manipulation of the knowledge base was the normalization: it consists in removing the punctuation from the text, transforming caps letters in lower letters, eliminating non meaningful words and finally carring out the pos tagging of them:

242

4

A. Amato and G. Cozzolino

Chatbot Algorithm Implementation

The next step was the implementation of the chatbot algorithm, which communicates with the user giving him as answer the recipe that comes closest to his requests. Personalised responses exploits data gathered from user’s interaction with the system or social networks [6–8] improve the customer satisfaction [9]. First the user’s input was taken and added to the knowledge base of the chatbot; this new knowledge base has been structured into a matrix, in which each row represents a phrase and each column represents a lemma. Then the user’s response is placed as the last row of this matrix.

Subsequently, the cosine similarity coefficient between the entire knowledge base and the user’s response was calculated; the sentences were then sorted by increasing index of similarity, so that the sentence which corresponds to the highest coefficient is placed in position [0][−2]. Subsequently the matrix is flattened into an array, and the operations about sorting by cosine similarity are repeated. At this point it’s possible to compare the given response to the knowledge base.

At the end of this process, a simple interface was implemented; the latter begins to communicate with the user. It requires the user to insert which ingredients he would like to use and if he has some other requests, it searches the most relevant recipe into its database and gives it in output to the user; after this first research, the user can make another one or leave the program giving a negative response or inserting a word contained in the exit codes.

C’Meal! the ChatBot for Food Information

5

243

Conclusions

Thanks to their ability to collect and rank user preferences and provide them relevant information in a simple way, conversational systems are increasing their success among users. In this paper we presented C’Meal, a chatbot-based conversational system that looks for the most suitable recipe, provided some ingredients and other requests entered by the user as input. The program addresses the desire of discovering new recipes, and the necessity to reduce food waste. The presented framework addresses the aforementioned issue by exploiting semantic and Natural Language Processing techniques. These techniques can be enhanced with the exploitation of semantic technologies [10–12], that can bring the correlation process to higher level, thanks to the inferential engine processing. Acknowledgement. This paper has been produced with the financial support of the Project financed by Campania Region of Italy ‘REMIAM - Rete Musei intelligenti ad avanzata Multimedialit` a’. CUP B63D18000360007.

244

A. Amato and G. Cozzolino

References 1. Maresca, A., Salatiello, M., Schettino, P.: C’meal! the chatbot for food information (2020) 2. Luis, J., Montenegro, Z., da Costa, C.A., da Rosa Righi, R.: Survey of conversational agents in health. Expert Syst. Appl. 129, 56–67 (2019) 3. Nadarzynski, T., Miles, O., Cowie, A., Ridge, D.: Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: a mixed-methods study. Digit. Health 5, 2055207619871808 (2019) ´ Using health chatbots for behavior change: a mapping study. 4. Pereira, J., D´ıaz, O.: J. Med. Syst. 43(5), 135 (2019) 5. Denecke, K., Hochreutener, S.L., P¨ opel, A., May, R.: Talking to ana: a mobile selfanamnesis application with conversational user interface. In: Proceedings of the 2018 International Conference on Digital Health, pp. 85–89 (2018) 6. Amato, F., Cozzolino, G., Mazzeo, A., Romano, S.: Detecting anomalies in Twitter stream for public security issues. In: 2016 IEEE 2nd International Forum on Research and Technologies for Society and Industry Leveraging a better tomorrow (RTSI), pp. 1–4. IEEE (2016) 7. Amato, A., Balzano, W., Cozzolino, G., Moscato, F.: Analysis of consumers perceptions of food safety risk in social networks. In: International Conference on Advanced Information Networking and Applications, pp. 1217–1227. Springer (2019) 8. Amato, A., Cozzolino, G.: Trust analysis for information concerning food-related risks. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 344–354. Springer (2019) 9. Amato, F., Cozzolino, G., Moscato, V., Picariello, A., Sperl´ı, G.: Automatic personalization of visiting path based on users behaviour. In: 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 692–697. IEEE (2017) 10. Cozzolino, G.: Using semantic tools to represent data extracted from mobile devices. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 530–536. IEEE (2018) 11. Amato, F., Cozzolino, G., Moscato, V., Moscato, F.: Analyse digital forensic evidences through a semantic-based methodology and NLP techniques. Future Gener. Comput. Syst. 98, 297–307 (2019) 12. Amato, F., Cozzolino, G., Mazzeo, A., Moscato, F.: An application of semantic techniques for forensic analysis. In: 2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 380–385. IEEE (2018)

A Study on the Effect of Triads on the Wigner’s Semicircle Law of Networks Toyoaki Taniguchi(B) and Yusuke Sakumoto School of Science and Technology, Kwansei Gakuin University, 2-1 Gakuen, Sanda, Hyogo 669-1337, Japan {toyoaki-taniguchi,sakumoto}@kwansei.ac.jp

Abstract. Spectral graph theory is widely used to analyze network characteristics. In spectral graph theory, the network structure is represented with a matrix, and its eigenvalues and eigenvectors are used to clarify the characteristics of the network. However, it is difficult to accurately represent the structure of a social network with a matrix. We derived the Wigner’s semicircle law that appears in the universality for the eigenvalue distribution of the normalized Laplacian matrix representing the structure of social networks, and proposed the analysis method to apply the spectral graph theory to social networks using the Wigner’s semicircle law. In previous works, we assume that nodes in a network are connected independently. However, in actual social networks, there are dependent structures called triads where link connections cannot be independent. For example, a triad is generated when a person makes a new friend via the introduction by its friend. In this paper, we experimentally investigate the effect of triads on the Wigner’s semicircle law, and clarify how effectively the Wigner’s semicircle law can be used for the analysis of networks with triads.

1

Introduction

With the spread of social media (e.g., Twitter and Facebook), people take a growing interest in the phenomenon of social networks. Social media has been used as a tool not only to distribute information from individual to individual, but also to encourage participation in collective activities such as political demonstration. For deeply understanding these phenomena on social media, it is important to clarify the characteristics of social networks. Spectral graph theory is widely used to analyze the characteristics of network structure [1–3]. In spectral graph theory, a network structure is represented as a matrix, its eigenvalues and eigenvectors of the matrix are used to analyze the characteristics of the network. In order to apply this theory to analyze a social network, it is necessary to represent the social network structure with a matrix. To perform such a representation, all the elements of the matrix must be estimated so that they accurately reflect the structure of the real social network. Since the social network is large, the number of the matrix elements is huge. Hence, it is impractical from a computational point of view to estimate the c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 245–255, 2021. https://doi.org/10.1007/978-3-030-57796-4_24

246

T. Taniguchi and Y. Sakumoto

matrix elements to reflect the reality. Also, since the data of the social network has privacy issues, it is difficult to collect the data needed to estimate the matrix elements. Therefore, even if the computational problem can be solved, the matrix elements cannot be accurately estimated due to the privacy issues. Therefore, before applying the spectral graph theory to the social network analysis, the difficulties of representing the social network the structure should be solved. Random matrix theory studies the universality related to eigenvalues and eigenvectors of a random matrix whose elements are given by random variables. The universalities of random matrices are used in various fields such as atomic physics, financial engineering, and ecology [4–6]. A random matrix was first introduced to analyze the excited states of giant atoms such as uranium [4]. In order to analyze the excited states of an atom, in general, the matrix that represents the structure of the atom is used. However, since uranium has many electrons, it is difficult to represent its structure with a matrix. To avoid the difficulty, they utilized the universality of the eigenvalues of the random matrix, and succeeded in clarifying the characteristics of the excite states. In [7], we have clarified the universality (i.e., the Wigner’s semicircle law) related to the eigenvalues of the normalized Laplacian matrix that represents the structure of a social network. The Wigner’s semicircle law is a limit theorem that the eigenvalue distribution of the normalized Laplacian matrix converges to a semicircle distribution. In [8], we have shown that the clarified universality is useful to avoid representing the social network structure with a matrix, and tehrefor it needs to accomplish the social network analysis based on the spectral graph theory. However, the universality clarified in [7] assumes that nodes in a network are connected independently. In the actual social network, there are dependent structures called triads (e.g., Fig. 1) where link connections cannot be independent. For example, a triad is generated when a person makes a new friend via the introduction by its friend. Therefore, it is necessary to clarify how the network triads affect Wigner’s semicircle law clarified in [7]. In this paper, we experimentally investigate the effect of network triads on the Wigner’s semicircle lawIn the experiment, we generate a network with triads using the network model proposed in [9]. While we change the number of triads in the network, we evaluate the error between the eigenvalue distribution of the normalized Laplacian matrix and the semicircle distribution in the Wigner’s

triad

Fig. 1. The example of a triad in a social network

A Study on the Effect of Triads on the Wigner’s Semicircle Law of Networks

247

semicircle law. As the result, we show that Wigner’s semicircle law holds for networks with not so much number of triads. This paper is organized as follows. In Sect. 2 describes spectral graph theory and the Wigner’s semicircle law In Sect. 3, we explain the network generation model and the experiment method. In Sect. 4, we show the experimental results. Finallyin Sect. 5 we conclude this paper and discuss future works.

2 2.1

Preliminary Network

We denote a network with n nodes by G(V, E) where V and E are the sets of nodes and links, respectively. Link (i, j) ∈ E has weight wij where wij = wji and wij > 0. Let ∂i be the set of adjacent nodes of node i. The degree and the weighted degree of node i is denoted by ki and di , respectively. Degree ki is given by |∂i|, which is the number of adjacent nodes of node i. Weighted degree di is defined by di := w(i, j). (1) j∈∂i

Adjacency matrix A and degree matrix D are used to represent the link structure and node structure of network G. Adjacent matrix A and degree matrix D are defined by w(i, j) if (i, j) ∈ E A := , (2) 0 otherwise D := diag(di )i∈V ,

(3)

respectively. Using adjacency matrix A and degree matrix D, normalized Laplacian matrix N is defined by N := I − D −1/2 AD −1/2 ,

(4)

where I is the identity matrix. In spectral graph theory, N is a popular matrix, and has been utilized to represent the structure of various networks [1,3]. In [8], we show that normalized Laplacian matrix N is also utilized for social networks. Since normalized Laplacian matrix N is a symmetric matrix (i.e., N = N T ), their eigenvalues are real numbers. We define l-th smallest eigenvalue of N as λl where λ1 = 0, and 0 < λl < 2 for 2 ≤ l ≤ n since we assume that G is not a bipartite graph, and there is a path between every pair of nodes in G. Let ql be the eigenvector of λl . Due to the symmetry of N , N can be always diagonalized using orthogonal matrix Q = (ql )1≤l≤n , In particular, smallest eigenvector q1 is given by 1 ( di , di , ..., di )T , q1 = (5) Vol(G)

248

T. Taniguchi and Y. Sakumoto

where Vol(G) is the sum of weighted degree di of each node, and is denoted by Vol(G) := di . (6) i∈G

Using Λ = diag(λl )1≤k≤n and orthogonal matrix Q. Normalized Laplacian matrix N is given by N = QΛQT =

n

λl ql qTl .

(7)

l=1

According to Eq. (7), N can be obtained from their eigenvalues and eigenvectors. Hence, the analysis using spectral graph theory with the eigenvalues and eigenvectors of N is equivalent to the analysis directly using N . However, it is difficult to analyze a social network using spectral graph theory because the structure of social networks cannot be accurately represented by a matrix. 2.2

The Wigner’s Semicircle Law

Random matrix theory discusses the universality of eigenvalues and eigenvectors of a random matrix where its elements are given by random variables. A notable universality in random matrix theory is the Wigner semicircle law that eigenvalues of a random matrix follows the semicircle distribution [4]. In [3], Chung et al., introduced the pioneering work in random matrix theory for unweighted networks where wij = 1 for all links (i, j) ∈ E, and found the Wigner’s semicircle law for N . In [7], we generalized the Wigner’s semicircle law for N in [3] to weighted networks. We describe the Wigner’s semicircle law [7] as follow. Let fn (λ) be the eigenvalue distribution fn (λ) of normalized Laplacian matrix N for weighted network G with n nodes is defined by n

fn (λ) :=

1 δ(λl − λ), n−1

(8)

l=2

where δ(x) is the Dirac delta function. Note that fn (λ) is the distribution of eigenvalues from λ2 to λn . According to the Wigner’s semicircle law [7], as the limit of n → ∞, eigenvalues distribution fn (λ) converges to semicircle distribution f ∗ (λ), which is defined by ⎧ ⎨ 2 r2 − (1 − λ)2 1 − r < λ < 1 + r ∗ f (λ) := πr2 , (9) ⎩0 otherwise where r is called the spectral radius, and is defined by

E W2 2 , r= kavg E W

(10)

A Study on the Effect of Triads on the Wigner’s Semicircle Law of Networks

249

where kavg is the average degree, and E W m is the m-th moment of link weights. However, in order to satisfy the Wigner’s semicircle law [7], G should fulfill the following conditions: independence condition Link (i, j) ∈ E are independently connected according n to probability pij = ρ ki kj , where ρ = 1/ i=1 ki and 1/ρ ≥ (maxi∈V ki ) for pij < 1. 2 2 and average degree kavg fulfill kmin degree condition Minimum degree kmin

2 2 kavg wmax /E W where wmax is the maximum of link weights. With the Wigner’s semicircle law [7], we can analyze social networks on the basis of spectral graph theory using eigenvalues λl without accurately estimating all the element of normalized Laplacian matrix N . However, before analyzing social networks using the Wigner’s semicircle law, we should confirm the acceptance of the independence condition and the degree condition. In [7], we well investigated the acceptance of the degree condition, but have not been discussed that of the independence condition. In actual social networks, there are dependent structures called triads (see Fig. 1) where link connections cannot be independent. For example, a triad is generated when a person makes a new friend via the introduction by its friend. Hence, such social networks may not satisfy the independence condition, and we should confirm that Wigner’s semicircle law holds in network G with triads.

3

Experiment Method

In this paper, we experimentally investigate the effect of triads on the Wigner’s semicircle law in network G. In this section, we describe the method used in our experiment. 3.1

Network Generation Method

In our experiment, we use the network generation model proposed in [9]. The model [9] is a variant of the BA model (Barab´ asi-Albert), and can generate scale-free networks with triads. Since social networks are scale-free networks with triads, the model [9] is suitable to investigate the acceptance of the conditions for the Wigner’s semicircle law [7]. In this paper, we refer to the model [9] as the CBA model. Similarly to the BA model, the CBA model generates a network by adding one node for each time step. When a new node is added in a network, the BA model makes new m links using the preferential attachment. On the other hand, the CBA model uses not only the preferential attachment but also the triad formation to make a triad in network G. In the preferential attachment performed at time t, node i are randomly selected as the adjacent node of the added node by using selection probability pPA (t, i), which is defined by pPA (t, i) = t

ki

j=1

kj

,

(11)

250

T. Taniguchi and Y. Sakumoto

where ki (t) is degree of node i at time t. In the preferential attachment, higher degree nodes are more likely to be selected as the adjacent nodes. Note that triads are rarely made with the preferential attachment. The triad formation is useful to make triads in network G. In the CBA model, the triad formation attachment is probabilistically performed instead of the preferential attachment by using pTF =

mTF , m−1

(12)

where mTF is the expected number of performing the triad formation (0 ≤ mTF ≤ m − 1). In the triad formation at time t, as the adjacent node of the added node, a node is randomly selected among the nodes that have already been selected by the preferential attachment at time t. Hence, the triad formation makes at least one triad. In the CBA model, the number of triads formed in the network can be changed by adjusting mTF . If all the adjacent nodes selected by the preferential attachment have already been selected by the triad formation, then the preferential attachment be certainly performed. Specifically, we generate network G according to the following procedure of the CBE model. ４

2

1

５

3

(a) preferential attachment

４

2

５ ?

1

? 3

(b) triad formation

Fig. 2. Example execution of step 2 of CBA model with m = 2 and mTF = 1

1. Add one node to G, and add m nodes that connect to it. 2. Set time t to m + 2. 3. Add one node to G. Select a node as the adjacent node of the added node by the following procedure. a. Selects one link destination node by using the preferential attachment. b. m−1 node is selected as the adjacent node by using the preference attachment or the triad formation. 4. Increment t. 5. Repeat step 3 until time t = n. Figure 2 shows an example for the execution of step 3 when m = 2 and mTF = 1. In Fig. 2, node 5 is to be added to network G. In step 2, as shown in Fig. 2, we first select one adjacent node by using the preferential attachment. In this example, we assume that node 4 is selected as the adjacent node of node 5. Next, the triad formation is performed, and node 1 or node 3, which is a adjacency node 4, is randomly selected as the adjacent node of node 5.

A Study on the Effect of Triads on the Wigner’s Semicircle Law of Networks

251

Network G generated with the CBA model has the same number of links as the network generated with the BA model, and so average degree kavg is approximated by kavg ≈ 2 m.

(13)

Then, minimum degree kmin is given by m in network G generated with the CBA model. Therefore, if average degree kavg is sufficiently large, it is easier to satisfy the degree condition for the Wigner’s semicircular law [7]. In this paper, average degree kavg is set to a sufficiently large value, and so the acceptance of the independence condition corresponds to that of the Wigner’s semicircular law. In network G generated by the CBA model, degree distribution Pr [k] also follows Pr [k] ∝ k −3

(14)

as same as the BA network. In this paper, link weight wij for all links (i, j) ∈ E are all set to the same value because we are in the initial stage of investigating the effect of the triad on the Wigner’s semicircle law.

Fig. 3. Cluster coefficient in G for different settings of parameter mTF .

The number of triads in network G is highly related to the clustering coefficient. Before the experiment, we confirm how parameter mTF affects the clustering coefficient in G. Figure 3 shows the clustering coefficient in G when varying parameter mTF . According to Fig. 3, the clustering coefficient increases linearly when mTF is small, but the clustering coefficient increases sharply as mTF approaches m, regardless of n. In the Twitter network that has 4,165 million users, the clustering coefficient is 0.000846 [10]. Hence, the number of triads in real social networks are not zero, but there are not many triads. 3.2

Evaluation Method for the Effect of the Triads on the Wigner’s Semicircle Law

To evaluate the effect of the triads on the Wigner semicircle law, we discuss the relative error of distribution fn (λ) of the normalized Laplacian matrix N .

252

T. Taniguchi and Y. Sakumoto

Let εn (mTF ) be the relative error of distribution fn (λ) in network G with n nodes and mTF . To obtain relative error εn (mTF ) of the eigenvalue distribution, we first divide the interval [λ2 , λn ] into nh sub-intervals [θi − hb /2, θi + hb /2] where hb = (λn − λ2 )/nh , and then averages the relative errors in the sub-intervals. Namely, relative error εn (mTF ) is defined by εn (mTF ) :=

nh |Fn (θi ) − F ∗ (θi )| 1 , nh i=1 F ∗ (θi )

(15)

where θi is (i − 1/2)hb + λ2 . In the above equation, Fn (θi ) is the eigenvalue frequency in the i-th sub-interval, and is calculated by dividing the number of eigenvalues of N in the i-th sub-interval divided by n − 1. F ∗ (θi ) is the integral of the semicircular distribution f (λ) in the i-th sub-interval. If distribution fn (λ) matches semicircle distribution f (λ), then Fn (θi ) = F (θi ). The Wigner semicircle law discusses the limit distribution for n → ∞. Therefore, in order to clarify exactly how the triad affects the acceptance of the Wigner’s semicircle law, we need to perform experiment with n → ∞. However, there is a limitation for n that can be used in the experiment. Therefore, we confirm the convergence to the limit distribution by examining how relative error εn (mTF ) decreases when n increases. Table 1. Parameter configuration Number of nodes, n in G

1,000

Number of links for added nodes used in the CBA model, m Parameter to adjust the number of triads, mTF Average degree kavg Link weights wij

Fig. 4. Relative error εn (mTF ) for different settings of n

20 5 40 1

Fig. 5. Relative error ratio εn (mTF )/ ε1000 (mTF ) for different settings of mTF

A Study on the Effect of Triads on the Wigner’s Semicircle Law of Networks

4

253

Experiment Result

This section describes the results of an experiment with the method described in Sect. 3. In the experiment, average degree kavg is set to a sufficiently large value, and so the acceptance of the independence condition corresponds to that of the Wigner’s semicircular law. We use the parameter configuration shown in Table 1 as a default parameter configuration. 4.1

Does the Triads Affect the Acceptance of the Wigner’s Semicircle Law?

We investigate how the triads affect the acceptance of the Wigner semicircle law. Figure 4 shows relative error εn (mTF ) of eigenvalue distribution fn (λ) when varying the number of nodes n and parameter mTF . According to Fig. 4, when mTF is relatively small (mTF = 0, 5), relative error εn (mTF ) of the eigenvalue distribution decreases as n increases.

Fig. 6. Eigenvalue distribution fn (λ) and semicircle distributions f ∗ (λ) for different settings of mTF

To examine the acceptance of the Wigner’s semicircle law in more detail, we investigate ratio εn (mTF )/εn1 (mTF ) of the relative error of the eigenvalue distribution. We set n1 to 1, 000. Figure 5 shows the relative error ratio εn (mTF )/ε1,000 (mTF ) when parameter mTF is changed. According to Fig. 5, the relative error ratio is not a monotonically increasing function. The relative error ratio with mTF = 1 is smaller than that with mTF = 0. Also, the relative error

254

T. Taniguchi and Y. Sakumoto

ratio with mTF ≤ 2 is almost same as that with mTF = 0. Hence, eigenvalue distribution fn (λ) for mTF ≤ 2 should converge to the semicircle distribution f ∗ (λ) in the limit of n → ∞. Since the relative error ratio with mTF ≤ 5 is less than or equal to 1, the Wigner semicircle law hold. According to the above results, the Wigner’s semicircle law holds even if network G has a small number of triads. 4.2

How the Triads Affect the Relative Error εn (mTF )?

First, we visually investigate the effect of the triads on relative error εn (mTF ). Figure 6 shows eigenvalue distribution fn (λ) and semicircle distribution f ∗ (λ) of the normalized Laplacian matrix N with parameters mTF = 0, 5 and 10. According to Fig. 6, we can visually see that eigenvalue distributions fn (λ) with mTF = 0 and mTF = 5 are close to semicircle distribution f ∗ (λ).

Fig. 7. Relative error εn (mTF ) for different settings of mTF

Fig. 8. Relative error ratio εn (mTF )/ εn (0) for different settings of mTF

Next, we quantitatively investigate the effect of the triads on relative error εn (mTF ). Figure 7 shows relative error εn (mTF ) of eigenvalue distribution fn (lambda) when varying parameter mTF . According to Fig. 7, relative error εn (mTF ) may not be changed when parameter mTF is small. On the other hand, relative error epsilonn (mTF ) increases significantly when mTF is large. For a more detailed discussion, we examine relative error ratio εn (mTF )/εn (0). Figure 8 shows relative error ratio εn (mTF )/εn (0) of eigenvalue distribution fn (λ) when varying parameter mTF . According to Fig. 8, the relative error is minimized with mTF = 2. According to the above results, the small number of triads in network G should affect the acceptance of the Wigner’s semicircle law in the good direction. Hence, if the social networks has some triads, we can analyze them using the Wigner’s semicircle laws.

5

Conclusion and Future Work

In this paper, we experimentally investigated the effect of network triads on the Wigner’s semicircle law. In the experiment, we generated a network with triads

A Study on the Effect of Triads on the Wigner’s Semicircle Law of Networks

255

using the model [9], and investigated how the triads affect the acceptance of the Wigner’s semicircle law. According to the experiment results, the small number of triads in a network should affect the acceptance of the Wigner’s semicircle law in the good direction. Hence, if the social networks has some triads, we can analyze them using the Wigner’s semicircle laws. As future work, we are planning to investigate the acceptance of the Wigner’s semicircular law in a more realistic social network model, and to analyze social networks based on real social network data. Acknowledgement. This work was supported by JSPS KAKENHI Grant Number 19K11927.

References 1. Spielman, D.: Spectral graph theory, pp. 495–524 (2012) 2. Lov´ asz, L.: Random walks on graphs: a survey. Comb. Paul Erd˝ os Eighty 2, 353– 398 (1996) 3. Chung, F., Lu, L., Vu, V.: Spectra of random graphs with given expected degrees. Nat. Acad. Sci. 100, 6313–6318 (2003) 4. Wigner, E.P.: Characteristic vectors of bordered matrices with infinite dimensions. Ann. Math. 62, 548–564 (1955) 5. Plerou, V., Gopikrishnan, P., Rosenow, B., Amaral, L.A.N., Guhr, T., Stanley, H.E.: Random matrix approach to cross correlations in financial data. Phys. Rev. E 65, 066126 (2002) 6. Tokita, K., Yasutomi, A.: Emergence of a complex and stable network in a model ecosystem with extinction and mutation. Theoret. Popul. Biol. 63, 131–146 (2003) 7. Sakumoto, Y., Aida, M.: The Wigner’s semicircle law of weighted random networks. arXiv preprint arXiv:2004.00125, April 2020 8. Sakumoto, Y., Kameyama, T., Takano, C., Aida, M.: Information propagation analysis of social network using the universality of random matrix. IEICE Trans. Commun. E102.B, 391–399 (2019) 9. Holme, P., Kim, B.J.: Growing scalefree networks with tunable clustering. Phys. Rev. E 65(2), 026107 (2002) 10. Kunegis, J.: The koblenz network collection. http://konect.uni-koblenz.de/. Accessed 19 Dec 2019

COVID-19-FAKES: A Twitter (Arabic/English) Dataset for Detecting Misleading Information on COVID-19 Mohamed K. Elhadad, Kin Fun Li(&), and Fayez Gebali Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC, Canada {melhaddad,kinli,fayez}@uvic.ca

Abstract. This paper aims to aid the ongoing research efforts for combating the Infodemic related to COVID-19. We provide an automatically annotated, bilingual (Arabic/English) COVID-19 Twitter dataset (COVID-19-FAKES). This dataset has been continuously collected from February 04, 2020, to March 10, 2020. For annotating the collected dataset, we utilized the shared information on the official websites and the official Twitter accounts of the WHO, UNICEF, and UN as a source of reliable information, and the collected COVID19 pre-checked facts from different fact-checking websites to build a groundtruth database. Then, the Tweets in the COVID-19-FAKES dataset are annotated using 13 different machine learning algorithms and employing 7 different feature extraction techniques. We are making our dataset publicly available to the research community (https://github.com/mohaddad/COVID-FAKES). This work will help researchers in understanding the dynamics behind the COVID-19 outbreak on Twitter. Furthermore, it could help in studies related to sentiment analysis, the analysis of the propagation of misleading information related to this outbreak, the analysis of users’ behavior during the crisis, the detection of botnets, the analysis of the performance of different classification algorithms with various feature extraction techniques that are used in text mining. It is worth noting that, in this paper, we use the terms of misleading information, misinformation, and fake news interchangeably.

1 Introduction The coronavirus disease (COVID-19) has considerably impacted our lives. Although the significant role of various digital technologies and social network platforms in fighting against COVID-19 is apparent, it has also offered a ground for the exploitation of many social behavior vulnerabilities (e.g., the spread of different kinds of misinformation (fake news, propaganda, hoaxes, etc.) [1], stigma, hatred, racism [2], and Cybercrimes [3]). Some organizations may profit from the spread of such misleading information [4]. Whereas, once the misleading information is published, it becomes a rumor that attracts many users. Later, when this information becomes a trending topic, it is used by advertising organizations and companies to promote products or ideas and gain huge financial profits [5]. © Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 256–268, 2021. https://doi.org/10.1007/978-3-030-57796-4_25

COVID-19-FAKES: A Twitter (Arabic/English) Dataset

257

Therefore, the fight against the spread of any kind of misleading information, and the need to find a system that assists in verifying the integrity of the shared information surrounding COVID-19 arises. The WHO is doing a great effort in facing, not only the medical consequences related to the COVID-19 but also the spread of Infodemic on it [6]. The WHO is working closely with different technology companies and social network platforms such as, Twitter, Facebook, YouTube, Google, and Microsoft, to endorse critical updates from reliable sources, and to point out the shared misleading information on their platforms [7]. To develop an automated misleading-information detection systems is very challenging, especially with the lack of available standards and information related to the virus, which makes the consequences of wrong decisions dire [8]. All misleadinginformation detection systems utilize Artificial Intelligence (AI) [9], Machine Learning (ML) and Deep Learning (DL), and Natural Language Processing techniques (NLP) [10]. These techniques are used to assist users in filtering the information they are viewing [11]. Moreover, they help in classifying whether a piece of information is misleading or not. This is done by comparing a piece of given information with some pre-known dataset that contains both misleading and truthful information [12]. In this paper, to aid all the work related to the detection of misinformation surrounding COVID-19, we introduce an automatically annotated, bilingual (Arabic/ English) COVID-19 Twitter dataset (COVID-19-FAKES). This dataset has been continuously collected since February 04, 2020, four days after the outbreak was declared a Public Health Emergency of International Concern, by the WHO, on January 30, 2020, till March 10, 2020, one day before the outbreak was declared a Pandemic. For performing the automated annotation task, we collect a set of ground-truth data, related to COVID-19, by scraping the shared information, in both Arabic and English languages, from the official websites and the official Twitter accounts of the WHO, UNICEF, and UN as a source of reliable information. Besides, we collect COVID-19 pre-checked facts from different fact-checking websites (e.g., “poynter.org”, “snopes.com”, “factcheck. org”, etc.). We use these ground-truth data in building detection models of 13 different machine learning algorithms, employing 7 different feature extraction techniques with each. Then, we use these models to automatically annotate our dataset into either Real or Misleading. Section 2 shows the related work about the available COVID-19 datasets. Section 3 introduces the dataset collection and annotation process. The exploratory data analysis on the collected dataset is discussed in Sect. 4, while Sect. 5 concludes and gives directions for future work.

2 Related Work Recently, several studies have emerged that aim by studying the shared COVID-19 related information on different social network platforms and collecting datasets for enabling research in different domains. To the best of our knowledge, all these works depend on a list of hashtags related to COVID-19 and focus on a given period. Moreover, all the available datasets are general-purpose ones, with no clear assigned annotation to the data. Also, all the available datasets are to study and analyze the human and social behavior, and information consumption surrounding COVID-19.

258

M. K. Elhadad et al.

Chen et al. [13] collected a multilingual coronavirus dataset of 67M million English Tweets and 101M non-English Tweets intending to study online conversation dynamics. They used Twitter’s streaming API [14] for collecting Tweets from January 22 to April 23, 2020. Also, Lopez et al. [15] collected around 6.5M multilingual dataset to identify public responses to the pandemic and analyze the information related to it. They used Twitter API for collecting their data from January 22 to March 13, 2020. Singh et al. [16] collected around 2.8M Tweets in multiple languages to investigate the amount of shared information and discussions on social network platforms, specifically Twitter, related to COVID-19, myths shared about the virus, and how much of it is connected to other high and low-quality information on the Internet through shared URL links. They used Twitter API for collecting their data from January 16 to March 15, 2020. Sharma et al. [17] collected 30.8M Tweets in multiple languages to design a dashboard for visualizing discussions around Coronavirus and identifying the quality of its related information shared on Twitter. They used Twitter API for collecting their data from March 1 to March 30, 2020. Alqurashi et al. [18] collected nearly 4M Arabic language Tweets on COVID-19 to study the pandemic from a social perspective, and analyze human behavior, and information spread with special consideration to Arabic speaking countries. They used Hydrator [19] and TWARC [20] tools to retrieve the full object of the Tweet, covering the period of (March 1, 2020–March 30, 2020). Haouari et al. [21] collected 748k Arabic language Tweets in addition to propagation networks of a subset of 65k Tweets to enable the research related to natural language processing, information retrieval, and social network analysis. They used Twitter search API to retrieve the data daily, covering the period of (January 27, 2020–March 31, 2020). Zarei et al. [22] collected social media content from Instagram using hashtags related to COVID-19. They collected 5.3K posts, 18.5K comments, and 329K likes with the aim of identifying and analyzing social behavior on their collected data. They used the official Instagram API [23] for collecting their data from January 5 to March 30, 2020. Cul et al. [24] collected 1,896 news, 183,564 related user engagements, 516 social platform posts about COVID-19 to call out for public attention to the spread of misinformation related to COVID-19 and to assist ongoing research to develop misinformation-detection systems.

3 Data-Collection The latest statistics and facts of Twitter in 2019 showed that Twitter has almost 23 percent of Internet users on it, with 336 million monthly active users who published around 6,000 Tweets every second with a total of almost 500 million daily posted Tweets [25]. In 2014 Twitter introduced a project called Twitter Data Grants (TDG), through which it allows researchers to access Twitter’s public and historical data with some limitations. This access allows researchers to get insights from its massive set of data [26]. Twitter is one of the most popular and commonly used social network platform. On Twitter, users can easily communicate with each other or share emotions, stories, concerns, and provide better means to get quick response and feedback on different global issues in the form of short blogs of at most 280 words length [27, 28].

COVID-19-FAKES: A Twitter (Arabic/English) Dataset

3.1

259

Twitter Data

The default configurations of Twitter accounts keep all the posted Tweets public, although any post owner has the authority to make them accessible only by his approved followers or by certain group members. However, more than 90% of all Twitter accounts are public [29, 30]. These publicly available Tweets, including the user information/ retweet/replies/mentions, are available in JSON format through Twitter’s provided API. Twitter API allows access to both historic and real-time feeds. As for the search API, it allows the query of Twitter for recent Tweets containing specific keywords and requires an authorized application before obtaining results from the API. While for the streaming API, it allows the filtering of live streaming Tweets by many identifiers such as user ID, keyword, geographic location, or random sampling [26]. To use this API, users must register with their research project information, then Twitter will provide the project with an application unique ID and then produce a set of credentials to access the API. Then the user can use these credentials to set up a connection and query Twitter’s Tweets database either its historic data or its real-time feeds. 3.2

Tweets Collection

After establishing the connection, we used the following search keywords, which are the trendings related to the coronavirus disease (COVID-19), to collect the corresponding shared Tweets (“Coronavirus”, “Corona_virus”, “Corona-virus”, “Novel_Coronavirus”, “2019-nCoV”, “Novel-Coronavirus”, “NovelCoronavirus”, “2019_nCoV”, “nCoV”, “COVID-19”, “SARS-CoV-2”, “covid19”). We used the streaming options of the API to collect real-time data. For our current release of the COVID-19-FAKES dataset, we started our streaming process on February 04 till March 10, 2020. We collected 5,224,912 Tweets in 66 different languages, in addition to all the metadata associated with these Tweets. We stored the collected data in real-time to our MySQL database. For the current work, we only consider the collected Tweets in both Arabic and English Languages with a total of 3,263,464 Tweets. Figure 1 shows the distribution of Tweets over the top-10 collected Tweets’ languages.

Fig. 1. Distribution of top-10 collected Tweets’ languages

260

M. K. Elhadad et al.

It could be remarked that Twitter API allows only the return of 3,200 Tweets per query at most. Moreover, due to some technical issues (connection errors, Internet problems, or power issues, etc.) we missed collecting some Tweets for sometimes during some days or for a few days. 3.3

Annotation Process

For assigning labels to the collected Tweets and building different annotation models, we collected a set of ground-truth information related to COVID-19 disease. We rely on the published information on the official websites and official Twitter accounts of the WHO, UNICEF, and UN as we perceived as trusted information sources. Additionally, we enriched the collected ground truth with the pre-checked facts from various fact-checking websites. The annotation process is divided into two phases as shown in Fig. 2(a, b). The first is the model-building phase, which is designed to train a binary classification model for 13 different machine learning algorithms while using 7 different feature extraction techniques with each of the used algorithms. While the second phase is the annotation phase. For the data in both phases, they are passed through the same preparation, preprocessing, and feature engineering steps (feature selection and feature extraction). For generalization, we are reporting the obtained class of each Tweet with each of the 7 feature extraction techniques (Term Frequency (TF), Term Frequency Inverse Document Frequency (TF-IDF)-(unigram, bigram, trigram, N-gram, character level), and Word Embedding), for each of the used 13 machine learning algorithms (Decision Tree (DT), k-Nearest Neighbor (kNN), Logistic Regression (LR), Linear Support Vector Machines (LSVM), Multinomial Naive Bayes (MNB), Bernoulli Naive Bayes (BNB), Perceptron, Neural Network (NN), Ensemble Random Forest (ERF), Extreme Gradient Boosting (XGBoost), Bagging Meta-Estimator (BME), AdaBoost, and Gradient Boosting (GB)).

Fig. 2. Annotation system

COVID-19-FAKES: A Twitter (Arabic/English) Dataset

261

4 Exploratory Data Analysis (EDA) EDA provides an in-depth understanding of the data. Moreover, the visual representation of text documents’ is considered one of the most important tasks in the field of text mining [31]. Its aim is not only exploring the content of documents from different aspects and at different levels of details, but also summarizing a single document, showing the words and topics, detecting events, and creating storylines [32]. However, there are still some gaps between visualizing unstructured textual data and structured data. For example, many text visualizations do not represent the text directly, they represent an output of a language model (word count, character length, word sequences, etc.). In this section, we will give insights on the collected Tweets; not only exploring the Tweets but also, visualizing numeric and categorical features in them. We will try to explore and visualize as much as we can, using Plotly’s Python graphing library [33] and the Bokeh visualization library [34]. After a brief inspection of the collected Tweets data, we found there are a series of steps that we must perform first, as follows: • Remove the Tweet’s String Id column as it is the same as its Tweet’s Id. • Remove rows where the Tweet’s text is empty (if exists). • Perform cleaning on the Tweet’s textual data for all the Tweets. For English written data, regular expressions are used to remove non-English words and the words that contain symbolic characters and by using a mechanism that takes only the words that match the expression and discard those which do not match [11]. • Use TextBlob python library [35] for English Tweets, and TextBlob_ar python library [36] for Arabic Tweets to create a new feature for the Tweets’ sentiment by calculating sentiment polarity which lies in the range of [−1, 1] where 1 means positive sentiment and −1 means negative sentiment. • Create new feature TweetCountry for the country that corresponds to the stored Tweet’s location belongs to, and to put standard country names for each Tweet. • Create new feature UserCountry for the country that corresponds to the stored user’s, and to put standard country names for each user. • Create a new feature for the length of the Tweet’s text. • Create a new feature for the word count of the Tweet’s text. • Create two new features, one for the Tweet date, while the other one for the hour of the day the Tweets been posted. We get these by processing the associated timestamp with each Tweet the Tweet’s text. For the collected English Tweets, we found that we have 3,047,026 Tweets. These Tweets were published by 993,320 users. Only around 2.2% (21,867 users) of these users are verified users on Twitter. Almost 32.523% (32,3058 users) of these users have incomplete profile information. The complete data indicate they are from 285 countries and 118,247 locations, in addition to 64,259 undefined locations. While for the collected Arabic Tweets, we found that we have 276,774 Tweets. These Tweets were published by 112,340 users. Only around 20.7% (23,241 users) of these users are verified users on Twitter. Almost 53.604% (60,219 users) of these users have

262

M. K. Elhadad et al.

incomplete profile information. The complete data indicate they are from 307 countries and 20,225 different locations, in addition to 9,787 undefined locations. After calculating the sentiment polarity score of the collected Tweets, we found that they are mostly neutral and deviated to the positive for both the Arabic and the English Tweets, as shown in Fig. 3(a, b), respectively. Most of the sentiment polarities are greater than or equal to 0, which means most of the Tweets are positive. For investigating the rate of Tweeting every day along the collection period, Fig. 4(a, b) shows the distribution of the Tweet’s publishing distribution over the collection period.

(a) Arabic Tweets

(b) English Tweets

Fig. 3. Tweet’s polarity distribution

(a) Arabic Tweets

(b) English Tweets

Fig. 4. Tweet’s daily rate

COVID-19-FAKES: A Twitter (Arabic/English) Dataset

263

Figure 5(a, b) shows the distribution of Tweets over the top-10 countries that engaged in publishing Tweets in both languages.

(a) Arabic Tweets

(b) English Tweets

Fig. 5. Tweets’ countries distribution

It could be noticed that the most active users are located in The United States of America and Saudi Arabia, while the next active group for both Arabic and English Tweets comes from undefined countries (e.g., “The Kingdom of God”, “???”, “the planet of Kashyyyk”, “Somewhere in this world”, “Nowhere”, “ ”, “ ”, “ ”, “ , etc.”). Figure 6(a, b) shows the Tweet’s text length distribution for both Arabic and English Tweets.

(a) Arabic Tweets

(b) English Tweets

Fig. 6. Tweets’ length distribution

In Fig. 6, despite the character limit for a tweet is 280 characters as imposed by Twitter, the calculated Tweet length has shown many Tweets exceed this limit. After further investigation of the collected Tweets, we found that the URLs and HTML encoding (e.g., &, <, >, etc.) are affecting the character count although they are excluded from the Tweets’ length limit. This data need cleaning to remove all the noise from the collected Tweets. Figure 7(a, b) shows the Tweet’s text length distribution for both Arabic and English Tweets after noise removal, while Fig. 8(a, b) shows the distribution of the Tweet’s word count.

264

M. K. Elhadad et al.

(a) Arabic Tweets

(b) English Tweets

Fig. 7. Cleaned Tweets’ length distribution

(a) Arabic Tweets

(b) English Tweets

Fig. 8. Tweets’ word count distribution

Form Fig. 7(a) and Fig. 8(a), we notice that the length of most of Arabic Tweets 100 to 180 characters in length with 20–42 words. While from Fig. 7(b) and Fig. 8(b), we notice that the length of most English Tweets 90 to 120 characters in length with 22 to 44 words. This means that there were some users like to leave long Tweets, and most of the Tweets are short ones. Then, we investigated the relationship between the Tweet’s sentiment polarity and its text length as shown in Fig. 9(a, b).

(a) Arabic Tweets

(b) English Tweets

Fig. 9. Distribution of sentiment polarity score by Tweets’ text length

While Fig. 10(a, b) shows the relation between the Tweet’s sentiment polarity and their word count.

COVID-19-FAKES: A Twitter (Arabic/English) Dataset

(a) Arabic Tweets

265

(b) English Tweets

Fig. 10. Distribution of sentiment polarity score by Tweets’ word count

There are relatively few documents that are very positive or very negative. Tweets that have neutral to positive scores are more likely to be with text length greater than 50 and with word count more than 26 words. Probably with this number of words users can give a good impression. For further analysis of the Tweets, we investigated the top-10 Unigrams, Bigrams, and Trigrams before and after removing the stop words for both Arabic and English data. Figure 11(a, b) and Fig. 12(a, b) show the Unigram analysis, as a sample of our syntactic analysis on the collected data.

(a) Before Removing the Stop Words

(b) After Removing the Stop Words

Fig. 11. Top-10 Arabic unigrams

(a) Before Removing the Stop Words

(b) After Removing the Stop Words

Fig. 12. Top-10 English unigrams

266

M. K. Elhadad et al.

It is clear from Fig. 11 and Fig. 12 that, many stop words have high frequencies, which could affect badly the detection results and will lead to an increase in the size of the extracted feature vector. Hence, stop words removal and data cleaning task are mandatory for that reason. This syntactic analysis gives a good indication of what are the most frequent words and the effect of removing unnecessary words from the used feature vector to serve in the goal of dimensionality reduction.

5 Conclusion and Future Work In this paper, we present COVID-19-FAKES, an automatically annotated misleading Information dataset about COVID-19 from Twitter. The dataset has been collected over a period of 36 days from February 4 to March 10, 2020. It consists of 3.263M Arabic and English Tweets. We described in detail our data collection steps and how we conducted our annotation process. Our EDA showed the main features of the COVID19-FAKES dataset. This work could help researchers in understanding the dynamics behind the COVID-19 outbreak on Twitter. Furthermore, it could help in studies related to sentiment analysis, the analysis of the propagation of misleading information related to this outbreak, the analysis of users’ behavior during the crisis, the detection of botnets, the analysis of the performance of different classification algorithms with various feature extraction techniques that are used in text mining. For future work directions, we could extend our proposed dataset to cover data from other languages, such as French, Spanish, Chinees, etc., also, to continuously update the dataset to cover dates after March 10, 2020.

References 1. Kefalaki, M., Karanicolas, S.: Communication’s rough navigations: ‘fake’ news in a time of a global crisis. J. Appl. Learn. Teach. 3(1), 1–13 (2020) 2. Ziems, C., He, B., Soni, S., Kumar, S.: Racism is a virus: anti-asian hate and counterhate in social media during the COVID-19 crisis. arXiv preprint arXiv:2005.12423 (2020) 3. Fontanilla, M.V.: Cybercrime pandemic. Eubios J. Asian Int. Bioethics 30(4), 161–165 (2020) 4. Taylor, C.R.: Advertising and COVID-19. Int. J. Advertising 39(5), 587–589 (2020) 5. Ansari, B., Ganjoo, M.: Impact of Covid-19 on advertising: a perception study on the effects on print and broadcast media and consumer behavior. Purakala ISSN 0971-2143 UGC CARE J. 31(28), 52–62 (2020) 6. Richtel, M.: W.H.O. fights a pandemic besides coronavirus: an ‘infodemic’ (2020). https:// www.nytimes.com/2020/02/06/health/coronavirus-misinformation-social-media.html? searchResultPosition=1. Accessed 21 Mar 2020 7. Binti Hamzah, F.A., Lau, C.H., Nazri, H., Ligot, D.V., et al.: CoronaTracker: worldwide COVID-19 outbreak data analysis and prediction. Bull. World Health Organ. 1, 32 (2020) 8. Zarocostas, J.: How to fight an infodemic. Lancet 395(10225), 676 (2020) 9. Cybenko, A.K., Cybenko, G.: AI and Fake News. IEEE Intell. Syst. 33(5), 1–5 (2018)

COVID-19-FAKES: A Twitter (Arabic/English) Dataset

267

10. Oshikawa, R., Qian, J., Wang, W.Y.: A survey on natural language processing for fake news detection. arXiv preprint arXiv:1811.00770 (2018) 11. Elhadad, M.K., Li, K.F., Gebali, F.: A novel approach for selecting hybrid features from online news textual metadata for fake news detection. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 914–925 (2019) 12. Elhadad, M.K., Li, K.F., Gebali, F.: Fake news detection on social media: a systematic survey. In: 2019 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, Victoria, B.C., Canada (2019) 13. Chen, E., Lerman, K., Ferrara, E.: Covid-19: the first public coronavirus twitter dataset. arXiv preprint arXiv:2003.07372 (2020) 14. Twitter Streaming API (2017). https://github.com/spatie/twitter-streaming-api. Accessed 21 Mar 2020 15. Lopez, C.E., Vasu, M., Gallemore, C.: Understanding the perception of COVID-19 policies by mining a multilanguage twitter dataset. arXiv preprint arXiv:2003.10359 (2020) 16. Singh, L., Bansal, S., Bode, L., Budak, C., et al.: A first look at COVID-19 information and misinformation sharing on Twitter. arXiv preprint arXiv:2003.13907 (2020) 17. Sharma, K., Seo, S., Meng, C., Rambhatla, S., et al.: COVID-19 on social media: analyzing misinformation in Twitter conversations. arXiv preprint arXiv:2003.12309 (2020) 18. Alqurashi, S., Alhindi, A., Alanazi, E.: Large Arabic Twitter dataset on COVID-19. arXiv preprint arXiv:2004.04315 (2020) 19. Hydrator: Turn Tweet IDs Onto Twitter JSON & CSV From Your Desktop (2019). https:// github.com/DocNow/hydrator. Accessed 21 Mar 2020 20. TWARC: A Command Line Tool (and Python Library) for Archiving Twitter JSON (2019). https://github.com/DocNow/twarc. Accessed 21 Mar 2020 21. Haouari, F., Hasanain, M., Suwaileh, R., Elsayed, T.: ArCOV-19: the first Arabic COVID19 Twitter dataset with propagation networks. arXiv preprint arXiv:2004.05861 (2020) 22. Zarei, K., Farahbakhsh, R., Crespi, N., Tyson, G.: A first Instagram dataset on COVID-19. arXiv preprint arXiv:2004.12226 (2020) 23. Instagram: Official API Graph Instagram (2020). https://developers.facebook.com/docs/ instagram-api. Accessed 21 Mar 2020 24. Cui, L., Lee, D.: CoAID: COVID-19 healthcare misinformation dataset. arXiv preprint arXiv:2006.00885 (2020) 25. Mohamed Sikandar, G.: 100 social media statistics for 2019. Statusbrew Blog (2019). https://blog.statusbrew.com/social-media-statistics-2018-for-business/. Accessed 18 Nov 2019 26. Krikorian, R.: Introducing Twitter Data Grants. Twitter (2014). https://blog.twitter.com/ engineering/en_us/a/2014/introducing-twitter-data-grants.html. Accessed 18 Nov 2019 27. Gligorić, K., Anderson, A., West, R.: How constraints affect content: the case of Twitter’s switch from 140 to 280 characters. In: Proceedings of the Twelfth International AAAI Confernce on Web and Social Media (2018) 28. Zubiaga, A., Aker, A., Bontcheva, K., Liakata, M., et al.: Detection and resolution of rumours in social media: a survey. ACM Comput. Surv. (CSUR) 52(2), 32 (2018) 29. Batrinca, B., Treleaven, P.C.: Social media analytics: a survey of techniques, tools, and platforms. AI Soc. 30(1), 89–116 (2015) 30. De Maio, C., Fenza, G., Loia, V., Orciuoli, F.: Unfolding social content evolution along with time and semantics. Future Gener. Comput. Syst. 66, 146–159 (2017) 31. Sahoo, K., Samal, A.K., Pramanik, J., Pani, S.K.: Exploratory data analysis using Python. Int. J. Innov. Technol. Exploring Eng. (IJITEE) 8(12), 4727–4735 (2019) 32. Kulkarni, A., Shivananda, A.: Exploring and processing text data. In: Natural Language Processing Recipes, pp. 37–65 (2019)

268

M. K. Elhadad et al.

33. Plotly Python Open Source Graphing Library (2020). https://plot.ly/python/. Accessed 21 Mar 2020 34. Bokeh Visualization Library (2019). https://docs.bokeh.org/en/latest/. Accessed 21 Mar 2020 35. TextBlob: Simplified Text Processing (2020). https://textblob.readthedocs.io/en/dev/. Accessed 21 Mar 2020 36. TextBlob-ar: Arabic Support for Textblob (2020). https://github.com/adhaamehab/textblob-ar. Accessed 21 Mar 2020

Precision Dosing Management with Intelligent Computing in Digital Health Hong Lu1, Sara Rosenbaum2, and Wei Lu3(&) 1

Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada 2 College of Pharmacy, University of Rhode Island, Kingston, RI, USA 3 Department of Computer Science, Keene State College, USNH, Keene, NH, USA [email protected]

Abstract. Pediatric dosing is not only critical for successful pediatric trials in drug development but also paramount to safety and effective treatment at bedside. Due to the complex pharmacokinetic of children compared to adults, several challenges are posed in managing dosing precisely during drug development and after drug approval to clinicians. In particular, given the real-world practice, understanding the impact of development on the dose-exposureresponse relationship is essential in optimizing the dosing to children of different ages. In this paper we propose a novel intelligent computing framework to examine how the growth and maturation create size- and age- dependent variability in pharmacokinetics and pharmacodynamics, and summarize the use of modeling-based approaches for dose finding in pediatric drug development, allowing clinicians to anticipate probable treatment effects and to have a higher likelihood of achieving optimal dose regimens early, as well as reducing the drug development cycling time and cost.

1 Introduction From birth onward, neonates, young infants and children develops with important agedependent changes in body composition, in size (weight and height) and in maturation of hepatic and renal function [1]. These processes all have a major impact on the pharmacokinetic (PK) profile of a drug from its absorption and distribution properties to metabolism and elimination. As a result, the developmental changes in pharmacokinetics require an age-dependent adjustment of dosing regimens in children to achieve the target systemic exposure of a drug [2], which is measured by the areaunder-the-plasma-concentration, AUC, or steady state plasma concentration, Css. The total exposure of a drug is determined by the efficiency of the elimination processes (Eq. 1). The apparent drug clearance (CL/F) is the principle PK process determining age-dependent differences in drug dosage regimens. Simple allometric approaches can be applied to the estimation of pediatric clearance based on the adult clearance and the power function of body weight (Eq. 2).

© Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 269–280, 2021. https://doi.org/10.1007/978-3-030-57796-4_26

270

H. Lu et al.

AUCSS ¼

Dose ðDose=sÞ or CSS ¼ ðCL=F Þ ðCL=F Þ

CLchild

BWchild ¼ CLadult BWadult

ð1Þ

b ð2Þ

The allometric exponent, b, typically assumes a value of 1 (the per kg model), 3/4 (the allometric ¾ power model), or 2/3 (the body surface area model) [3]. These models derived from body size are simple to use in clinical practice. However, they are often failed to predict clearance in neonates and young infants because drug elimination pathways in the first year of life is not matured even after size adjustment [4, 5]. Hence, a mechanism-based approach considering the underlying physiological and biochemical processes that govern drug elimination has been proposed. The advantage of this approach over other size models is the ability to incorporate the ontogeny information of the various anatomical, physiological and biochemical processes in drug elimination, although a tremendous input of physiological data is required [6]. Such a model has been applied to predict clearance of model drugs for different pediatric age groups using commercial software such as Simcyp or PKSim [7, 8]. In this paper we develop an intelligent mechanistic model in R to estimate “population mean clearance value” in any age of child for selected model drugs, on the basis known compound-specific information in literature and published studies on the development physiology and enzyme ontogeny in children. In particular the rest of this paper is organized as follows. In Sect. 2, we discuss the selection of model drugs. Section 3 is a compilation of the child clearance database. In Sects. 4 and 5 we present the physiologically based hepatic clearance scaling and physiologically based renal clearance scaling, respectively, and then in Sect. 6 we formalize the drug and system specific input for building up our computing framework that is proposed in Sect. 7. We conclude the paper and describe future work in Sect. 8.

2 Selection of Model Drugs Probe substrates for cytochrome P450 phase I metabolism, and renal excretion are selected according to the following criteria: (1) The primary pathway of elimination due to one process in healthy adults was >80% of an oral dose. (2) Complete absorption (or >90%) from gastrointestinal tract after oral administration (po); for compounds showing incomplete absorption, only IV data were used. (3) Probe choice is routinely administered for clinical indications in ill neonates, infants and children. Model drugs also need to have the established clinical use in adults and pediatric patients of all ages, the availability of published data on in vivo clearance for different age groups and, adequate published data on their in vivo absorption, distribution, metabolism and excretion (ADME) study, namely the contribution of each clearance

Precision Dosing Management with Intelligent Computing in Digital Health

271

pathway to total clearance. The list of drugs and major clearance mechanisms were shown in Table 1. Clearance mechanisms were identified from pharmacology textbook [9], a key review article [10], drug label, as well as primary literature for individual compounds. Table 1. Elimination pathways for probe substrates in healthy adult volunteers. Drug Alfentanil Midazolam Caffeine Theophylline Gentamicin Vancomycin

Route IV IV IV IV IV IV

% Excreted >90% metabolized >90% metabolized >90% metabolized >90% metabolized 82 ± 10% in urine >90% in urine

Metabolic pathways CYP3A4 (>80%) [11] CYP3A4 (>80%) [11] CYP1A2 (>90%) [12] CYP1A2 (>90%) [12] Glomerular filtration [13] Glomerular filtration [10]

Table 1 provides an estimate of the percentage of parent compound processed by the major fate pathways. These compounds are grouped according to the primary process of clearance, which include the process of renal, CYP3A4 and CYP1A2 elimination. Alfentanil and midazolam are metabolized by CYP3A as the primary route of disposition. Alfentanil is extensively oxidized via two major N-dealkylation pathways both of which are mediated via CYP3A4 in human liver microsomes. Human in vivo studies have shown that more than 80% of an iv dose is recovered in the urine of healthy adult volunteers as CYP3A4 metabolites [11]. Midazolam metabolism is mediated by CYP3A4 to 1-hydroxy and 4-hydroxy derivative corresponds to the main metabolite and a minimum of 70% of an oral dose and 77% of an iv dose is recovered in the urine within 24 h as this metabolite [11]. The predominant biotransformation of theophylline and caffeine are catalyzed by CYP1A2. Caffeine is primarily transformed via three major N-demethylation reactions (1-, 3- or 7-N-demethylation) producing theobromine, paraxaanthine or theophylline. N3-demthylation pathway is catalyzed by CYP1A2 with high affinity and accounts for 80% of the metabolism of caffeine in humans [12]. Theophylline undergoes C-8oxidation as the major metabolite route accounting for 49.1% of the total urinary excretion, together with oxidative 1- and 3-N-demethylation (17.5% and 24.5%, respectively). These reactions are mediated by the CYP1A2 isoform at pharmacological concentrations. Over 90% of gentamicin is predominantly excreted unchanged in urine through glomerular filtration [13]. Similarly, over 80% of vancomycin is mainly eliminated into the urine as unchanged [9].

272

H. Lu et al.

3 Compilation of Child Clearance Database After drug model selection, computerized literature searches (PubMed, 1970-present) were conducted to find references or publications describing pharmacokinetics of probe substrates in children, using words such as, newborn, neonate, infant, children and crossing these with terms such as probe drug names, pharmacokinetics. Additionally, a variety of pediatric pharmacology reviews [14–19] were examined to identify drugs for which PK datasets exists for children. Next, the primary PK studies in published literature were evaluated to extract key data including weight, gender, age, drug administration route, the number of does (single or multiple doses), the number of subjects, and PK findings such as total clearance or apparent volume of distribution. Through these sources, a database of age-dependent observed clearances for 6 therapeutic probes were compiled, based upon the availability of data for pertinent age groups (especially very early life stages), and being able to obtain the primary data sources (CL and body weight), having a reasonable number of subjects (at least 3 per age group). A scan of the database shows that for these model compounds the weight normalized clearance in neonates and young infants appears different from the adults. Table 2 was extracted from the database to illustrate this pattern and to show how the data has been compiled and organized. The observed adult clearance value was the weighted mean and only the mean adult clearance value was regarded in this study. Table 2. Some instances of children’s clearance database (n = number of subjects in study; GA = gestational age; PNA = postnatal age; BW = body weight; CL = plasma clearance) Pathway CYP3A4 [26] CYP3A4 [26] CYP3A4 [26] CYP3A4 [11]

Drug

Subjn GA (wk) Midazolam 6 39

PNA PNA. unit 5.2 y

BW (kg) 18.4

CL (mL/min/kg) 11.98

CL. sd 6.68

Midazolam

6

39

4.7

y

15.9

8.53

1.8

Midazolam

5

39

1.3

y

8.8

9.07

3.35

Midazolam 198

39

18

y

70

7

1.5

4 Physiologically Based Hepatic Clearance Scaling The physiologically based hepatic clearance scaling approach involves an in vitro-in vivo extrapolation of enzyme activity data determined in hepatic microsomal preparations from different pediatric age groups as well as the adult. Briefly, the adult intrinsic clearance (CLint,adult) is back calculated from in vivo hepatic drug clearance (CLH), the free fraction in plasma (fu), the blood to plasma drug concentration ratio (CB/CP), and hepatic blood flow (QH), using well-stirred model (Eq. 3) [20, 21].

Precision Dosing Management with Intelligent Computing in Digital Health

CLint;adult ¼

Q CLH;adult H;adult fu;adult QH;adult CLH;adult = CCBP

273

ð3Þ

The generated adult intrinsic clearance value is then multiplied by scaling factor that represents the activity of the specific enzyme in relation to the age of the child (Eq. 4). This new child-scaled intrinsic clearance (CLint,child) is used to generate an agespecific hepatic clearance calculated from the re-arranged equation (Eq. 5) using agespecific body weight, liver weight, liver blood flow and predicted fraction unbound (scaled from adults based binding protein concentrations in blood). CLint;child ¼ CLint;adult SF

ð4Þ

Fig. 1. Scaling clearances from adult to children for enzymatic hepatic clearance pathway

274

H. Lu et al.

CLH;child ¼

QH;child fu;child CLint;child QH;child þ fu;child CLint;child = CCBP

ð5Þ

Figure 1 describes an overview of the process involved in scaling clearances from adult to children. Because the elimination of midazolam and alfentanil are primarily due to CYP3A4 metabolism, their hepatic clearance is assumed to be close to their plasma clearance. The assumption was also applicable to theophylline and caffeine, which eliminations are primarily due to CYP1A2 metabolism.

5 Physiologically Based Renal Clearance Scaling For renal eliminated drugs, it is well accepted that the renal clearance is proportional to glomerular filtration rate (GFR). For example, surrogate measures of GFR are often used to adjust the dosing rate in adults with impaired renal function [22]. Extension of these concepts to adaption of the adult regimen for the child leads to the proposition that the renal clearance in child, expressed as a fraction of the adult values, is adjusted proportionally by GFR and free fractions in plasma (Eq. 6) [8]. CLGFR;child GFRchild fu;child ¼ CLGFR;adult GFRadult fu;adult

ð6Þ

where CLGFR is the compound specific renal clearance (mL/min), GFR is glomerular filtration rate (mL/min). Gentamicin and vancomycin, are used to evaluate this renal clearance model because they are excreted exclusively via filtration in the kidney and CLGFR is close to the plasma clearance. Figure 2 describes an overview of the process involved in scaling clearances from adult to children.

Fig. 2. Scaling clearances from adult to children for renal elimination pathway

Precision Dosing Management with Intelligent Computing in Digital Health

275

6 Drug and System Specific Input Table 3 listed the drug specific parameters that are obtained from the literature [9, 11– 13, 23], such as CLadult, fu, CB/CP. The adult plasma clearance values are geometric means from different PK studies in which drugs were administered by i.v. injections. The CL of alfentanil in healthy adult volunteers was obtained from total 241 healthy adult volunteers in 9 studies after intravenous administration and the geometric mean CL was 4.7 ml/min/kg (SD: 2.3) [11]. The typical clearance of midazolam in adults was estimated from total 198 healthy adult volunteers of 4 studies, which was 7.7 ml/min/kg (SD: 3.7) [11].

Table 3. Summary of drug specific input in adults Drug

Major binding plasma protein

Midazolam Alfentanil

Albumin alpha1-acid glycoprotein Albumin Albumin Albumin Albumin

Theophylline Caffeine Gentamicin Vancomycin

Unbound fraction in plasma, fu 0.02 0.1

Blood:plasma partition ratio (Cb/ Cp) 0.55 0.63

Adult drug CL (mL/min/kg) 7.7 4.7

0.44 0.68 0.95 0.7

0.82 1 n.a. n.a.

1.0 1.97 1.3 1.22

The mean clearance of caffeine in adults after iv administration was estimated from 20 subjects of 2 studies and the value was 1.97 mL/min/kg (SD: 0.92) [12]. The theophylline clearance in healthy adults after iv administration was estimated from 100 subjects of 12 studies and was 1.0 mL/min/kg (SD: 0.29) [12]. The total clearance of gentamicin in healthy adults after iv administration is estimated from 219 subjects of 6 studies, which was 1.3 mL/min/kg (SD: 0.5) [13]. The total clearance of vancomycin in healthy adults after iv administration was estimated from 121 subjects of 6 studies and the mean value was 1.22 mL/min/kg (SD: 0.5) [24]. The physiological parameters such as plasma protein binding level, hepatic blood flow, liver volume and enzyme activity are variable with age. The empirical regression functions that can generate age appropriate parameters and account for the developmental differences between infants and adults are shown in Table 4. The mean physiological inputs were listed in Table 5 for a typical male adult from ICRP [25].

276

H. Lu et al.

Table 4. A summary of regression equations to calculate age-specific physiology and biochemical input (a is the postnatal age in year; BW is body weight in kg, and age is the postnatal age in month; “-” is the unitless fraction) Parameter

Unit

Body weight (BW)*

kg

Age range of observation day 1–18 yr

Regression equations

Height (HT)*

cm

day 1–18 yr

CYP3A4 ontogeny scaling factor (OSF) CYP1A2 ontogeny scaling factor (OSF)

-

OSFCYP3A4 ¼ 0:0835 þ

0:894ageðdayÞ 139 þ ageðdayÞ

-

OSFCYP1A2 ¼ 0:0078 þ

0:531ageðdayÞ 100 þ ageðdayÞ

BW ¼ 4:2986 þ 5:4396a 0:9175a2 þ 0:091a3 0:0026a4 HT ¼ 53:674 þ 22:304a 3:759a2 þ 0:386a3 0:0179a4 þ 0:0003a5

Table 5. Physiological input used in the physiologically based clearance scaling model for a normal male adult Parameter Body weight (BW) Body surface area (BSA) Cardiac output (CO) Liver blood flow rate (% cardiac output) Liver blood flow (QH) Microsomal protein per gram liver (MPPGL) Liver weight (LW)

Unit kg m2 mL/min fraction mL/min mg/gram Gram

Input 70 1.9 6500 25.5% 1657.5 45 1800

7 Computing Framework with Model Evaluation and Simulation The model was compiled as a function package in R. Figure 3 illustrates our intelligent computing framework called iDose, including GUI, dose data visualization, children’s clearance database, regression modules and their correlation components. Clearance predictions were compared against literature values. To determine the ability of the ontogeny models to predict the observed clearances, the correlation between observed and predicted clearances for the model compounds was determined, as well as a measure of precision (the percentage of prediction values within 2-fold of the observed values). The Person’s correlation coefficient between observations and predictions were calculated with R. Simulations were performed using 500 virtual pediatric subjects with age ranging from birth to 18 years. The cubic spline curves of predicted clearance versus age were generated and evaluated against observed in vivo clearance values for probe substrates in children.

Precision Dosing Management with Intelligent Computing in Digital Health

Dose Management System, iDose

SQL Interface

GUI

Visualization Logic

RDBMS Children’s Clearance Data

Data Access Dose Correlation

BW/HT modules

Importer adult input

adult input

OSF modules

adult input

Fig. 3. Framework of dose management system, iDose

10.00

0.01

10

50

500

500 50 10

10

50

500

5.0

5.0

observed (term) observed (preterm) unity

50.0

—

0.1

0.1 5

Observed Clearance (mL/min)

5

Observed Clearance (mL/min)

F. vancomycin

observed (term) observed (preterm) unity

0.5

10 5 1 1

5 1

10.00

50.0

—

Predicted Clearance (mL/min)

500

1.00

E. gentamicin

observed (term) observed (preterm) unity

50

—

0.10

Observed Clearance (mL/min)

D. midazolam

observed (term) observed (preterm) unity

0.5

1.00

Observed Clearance (mL/min)

Predicted Clearance (mL/min)

0.10

—

1

0.01 0.01

Predicted Clearance (mL/min)

Predicted Clearance (mL/min)

100.00 1.00

10.00

—

C. alfentanil

observed (term) observed (preterm) unity

0.10

Predicted Clearance (mL/min)

100.00 0.10

1.00

10.00

—

B. theophylline

observed (term) observed (preterm) unity

0.01

Predicted Clearance (mL/min)

A. caffeine

0.1

0.5

5.0

50.0

Observed Clearance (mL/min)

0.1

0.5

5.0

50.0

Observed Clearance (mL/min)

Fig. 4. Comparative studies of model compounds

277

278

H. Lu et al.

Table 6. Percentage of clearance predictions within 2-fold of the observed values (success rate) and correlation coefficient between observations and predictions Drug

Success rate

Pearson’s correlation coefficient Overall Preterm neonate Overall Preterm neonate Alfentanil 82% (89/108) 67% (4/6) 0.85 0.73 Midazolam 78% (26/38) 43% (3/7) 0.96 −0.183 Caffeine 77% (41/53) 78% (28/36) 0.93 0.99 Theophylline 68% (125/183) 71% (34/48) 0.88 0.948 Gentamicin 62% (58/94) 40% (21/53) 0.96 0.757 Vancomycin 66% (47/71) 40% (16/40) 0.97 0.933

Clearance predictions for each of the model compounds were plotted against the observations in Fig. 4. For CYP3A4 substrates, 82% (89/108) of predicted values were within 2-fold of the observed values of alfentanil while 78% (26/38) were within 2-fold of the observed values of midazolam. For CYP1A2 probe substrate, 77% (41/53) of predicted values were within 2-fold of the observed values of caffeine and 68% (125/183) were within 2-fold of the observed values of theophylline. About 62% of predicted values were within 2-fold of the observed values of gentamicin (58/94) and 66% for vancomycin (47/77). There was good correlation between the observed versus predicted values for each of the model drugs as illustrated in Table 6. The overall coefficients of correlation were 0.883, 0.909 and 0.962, respectively, for CYP1A2, CYP3A4 metabolized elimination and GFR-mediated renal excretion.

8 Conclusions and Future Work The physiology-based scaling model predicts an age-and elimination pathway-specific clearance for the ontogeny of renal clearance, and metabolic clearance of CYP1A2 and CYP3A4. As developed with an open program R, the model provides a valuable source for informed individuals to understand the physiological based clearance scaling approach and to use it, without having to use the “black-box” kind of simulation software. In the future we will deploy the R based computing models running on the back-end server and develop a front-end client apps on mobile platform, allowing clinicians to anticipate probable treatment effects and to have a higher likelihood of achieving optimal dose regimens early, as well as reducing the drug development cycling time and cost.

References 1. Allegaert, K., et al.: Developmental pharmacology: neonates are not just small adults. Acta Clin. Belg. 63(1), 16–24 (2008)

Precision Dosing Management with Intelligent Computing in Digital Health

279

2. Johnson, T.N.: Modelling approaches to dose estimation in children. Br. J. Clin. Pharmacol. 59(6), 663–669 (2005) 3. Anderson, B.J., Meakin, G.H.: Scaling for size: some implications for paediatric anaesthesia dosing. Paediatr Anaesth. 12(3), 205–219 (2002) 4. Alcorn, J., McNamara, P.J.: Using ontogeny information to build predictive models for drug elimination. Drug Discov Today 13(11–12), 507–512 (2008) 5. Allegaert, K., et al.: Determinants of drug metabolism in early neonatal life. Curr Clin Pharmacol 2(1), 23–29 (2007) 6. Edginton, A.N.: Knowledge-driven approaches for the guidance of first-in-children dosing. Paediatr Anaesth. 21(3), 206–213 (2011) 7. Johnson, T.N., Rostami-Hodjegan, A., Tucker, G.T.: Prediction of the clearance of eleven drugs and associated variability in neonates, infants and children. Clin. Pharmacokinet. 45(9), 931–956 (2006) 8. Edginton, A.N., et al.: A mechanistic approach for the scaling of clearance in children. Clin. Pharmacokinet. 45(7), 683–704 (2006) 9. Brunton, L.L., Lazo, J., Parker, K.L. (ed.): Goodman & Gilman’s The Pharmacological Basis of Therapeutics. 11th ed. McGraw-Hill Professional, New York (2005) 10. Bertz, R.J., Granneman, G.R.: Use of in vitro and in vivo data to estimate the likelihood of metabolic pharmacokinetic interactions. Clin. Pharmacokinet. 32(3), 210–258 (1997) 11. Dorne, J.L., Walton, K., Renwick, A.G.: Human variability in CYP3A4 metabolism and CYP3A4-related uncertainty factors for risk assessment. Food Chem. Toxicol. 41(2), 201– 224 (2003) 12. Dorne, J.L., Walton, K., Renwick, A.G.: Uncertainty factors for chemical risk assessment. Human variability in the pharmacokinetics of CYP1A2 probe substrates. Food Chem. Toxicol. 39(7), 681–696 (2001) 13. Dorne, J.L., Walton, K., Renwick, A.G.: Human variability in the renal elimination of foreign compounds and renal excretion-related uncertainty factors for risk assessment. Food Chem. Toxicol. 42(2), 275–298 (2004) 14. de Wildt, S.N., Johnson, T.N., Choonara, I.: The effect of age on drug metabolism. Paediatric Perinatal Drug Therapy 5(3), 101–106 (2003) 15. Bjorkman, S.: Prediction of cytochrome p450-mediated hepatic drug clearance in neonates, infants and children: how accurate are available scaling methods? Clin. Pharmacokinet. 45(1), 1–11 (2006) 16. Anderson, B.J., Larsson, P.: A maturation model for midazolam clearance. Paediatr Anaesth. 21(3), 302–308 (2011) 17. Ginsberg, G., et al.: Evaluation of child/adult pharmacokinetic differences from a database derived from the therapeutic drug literature. Toxicol. Sci. 66(2), 185–200 (2002) 18. Alcorn, J., McNamara, P.J.: Ontogeny of hepatic and renal systemic clearance pathways in infants: part II. Clin. Pharmacokinet. 41(13), 1077–1094 (2002) 19. Suzuki, S., et al.: Estimating pediatric doses of drugs metabolized by cytochrome P450 (CYP) isozymes, based on physiological liver development and serum protein levels. Yakugaku Zasshi 130(4), 613–620 (2010) 20. Yang, J., et al.: Misuse of the well-stirred model of hepatic drug clearance. Drug Metab. Dispos. 35(3), 501–502 (2007) 21. Schmidt, S., Gonzalez, D., Derendorf, H.: Significance of protein binding in pharmacokinetics and pharmacodynamics. J. Pharm. Sci. 99(3), 1107–1122 (2010) 22. Verbeeck, R.K., Musuamba, F.T.: Pharmacokinetics and dosage adjustment in patients with renal dysfunction. Eur. J. Clin. Pharmacol. 65(8), 757–773 (2009) 23. Uchimura, T., et al.: Prediction of human blood-to-plasma drug concentration ratio. Biopharm. Drug Dispos. 31(5–6), 286–297 (2010)

280

H. Lu et al.

24. Guay, D.R., et al.: Comparison of vancomycin pharmacokinetics in hospitalized elderly and young patients using a Bayesian forecaster. J. Clin. Pharmacol. 33(10), 918–922 (1993) 25. Basic anatomical and physiological data for use in radiological protection: reference values. ICRP Publication 89. Ann. ICRP, 32(3–4), 5-265 (2002) 26. Mathews, H.M., et al.: A pharmacokinetic study of midazolam in paediatric patients undergoing cardiac surgery. Br. J. Anaesth. 61(3), 302–307 (1988)

Optimal Number of MOAP Robots for WMNs Using Silhouette Theory Atushi Toyama1 , Kenshiro Mitsugi1 , Keita Matsuo2(B) , and Leonard Barolli2 1

2

Graduate School of Engineering, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan {mgm20105,mgm20108}@bene.fit.ac.jp Department of Information and Communication Engineering, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan {kt-matsuo,barolli}@fit.ac.jp

Abstract. Recently, various communication technologies have been developed in order to satisfy the requirements of many users. Especially, mobile communication technology continues to develop rapidly and Wireless Mesh Networks (WMNs) are attracting attention from many researchers in order to provide cost efficient broadband wireless connectivity. The main issue of WMNs is to improve network connectivity and stability in terms of user coverage. In our previous work, we presented Moving Omnidirectional Access Point (MOAP) robot. The MOAP robot should move omnidirectionaly in the real space to provide a good communication and stability for WMNs. For this reason, we need to find optimal number of MOAP robots. In this paper, we use silhouette theory to decide the optimal number of MOAP robots for WMNs in order to achieve a good communication environment.

1

Introduction

Recently, communication technologies have been developed in order to satisfy the requirements of many users. Especially, mobile communication technologies continue to develop rapidly and has facilitated the use of laptops, tablets and smart phones in public spaces [4]. In addition, Wireless Mesh Networks (WMNs) [1] are becoming on important network infrastructure. These networks are made up of wireless nodes organized in a mesh topology, where mesh routers are interconnected by wireless links and provide Internet connectivity to mesh clients. WMNs are attracting attention from many researchers in order to provide cost efficient broadband wireless connectivity. The main issue of WMNs is to improve network connectivity and stability in terms of user coverage. This problem is very closely related to the family of node placement problems in WMNs [5,8,10]. In these papers are assumed that routers move by themselves or by using network simulator moving models. In this paper, we consider a moving robot as network device. In order to realize a moving access point, we implemented a moving omnidirectional access c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 281–290, 2021. https://doi.org/10.1007/978-3-030-57796-4_27

282

A. Toyama et al.

point robot (called MOAP robot). It is important that the MOAP robot moves to an accurate position in order to have a good connectivity. Thus, the MOAP robot can provide good communication and stability for WMNs. In this work, we consider silhouette theory to decide the optimal number of MOAP robots. The rest of this paper is structured as follows. In Sect. 2, we introduce the related work. In Sect. 3, we present our implemented moving omnidirectional access point robot. In Sect. 4, we use silhouette theory to decide optimal number of MOAP robots. In Sect. 5, we show the simulation results. Finally, conclusions and future work are given in Sect. 6.

2

Related Work

Many different techniques are developed to solve the problem of moving robots position. One of important research area is indoor position detection, because the outdoor position can be detected easily by using GPS (Global Positioning System). However, in the case of indoor environment, we can not use GPS. So, it is difficult to find the target position. Asahara et al. [2] proposed to improve the accuracy of the self position estimation of a mobile robot. A robot measures a distance to an object in the mobile environment by using a range sensor. Then, the self position estimation unit estimates a self position of the mobile robot based on the selected map data and range data obtained by the range sensor. Wang et al. [11] proposed the ROS (Robot Operating System) platform. They designed a WiFi indoor initialize positioning system by triangulation algorithm. The test results show that the WiFi indoor initialize position system combined with AMCL (Adaptive Monte Carlo Localization) algorithm can be accurately positioned and has high commercial value. Nguyen et al. [9] proposed a low speed vehicle localization using WiFi fingerprinting. In general, these researches rely on GPS in fusion with other sensors to track vehicle in outdoor environment. However, as indoor environment such as car park is also an important scenario for vehicle navigation, the lack of GPS poses a serious problem. They used an ensemble classification method together with a motion model in order to deal with the issue. Experiments show that proposed method is capable of imitating GPS behavior on vehicle tracking. Ban et al. [3] proposed indoor positioning method integrating pedestrian Dead Reckoning with magnetic field and WiFi fingerprints. Their proposed method needs WiFi and magnetic field fingerprints, which are created by measuring in advance the WiFi radio waves and the magnetic field in the target map. The proposed method estimates positions by comparing the pedestrian sensor and fingerprint values using particle filters. Matsuo et al. [6,7] implemented and evaluated a small size omnidirectional wheelchair.

Optimal Number of MOAP Robots for WMNs Using Silhouette Theory

283

Fig. 1. Implemented MOAP robot.

3

Implemented Moving Omnidirectional Access Point Robot

In this section, we describe the implemented MOAP (Moving Omnidirection Access Point) robot. We show the implemented MOAP robot in Fig. 1. The MOAP robot can move omnidirectionaly keeping the same direction and can provide access points for network devices. In order to realize our proposed MOAP robot, we used omniwheels which can rotate omnidirectionaly in front, back, left and right. The movement of the MOAP robot is shown in Fig. 2. We would like to control the MOAP robot to move accurately in order to offer a good environment for communication. 3.1

Overview of MOAP Robot

Our implemented MOAP robot has 3 omniwheels, 3 brushless motors, 3 motor drivers and a controller. The MOAP robot requires 24 V battery to move and 5 V battery for the controller. We show the specification of MOAP robot in Table 1.

Fig. 2. Movement of our implemented MOAP robot.

284

A. Toyama et al. Table 1. Specification of MOAP robot. Item

Specification

Length

490.0 [mm]

Width

530.0 [mm]

Height

125.0 [mm]

Brushless motor BLHM015K-50 (Orientalmotor corporation) Motor driver

3.2

BLH2D15-KD (Orientalmotor corporation)

Controller

Raspberry Pi 3 Model B+

Power supply

DC24V Battery

PWM driver

Pigpio (the driver can generate PWM signal with 32 line)

Control System

We designed the control system for operation of MOAP robot, which is shown in Fig. 3. We are using brushless motors as main motor to move the robot, because the motor can be controlled by PWM (Pulse Width Modulation). We used Rasberry Pi as a controller. However, the controller has only 2 PWM hardware generators. But, we need to use 3 generators, so we decided to use the software generator to get a square wave for the PWM. As software generator, we use the Pigpio which can generate better signal than other software generators and make PWM signals with 32 lines. Figure 4 shows the square signal generated by Pigpio.

Fig. 3. Control system for MOAP robot.

Optimal Number of MOAP Robots for WMNs Using Silhouette Theory

285

Fig. 4. Square signal generated by using Pigpio.

3.3

Kinematics

For the control of the MOAP robot are needed the robot’s rotation degrees, movement speed and direction. Let us consider the movement of the robot in 2 dimensional space. In Fig. 5, we show the movement of the robot. In this figure, there are 3 onmiwheels which are placed 120◦ with each other. The omniwheels can move in clockwise and counter clockwise directions, we decided clockwise is positive rotation as shown in the figure. We consider the speed for each omniwheel M1, M2 and M3, respectively. As shown in Fig. 5, the axis of the MOAP robot are x and y and the speed ˙ In this case, the moving speed of the is v = (x, ˙ y) ˙ and the rotating speed is θ. MOAP robot can be expressed by Eq. (1). ˙ V = (x, ˙ y, ˙ θ)

(1)

Based on the Eq. (1), the speed of each omniwheel can be decided. By considering the control value of the motor speed ratio of each omniwheel as linear and synthesising the vector speed of 3 omniwheels, we can get Eq. (2) by using Reverse Kinematics, where (d) is the distance between the center and the omniwheels. Then, from the rotating speed of each omniwheel based on Forward Kinematics, we get the MOAP robot’s moving speed. If we calculate the inverse matrix of Eq. (2), we get Eq. (3). Thus, when the MOAP robot moves in all directions (omnidirectional movement), the speed for each motor (theoretically) is calculated as shown in Table 2. M1 1 0 1 √3 M 2 = − − 2 2 √ M 3 − 1 3 2 2

d x˙ d y˙ d θ˙

(2)

286

A. Toyama et al.

2 x˙ 3 − 31 − 13 M1 y˙ = 0 − √1 √1 M2 3 3 1 M θ˙ 1 1 3 3d 3d 3d

4

(3)

Silhouette Theory to Decide Optimal Number of MOAP Robots

Silhouette theory considers K-means clustering. We show k-means function in Eq. (4). In this case, Ci means i th cluster and xij is j th of i th data. K is the number of clusters. Ideal clustering is achieved when the value of Eq. (4) will be minimized. K 1 min { C1 ,...,CK |Ck | k=1

p

i,i ∈C

k

(xij − xi j )2 }

(4)

j=1

We show the K-means clustering in Fig. 6, where the dots show clients. In Fig. 6(a), we deployed 150 clients in random way on 2D space (100 m × 100 m). After that we used K-means clustering as shown in Fig. 6(b). We consider that the centroids can communicate with each-other for this scenario. In order to decide the number of optimal clusters, we used Silhouette theory. Silhouette theory use the value of Silhouette coefficient. We show the equation of Silhouette coefficient in Eq. (5), where a(i) is the average distance between i th client and other clients in the same cluster and b(i) is the average distance

Fig. 5. The movement of MOAP robot.

Optimal Number of MOAP Robots for WMNs Using Silhouette Theory

287

between i th client and other clients in the nearest cluster. The, s(i) shows the degree of success or failure of clustering. The value range for s(i) is -1 to 1. If s(i) value is near to 1 the clients in the same cluster are very close to each other. When s(i) value is 0, the clients are located on the border between clusters. Also, Table 2. Motor speed ratio. Direction

Motor speed ratio

(Degrees) Motor1 Motor2 Motor3 0

0.00

−0.87

0.87

10

0.17

−0.94

0.77

20

0.34

−0.98

0.64

30

0.50

−1.00

0.50

40

0.64

−0.98

0.34

50

0.77

−0.94

0.17

60

0.87

−0.87

0.00

70

0.94

−0.77

−0.17

80

0.98

−0.64

−0.34

90

1.00

−0.50

−0.50

100

0.98

−0.34

−0.64

110

0.94

−0.17

−0.77

120

0.87

0.00

−0.87

130

0.77

0.17

−0.94

140

0.64

0.34

−0.98

150

0.50

0.50

−1.00

160

0.34

0.64

−0.98

170

0.17

0.77

−0.94

180

0.00

0.87

−0.87

190

−0.17

0.94

−0.77

200

−0.34

0.98

−0.64

210

−0.50

1.00

−0.50

220

−0.64

0.98

−0.34

230

−0.77

0.94

−0.17

240

−0.87

0.87

0.00

250

−0.94

0.77

0.17

260

−0.98

0.64

0.34

270

−1.00

0.50

0.50

280

−0.98

0.34

0.64

290

−0.94

0.17

0.77

300

−0.87

0.00

0.87

310

−0.77

−0.17

0.94

320

−0.64

−0.34

0.98

330

−0.50

−0.50

1.00

340

−0.34

−0.64

0.98

350

−0.17

−0.77

0.94

360

0.00

−0.87

0.87

288

A. Toyama et al.

when the value of s(i) is negative, the client does not belong to an appropriate cluster. We show the K-means clustering and Silhouette coefficient with 8 clusters in Figs. 7 and 8. In Fig. 7 (a), we deployed 800 clients on the 2D space (150 m × 120 m) randomly. After that we clustered the clients by 8 clusters using K-means clustering in order to analyze the clusters using the Silhouette theory (see Fig. 7 (b)). The circles in Fig. 7 (b) means the communication range, which radius is 30 m. In Fig. 8 are shown silhouette coefficients of 8 clusters, where the vertical broken line means average value. The thickness of each clusters shows the number of clients in the cluster. The ideal case is when the thickness are almost the same. In this case, the silhouette coefficient is high. s(i) =

5

b(i) − a(i) max{a(i) , b(i) }

(5)

Simulation Results

We show the simulation results in Fig. 9. We compared each silhouette coefficient for 2, 3 and 4 clusters using K-means clustering in the random deployed clients of Fig. 7(a). We show simulation conditions in Table 3. The simulation results are shown in Figs. 9 (a), (b), (c). When we consider 2 clusters (Fig. 9 (a)), the value of silhouette coefficient is 0.529 for 3 clusters (Fig. 9 (b)) the silhouette coefficient is 0.650 and for 4 clusters (Fig. 9 (c)) the silhouette coefficient is 0.614. From these results, we can find that the optimal number of MOAP robots is 3.

Fig. 6. K-means clustering.

Optimal Number of MOAP Robots for WMNs Using Silhouette Theory

Fig. 7. Simulation with 8 clusters.

Fig. 8. Silhouette coefficient with 8 clusters. Table 3. Simulation conditions. Item

Description

Number of generate clients 800 Number of centers

5

Standard deviation

1.1

Random state

11 (seed)

Range

150[m] × 120[m]

Fig. 9. Silhouette coefficient.

289

290

6

A. Toyama et al.

Conclusions and Future Work

In this paper, we introduced our implemented MOAP robot. We have shown some of the previous works and discussed the related problems and issues. Then, we presented in details the kinematics and the control methodology for MOAP robot. We considered the silhouette theory to determine the number of MOAP in oder to have a good communication environment in WMNs. The simulation results show that the silhouette theory can decide the optimal number of MOAP robots. In the future work, we would like to propose other efficient methods for optimal number of MOAP robots.

References 1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005) 2. Asahara, Y., Mima, K., Yabushita, H.: Autonomous mobile robot, self position estimation method, environmental map generation method, environmental map generation apparatus, and data structure for environmental map (19 Jan 2016), uS Patent 9,239,580 3. Ban, R., Kaji, K., Hiroi, K., Kawaguchi, N.: Indoor positioning method integrating pedestrian dead reckoning with magnetic field and wifi fingerprints. In: 2015 Eighth International Conference on Mobile Computing and Ubiquitous Networking (ICMU), pp. 167–172, January 2015 4. Hamamoto, R., Takano, C., Obata, H., Ishida, K., Murase, T.: An access point selection mechanism based on cooperation of access points and users movement. In: 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), pp. 926–929, May 2015 5. Maolin, T.: Gateways placement in backbone wireless mesh networks. Int. J. Commun. Netw. Syst. Sci. 2(01), 44–50 (2009) 6. Matsuo, K., Barolli, L.: Design and implementation of an omnidirectional wheelchair: control system and its applications. In: Proceedings of the 9th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA-2014), pp. 532–535 (2014) 7. Matsuo, K., Liu, Y., Elmazi, D., Barolli, L., Uchida, K.: Implementation and evaluation of a small size omnidirectional wheelchair. In: Proceedings of the IEEE 29th International Conference on Advanced Information Networking and Applications Workshops (WAINA-2015), pp. 49–53 (2015) 8. Muthaiah, S.N., Rosenberg, C.: Single gateway placement in wireless mesh networks. Proc. ISCN 8, 4754–4759 (2008) 9. Nguyen, D., Recalde, M.E.V., Nashashibi, F.: Low speed vehicle localization using wifi fingerprinting. In: 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV), pp. 1–5, November 2016 10. Oda, T., Barolli, A., Spaho, E., Xhafa, F., Barolli, L., Takizawa, M.: Performance evaluation of WMN using WMN-GA system for different mutation operators. In: 2011 14th International Conference on Network-Based Information Systems, pp. 400–406, September 2011 11. Wang, T., Zhao, L., Jia, Y., Wang, J.: Wifi initial position estimate methods for autonomous robots. In: 2018 WRC Symposium on Advanced Robotics and Automation (WRC SARA), pp. 165–171, August 2018

Performance Evaluation of a Recovery Method for Vehicular DTN Considering Different Reset Thresholds Yoshiki Tada1 , Makoto Ikeda2(B) , and Leonard Barolli2 1

Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected] 2 Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected], [email protected]

Abstract. In this work, we focus on vehicular-based message delivery methods that are effective in disaster situations. In our previous work, we proposed a message relaying method with Enhanced Dynamic Timer (EDT) for Vehicular Delay Tolerant Networking (DTN). In this paper, we evaluate the network performance of proposed EDT considering different reset threshold for Vehicular DTN. From the simulation results, we found that setting of RT less than 0.5 reduced storage usage regardless the number of vehicles. Keywords: Message relaying method dynamic timer

1

· Vehicular DTN · Enhanced

Introduction

The application of Vehicle-to-Infrastructure (V2I) and Vehicle-to-Vehicle (V2V) communications have attracted attention for fostering innovative city-wide services and will become one of the common communications platform on the future Internet [3,6,9,10,12,14,15]. We believe that robustness is important for communication platforms with high latency and frequent disconnections. Delay/Disruption/Disconnection Tolerant Networking (DTN) is effective as a communication platform for V2I and V2V communications [8,11,21]. By these new technologies, we can expect in the future new safe driving assistance and logistics services. However, DTN protocols have some problems such as storage usage and overhead because the nodes send duplicated bundle messages to neighbors. In [5], the authors proposed a hybrid DTN routing method which selects the Epidemic-based protocol with many replications and the SpW-based protocol with few replications. They consider storage state of the nodes to select the routing protocol. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 291–299, 2021. https://doi.org/10.1007/978-3-030-57796-4_28

292

Y. Tada et al.

In our previous works [7,13], we have proposed a message relaying method with an Enhanced Dynamic Timer (EDT) considering both grid road and real map scenarios in Vehicular DTN. From these works, we could not reduce storage usage when vehicle density was low because the method used a fixed threshold. In this paper, we evaluate the network performance of the message relaying method with proposed EDT considering different Rest Thresholds (RTs) for Vehicular DTN. The structure of the paper is as follows. In Sect. 2, we give the related work. In Sect. 3 is described the message relaying method considering the EDT. In Sect. 4, we provide the description of the evaluation system and the results. Finally, conclusions and future work are given in Sect. 5.

2

Related Work

DTN can provide a reliable internet-working for space tasks [4,17,20]. The space networks have possibly long delay, frequent link disconnection and frequent disruption. In Vehicular DTN, the intermediate vehicles stored messages in their storage and will be sent to other vehicles. The network architecture is specified in RFC 4838 [2]. Epidemic routing is well-known routing protocol for DTN [16,19]. Epidemic routing uses two control messages to duplicate messages. Nodes periodically broadcast the Summary Vector (SV) message in the network. The SV contains a list of stored messages of each node. When the nodes receive the SV, they compare received SV to their SV. The nodes send the REQUEST message if received SV contains unknown messages. In Epidemic routing, consumption of network resources and storage usage become a critical problem, because the nodes duplicate messages to neighbors in their communication range. Therefore, received messages remain in the storage and the messages are continuously duplicated even if the destination receives the messages. However, recovery schemes such as timer or anti-packet may delete the duplicate messages in the network. In the case of the timer, messages have a lifetime. The messages are punctually deleted when the lifetime of the messages is expired. However, the setting of a suitable lifetime is difficult. In the case of anti-packet, the destination broadcasts the anti-packet, which contains the list of messages that are received by the destination. Nodes delete the messages according to the anti-packet. Then, the nodes duplicate the anti-packet to other nodes. However, the network resources are consumed by anti-packet. In this paper, we evaluate a message relaying method with EDT considering different RTs to improve the storage usage.

Performance Evaluation of a Recovery Method for Vehicular DTN

3

293

Enhanced Dynamic Timer Method

In this section, we explain in detail the proposed message relaying method considering EDT. 3.1

Overview of EDT

In conventional method, a fixed lifetime is set at the time of message generation. The EDT controls the bundle drop due considering network conditions. If the lifetime is expired, the vehicle deletes the bundle message in their storage. However, the conventional method does not consider network conditions such as non-signal time, density and so on. 3.2

Timer Setting of EDT

In Fig. 1, we show a flowchart of proposed EDT method. In our approach, each vehicle periodically checks the number of received SV and measure a non-signal time (NT) from neighboring vehicles. If the current NT is greater than maximum NT (NTmax ), the NTmax will be updated. The formula of EDT is: EDT = NTmax + Interval,

(1)

where Interval indicates the check interval, which uses for checking the number of received SVs. The proposed method considers the number of neighboring vehicles measured last time (Nprev ) and calculates the change rate of neighboring vehicles by comparing the value with the RT. The timer is reset by the condition of Eq. (2): Nnow ≤ RT. Nprev

(2)

For example, the RT = 0.1 is reset when the number of vehicles decreases by 90% or more compared to the last time. While, the RT = 0.9 is reset when the number of vehicles has decreased by 10% or more. Thus, the proposed EDT method improves network performance by changing the timer reset condition. In general, if the timer is set in the message, the lifetime is not reset. But, in our proposed method, the EDT reset is allowed even if the lifetime is reset more than one time, which means that there is no reset limit. We consider the Interval to keep the message in the storage for checking the next SV.

294

Y. Tada et al.

Fig. 1. Flowchart of proposed EDT method.

4

Evaluation

In this section, we evaluate the EDT method considering different RTs. We implemented the proposed method on the Scenargie network simulator [18]. 4.1

Evaluation Setting

We consider a grid road scenario (see Fig. 2) with two vehicular densities from both 100 and 200 vehicles/km2 . Table 1 shows the simulation parameters used for the network simulator. Start-point is the message generator and End-point is the destination. Both start-point and end-point are static. The other vehicles move on the road based on the map-based random way-point mobility model. The start-point sends bundle messages to end-point according to ITU-R P.1411 propagation model [1]. When the vehicles receive bundle messages, they store the bundle messages in their storage. Each vehicle periodically broadcasts a SV that stores its own bundle list. Then, the vehicles duplicate the bundles to other vehicles. We considered the interference from obstacles at 5.9 GHz radio channel.

Performance Evaluation of a Recovery Method for Vehicular DTN

Fig. 2. Road model. Table 1. Simulation parameters. Parameter

Value

Simulation time (Tmax )

600 [s]

Area dimensions

1, 000 [m] × 1, 000 [m]

Density of vehicles

100, 200 [vehicles/km2 ]

Minimum speed

8.333 [m/s]

Maximum speed

16.666 [m/s]

Message start and end time

1–400 [s]

Message generation interval

10 [s]

Message size

1,000 [bytes]

PHY model

IEEE 802.11p

Propagation model

ITU-R P.1411

Antenna model

Omni-directional

EDT: Activated

60 to 600 [s]

EDT: Reset threshold (RT)

0.1–0.9

EDT: Check interval (Interval) 2 [s]

295

296

Y. Tada et al.

We evaluate the performance of delay, overhead and storage usage by changing the RT from 0.1 to 0.9. The delay indicates the transmission delay of the bundle to reach the end-point. The overhead indicates the number of times for sending duplicate bundles. Timer function will be activated after the simulation time is 60 s. The storage usage indicates the average of the storage state of each vehicle. 4.2

Evaluation Results

6.34 6.32 6.3 6.28 6.26 6.24 6.22 6.2

100 vehicles Delay [sec]

Delay [sec]

We evaluate the proposed message relaying method with EDT by changing the timer reset condition. For all cases, all messages reached the end-point by 600 s. We present the simulation results of delay for different vehicles in Fig. 3. When the number of vehicles is 100, the RT = 0.3 has the best result for delay time and RT = 0.5 is the second. On the other hand, we found that when the number of vehicles increased to 200, there is almost the same results due to the effect of increasing vehicle density. We present the simulation results of overhead for different vehicles in Fig. 4. The overhead for 100 vehicles increases by increasing the RT. With the exception of RT = 0.5, the overhead is minimized. In the case of 200 vehicles, the overhead results are larger due to the increased number of vehicles. We show the simulation results of the storage usage for different vehicles in Fig. 5. The horizontal axis shows the simulation time and the vertical axis shows the storage usage. In order to compare the results of storage usage, we split the results into two parts. The split conditions are RT = 0.1 to 0.5 and RT = 0.6 to 0.9. When the RT is less than 0.5, the results of storage usage are the same regardless of the number of vehicles. On the other hand, the storage usage increases with the increase of RT when the RT is 0.6 or higher. We observed that the oscillation was greater when the number of vehicles is 100. 2.94 2.92 2.9 2.88 2.86 2.84 2.82 2.8

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Reset Threshold (RT)

200 vehicles

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Reset Threshold (RT)

(a) 100 vehicles/km2

(b) 200 vehicles/km2

Fig. 3. Delay.

36500 36450 36400 36350 36300 36250 36200 36150 36100

100 vehicles Overhead

Overhead

Performance Evaluation of a Recovery Method for Vehicular DTN 70660 70640 70620 70600 70580 70560 70540 70520 70500

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

297

200 vehicles

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Reset Threshold (RT)

Reset Threshold (RT)

(a) 100 avehicles/km 2

(b) 200 vehicles/km 2

3500 3000 2500 2000 1500 1000 500 0

Storage Usage [bytes]

Storage Usage [bytes]

Fig. 4. Overhead. RT=0.1 RT=0.2 RT=0.3 RT=0.4 RT=0.5

0

100

200

300

400

500

3500 3000 2500 2000 1500 1000 500 0

600

RT=0.6 RT=0.7 RT=0.8 RT=0.9

0

100

Times [sec]

100

200 300 400 Times [sec]

Storage Usage [bytes]

Storage Usage [bytes]

400

500

600

500

600

(b) 100 vehicles/km2

RT=0.1 RT=0.2 RT=0.3 RT=0.4 RT=0.5

0

300

Times [sec]

(a) 100 vehicles/km2 3500 3000 2500 2000 1500 1000 500 0

200

500

600

3500 3000 2500 2000 1500 1000 500 0

RT=0.6 RT=0.7 RT=0.8 RT=0.9

0

(c) 200 vehicles/km2

100

200 300 400 Times [sec]

(d) 200 vehicles/km2

Fig. 5. Storage usage.

5

Conclusions

In this paper, we evaluated the proposed message relaying method with EDT considering different RTs for Vehicular-DTN. We evaluated the proposed EDT method considering delay, overhead and storage usage as evaluation metrics. From the simulation results, we found that setting of RT less than 0.5 reduced storage usage regardless the number of vehicles. In future work, we would like to investigate the impact of the activated time and adapted RT parameters.

298

Y. Tada et al.

References 1. Rec. ITU-R P.1411-7: Propagation data and prediction methods for the planning of short-range outdoor radiocommunication systems and radio local area networks in the frequency range 300 MHz to 100 GHz. ITU (2013) 2. Cerf, V., Burleigh, S., Hooke, A., Torgerson, L., Durst, R., Scott, K., Fall, K., Weiss, H.: Delay-tolerant networking architecture. IETF RFC 4838 (Informational), April 2007 3. Cuka, M., Elmazi, D., Ikeda, M., Matsuo, K., Barolli, L.: IoT node selection in opportunistic networks: implementation of fuzzy-based simulation systems and testbed. Internet Things 8, 100105 (2019) 4. Fall, K.: A delay-tolerant network architecture for challenged Internets. In: Proceedings of the International Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, SIGCOMM 2003, pp. 27–34 (2003) 5. Henmi, K., Koyama, A.: Hybrid type DTN routing protocol considering storage capacity. In: Proceedings of the 8th International Conference on Emerging Internet, Data and Web Technologies (EIDWT 2020), pp. 491–502, February 2020 6. Hou, X., Li, Y., Chen, M., Wu, D., Jin, D., Chen, S.: Vehicular fog computing: a viewpoint of vehicles as the infrastructures. IEEE Trans. Veh. Technol. 65(6), 3860–3873 (2016) 7. Ikeda, M., Nakasaki, S., Tada, Y., Barolli, L.: Performance evaluation of a message relaying method with enhanced dynamic timer in vehicular DTN. In: Proceedings of the Workshops of the 34th International Conference on Advanced Information Networking and Applications (WAINA-2020), pp. 332–340, April 2020 8. Kawabata, N., Yamasaki, Y., Ohsaki, H.: Hybrid cellular-DTN for vehicle volume data collection in rural areas. In: Proceedings of the IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC-2019), vol. 2, pp. 276–284, July 2019 9. Ku, I., Lu, Y., Gerla, M., Gomes, R.L., Ongaro, F., Cerqueira, E.: Towards software-defined VANET: architecture and services. In: Proceedings of the 13th Annual Mediterranean Ad Hoc Networking Workshop (MED-HOC-NET-2014), pp. 103–110, June 2014 10. Lin, D., Kang, J., Squicciarini, A., Wu, Y., Gurung, S., Tonguz, O.: MoZo: a moving zone based routing protocol using pure V2V communication in VANETs. IEEE Trans. Mob. Comput. 16(5), 1357–1370 (2017) 11. Mahmoud, A., Noureldin, A., Hassanein, H.S.: VANETs positioning in urban environments: a novel cooperative approach. In: Proceedings of the IEEE 82nd Vehicular Technology Conference (VTC-2015 Fall), pp. 1–7, September 2015 12. Marques, B., Coelho, I.M., Sena, A.D.C., Castro, M.C.: A network coding protocol for wireless sensor fog computing. Int. J. Grid Util. Comput. 10(3), 224–234 (2019) 13. Nakasaki, S., Ikeda, M., Barolli, L.: A message relaying method with enhanced dynamic timer considering decrease rate of neighboring nodes for Vehicular-DTN. In: Proceedings of the 14th International Conference on Broad-Band Wireless Computing, Communication and Applications (BWCCA-2019), pp. 711–720, November 2019 14. Ning, Z., Hu, X., Chen, Z., Zhou, M., Hu, B., Cheng, J., Obaidat, M.S.: A cooperative quality-aware service access system for social internet of vehicles. IEEE Internet Things J. 5(4), 2506–2517 (2018)

Performance Evaluation of a Recovery Method for Vehicular DTN

299

15. Ohn-Bar, E., Trivedi, M.M.: Learning to detect vehicles by clustering appearance patterns. IEEE Trans. Intell. Transp. Syst. 16(5), 2511–2521 (2015) 16. Ramanathan, R., Hansen, R., Basu, P., Hain, R.R., Krishnan, R.: Prioritized epidemic routing for opportunistic networks. In: Proceedings of the 1st International MobiSys Workshop on Mobile Opportunistic Networking (MobiOpp 2007), pp. 62–66 (2007) 17. R¨ usch, S., Sch¨ urmann, D., Kapitza, R., Wolf, L.: Forward secure delay-tolerant networking. In: Proceedings of the 12th Workshop on Challenged Networks (CHANTS-2017), pp. 7–12, October 2017 18. Scenargie: Space-time engineering, LLC. http://www.spacetime-eng.com/ 19. Vahdat, A., Becker, D.: Epidemic routing for partially-connected ad hoc networks. Duke University, Technical report (2000) 20. Wyatt, J., Burleigh, S., Jones, R., Torgerson, L., Wissler, S.: Disruption tolerant networking flight validation experiment on NASA’s EPOXI mission. In: Proceedings of the 1st International Conference on Advances in Satellite and Space Communications (SPACOMM-2009), pp. 187–196, July 2009 21. Zguira, Y., Rivano, H., Meddeb, A.: IoB-DTN: a lightweight DTN protocol for mobile IoT applications to smart bike sharing systems. In: Proceedings of the Wireless Days (WD-2018), pp. 131–136, April 2018

Survey of UAV Autonomous Landing Based on Vision Processing Liu Yubo1(&), Bei Haohan1, Li Wenhao2, and Huang Ying2 1

2

Engineering University of PAP, Xi’an 710086, China [email protected], [email protected] Information and Communication of Engineering, University of PAP, Xi’an 710086, China

Abstract. At present, with the rapid development of UAV technology, UAV has begun to play an important role in the field of military and civilian applications, which has attracted more and more attention. In the context of automation, there are more and more attention on how UAV achieves autonomous landing. In this paper, it is to describe the research situation of autonomous landing of UAV based on visual processing at home and abroad first, and then introduce several technologies of image processing, target tracking, position estimation and autonomous control according to the process. Based on these three process components, the existing shortcomings are put forward. Finally combined with the actual situation, the research ideas are proposed.

1 Introduction In recent years, with the continuous development of society, the use of UAV has become more frequent and been involved into more fields, which include scientific exploration and data collection, commercial services, military reconnaissance and law enforcement, search and rescue, patrols, and entertainment. At present, the landing mode of UAV can be divided into two ways which are the manual landing and autonomous landing. In the field of military aviation, new UAV with autonomous landing capabilities have received more and more attention at home and abroad. In the field of civil aviation, the autonomous landing technology of UAV has also received extensive attention, especially for vehicle-mounted UAV, autonomous landing is a very important technical advantage. For achieving efficient and repeated use of UAV, it is necessary to realize the autonomous landing technology of UAV [1]. Navigation technology has formed many mature and complete systems in the longterm development process. Common navigation technologies include GPS navigation [2, 3], inertial navigation [4], and later DGPS navigation and SNIS/GPS navigation [5], etc. which are technologies the traditional UAV autonomous landing positioning relies on. However, due to the high requirements for external signal conditions, as well as the error of the technology itself and some human factors, traditional navigation technology is difficult to meet the accuracy requirements of UAV autonomous landing. This paper first analyzes and summarizes the research situation at home and abroad, then introduces relevant technologies in accordance with the three stages of image processing, target tracking, position estimation and autonomous control, and © Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 300–311, 2021. https://doi.org/10.1007/978-3-030-57796-4_29

Survey of UAV Autonomous Landing Based on Vision Processing

301

summarizes them. Finally, according to the actual development situation, the deficiencies of the three stages are pointed out, and the corresponding research ideas are put forward.

2 Related Work Since the 1990s, many universities and scientific research institutions have conducted systematic research on UAV autonomous landing and great progress has been made in both theory and practice. Specific research examples are as follows: In the research of autonomous landing of UAV based on visual processing, research started earlier in Europe and the US. Scholar Tsui et al. [6] use the visual system to estimate the three-dimensional motion state of the target, and use the BP neural network fitting function to estimate the parameters in the position estimation stage to ensure the accuracy of the estimated parameters, which is compatible with the real-time nature of the system. In reference [7], researchers such as Saripalli of the University of Southern California in 2003 use visual technology to land on the designated target, conduct reference target analysis, and use the check of the pre-mark to finalize the automatic landing. In order to enhance the accuracy of position estimation, Scholar Olivares-Mendez et al. [8] use the aruco Eyes augmented reality software package, and use four fuzzy controllers to achieve autonomous landing of the UAV. They use V-rep and ROS to control the set vertical controller to keep the UAV at a predetermined distance in the tracking experiment and land on the platform during the landing mission. In reference [9], in order to achieve target tracking, Kim J et al. developed a set of algorithms to detect targets based on the color appearing on the image, in order to obtain a good position estimation effect, they connect the nonlinear observation model and the constant velocity model in the NED coordinate system in series to form a nonlinear estimation model. At the same time, the Unscented Kalman filter is used to properly evaluate the model and estimate the state vector. Farhad et al. proposed an extended Kalman filter (EKF) based on the special Euclidean group SE(3) for geometric control of quadrotor UAV [10]. It is obtained by estimating the state of the quadcopter from the noise measurement by linearizing SE(3) in an eigen form. In reference [11], on the basis of the integrated visual inspection system, Carvalho Souza et al. propose a set of AR trajectory tracking, landing points, and sensor processing methods for aircraft markers of different sizes for image-processed ground signs, while in terms of control, they proposed a method to supervise the training of artificial neural networks. Scholars at the Massachusetts Institute of Technology develop a miniature quadrotor UAV by designing a micro-processing chip [12]. It is to reconstruct the chip for image processing, minimize the data in the memory, and greatly reduce the power consumption of the chip. At the same time, the image processing capabilities of the processor can meet the requirements of most situations. Although the researches on UAV at China started late, many achievements have been made. In reference [13], Huang Jun et al. of Beijing Institute of Technology, in order to ensure the effect of target tracking, adopt the method of overlapping size icons and hierarchical recognition to prevent the loss in the recognition process. In reference [14], Jin Shaogang and others from the National University of Defense Technology use

302

L. Yubo et al.

a target tracking algorithm to select the two-dimensional code area, which prevents the number of pixels from being traversed during image processing and improves the realtime performance of the cooperative two-dimensional code algorithm. They use the pnp feature and take the method of completing missing vertices to improve the robustness of the QR code. Jia Peiyang and others [15] of the Space Research Center of the Chinese Academy of Sciences proposed a tracking and landing algorithm based on Apriltags, which uses a combination of size and code to improve recognition performance. In reference [16], Ma Xiaodong, et al. of China Flight Test Research Institute complete target recognition based on contour geometric features and SVM classifier. It is to perform grid sampling on the target image, and track the sampling points based on the pyramid L-K optical flow method based on the improvement of the front-back bidirectional tracking error and the similarity constraint of the local area image of the adjacent frame, and to design target re-search algorithm, finally, to extract sub-pixellevel feature corner points, complete relative position estimation based on perspective projection theory, and perform binocular mean fusion. In reference [17], Wang Zhaozhe and others of Harbin Institute of Technology analyze the target recognition algorithm to optimize its ROI. They design the membership function and fuzzy control rate in the control aspect, and use fuzzy rules to modify the PID parameters and carried out simulation verification. Xu Xiaobing and others [18] from Beijing University of Aeronautics and Astronautics apply convex hull transformation to select feature markers, and remove redundant markers by interference elimination. For the incomplete marking, they adopt the method of feature prediction for prediction. Yuan Suzhe et al. of the Twenty Research Institute of CLP Technology [19] use some cooperative Apriltag tags for joint positioning for some problems of large positioning error and many missing recognition at the single sign point under outdoor conditions. They obtain the relative position of the UAV relative to the edge label and then synthesize it. Chen Feiyu and others from Shandong University, based on the TLD framework, propose an algorithm for autonomously determining landing targets based on target shape features [20], which improves the autonomy of the landing process; they use the kernel correlation filter to implement the tracker in the TLD framework, which improves the real-time, accuracy and robustness of the target tracking algorithm.

3 Analysis to the Implementation Process and Common Methods of Autonomous Landing The autonomous landing process of UAV based on visual processing is mainly divided into image processing, target tracking, position estimation and autonomous control. According to its process, the following common methods are introduced step by step now. 3.1

Image Processing

After the UAV collects images through the camera, it needs to carry out edge detection, grayscale processing, threshold segmentation and tresholding. The following first introduces the common methods of each stage and then analyzes its advantages and disadvantages.

Survey of UAV Autonomous Landing Based on Vision Processing

303

3.1.1 Common Edge Detection Algorithms At present, there are many mature edge detection methods. According to order classification, common detection methods can be divided into Roberts operator Sobel operator and Prewitt operator based on first order and Laplacian operator based on second order. Among them, the second-order operator is susceptible to noise interference, so it is not considered here. Roberts operator is also called cross differential algorithm, it is a gradient algorithm based on cross difference, which detects edge lines by local difference calculation. This algorithm is commonly used to process images with steep low noise, and its operators are as follows: dx ¼

1 0

0 1

dx ¼

0 1

1 0

It can be seen that the Roberts operator can enhance the plus or minus 45 edges better. It is very important to choose a suitable threshold T for Roberts operator. If the gradient value of each processed point is not less than T value, it is the edge point of the image. Although this algorithm has the advantage of high calibration accuracy, but it is easily disturbed by noise. The Prewitt operator is a differential operator for image edge detection. Its principle is to use the difference generated by the pixel gray value in a specific area to achieve edge detection. Its operator is a 3 3 matrix. And its operators are as follows: 2

1 dx ¼ 4 1 1

3 2 0 1 1 1 0 1 5 dy ¼ 4 0 0 0 1 1 1

3 1 05 1

Prewitt operator can effectively overcome edge noise, and it is suitable for identifying images with more noise and gray gradient. But, its weakness is that the classic Prewitt operator believes that the point where the gray value obtained is not less than the threshold is defined as the edge of the image. Therefore, the edge detection result of Prewitt operator is more obvious in both horizontal and vertical directions than that of Robert operator. Therefore, many noises with gray levels greater than the selected threshold are mistaken for edge points. While, some edge points with small gray scale amplitude are easily overlooked in the selection. After the edge detection is completed, we identify the inside of the icon and judge the gray scale of each small square equally divided, after completion, we compare the number of identified arrays with the preset data. If the match one is the target image, search again to complete the image recognition. 3.1.2 Image Grayscale Processing, Threshold Segmentation and Thresholding (1) grayscale processing is the process of converting color image into grayscale, today most color images are in RGB color mode. A color image can be understood as the overlay of three monochrome layers. Grayscale means that all three

304

L. Yubo et al.

layers are set to the same value. For example: RGB (60, 60, 60) represents a grayscale of 60. According to different processing methods, grayscale processing can be divided into: component method, maximum value method, average value method and weighted average method. Among them, the component method is to use the brightness of the three components as the gray value of the three gray images, and select one of them according to the application needs. The maximum value method is to use the maximum value of the three-component brightness in the color image as the gray value of the gray image. The average method is to average the three-component brightness to obtain the gray value. While the weighted average method is to weight the three components with different weights according to the importance and other indicators, according to the sensitivity of the human eye to different colors, and then obtains a more reasonable image. The calculation formula is as follows: Grayði; jÞ ¼ 0:299 Rði; jÞ þ 0:578 Gði; jÞ þ 0:114 Bði; jÞ The grayscale image reflects the distribution and characteristics of the overall and local chroma and brightness levels of the entire image as well as the color image. However, it reduces the amount of calculation for upper-layer operations, so grayscale processing of images is generally used as a preprocessing step for image processing. (2) Thresholding image segmentation is a relatively common method. The principle is to first convert the image into a mode with only two gray levels, then to extract the required elements as the foreground display and separate them from the background. Thresholding is very significant for the extraction of physical characteristics such as the geometry and texture of the target area, and it is used for preprocessing for the next image discrimination. According to different threshold selection methods, the thresholding method can be divided into: simple threshold method, adaptive threshold method and Otsu’s thresholding method. The characteristic of the simple algorithm is to choose a global threshold, and then divide the whole image into binary images that are either black or white. The adaptive threshold method can treat the threshold as a local value, it is to determine the situation of this pixel by setting a region size and comparing the size relationship between this point and the average value (or other characteristics) of the pixels in the region size. The basic principles of the two are the same, and the formula of the thresholding segmentation is as follows: pðx; yÞ ¼

1; f ðx; yÞ\T ; T 2 ð0 255Þ 0; f ðx; yÞ T

Where f(x, y) is the original image, P(x, y) is the target image extracted after thresholding, and T is a fixed threshold selected between 0–255. Unlike the above two artificially specified algorithms, Otsu’s thresholding algorithm can find a suitable threshold value by itself to prevent the problem of incorrectly extracting the foreground

Survey of UAV Autonomous Landing Based on Vision Processing

305

caused by the artificially specified threshold value. Therefore, Otsu’s is very suitable for the case where the image grayscale histogram has double peaks (as shown below). Conversely, for the case where the image grayscale histogram has no double peaks, this method is not okay to use. 3.2

Target Tracking

After image processing, we can extract the feature points contained in the designed ground sign, and track the feature points through the algorithm to prevent the UAV from losing the identified targets during flight, and leading to navigation failure. Most of the current target tracking techniques are based on the theory of optical flow method. Optical flow is the instantaneous velocity of a moving object in space on the pixel of the observation imaging plane. Optical flow method is a method to find the corresponding relation between the previous frame and the current frame by using the change of pixels in the time domain of image sequence and the correlation between adjacent frames, so as to calculate the motion information of objects between adjacent frames. Generally speaking, the optical flow is caused by the movement of the object itself or the movement of the camera, or both. The following figure shows the result of object motion mapping in three-dimensional space to two-dimensional imaging plane (Fig. 1).

Fig. 1. Schematic diagram of mapping

What we will get is a two-dimensional vector describing the change in position. When the motion interval is small enough, we can regard it as a two-dimensional vector u = (u, v) describing the instantaneous velocity at that point, that is, the optical flow vector. The two-dimensional volume field of optical flow field is to reflect the velocity vector message of the instantaneous motion of each image point through the change trend of the gray level of each point. But there are two basic conditions for optical flow method: 1. The brightness is constant; 2. The time is continuous or the movement is “small movement”. When using the optical flow method, it is necessary to pay attention to these two. 3.2.1 Target Tracking Based on LK Optical Flow Method Lucas-Kanade (LK) [21] optical flow method was proposed in 1981. It added the assumption of “spatial consistency” to the two basic assumptions of the original optical

306

L. Yubo et al.

flow method, that is, there are three conditions for its establishment. Among them, the assumption that all adjacent pixel points have similar actions just conforms to the characteristics between feature points extracted after graphics processing. The assumptions of the LK optical flow algorithm satisfy the number of constraints required to solve the two variables u and v, which enables us to find the velocity vector corresponding to each set of conditions. However, the constraints of the LK algorithm are not easily satisfied. When the object moves too fast, the assumption will be destroyed, resulting in a large error in the result. Therefore, we adopt the method of introducing pyramid sampling space and adopt the method of approximate sampling layer by layer to make it conform to the assumption of “small motion”. At this time, we can use the LK algorithm to calculate the optical flow of the target to avoid large errors. In order to meet the tracking effect, it is necessary to use the front and rear tracking errors and local regions of adjacent frames to improve, to avoid the effects of blur and lighting changes [22]. We carry out two-way tracking on a certain point of the image, and then compare the two tracking trajectories generated. The same results in success, otherwise, the different results in failure, it can be defined by the following formula: FB ¼ jjxt ^xt jj This is the Euclidean distance between two points, the smaller the value, the more similar the trajectory. In addition, due to the short time interval and the similarity of local images, we can adopt normalized correlation coefficient for matching, whose definition is as follows: P 0 0 0 0 x0 ;y0 ðTðx ; y Þ Iðx ; y ÞÞ ffi NCC ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P P 0 ; y0 Þ 2 0 ; y0 Þ2 Tðx Iðx 0 0 0 0 x ;y x ;y This formula is based on the region image captured at the tracking point xt at time t and the corresponding point xt + 1 at time t + 1, where T(x, y) and I(x, y) are the gray levels of the points (x, y) in the images T and I, respectively. Finally, we set thresholds for the results FB and NCC respectively, and ensure the tracking success through threshold comparison. 3.2.2 Target Tracking Based on TLD Algorithm The TDL algorithm consists of three modules based on the tracking module using the optical flow method tracker, which are: tracking module, detection module and learning module [23]. In order to obtain stable tracking points, the TLD algorithm introduces a failure detection mechanism based on NCC (Normal Cross Correlation) [24] similarity calculation and forward-backward tracking method, finally, takes the median value of displacement and scale change between stable tracking points as the output of the tracking module (Fig. 2). The detection module is to conduct a global scan with multiple scales for each frame, and sequentially pass the obtained detection window through a cascade

Survey of UAV Autonomous Landing Based on Vision Processing

307

Fig. 2. Block diagram of TCL algorithm

classifier, and use the finally passed window as the output of the detection module. The tracking module and detection module will process each frame independently and in parallel, and the results of the two will be fused according to a certain fusion strategy to obtain the final tracking position. The learning module is to sample the positive and negative samples of the current frame according to the tracking results, and adopt the PN learning strategy [25] to learn and update the target model. 3.3

Position Estimation and Autonomous Control Technology

3.3.1 Kalman Filter After obtaining the information of the tracked target, we can predict and analyze the motion trajectory and state of the UAV body accordingly. Kalman filtering (Kalman filtering) is an algorithm that uses the linear system state equation to observe the data through the system input and output to optimally estimate the state of the system. It can estimate the state of the dynamic system from a series of data with measurement noise when the measurement variance is known. Because it is convenient for computer programming and real-time update processing of on-site collected data, it is widely used in the field of UAV. The principles of Kalman can be understood as the update of the predicted value and the observed value to the state value, and its process can be divided into two steps of prediction and update. Where, the predicted value is our prediction of the future state quantity based on the previous state quantity, and its formula can be expressed as: T P t ¼ FPt1 F þ Q

Then, we use the Kalman coefficient to determine the influence of the predicted value and the observed value on the update, the formula is as follows:

308

L. Yubo et al. 1 T T Kt ¼ P t H ðHPt H þ RÞ

Finally, combined with the Kalman coefficient, we update the state. ^xt ¼ ^x x t þ Kt ðzt H^ t Þ Pt ¼ ðI Kt HÞP t Although Kalman filtering has many advantages, the nature of its linear equations makes it prone to errors when dealing with nonlinear problems, which is the problem that needs to be overcome. 3.3.2 PID Control Technology Through position estimation, we can obtain the parameters we need to complete the autonomous landing. At this time, we can use the classic PID control theory to complete the landing of the UAV platform. Because of the simple algorithm, good robustness and high reliability, PID control is widely used in industrial production, unmanned vehicles and UAV. The conventional PID control system is shown in the figure below. The system consists of PID controllers and the controlled objects. As a linear controller, PID controller forms a deviation according to the given value r(t) and the actual output value c(t): e(t) = r(t) − c(t). It uses the linear combination of the proportional (P), integral (I) and differential (D) deviations to form a controlled variable to control the controlled object. The control law is as follows: uðtÞ ¼ KP ½eðtÞ þ

1 Ti

Z 0

t

eðtÞdt þ Td

deðtÞ ¼ Kp eðtÞ þ Ki dt

Z

t

eðtÞdt þ Kd

0

deðtÞ dt

1 1 The transfer function is: GðsÞ ¼ UðsÞ EðsÞ ¼ Kp ð1 þ Ti s þ Td sÞ ¼ Kp þ Ki s þ Kd s Where, Kp is the proportional coefficient, Ti is the integral time constant, Td is the differential time constant; Ki = Kp/Ti, is the integral coefficient; Kd = Kp * Td, is the differential coefficient. The PID controller is divided into three control parts: the proportional link is adjusted based on the deviation until the deviation is eliminated. The integral link can memorize the error, which is mainly used to eliminate the static error and improve the system’s error-free. The degree of integration is proportional to the integration time constant Ti.

4 Challenges and Research Ideas of Visual Navigation in the Field of Autonomous Landing of UAV Through the analysis of the research status at home and abroad and the common methods involved in the process, we present the current deficiencies and research ideas from three aspects: (1) For image processing, in view of the limited image processing hardware that the UAV platform can carry, in the traditional image processing technology, there is a

Survey of UAV Autonomous Landing Based on Vision Processing

309

general problem that the overhead is too large and the real-time performance cannot be met. In order to reduce such problems, on the one hand, we can start with ground signs, reduce the edge contours needed for positioning, or adopt improved algorithms to increase the processing rate. At the same time, when extracting image features, due to the threshold segmentation method, the extracted feature points or edge contours are often disturbed by image noise, which makes the extracted data inaccurate. Finally, we find that there are few methods to deal with the rapid change of light in the process of target image recognition. For this reason, the ground with auxiliary light sources such as fluorescence or infrared can be considered to enhance the robustness of the recognition system. (2) In the process of target tracking, we need to consider the influence of the identified target being blocked and the light changes on the algorithm adaptation parameters to prevent the loss of the tracking target. In view of the fast moving speed of the UAV, when performing image tracking, we need to ensure the real-time nature of the algorithm. While now most algorithms adopt iterative calculation, so it is necessary to adopt a suitable regression algorithm. At the same time, GPS navigation and SNIS navigation can be combined to provide continuous navigation signals when the target is lost. Finally, we can also introduce artificial intelligence to transform a large-scale traversal search into intelligent recognition, thus greatly improving the rate of positioning targets. (3) For position prediction, although the traditional Kalman filter can estimate the future motion state of the system in the presence of multiple noises, but the nature of its linear system makes it produce errors when dealing with nonlinear problems. In this regard, on the one hand, we can use a better parameter tuning method to obtain accurate parameters. There are existing methods of the critical proportion method and the attenuation curve method. With the continuous development of computer technology and the introduction of new control methods, we can also adopt a compound control method that combines PID control with fuzzy control, deep neural network control, etc. to improve its accuracy and the robustness of the control method.

5 Conclusion The ultimate purpose of autonomous landing of UAV based on visual processing is to achieve autonomous and accurate landing of UAV in order to complete the recovery of the UAV platform. Compared with traditional navigation methods, visual navigation has the advantages of low communication environment requirements and no assistance from external devices during navigation. Therefore, it is very meaningful to study the autonomous landing technology of UAV based on visual processing. This paper first summarizes several traditional navigation methods and points out their shortcomings, then introduces the research status at home and abroad by platform, introduces several algorithms respectively according to the three stages of autonomous landing process, and puts forward its advantages and disadvantages, and finally points out the challenges and research ideas of autonomous landing of UAV.

310

L. Yubo et al.

References 1. Kim, S.J., Jeong, Y., Park, S.: A survey of drone use for entertainment and AVR (Augmented and Virtual Reality). In: Augmented Reality and Virtual Reality, pp. 339–352. Springer, Heidelberg (2018) 2. Heredia, G., Caballero, F., Maza, I., et al.: Multi-unmanned aerial vehicle (UAV) cooperative fault detection employing differential global positioning (DGPS), inertial and vision sensors. Sensors 9, 7566–7579 (2009) 3. Pestana, J., Mellado-Bataller, I., Sanchez-Lopez, J.L., et al.: A general purpose configurable controller for indoors and outdoors GPS-denied navigation for multirotor unmanned aerial vehicles. J. Intell. Robot. Syst. 73, 387–400 (2014) 4. Barczyk, M., Lynch, A.F.: Invariant observer design for a helicopter UAV aided inertial navigation system. IEEE Trans. Contr. Syst. Technol. 21, 791–806 (2013) 5. Kai, J.: Vision based autonomous landing technology for unmanned aerial vehicles, pp. 1–3. The PLA Information Engineering University (2016) 6. Jun, H.: Research on autonomous landing control technology of vehicular UAV. Beijing Institute of Technology (2016) 7. Tsui, P., Basir, O.A.: A neural network based vision system for 3D motion estimations. In: Proceedings of the 1999 IEEE International Symposium on Intelligent Control/Intelligent Systems and Semiotics (1999) 8. Saripalli, S., Montgomery, J.F., Sukhatme, G.S.: Visually guided landing of an unmanned aerial vehicle. IEEE Trans. Robot. Autom. 19, 371–380 (2003) 9. Olivares-Mendez, M.A., Kannan, S., Voos, H.: Vision based fuzzy control autonomous landing with UAVs: from v-rep to real experiments. In: 2015 23th Mediterranean Conference on Control and Automation (MED), pp. 14–21. IEEE (2015) 10. Kim, J., Jung, Y., Lee, D., Shim, D.H.: Landing control on a mobile platform for multicopters using an omnidirectional image sensor. J. Intell. Rob. Syst. 84(1–4), 529–541 (2016) 11. Goodarzi, F.A., Lee, T.: Global formulation of an extended kalman filter on SE(3) for geometric control of a quadrotor UAV. J. Intell. Robot. Syst. 88(2–4), 395–413 (2017) 12. Suleiman, A., Karaman, S., et al.: Navion: a 2mW fully integrated real-time visual-inertial odometry accelerator for autonomous navigation of nano drones. IEEE J. Solid State Circ. (JSSC) 2019, 1–14 (2019) 13. De Souza, J.P.C., Marcato, A.L.M., de Aguiar, E.P., Jucá, M.A., Teixeira, A.M.: Autonomous landing of UAV based on artificial neural network supervised by fuzzy logic. J. Control Autom. Electr. Syst. (2019). https://doi.org/10.1007/s40313-019-00465-y 14. Gang, J.S.: Research on autonomous tracking and landing technology of vehicle mounted UAV based on nested two-dimensional code. National Defense University of science and technology (2016) 15. Yang, J.P., Dong, P.X., Gen, Z.W.: Research on autonomous landing of four rotor UAV. Comput. Sci. 44(S2), 520–523 (2017) 16. Dong, M.X., Hao, L., Jie, Z., Hua, G.S.: Autonomous landing technology of fixed wing UAV based on binocular vision. J. Ordnance Equip. Eng. 40(11), 193–198 (2019) 17. Zhe, W.Z.: Research on autonomous landing of four rotor UAV mobile platform based on vision localization. Harbin Institute of Technology (2019) 18. Xu, X.B., Wang, Z., Deng, Y.M.: A software platform for vision-based UAV autonomous landing guidance based on markers estimation. Sci. China Technol. Sci. 62(10), 1825–1836 (2019) 19. Zhe, Y.S., Yu, G.J., Xin, J., Yang, L.: Research on autonomous visual landing technology based on multi label joint positioning. Mod. Navig. 11(02), 109–113 (2020)

Survey of UAV Autonomous Landing Based on Vision Processing

311

20. Yu, C.F., Bin, Y.W., Lu, R.Y., Hao, X.J., Jing, M.X.: Engineering and application of UAV self precision landing. Comput. Improved TLD Algorithm 56(07), 247–254 (2020) 21. Tomasi, S.J.: Good features to track. In: 1994 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Proceedings CVPR 1994, pp. 593–600. IEEE (2002) 22. Jie, Z.J., Li, S., Lin, H.B., et al.: Moving target detection based on pyramid LK algorithm. Industr. Control Comput. 9, 13–15 (2015) 23. Sun, C., Zhu, S., Liu, J.: Fusing kalman filter with TDL algorithm for target tracking. In: Proceedings of Control Conference on Technical Committee on Control Theory (2015) 24. Barnich, O., Van Droogenbroeck, M.: ViBe: a powerful random technique to estimate the background in video sequences. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 945–948 (2009) 25. Barnich, O., Droogenbroeck, M.V.: ViBe: a universal background subtraction algorithm for video sequences. IEEE Trans. Image Process. 20(6), 1709–1724 (2011)

Improved Sentiment Urgency Emotion Detection for Business Intelligence Tariq Soussan(&) and Marcello Trovati School of Computing, Edge Hill University, Ormskirk, UK [email protected], [email protected]

Abstract. The impact of social media on people’s lives has significantly grown over the last decade. Individuals use it to promote discussions and a way of acquiring data. Industries use social media to market their goods and facilities, advise and inform clients about future offers, and follow up on their direct market. It also offers vital information concerning the general emotions and sentiments directly connected to welfare and security. In this work, an improved model called Improved Sentiment Urgency Emotion Detection (ISUED) has been created based on previous work for opinion and social media mining implemented with Multinomial Naive Bayes algorithm and based on three classifiers which are sentiment analysis, urgency detection, and emotion classification. The model will be trained to improve its accuracy and F1 score so that the precision and percentage of correctly predicted texts is elevated. This model will be applied on the same data set of previous work acquired from a general business Twitter account of one of the largest chains of supermarkets in the United Kingdom to be able to see what sentiments and emotions can be detected and how urgent they are.

1 Introduction Social Networking has unraveled innovative capacities for businesses to connect with their customers and employees by providing them the option to send messages out quickly and to obtain real-time response [11]. Social media has opened many new prospects for the B2B sector due to its features that can enhance communication, interaction, education and partnership which can convey substantial profits to institutes [8]. While social media was initially released for online networking and flow of content across the Web, it then became a supreme area for societies for providing feedback with goods and services [9]. Social Media offers prospects for institutions to participate, to grow a relationship with their clients, and to promote an environment in order to push sales and awareness. From the institute perception, social media offers the means whereby businesses can directly interact with their customers [15]. Clients’ own feedback on goods, amenities, and brands are really valued among online spectators [2]. Also, clients have been eager and open to express their dissatisfaction with goods and services through social network [17]. In this work, an improved learning model called Improved Sentiment Urgency Emotion Detection (ISUED) will be deployed as a combination of sentiment analysis, © Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 312–318, 2021. https://doi.org/10.1007/978-3-030-57796-4_30

Improved Sentiment Urgency Emotion Detection for Business Intelligence

313

urgency detection, and emotion classification in order to enhance the model from previous work [13]. Similarly, this model will also be trained on a training dataset having text samples from multiple Twitter accounts which will be followed by running the model on the same general business Twitter account. The purpose is to classify tweets on sentiments, urgency and emotions. Thus, the model will discover if the tweets have a sentiment, will realize if there are emotions in them, and will detect if they are urgent. Therefore, this will assist in determining the gratification or the dissatisfaction of online communities around the brand being examined. Knowledge that is mined when turning data into decisions is crucial since it will aid stakeholders handling the institute to evaluate the topics and issues that were mostly emphasized. In the coming sections, Sect. 2 will give a literature review about social media, sentiment analysis, urgency detection, and emotion classification. Section 3 will discuss the model used along with its training. Section 4 will discuss the results and analysis of the results. Lastly, Sect. 5 will give a conclusion from the results.

2 Literature Review One of the most prevalent and most currently used social media platforms is Twitter, which was developed by Obvious Corporation in 2006 [4]. Users can communicate with each other, ask questions and guidance, as well as take part in open-ended debates [4]. By combining individual publishing and communication, Twitter was able to create a new kind of instantaneous publishing [4]. Twitter’s effect on businesses has been widely investigated. Some work focused on the word-of-mouth discussion on Twitter and the results showed that about one fifth of all tweets hold the name of a brand, good, or facility [6]. In addition, one fifth of these word-of-mouth tweets showed few sentiments. Also, the study showed that the positive tweets are more than half of the branded tweets while the negative tweets are only one third of them [6]. This study concluded that the linguistic assembly of tweets is related to the linguistic patterns of natural language expressions and it showed that Twitter is a rich word-of-mouth site for institutes to explore as part of their global branding policy [6]. Twitter is a good platform to extract data for text categorization. Automatic text classification is hard for a computer due to the huge number of words. The categorization of tweets is even more difficult due to the restriction on the number of words in a tweet [7]. Classifying emotions in text is very hard for two reasons. First, feelings can be implied and activated by precise actions or circumstances. Second, gathering different emotions based on keywords can be very difficult to detect [7, 16]. Classifying urgency in text might not be easy. Previous work shows a social network-based solution that can observe multiple social networks to detect keywords, urgency ratings, the request owner’s identity, date, and time [3]. It also governs which posts or chats are crucial and prioritize or rank pending urgent concerns [3].

314

T. Soussan and M. Trovati

3 Methodology Throughout this work, a custom model is going to be developed using Monkey Learn platform [12] based on several classifiers. The three classifiers to be joint into a single model are sentiment analysis, urgency detection, and emotion classification. This model is based on Multinomial Naive Bayes algorithm. This algorithm has been broadly utilized in text classification and uses a parameter learning method called Frequency Estimate (FE), which approximates word probabilities by computing suitable frequencies from data [14]. It undertakes that a document is a bag of words and takes word frequency and information into consideration [1]. The main benefit of FE is that it is easy to use, often delivers rational forecast performance, and is effective [14]. Thus, Multinomial Naive Bayes is considered as a supervised learning method using a probabilistic technique [5]. The N-gram range sets the type of features to be used to characterize texts [12]. The N-gram used for this model is Unigrams or words (n-gram size = 1) and Bigrams or terms compounded by two words (n-gram size = 2) [12]. The next step after creating the model is to create the categories associated with the three classifiers which are sentiment analysis, urgency detection, and emotion classification. Some categories were gathered from existing classifiers and others were created and they have been put all together into one custom model as shown in Table 1. This model will be trained on many test tweets extracted from many Twitter accounts; thus, the model will be learning to link the input to the matching output (category) based on the training of the test tweets data [12] which will help enhance accuracy and F1 Score. The model will be run on the same dataset of previous work [13] for verification to check the results of the learning mechanisms along with the categories’ confidence numbers. The feedback categories can be from the customers which can be customer service reviews, product reviews, or proposals to enhance their Table 1. Categories for the ISUED model Categories Anger Features Complaint Positive Negative Pricing Feedback Request Happiness Sadness Hate Surprise Love Worry

Improved Sentiment Urgency Emotion Detection for Business Intelligence

315

service. At the same time, the model can categorize the institute’s reply feedbacks that either the customer service gave, or other individuals issued. The same experimental environment will be used for this work [13], which is a Windows 10 Enterprise laptop with a processor of Intel® Core ™ i5-8250U CPU @ 1.60 GHz 1.80 GHz. The installed memory (RAM) is 8.00 GB. The System type is 64bit Operating System, x64-based processor. The data size for a data set is 2795 recent Tweets where the store chain has been mentioned up until 17th of February 2020. 3.1

Training the Improved Sentiment Urgency Emotion Detection Model

A training data set having 735 tweets from different Business Twitter accounts is fed to the model. Each tweet must be labelled to the suitable category or categories as this is the crucial for training the model [12]. By classifying and retagging the test tweet data, the machine learning algorithm gains knowledge such that for each input having precise keywords, an explicit output category or categories is expected [12]. The metrics used to enhance the model are accuracy and F1 Score (precision and recall). The accuracy is defined as the percentage of test tweets that were matched with the right category and it is the quotient of the correctly classified tweets by the overall tweets in test data set. The F1 score combines both precision and recall [10, 12]. Recall is the proportion of positive sentiments which are correctly acknowledged while precision is the ratio between the correct sentiments predicted to the total number of matches predicted [10]. The result of training the model is seen in Fig. 1 as the accuracy was raised to 73% and the F1 Score was raised to 77%. Training the model also resulted in a keyword list shown in Fig. 2 that provides an outline on how the training data is being evaluated. It is displaying the most correlated keywords [12].

Fig. 1. Accuracy and F1 Score metrics for training the ISUED Model

Fig. 2. Overall Keyword List from training the ISUED Model

316

T. Soussan and M. Trovati

4 Statement of Results and Analysis After the ISUED model’s training was finished, it was run on the same data set from previous work for verification [13]. 2795 tweets were imported to the model as a batch. Processing the dataset shows the category or categories for each tweet and its/their respective confidence value(s) such that 219 different single or combined categories have been produced. Figure 3 shows the top 20 categories containing the largest group of tweets using the ISUED Model. Figure 4 displays the top 10 categories percentages out of the overall. It is shown that most of the tweets have been classified as “Feedback” category. Customer Service for this institute can benefit from this model since it provides them knowledge about the classifications of the tweets, the urgent complaints that might need assistance, the requests coming from customers, and the feedback. The model also provides the sentiment of the customer service feedback to any complaints. Some difficulties were noted. In Fig. 3 and Fig. 4, 17 tweets which constitute 0.61% of the entire dataset could not be categorized to any of the categories of the model which might be due to the model not being able to match the keywords in the tweets to any of the groups. Still, this is less than the number of tweets that were not able to be categorized in previous work [13]. Other challenges faced were tweets that may have not been properly grouped due to false positive or false negative in the tweets. Another problem faced as well was that 23 tweets which constitute around 1% of the dataset showed irrelevant confidence numbers, however this is still less than the number of tweets that showed irrelevant confidence numbers in previous work [13]. For every tweet, the ISUED model computes the confidence value for each single category that matches it. Furthermore, the average confidence values of all the categories for each tweet is then calculated. Based on this, the average confidence value of each of the top 10 categories from Fig. 3 can be reached. This is defined as the average of the average confidences for all the tweets belonging to the same category. Comparing the average confidence of the top 10 categories from Table 2 to the average confidence of the top 10 categories from previous work [13], we notice that two of the top 3 categories had their average confidence improved. Table 2. Average confidence of the Top 10 categories Categories Feedback Feedback: Positive Feedback: Complaint Features Features: Not Urgent: Love Positive: Feedback Positive: Feedback: Request Feedback: Complaint: Sadness Feedback: Positive: Request Complaint: Feedback

Average confidence 0.760 0.714 0.756 0.584 0.526 0.700 0.720 0.750 0.778 0.688

Improved Sentiment Urgency Emotion Detection for Business Intelligence

317

Fig. 3. Top 20 categories containing the largest group of tweets

Fig. 4. Top 10 categories percentages out of the overall.

5 Conclusion In this work, an improved learning model for sentiment, urgency, and emotion detection has been built for social media mining and opinion mining. The model was trained on text samples from many twitter accounts to improve its accuracy and F1 Score. The model was run on the same general business Twitter account of a supermarket chain from previous work to measure the confidence of its categorization. Tweets were successfully classified into one or multiple categories. Future work can involve training the model with more metrics tuning to improve accuracy and F1 Score. Further work can also include taking time as metric to monitor how sentiments vary over time.

318

T. Soussan and M. Trovati

References 1. Abbas, M., Memon, K.A., Jamali, A.A., Memon, S., Ahmed, A.: Multinomial Naive Bayes classification model for sentiment analysis. IJCSNS 19(3), 62 (2019) 2. Burton, J., Khammash, M.: Why do people read reviews posted on consumer-opinion portals? J. Mark. Manag. 26(3–4), 230–255 (2010) 3. Chavez, D.L., Mohler, D.S., Shockley, B.A.: U.S. Patent No. 8,515,049. U.S. Patent and Trademark Office, Washington, DC (2013) 4. Grosseck, G., Holotescu, C.: Can we use Twitter for educational activities. In: 4th International Scientific Conference, eLearning and Software for Education, Bucharest, Romania, April 2008 5. Isabelle, G., Maharani, W., Asror, I.: Analysis on opinion mining using combining lexiconbased method and multinomial Naïve Bayes. In: 2018 International Conference on Industrial Enterprise and System Engineering, ICoIESE 2018. Atlantis Press, March 2019 6. Jansen, B.J., Zhang, M., Sobel, K., Chowdury, A.: Twitter power: tweets as electronic word of mouth. J. Am. Soc. Inform. Sci. Technol. 60(11), 2169–2188 (2009) 7. Janssens, O., Slembrouck, M., Verstockt, S., Van Hoecke, S., Van de Walle, R.: Real-time emotion classification of tweets. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013, pp. 1430–1431. IEEE, August 2013 8. Jussila, J.J., Kärkkäinen, H., Aramo-Immonen, H.: Social media utilization in business-tobusiness relationships of technology industry firms. Comput. Hum. Behav. 30, 606–613 (2014) 9. Kho, N.D.: Customer experience and sentiment analysis. KM World 19(2), 10–20 (2010) 10. Kim, Y., Jeong, S.R., Ghani, I.: Text opinion mining to analyze news for stock market prediction. Int. J. Adv. Soft Comput. Appl. 6(1), 2074–8523 (2014) 11. Lovejoy, K., Waters, R.D., Saxton, G.D.: Engaging stakeholders through Twitter: how nonprofit organizations are getting more out of 140 characters or less. Public Relat. Rev. 38 (2), 313–318 (2012) 12. Monkey Learn (2013). http://www.monkeylearn.com 13. Soussan, T., Trovati, M.: Sentiment urgency emotion detection for business intelligence. In: Research Perspectives in Data Science and Smart Technology for Shipping Industries (2020) 14. Su, J., Shirab, J.S., Matwin, S.: Large scale text classification using semi-supervised multinomial Naive Bayes. In: Proceedings of the 28th International Conference on Machine Learning, ICML 2011, pp. 97–104 (2011) 15. Taneja, S., Toombs, L.: Putting a face on small businesses: visibility, viability, and sustainability the impact of social media on small business marketing. Acad. Mark. Stud. J. 18(1), 249 (2014) 16. Wang, W., Chen, L., Thirunarayan, K., Sheth, A.P.: Harnessing twitter “big data” for automatic emotion identification. In: 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, pp. 587–592. IEEE, September 2012 17. Wei, C.P., Chen, Y.M., Yang, C.S., Yang, C.C.: Understanding what concerns consumers: a semantic approach to product feature extraction from consumer reviews. IseB 8(2), 149–167 (2010)

Knowledge-Based Networks for Artificial Intuition Olayinka Johnny and Marcello Trovati(B) Department of Computer Science, Edge Hill University, Ormskirk, UK {24023868,trovatim}@edgehill.ac.uk

Abstract. The ability to carry out automated decision making from large data sets with potentially large quantities of parameters presents a major challenge in decision making systems. Moreover, this involves the need for deeper analysis to allow their investigation of their corresponding scenarios in other to perform knowledge discovery, which is usually associated with high computational complexity. In this article, we discuss a knowledge-based network approach to facilitate the identification and extraction of information for a novel Artificial Intuition approach which is currently being developed by the authors.

1

Introduction

Artificial intuition plays an important role in the process of intelligence extraction and the decision-making. Some of the earlier studies took a philosophical and cognitive approach [1]. However, the effects of artificial intuition on decisionmaking have not been addressed. To the best of our knowledge, only few studies discussed in [2] attempt to study artificial intuition from the computational point of view. In [3], the authors have considered the computational model of artificial intuition. However, they do not represent the algorithms in details or show the use intuitive entities in their respective processes that they modelled. In [2], the authors reviewed the researches on artificial intuition and recognise that knowledge and past experience are very important for intuition to be accurate. Furthermore, the authors argue that the early studies of artificial intuition focus largely on the concept itself, rather than on the representation and use of entities in the process. The aim of this article is to provide an initial discussion and implementation which models an Artificial Intuition approach based on semantic networks to improve a decision system. Specifically, this research hypothesises that a computational model that correctly implement the requirements defined above can potentially obtain accurate and optimal result and improve the overall performance of human decision making systems. The article is structured as follows: in Sect. 2, the main concepts of Network theory and semantic networks were discussed in context of the proposed model. In Sect. 3, the essential components that are crucial to the development of the model are discussed. In Sect. 4, the knowledge-based network model is introduced. Finally, Sect. 5 provides a conclusion and discusses future research directions. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 319–325, 2021. https://doi.org/10.1007/978-3-030-57796-4_31

320

2

O. Johnny and M. Trovati

Background: Network Theory and Semantic Networks

This section introduces the theory and models of networks as the suitable approaches to be used in the definition and implementation of the model in this paper. It provides a discussion of semantic networks since this constitute the representation model we have chosen to implement our model. 2.1

Network Theory

A network is a set of objects, called nodes or vertices that are connected together. Many of the systems in nature can be described by models of these complex networks, which are structures consisting of the connected nodes. The connections between the nodes are called edges or links. More specifically, any patterns of interactions in a given system can be represented as a network, the individual parts of the system being denoted by nodes and their interaction by edges [4]. Numerous examples of networks exist. For example, the Internet is a network of routers or domains, the World Wide Web (WWW) is a network of Web pages connected by hyperlinks, and the human brain is a network of neurons. Network theory is the study of graphs as a representation of symmetric or asymmetric relations between discrete objects. In computer science and network science, network theory is a part of graph theory, where networks are defined as a graphs in which nodes and/or edges have attributes. Formally, networks consist of a collection of nodes, called the node set V = {vi }ni=1 , which are connected as specified by the edge set E = {e(vi , vj )}vi =vj ∈V [4], excluding self-loops, that is a single edge starting and ending at the same node. We say that there is a path P (vi , vj ) between the nodes vi and vj , if we have a sequences of edges which connect a sequence of distinct nodes, such that it starts from vi and ends at vj . The topology of different networks has been extensively investigated to identify crucial information on the corresponding system, which can provide a set of predictive tools to investigate its properties [5]. In particular, stochastic topological features can successfully model systems based on unknown parameters. The networks features have been categorised into small-world, random and scalefree network [4]. The small-world and scale-free features are common to many real-world complex networks. 2.2

Semantic Network

Since our approach utilises semantic networks as the suitable knowledge representation for our approach, this section presents a discussion of semantic network and provide some insights and show the main properties that characterises Semantic Networks. Moreover, we make the case for choosing semantic network as the suitable method to represent the knowledge concerning the real world in our study.

Knowledge-Based Networks for Artificial Intuition

321

Semantic networks are specific types of graph data structure. They are graphical knowledge representation of concepts and their mutual connections within the context of the domain that is described by the concepts [2]. In a semantic network, knowledge is expressed in the form of directed binary relations, represented by edges, and concepts, represented by nodes. Fundamentally, the process of building a semantic network requires that documents are all restricted to a domain that can be characterised by well-defined, inter-related concepts. These concepts then form the basis for the scientific terminology of the domain [2]. Each concept in the network is represented by a node and the hierarchical relationship between concepts is depicted by connecting appropriate concept nodes via is-a or instance-of links. Nodes at the lowest level in the is-a hierarchy denote tokens while nodes at higher levels denote classes or categories of types. Properties in the network are also represented by nodes and the fact that a property applies to a concept is represented by connecting the concept and property nodes via an appropriately labelled link. Typically, it is attached at the highest concept in the conceptual hierarchy to which the property applies, and if a property is attached to a node, it is assumed that it applies to all its descendants. Succinctly put, subtypes inherit properties from supertypes. By suitable definition of a set of binary relations on a set of nodes, the network corresponds to a predicate logic with binary relations. An important issue with semantic network is that it does not make a distinction between links that constitute relation and links that are structural in nature in the network. For example, a link may have two meanings in a network, thereby causing ambiguity. Moreover, a semantic network can have nodes pointing to themselves, a situation commonly known self-loops. Therefore, the meanings of links in the network is limited by the user or experts. In both cases, additional knowledge and work are required to understand and make meaningful distinction of the links. However, we can argue that this limitation can be seen as a strength as it allows experts to have different perspectives to the same dataset that is modelled by real-world semantic network and also the different meanings that can be derived from the resulting networks. In fact, the ability to reason over the concepts in the network provide useful modelling frameworks and an entry point into how the mind works to make intuitive decision. In particular, we will not allow cycles or self-loops, that is a single edge starting and ending at the same node in our implementation. The above discussion demonstrates that semantic networks have a high representational and expressive power to represent knowledge as networks of concepts, allowing a problem space to be explored by using efficient graph algorithm. In fact, semantic networks have been key in modelling diverse phenomenon from reasoning, creativity to human cognition and decision making. In this study, we are particularly interested in using efficient model of intuition that incorporate scenarios captured by semantic networks to improve a decision system.

322

2.3

O. Johnny and M. Trovati

Measurements of Semantic Similarities and Relatedness

Another important aspect to consider in our approach is the similarities and relatedness measurements of the concepts. More specifically, how do the concepts compare with respect to their semantic attributes, properties or features. There are fundamental differences between similarities and relatedness [7]. Some concepts in a network can be related without being similar. For example, Palm and tree are similar concepts and can be substituted for each other in a context. This may not necessarily be the same for semantically related concepts [7].

3

Essential Components of the Artificial Intuition

We define artificial intuition as the ability of a system to assess a problem context and use pattern recognition or properties from a dataset to choose a course of action or aid the decision process in an automatic manner. More specifically, by using intuitive models, a system is able to take subsets from networks and pass them through a process to determine relationship that can be used to predict future decision without a deep understanding of a scenario and its corresponding parameters This proposed approach is based on the following essential components. 1. Semantic network representation of existing and common sense knowledge. This includes some initial and existing knowledge regarding the relationships between the corresponding concepts within a specific setting is present and captured in a semantic network. These relationships are the significant patterns and properties as well as variables, which have been acquired over time from experience. This is the central data structure that have been created and held in the mental model about the subject of interest. In other words, these are associative-semantic network that forms its long-term memory featuring the conceptual nodes and the associated links. Human memory can be represented as a semantic network of associations where a node represents a semantic concept and these concepts are connected by directed links. The strength of these influence links varies, and these links allow for association and serves as cues to recognise and act rapidly during intuitive decision making scenarios [2]. 2. Semantic network associated with the contextualised knowledge. 3. Analysis of the network dynamics. The dynamical properties captured by the networks are investigated to provide an assessment of the reliability of the initial knowledge on a specific scenario. Additional information is then obtained iteratively via appropriate network analysis of subset of the networks. 4. Assessment of the concepts relatedness in their respective categories and similarity measurements. 5. Assessment of the intuitive decision making process. If the overall knowledge is consistent with the initial assessment, then we assume it is accurate and no further analysis is suggested.

Knowledge-Based Networks for Artificial Intuition

4

323

Knowledge-Based Networks for Artificial Intuition: ConceptNet and Wikipedia

In this section, the knowledge based will be introduced and discussed. More specifically, this is based on ConceptNet and Wikipedia, which are described in Sects. 4.1 and 4.2. The main motivation to use a knowledge-based approach is based on the simple observation and intuition is informed by any general knowledge, as well as more contextualised and ‘intuitive’ knowledge. The creation of a large network defined by such types of knowledge, namely Wikipedia and ConceptNet respectively, is essential in designing an Artificial Intuition framework. As described in Sect. 2, we define an undirected network G = G(V, E), where V = {vi }ni=1 is the node set and E = {ewi,j (vi , vj )}vi =vj ∈V is the edge set. Typically, each edge ewi,j (vi , vj ) is associated with a weight wi,j , which is related to the relationship between vi and vj . In this article, we shall consider the network generated by the union of three (usually overlapping) different networks G = Gk ∪ Gi ∪ Gc

(1)

where • Gk is the (semantic) network associated with the existing knowledge within a specific setting, • Gi is the (semantic) network associated with the intuitive knowledge and • Gc is the (semantic) network associated with the contextualised knowledge. Using this formulation, we assume that ConceptNet is associated with Gi ∪ Gc , and Wikipedia with Gk . 4.1

ConceptNet

ConceptNet is a large semantic network of common sense knowledge, which consists of assertions related to several aspects of everyday life [8]. Its overall topology is defined by connected words and phrases of natural language with labeled edges. Version 5.7 of ConceptNet has been recently been published and it is derived from several sources including the Mind Common Sense corpus, a crowd-sourced knowledge project. The network is designed to represent the general knowledge involved in understanding language, improving natural language applications by allowing the application to better understand the meanings behind the words people use. ConceptNet contains over 8 million nodes and over 21 million edges with an associated positive or negative weight. The more positive the weight, the more likely that the assertion is true. On the other hand, a negative weight indicates that the corresponding assertion is not true [8]. Each node is defined by a core set of 36 asymmetric and symmetric relations such as is_a, Used_For, Part_Of, Capable_Of, Similar_To, Located_Near, Related_To, which are language independent.

324

O. Johnny and M. Trovati

To test the ConceptNet network, the offline copy was downloaded as it provides the complete dataset in locally accessible and highly efficient SQLite database format. Furthermore, this ensured a substantially faster access to the relevant data compared with the online version. In order to investigate the topology of this network, the topic of weather modelling and prediction was chosen. Therefore, ConceptNet was queried based on the following concepts: • • • • • • • •

weather rain rainfall wind temperature hot weather weather forecast and weather prediction

This identified a total of 5495 relations from ConceptNet database with the lowest weight in the relations retrieved was 0.1 while the highest weight was 10.472. Suitable data cleansing was carried out by removing any assertion with a weight less than 1 and any non-English concept or assertion. The resulting dataset was saved as a CSV file to allow easy access via Python Pandas and NumPy libraries and to perform some analysis of the network. Since ConceptNet contains millions of edges and nodes, an important decision to make when building a knowledge graph from ConceptNet, is on what concepts to use and relations that are involved in these concepts. Note that an important characteristic of ConceptNet is that a given pair of concepts can be linked by multiple relation types, and relations can have multi-word arguments of diverse semantic types. These places relations in close vicinity in semantic space, making relation prediction a hard task. On average this applies to 5.37% of instances per relation [2]. The relations in ConceptNet can be as deep as the representation requires within the semantic network. Because the size of the network can grow exponentially, we used parameters to limit the size of the network. In other words, we prune the resultant semantic network. Limiting the sizes of the relation structures reduces the noise and improves the performance and efficiency of the resultant network. Moreover, the reduced semantic network makes it easier and simpler to work with. Given the relation structures of each of the concepts in the dataset, we can define the depth of the relation structure. The more the depth, the higher number of the related concepts are retrieved. In our experiment, we used relation grouping and weight importance. We found that the weight importance of between 1 and 4 were reasonable choices in our implementation using ConceptNet dataset. The query runs in approximately 30 s on reduced ConceptNet and in 60 min on the full ConceptNet. 4.2

Wikipedia

Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project by a community of volunteer editors using a wikibased editing system. It is the largest and most popular general reference work

Knowledge-Based Networks for Artificial Intuition

325

on the World Wide Web and it contains unstructured textual data. An important requirement of our model is to represent knowledge as a network. This was achieved by using spaCy to build an appropriate graph representation of knowledge from Wikipedia pages. The resulting network was integrated and combined with ConceptNet to provide the appropriate knowledge discovery for the given scenarios modelled by our approach. By using spaCy libraries, we were able to perform named entity recognition (NER). We applied natural Language Processing (NLP) such as sentence segmentation, Part of speech (POS) to extract pairs of entities and their relations from Wikipedia pages in order to build the knowledge network. The same concepts as above were used to identify the suitable Wikipedia sub-network, which resulted in a combined network of over 8, 000 edges and approximately 3, 500 nodes.

5

Conclusion and Future Works

In this article, a knowledge-based network approach to Artificial Intuition was introduced. To achieve this, a large semantic network was designed based on a suitable textual analysis of ConceptNet and Wikipedia. This is part of ongoing research in this field and it will be used a large knowledge network to initiate a comprehensive analysis and implementation of Artificial Intuition.

References 1. Kahneman, D., Frederick, S.: Representativeness revisited: attribute substitution in intuitive judgment. In: Heuristics & Biases: The Psychology of Intuitive Judgment, pp. 49–81. Cambridge University Press, New York (2002) 2. Johnny, O., Trovati, M., Ray, J.: Towards a computational model of artificial intuition and decision making. Advances in Intelligent Networking and Collaborative Systems, pp. 463–472. Springer International Publishing (2020) 3. Dundas, J., Chik, D.: Implementing Human-Like Intuition Mechanism in Artificial Intelligence (2011) 4. Newman, M.E.J.: Networks: An Introduction. Oxford University Press, Oxford (2010) 5. Trovati, M., Zhang, H., Ray, J.S., Xu, X.-L.: An entropy based approach to real-time information extraction for industry 4.0. IEEE Trans. Ind. Inform. 16, 6033–6041 (2019) 6. Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguist. 32, 13–47 (2006) 7. Turney, P.: Expressing Implicit Semantic Relations without Supervision. The Association for Computer Linguistics (2006) 8. Speer, R., Havasi, C.: Representing general relational knowledge in ConceptNet 5. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, pp. 3679–3686 (2012)

Research on Foreign Anti-terrorism Intelligence Early Warning Based on Visual Measurement Xiang Pan1(&), Zhiting Xiao1, Xuan Guo2, and Yuan Chen1 1 Information and Communication College, National University of Defense Technology, Wuhan, China [email protected], [email protected], [email protected] 2 Officers’ College of PAP, Chengdu, China [email protected]

Abstract. Based on the core database of Web of Science, 146 foreign effective literatures on anti-terrorism intelligence early warning research are extracted, and the research status, hotpots and frontiers of foreign anti-terrorism intelligence early warning research are objectively analyzed by means of title analysis tool SATI, social network analysis software Ucinet, citation visualization analysis tool Citespace and visualization software NetDraw. The research status, hot spots and frontiers of foreign anti-terrorism intelligence early warning research are objectively analyzed by using software such as topic analysis tool SATI, social network analysis software Ucinet, citation visualization analysis tool Citespace and visualization software NetDraw.

1 Introduction In recent years, the reshaping of the global order has been further carried out: major forces have fluctuated and interacted in a complex way; changes in international mechanisms and rules have accelerated; geopolitical competition among the three major powers has intensified; and multiple challenges have highlighted the urgency of global governance reform. Terrorism in the war, instability, geopolitics and ethnic tensions complex areas by entrenched, organizational structure to the multicenter, flat, decentralized evolution, presents a smaller, faster, more information, stronger specialization degree and attack means more and more violence, influential country broader, more difficult to prevent the new situation. The successful implementation of violent terrorist activities, to a large extent, is due to the loopholes in prevention. In the process of summing up the lessons of failure, countries around the world have gradually shifted the focus of anti-terrorism work from emergency response after attacks to anti-terrorism intelligence and early warning, resulting in a large number of academic research literatures. In view of this, this paper makes a visual econometric analysis of domestic and foreign anti-terrorism intelligence early warning research literature to grasp the research status, research hotpots and progress of domestic and foreign antiterrorism intelligence early warning, so as to clarify the future research direction.

© Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 326–337, 2021. https://doi.org/10.1007/978-3-030-57796-4_32

Research on Foreign Anti-Terrorism Intelligence Early Warning

327

2 Research Data and Methods Only by econometric analysis of representative literature with sufficient sample size can valuable statistical conclusions be obtained. This paper adopts literature measurement method to conduct quantitative analysis on domestic and foreign anti-terrorism intelligence early warning research literature, and objectively analyzes the research status and hot areas of domestic and foreign anti-terrorism intelligence early warning research with the help of journal analysis tool SATI, social network analysis software Ucinet, citation visualization analysis tool Citespace and visualization software NetDraw. The Data source of The English literature is the core database of Web of Science, AND the retrieval conditions are TS = (anti-Terrorism AND intelligence) OR TS = (anti-Terrorism AND early-warning) OR TS = (terrorism AND early-warning), the retrieval strategy is advanced retrieval, AND the language is English. The literature type was Article, the time span was 2000–2020, retrieved on April 8, 2020. After eliminating duplicates and irrelevant literatures, 146 valid English literatures were obtained.

3 Analysis of Domestic and Foreign Anti-terrorism Intelligence Early Warning Research 3.1

Analysis of the Number of Published Documents

The number of published documents reflects to a certain extent the research level and development trend of relevant research in this field at a given time. As can be seen from Fig. 1, the amount of foreign research literature also shows a slow growth trend on the whole and is closely related to the global terrorism governance situation. The “9.11” incident in 2001 caused academic circles in various countries to reflect on the shortcomings of intelligence work one after another. The amount of research literature increased rapidly and reached its first peak in 2004. From 2005 to 2008, with the decrease of the popularity of the “9.11” incident, the amount of research literature decreased year by year. From 2008 to 2011, as al-Qaeda’s Osama bin Laden and other important leaders were successively killed by U.S. Troops, al-Qaeda and the Taliban launched many major retaliatory terrorist attacks, and the amount of research literature showed an increasing trend. The heat of relevant research from 2012 to 2014 is not high. The rapid rise of the “Islamic State” from 2015 to 2017 has once again triggered an upsurge in anti-terrorism intelligence early warning research. Since 2018, terrorism has continued to spread to the social and political fields, resulting in increasingly prominent and deadly extreme right-wing terrorism. Western Europe, North America and Oceania have been attacked successively. At the same time, the rapid development of the Internet has also brought new changes to the spread of terrorism, thus making the heat of foreign anti-terrorism intelligence early warning research continue to this day.

328

X. Pan et al.

Fig. 1. Annual number of articles published in foreign anti-terrorism intelligence early warning

3.2

Analysis of Country

From the perspective of the publication volume of anti-terrorism intelligence and early warning in different countries, among the top ten countries, only China is a developing country. The United States published 78 articles, accounting for 53.4% of the total, indicating its important influence in the field of anti-terrorism intelligence and early warning research. The number of published articles in China is 17, accounting for only 11.6% of the total, indicating that there is still much room for improvement in China’s research on anti-terrorism intelligence and early warning (Fig. 2).

Fig. 2. Distribution of foreign countries in the field of anti-terrorism intelligence early warning research

Research on Foreign Anti-Terrorism Intelligence Early Warning

3.3

329

Analysis of Core Journals

Journals generally have a fixed theme. Through the collation of high-frequency journals, we can focus on a certain research field, quickly grasp the research hotspots in this field, and accurately grasp the academic frontier. Visualization and centrality analysis of periodicals cited in foreign anti-terrorism early warning research are carried out, as shown in Fig. 3 and Table 1. Through the analysis of periodical intermediary centrality, we can see that RISK ANALYSIS, COMMUNICATIONS OF THE ACM, LECTURE NOTES IN COMPUTERS SCIENCE have higher intermediary centrality, which are 0.47, 0.47 and 0.39 respectively, covering computer, criminology, international relations and other fields. RISK ANALYSIS is a research journal in the field of interdisciplinary application of mathematics published by Wiley-Blackwell Publishing House in the United States. COMMUNICATIONS OF THE ACM is a research journal in the field of computer science and information systems published by the American Computer Association. The journal focuses on the practical impact of information technology progress and related management issues. LECTURE NOTES IN COMPUTERS SCIENCE is a research-level professional literature on computer science published by SPRINGER Publishing House in Germany. It collects the proceedings of major international conferences on computer science in the world, mainly involving the fields of computers, information technology, automatic control, artificial intelligence, etc. The high centrality of intermediary indicates that the research papers published in these three journals have been highly recognized by research scholars in the field of anti-terrorism intelligence and early warning, and also shows the important influence of the three journals in the field of anti-terrorism intelligence and early warning research.

Fig. 3. The core journals of foreign anti-terrorism intelligence early warning research

330

X. Pan et al.

Table 1. The number of citations and center degree of foreign core periodicals in anti-terrorism intelligence early warning research No

Journals

1 2 3 4 5

LECTURE NOTES IN COMPUTERS SCIENCE RISK ANALYSIS COMMUNICATIONS OF THE ACM SCIENCE EUROPEAN JOURNAL OF OPERATIONAL RESEARCH NATURE OPERATIONS RESEARCH LECT NOTES ARTIF INT RELIABILITY’ ENGINEERING SYSTEM SAFETY STUD CONFL TERROR

6 7 8 9 10

3.4

Number of references 19 19 19 15 14

Degree

12 12 10 10

0.26 0.06 0.21 0.20

10

0.06

0.39 0.47 0.47 0.35 0.12

Analysis of Core Authors

Garfield E, an American intelligence scientist, pointed out that the amount of literature data published by the author was positively correlated with its influence in the research field. According to Price’s law, the number of published articles by core authors should be satisfied: pffiffiffiffiffiffiffiffiffi m ¼ 0:749 nmax

ð1Þ

among which, nmax is the number of published articles by the most authors. Foreign anti-terrorism intelligence early warning research has 444 signatures, The most published author is CHEN, HC, Dr. Chen Xinjun, Professor, Department of Management Information Systems, University of Arizona, Dr. Chen led the team of the Artificial Intelligence Laboratory of the University of Arizona in the United States to cooperate with various U.S. Government departments such as the U.S. Department of Defense, the Department of Homeland Security and the Central Intelligence Agency. He is mainly engaged in intelligence analysis, terrorist data mining and other research work, and is one of the most authoritative authors in the field of anti-terrorism intelligence research in the world. The number of documents published by Professor Chen is 6, nmax ¼ 6. Based on this, m 1:834 can be calculated. Therefore, research institutions with 2 or more documents can become core authors. The sample data has a total of 25 core authors, accounting for 5.63% of the total number of signed authors, as shown in Table 2.

Research on Foreign Anti-Terrorism Intelligence Early Warning

331

Table 2. Core authors in foreign anti-terrorism intelligence early warning field No

Author

1 2 3 4 5 6 7 8 9 10 11 12 13

Chen, HC Choo, KKR Xu, JJ Hausken, K Kaplan, EH Quick, D Sageman, M Bagchi, A Caulkins, JP Chau, M Chung, WY Ding, Y Feichtinger, G

3.5

Number of published documents 6 4 4 3 3 3 3 2 2 2 2 2 2

No

Author

14 15 16 17 18 19 20 21 22 23 24 25

Fujita, H Leistedt, SJ Osrfeld, A Paul, JA Qin, JL Salomons, E Seidl, A Sun, DY Wang, FY Wrzaczek, S Yang, CC Zeng, D

Number of published documents 2 2 2 2 2 2 2 2 2 2 2 2

Analysis of Co-authors

Network density shows how close the relationship between authors is. In an undirected network with the number of nodes, the maximum possible number of relationships is Þ 2m Cn2 ¼ nðn1 2 , If the actual number of relationships is, the density of the network is nðn1Þ. After calculation, the network density value of the co-authors of foreign anti-terrorism intelligence early warning research is 0.0431. The index is small, and the network nodes as a whole are in a loose state. The research team of the University of Arizona in

Fig. 4. Analysis of co-authors of foreign anti-terrorism early warning research

332

X. Pan et al.

the United States with Dr. Chen Xinjun at the core and the research team of the University of Wisconsin-Madison with Dr. Bier and Dr. VM at the core cooperate closely within the team, while the researchers outside the team mainly work independently, with few research teams with close cooperation (Fig. 4). 3.6

Analysis of Core Institutions

According to Price’s Law, the number of articles issued by core institutions should be satisfied formulas 1, of which nmax is the research institution with the largest number of articles issued. The institution that published the most literature on anti-terrorism intelligence and early warning research abroad is the University of Arizona in the United States. nmax ¼ 7, according to which m 1:982 can be calculated. Therefore, research institutions with 2 or more articles can become core institutions, with a total of 54 core institutions, accounting for 21.4% of the total number of research institutions. Table 3 lists the core institutions with 3 or more articles in the field of anti-terrorism intelligence early warning research abroad. China’s Chinese Academy of Sciences and National Defense University of Science and Technology are in this category.

Table 3. Core institutions of foreign anti-terrorism intelligence early warning research No Research institutions

No Research institutions

I

10 Pennsylvania Commonwealth System of Higher Education PCSHE 11 Stavanger University

2 3 4

5 6 7 S 9

Number of published documents University-of Arizona 7

University of Texas System Chinese Academy of Science United States Department of Defense University System of Georgia U.S. Department of Energy United States Navy

6 5

Number of published documents 3

3

5

12 University of Massachusetts System 13 University of Pennsylvania

5

14 University of South Australia 3

4

15 University of Southern California 16 University of San Antonio, Texas 17 Yale University

4

University of London 4 National University 3 of Defense Technology China

3 3

3 3 3

Research on Foreign Anti-Terrorism Intelligence Early Warning

3.7

333

Analysis of Co-institutions

Among the foreign anti-terrorism intelligence and early warning research institutions, see Fig. 5. The academic cooperation between the University of Arizona Arizona University and the Chinese Academy of Sciences is the most obvious among the five universities such as Yale University and Yale University. Although there is some academic cooperation among other research institutions, on the whole, the academic exchange system among foreign research institutions in the field of anti-terrorism intelligence and early warning research is not perfect.

Fig. 5. Analysis of co-research institutions of foreign anti-terrorism early warning research

4 Analysis of Hotpots and Frontiers of Foreign Anti-terrorism Intelligence Early Warning Research 4.1

Analysis of Hotpots of Foreign Anti-terrorism Intelligence Early Warning Research

Key words are the refinement and concentration of the core content of the article, which has a strong correlation with the research purpose, object, method and results of the article. Word frequency and intermediary centrality are the main parameters to evaluate the importance of keywords. The research topics represented by keywords with high word frequency and high intermediary centrality are the research hotspots in this field. According to the statistics of the word frequency of the key words in the foreign antiterrorism intelligence early warning research literature, the top ten are: terrorism, intelligence, model, security, counter-terrorism, defense, attack, data mining, system, bioterrorism. The key words of foreign anti-terrorism intelligence early warning research literature are analyzed by intermediary centrality statistics, and the top ten are respectively: terrorism, model, security, data mining, system, big data, counterterrorism, crime, decision making, capability. At the same time, the key words with

334

X. Pan et al.

high frequency and high intermediary centrality are terrorism, model, security, counterterrorism, data mining and system, which are hot spots in foreign anti-terrorism intelligence early warning research, as shown in Table 4. Table 4. Statistics on key word frequency and intermediary center of foreign anti-terrorism intelligence early warning research (top 10) No Key words 1 2 3 4 5 6 7 8 9 10

Frequency Key words

Terrorism 19 Intelligence 15 Model 10 Security 8 Anti-terrorism 5 Defense 5 Attack 5 Data mining 4 System 4 Bioterrorism 4

Intermediary centrality Terrorism 0.19 Model 0.15 Security 0.13 Data mining 0.10 System 0.10 Big data 0.10 Anti-terrorism 0.08 Crime 0.08 Decision making 0.07 Capability 0.07

The key words of foreign anti-terrorism intelligence early warning research literature are clustered and divided into 10 categories, including privacy right, specific treatment, individual protection, dark web, complexity science, intelligent agent, terrorist organization, terrorist network, general corpus and social system. These 10 categories are the specific research hotpots in the field of foreign anti-terrorism intelligence early warning research, as shown in Fig. 6.

Fig. 6. Research hotpots of foreign anti-terrorism intelligence early warning research

Research on Foreign Anti-Terrorism Intelligence Early Warning

4.2

335

Analysis of Frontiers of Foreign Anti-terrorism Intelligence Early Warning Research

According to the word frequency statistics of foreign research keywords in the field of anti-terrorism intelligence and early warning from 2018 to 2020, the top ten are: terrorism, impact, counter-terrorism, big data, decision making, artificial intelligence, treat assessment, intelligence, system and model. The key words of foreign antiterrorism intelligence early warning research literature are analyzed by intermediary centrality statistics, and the top ten are: impact, terrorism, big data, decision making, artificial intelligence, treatment assessment, critical success factor, construction, intelligence and capability. At the same time, the key words with high frequency and high intermediary centrality are terrorism, impact, big data, decision making, artificial intelligence, treat assessment and intelligence, which are the frontiers of foreign antiterrorism intelligence early warning research, as shown in Table 5.

Table 5. 2018–2020 statistics on key word frequency and intermediary center of foreign antiterrorism intelligence early warning research (top 10) No Key words 1 2 3 4 5 6 7 8 9 10

Terrorism Impact Anti-terrorism Big data Decision making Artificial intelligence Treat assessment Intelligence System Model

Frequency Key words 5 3 3 2 2 2

Terrorism Model Security Data mining System Big data

Intermediary centrality 0.47 0.37 0.20 0.17 0.15 0.15

2 2 2 2

Anti-terrorism Crime Decision making Capability

0.15 0.15 0.10 0.10

5 Conclusions The heat of foreign anti-terrorism intelligence early warning research is closely related to the global terrorism governance situation, showing a steady but rising trend. The core research journals cover computer, information technology, automatic control, artificial intelligence, criminology, mathematics, international relations and other fields. The proportion of core authors is high; There is less cooperation between research authors, insufficient knowledge flow, and the research system has not yet been fully formed. Although the core research institutions have no specific anti-terrorism tasks, they play an important role as “think tanks” in close cooperation with the relevant antiterrorism departments of the state and the army. Research institutions are heavily guarded and the academic exchange system is not perfect. It focuses on the research on

336

X. Pan et al.

terrorism model, data mining, terrorism threat assessment and other contents. There are relatively more research on technology application. Its advantages are that the research level is deeper and can directly increase intelligence benefits. Its disadvantages are that the research scope is narrow and the research versatility is not strong. Big data, terrorism, anti-terrorism and its derived artificial intelligence, threat assessment, impact, intelligence, decision-making and other frontier issues in foreign anti-terrorism intelligence early warning research have pointed out the development direction for China’s anti-terrorism intelligence early warning research.

References 1. Kim, J.: Breaking the privacy kill chain: protection individual and group privacy online. Inf. Syst. Front. 22, 171–185 (2020). https://doi.org/10.1007/s0796-018-9856-5 2. Haimes, Y.Y.: On the complex quantification of risk: systems-based perspective on terrorism. Risk Anal. 31(8), 1175–1186 (2011) 3. Hausken, K.: Combined series and parallel systems subject to individual versus overarching defense and attack. Asia-Pac. J. Oper. Res. 30(13), 155–160 (2013) 4. Nasir, M., Shahbaz, M.: War on terror: do military measures mater? Empirical analysis of post 9/11 in Pakistan. Qual. Quant. 49(5), 1969–1984 (2015) 5. Kim, W.: On us homeland security and database technology. J. Database Manag. 16(1), 17 (2005) 6. Helbing, D., Brockmann, D., Chadefaux, T., Donnay, K., Blanke, U., Woolley-Meza, O., Moussaid, M., Johansson, A., Krause, J.: Saving human lives: what complexity science and information systems can contribute. J. Stat. Phys. 158, 735–781 (2015)

Research on Foreign Anti-Terrorism Intelligence Early Warning

337

7. Gao, S., Xu, D.: Conceptual modeling and development of an intelligent agent-assisted decision support system for anti-money laundering. Expert Syst. Appl. 36, 1493–1504 (2009) 8. Jaspersen, J.G., Montibeller, G.: On the learning patterns and adaptive behavior of terrorist organizations. Eur. J. Oper. Res. 282, 221–234 (2020) 9. Singh, S., Verma, S.K., Tiwari, A.: A novel approach for finding crucial node using ELECTRE method. Int. J. Mod. Phys. B 34, 58–61 (2020)

Credit Rating Based on Hybrid Sampling and Dynamic Ensemble Shudong Liu1, Jiamin Wei1, Xu Chen1, Chuang Wang2(&), and Xu An Wang3 1

3

School of Information and Security Engineering, Zhongnan University of Economics and Law, Wuhan, China 2 China Software Testing Center, Beijing, China [email protected] Engineering University of People’s Armed Police, Xi’an, China

Abstract. The core problem of the credit rating is how to build an efficient and accurate classifier on the imbalanced datasets. The ensemble learning and resampling technology have rich results in this field, but the efficiency of the classifier is limited when dealing with high imbalanced credit data. In this paper, we propose a credit rating model based on hybrid sampling and dynamic ensemble technique. Hybrid sampling can contribute to build a rich base classifier pool and improve the accuracy of the integrated learning model. The combination of hybrid sampling and dynamic ensemble can apply to various imbalanced data and obtain better classification results. In the resampling phase, synthetic minority over-sampling technique (SMOTE) and boundary-sensitive under-sampling techniques are used to process the training data set, and the clustering technique is used to improve the under-sampling and make it more adaptable to high imbalanced credit data, by generating more samples and more representative training subset to enhance the diversity of the basic classifier. A dynamic selection method is used to select one or more classifiers from the basic classifier pool for each test sample. Experiments on three credit data sets prove that the combination of hybrid sampling and dynamic ensemble can effectively improve the performance of the classification.

1 Introduction Credit risk management is a perennial topic in financial field. Due to the subprime crisis appeared in the US in 2007 and some international agreements released since 2010 (such as Basel III), credit rating has attracted more researchers’ attentions from the academic and industry. Recently, with the development of network technique and artificial intelligence, digital payment and peer-to-peer lending are widely used in our daily lives. Credit risk is often hidden in above-mentioned financial platforms; therefore, credit risk assessment is a crucial issue, and credit scoring based on machine learning algorithm is becoming researchers’ first-choice solution. In the era of big data, vast and cross-domain data, more and more data types and high speed will bring about new challenge for credit scoring in financial field. Manual credit risk management is obviously impossible, it needs automatic credit scoring by using machine learning © Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 338–347, 2021. https://doi.org/10.1007/978-3-030-57796-4_33

Credit Rating Based on Hybrid Sampling and Dynamic Ensemble

339

algorithm. Moreover, high-performance credit scoring could help reduce the cost of credit risk assessment, make optimal loan decision, and help financial institution pursuing larger profit margins. As Thomas et al. [1] said: credit scoring is “a set of decision models and their underlying techniques that aid credit lenders in the granting of credit”, consequently, credit scoring model often refers to statistical analysis. Through analyzing users’ historical loan records, we will extract some important features, which have significant impact on risk assessment, such as gender, age, occupation and income, and construct an appropriate model to predict users’ credit scoring. The model of credit scoring is often a classifier, which could divide users into two group, namely, trustworthy users and non-trustworthy users. Financial institution will decide to make leans to users in terms of their credit scoring. In an increasingly complex financial market, the performance of credit scoring model usually affects the market position of financial institution, therefore, constructing a high-performance credit scoring model is crucial. With the great progress of artificial intelligence and deep learning, exploiting machine learning technique to construct credit scoring model has gradually become the mainstream, including k-nearest neighbor (KNN), artificial neural network (ANN), decision tree (DT), support vector machine (SVM). Recently, many researches [2–4] demonstrate that ensemble learning technique has obtained a good performance on credit scoring. Thus, in this paper, we introduce dynamic ensemble learning and hybrid resampling technique to construct credit scoring model.

2 Related Works As an useful tool in financial risk management, credit scoring has attracted lots of researchers’ attention over the past decades, and some interesting research results have been achieved. We divide previous works into two group in terms of the classifier used in credit scoring model, which includes statistical method and machine learning method. Statistical method is the earliest solutions used in credit scoring model, despite the breakthrough of artificial intelligence nowadays, due to their easy implement and accuracy, many statistical methods are still popular, including discriminant analysis (DA) [5–7], logistic regression (LR) [8, 9], Naïve Bayes (NB) [10], rough set theory [3, 11]. For example, Eisenbeis et al. [5] review the early methodological approaches which exploit discriminant analysis technique for credit scoring in 1960s, and point out that the statistical scoring models tend to focusing primarily on the minimization of default rates, ignore other optimization objectives, such as the lender profit maximization and cost minimization. William et al. [6] propose a descriptive example and empirical analysis to show how linear programming might be used to solve discriminant type problems. Mircea et al. [7] propose a specific credit score model based on discriminant analysis, which can make financial diagnoses on particular predefined classes. Patra et al. [8] propose a maximum margin logistic regression with a modified logistic loss function, which has the classification capability of SVM with low computational cost. Sohn et al. [9] propose a fuzzy credit scoring model with fuzzy input and output, which can be used to predict the default possibility of loan for a firm.

340

S. Liu et al.

Vedala et al. [10] propose a multi relational Bayesian classification method to predict the default probabilities of users in online P2P platform. Capotorti et al. [11] propose a credit scoring model based on conditional probability assessments, which incorporates fuzzy rough set theory with uncertain reasoning. Maldonado et al. [3] propose a twostep credit scoring approach based on 3-way decisions theory. Machine learning method is another popular technique by used in credit scoring over past years, and many achievements have demonstrated that machine learning method is alternative and effective solution for credit scoring. According to the integration of the algorithm used in credit scoring, machine learning methods can be roughly categorized into two groups: single classification method and ensemble method. The single classification algorithms used in credit scoring model include KNN [12, 13], DT [14, 15], SVM [16, 17], ANN [18] and generative model [19]. For example, Mukid et al. [13] review the weighted k nearest neighbor algorithms for credit scoring and the experimental results show that the Gaussian kernel and rectangular kernel outperform others. Nie et al. [14] propose a misclassification cost measurement by considering two type error and the economic sense, which is applied to logistic regression and decision tree. Sohn et al. [15] propose a rather simple credit scoring model based on decision tree, which can serve as a-replacement for the complicated models currently used for start-up firms. Li et al. [16] propose a reject inference based on semi-supervised SVM for credit scoring, which can deal with labelled and unlabeled data. Luo et al. [17] design a golden-section algorithm based on a unsupervised kernelfree quadratic surface SVM model, which can generate the appropriate classifier for balanced and imbalanced data. Zhou et al. [18] propose a credit scoring model based on multi-layer perceptron neural networks, and prove that the optimization of dataset structure can improve a model’s performance. Mancisidor et al. [19] present two Bayesian models for reject inference in credit scoring, which combines Gaussian mixtures and auxiliary variables in a semi-supervised framework with generative models. Above-mentioned methods almost refer to a single classification algorithm, they probably have good performance on a small dataset. However, current credit risk assessment tends to deal with cross-domain complex datasets. A single classification algorithm is no longer a considerable solution to credit scoring in the big data scenario. Consequently, researchers around the world increasingly exploit ensemble methods for credit scoring. Ensemble methods can be divided into two groups according to their structure: parallel methods [20, 21] and sequential methods. The representatives of the former are bagging [20] and random forest [21], the representative of the latter is boosting [22, 23]. For example, Sun et al. [20] design a decision tree ensemble model based on the synthetic minority over-sampling technique (SMOTE) and the Bagging algorithm, which assign different degrees to new positive samples and different numbers of negative samples are drawn with differentiated sampling rates in iterations. Arora et al. [21] propose a Bolasso enabled random forest algorithm to classify borrower as defaulter or legitimate. Xia et al. [22] propose a sequential ensemble credit scoring model based on extreme gradient boosting (XGboost), which tunes the hyperparameter of XGboost with Bayesian hyper-parameter optimization. Chang et al. [23] construct a credit risk assessment model based on XGboost, which achieve better experimental results than single-stage classification algorithms, such as LR, SVM.

Credit Rating Based on Hybrid Sampling and Dynamic Ensemble

341

Recently, few researchers [24, 25] put forward dynamic ensemble method for credit scoring. In this paper, we incorporate hybrid sampling into dynamic ensemble method, propose a novel credit scoring framework, which can be applied to various imbalanced datasets.

3 A Credit Scoring Model Based on Hybrid Sampling and Dynamic Ensemble 3.1

The Overall of the Credit Scoring Model

Our proposed credit scoring model is composed of two main parts: hybrid module and dynamic selection module. The model is shown in Fig. 1. hybrid sampling

original dataset

SMOTE

CBU

BSU

SMOTE

negative samples

negative samples

positive samples

traning subsets

base classifiers pool test dataset

DSEL dynamic ensemble

dynamic selection results

Fig. 1. The overall of our proposed model

3.2

Hybrid Sampling

Like the research results in [26], the borderline instances of two classes play an important part in deciding the decision boundary and the performance of the classifier, consequently, we regard the borderline instances as necessary input of the base classifiers. In hybrid sampling module, we firstly use SMOTE technique to generate new synthetic instances of the minority class, and identify borderline samples from the majority class, we add these borderline samples into training subsets, and regard this sample selection method as borderline-sensitive undersampling (BSU). To avoid the loss of important information in random undersampling, we utilize k-means algorithm to divide all majority class samples (no include borderline instances) into numerous clusters, select representative instances from each cluster, and construct the training subsets of the base classifiers. The hybrid sampling module is composed of following three steps.

342

S. Liu et al.

(1) oversampling minority class by using SMOTE technique To denotes the original skewed dataset, TN is the set of majority class, jTN j ¼ NumN , TP is the set of minority class, jTP j ¼ NumP . We use SMOTE algorithm to generate new minority class samples. For randomly selected sample xi 2 TP , i 2 ½1; NumP , we find out its k nearest neighbors fxi;j jj 2 ½1; k g, where the distance measurement is Euclidean distance. Randomly select a sample xi;j0 from fxi;j jj 2 ½1; k g, the new synthetic sample xi;n is: xi;n ¼ xi;j þ rand ð0; 1Þ xi;j xi;j0 Repeat the above steps until we get a balanced dataset TB ¼ To [ xi;n ; i 2 ½1; NumP , the set of all new synthetic samples is donated by Tsyn . (2) identify the borderline samples from the majority class and remove the noisy samples For a sample xi 2 TB , whose k nearest neighbors are fxi;j jj 2 ½1; kg, if the class of xi is not the same as the classes of k nearest neighbors fxi;j jj 2 ½1; k g, we regard xi as the noisy sample, and remove it from TB . For a sample xi 2 TN , whose k nearest neighbors are Bi ¼ fxi;j jj 2 ½1; kg, if there is a subset SBi of Bi , where SBi ¼ fxi;j jxi;j 2 Bi ; j 2 ½1; k ^ class xi;j 6¼ majorityg, and 2k \ jSBi j \ k. In other words, if the majority samples of Bi belong to the minority class, we see it as the borderline sample. The set of selected borderline samples is denoted by TL . (3) undersampling the majority class based on clustering (CBU) To avoid the loss of important information in random undersampling, we utilize kmeans algorithm to divide all majority class samples (exclude borderline instances) into numerous clusters, select representative instances from each cluster. for the set of the majority class TN TL , the number of clusters is m ¼ jTN TL j=NumP . Let t ¼ 0, randomly select samples ot ¼ xt1 ; xt2 ; ; xtm from TN TL as the cluster centers, compute the distance dijt between a sample xi ; i ¼ 1; 2; jTN TL j and a cluster center xtj ; j ¼ 1; 2; m, and add the sample into the cluster ct1 with xtj0 where dij0 ¼ minj dij , for xi 8i; i ¼ 1; 2; ; jTN TL j; xi 2 TN TL , we achieve m clusters C1t ; C2t ; ; Cmt , Cit j jP P t P tþ1 xk , if dij [ dij , t ¼ t þ 1, proupdate the cluster centers xti þ 1 ¼ C1t j i j k¼1 i;j i;j ceed to the next iteration, otherwise, stop above iteration. We achieve the final clustering results C1 ; C2 ; ; Cm , and construct minority class set B1 ; B2 ; ; Bm , where the samples of Bi are randomly selected from TP [ Tsyn through bootstrap sampling technique. The m training subsets are S1 ; S2 ; ; Sm , where Si ¼ fTL ; Ci ; Bi g; i ¼ 1; 2; m. 3.3

Dynamic Ensemble

Dynamic selection is the core issue of dynamic ensemble, which can dynamically select appropriate classifiers from base classifier pool for test samples. A base classifier

Credit Rating Based on Hybrid Sampling and Dynamic Ensemble

343

in the pool is often seen as a domain expert, who can correctly classify the samples in local region of feature space, and not correctly classify all samples. Thus, we need to evaluate the local competence of a base classifier in the pool, the evaluation methods include KNN algorithm, cluster algorithm and potential function, only the competent classifiers are used to predict test samples. We take advantage of KNOE (k-nearest oracle eliminate) [27] algorithm to implement our dynamic ensemble solution, and dynamic selection dataset (DSEL ) is the original balanced training set, which is supplemented with new synthetic samples generated by SMOTE.

4 Experiments and Results Analysis 4.1

Experimental Data and Settings

The datasets used in our experiments are from international machine learning (UCI) repository, Kaggle, and Lending Club, respectively. The statistics of three datasets is shown in Table 1 as follows. In our experiments, The number of nearest neighbor is 10. We use 5-fold crossvalidation to compare the performance of all methods on training datasets. The base classifier is decision tree, the size of the base classifier pool is 30. To compare our

Table 1. The statistics of three dataset

Taiwan (UCI) Give Me Some Credit (Kaggle) Lending Club

# samples 30,000 150,000

# positive samples 6,636 10,026

# negative samples 23,364 139,974

266,919

45,248

221,671

Imbalance Ratio (IR) 3.52 13.96 4.90

proposed model, we select some comparison methods, including random forest (RF), which is a static ensemble method, RANK [28], LCA [29] and META [30], which are three dynamic ensemble methods. To demonstrate the practicability of hybrid sampling proposed in this paper, we introduce random sampling (RS) as comparison method. 4.2

Evaluation Metric

We evaluate our proposed algorithm by F-measure, G mean, which are commonly used to evaluate imbalance learning algorithms. The confusion matrix of the classification results is shown as in Table 2.

344

S. Liu et al. Table 2. Confusion matrix of classification Predicted Positives Predicted Negatives Actual Positives True Positives (TP) False Negatives (FN) Actual Negatives False Positives (FP) True Negatives (TN)

Positive predictive value (precision) is defined as: Precision ¼

TP TP þ FP

True positives rate (recall or sensitivity) is defined as: Recall ¼ TPR ¼

TP TP þ FN

True negative rate (specificity) is defined as: TNR ¼

TN TN þ FP

F1 is defined as: F1 ¼

2 Precision Recall Precision þ Recall

G mean is defined as: G mean ¼

4.3

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi TPR TNR

Performance Comparisons

We compare our proposed model with four state-of-art ensemble methods on three dataset, the experimental results are shown in Table 3. We can see from Table 3 F1 and G mean of four dynamic ensemble methods are better than random forest, and they have much more better performance on high-IR datasets. Especially, our proposed model does not have evident competition superiority on low-IR datasets, however, it always outperforms other three comparison methods on high-IR datasets.

Credit Rating Based on Hybrid Sampling and Dynamic Ensemble

345

Table 3. Comparison of experimental results Taiwan

Give Me Some Credit G-means F1 G-means F1 RF 0.79 0.83 0.76 0.79 RS-LCA 0.82 0.85 0.80 0.80 RS-RANK 0.86 0.89 0.77 0.76 RS-META 0.88 0.90 0.81 0.82 RS-KNOE 0.91 0.94 0.85 0.88 HS-LCA 0.86 0.90 0.79 0.71 HS-RANK 0.81 0.88 0.81 0.83 HS-META 0.90 0.90 0.83 0.85 HS-KNOE 0.92 0.93 0.86 0.89

Lending Club F1 0.69 0.72 0.70 0.72 0.80 0.80 0.81 0.71 0.83

G-means 0.75 0.79 0.75 0.79 0.80 0.80 0.83 0.82 0.84

5 Conclusion Credit risk is often hidden in above-mentioned financial platforms; therefore, credit risk assessment is a crucial issue, and credit scoring based on machine learning algorithm is becoming researchers’ first-choice solution. In this paper, we propose a credit rating model based on hybrid sampling and dynamic ensemble technique. Hybrid sampling can contribute to build a rich base classifier pool and enhance the performance of the integrated learning model. The combination of hybrid sampling and dynamic ensemble can apply to various imbalanced data and obtain better classification results. Experiments on three credit data sets prove that the combination of hybrid sampling and dynamic ensemble can effectively improve the performance of the classification. Acknowledgments. This study was supported by the National Natural Science Foundation of China (61602518, 71872180, 71974202) and the Fundamental Research Funds for the Central Universities, Zhongnan University of Economics and Law (2722 020JCT033).

References 1. Thomas, L.C., Edelman, D., Crook, J.: Credit Scoring and Its Application. SIAM, Philadelphia (2002) 2. Xiao, J., Zhou, X., Zhong, Y., Xie, L.: Cost-sensitive semi-supervised selective ensemble model for customer credit scoring. Knowl.-Based Syst. 189 (2020). Article No. 113351 3. Maldonado, S., Peters, G., Weber, R.: Credit scoring using three-way decisions with probabilistic rough sets. Inf. Sci. 507, 700–714 (2020) 4. Classifier selection and clustering with fuzzy assignment in ensemble model for credit scoring. Neurocomputing 316, 210–221 (2018) 5. Eisenbeis, R.A.: Problems in applying discriminant analysis in credit scoring models. J. Bank. Finance 2(3), 205–219 (1978)

346

S. Liu et al.

6. William, E., Hardy, J.R., John, L., Adrian, J.R.: A linear programming alternative to discriminant analysis in credit scoring. Agribusiness 1(4), 285–292 (1985) 7. Mircea, G., Pirtea, M., Neamtu, M., Bazavan, S.: Discriminant analysis in a credit scoring model. In: AIASABEBI 2011, pp. 257–262, August 2011 8. Patra, S., Shanker, K., Kundu, D.: Sparse maximum margin logistic regression for credit scoring. In: ICDM 2008, pp. 977–982, December 2008 9. Sohn, S.Y., Kim, D.H., Yoon, J.H.: Technology credit scoring model with fuzzy logistic regression. Appl. Soft Comput. 43, 150–158 (2016) 10. Vedala, R., Kumar, B.R.: An application of Naive Bayes classification for credit scoring in e-lending platform. In: ICDSE 2012, pp. 21–29, July 2012 11. Capotorti, A., Barbanera, E.: Credit scoring analysis using a fuzzy probabilistic rough set model. Comput. Stat. Data Anal. 56(4), 981–994 (2012) 12. Henley, W.E., Hand, D.J.: A k-NN classifier for assessing consumer credit risk. Statistician 65, 77–95 (1996) 13. Mukid, M.A., Widiharih, T., Rusgiyono, A.: Credit scoring analysis using weight k nearest neighbor. In: the 7th International Seminar on New Paradigm and Innovation on Natural Science and Its Application, October 2017 14. Nie, G., Rowe, W., Zhang, L., Tian, Y., Shi, Y.: Credit card churn forecasting by logistic regression and decision tree. Expert Syst. Appl. 38, 15273–15285 (2011) 15. Sohn, S.Y., Kim, J.W.: Decision tree-based technology credit scoring for start-up firms: Korean case. Expert Syst. Appl. 39(4), 4007–4012 (2012) 16. Li, Z., Tian, Y., Li, K., Zhou, F.: Reject inference in credit scoring using semi-supervised support vector machines. Expert Syst. Appl. 74, 105–114 (2017) 17. Luo, J., Yan, X., Tian, Y.: Unsupervised quadratic surface support vector machine with application to credit risk assessment. Eur. J. Oper. Res. 280(3), 1008–1017 (2020) 18. Zhou, Z., Xu, S., Kang, B.H.: Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Syst. Appl. 42(7), 3508–3516 (2015) 19. Mancisidor, R.A., Kampffmeyer, M., Aas, K., Jenssen, R.: Deep generative models for reject inference in credit scoring. Knowl.-Based Syst. 196 (2020). Article No. 105758 20. Sun, J., Lang, J., Fujita, H., Li, H.: Imbalanced enterprise credit evaluation with DTE-SBD: decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Inf. Sci. 425, 76–91 (2018) 21. Arora, N., Kaur, P.D.: A Bolasso based consistent feature selection enabled random forest classification algorithm: an application to credit risk assessment. Appl. Soft Comput. 86 (2020). Article No. 105936 22. Xia, Y., Liu, C., Li, Y., Liu, N.: a boosted decision tree approach using Bayesian hyperparameter optimization for credit scoring. Expert Syst. Appl. 78, 225–241 (2017) 23. Chang, Y., Chang, K., Wu, G.: Application of eXtreme gradient boosting trees in the construction of credit risk assessment models for financial institutions. Appl. Soft Comput. 73, 914–920 (2018) 24. Feng, X., Xiao, Z., Zhong, B., Qiu, J., Dong, Y.: Dynamic ensemble classification for credit scoring using soft probability. Appl. Soft Comput. 65, 139–151 (2018) 25. Junior, L.M., Nardini, F.M., Renso, C., Trani, R.: A novel approach to define the local region of dynamic selection techniques in imbalance credit scoring problems. Expert Syst. Appl. 152 (2020). Article No. 113351 26. Salunkhe, U.R., Mali, S.N.: A hybrid approach for class imbalance problem in customer churn prediction: a novel extension to undersampling. Intell. Syst. Appl. 5, 71–81 (2018) 27. Ko, A.H.R., Sabourin, R., Britto, A.S., et al.: From dynamic classifier selection to dynamic ensemble selection. Pattern Recogn. 41(5), 1718–1731 (2008)

Credit Rating Based on Hybrid Sampling and Dynamic Ensemble

347

28. Sabourin, M., Mitiche, A., Thomas, D.: Classifier combination for hand-printed digit recognition. In: ICDAR, pp. 163–166 (1993) 29. Woods, K., Kegelmeyer, W.P., Bowyer, K.: Combination of multiple classifiers using local accuracy estimates. IEEE Trans. Pattern Anal. Mach. Intell. 19(4), 405–410 (1997) 30. Cruz, R.M.O., Sabourin, R., Cavalcanti, G.D.C.: META-DES: a dynamic ensemble selection framework using meta-learning. Pattern Recogn. 48(5), 1925–1935 (2015)

Low Infrared Emission Hybrid Frequency Selective Surface with Low-Frequency Transmission and High-Frequency Low Reflection in Microwave Region Yiming Xu(&), Yu Yang, and Xiao Li Brigade Engineering, University of PAP, Xi’an 710086, China [email protected]

Abstract. In this essay, we put forward a mixed structure with the properties of low emission at infrared region and scattering-transmission integrated at microwave region. This hybrid structure is mainly consisted of three portions: a frequency selective surface (FSS), a checkerboard metasurface and a metal patches layer. The FSS displays low-pass and high-stop characteristic, which bounces microwaves at higher frequencies and enables microwaves to be transmitted at lower frequencies. The checkerboard metasurface, with reflective substrate, can minimize reflection at higher frequencies by the method of scattering elimination. Owing to the high-stop properties of the FSS and the it serves as the reflective substrate, the combination of FSS and metasurface can realize broadband reflection reduction at high frequencies. At the same time, the transmission window of FSS in low frequencies can be maintained. At infrared region, the metal patches layer on the top produces the high reflectivity, which makes the whole structure to achieve low infrared emission. Meanwhile, the metal patches layer as the superstrate is transparent for the microwaves, which provides conditions for the radar-infrared compatible design. Simulation results indicate that the value of the reflection reduction is above −10 dB in the frequency range of 6.8–11.6 GHz as well as the transmission band appears around 4 GHz. At the same time, the whole hybrid structure’s emissivity is less than 0.3 in the infrared region. Finally, the experimental measurement is carried out and a good demonstration is supplied. We hope that our proposed structure can seek out a potential applications of multispectral stealth radomes.

1 Introduction With the continuous development of detection technique, the threat of aircrafts in electronic warfare becomes more and more serious. Radar and infrared detection systems have greatly increased the probability of target being detected. Regarding these, it’s desired that radar cross section (RCS) reduction and low infrared emission can be achieved simultaneously. However, the implementations of these two properties are opposite. RCS reduction can mainly be summed up as three points: shaping of target [1], loading absorbing materials [2–6] and cancellation [7–13]. At the same time, according to the law of Kirchhoff, a good absorber is a good emitter, so the object with © Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 348–360, 2021. https://doi.org/10.1007/978-3-030-57796-4_34

Low Infrared Emission Hybrid Frequency Selective Surface

349

high reflectivity can meet the low emission requirements. To be more specific, radar stealth needs high absorption and low reflection, while infrared stealth are low absorption and high reflection. Therefore, the implementation of a material or structure achieving compatible stealth at microwave and infrared region simultaneously needs more consideration. In earlier research, radar absorbing materials (RAM) are widely used in radar stealth. Since the high emission of RAM, researchers have added an additional structure with low infrared emission as superstrate [14, 15]. However, the infrared layer will inevitably cause a little deterioration of the performance of radar absorber. Metasurface, which is a two-dimensional metamaterial with a thickness much smaller than the wavelength, emerges as the time required. Because of its unique physical characteristics, metasurface has great potential in stealth [16, 17], antenna [18, 19] and microwave devices [20, 21]. The appearance of metasurface makes it possible for radar-infrared stealth. For instance, radar-infrared stealth compatible composite structure have been achieved by loading a radar transparent metasurface with low infrared radiation on the upper apex of a broadband radar absorber [14, 15]. Furthermore, the scattering cancellation replaced absorption to achieve reflection reduction in the microwave region, while the infrared emission reduction is still obtained owing to the high reflectivity metasurface [22]. Considering the transmission properties is intensely required in the practical application, we would like to introduce it on the basis of the radar-infrared stealth compatible composite structure. In doing so, we propose a mixed structure comprising of a checkerboard metasurface, a bandpass FSS and a high reflectivity metal patches layer. The high duty-cycle metal patches layer on the top is transparent for the microwaves while it has high reflectivity properties to achieve low emission in the infrared region. At microwave band, the bandpass FSS is used as the reflective substrate for the checkerboard metasurface and the mixed structure can reduce broadband reflection at higher frequencies. At lower frequencies, Checkerboard metal has almost no unfavorable effect on FSS transmission, so it can form a transmission window like FSS. The simulation results display that the value of reflection reduction is less than −10 dB in the frequency range of 6.8–11.6 GHz and the transmission band appears around 4 GHz. Finally, the sample was manufactured for experimental verification. According to the experimental test, the emissivity of the proposed structure is less than 0.3 from 5 to 14 lm. The measurement agreed with the simulation results, and the rationality of the design is verified effectively.

2 Results Design and analysis. A schematic diagram of the proposed mixed structure is illustrated in Fig. 1. The high duty-cycle metal patches layer on the top has radar transparent properties, while it reflects the electromagnetic waves in the infrared region to achieve low infrared emission, as shown in Fig. 1(a). For the microwave band, the main idea is based on the combination of a bandpass FSS working at low frequencies and a checkerboard metasurface achieving the reflection reduction at high frequencies.

350

Y. Xu et al.

At lower frequencies, the checkerboard metasurface has only a modest impact on the forward-traveling waves, so the transmission with a low insertion loss may be realized, as illustrated in Fig. 1(b). At higher frequencies, the metallic-like FSS can be regarded as a metal plate when integrating with the checkerboard metasurface to realize the broadband reflection reduction, as illustrated in Fig. 1(c).

Incident wave

Reflection

Metal patches array Checkerboard surface

Infrared region Incident wave

(a)

Bandpass FSS Incident wave

Reflection reduction at high frequencies Low-frequency transmission

(b)

Microwave region

(c)

Fig. 1. The schematic diagram of the proposed multifunction structure with (a) low emission in the infrared region, (b) high efficient transmission at lower frequencies in the microwave region and (c) reflection reduction at higher frequencies in the microwave region.

Noting that this just a qualitative design for the mixed structure. A coupling among the multilayer structures in the actual design process is required, so each layer cannot be designed independently. Considering this factor, a bandpass FSS working at low frequencies was intended to be designed first, and a checkerboard metasurface was then designed. Finally, we designed the metal patches layer. By optimizing the whole arrangement, the ideal performance of mixed structure can be obtained Now we design the bandpass FSS. In this design process, we used metal/dielectric/metal/dielectric/metal structure to achieve low-pass and high-stop properties. The structural unit cell is illustrated in Fig. 2(a). The yellow portion is presumed as copper, which is 0.017 mm thick and the conductivity is 5.8 107 S/m. The gray portion is equal to the dielectric layer, which consists of F4B with a dielectric constant of 2.65(1−j0.001).

Low Infrared Emission Hybrid Frequency Selective Surface

351

The outmost metals use the square patches with the same size, top down view of it is shown in Fig. 2(b). The in-between metal uses the orthogonal grid structure, as illustrated in Fig. 2(c). The final parameters of the FSS will be provided after merging into the checkerboard metasurface and the metal patches layer. l m

n

(b)

(c)

t1 t2 (a)

Fig. 2. (a) Schematic diagram of the designed bandpass FSS including three layers of metal separated by two layers of dielectric. (b) Top down view of the top and bottom metal unit cell. (c) Top down view of the intermediate metal unit cell.

In this paper, we use an anisotropic resonant cell of the Jerusalem cruciform structure to turn the reflected electromagnetic waves away from direction of incident by using scattering cancellation principle. The anisotropic resonator will produces different phase responses under different polarization modes and therefore the needed phase difference 180° ± 37° could be got. The proposed resonator is illustrated in Fig. 3. The yellow portion is presumed as copper, which is 0.017 mm thick and the conductivity is 5.8 107 S/m. The blue portion stands for the dielectric layer which is composed of FR4 with a dielectric constant of 4.3(1−j0.025).

b c

d

a w t3

Fig. 3. Schematic view of the resonant cell of Jerusalem cruciform structure.

352

Y. Xu et al.

The 8 8 metal patches etched on the dielectric layer of F4B are illustrated in Fig. 4(a). The yellow portion is copper and the grey portion symbolizes the dielectric layer which is composed of F4B. The dielectric constants of the materials are consistent with those mentioned above. Then the metal patches layer, the Jerusalem cross resonator and the bandpass FSS are working together on hybrid design. Every 8 8 metal patches etched on the dielectric correspond to a Jerusalem cross resonant and a bandpass FSS. To reduce the coupling effect between the multilayer structure, the polymethacrylimide (PMI) foam spacer, which is 5 mm thick and the permittivity is 1.1, is loaded between the Jerusalem cross resonator and the bandpass FSS. The hybrid resonant cell is illustrated in Fig. 4(b).

j t5 k

K H

t4 (a)

E

(b)

Fig. 4. (a) Schematic view of metal patches layer. (b) Schematic view of hybrid resonant cell.

We optimized the mixed resonant cell by applying the genetic algorithm in Microwave Studio. In the end, the geometry parameters are: l = 8.49 mm, m = 8.32 n = 2.55 mm, a = 3.06 mm, b = 1.61 mm, c = 3.14 mm, d = 1.57 mm, w = 0.34 j = 0.96 mm, k = 0.11 mm, t1 = t2 = 2.06 mm, t3 = 0.1 mm, t4 = 0.70 mm, t5 = 5

CST mm, mm, mm.

Simulation and Discuss. The simulated reflection and transmission spectra of TE and TM polarized hybrid resonators are illustrated in Fig. 5, respectively. It can be seen that the reflection coefficients are all higher than −1 dB in 6.5– 16 GHz as well as the transmission coefficients are higher than −3 dB from 3.3 GHz to 5 GHz for two different polarizations. The proposed hybrid resonator has a broad transmission band, and a sharp falling edges and its high efficient reflection at high frequencies facilitates the realization of out-band reflection reduction.

Low Infrared Emission Hybrid Frequency Selective Surface 0

0 S21 S11

-10

-20

S-Parameters

S-Parameters

S21 S11

-10

-20 -30 -40 -50 -60 -70

353

-30 -40 -50 -60

2

4

6

8

10

12

14

16

-70

2

4

6

Frequency/GHz

8

10

12

14

16

Frequency/GHz

(b)

(a)

Fig. 5. S-parameters under normal stimulation for (a) TE and (b) TM polarizations.

The reflection phases of the hybrid resonator under normal incident wave for TE and TM polarizations are illustrated in Fig. 6(a). The phase difference 180° ± 37° of the back waves was realized in the frequency range of 6.5–12.5 GHz, as illustrated in Fig. 6(b). The transmission phases of the mixed resonator for TE and TM polarizations are illustrated in Fig. 6(c), and the phase difference is illustrated in Fig. 6(d). We can 300

200 TE TM

Phase difference/deg

0

Phase/deg

-200 -400 -600 -800 -1000

2

4

6

8

10

12

14

200

100

0

-100

16

2

4

6

Frequency/GHz

8

(a)

14

16

12

14

16

(b)

Phase difference/deg

TE TM

0 -200

Phase/deg

12

200

200

-400 -600 -800 -1000

10

Frequency/GHz

2

4

6

8

10

Frequency/GHz

(c)

12

14

16

100 0 -100 -200 -300

2

4

6

8

10

Frequency/GHz

(d)

Fig. 6. (a) The phases of the reflected EM waves for TE and TM polarizations. (b) Phase difference of the reflected EM waves. (c) The phases of the transmitted EM waves for TE and TM polarizations. (d) Phase difference of the transmitted EM waves. Insets represent the polarization direction of the hybrid resonant cell.

354

Y. Xu et al.

clearly see that the phase difference of transmitted waves is subtle, therefore the high efficient transmission is able to be realized in low frequency band. We define the hybrid resonator with two different orientations as “0” and “1”, respectively. By arranging the two resonators in order, the harmful interference and the valid reflection reduction result can be realized for the incident waves in the specula direction. In this paper, we used 0101…/1010… coding sequence to arrange and form a chessboard-like configuration within the whole hybrid structure. The checkerboard metasurface as the intermediate layer is illustrated in Fig. (7). The whole mixed structure consists of 8 8 super cells, and each super cell is consisted of 5 5 hybrid resonators. The transmission and reflection spectrum of the overall hybrid structure have been simulated by using CST Microwave Studio, the outcomes are displayed in Fig. 8. The red full line stands for the mimetic reflection reduction coefficient under normal incidence, as shown in Fig. 8(a). We can see the reflection reduction above −10 dB is acquired in the frequency band of 7.3–11.6 GHz, which is roughly conformed to the predicted bandwidth of 6.5–12.5 GHz based on the phase difference 180° ± 37° between the mixed resonators with different polarizations calculated in the previous section, as displayed in Fig. 6(b). Additionally, the transmission peak appears around 4 GHz, as illustrated by the red full line in Fig. 8(b).

1

0

0

1

Fig. 7. Schematic view of the checkerboard metasurface within the whole hybrid structure.

0

0

Transmission/dB

Reflection/dB

-5 -10

-20

6

8

10

12

14

-10 -15 -20 -25

2

4

Frequency/GHz

Frequency/GHz

(a)

(b)

6

Fig. 8. (a) Simulated reflection and (b) transmission spectra of the designed hybrid structure under normal incidence.

Low Infrared Emission Hybrid Frequency Selective Surface

355

To further validate the reflection reduction performance, the distant field model of the overall structure is mimetic using CST software. The reflected lobes from overall mixed structure under normal incident wave in the frequency band of 7–12 GHz are illustrated in Fig. 9. Not unexpectedly, one can see that the incident waves are reflected along four main ways. On this condition, the reflected wave traveled in the direction of the mirror (the normal of the overall structure) are decreased and therefore the specular reflection can be decreased. We further validate the low-frequency transmission features by employing a 2 2 patch array antenna working at the passing-band. The entire mixed structure is layed aside 50 mm from the patch array antenna. The transmitted lobes from the mixed structure lightened by patch array antenna at 4 GHz are shown in Fig. 10(a). As illustrated in Fig. 10(b), the two-dimensional (2D) radiation models of antenna of mixed structure are similar to that of a single patch array at 4 Ghz. The gain boost may be owing to the fact that the aperture of the mixed structure is larger than that of the patch antenna array. In the low frequency range, because the phases of the transmitted waves are almost constant, the transmission direction of the near specular is kept obviously unchanged.

7GHz

10GHz

dBsm -0.439

8GHz

dBsm -0.369

9GHz

1.32

-10.4

-9.63

-8.68

-25.4

-24.6

-26.2

-40.4

-39.6

-38.7

2.97

dBsm

11GHz

4.03

dBsm

12GHz

5.48

-7.03

-5.97

-4.52

-22

-23.5

-22

-37

-36

-34.5

dBsm

dBsm

Fig. 9. Reflected lobes from mixed structure under normal incidence at 7, 8, 9, 10, 11 and 12 GHz.

356

Y. Xu et al.

4GHz

15

without hybrid structure with hybrid structure

4GHz

Gain/dBi

10 5 0 -5 -10 -15 -180

0

180

Angle/degree

(a)

(b)

Fig. 10. (a) Transmitted lobes of mixed structure illuminated by a 4 Ghz patch array antenna.

(b) 4 GHz radiation pattern of patch array antenna with or without hybrid structure on the xoz plane. To confirm the above design and simulation results, a prototype of the whole hybrid structure was made by applying the printing circuit board technology. A specimen containing 40 40 basic units with the total size of 393.6 mm 393.6 mm is shown in Fig. 11.

Fig. 11. Optical image of assembled mixed structure with multi-layer cascade structure

Low Infrared Emission Hybrid Frequency Selective Surface

357

The measurement system is according to a vector Network analyzer whose horn antenna operates in the 2–8, 8–12, 12–14 ghz bands The performance of the mixed structure has been assessed by the horn antenna reflection and transmission coefficient feature. As displayed in Fig. 12(a), the blue dotted line represents the measured results where the bandwidth with a 10 dB reflection reduction is achieved from 9.5 to 12 GHz. As presented in Fig. 8(b), an apparent pass band shown by the blue dotted line appears around 4 GHz. The difference between the measured and simulated results may be due to the following factors:

0

0

Simulated Measured

Simulated Measured

Transmission/dB

Reflection/dB

-5 -10

-20

6

8

10

12

14

-10 -15 -20 -25

2

4

Frequency/GHz

Frequency/GHz

(a)

(b)

6

Fig. 12. Comparison of simulation and experiment of (a) reflection and (b) transmission.

(1) discrepancy between the dielectric constants of the materials applied in fabricating the prototype and in the simulations; and (2) There is an air gap between the multilayer structures. The low infrared emission properties of the general structure is mainly due to the high duty-cycle metal patches layer on the top. The infrared emissivity of superstrate can be calculated by an formula e ¼ em fm þ ed ð1 fm Þ

ð1Þ

where e is the emissivity of the metal patches layer, m and d indicate the metal and the dielectric layer of F4B, respectively. fm represents the ratio of the metal to the superstrate. The emissivity of copper and F4B is about 0.09 and 0.955, respectively [23, 24]. In view of these values, the estimated emissivity of the proposed structure is around 0.25. The emissivity of the overall hybrid structure is difficult to be got by the simulation on account of the high requirement for the hardware, so we’re just going to give one experimental result.

358

Y. Xu et al.

Based on the law of Kirchhoff, the infrared emissivity could be achieved by measuring the reflection coefficient of the experimental sample. The measuring system is based on Vertex 70. The measured results in the infrared region is performed in Fig. 13. Not unexpectedly, the emission values of the three random regions are all less than 0.3 in the band of 5–14 lm, which is satisfy our original expectations. The results of the test is higher than that of the theoretical calculation, the mainly factor is that oxide is produced on the surface of the copper, which affects is own emissivity.

0.34 Region-1 Region-2 Region-3

0.32

Emissivity

0.30 0.28 0.26 0.24 0.22

4

6

8

10

12

14

Wavelength/um

Fig. 13. Emissivity spectra in the infrared region for the whole hybrid structure.

3 Conclusion In this work, a mixed structure was put forward to realize multifunction integration. The whole structure is consisted of three parts: a bandpass FSS, a checkerboard metasurface and a metal patches layer. In the infrared region, the proposed structure achieves the low emission properties owing to the intrinsic high reflection of the metal patches layer on the top. In the microwave region, the integration of a checkerboard metasurface and a band-pass FSS achieve the low-frequency transmission and reflection reduction at high frequencies. We can see that the mixed structure can realize broadband reflection reduction above 10 dB in 7.3–11.6 GHz, together with high transmission appears around 4 GHz. The emissivity is less than 0.3 in the band of 5– 14 lm. We believe that the results have a certain application prospect in some practical applications, such as multispectral stealth radomes and so forth. Acknowledgements. The author would like to thank the National Natural Science Foundation of China (Grant Nos. 61501497, 61471388, 61501502, 61331005) and the China Postdoctoral Science Foundation (Grant No. 2015M572561) for their support.

Low Infrared Emission Hybrid Frequency Selective Surface

359

References 1. Knott, E.F.: Radar Cross Section. SciTech Publishing, Raleigh (2004) 2. Costa, F., Monorchio, A.: A frequency selective radome with wideband absorbing properties. IEEE Trans. Antennas Propag. 60(6), 2740–2747 (2012) 3. Shang, Y., Shen, Z., Xiao, S.: Frequency-selective rasorber based on square-loop and crossdipole arrays. IEEE Trans. Antennas Propag. 62(11), 5581–5589 (2014) 4. Wang, B.X., Wang, L.L., Wang, G.Z., et al.: Tunable bandwidth of the terahertz metamaterial absorber. Opt. Commun. 325 (5), 78–83 (2014) 5. Li, B., Shen, Z.: Three-dimensional dual-polarized frequency selective structure with wide out-of-band rejection. IEEE Trans. Antennas Propag. 62(1), 130–137 (2013) 6. Shen, Y., Zhang, J., Pang, Y., Li, Y., Zheng, Q., Wang, J., Ma, H., Qu, S.: Broadband reflectionless metamaterials with customizable absorption–transmission-integrated performance. Appl. Phys. A 123(8), 1–8 (2017). https://doi.org/10.1007/s00339-017-1141-9 7. Paquay, M., Iriarte, J.C., Ederra, I., et al.: Thin AMC structure for radar cross-section reduction. IEEE Trans. Antennas Propag. 55(12), 3630–3638 (2007) 8. Cos, M.E.D., Alvarezopez, Y., Andres, F.L.H.: A novel approach for RCS reduction using a combination of artificial magnetic conductors. Progr. Electromagn. Res. 107(4), 147–159 (2010) 9. Iriarte, J.C., Paquay, M., Ederra, I., et al.: RCS reduction in a chessboard-like structure using AMC cells. Proc. EUCAP. 11, 1–4 (2007) 10. Zhang, Y., Mittra, R., Wang, B.Z., et al.: AMCs for ultra-thin and broadband RAM design. Electron. Lett. 45(10), 484–485 (2009) 11. Galarregui, J.C.I., Pereda, A.T., Falcon, J.L.M.D., et al.: Broadband radar cross-section reduction using AMC technology. IEEE Trans. Antennas Propag. 61(12), 6136–6143 (2013) 12. Edalati, A., Sarabandi, K.: Wideband, wide angle, polarization independent RCS reduction using nonabsorptive miniaturized-element frequency selective surfaces. IEEE Trans. Antennas Propag. 62(2), 747–754 (2014) 13. Simms, S., Fusco, V.: Chessboard reflector for RCS reduction. Electron. Lett. 44(4), 316– 317 (2008) 14. Liu, L., Gong, R., Cheng, Y., et al.: Emittance of a radar absorber coated with an infrared layer in the 3–5 micron window. Opt. Express 13(25), 10382–10391 (2005) 15. Wang, Z., Cheng, Y., Nie, Y., et al.: Design and realization of one-dimensional double heterostructure photonic crystals for infrared-radar stealth-compatible materials applications. J. Appl. Phys. 116(5), 054905 (2014) 16. Huang, Y., Pu, M., Zhao, Z., et al.: Broadband metamaterials as an “invisible” radiative cooling coat. Opt. Commun. 407, 204–207 (2018) 17. Zhang, J., Mei, Z.L., Zhang, W.R., et al.: An ultrathin directional carpet cloak based on generalized Snell’s law. Appl. Phys. Lett. 103(15), 151115 (2013) 18. Rahmati, E., Ahmadi-Boroujeni, M.: Improving the efficiency and directivity of THz photoconductive antennas by using a defective photonic crystal substrate. Opt. Commun. 412, 74–79 (2018) 19. Nguyen, T.K., Kim, W.T., Kang, B.J., et al.: Photoconductive dipole antennas for efficient terahertz receiver. Opt. Commun. 383, 50–56 (2017) 20. Esfandyarpour, M., Garnett, E.C., Cui, Y., et al.: Metamaterial mirrors in optoelectronic devices. Nat. Nanotech. 9(7), 542–547 (2014) 21. Ni, X., Ishii, S., Kildishev, A.V., et al.: Ultra-thin, planar, Babinet-inverted plasmonic metalenses. Light: Sci. Appl. 2(4), e72 (2013)

360

Y. Xu et al.

22. Pang, Y., Li, Y., Yan, M., et al.: Hybrid metasurfaces for microwave reflection and infrared emission reduction. Opt. Express 26(9), 11950–11958 (2018) 23. Tian, H., Liu, H., Cheng, H.: A thin radar-infrared stealth-compatible structure: design, fabrication, and characterization. Chin. Phys. B 23(2), 333–338 (2014) 24. Zhong, S., Jiang, W., Xu, P., et al.: A radar-infrared bi-stealth structure based on metasurfaces. Appl. Phys. Lett. 110(6), 063502 (2017)

Rapid Detection of Crowd Abnormal Behavior Based on the Hierarchical Thinking Xiao Li1, Yu Yang1(&), Yiming Xu1, Linyang Li2, and Chao Wang3 1

College of Information Engineering, Engineering University of PAP, Xi’an 710086, China [email protected], [email protected] 2 Department of Basic Course, Engineering University of PAP, Xi’an 710086, China [email protected] 3 College of Password Engineering, Engineering University of PAP, Xi’an 710086, China [email protected]

Abstract. In view of current algorithms for the crowd abnormal behavior detection are not suitable for different scenarios, a hierarchical detection algorithm is proposed. First, the target area is extracted using the Gaussian mixture model. According to the pixel ratio, decide which type the group belongs to: individualism behaviors, social interaction behaviors or leadership-led behaviors. Then divide the video according to the category of the crowd, calculate HOG-LBP for the crowd with individualism to judge the abnormal appearance. For other categories, calculate the trajectory and entropy in the divided image to obtain the speed, deviation from the trajectory and variance of the trajectory. Then compare with the corresponding thresholds to determine whether an abnormality occurs. The value of the entropy and its first-order function are used to judge the abnormal extent. When the entropy does not exceed 3/2 of the threshold, the optical flow is extracted to calculate CMI, and the peak value is used to detect anomalies. After experiments, our algorithm is verified to be rapid and accurate in different scenarios.

1 Introduction Abnormal crowd behavior has been a hotspot in the research of video and image processing in recent years. The detection of crowd abnormal behavior has problems such as inconsistent definition of abnormalities, difficulty in real-time detection, and serious occlusion problems. Feature extraction algorithm is the key point. Clustering is used to classify the abnormal features and normal features. The commonly used clustering algorithms include K-means clustering [1], SVM [2], and neural networks [3]. Methods using only traditional models such as WEI [4] introduces local optical flow, proposes a crowd abnormal behavior detection algorithm based on the average kinetic energy change magnification, and improves the crowd density detection method based on pixel statistics and texture features to achieve crowd density rating. Liu [5] uses SVM to detect the abnormal behavior of the group as a binary classification © Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 361–371, 2021. https://doi.org/10.1007/978-3-030-57796-4_35

362

X. Li et al.

problem, while Bertini [6] uses the clustering method to finally divide the normal or abnormal events by threshold. Zhou [7] combines the gray value and the distribution of the optical flow field to extract the motion area. Then extract the local H gradient direction histogram G and the local H The histogram F feature of the optical flow direction. At present, there are many algorithms that combine traditional models and deep learning models. Luo [8] proposed a crowd abnormal behavior recognition algorithm based on YOLO_v3 and sparse optical flow, by detecting small group anomalies to provide sufficient time for group anomaly warning and corresponding emergency measures. Chen [9] proposed a Bayesian fusion-based spatio-temporal stream abnormal behavior detection model. For two reconstruction errors, they used Bayesian criterion fusion reconstruction error to detect anomalies. Wilbert [10] proposed a low computational cost approach based on moved pixel density modelling. Hamidreza [11] proposed two descriptors, one is based on the direction and size of the short trajectory, and the second descriptor is used to locate the abnormal behavior area sequence in the video.

2 Proposed Method This article defines anomalies as “abrupt changes in speed”, “abrupt changes in entropy value”, “abnormalities in related parameter values”. Our proposed algorithm consists of three main parts: determine the target area, feature extraction, and determine anomaly. The overall flow chart of the algorithm is shown in Fig. 1.

Fig. 1. Algorithm overall flow chart

Rapid Detection of Crowd Abnormal Behavior

2.1

363

Determine the Target Area

The first step is to initially extract the target area. In this paper, we choose the modified GMM to improve the accuracy and effectiveness of target area detection. The expression of the probability density function of the current pixel is as follow: P(fi Þ ¼

k X i¼1

k P

X 1 xi;t i; t : g Xt ; li;t ; c þ 10st ðx; yÞ

ð1Þ

Where xi;t is the weight, li;t is the mean, and the covariance matrix Ri; t ¼ r2i;t I, xi;t ¼ 1, st ðx; yÞ is the cumulative value of the counter, and the counter can calculate

i¼1

the change of pixels. Introduce the sensitive parameter S to improve the dependence of the traditional GMM on the learning rate. Specifically, use the matching parameter L to determine whether the jth pixel xjn of the nth frame image matches the existing Gaussian component. Use maxð3rij ; S) instead of the threshold 3rij . L ¼ argmin8i:kxjn lij k maxð3r ij ;SÞ xij g xij ; lij ; rij :

2.2

ð2Þ

Feature Extraction

HOG-LBP Feature Extraction When the occlusion of the crowd is not obvious, the appearance features should be extracted. In this paper, the HOG-LBP feature is selected. We create a region matrix in the moving target area to determine the size of the block. The region matrix represents the number of the same matrix size in the moving target area, that is, the item in the ith row and jth column indicates the number of regions of width i, length j. The maximum value of the matrix is the size of the divided cells, which is called as a unit block. In each unit block, the gradient value of the pixel is calculated and a weighted vote is made with one of the 9 histograms around the pixel. LBP is used as the texture descriptor. Then HOG and LBP vector are combined to describe the appearance of the crowd. Figure 2 shows the region matrix(a) and proper cell size(b). The red circle is maximum entry.

364

X. Li et al.

Fig. 2. (a) Region-size matrix, (b) proper cell size

Extract Trajectory The trajectory is selected in this paper to reduce the detection trouble caused by individual occlusion. The trajectory of the jth object is expressed as follows: Cj ¼ Where

n

xij ; yii

oM i¼1

:

ð3Þ

xij ; yii is the predicted position of the jth object in the ith and coordinate

system, M is the length of the trajectory, representing the number of trajectory points. In the training set, take the normal movement of the crowd as a reference and obtain the speed standard of normal people. The speed characteristic calculation formula is: Vj ¼

1 XM k k1 k k1 x x ; y y j j j j k¼2 M

ð4Þ

Taking the trajectory deviation of the jth object as the second extracted feature to avoid the trajectory generated by the zigzag motion of the crowd from being regarded as abnormal motion. The calculation formula of the trajectory deviation is as follows: 0 Sj ¼

ByM j m j ; dj @ M xj

y0j x0j

08

9M 11 < P xi ½v xj ½v 2 i 6¼ j d xi ; xj ¼ v¼1 > : 0 i¼j

ð5Þ

D = {dij | dij = d(xi, xj), 1 i n, 1 j n, i 6¼ j} is the Euclidean distance set between arbitrary samples in the high-dimensional space. The maximum distance of samples in the high-dimensional space is represented as dmax, and the nearest distance is represented as dmin, and the average distance is represented as dmean. The normalization of Euclidean distance between samples in high dimensional space is expressed as:

High-Dimensional Data Clustering Algorithm Based on SRP

k¼

dij dmean dmax dmin

397

ð6Þ

In order to compare the distance under the same standard, we normalize the Euclidean distance matrix EDMH between data points in the high-dimensional space and the Euclidean distance matrix EDML between data points in the low-dimensional space respectively, according to Eqs. 5 and 6. Then we define a dimension reduction evaluation index, named Distance Preserving Index (DPI): PP DPI ¼

i

j

ðEDMHði; jÞ EDMLði; jÞÞ2 PP i

ð7Þ

j

Whereby the value of DPI is between 0 and 1. The smaller that value the better the distance information is kept in the dimensionality reduction results, and so the better the dimensionality reduction effect.

4 Experimental Results and Analysis Two datasets were used in the experiment. One is the BBC news dataset: a total of 2,225 text files on five topical areas published on the BBC news website. Text documents were arranged into folders containing five labels: business, entertainment, politics, sports, and technology. The sample sizes were 510, 386, 417, 511, 401, respectively. The other dataset consists of data sourced from 20 newsgroup; it is a collection of approximately 20,000 newsgroup documents, partitioned evenly across the 20 different newsgroups. We selected 3000 of them as the experimental dataset, whereby the sample sizes of 20 groups were 140, 167, 159, 148, 152, 157, 152, 161, 133, 166, 166, 166, 132, 165, 162, 168, 134, 156, 122, 94 respectively. Following tokenization, stop-words removal and TF-IDF, the feature dimension of the BBC news dataset is 11,227 and that of the 20-newsgroups dataset is 130,107. In a comparative study of Laplacian Eigenmaps, Random Projection and SRP, we used these algorithms to reduce the dimensionality of the high-dimensional text feature vectors, and then applied spectral clustering and CFDP in order to compare their lowdimensional data clustering. In this experiment, we used the three cluster indice-ARI (Adjusted Rand Index), NMI (Normalized Mutual Information) and FMI (FowlkesMallows Index) to evaluate the performance of the clustering algorithm. It should be noted that spectral clustering selects the eigenvectors corresponding to the K maximum eigenvalues by decomposing the Laplace matrix to form a lowdimensional space corresponding to the original data. Spectral clustering can efficiently process the high-dimensional data. Our dataset dimensions are greater than 10,000. If the dataset dimensions from the BBC news (11,227) and 20-newsgroups (130,107) datasets are reduced to about 100 dimensions, the resulting clustering effect is poor. Therefore, before applying spectral clustering, the Laplacian Eigenmaps, Random Projection and SRP dimensionality reduction methods are also used to reduce the

398

Y. Sun and J. Platoš

dimensionality of high-dimensional data. Table 1 shows the clustering effects of each method. Table 1. ARI, NMI and FMI of each clustering algorithm. ARI LE+spectral clustering RP+spectral clustering LE+CFDP RP+CFDP SRP+spectral clustering SRP+CFDP NMI LE+spectral clustering RP+spectral clustering LE+CFDP RP+CFDP SRP+spectral clustering SRP+CFDP FMI LE+spectral clustering RP+spectral clustering LE+CFDP RP+CFDP SRP+spectral clustering SRP+CFDP

BBC

20-newsgroups

0.0037 0.2351 0.1093 0.2293 0.7319 0.7061 0.0588 0.2827 0.2658 0.3004 0.7843 0.7719 0.4350 0.3907 0.4241 0.3965 0.7862 0.7766

0.0260 0.1016 0.0115 0.0917 0.8465 0.9351 0.1550 0.2905 0.0672 0.2639 0.9585 0.9762 0.1478 0.1503 0.1687 0.1502 0.8630 0.9395

It can be summarized from Table 1 that high-dimensional text data has better clustering following SRP dimensionality reduction. The evaluation indexes of spectral clustering and CFDP are higher than those of Random Projection and Laplacian Eigenmaps. Figure 4 shows 2D and 3D spectral clustering of the BBC dataset following dimensionality reduction by SRP. Figure 5 shows 2D and 3D spectral clustering of the 20-newsgroups dataset following dimensionality reduction by SRP. Figure 6 shows 2D, 3D and decision graphs of CFDP of the BBC dataset following dimensionality reduction by SRP. Figure 7 shows 2D, 3D and decision graphs of CFDP of 20-newsgroups dataset following dimensionality reduction by SRP.

Fig. 4. The spectral clustering of the BBC dataset following dimensionality reduction by SRP.

High-Dimensional Data Clustering Algorithm Based on SRP

399

Fig. 5. The spectral clustering of the 20-newsgroups dataset following dimensionality reduction by SRP.

Fig. 6. The CFDP of the BBC dataset following dimensionality reduction by SRP.

Fig. 7. The CFDP of 20-newsgroups dataset following dimensionality reduction by SRP.

The DPI of Random Projection, Laplacian Eigenmaps and SRP were calculated in accordance with the proposed dimensionality reduction evaluation index. A smaller DPI value indicates that the dimensionality reduction method performs better in terms of distance retention. The results of our comparative study are shown in Table 2. For the two datasets, the indexes calculated by the SRP method are smaller than those calculated using Random Projection and Laplacian Eigenmaps, an indication that the structure of SRP can retain distance information better than that the Random Projection and Laplacian Eigenmaps method following dimension reduction.

400

Y. Sun and J. Platoš Table 2. DPI of RP, LE and SRP BBC

20-newsgroups

DPI Random Projection 0.0031 0.0062 Laplacian Eigenmaps 0.0412 0.0437 Stacked-Random Projection 0.0028 0.0025

5 Conclusions This study proposes a Stacked-Random Projection dimension reduction framework based on deep networks. In our experiment we used Random Projection, StackedRandom Projection and Laplacian Eigenmaps to reduce the dimensionality of highdimensional data, and applied spectral clustering and fast search and find density peak clustering to compare the clustering results. It was found that Stacked-Random Projection was superior to the other dimensionality reduction methods in terms of ARI, NMI and FMI, and enables spectral clustering and fast search and find density peak clustering to be used to process high-dimensional data. Finally, we proposed a dimensionality reduction quantitative evaluation index (DPI) to quantitatively evaluate the dimensionality reduction effects of Random Projection, Stacked-Random Projection and Laplacian Eigenmaps. Our results show that the performance of StackedRandom Projection is better than that of the other two dimensionality-reducing methods. The experimental results show that the Stacked-Random Projection dimensionality reduction framework is an effective dimensionality reduction method which can significantly improve the clustering performance.

References 1. Janitza, S., Celik, E., Boulesteix, A.-L.: A computationally fast variable importance test for random forests for high-dimensional data. Adv. Data Anal. Classif. 12(4), 885–915 (2018) 2. Platos, J., Nowakova, J., Kromer, P., Snasel, V.: Space-filling curves based on residue number system. In: Barolli, L., Woungang, I., Hussain, O. (eds.) Advances in Intelligent Networking and Collaborative Systems. INCoS 2017. Lecture Notes on Data Engineering and Communications Technologies, vol. 8. Springer, Cham (2017) 3. Li, W., Zhang, Y., Sun, Y., Wang, W., Zhang, W., Lin, X.: Approximate nearest neighbor search on high dimensional data - experiments, analyses, and improvement. IEEE Trans. Knowl. Data Eng. 32, 1475–1488 (2016) 4. Bellman, R.E.: Adaptive Control Processes: A Guided Tour. Princeton University Press, United States of America (1961) 5. Xie, H., Li, J., Xue, H.: A survey of dimensionality reduction techniques based on random projection. arXiv preprint arXiv:1706.04371 (2017) 6. Mahmoud, N.: Random projection and its applications. arXiv preprint arXiv:1710.03163 (2017) 7. Papadimitriou, C.H., Raghavan, P., Tamaki, H., Vempala, S.: Latent semantic indexing: a probabilistic analysis. In: Proceedings of the 17th ACM Symposium on Principles of Database Systems, Scattle, Washington, pp. 159–168 (1998)

High-Dimensional Data Clustering Algorithm Based on SRP

401

8. Li, W., Bebis, G., Bourbakis, N.: Integrating algebraic functions of views with indexing and learning for 3D object recognition. In: Computer Vision and Pattern Recognition, pp. 110– 112 (2004) 9. Kaski, S.: Dimensionality reduction by random mapping: fast similarity computation for clustering. In: Proceedings of the 1998 IEEE International Joint Conference on Neural Networks, Anchorage, USA, vol. 1, pp. 413–418 (1998) 10. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: International Conference on Knowledge Discovery and Data Mining, vol. 17, no. 2, pp. 245–250 (2001) 11. Zhao, R., Mao, K.: Semi-random projection for dimensionality reduction and extreme learning machine in high-dimensional space. IEEE Comput. Intell. Mag. 10(3), 30–41 (2015) 12. Luxburg, U.V.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007) 13. Rodriguez, A., Liao, A.: Clustering by fast search and find of density peaks. Science 344 (6191), 1492–1496 (2014) 14. Chen, X.L., Deng, C.: Large scale spectral clustering with landmark-based representation. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence, pp. 313–318 (2011) 15. Jin, Z.G., Xu, P.X.: An adaptive community detection algorithm of density peak clustering. J. Harbin Inst. Technol. 50(5), 44–51 (2018) 16. Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26, 189–206 (1984) 17. Menon, A.K.: Random Projections and Applications to Dimensionality Reduction. School of Information Technologies, The University of Sydney, Australia (2007) 18. Biggs, N.: Algebraic Graph Theory. Cambridge University Press, Cambridge (1993) 19. Snasel, V., Drazdilova, P., Platos, J.: Closed trail distance in a biconnected graph. PLoS ONE 13(8), e0202181 (2018) 20. Sun, Y., Platos, J.: CAPTCHA recognition based on Kohonen maps. In: Barolli, L., Nishino, H., Miwa, H. (eds.) Advances in Intelligent Networking and Collaborative Systems. INCoS 2019. Advances in Intelligent Systems and Computing, vol. 1035, pp. 296–305. Springer, Cham (2020) 21. Sun, Y., Platos, J.: Text classification based on topic modeling and chi-square. In: Pan, J.S., Lin, J.W., Liang, Y., Chu, S.C. (eds.) Genetic and Evolutionary Computing. ICGEC 2019. Advances in Intelligent Systems and Computing, vol. 1107, pp. 513–520. Springer, Singapore (2020)

Adaptive Monitoring in Multiservice Systems Luk´ aˇs Révay1(B) and Sanja Tomić2 1

ˇ - Technical University of Ostrava, Department of Computer Science, FEECS, VSB 17. listopadu 15, Poruba, 708 33 Ostrava, Czech Republic [email protected] 2 Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic [email protected]

Abstract. Understanding of complex systems is a key problem in the IT industry. Such systems can be imagined as a network of micro-services communicating among themselves. After the initial setup the configuration of such system changes with time. It is hard to follow the changes and always know configuration of the system. It is hard to detect errors in its processes and more important it is difficult to evaluate in a real time whether the system still works as designed. Aim of this paper is to provide a solution to evaluate the quality of processes conducted by the system in real-time. We offer complete fully automated solution, which can be easily implemented. This paper also briefly covers the problem of data normalisation, streaming and batch processing which is important for model creation and further validations. This kind of topology is known as lambda architecture. Furthermore we provide solution for graphical representation of the system. The system can be envisioned as a directed graph with micro-services as its nodes and communication between micro-services as its edges. We strive to graphically represent the system as a directed graph and also represent errors our solution detected. Therefore our work covers back-propagation together with model persistence and changes tracking over time.

1

Introduction

The prevalence of web services used for (sub)system integration which become a big topic nowadays [16]. All kinds of services which are responsible for specific domains. All data sent to these services in different formats and also protocols. To avoid business services direct integration problems related to facts mentioned in previous sentences, additional layer needed for integration has been introduced [2]. It is clear that we think about systems functionalities from service point of view. For our purpose the services are also tightly connected with processes this monitored system covers. This additional integration layer gives additional complexity and also increases errors likelihoods. System errors become our concern for this paper. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 402–412, 2021. https://doi.org/10.1007/978-3-030-57796-4_39

Adaptive Monitoring in Multiservice Systems

403

It is hard to monitor whole information system when error occurs. This is caused by the complexity of systems which are highly available. With all these replicated services it is hard to find what is the error root cause. Because we monitor a lot of services it is necessary for us to define precisely if the processes finished correctly. These problems put us on path which is described in this paper. The most important thing is to gather the data from all services. When services are replicated it is hard to find on which server even service the process is failing. Coexistence of monitoring is necessary for system definition and its fault analysis. “While building a fault-free system of such complexity is unrealistic, it is very desirable to have effective fault detection and isolation methods in operational system management” [16]. For better understanding and modelling the complex system it is necessary to mine data. In this work we utilize Markov Chain model [10] above which we do predictions and validations. We compare process traces compared to predictions computed upon the model. Because during model creation we know what is the input for process and what shall be the expected result activity when the whole chain ends. We are also able to mark events. Because we exactly know about inputs, labels and also expected outputs it is clear that this machine learning is supervised [11].

2

State of Art

Nowadays when systems become more complex and the complexity is still increasing (Fig. 1) it is important to monitor and check the functionality. Complexity itself on one side decreases system understanding especially in error scenarios, which are not covered during its analysis, implementation and testing. The most important question is, how to find out that erroneous scenario happened. To be able to make such decisions it is necessary to understand the systems. It is also proven that users(managers, analysts, ...) of systems are overestimate the knowledge of processes [14]. This overestimation tend to lead to bad conclusions. It is not possible to avoid all of these problems, but at least it is possible to understand what the processes are doing. To make it understandable for users variety of techniques are usable. There are already techniques like BPMN [7], Petri Nets [3] or Neural Networks [12] which are used for process discovery. Unluckily, it is hard to visualise Neural Network that will fully describe the system. All business processes are visualised especially by BPMN or Petri Nets. Both methods are usable for processes visualisation. Visualisation is the last instance of process discovery utilisation. Each visualisation is of course persisted. For example graph data structures, data tables or neural networks. This data structures are then used for further investigations, normalisations comparing with new version of model and others.

404

L. Révay and S. Tomić

What is also important nowadays related to such models makes tight coupling between process mining and process discovery [7,14]. To improve the system functionalities and effectiveness, mining brings another insight. It utilizes models which are created in discovery phase. Use cases like, system monitoring, reverse engineering are fully connected with human interaction and work. All events gathered via monitoring and displayed to human via graphs still make only a part of picture. To make the decisions (how to react on some system behaviour) and understanding of system better discovery and mining of processes are crucial.

3

Complexity of System

It is hard to fully understand processes systems are take care. Those are compound from services which together fulfill the needs system was created. It is hard to remember all the services interactions. For those purposes we utilize modeling of complex systems [9]. All interconnected parts in system define it’s behaviour. With increasing number of services, system complexity raises [6]. Complexity is proven in cases of system failures because everyone is searching for root cause this error was caused. This negative feedback can be sometimes viable before it happens. When we look before system had failed it can be seen in lot of cases that some patterns related to failure repeats.

Fig. 1. Representation 100% of calls traces in complex system

Our purpose in this paper is to search such patterns based on business processes any system is covering. Most common symptoms like slowly increasing number of erroneous processes can suggest further problems which can be caused by decreasing amount of resources (like connection pools ...) for example. All necessary resources are important for services which are backbone of complex systems nowadays.

Adaptive Monitoring in Multiservice Systems

3.1

405

Processes, Services, Activities and Events

When we think about processes which are composed from a lot of steps. Information systems we use are support these processes with services these systems are composed from. Processes mirrored to the information systems which fully obey and supplement such processes create a big interconnect network of service calls. Complete service calls can be understand as traces via whole system. Each trace then corresponds to the appropriate process this information system covers. Important thing which we need to focus is that each invocation of service is initiated by another service in this information system. This kind of bonds are interesting for us in this work. We mostly focus our intention into request, response calls which are base parts for further investigations. High level view of services interactions, provides introspection for processes which is the critical part in process analysis. More about services definition in relation to processes can be found also here. For our work is important that each process is composed from activities. But from monitoring point of view activities are attracted to processes. Because we monitor services for events (Fig. 2) they invoke it can be stated that activity refers to it’s specific event which is caught and grouped as request and response. This makes events sending and its monitoring important for further investigation.

Fig. 2. Services relations

406

4

L. Révay and S. Tomić

Process Discovery

Process-aware information systems need to be configured based on process models specifying the order in which process steps are to be executed. Creating such models is a complex and time-consuming task for which different approaches exist. The most traditional one is to analyze and design the processes explicitly making use of a business process modeling tool. However, this approach has often resulted in discrepancies between the actual business processes and the ones as perceived by designers [13] therefore, very often, the initial design of a process model is incomplete, subjective, and at a too high level. Instead of starting with an explicit process design, process discovery aims at extracting process knowledge from “process execution logs”. When we think about process discovery in this context, we need to clarify that our aim is preparation of whole process chain based on service calls. Each service call can be tracked and used for further process definition. As we mentioned in previous chapter services which can be understand as process building blocks are used for process definition. From our perspective it does not matter if the service is micro-service, or in some cases we can also cover, method calls (monolithic systems). It is hard to understand whole system and all processes it covers. This work primary brings insight into problems linked with processes overview. It is our primary aim because it is necessary to see current system status all services which define each process this system is covering. Whole process discovery is possible when all necessary services log appropriate data. Based on data from logs we are able to connect services and fully construct the call pipeline which defines the process. 4.1

Terminology

While a lot of process discovery concepts are taken from graph theory, it uses a bit different terminology. Event is a single occurrence of a process, and it has following attributes1 : • • • •

Process ID - It is the main ID which connects Activities into one process. Activity - Name of a single activity in the process. Activity instance ID - ID of a single instance of a activity. Timestamp - Timestamp of the event.

In process discovery there are several ways to represent the activity network, such as Petri nets, BPMNs (Business Process Model and Notation), etc. We opted for a process map. The process map (Fig. 3) is simply a network of activities, with addition of ghost activities, start and end. The edges of the network are directed and are connecting activities which happen after each other. When we order activities of a single process, we construct a trace of the said process. 1

There are more attributes, only the relevant ones are mentioned.

Adaptive Monitoring in Multiservice Systems

407

Fig. 3. Traces terminology

4.2

Events Gathering and Distributing

Our first technical problem is to get data. Because we focus mostly on already existing systems the only think which is important are log data with appropriate format and required information. Important think in monitored system is logging of service calls. When this condition is successfully accomplished then it is possible stream the logged data for further processing. Raw data from logs are not suitable for processing. Before model creation or predictions it is necessary preprocess data. This is better described in Sect. 4.3 about preprocessing. What is more important that the normalised logged events flowing through monitoring needs to be splitted for batches computation and for streamed real time normalised events. This kind of architecture is know as lambda architecture [8]. 4.3

Data Preprocessing and Cleaning

It is important to understand that the quality data is a key issue when we are going to mining from it. Nearly 80% of mining efforts often spend to improve the quality of data [1]. The data which is obtained from the logs may be incomplete, noisy and inconsistent. The attributes that we can look for in quality data includes accuracy, completeness, consistency, timeliness, believability, interpretability and accessibility. There is a need to preprocess data to make it have the above mentioned attributes and to make it easier to mine for knowledge. In the following subsections we discuss about the data cleaning and data reduction algorithms [13]. In this work we preprocess data from it’s raw logged form and transform them. During preprocessing all raw logged data are normalized to the form which is suitable for further processing and model creation. We prepared minimal requirements definition for data which are necessary for processing part as described in Sect. 4.1. 4.4

The Model

Adaptive monitoring at its core is binary clustering problem. The processes are sorted into two categories, successful and failed. Because of this, it makes sense

408

L. Révay and S. Tomić

to make model entirely of good patterns. Our model of the system processes is consisted of three components: Markov Chain, evaluation of transition time and absolute number of observed transitions. After the bad patterns are filtered out of each component, we merge them and create the model. The constraints we use for filtering are learned through iterative process. After the model is created it is stored in a database as structured table. The table contains 4 columns, the first and the second represent transitions (From-To), third and forth represent time interval (Minimum Time and Maximum Time) (Table 1). Table 1. An example of the model

5

From

To

Minimum time Maximum time

A

C

1

4

START A

−1

−1

START B

−1

−1

B

C

2

5

C

A

2

60

C

END −1

−1

Realisation

We practically prepared environment which is fully virtualised in docker. This brings us possibility to test and run the solution locally. It is also possible to utilize docker-swarm and spread the environment over multiple computers. We decided for this environment because of easier redeployment and in case of errors in such infrastructure to its repairs. It was also important to test and check if the environment can compute data in parallel manner. This kind of scalability and robustness is provided by virtualised environment mentioned earlier. Whole environment is also compound to fully support distributed computing [15]. This kind of approach fully supports paradigm of command and conquer paradigm [5]. Formally speaking, divide data to multiple chunks and compute separately on workers in cluster and combine the partial results. Whole environment built from components for streaming, batch processing and for further data persisting also with database. All this is described in Sect. 5.1. Due to distributed computing is whole architecture designed to fully support this approach. Whole realisation of virtualised environment is compound from artifacts. Most important one is the model (see Sect. 4.4) which is persisted as table structure into database. This model is further utilised for graphical output. Based on this model graphical representation is constructed as described in Sect. 5.3.

Adaptive Monitoring in Multiservice Systems

5.1

409

Architecture

As was mentioned in previous chapter all components (Fig. 4) chosen for this solution fully support distributed computing and together make the environment scalable and highly available.

Fig. 4. Components architecture

Components: • • • •

spark - analytical engine for stream and batch processing flink - streaming platform used for preprocessing in this work kafka - message broker supporting publish subscribe behaviour postgre - database used for persisting of model or batch data.

Based on lambda architecture as was mentioned in Sect. 4.2 the environment fully obeys it. Flink is responsible only for normalisation of raw data and than it stream such data to Kafka message broker. Real-time processing is compound from preprocessing (see Sect. 4.3) part and predictions which are base on realtime normalised (prepossessed) log data. It is visible from Fig. 4 that model is created from batched normalised (prepossessed) log data in spark engine. Predictions are computed from real-time data where model is used for this computation. Model is red from persistence database store. 5.2

Use of Model and Predictions

Basic scenarios how this solution covers model creation and traces predictions. They consist of these steps: • Model creation - Batch part 1. Events gathering preprocessing, normalisation and labeling 2. Model construction based on prepossessed events 3. Model persist • Traces prediction - Stream part 1. Normalise and label streamed events

410

L. Révay and S. Tomić

2. Do predictions for streamed events based on model 3. Compare reality with prediction 4. Persist false marked trace. Based on these events it is possible to create different statistical graphs views or trends that can be used for system error predictions. Related visualisation representing model (Fig. 5b) is crucial for system understanding. Can be also used for reverse engineering and control if services needed for business processes events are there. Added value of this model is easy visualisation and clear understanding which processes are important during run time of system. Traces prediction and comparison with real time traces provides possibility to see if any of trace is not as expected compared to prediction. Visualisation (Fig. 6) of such negative traces also brings another added value related to this solution. 5.3

Visualisation

To understand modeled system from database table is not possible. To be able to see all calls and present them in understandable way. We graphically (Fig. 5b) represent the model (Table 1) as directed graph [4].

(a) Model representation in grafana

(b) Detail of graphical model representation

Fig. 5. Graphical representations

Whole rendering where each row from model represents each edge in graph is constructed by javascript vis.js framework inside grafana application. Vertexes in columns which are proportionally sized according to those maximum times. False traces which are marked negatively based on model prediction are also important and needs to be visualised. This visualisation (Fig. 6) gives insight to monitored processes from high level of perspective which is a crucial part of system understanding and monitoring in cases of failures or malfunctioning on business level.

Adaptive Monitoring in Multiservice Systems

411

Fig. 6. Error visualisation

6

Summary

In our research we developed a machine learned approach related to process mining and discovery. This kind of solution is based on Markov Chain algorithm that can be replaced or combined with another algorithms. To do such decision we need to do a lot of testing and validations on real data. Current work is tested for now only on small chunks of data which also do not cover the performance problems that can come when there is large amount of events. We also think about model changes that can be persisted in time series manner. This kind of persisting can provide us also another point of view during the time. Further we want also predict each step in long term processes that is also crucial. We still miss possibility to update the model by human supervisor which in case of error from our prediction can update the model in a way the. These are also directions for future research. Acknowledgment. The following grants are acknowledged for the financial support provided for this research: Grant of SGS No. SP2020/78, VSB Technical University of Ostrava.

References 1. BigMine 2012: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications. Association for Computing Machinery, New York (2012)

412

L. Révay and S. Tomić

2. Al-Ghamdi, A., Saleem, F.: Enterprise application integration as a middleware: modification in data & process layer, pp. 698–701 (2014). https://doi.org/10.1109/ SAI.2014.6918263 3. Busi, N., Pinna, G.M.: Process discovery and Petri nets. Math. Struct. Comput. Sci. 19(6), 1091–1124 (2009). https://doi.org/10.1017/S0960129509990132 4. Eades, P., Lin, X.: How to draw a directed graph. In: Proceedings of 1989 IEEE Workshop on Visual Languages, pp. 13–17 (1989). https://doi.org/10.1109/WVL. 1989.77035 5. Horowitz, Z.: Divide-and-conquer for parallel processing. IEEE Trans. Comput. C–32(6), 582–585 (1983). https://doi.org/10.1109/TC.1983.1676280 6. Janoˇsek, M., Kocian, V., Voln´ a, E.: Complex system simulation parameters settings methodology. In: Zelinka, I., Chen, G., R¨ ossler, O.E., Snasel, V., Abraham, A. (eds.) Nostradamus 2013: Prediction, Modeling and Analysis of Complex Systems, pp. 413–422. Springer, Heidelberg (2013) 7. Kalenkova, A.A., De Leoni, M., van der Aalst, W.M.: Discovering, analyzing and enhancing BPMN models using ProM. In: BPM (Demos), p. 36 (2014) 8. Kiran, M., Murphy, P., Monga, I., Dugan, J., Baveja, S.: Lambda architecture for cost-effective batch and speed big data processing, pp. 2785–2792 (2015). https:// doi.org/10.1109/BigData.2015.7364082 9. Pel´ anek, R.: Modelov´ an´ı a simulace komplexn´ıch systém˚ u. Jaklépe porozumˇet svˇetu (2011) 10. Rabiner, L., Juang, B.: An introduction to hidden Markov models. IEEE ASSP Mag. 3(1), 4–16 (1986). https://doi.org/10.1109/MASSP.1986.1165342 11. Tax, N., Sidorova, N., Haakma, R., Aalst, W.: Event abstraction for process mining using supervised learning techniques, pp. 251–269 (2018). https://doi.org/10.1007/ 978-3-319-56994-9 18 12. Tax, N., Verenich, I., La Rosa, M., Dumas, M.: Predictive business process monitoring with LSTM neural networks. In: Dubois, E., Pohl, K. (eds.) Advanced Information Systems Engineering, pp. 477–492. Springer, Cham (2017) 13. Tyagi, N.K., Solanki, A., Tyagi, S.: An algorithmic approach to data preprocessing in web usage mining. Int. J. Inf. Technol. Knowl. Manag. 2(2), 279–283 (2010) 14. Van Der Aalst, W.: Process mining. Commun. ACM 55(8), 76–83 (2012). https:// doi.org/10.1145/2240236.2240257 15. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I., et al.: Spark: cluster computing with working sets. HotCloud 10(10–10), 95 (2010) 16. Guo, Z., Jiang, G., Chen, H., Yoshihira, K.: Tracking probabilistic correlation of monitoring data for fault detection in complex systems. In: International Conference on Dependable Systems and Networks (DSN 2006), pp. 259–268 (2006). https://doi.org/10.1109/DSN.2006.70

Towards Faster Matching Algorithm Using Ternary Tree in the Area of Genome Mapping Rostislav Hˇrivˇna´ k(B) , Petr Gajdoˇs , and Václav Snásˇel VSB - Technical University of Ostrava, Ostrava, Czech Republic {rostislav.hrivnak.st,petr.gajdos,vaclav.snasel}@vsb.cz

Abstract. In area of precision medicine there is a need to map long sequences of DNA, which are represented as strings of characters or numbers. Most of the computer programs used for genome mapping use suffix-based data structures, but those are much more suitable for mapping of short DNA sequences represented as strings over small alphabets. The most crucial parameters of data structure used for DNA mapping are time to fill the data structure, search time and system resources needed, especially memory, as the amount of data from scanning process can be really large. This article will describe implementation of memory optimized Ternary Search Tree (TST) for indexing of positions of labels obtained by Bionano Genomics DNA imaging device. BNX file parser with alphabet encoding functions is described and performance results from experiments with presented software solution on real data from Bionano Genomics Saphyr device are also included.

1 Motivation Precision medicine requires human genome to be sequenced and then mapped to reference genome to find differences between them. This enables pinpointing of the disease with greater accuracy. Majority of techniques and methods used today were developed to sequence the human genome by short-read sequencing. For clinical genomics, long-read sequencing methods are needed [1], and are being developed for example by company Bionano Genomics [2, 3]. Availability and possibility to use new methods in clinical practice are directly connected to performance of computation systems which are used to analyze scanned genetic data. From bioinformatics stand point this means that majority of tools used were developed and optimized for short-read genome assembly, which differs significantly from long-read genome mapping. Figure 1 shows schema of DNA scanning process using Bionano Genomics Saphyr Genome Imaging Instrument which produces such longreads. Process can be divided into 3 phases. The first phase is scanning itself - prepared solution containing DNA is injected into Saphyr chip (module), which consists from up to 3 flowcells with 2 channels each (inlet and outlet channels). Large fragments of DNA molecules flows from one channel to the other through the banks on each side of the channel while laser is scanning them. Sites on the DNA molecules which were labeled with fluorescent markers during the preparation phase reflects laser beam more c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 413–424, 2021. https://doi.org/10.1007/978-3-030-57796-4_40

414

R. Hˇrivˇna´ k et al.

Fig. 1. Bionano Genomics process schematics [17, 18]

than parts of the DNA fragments where no fluorescent markers are present. This results in output image of one column in the bank. Each sample is scanned ∼50 times. There are 4 banks in each flowcell, each bank is further divided into 137 columns. In phase two resulting image of whole scanning process must be assembled. This is done by Saphyr software. Raw scan input for image assembly consists of 137 columns× 4 banks × 50 scans × 2 channels = 54 800 images. Each input image has ∼16 Mb. This multiplied by the total number of scans, columns, banks and chips gives ∼877 Gb of raw images per flowcell. This means that Saphyr device in 3 flowcells configuration is able to produce ∼ 3T b resulting image, in which all source images are assembled together. During phase three, software analyses resulting image based on color and intensity of points in the image. Resulting model which contains positions of the marked labeled sites on DNA molecules relative to the beginning of the molecules is computed and saved into raw text data file named BNX [2, 17]. Positions of label sites and distances between them are measured in base pairs. Data from BNX files are source for genome mapping onto reference genome. But first, label positions stored in the BNX file has to be read and indexed to computer memory together with metadata about the part of DNA molecule, from which they originate. Indexing data structure must enable fast searching for given string and return all occurrences of matched strings held in the index. Most genome assemblers are using suffix-based data structures which are much more suitable for storing short strings over short alphabets [4, 5]. This article describes novelty approach how to process, encode and store label position data from Bionano Genomics BNX file into memory optimized ternary search tree (TST). Main emphasis was put on memory optimization by modifying TST structure with possibility to parameterize some of the data structure properties. Described BNX

Towards Faster Matching Algorithm

415

parser and data structure will be used as one of the building block for new genome mapper which is being developed.

2 Data Preprocessing As was stated in Sect. 1, BNX file is a raw data text file and content of the file has to be preprocessed for use in genome mapping. For simplification, in this chapter the term molecule denotes the set of label positions or distances between them (depending on the phase of parsing process), not the DNA double helix molecule itself (or its fragments). Preprocessing takes place in so called BNX parser which has following functions: 1. 2. 3. 4.

Read and parse BNX file according to the documentation [16] Filter out molecules by labels count Calculate distances between the labels Encode distances by selected function to a new (numerical) alphabet

Reading the data from the file according to specification is a necessary step to proceed and is really only a technical step. Each set of label positions represents one scanned part of DNA and hugely varies in lengths - there can be sets several hundreds of labels long, but also there can be molecules covered with only five or ten labels. This can be caused by error, but it can also mean that molecule doesn’t have any matching sites where fluorescent mark can adhere to, i.e. it is correct. Experience shows that molecules with too few values is good to left out from further processing - it doesn’t contain enough information for assembly and mapping and filtering such molecules out also saves some system resources. So the second step of the BNX file parsing is to filter molecules which doesn’t surpass given labels count threshold (parameter of BNX parser). Because label positions in BNX files are given as absolute position of labels from the beginning of the scanned DNA molecule fragment, it’s very convenient to calculate distances between the labels. This transformation not only doesn’t lose any information and saves one number per molecule, but also allows comparison of DNA fragments (aligning them to form larger molecules called contigs and then mapping onto reference genome) as distances between the labels are key information. Absolute label positions, thus also distances between them, are integer values (measured in base pairs as was stated before). From information theory point of view - resulting set of all distinct distances between labels forms the alphabet and each molecule is a string over this alphabet. Last step is to compress the alphabet by encoding label distances by selected conversion function. From perspective of data indexing it is good to have smallest possible alphabet as it leads to smaller data structures needed. Some information contained in label distance strings is obviously lost during the process of alphabet compression. This is not necessarily a bad thing as the data which are processed by the parser are not, and can’t be, absolutely correct. Label distances encoding can be also beneficial and can bring error tolerance into the solution. DNA fragments which are being scanned can be for example tangled. This leads to position shift of the labels. If the shift is small, then

416

R. Hˇrivˇna´ k et al.

encoding function can map label distance to correct one. Careful selection and parameterization of this function is necessary and require further research, testing and fine tuning. Our parser currently supports linear, logarithmic and polynomial functions.

3 Data Structures Suffix-based data structures is a set of data structures optimized for storing and searching among suffixes of string(s) stored. Main representatives of suffix-based data structures are: • Suffix Trees (ST) • Suffix Arrays (SA) • Directed Acyclic Word Graph (DAWG) Suffix Tree is commonly used data structure for string indexing. ST is a trie used to find all possible suffixes of stored string [8, 9]. Variant used to store multiple strings in one data structure is called Generalized Suffix Tree [10]. Time complexity of ST construction is O(n) and the number of ST nodes is at most 2n, where n is length of the string for which ST is constructed. Suffix Array [11] is a sorted array of all string suffixes. SA is more space efficient than suffix trees, it occupies approximately 4 × n bytes in memory for n characters stored. FM-index [12] is even more space efficient data structure based on suffix arrays. It uses Burrows-Wheeler transform to compress the data. SA can be constructed in O(n log n) time and search time complexity is O(p), where p is length of the string we are searching for. Directed Acyclic Word Graph [14] is an automaton which stores all sub-strings in a graph where vertices are reused by allowing multiple edges to connect vertices. DAWG can be constructed in O(n) time and search time for string of length p is O(p). More space efficient, and very often used, variation of DAWG data structure is a compressed DAWG (CDAWG) [13]. Ternary Search Tree (TST) [6, 7] is a memory optimization of R-way trie data structure. Each node doesn’t contain list of all possible values from the alphabet of R characters, as in R-way trie, but only three pointers to its child nodes. This is similar to binary search tree where each node has only two child nodes. TST is data structure used to store sets of strings. TST can be also used to store suffixes of given string, but those suffixes has to be generated and inserted into the TST as original strings. Pointers of a TST node will be denoted as le f t, equal and right, node of the TST as N and child nodes of the node N as Nle f t , Nequal and Nright . TST is then constructed based on following rules: If node N[S[i]] contains i − th character of string S, then node Nequal [S[i]] contains S[i + 1] character of the string S. Node Nle f t [S[i]] (and Nright [S[i]]) contains character which meet relation S[i] < Nle f t [S[i]] (S[i] > Nright [S[i]] respectively). Each node in TST must contain one last information, and that is, if the node represents end of the string. Average time complexity for searching in TST is O(log n), where n is number of nodes in the TST. Height of the tree is proportional to log3 N and so average search time complexity is O(log N). Construction of TST has at worst time complexity of O(n), where n is the length of inserted string.

Towards Faster Matching Algorithm

417

Worst case space complexity of TST is O(p), where p is a sum of lengths of all strings stored in the TST. This can happen only in the situation when each string contains character from the alphabet only once and each character is used only in one string (super-linear TST). Real-life strings doesn’t have such characteristics, so some nodes of the TST can be reused, thus saving the space required to store the TST. This especially applies to strings which has common prefixes. Rigorously it can be described as relation of space complexity (number of nodes) on entropy of the input set 3n/H, where n is a number of strings in the input set and H is entropy of the input set [6]. Big benefit of TST is its ability to easily implement partial-match search algorithm which allows to search TST for strings by not fully defined keys (i.e. some characters in the search key are replaced by wild cards) or for searching against keys with given tolerance function (i.e. search for key abc with diff factor ±1 will match strings like bcd, bbc, aab, ... in the TST).

4 Ternary Search Tree Improvements Main thought behind presented optimization of TST is to save memory on unnecessary “technical” parts of the tree [15]. Classic, not optimized, implementation of TST needs following data fields per tree node: • Key data (char/integer/pointer; for calculations in this Section 4 Bytes integer will be used) • 3 pointers to child nodes (depends on the system architecture; for calculations in this Section 8 Bytes pointers will be used) • Boolean flag indicating if the node is end of the word (for calculations in this Section 1 Byte will be used) • 1 pointer to value data if the TST serves as an index (not needed if TST only stores keys and queries for presence of given key string in the TST) This gives us total of 37 Bytes of memory. Please note here that this is only theoretical value. In specific implementation this number varies quite a lot depending on the platform, programming language, concrete data types used, computer architecture and compiler. Second thought is not to create tree nodes from certain depth of the tree at all, but rather store key string from some depth just as sequence of keys linked to the last tree node. This is based on the observation that branching in the tree happens just at the beginning of the string which is being indexed. This is highly empirical and differs a lot depending on the alphabet size and entropy of the strings indexed. When this parameter is too low, then optimized TST will become array of arrays as there will be only one node, root, and all data will be stored in the arrays. Node classes which are used to represent optimized TST can be categorized as follows: • Base nodes (4 + 8 = 12 Bytes) • Inner nodes (12 + 16 = 28 Bytes)

418

R. Hˇrivˇna´ k et al.

• End nodes (28 + 17 = 45 Bytes) All node types contain pointer to it’s center child and key value. This minimal form of node is used for Base node. Inner nodes have two more pointers, one for left and second for right child node. End node is an extension of Inner node and contains data fields for storing end of the string - collection of pointers to metadata which are linked to the stored word, pointer to Residual Data object (which will be described later on) and boolean flag signaling end of the string. Base and Inner nodes are used to build the TST and are unable to hold any data apart from the key value and End Nodes are data storage nodes. This optimization of course comes with some cost - there is a need to evaluate what type of node is needed when new node is linked to the optimized TST. Nodes are only “upgraded” during the insertion and can be “downgraded” during the deletion of node according to the relation BaseNode < InnerNode < EndNode - End Node is the most universal kind of node, but requires most memory to represent. Further optimization is introduction of Residual Data object. It consists of two arrays - first stores pointers to collection of keys and second contains pointers to metadata which are linked to respective key sequence. Optimized TST has a parameter which specifies key sequence depth from which keys are no longer stored as nodes, but are put into Residual Data collection. Metadata which are connected to that key are stored on the same array index in the Residual Data object.

Fig. 2. Optimized TST example

Figure 2 illustrates structure of optimized TST which stores words “ABAC”, “ACC”, “ACCD” and “ABACDDD”. MP denotes Metadata Pointer for respective word (for simplicity only one metadata pointer is shown, but as was mentioned above there is a collection of pointers). Parameter for use of Residual Data object is set to 4, i.e.

Towards Faster Matching Algorithm

419

sub string “DDD” from “ABACDDD” won’t be stored as TST nodes, but is placed into Residual Data structure. Optimized TST is constructed as regular TST. All nodes starts as base nodes. Inner nodes are created only in case when branching is needed but there is no need to store data in the node (i.e. node with key value “B” from the example). In case there is a end of key sequence (word), the type of node is changed to end node. This also happens when newly added key sequence ends in the middle of TST where we have only base node (i.e. word “ABA” will be added to the example).

5 Experiments and Achieved Results This section describes experiments with current implementation of BNX parser and optimized TST data structure and presents achieved results. All experiments were done on computer with Intel Core i7 CPU, 32GB RAM, NVMe SSD hard drive and Windows 10 operating system. Implementation was done in C++ and compiled with g++ compiler. Source data file containing 4 546 748 molecules has size of 2 496 MB Tables 1, 2 and 3 shows measured performance data and legend is as follows: • • • • • • • • •

Mcnt - count of molecules loaded from source BNX file t1 - time to load and parse source BNX file Ccnt - count of characters in molecules Ntot - total count of tree nodes Ne - count of end nodes Ni - count of internal nodes Nb - count of base nodes t2 - time to fill the TST with data from parser S - memory size of TST Table 1. TST with nodes created for all characters in the strings Mcnt

t1 [ms]

Ccnt

Ntot

Ne

Ni

Nb

t2 [ms]

S [MB]

9 822

452

412 526

197 580

4 911

3 717

188 952

27

6,7

94 552

3 120

2 419 128

1 105 977

44 963

29 708

1 031 306

141

39

34 400 064

15 676 239 488 797

328 092

14 859 350 2 195

1 042 140 38 974

548

4 546 748 182 102 168 969 608 76 949 451 2 107 871 1 436 842 73 404 738 10 473 2 655

Table 2. TST with node stop creation depth = 20 Mcnt

t1 [ms]

Ccnt

Ntot

Ne

Ni

Nb

t2 [ms] S [MB]

9 822

448

412 526

87 081

4 911

3 717

78 453

20

5,7

94 552

3 087

2 419 128

560 661

44 963

29 708

485 990

162

36

34 400 064

6 329 741

488 797

328 092

5 512 852

1 757

451

1 042 140 39 201

4 546 748 181 765 168 969 608 27 942 811 2 107 868 1 436 842 24 398 105 9 095

1 520

420

R. Hˇrivˇna´ k et al. Table 3. TST with node stop creation depth = 10

Mcnt

t1 [ms]

Ccnt

Ntot

Ne

Ni

Nb

t2 [ms] S [MB]

9 822

450

412 526

40 427

4 911

3 717

31 799

14

2,9

94 552

3 093

2 419 128

290 930

44 963

29 708

216 259

113

20

34 400 064

2 972 072

488 747

328 002

2 155 323 1 545

1 042 140 38 988

4 546 748 182 013 168 969 608 12 329 094 2 106 508 1 434 578 8 788 008 8 860

254 1 145

Measured data shows how size of the TST decreases with count of tree nodes. Comparison with not optimized TST was done on theoretical level. In our implementation of TST size of nodes is 24B for base node, 40B for inner node and 64B for end node. Not optimized TST would have all nodes represented as end nodes. This leads to assumption, when also accounting for some technical specific constants, that not optimized TST would be ten times larger than optimized TST with node stop creation depth parameter set to 10. Difference between real sizes of nodes and theoretical are caused for example by memory alignment of data fields in structures and objects when allocated. Data also shows that the presented optimization has improved TST build time: 15, 5% improvement for 4,5M molecules between without node creation stop depth and with node creation stop depth = 10. This is caused by not doing decisions needed for node creation when finding correct place in the TST where the new node should be inserted. Figure 3 shows count of nodes in the TST in relation on count of molecules parsed from BNX file for three values of nodes stop creation depth parameter.

Fig. 3. Count of nodes in TST - parameter node stop creation depth is abbreviated as NSCD

Towards Faster Matching Algorithm

421

Figure 4 shows sizes of TST in relation on count of molecules parsed from BNX file for three values of nodes stop creation depth parameter. Chart uses logarithmic axis scale to better display the relation.

Fig. 4. Size of TST - parameter node stop creation depth is abbreviated as NSCD

Table 4. TST search times Node stop creation depth ts [ns] OFF

∼5 100

20

∼3 900

10

∼3 650

Table 4 shows average measured search time for some value in the TST. Again, it can be seen that searching in TST is faster when there is fewer nodes in the TST. This is caused by saving unnecessary comparisons when going down the tree branches. Finding optimal value for node stop creation depth parameter will be subject of further research. Table 5. Confusion matrices - measured data D f = 0% D f = 1% D f = 2% D f = 5% True-Positives

190 362

190 362

190 362

190 362

False-Negatives 0

0

0

0

True-Negatives 0

158 756

170 472

174 440

False-Positives

31 606

19 890

15 922

190 362

422

R. Hˇrivˇna´ k et al.

Table 5 shows confusion matrix for our parser with optimized TST. It was tested by parsing 190362 molecules from BNX file. Distances between labels were encoded with logarithmic (with base 10) encoder and then inserted into the TST together with all their suffixes. This resulted in total of 5 371 720 strings stored in the TST. Optimized TST was configured with node stop creation depth parameter = 10. Test engine tried to search for all molecules, resulting in total of 190 362 searches. Performing this amount of searches took approximately 673[ms]. Searching was done in that way, that for true-positives and false-negatives original (source) molecules were used as we know, that those molecules were added into the TST. Distances between the labels were encoded and then passed to the search function. For testing of true-negatives and false-positives new set of molecules is needed, so those were generated by “deformation” of original ones. All source label distances were multiplied with some deformation factor D f , encoded and then passed to search function. Search function returns all strings from TST which were matched with the searched string. Table 6 shows calculated quality scores from confusion matrix for different values of D f . For set of molecules which were not deformed (D f = 0) all molecules were found in the TST. This was expected as distance encoding function always encode to same resulting values. When molecules are very similar (low D f values) to molecules inserted into the TST, then more matches occurs. This is again expected. Table 6. Confusion matrices - results D f = 0% D f = 1% D f = 2% D f = 5% Precision

0.858

0.905

0.923

Negative predictive value N/A

1

1

1

1

Sensitivity

1

1

1

1

Specificity

0

0.834

0.896

0.916

Accuracy

1

0.917

0.948

0.958

Our implementation of TST also supports partial-match search algorithm, which is derivation of classic searching in the TST and same results can be obtained when all possible combinations search string are passed into the classic search function. Partialmatch searching was tested only for single molecules where some distance was replaced with wild char and it works as expected - it returns all molecules which match given pattern in times similar to those presented in Table 4 for regular searches.

6 Conclusion In this article novel implementation of BNX file parser with memory optimized TST indexing data structure was described. Main goal is to shorten time needed to align DNA molecules to reference genome and to find differences between them. This is much needed in precision medicine as the whole process now takes up to several days

Towards Faster Matching Algorithm

423

and large part of the time required is spent just in data processing. There, quality and speed of DNA mapping software plays significant role. Hopefully improvement in efficiency and speed of DNA mapping software will lead to greater expansion of precision medicine into clinical practice. Thorough performance and quality testing of presented solution was done. Results show that the TST can be build in reasonable times even for quite large amounts of data on mediocre computer so much better performance can be expected on more powerful machines. Memory optimization shows significant savings in memory consumption on real-life data. Search performance was also improved by saving unnecessary comparisons when searching for data in the TST. Further work is definitely needed and can be divided into three parts - parameter optimization, technical improvements and further testing. Presented solution has several parameters which should be optimized and tested on real-life data, especially node stop creation depth parameter for TST and choice of alphabet encoding function for molecule parser. Whole solution can be further improved with technical improvements such as use of low level programming in certain areas or better memory allocation of arrays needed to store residual data in TST End Nodes. BNX parser, TST building and searching can also be parallelized for better utilization of system resources on modern computers. Further testing should be done on more powerful server class computers on much larger input data-sets. Also finding impact of different alphabet sizes on performance should be analyzed. Acknowledgements. This work is supported by SGS project, VSB-Technical University of Ostrava, under the grant no. SP2020/161 and Celgene Research Grant-CZ-102.

References 1. Ashley, E.A.: Towards precision medicine. Nat. Rev. Genet. 17(9), 507–522 (2016) 2. Shelton, J.M., et al.: Tools and pipelines for BioNano data: molecule assembly pipeline and FASTA super scaffolding tool. BMC Genom. 16, 734 (2015) 3. Chan, S., et al.: Structural variation detection and analysis using Bionano optical mapping, pp. 193–203. Springer (2018) 4. Edwards, D., Stajich, J., Hansen, D.: Bioinformatics: Tools and Applications. Springer, New York (2009) 5. Michael, S.: Rosenberg: Sequence Alignment Methods, Models, Concepts, and Strategies. University of California Press, Berkeley (2009) 6. Clement, J., Flajolet, P., Vallee, B.: The analysis of hybrid trie structures. Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 531–539 (1998) 7. Sedgewick, R., Wayne, K.: Algorithms. Addison-Wesley Professional, Upper Saddle River (2011) 8. Blumer, A., Ehrenfeucht, A., Haussler, D.: Average sizes of suffix trees and dawgs. Discrete Appl. Math. 24, 37–45 (1989) 9. Farach, M.: Optimal suffix tree construction with large alphabets. In: Proceedings 38th Annual Symposium on Foundations of Computer Science, pp. 137–143 (1997) 10. Na, J.Ch. et al.: Suffix tree of alignment: an efficient index for similar data. In: International Workshop on Combinatorial Algorithms, pp. 337–348 (2013) 11. Manber, U., Myers, G.S.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)

424

R. Hˇrivˇna´ k et al.

12. Ferragina, P., Manzini, G.: Opportunistic data structures with applications, pp. 390–398. IEEE (2000) 13. Blumer, A., et al.: The smallest automation recognizing the subwords of a text. Theor. Comput. Sci. 40, 31–55 (1985) 14. Ehrenfeucht, A. and McConnell, R. M.: String searching. In: Handbook of Data Structures and Applications, pp. 477–494. Chapman and Hall (2018) 15. Robenek, D., Platos, J., Snasel, V.: Ternary Tree Optimalization for n-gram Indexing. In: DATESO, pp. 47–58 (2014) 16. Bionano Genomics: BNX File Format Specification Sheet (2018). https://bionanogenomics. com/wp-content/uploads/2018/04/30038-BNX-File-Format-Specification-Sheet.pdf. Accessed 21 May 2020 17. Pang, A.W.C., et al.: Efficient structural variation detection and annotation using bionano genome mapping. Bionano Genomics 131, 1345–1362 (2018) 18. Bionano Genomics: Saphyr (2018). https://bionanogenomics.com/products/saphyr/. Accessed 29 May 2020

Preprocessing COVID-19 Radiographic Images by Evolutionary Column Subset Selection Jana Nowakov´ a , Pavel Kr¨ omer(B) , Jan Platoˇs, and V´ aclav Sn´ aˇsel VSB-Technical University of Ostrava, 17. listopadu, 2172/15 Ostrava, Czech Republic {jana.nowakova,pavel.kromer,jan.platos,vaclav.snasel}@vsb.cz

Abstract. Column subset selection is a hard combinatorial optimization problem with applications in operations research, data analysis, and machine learning. It involves the search for fixed–length subsets of columns from large data matrices and can be used for low–rank approximation of high–dimensional data. It can be also used to preprocess data for image classification. In this work, we study column subset selection in the context of radiography image analysis and concentrate on the detection of COVID-19 from chest X–ray imagery.

1

Introduction

Radiographic image analysis is an important part of diagnostic imaging. It is a field within modern medicine that deals with the representation, communication, and analysis of visual (image) information [23]. The analysis is nowadays typically done by human experts and aided by computers (e.g., decision support systems) but the need for accurate, robust, and efficient methods to automate this task is obvious. The rise of artificial intelligence (AI) and machine learning in medical image analysis allows not only the use of intelligent classification and prediction methods, but also the application of advanced data pre– and post– processing techniques common in this field. Column subset selection (CSS) [4] is a combinatorial optimization problem well-known for its empirically demonstrated hardness and a wide range of real– world applications. One of its potential uses is unsupervised feature and variable selection selection [31] that has been previously used as preprocessing for image classification [21,31]. Coronavirus disease (COVID-19) is a severe disease caused by coronavirus SARS-COV2 [28]. By June 2020, COVID-19 is a global pandemy with several epicenters around the world and death toll of nearly 500,000. Accurate and efficient diagnostic methods are needed to detect [8] and/or assess [3] COVID-19 in patients. In this work, we use differential evolution (DE) to find column subsets useful for the classification of COVID-19 radiographic images and evaluate the effect of this reduction on machine learning–based classification of COVID-19 positive and negative patients. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 425–436, 2021. https://doi.org/10.1007/978-3-030-57796-4_41

426

2

J. Nowakov´ a et al.

Column Subset Selection

CSS is a combinatorial optimization problem that consists in the search of k columns out of the total of n columns of a (potentially large) matrix, Am×n , k < n. From a top–level point of view, it is a form of knowledge extraction that aims at an automated identification of useful features from large sets of variables. In image classification [21,31], CSS has been used as a dimensionality reduction method. It was applied to collections of images to extract a subset of pixels for further processing by different machine learning and classification models. CSS for image classification has been in the past tackled by kernel–based [31] and genetic [21] algorithms. Without regard to a particular application, CSS is defined as [4]: Definition 1 (CSS). For a matrix, A ∈ Rm×n , and a positive integer, k < n, select k columns out of A to form another matrix, C ∈ Rm×k , so that the residual A − PC Aξ (1) n is minimized over all possible k column subsets of A. In Eq. 1, PC = CC † denotes a projection of A into a k-dimensional space spanned by the columns of C, ξ = {2, F } is spectral or Frobenius norm, respectively [4], and C † is the Moore-Penrose pseudoinverse of C [12]. The goal of CSS is to find the combination of k columns (out of all n columns of A) that would minimize (1). It is a hard problem that can be solved by deterministic methods only in a limited number of cases. Although a formal proof has not been provided yet, CSS is assumed to be NP–hard. So far, it has been shown unique games (UG) hard. A UG hard problem is a problem that is NP–hard in case the unique games conjecture holds [6]. This is strong support to the expected NP–hardness of CSS. The problem can be solved by deterministic, approximate (stochastic, randomized), and mixed–type (hybrid) algorithms. Small and medium–sized CSSP instances have been solved by deterministic methods based on a sparse approximation of singular value decomposition [7], randomized 2–stage algebraic selection algorithm [4], pseudoinverse–based methods [1], greedy algorithms [11], and others. Recently, an increased amount of research has been devoted to the application of nature–inspired metaheuristics to CSS [21] and other hard subset selection (combination) problems [29].

3

Radiographic Image Analysis

Radiology is an area of medicine highly relying on image analysis. X-ray images of different parts of bodies are routinely taken and analyzed either by human experts (medical doctors), with the help of digital image analysis and, in recent times, using image retrieval, artificial intelligence, and machine learning [13]. The primary task of medical image analysis is pattern recognition such as finding the anomalies, nodule markings, bones fractures’ detection, etc. [18].

Preprocessing COVID-19 Radiographic Images by Evolutionary CSS

427

The current abilities and future use of AI and machine learning in radiology are still often discussed [26]. Radiology is at the forefront of medical areas that have the potential to be transformed using AI. However, the adoption of AI in medicine, including radiology, is not straightforward. Medical experts, and even medical students who are young and use AI in their lives on a daily basis, are often skeptical about its role in diagnostics and treatment [14]. It has been pointed out that, in the beginning, radiologists have to extend their knowledge of statistics and data science to be able to efficiently use, validate, and interpret the results obtained by machine learning [19], even when used as decision support systems [13]. On the other hand, the adoption of modern, AI–driven data analytic methods in radiology can significantly decrease the time needed to accomplish individual tasks. This can lead to cost reductions and make the healthcare available to more patients [32]. The applications of the machine learning in radiology can be divided into six basic groups [35]: medical image segmentation, registration, computer–aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis, content–based image retrieval systems, and text analysis of radiology reports using natural language processing and natural language understanding. In the last couple of years, interesting machine learning applications can be found in, e.g., the classification of radiology reports [25]. It involved active learning for optimization and classification performance improvement. Another application is the annotation of radiology reports, i.e., automated identification of findings in the reports [36]. A special challenge for machine learning in medical image analysis is the question of big data. The availability of big medical data can overcome the lack of AI. The combination of the advantages of big data approaches and machine learning methods has great potential for medical image analysis. It can contribute to the advent of personalized medicine at a high level [10]. In light of current events, the scope of many ongoing researcher efforts is focused on the analysis of chest radiographs in conjunction with COVID-19. 3.1

COVID-19 and Medical Imaging

In relation to the 2020 COVID-19 pandemic, great attention is nowadays paid to chest radiography as a substitute or complement of RT–PCR (Reverse Transcription – Polymerase Chain Reaction) tests in COVID-19 early screening [5], diagnostics, or as treatment support method. A radiograph (X-ray) is the fundamental non–invasive method in the treatment of many lung diseases. In COVID19 radiographic image analysis, it is necessary to differentiate the radiographs of COVID-19 positive subjects from subjects suffering by other respiratory diseases such as Influenza–A viral pneumonia and healthy (test) subjects. By June 2020, large quantities of new methods to accomplish this task have been introduced. In this section, we provide only a short overview of selected recent ones. An overview of AI and non–AI methods addressing COVID-19 in various ways is provided, e.g., in [33]. Analysis of computer tomography (CT) images by

428

J. Nowakov´ a et al.

support vector machines (SVM) is proposed in [2]. The CT images were divided into 32 × 32 sized blocks (patches). Each patch was transformed by Grey Level Size Zone Matrix into a 1 × 13 feature vector, processed by Discrete Wavelet Transform, and classified by an SVM. Essentially, positive and negative samples were in this approach separated by a high margin hyperplane. Another system combining 2D and 3D image analysis for COVID-19 detection was introduced in [15]. It is a combination of commercial off–the–shelf software for the 3D subsystem and pre–trained neural network for segmentation, i.e., search for the region of the interest in the 2D subsystem. A rather small data set of 50 chest X-ray images was used to develop deep convolutional neural classifiers [17] based on modified Visual Geometry Group Network and on Google MobileNet. A side–by–side comparison of several deep neural networks addressing COVID-19 screening problem was provided in [9]. The authors compared the performance of 8 models (SqueezeNet, MobileNetv2, ResNet18, InceptionV3, ResNEt101, CheXNet, DenseNet201, VGG19) on two variants of radiographic images – with and without additional augmentation. In another research, a COVNet–based deep learning framework (RestNet50) with the ability to detect COVID-19, pneumonia, and non–pneumonia patients was proposed for the classification of 2D, and 3D CT scans [22]. The availability of reliable and good–quality radiography data is essential for the development of AI models for COVID-19 detection. The authors of [24] combined several public data sets to create a collection of 5,000 chest X-ray images. The collection was split (2,000 for training, 3,000 for test) and used for the training of several convolutional neural networks, in particular ResNet18, ResNet50, SqueezeNet, and DenseNet–121. A new network called VB–Net was introduced in [30] and trained using a human–in–the–loop strategy. Another system to tackle the disease named COVID-Net [34] was based on a custom deep conventional neural design. Networks with a weakly–supervised deep learning were proposed in [37]. Apparently, major effort is devoted to the design of deep learning and neural methods to analyze COVID-19 image data. In this work, we put emphasis on image preprocessing, important feature selection, and its effects on the classification by different traditional machine learning classifiers.

4

Differential Evolution for CSS

The DE [27] is an evolutionary optimization algorithm that evolves a population of real–valued vectors representing the solutions to the solved problem. It facilitates an artificial evolution of a population of candidate solutions by an iterative application of the differential mutation and crossover operators. The DE starts with an initial population of M randomly generated (or carefully initialized) real–valued vectors. During the optimization, the algorithm perturbs selected base vectors with the scaled difference of two (or more) other population vectors to produce so–called trial vectors. The trial vectors compete with other members of the current population with the same index called the target

Preprocessing COVID-19 Radiographic Images by Evolutionary CSS

429

vectors. If a trial vector represents a better solution than the corresponding target vector, it takes its place in the population [27]. The two most important DE parameters are scaling factor and mutation probability [27]. The scaling factor, F ∈ [0, ∞), controls the rate at which the population evolves and the crossover probability, C ∈ [0, 1], determines the ratio of elements that are transferred to the trial vector from its opponent. The standard DE is a continuous metaheuristic optimization method. However, in this work, it is used to solve a combinatorial optimization problem. In order represent column subsets by real–valued vectors, the DE uses a decoding algorithm first introduced in [20]. Each column subset is represented by a real–valued vector, c, of the size k. The vector is decoded into a set of column indices, K, |K| = k, using the following decoding algorithm: every coordinate in the candidate vector, ci , is truncated and added to K. If j = trunc(ci ) already exists in K, next available column index that is not included in K yet, is added to the set. The fitness function that is minimized by the DE is fcss = ||A − CC † A||F , C = {cij }, i = {0, . . . , m}, j = {c1 , . . . ck }.

(2) (3)

It is based on the residual, defined by Eq. 1, with ||·||F evaluated as Frobenius norm. This is the only part of the DE specific to CSS, otherwise, it can be used to solve arbitrary subset selection (combination) problems.

5

Experiments

In order to evaluate the suitability of CSS as an unsupervised preprocessing step in the analysis of COVID-19 radiographs, a series of computational experiments was performed over the Covid Radiography Database (CRD) [9]. The CRD data set was downloaded from Kaggle1 and contained 219 chest radiographs of patients with COVID-19, 1345 chest radiographs of patients with viral pneumonia, and 1341 radiographs of patients with neither disease (normal). All images had resolution 1 MP (i.e., 1024 × 1024 pixels). In this work, only COVID19 positive and normal images were used and a binary classification problem was considered. An example of images from the database is shown in Fig. 1. For the purpose of the computational experiments, the images were converted to grayscale and scaled–down by factor 4 (i.e., to 256 × 256 pixels) by Lanczos resampling. Then, three test data sets were created. COV4-50 was assembled from 50 randomly selected COVID-19 and 50 randomly selected normal images, COV4-100 was composed of 100 COVID-19 and 100 normal images, and COV4219 was composed of all 219 COVID-19 images and 219 randomly selected normal images. Each data sets was represented by a single matrix with images as rows and individual image pixels as columns. 1

https://www.kaggle.com/tawsifurrahman/covid19-radiography-database.

430

J. Nowakov´ a et al.

A battery of traditional machine learning classifiers, was used to distinguish COVID-19 chest radiographs from radiographs of healthy patients. The models included the k-Nearest Neighbour classifier with k = 3 (kNN3), linear support vector machine (SVM), classification and regression tree (CART) [16], and a shallow neural network (i.e., multilayer perceptron) with a single hidden layer of 100 neurons (MLP). The methods were selected because of their frequent use in data mining and machine learning and because similar classifiers were recently used for CSS–aided image classification [21,31]. All classifiers were set up with parameters corresponding to best practices and initial trial–and–error runs. COVID-19 positive subjects

COVID-19 negative (normal) subjects

Fig. 1. Examples of images from the Covid Radiography Database. First row: COVID19 positive subjects, second row: healthy subjects.

To evaluate their ability to detect COVID-19 from chest radiographs, they were first evaluated on the full COV4-50, COV4-100, and COV4-219 data sets. The data sets were first split into training (60%) and test (40%) subsets and then processed by the classification algorithms. Each classifier was first trained on the trainig subset and then evaluated on the test subset. The results of the initial evaluation are summarized in Table 1.

Preprocessing COVID-19 Radiographic Images by Evolutionary CSS

431

Table 1. Training and test accuracy on full data sets. Data set

Training accuracy Test accuracy kNN3 SVM CART MLP kNN3 SVM CART MLP

COV4-50

0.98

1.0

1.0

1.0

0.93

0.93

0.93

0.95

COV4-100 0.98

1.0

1.0

1.0

0.94

0.99

0.90

0.98

COV4-219 0.99

1.0

1.0

1.0

0.98

0.98

0.90

0.98

The table shows a very good ability of the classifiers to detect positive COVID-19 patients from even downscaled radiographs. Next, the effect of (random) column sampling on the accuracy of the classification was investigated. From each data set, 51 random column subsets of size k, k ∈ {6, 12, 24, 48, 96, 192, 384}, were taken and the ability of the classifiers to detect COVID-19 positive patients from such reduced data was evaluated. The average classification accuracy on data with randomly sampled columns (i.e., image pixels) is shown in Table 2. The table clearly illustrates how column sampling reduces classification accuracy. It also shows that the reduction in training accuracy differs among classifiers (training of CART and MLP is less affected than the training of kNN3 and SVM). The drop in test accuracy is for different classifiers different, too. This fact points out the different ability of the investigated classifiers to learn and generalize. Another interesting observation is the different sensitivity of the classification algorithms to the amount of dimensionality reduction (i.e., the value of k). kNN3 is from this point of view in many cases less sensitive than SVM and MLP when the dimensionality is reduced to lowest values and achieves better accuracy for small ks. This trend is, however, more pronounced in training (training precision) than in generalization (test precision). Then, the DE was used to learn column subsets of size k. The implemented DE was the traditional /DE/rand/1 version of differential evolution with scaling factor, F = 0.9, crossover probability, C = 0.9, population size 20, and maximum number of DE generations set to 2000. The values of F and C were selected on the basis of best practices and initial trial–and–error runs. Population size and the maximum number of generations were chosen with regard to the computational complexity of the algorithm. For smaller data sets, COV4-50 and COV4-100, column subsets for certain ks were not computed due to the numerical precision of the pseudoinverse of C computation. The evolutionary search for column subsets by the DE was for each data set and each value of k repeated 51 times, independently. Results of this experiment are shown in Table 3. The table clearly illustrates that CSS as a dimensionality reduction procedure contributes to the accuracy of COVID-19 detection in the investigated collection of chest X-ray images. The use of column subsets, obtained through DE–powered evolutionary CSS, increased the average training accuracy on the reduced data for all but one classifier to over 0.9 (i.e., 90%). The only exception was SVM that achieved on the smallest data set, COV4-50, and for the smallest column

432

J. Nowakov´ a et al.

Table 2. Average training and test accuracy on data sets with randomly selected column subsets. k

Data set

6 COV4-50 12 24 48 96 192 384

Training accuracy kNN3 SVM CART MLP

Test accuracy kNN3 SVM CART MLP

0.77 0.75 0.78 0.83 0.92 0.97 0.93

0.58 0.57 0.58 0.77 0.87 0.97 0.98

0.87 0.82 0.92 0.95 0.99 1.00 1.00

0.62 0.74 0.87 0.95 1.00 1.00 1.00

0.65 0.55 0.65 0.77 0.80 0.90 0.90

0.60 0.60 0.60 0.80 0.83 0.95 0.88

0.53 0.54 0.79 0.66 0.73 0.87 0.87

0.65 0.77 0.80 0.85 0.86 0.92 0.92

6 12 24 48 96 192 384

COV4-100

0.83 0.87 0.85 0.80 0.89 0.96 0.94

0.65 0.66 0.65 0.67 0.77 0.88 0.93

0.78 0.78 0.83 0.85 1.00 1.00 1.00

0.69 0.79 0.85 0.93 1.00 1.00 1.00

0.62 0.56 0.65 0.74 0.79 0.94 0.92

0.59 0.57 0.59 0.59 0.69 0.79 0.83

0.61 0.65 0.70 0.71 0.76 0.82 0.82

0.59 0.64 0.73 0.82 0.82 0.94 0.88

6 12 24 48 96 192 384

COV4-219

0.77 0.81 0.85 0.89 0.91 0.93 0.94

0.63 0.62 0.74 0.74 0.80 0.86 0.91

0.79 0.85 0.85 0.87 0.93 0.97 0.98

0.64 0.75 0.83 0.87 0.95 1.00 1.00

0.70 0.74 0.77 0.82 0.81 0.82 0.87

0.64 0.62 0.76 0.73 0.75 0.82 0.81

0.65 0.70 0.72 0.69 0.72 0.80 0.83

0.65 0.68 0.74 0.78 0.84 0.86 0.86

subset size, k = 6, training accuracy 0.85. The reduction of the data to column subsets evolved by the DE also clearly improved the average generalization ability of the classifiers on reduced data sets. The average accuracy of the worst classifier increased from 0.53 (CART on COV4-50 with k = 6) to 0.83 (CART on COV4-50 with k = 6). The best average training accuracy was in this case obtained by CART and MLP while the best average test accuracy was achieved by MLP. However, kNN3 and MLP showed a good ability to detect COVID-19 on previously unseen radiographic images, too.

Preprocessing COVID-19 Radiographic Images by Evolutionary CSS

433

Table 3. Average training and test accuracy on data sets with columns learned by evolutionary CSS. k

Data set

6 12 24 48 96 192 384

COV4-50

0.94 0.96 0.96 0.96 0.97 – N/A – N/A

0.85 0.94 0.98 0.99 1.00 – –

1.00 1.00 1.00 1.00 1.00

0.97 1.00 1.00 1.00 1.00

0.88 0.89 0.90 0.92 0.91 – N/A – N/A

0.85 0.91 0.93 0.93 0.93 – –

0.83 0.83 0.85 0.85 0.84

0.90 0.92 0.93 0.93 0.94

6 12 24 48 96 192 384

COV4-100

0.96 0.97 0.97 0.97 0.97 0.97 – N/A

0.91 0.93 0.97 0.99 1.00 1.00 –

0.99 0.99 1.00 1.00 1.00 1.00

0.97 1.00 1.00 1.00 1.00 1.00

0.89 0.91 0.93 0.94 0.95 0.95 – N/A

0.87 0.89 0.93 0.96 0.97 0.97 –

0.86 0.87 0.88 0.88 0.88 0.87

0.92 0.93 0.95 0.96 0.97 0.98

0.92 0.94 0.97 0.99 1.00 1.00 1.00

0.98 0.99 0.99 1.00 1.00 1.00 1.00

0.94 0.98 1.00 1.00 1.00 1.00 1.00

0.92 0.93 0.94 0.95 0.96 0.97 0.97

0.91 0.92 0.94 0.95 0.96 0.97 0.97

0.88 0.88 0.89 0.90 0.90 0.90 0.90

0.92 0.93 0.95 0.96 0.97 0.97 0.98

6 COV4-219 12 24 48 96 192 384

6

Training accuracy Test accuracy kNN3 SVM CART MLP kNN3 SVM CART MLP

0.96 0.96 0.98 0.99 0.99 0.99 0.99

Conclusions

In this work, differential evolution was used to solve the column subset selection problem. The column subsets found by the DE were used to obtain a compressed (low–rank) representation of a collection of chest radiographs collected during the medical treatment of 219 COVID-19 positive patients. The compressed representation of radiographs was used to train and evaluate several machine learning–based classifiers. The computational experiments clearly illustrated that the average column subsets evolved by the proposed evolutionary method for CSS allowed for more accurate classification by the majority of the classification algorithms. The classification results summarized in Table 3 are indeed impressive. However, they in fact raise questions about the data in Covid Radiography Database. The results actually suggest that only 6 pixels out of the total 65536 pixels in the downscaled images (24 pixels out of the 1048576 pixels in the original

434

J. Nowakov´ a et al.

radiographs) are needed to achieve fully automated COVID-19 detection with accuracy of 92% and more. That is an interesting result that demands further scientific investigation. Future work on this topic will include evaluation of more machine learning– based classifiers (e.g., deep neural networks) and an efficient (parallel) implementation of the proposed algorithm. Acknowledgements. This work was supported by the Ministry of Education, Youth and Sports of the Czech Republic in project “Metaheuristics Framework for Multi-objective Combinatorial Optimization Problems (META MO-COP)”, reg. no. LTAIN19176, and in part by the grants of the Student Grant System no. SP2020/108 and SP2020/161, VSB - Technical University of Ostrava, Czech Republic.

References 1. Avron, H., Boutsidis, C.: Faster subset selection for matrices and applications. SIAM J. Matrix Anal. Appl. 34(4), 1464–1499 (2013) 2. Barstugan, M., Ozkaya, U., Ozturk, S.: Coronavirus (COVID-19) classification using CT images by machine learning methods. arXiv preprint arXiv:2003.09424 (2020) 3. Borghesi, A., et al.: Radiographic severity index in COVID-19 pneumonia: relationship to age and sex in 783 Italian patients. La radiologia medica 125(5), 461–464 (2020) 4. Boutsidis, C., Mahoney, M.W., Drineas, P.: An improved approximation algorithm for the column subset selection problem. In: Proceedings of the 20th Annual ACM– SIAM Symposium on Discrete Algorithms, pp. 968–977. SIAM, USA (2009) 5. Bullock, J., Pham, K.H., Lam, C.S.N., Luengo-Oroz, M., et al.: Mapping the landscape of artificial intelligence applications against COVID-19. arXiv preprint arXiv:2003.11336 (2020) 6. C ¸ ivril, A.: Column subset selection problem is UG-hard. J. Comput. Syst. Sci. 80(4), 849–859 (2014) 7. C ¸ ivril, A., Magdon-Ismail, M.: Column subset selection via sparse approximation of SVD. Theoret. Comput. Sci. 421, 1–14 (2012) 8. Chen, H., Ai, L., Lu, H., Li, H.: Clinical and imaging features of COVID-19. Radiology of Infectious Diseases (2020) 9. Chowdhury, M.E., Rahman, T., Khandakar, A., Mazhar, R., Kadir, M.A., Mahbub, Z.B., Islam, K.R., Khan, M.S., Iqbal, A., Al-Emadi, N., et al.: Can AI help in screening viral and COVID-19 pneumonia? arXiv preprint arXiv:2003.13145 (2020) 10. Choy, G., Khalilzadeh, O., Michalski, M., Do, S., Samir, A.E., Pianykh, O.S., Geis, J.R., Pandharipande, P.V., Brink, J.A., Dreyer, K.J.: Current applications and future impact of machine learning in radiology. Radiology 288(2), 318–328 (2018) 11. Couvreur, C., Bresler, Y.: On the optimality of the backward greedy algorithm for the subset selection problem. SIAM J. Matrix Anal. Appl. 21(3), 797–808 (2000) 12. Friedberg, S.: Linear Algebra, 4th edn. Prentice-Hall of India Pvt. Limited, New Delhi (2003) 13. Giger, M.L.: Machine learning in medical imaging. J. Am. College Radiol. 15(3), 512–520 (2018)

Preprocessing COVID-19 Radiographic Images by Evolutionary CSS

435

14. Gong, B., Nugent, J.P., Guest, W., Parker, W., Chang, P.J., Khosa, F., Nicolaou, S.: Influence of artificial intelligence on Canadian medical students’ preference for radiology specialty: anational survey study. Acad. Radiol. 26(4), 566–577 (2019) 15. Gozes, O., Frid-Adar, M., Greenspan, H., Browning, P.D., Zhang, H., Ji, W., Bernheim, A., Siegel, E.: Rapid AI development cycle for the coronavirus (COVID-19) pandemic: Initial results for automated detection & patient monitoring using deep learning CT image analysis. arXiv preprint arXiv:2003.05037 (2020) 16. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer, New York (2013) 17. Hemdan, E.E.D., Shouman, M.A., Karar, M.E.: Covidx-net: a framework of deep learning classifiers to diagnose COVID-19 in x-ray images. arXiv preprint arXiv:2003.11055 (2020) 18. Jha, S., Topol, E.J.: Information and artificial intelligence. J. Am. College Radiol. 15(3), 509–511 (2018) 19. Kohli, M., Prevedello, L.M., Filice, R.W., Geis, J.R.: Implementing machine learning in radiology practice and research. Am. J. Roentgenol. 208(4), 754–760 (2017) 20. Kr¨ omer, P., Platoˇs, J.: A comparison of differential evolution and genetic algorithms for the column subset selection problem. In: Burduk, R., Jackowski, K., ˙ lnierek, A. (eds.) Proceedings of the 9th InternaKurzy´ nski, M., Wo´zniak, M., Zo tional Conference on Computer Recognition Systems CORES 2015, pp. 223–232. Springer, Cham (2016) 21. Kromer, P., Platos, J., Nowakova, J., Snasel, V.: Optimal column subset selection for image classification by genetic algorithms. Ann. Oper. Res. 265(2), 205–222 (2018) 22. Li, L., Qin, L., Xu, Z., Yin, Y., Wang, X., Kong, B., Bai, J., Lu, Y., Fang, Z., Song, Q., et al.: Artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest CT. Radiology p. 200905 (2020) 23. Meyer-Baese, A., Schmid, V.: Pattern Recognition and Signal Analysis in Medical Imaging, 2nd edn. Academic Press, Oxford (2014) 24. Minaee, S., Kafieh, R., Sonka, M., Yazdani, S., Soufi, G.J.: Deep-COVID: predicting COVID-19 from chest x-ray images using deep transfer learning. arXiv preprint arXiv:2004.09363 (2020) 25. Nguyen, D.H., Patrick, J.D.: Supervised machine learning and active learning in classification of radiology reports. J. Am. Med. Inform. Assoc. 21(5), 893–901 (2014) 26. Pesapane, F., Tantrige, P., Patella, F., Biondetti, P., Nicosia, L., Ianniello, A., Rossi, U.G., Carrafiello, G., Ierardi, A.M.: Myths and facts about artificial intelligence: why machine-and deep-learning will not replace interventional radiologists. Med. Oncol. (Northwood, London, England) 37(5), 40–40 (2020) 27. Price, K.V., Storn, R.M., Lampinen, J.A.: Differential Evolution A Practical Approach to Global Optimization. Natural Computing Series. Springer, Berlin (2005) 28. Rothan, H.A., Byrareddy, S.N.: The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak. J. Autoimmun. 109, 102,433 (2020) 29. dos S. Santana, L.E.A., de Paula Canuto, A.M.: Filter-based optimization techniques for selection of feature subsets in ensemble systems. Expert Syst. Appl. 41(4, Part 2), 1622–1631 (2014) 30. Shan, F., Gao, Y., Wang, J., Shi, W., Shi, N., Han, M., Xue, Z., Shi, Y.: Lung infection quantification of COVID-19 in CT images with deep learning. arXiv preprint arXiv:2003.04655 (2020)

436

J. Nowakov´ a et al.

31. Shen, J., Ju, B., Jiang, T., Ren, J., Zheng, M., Yao, C., Li, L.: Column subset selection for active learning in image classification. Neurocomputing 74(18), 3785– 3792 (2011) 32. Thrall, J.H., Li, X., Li, Q., Cruz, C., Do, S., Dreyer, K., Brink, J.: Artificial intelligence and machine learning in radiology: opportunities, challenges, pitfalls, and criteria for success. J. Am. College Radiol. 15(3), 504–508 (2018) 33. Vaishya, R., Javaid, M., Khan, I.H., Haleem, A.: Artificial intelligence (AI) applications for COVID-19 pandemic. Diabetes Metabol. Syndrome: Clin. Res. Rev. 14(4), 337–339 (2020) 34. Wang, L., Wong, A.: COVID-net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest x-ray images. arXiv preprint arXiv:2003.09871 (2020) 35. Wang, S., Summers, R.M.: Machine learning and radiology. Med. Image Anal. 16(5), 933–951 (2012) 36. Zech, J., Pain, M., Titano, J., Badgeley, M., Schefflein, J., Su, A., Costa, A., Bederson, J., Lehar, J., Oermann, E.K.: Natural language-based machine learning models for the annotation of clinical radiology reports. Radiology 287(2), 570–580 (2018) 37. Zheng, C., Deng, X., Fu, Q., Zhou, Q., Feng, J., Ma, H., Liu, W., Wang, X.: Deep learning-based detection for COVID-19 from chest CT using weak label. medRxiv (2020)

Transmission Scheduling for Tandemly-Connected Sensor Networks with Heterogeneous Packet Generation Rates Ryosuke Yoshida(B) , Masahiro Shibata, and Masato Tsuru Computer Science and System Engineering, Kyushu Institute of Technology, Fukuoka, Japan [email protected], {shibata,tsuru}@cse.kyutech.ac.jp

Abstract. A tandemly-connected multi-hop wireless sensor network model is studied. Each node periodically generates packets in every cycle and relays the packets in a store-and-forward manner on a lossy wireless link between two adjacent nodes. To cope with a considerable number of packet losses, we previously proposed a packet transmission scheduling framework, in which each node transmits its possessing packets multiple times according to a static time-slot allocation to recover or avoid packet losses caused either by physical conditions on links or by interference of simultaneous transmissions among near-by nodes. However, we assumed that the packet generation rate is identical over all nodes, which is not always realistic. Therefore, in this paper, we enhance our work to the case of heterogeneous packet generation rates. We derive a static timeslot allocation maximizing the probability of delivering all packets within one cycle period. By using an advanced wireless network simulator, we show its effectiveness and issues to be solved.

1

Introduction

We focus on a multi-hop wireless network model in which network nodes are tandemly arranged and serially connected by unreliable lossy links. Each node generates a certain number of packets in every one cycle period and also relays them in a store-and-forward manner on a low-cost wireless link between two adjacent nodes. Generated packets are finally forwarded to one of gateways located at both ends of the network and sent to a central server via Internet. This type of configuration is often seen in sensor networks to monitor facilities (e.g., a power transmission tower’s network) in a geographically elongated field without a wide-area infrastructural network. A problem in such networks is frequent happening of packet losses caused either by attenuation and fading in environmental conditions (e.g., restricted placement of nodes) or by interference of simultaneous data transmissions among near-by nodes especially with omnidirectional antenna. This problem is common in multi-hop wireless networks, and a wide range of studies have been c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 437–446, 2021. https://doi.org/10.1007/978-3-030-57796-4_42

438

R. Yoshida et al.

devoted over decades In this paper, to mitigate interference of simultaneous transmissions, we adopt a Time Division Media Access (TDMA)-based scheduling because it is efficient and suitable in centrally-managed stationary network configurations. On TMDA scheduling, there are a number of studies. A conflictfree TDMA scheduling for multi-hop wireless networks to minimize an end-toend delay was discussed [1]. Two centralized algorithms for the shortest schedules for sensor networks with a few central data collectors were proposed [2]. Wireless mesh networks with multiple gateway nodes were studied [3] to maximize the traffic volume transferred between the mesh network and the central server via gateways. An online and distributed scheduling for lossy networks was studied [4] to provide hard end-to-end delay guarantees. Most of them focused on avoiding interference of simultaneous data transmissions on general network topologies by using the conflict graph or some heuristics. However, they do not explicitly consider retransmissions. On the other hand, our previous work [5,6] proposed a packet transmission scheduling framework to derive an optimal number of retransmissions for each packet on each link while being restricted to tandemlyconnected topologies with two gateways at the edges; which has not been well studied. We analytically derived a static time-slot allocation that maximizes the probability of delivering all packets in one cycle period but only in the case that each node generates one packet every cycle. Therefore, in this paper, we deal with a transmission scheduling with heterogeneous packet generation rates.

2 2.1

Proposed Method System Model

Fig. 1. Tandemly-connected multi-hop wireless network model (3-5 model).

In this study, we assume a tandemly-connected multi-hop wireless network model in which nodes are tandemly arranged and serially connected by unreliable lossy links. Each node is a sensor node that periodically generates packets and also relays those packets in a store-and-forward manner between two adjacent nodes. Packets generated by each sensor node are finally forwarded to one of gateways located at both ends of the network and are sent to a central server via Internet. The model consists of n nodes and two gateways X, Y at both ends as illustrated in Fig. 1. The sensor nodes and wireless links are separately numbered from left to right. Each link is lossy and half-duplex. A transmission of packet

Scheduling with Heterogeneous Packet Generation Rates

439

by a node affects both links connected to that node. Therefore, to avoid transmission interference, two nodes within two-hop distance should not transmit packets in the same direction simultaneously (i.e., on the same time-slot). The data-link layer does not provide any advanced packet loss recovery scheme such as Automatic Repeat-reQuest (ARQ) and any advanced transmission power adaptation, in which a proactive redundant packet transmission by each node is required to cope with packet losses. Therefore, each packet from an upstream node is stored and redundantly transmitted to a downstream node in scheduled slots. The packet loss rate of link j is defined as qj (0 < qj < 1), and the packet generation rate of node i is defined as integer ri . Node i generates ri packets at (or before) the beginning of a cycle period D and those packets are forwarded to the gateway either X or Y , which are sent to the central server S. The central server S knows the values qj and ri for any j and i, and designs a global static timeslot allocation to schedule the packet transmission at each node based on these values. Thus, it can consider not only the loss rates of links but also the packet generation rates of nodes while avoiding interference among near-by nodes based on a static interference avoidance policy considering the hop distance between nodes. The goal of the scheduling is to deliver all generated packets to either one of gateways along the lossy links during a cycle period D with a high success probability and fairness among packets from different nodes. To design a time-slot allocation, we need to decide a routing direction of packets (i.e., to which direction the packets are transmitted on each link). We call it a path model. In this paper, we adopt the model “the dual separated (DS) path model” to determine the gateway for each node to forward packets to. In the DS path model, two independent paths are separated at a separation (unused) link. Any node at the left of the separation link will forward packets to the left gateway X and any node at the right of the separation link will forward to the right gateway Y . It is called l-r model where l and r are the number of nodes located at the left and the right of the separation link, respectively. In Fig. 1, there are 7 candidates as the separation link for the DS path model. For example, if the link between nodes 3 and 4 is the separation link, the path model is 3-5 model (3-5 model and 4-4 model are focused on in this paper.). On a path model, based on the method explained in the next subsection, an optimal static time-slot allocation is derived by computing the theoretical probability that all packets are successfully delivered to either one of gateways. To find the optimal static time-slot allocation including a path model selection, we examine all reasonable path models one by one in such a manner. Finally we choose a best combination of a path model and a slot allocation in terms of maximizing the above success delivery probability. 2.2

Static Slot Allocation for Redundant Transmission Scheduling

On a given path model, our method can derive a global time-slot allocation for redundant transmission scheduling. Let T be the total number of slots in one cycle period D, i.e., D/T is the time duration of one time-slot. A global timeslot allocation allocates sij slots for node i to send its packets via link j in T

440

R. Yoshida et al.

total time-slots. According to the allocation, each node redundantly transmits possessed packets (that is originally generated by node i) on downstream link j in sij times in the following order. The packet generated by itself is sent first in the allocated times, then the packets generated by other nodes located at closer upstream are forwarded earlier in the allocated times. If a packet is lost in upstream and does not reach the node, the slots allocated to the packet is used for the next packet. The derived time-slot allocation is optimal in the sense that it theoretically maximizes the success probability of delivering all packets within one cycle period, to cope with packet losses on links by redundantly transmitting each packet allocated times. To be more exact, let Mk be the success probability of delivery for packets from node k, i.e., the probability that all rk packets generated by node k are successfully delivered to either one of two gateways in T time-slots. Then the derived time-slot allocation aims to maximize the n product Mk . This is the maximization of the logarithmic sum of the success k=1

probability Mk for packets generated by node k, which aims at the proportional fairness in terms of utility Mk for node k over all nodes. It also means the maximization of the success probability of delivery for all the packets if the packet losses occur independently. In the following, we explain how the proposed method derives a slot allocation in case of network topology with n = 8 nodes. Figure 2a shows an example of slot allocation on the left side of the separated link of 3–5 model. In this example, the packet generation rates of nodes A, B, C are 1, 3, 2, respectively. The expressions aj , bj and cj are the numbers of slots allocated to transmit a packet generated by nodes A, B and C, on link j, respectively. The success probability of delivery for packets generated by node A, B, and C are denoted by Ma , Mb and Mc , respectively, and can be calculated as follows. Ma = (1 − q1a1 )r1 , Mb = (1 − q1b1 )r2 (1 − q2b2 )r2 Mc = (1 − q1c1 )r3 (1 − q2c2 )r3 (1 − q3c3 )r3

(1) (2)

Here, qj is the packet loss rate of link j and ri is the packet generation rate of node i. Then the maximization problem is defined as follows. max Ma Mb Mc

subject to

r1 a1 + r2 b1 + r2 b2 + r3 c1 + r3 c2 + r3 c3 = T (3)

The Lagrangian multiplier is applied to a relaxation version of Eq. (3) to derive Eqs. (4)–(6) where ai , bi , ci are not restricted to natural numbers. log(1 − α log(q1 )) log(q1 ) log(1 − α log(q2 )) log(1 − α log(q3 )) b2 = c2 = − , c3 = − log(q2 ) log(q3 ) (r1 + r2 + r3 )a1 + (r2 + r3 )b2 + r3 c3 = T a1 = b1 = c1 = −

(4) (5) (6)

Scheduling with Heterogeneous Packet Generation Rates

(a) Transmission scheduling on 3 nodes.

441

(b) Transmission scheduling on 5 nodes.

Fig. 2. Transmission scheduling on 3 nodes (left) and 5 nodes (right).

α can be determined by Eqs. (4)–(6). Hence, the real value solution for (4)–(5) can be derived from α. However, the real value solution of the relaxed problem cannot be used for the static slot allocation. Therefore, we examine integer value solutions near the derived real value solution to seek the optimal integer value solution that maximizes the original problem. Note that the real value solution of the relaxed problem can provide an upper-bound of the objective function Ma M b M c . With a more complex manner, the equations can also be derived for the 5 node part on the right side of the 3-5 model. In the case of 5 nodes, simultaneous packet transmissions by two distant nodes in the same time-slot is possible when the interference avoidance condition is cleared. Figure 2b shows a slot allocation on the right side of the separated link. In this example, the packet generation rates of nodes H, G, F, E, D are 1, 2, 3, 2, 1, respectively. The formulation of the optimal slot allocation problem is as follows. Md = (1 − q5d5 )r4 (1 − q6d6 )r4 (1 − q7d7 )r4 (1 − q8d8 )r4 (1 − q9d9 )r4 Me = (1 − q6e6 )r5 (1 − q7e7 )r5 (1 − q8e8 )r5 (1 − q9e9 )r5

(7) (8)

Mf = (1 − q7f7 )r6 (1 − q8f8 )r6 (1 − q9f9 )r6

(9)

Mg = Mh =

g g +g ((1 − q88 )(1 − q99 9 ) h +h (1 − q9 9 9 )r8

+ (1 −

g q8g8 )(q88 )(1

−

q9g9 ))r7

(10) (11)

442

R. Yoshida et al.

Here, g8 g9 and h9 represent the number of slots allocated to transmit a packet in the simultaneous transmission region in which the interference avoidance condition is cleared. Because of the nature of the model, simultaneous transmissions are possible on a pair of links j = (5, 8) or another pair of links j = (6, 9). (12) max Md Me Mf Mg Mh subject to r4 d5 + r4 d6 + r4 d7 + r4 d8 + r4 d9 + r5 e6 + r5 e7 + r5 e8 + r5 e9 + r6 f7 + r6 f8 + r6 f9 + r7 g8 + r7 g9 + r8 h9 = T r4 d5 = r7 g8 , r5 e6 + r4 d6 = r8 h9 + r7 g9

(13) (14)

In a similar way in the previous 3 node case, the real value solution of the relaxed problem is derived and then the optimal integer value solution of the original problem can be found.

3

Simulation

To validate the effectiveness and the issues of the scheduling proposed in the previous section, we conduct synthetic simulations using an advanced packet-level network simulator, Scenargie, that can reflect various wireless configurations and realistic environments. The probability of delivering all packets (from all nodes) within one cycle period is considered as performance metric presented by three different values. Theoretical upper-bound (TUB) value means the theoretical maximum value of the objective function obtained by solving the relaxed version of the maximization problem. Model-based computed (COM) value means the computed probability of delivering all packets according to a slot allocation using an optimal integer-value solution of the original integer-constraint maximization problem. Simulation-based measured (MES) value means the measured ratio of the number of successfully delivered packets to all packets in simulation. Similarly the success delivery probability for the packets generated by each specific node is examined by the three different values. 3.1

Simulation Settings

Simulations are executed in three cases with different settings for each link and node assuming the number of nodes n is 8. • Case1: High packet loss rate near each GW. • Case2: High packet loss rate and high packet generation rate at the left side of the sensor array. • Case3: High packet loss rate at the right side of the sensor array and high packet generation rate at the left side of the sensor array.

Scheduling with Heterogeneous Packet Generation Rates

443

The information of link and node is shown in Tables 1 and 2. Table 1. Packet loss rate of each link Case q1 q2 q3 q4 q5 q6 q7 q8 q9

Table 2. Packet generation rate of each node

1

0.4 0.1 0.2 0.2 0.4 0.2 0.2 0.1 0.4

Case r1 r2 r3 r4 r5 r6 r7 r8

2

0.5 0.3 0.4 0.4 0.2 0.3 0.2 0.1 0.3

1

3 2 1 2 2 1 3 1

3

0.3 0.1 0.2 0.3 0.3 0.4 0.4 0.2 0.5

2

2 3 3 1 1 2 1 2

3

2 3 3 1 1 2 1 2

Table 3. The wireless link settings of Scenargie

Table 4. The relationship between the node distance and the packet loss rate

Wireless standard

Loss rate

802.11g

Transmission power 20[−dbm] Received power

−100[dbm]

Modulation type

BPSK0.5

0.1

0.2

0.3

0.4

0.5

Distance[m] 938.00 954.75 964.55 974.19 983.45

The wireless link settings of Scenargie are shown in Table 3. Each node does not use automatic repeat-request (ARQ) and carrier sense by ACK but adopt broadcast transmission. Three different values (T = 50, T = 75, T = 100) are used as the number of slots in one cycle. The relationship between the node distance and the packet loss rate on the link is obtained in the following manner by Scenargie; the CBR application transmits packets 10000 times between the two nodes without any interference from other communications; then the packet loss rate is decided by averaging the measured values over 10000 simulation instances. Table 4 shows the relationship between the node distance and the packet loss rate on link measured on Scenargie. 3.2

Simulation Results

Figure 3 shows the slot allocation of the right part of 3-5 model at the number of slots T = 75 in Case 2. The number beside the arrow indicates how many times the packet is sent on the current link. Nodes G and H are transmitting simultaneously. Figure 4 shows the total packet delivery ratios of all cases (T = 75). I, II and III in the figure show TUB value, COM value, MES value. The blue bar graph shows the total packet delivery ratios of 4-4 model and the red bar graph shows the total packet delivery ratios of 3–5 model.

444

R. Yoshida et al.

Fig. 3. The slot allocation of the right part of 3-5 model with slots T = 75 in Case 2

Fig. 4. The total packet delivery ratios

Case 1 gets higher total packet delivery ratio of 4-4 model than 3-5 model in all of TUB value, COM value, and MES value. The difference in packet loss rates between the left and right sides separated by the link j = 5 is small, and the minimum number of transmissions when considering the packet generation rate of each node is close between the left and right sides. As a result, the total packet delivery ratio of the 4-4 model becomes high. Case 2 gets higher total packet delivery ratio of 3-5 model than 4-4 model in contrast to Case 1. The packet loss rate on the left side and the packet generation rate of each node are high with the link j = 4 as a boundary. However, since the 3–5 model can reduce the number of hops of packets exchanged on the left side, the 3-5 model achieves a higher total packet delivery ratio than the 4-4 model. The fact that a node having a high packet generation rate is not arranged near the upstream of the right side portion also contributes to this

Scheduling with Heterogeneous Packet Generation Rates

445

delivery ratio. When a large number of nodes with a high generation rate are arranged upstream of a flow, it is necessary to transmit all different packets of the nodes for each hop, and as a result, a finite number of slots is compressed. It is considered appropriate not to arrange a large number of nodes having a high packet generation rate near the separation point. Case 3 gets higher total packet delivery ratio of 4-4 model than 3-5 model. This result follows from the fact that weight of the overall packet generation rate and the weight of the packet loss rate are not biased to one. The success delivery probability for all packets degrades in order of TUB value (I), COM value (II), and MES value (III) in all cases. It is natural that COM value is always lower than TUB value because TUB value is the theoretically maximum value of the objective function allowing a real-number solution and COM value is a value of the same objective function with an optimal integernumber solution that generally differs from the real-number optimal solution. Furthermore, MES value in simulation is always lower than COM value since an interference of simultaneous transmissions actually happens around the separation link between the most upstream nodes of the left-side and right-side flows. Figure 5 shows the relationship between the total packet delivery ratio and the packet generation rate of each node. The packet generation rate of each node is indicated in parenthesis. As shown in the figures, the total packet delivery ratios of the upstream nodes are generally lower than those of the downstream nodes. This is because the packets generated by a node at upper-side should traverse more number of lossy links. However, node G in Fig. 5a shows an exception. The total packet delivery ratio of node G, whose packet generation rate is higher than upstream node F , is lower than that of node F . In addition, nodes B and C in Fig. 5b suffer from significantly low total packet delivery ratios compared with the nodes D to H in the right-hand side despite that the link loss rates are not very different from the left-hand side. This comes from our slot allocation policy. It aims to maximize the probability of delivering all packets from all nodes without distinguishing between different packets generated by the same node and different packets generated by different nodes. Assuming that each generated

(a) Case 1, 3-5 model, T = 75

(b) Case 2, 3-5 model, T = 50

Fig. 5. Total packet delivery ratios for each node

446

R. Yoshida et al.

packet has the delivery probability of around x, the total packet delivery ratio of a node with the packet generation rate of m is around xm , which decreases as m increases.

4

Concluding Remarks

In this paper, we considered a TDMA-based packet transmission scheduling for tandemly-connected sensor networks with lossy links. We have enhanced our previous work to derive a static time-slot allocation maximizing the success delivery probability for packets, even if the packet generation rates are heterogeneous over nodes. As future work, we introduce an inter-packet XOR coding in transmitting multiple packets by multiple times, which was shown to be beneficial [6] but is not trivial in our case of heterogeneous packet generation rates. Acknowledgement. The research results have been achieved by the “Resilient Edge Cloud Designed Network (19304),” NICT, and by JSPS KAKENHI JP20K11770, Japan.

References 1. Djukic, P., Valaee, S.: Delay aware link scheduling for multi-hop TDMA wireless networks. IEEE Trans. Network. 17(3), 870–883 (2009) 2. Ergen, S., Varaiya, P.: TDMA scheduling algorithms for wireless sensor networks. Wireless Netw. 16(4), 985–997 (2010) 3. Tokito, H., Sasabe, M., Hasegawa, G., Nakano, H.: Load-balanced and interferenceaware spanning tree construction algorithm for TDMA-based wireless mesh networks. IEICE Trans. Commun. E93B(1), 99–110 (2010) 4. Ho, I.: Packet scheduling for real-time surveillance in multihop wireless sensor networks with lossy channels. IEEE Trans. Wireless Commun. 14(2), 1071–1079 (2015) 5. Agussalim, M.T.: Message transmission scheduling on tandem multi-hop lossy wireless links. In: Proceedings the 14th IFIP International Conference on Wired/Wireless Internet Communications (WWIC 2016), pp. 28–39 (2016) 6. Kimura, R., Shibata, M., Tsuru, M.: Scheduling for tandemly-connected sensor networks with heterogeneous link transmission rates. In: Proceedings of the 34th International Conference on Information Networking (ICOIN 2020), pp.5 90–595 (2020)

Smart Watering System Based on Framework of Low-Bandwidth Distributed Applications (LBDA) in Cloud Computing Nurdiansyah Sirimorok1(B) , Mansur As2 , Kaori Yoshida3 , and Mario K¨ oppen4 1

4

Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, Fukuoka, Japan [email protected] 2 Graduate School of Life Science and Systems Engineering, Kyushu Institute of Technology, Kitakyushu, Japan [email protected] 3 Department of Life Science and Systems Engineering, Kyushu Institute of Technology, Kitakyushu, Japan [email protected] Department of Creative Informatics, Kyushu Institute of Technology, Fukuoka, Japan [email protected]

Abstract. The estimation of soil moisture content is required for agriculture, mainly to build the irrigation scheduling model. In this study, we present a smart watering system to deal with various factors derived from the stochastic information in the agricultural operational, i.e., air temperature, air humidity, soil moisture, soil temperature, and light intensity. The methodology consists of exploiting the Internet of Things (IoT) using Low-Bandwidth Distributed Applications (LBDA) in cloud computing to integrate the real data sets collected by the several sensor technologies. We conducted experiments for the watering system used two types of soil and different plant. Here, the Long Short Term Memory Networks (LSTMs) approach in deep learning techniques used to build smart decisions concerning watering requirements and deal with heterogeneous information coming from agricultural environments. Results show that our models can effectively improve the prediction accuracy for the watering system over various soil and plants.

1

Introduction

The Food and Agriculture Organization (FAO) provides recommendations that all agricultural sectors need to be managed using innovative technology. The concept of agricultural development that has been developed at this time is the concept of intelligent farming and is also commonly called smart agriculture [25]. This concept refers to the Internet of Things (IoT) technologies in the modern c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 447–459, 2021. https://doi.org/10.1007/978-3-030-57796-4_43

448

N. Sirimorok et al.

farming system. The system of this technology makes it possible to obtain data from the agricultural environment, process the data, and manipulate the state of the farm environment. The primary purpose of the application of the technology is to carry out optimization in the form of increased yield (quality and quantity) and efficient use of resources [24,25]. Meanwhile, one of the crucial factors for producing high-quality agriculture products is required careful management in every production stage, including field management practices in the watering system. The watering system is an automated watering schedule that uses information about environmental conditions like soil moisture to ensure your plants get the optimum amount of water [17]. Naturally, the lack of in the watering system is an important production factor and that delaying irrigation may result in loss of water and crop yield [8,24]. Due to the various devices of sensor have been developed and are commercially available to measure the soil moisture rapidly. In recent year, several studies used a sensor based smart irrigation controller’s technologies to measure soil moisture content has shown significant progress in modern agriculture [7]. In their studies, the sensor installed in the soil (land) and root area of plants or trees, the sensors diagnose the moisture level in the soil and send the reading data to the server. The studies have shown that type of the soil moisture measurement devices has its benefits in terms of precision and reliability for determining soil water content [3,23]. Also, their model was able to diagnose when and how much water is needed the plants on the soil periodically [14,18]. Thus, Information of soil moisture as the primary focus for assessing plants water requirement, water absorption, water storage capacity of the soil, and quantity of water movement. Munir, M.S, and Bajwa, I.S [18], researched the amount of water concerning the consumption level of water in the medium-scale area is known as Smart Watering System (SWS) for controlling and watering the garden automatically. A server-side application was built with a web-based interface to deal with the proposed model. Then, the decision support system based on fuzzy logic was used in making intelligent decisions on data achieved from the sensors. This study focused on the relationship between plants and environmental conditions such as soil moisture content, light intensity, air humidity, and temperature. Their studies have succeeded in providing a watering system with the flexibility and efficiency to deal with the environmental conditions. However, they suggested using more types of plants and implanting more sensors to generate more precise predictions. In the other study, a smart system for monitoring and managing irrigation has designed and implemented [14], by focusing on the real data collected from sensors, and the data are processed using two-class boosted decision tree algorithm (train and process the data). The model successfully produces the weather forecast system and providing a decision to the farmer when the watering is necessary. Moreover, a smart conceptual architecture is proposed in [20] with a control system on the farming irrigation using arduino devices [1], the study proposed a persuasive technique on the Internet of Things (IoT) as being the

Smart Watering System Based on Framework of LBDA in Cloud Computing

449

gateway for communication between the sensor and other devices to develop smart farming. Furthermore, the study involved the concept of the kalman filtering algorithm to remove inaccuracies in the physical data from one sensor to another sensor for obtaining more accurate values [20] and then using the decision tree model to decide when it is suitable to start watering using specific standards [14]. In general, most of the approaches presented above have been intended to build a smart watering system mainly used environmental conditions, sensor devices, and IoT technology. With this model, provide better control of time on the farmer watering system, save water, and as decision support. However, in those prototype, different soil samples and various plants for calibration at various moisture levels have not been tested [18]. Besides, Susha Lekshmi et al., [23] reported that the requirements for fast, reliable and spatially distributed in the soil moisture measurement we have to consider soil specific parameters i.e., sunlight, rainfall, ambient temperature, presence of the organic matter, type of soil, etc. [18,21]. Likewise the characteristics of plants on water consumption like the type of plants, plants age, leaf width, etc. are not being fulfilled by the currently available techniques. On the other hand, Deep Learning (DL) techniques have shown the progress to examine the stochastic of ecosystems like evaporation processes, soil moisture, and temperature in modern agriculture [2,11]. An essential advantage of DL techniques implemented in agriculture is feature learning, i.e., the automatic feature extraction from real data, with features from higher levels of the hierarchy being formed by the composition of lower-level components [24]. DL techniques be able to deal with more complex problems, particularly adequately and quickly, due to more complex models used, which allow extensive parallelization [17]. These advanced models employed in DL techniques can improve the classification/prediction performance or decrease errors value on the stochastic data problems in heterogeneous agricultural environments [15]. Therefore, the main objective of this study is to propose a smart watering system based on the framework of Low-Bandwidth Distributed Applications (LBDA) in cloud computing. We presented a distributed application in the Internet of Things (IoT) technology in terms of inter-dependencies application to make it a low-bandwidth system and enable it to use in real-life agriculture. In our proposed model, the data is collected through a collection of sensors that have a better diagnose of the soil moisture level, light intensity, air humidity, and temperature. The information is then transformed using The Long Short Term Memory Networks (LSTMs) methods in deep learning techniques to build the watering system. Next, the remaining paper is organized as follows: In Sect. 2, the materials and methods, the problem will be addressed and algorithm that we used; In Sect. 3, results and discussion includes data visualization and results of prediction; In Sect. 4, we summarized the solution and the limitations that follow the future work.

450

2 2.1

N. Sirimorok et al.

Materials and Methods Materials

The first challenge that will be addressed is the varying soil moisture for different types of plants. Each variety of plant typically requires a different amount of water, temperature, and soil content to optimize growth. Thus, they need an unusual amount of watering time and regularity. So, soil water content is an extensive variable, it’s changes with size and situation and type of plants [18,21]. Therefore, to find the characteristics of soil content, two typical plants were selected with different growing environments, i.e., lettuce and chili that cultivated on the peat and loam soil in 95 cm pots with soil pH are slightly acidic to neutral 6.5 to 7.0 and spacing plants 18 to 25 cm without fertilizer with aims to keep constant of soil content. Then, we used the sun exposure directly and for the watering, we treated the same method for both plants (Fig. 1).

Fig. 1. Framework of smart watering system using Low-Bandwidth Distributed Applications (LBDA) in cloud computing

As shown in Table 1, we did not attempt to choose plants with similar variety, since that was important for testing the soil moisture characteristics in each different plants. In this experiment, the chili plants had a height of approximately 45 cm, with the more 30 leaves count and six branches. In addition, for the lettuce plants reaching up to 20 cm tall, with age 15 days and leaf width approximately 7 cm. While the moisture sensors were installed in soil, the environmental conditions in the experiment area monitored using a temperature/relative humidity sensor (light sensor tag) and an ultraviolet light sensor (UVB Sensor). Then we tracking the soil moisture relative [6] to present a smart management rule which defines moisture condition; Soil moisture >=21%: Wet, Soil moisture >=18%: Medium wet, Soil moisture >=12%: Optimal, Soil moisture >=9%: Medium stress and Soil moisture < 9%: Extreme stress. The goal is to keep moisture in the optimal section, then the next that the soil moisture level will be used to predict the next watering cycle.

Smart Watering System Based on Framework of LBDA in Cloud Computing

451

Since, the light intensity and soil temperature are two environmental factors that have major effects on plant growth [21]. Therefore in this study, we also investigate the light intensity using an ultraviolet light sensor (UVB Sensor) and use that information to predict soil moisture in each plants. The results obtained in the morning peak (09:00–12:00), Midday (12:00–14:00) and afternoon peak (14:00–17:00) the average of light intensity 19.8%, 40.6% and 20.5%, respectively. Table 1. Plants and material experiment Type plants

Type soil Age

Pot 1 (lettuce/Mizuna) Peat soil Pot 2 (Chili/Piri-piri)

15 Days

Deal growing temperature 61–64 F (6–18 C)

Loam soil 3 Months 58–60 F (14–16 C)

Meanwhile, almost the soil sensors commercial has its default factory calibration. The existence of the calibration model usually was taken from the average of many different soil types or of a specific soil type [7]. So, to avoid the possibility if the calibrated value in the sensor will not match the soil type [12], we used two kinds of soil, i.e., first, the loam soil, this soil is a combination of sand, silt, and clay such that the beneficial properties from each are included. For instance, it can retain moisture and nutrients; hence, it is more suitable for farming. Second, the peat soil, this soil is high in organic matter and maintains a large amount of moisture [4,5]. Table 1 shows the types of plants and materials that we used in this experiment. 2.2

Methods

Firstly in the standalone scenario, the data collection consists of three types of sensors devices was installed. The output of these sensors has obtained the information of soil moisture, humidity, and temperature in https://my.wirelesstag.net and output of UV light sensor is the information of light intensity are provided by Vernier Data Share in which is synched with web service. Then we built a web service an automated line for deploying the Low-Bandwidth Distributed Applications (LBDA) in the cloud computing, which includes a decentralized protocol for self-configuring the virtual application machines [13]. The web service is written in PHP with a light weighted REST APIs to communicate the data between the Wireless Tag data, the cloud computing and OpenSimulator based virtual worlds. Next, to handle the best options in automatic learning systems for the soil moisture prediction that tends to be heterogeneous, as discussed in Sect. 2.1, the Long Short Term Memory Networks (LSTMs) is a part of artificial Recurrent Neural Networks (RNNs) architecture was implemented [9]. Here, we briefly introduce the Long Short Term Memory Networks (LSTMs) methods that were used to build the soil moisture content prediction. The LSTMs

452

N. Sirimorok et al.

is a transformed variant of Recurrent Neural Networks (RNNs) in Deep Learning (DL) techniques, which makes it more accessible to remember past data in memory [9]. The vanishing gradient problem of RNNs is determined here. So, LSTMs is well-suited to classify process and estimate time series given time lags of unknown duration. In LSTMs method contains three gates, i.e., input gate, forget gate, and output gate. So that, in each iteration process, between the three gates, prevents vanishing/exploding gradient problem of the original RNNs and allows the network to retain state information over longer periods of time [17,22]. The first process carried out by LSTMs is to determine the value in which is not used (forget gate) with the following equation: ft = σ(Wf .[ht−1 , xt ] + bf )

(1)

Second, the process for determining input gates with the following equation: it = σ(Wi .[ht−1 , xt ] + bi )

(2)

Ct = tanh(Wc .[ht−1 , xt ] + bc )

(3)

Then equation for input data as follow: Ct = ft ∗ Ct−1 + it ∗ Ct

(4)

The last is the process for determining the output data (output gates) with the following equation: ot = σ(Wo .[ht−1 , xt ] + bo ) (5) ht = ot ∗ tanh(Ct ),

(6)

where it , ft and ot are the input gate, forget gate and output gate, respectively. σ represents sigmoid function, Wx and bx are weight for the respective gate x and biases at neurons, ht−1 is output of the previous LSTMs block for input at current timestamp xt . While, Ct is memory at timestamp t for the cell state, candidate cell state and the final output. Meanwhile, the process for calculating the error value we using Mean Square Error (MSE) with the following equation: n

M SE =

1 (yx − yi )2 n i=1

(7)

The error value is calculated by calculating the difference between the actual value and the predicted value, where yi is the actual value and yx is the predicted value. The prediction model was performed by Weka Software (version 3.9.3) with the WekaDeeplearning4j package. The package support for the Weka Graphical User Interface (GUI), it is produced to include the advanced ways of Deep Learning (DL) techniques [16].

Smart Watering System Based on Framework of LBDA in Cloud Computing

453

In order to verify the prediction performance of our model, the average of daily data, i.e., air temperature, air humidity, soil moisture, soil temperature, and light intensity are used as input data. Then, the soil moisture prediction target is classified as follows: Extreme Stress, Medium Stress, Optimal, Medium Wet, and Wet. Meanwhile, the trained network using a separate test data set, which is usually referred to as a generation test data. So, in this work has retrained using the full training set (i.e., 1498 record data from 64 days and each day contains 24 h average data) with the proportion 60% for training and 40% for testing. Next, we implemented the prediction model based on the framework of LowBandwidth Distributed Applications (LBDA) in cloud computing. The framework is a part of a cloud computing that enables fog communication between the application program interfaces (APIs) and data services to reduce the traffic bandwidth in IoT technologies [19]. Therefore, for communication between the client and the server, we built an application LBDA server in IoT devices, where the LBDA become flexible proxying services to minimize the communication capacities by efficiently encoding binary message content and adapting application protocol operation to affect the number of exchanged messages; it further reduces the traffic for the end device.

3 3.1

Results and Discussion Data Visualization

Since we considered the characteristics of soil moisture based on air temperature, air humidity, soil temperature, and light intensity. The first stage is to analyze soil moisture information that has been taken from a moisture sensor in the soil on each plants (soil pot). These parameters are considered for analyzing the soil moisture in each pot based on the pattern of soil moisture in Sect. 2.1 and then apply learning systems in Long Short Term Memory Networks (LSTMs) method to predict the watering system. Here, we have investigated the water consumption in each plants type based on the moisture (%) and temperature (C) of soil from day to day for 24 h. Figures 2 and 3 shows the soil temperature, and soil moisture level tends to vary in each pot throughout for 64 days. Although the soil temperature tends to vary in day to day, between two pots (lettuce and chili), there is no significant difference. However, the interesting thing is the soil moisture (water content) between two pots due to there is a significant difference in the water content. This can be seen in Fig. 2 the lettuce plants in pot 1 for the soil moisture tends stable become dry in day to day. While for the chili plants in pot 2, the soil moisture tends wet, and this proves that the water consumption of the lettuce plants tends to be more than the chili plants. In this case, similar results have been previously reported that the patterns of time-varying consumption of water for different plants. The growth rate and leaf number are impacting the amount of water consumption over a day in each plants type significantly [5]. So, applying daily data of soil moisture per hour for 64 days, the water consumption of plants

454

N. Sirimorok et al.

Fig. 2. Soil moisture in the Pot 1 and Pot 2

could then be determined. Additionally, different variety of plants need a different quantity of water, so that a smart system is required for efficient utilization of the watering system for more efficient plants growth.

Fig. 3. Soil temperature in the Pot 1 and Pot 2

On the other hand, several studies have noted that the effect of variation air temperature and humidity on soil moisture (water content) has a very high correlation [21]. While In Japan between March, April and May is spring season wherein the temperature increases with large short-term fluctuations. Sunlight duration is long in the second half of spring due to the predominance of anticyclonic rules as well the stormy season begins in early May [10]. Therefore, in this study, we also investigate the variations of air temperature and humidity. Figure 4 shows the results that the significant change in air temperature and humidity tends to vary in each day. It is also observed, during the air temperature decrease, the soil moisture also reduces; this can be seen in Figs. 2 and 3. Particularly in May, when the variation air temperature significantly increases, the average of soil temperature also change and soil moisture in each pot tends to vary.

Smart Watering System Based on Framework of LBDA in Cloud Computing

455

Fig. 4. Variation of temperature and air humidity in each day on March, April and May, 2020

3.2

Results Prediction

In Table 2 shows the results performance of the soil temperature in each pot, it can be seen that for classification of pot 1 lettuce plants obtained performance accuracy 91.71%, while pot 2 chili plants obtained 87.32%. Also, it has been observed here that for lettuce plants, in classification performance, provide better results than chili plants. However, as we discussed in Sect. 2.1 that lettuce has a relatively high water requirement and very risky for changes in temperature and air humidity. Table 2. Accuracy classification of smart watering system Plants type

Soil type

Pot 1 (lettuce/Mizuna) Peat soil Pot 2 (chili/Piri-piri)

Accuracy prediction 91.71%

Loam soil 87.32%

Table 3. Classification of soil moisture (water) for Pot 1 lettuce with a LSTM performance Categories:

True False Precision (%) Recall (%) positive positive rate (%) rate (%)

Extreme stress 0.875

0.031

0.778

0.875

Medium stress 0.805

0.022

0.826

0.805

Optimal

0.96

0.005

0.992

0.96

Medium wet

0.856

0.007

0.967

0.856

Wet

0.989

0.033

0.868

0.989

Mean per-type 0.917

0.015

0.922

0.917

456

N. Sirimorok et al.

Table 4. Classification of soil moisture (water) for Pot 2 chili with a LSTM performance Categories:

True False Precision (%) Recall (%) positive positive rate (%) rate (%)

Extreme stress 0.662

0.014

0.768

0.662

Medium stress 0.922

0.031

0.922

0.922

Optimal

0.957

0.017

0.963

0.957

Medium wet

0.986

0.1

0.725

0.986

Wet

0.447

0.001

0.982

0.447

Mean per-type 0.873

0.037

0.891

0.873

The analysis of data acquired for each type of plants (moisture pot 1 and pot 2) is shown in Tables 3 and 4. The data of the different quantity/content of soil moisture of using the Long Short Term Memory Networks (LSTMs) for predicting the content of soil moisture was classified. Sixty-four days of data were used to train and test for all categories of soil moisture content. The results were measured in terms of true positive rate (TPR), false-positive rate (FPR), precision, and recall. In this experiment for the classification of soil moisture on the pot 1 lettuce plants for all categories, the mean per-type of true positive rate obtained 0.97%, the false positive rate achieved 0.015%, precision, and recall 0.922% and 0.917%, respectively. While for the classification of soil moisture for all categories, the mean per-type of true positive rate obtained 0.873%, false positive rate obtained 0.037%, precision and recall were 0.891% and 0.873%. From the results obtained, it can be said that determining all categories of soil moisture content for the smart watering system provided promising prediction results. So, it can be concluded, that using the modern system such as a soil moisture sensor and IoT technology with the Long Short Term Memory Networks (LSTMs) approach has the potential for improving the smart watering system although in the soil is a heterogeneous natural environment, with complex processes and uncertainty mechanisms [17]. 3.3

Discussion

In order to get the most accurate configuration for the plants watering system, a smart system has been developed to automatically detect the type of the plants, which is based on deep learning techniques, especially the Long Short Term Memory Networks (LSTMs) approach. The implementation is based on the following tools and environments: air temperature and air humidity sensor, soil temperature and moisture sensor, and light intensity. Moreover, we employed the Internet of Things (IoT) and Low-Bandwidth Distributed Applications (LBDA) in cloud computing for collecting and processing data. Then, two typical plants were selected with different growing environments, i.e., lettuce plants using peat

Smart Watering System Based on Framework of LBDA in Cloud Computing

457

soil and chili plants using loam soil. The results providing excellent accuracy in terms of soil moisture classification and have the potential for improving the smart watering system. Moreover, in our study has obtained the better performance in the two different testing datasets, as mentioned above in Table 2, accuracy prediction of lettuce plants (91.71%) and accuracy prediction of chili plants lower values (87.32%) have been reported. These show that the LSTMs method was able to generalize well to different datasets. However, more comparisons between the method in DL techniques are necessary, due to a considerable drawback and barrier is the need for large datasets, which would serve as the input during the training procedure. So, more complex architectures and tasks would appear, comparing various DL techniques and classifiers together, or combining with an automatic method by using various models, will improve the overall result [11]. In addition, due to the experiment lasted three months, the weather each year is unpredictable. We also noted the land area has to be considered because particles and water absorption by the soil are highly correlated with the depth and width of the land [4].

4

Conclusions and Future Work

The soil moisture is a crucial parameter for building a smart watering system. The soil moisture is affected significantly by several environmental variables, e.g., air temperature, air humidity, light intensity, soil temperature, and type of soil and plants. With the advancement in sensor and IoT technologies, the smart watering system accuracy has improved significantly and data can be used for the prediction of changes in the soil moisture. For future work, more complex experimental studies may compare more plants varieties, metabolism, photosynthesis and gene expression.

References 1. Arvindan, A., Keerthika, D.: Experimental investigation of remote control via android smart phone of Arduino-based automated irrigation system using moisture sensor. In: 2016 3rd International Conference on Electrical Energy Systems (ICEES), pp. 168–175. IEEE (2016) 2. Balducci, F., Impedovo, D., Pirlo, G.: Machine learning applications on agricultural datasets for smart farm enhancement. Machines 6(3), 38 (2018). https://doi.org/ 10.3390/machines6030038 3. Bittelli, M.: Measuring soil water potential for water management in agriculture: a review. Sustainability 2(5), 1226–1251 (2010). https://doi.org/10.3390/su2051226 ¨ Lærke, P.E., Maddison, 4. Buschmann, C., R¨ oder, N., Berglund, K., Berglund, O., ¨ Myllys, M., Osterburg, B., van den Akker, J.J.: Perspectives M., Mander, U., on agriculturally used drained peat soils: Comparison of the socioeconomic and ecological business environments of six European regions. Land Use Policy 90, 104,181 (2020)

458

N. Sirimorok et al.

5. Chen, T., Wang, X.: A correlation model on plant water consumption and vegetation index in mu us desert, in China. Procedia Environ. Sci. 13, 1517–1526 (2012). https://doi.org/10.1016/j.proenv.2012.01.144 6. Douville, H.: Relative contribution of soil moisture and snow mass to seasonal climate predictability: a pilot study. Climate Dynamics 34(6), 797–818 (2010). https://doi.org/10.1007/s00382-008-0508-1 7. Francesca, V., Osvaldo, F., Stefano, P., Paola, R.P.: Soil moisture measurements: comparison of instrumentation performances. J. Irrig. Drainage Eng. 136(2), 81–89 (2010). https://doi.org/10.1061/ASCE0733-94372010136:281 8. Galioto, F., Chatzinikolaou, P., Raggi, M., Viaggi, D.: The value of information for the management of water resources in agriculture: Assessing the economic viability of new methods to schedule irrigation. Agricult. Water Manag. 227, 105,848 (2020). https://doi.org/10.1016/j.agwat.2019.105848 9. Gao, Y., Glowacka, D.: Deep gate recurrent neural network. In: Asian Conference on Machine Learning, pp. 350–365 (2016) 10. JMA: Overview of japan’s climate. https://www.jma.go.jp/jma/indexe.html. Accessed: 30 Sept 2010 11. Kamilaris, A., Prenafeta-Bold´ u, F.X.: Deep learning in agriculture: a survey. Comput. Electron. Agricult. 147, 70–90 (2018). https://doi.org/10.1016/j.compag. 2018.02.016 12. Karim, N.B.A., Ismail, I.B.: Soil moisture detection using electrical capacitance tomography (ect) sensor. In: 2011 IEEE International Conference on Imaging Systems and Techniques, pp. 83–88. IEEE (2011). https://doi.org/10.1109/IST.2011. 5962195 13. K¨ oeppen, M., Yoshida, K.: The price of unfairness. In: 2016 International Conference on Intelligent Networking and Collaborative Systems (INCoS), pp. 463–468 (2016) 14. Kissoon, D., Deerpaul, H., Mungur, A.: A smart irrigation and monitoring system. Int. J. Comput. Appl. 163(8), 39–45 (2017). https://doi.org/10.5120/ ijca2017913688 15. Kwok, J., Sun, Y.: A smart IoT-based irrigation system with automated plant recognition using deep learning. In: Proceedings of the 10th International Conference on Computer Modeling and Simulation, pp. 87–91. Association for Computing Machinery (2018). https://doi.org/10.1145/3177457.3177506 16. Lang, S., Bravo-Marquez, F., Beckham, C., Hall, M., Frank, E.: Wekadeeplearning4j: a deep learning package for Weka based on deeplearning4j. Knowl.-Based Syst. 178, 48–50 (2019) 17. Lu, Y., Salem, F.M.: Simplified gating in long short-term memory (LSTM) recurrent neural networks. In: 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1601–1604. IEEE (2017) 18. Munir, M.S., Bajwa, I.S., Cheema, S.M.: An intelligent and secure smart watering system using fuzzy logic and blockchain. Comput. Electr. Eng. 77, 109–119 (2019). https://doi.org/10.1016/j.compeleceng.2019.05.006 19. Pustiˇsek, M., Dolenc, D., Kos, A.: LDAF: low-bandwidth distributed applications framework in a use case of blockchain-enabled IoT devices. Sensors 19(10), 2337 (2019) 20. Putjaika, N., Phusae, S., Chen-Im, A., Phunchongharn, P., Akkarajitsakul, K.: A control system in an intelligent farming by using Arduino technology. In: 2016 Fifth ICT International Student Project Conference (ICT-ISPC), pp. 53–56. IEEE (2016)

Smart Watering System Based on Framework of LBDA in Cloud Computing

459

21. Quanqi, L., Xunbo, Z., Yuhai, C., Songlie, Y.: Water consumption characteristics of winter wheat grown using different planting patterns and deficit irrigation regime. Agricult. Water Manage. 105, 8–12 (2012). https://doi.org/10.1016/j.agwat.2011. 12.015 22. Sainath, T.N., Vinyals, O., Senior, A., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4580–4584. IEEE (2015) 23. Su, S.L., Singh, D., Baghini, M.S.: A critical review of soil moisture measurement. Measurement 54, 92–105 (2014). https://doi.org/10.1016/j.measurement.2014.04. 007 24. Torres-Sanchez, R., Navarro-Hellin, H., Guillamon-Frutos, A., San-Segundo, R., Ruiz-Abell´ on, M.C., Domingo-Miguel, R.: A decision support system for irrigation management: analysis and implementation of different learning techniques. Water 12(2), 548 (2020). https://doi.org/10.3390/w12020548 25. Trendov, N.M., Varas, S., Zeng, M.: Digital technologies in agriculture and rural areas. FAO, Rome, Italy (2019)

P4-Based Implementation and Evaluation of Adaptive Early Packet Discarding Scheme Kazumi Kumazoe1(B) and Masato Tsuru2 1

2

Kyushu Institute of Technology, 1-1 Sensui-cho, Tobata-ku, Kitakyushu-shi, Fukuoka, Japan [email protected] Kyushu Institute of Technology, 680-4 Kawazu, Iizuka-shi, Fukuoka, Japan [email protected]

Abstract. Software Defined Networking (SDN) has attracted widespread attention due to its architectural challenges to flexibly and dynamically control network switches and packets traversing them. Recently the programmability on the data plane becomes an active area of research. For example, Programming Protocol-independent Packet Processors (P4) was introduced as a language to enable packet-level processing by the data plane and supported by not only on software but on a variety of hardware of switches and devices. Motivated by this shift, we provide a P4-based implementation of the MTQ/QTL that is a dynamic advanced Active Queue Management (AQM) scheme previously proposed by the authors. In this paper, we report a P4-based MTQ/QTL implementation and its evaluation on software switches by Mininet emulator. Through the emulation, we verify that the effects of MTQ/QTL to benefit delay-sensitive application flows are similar to the previous simulation results, which suggests that the data plane programming using P4 in advanced AQM is feasible and promising as a next generation SDN enabler.

1

Introduction

Software Defined Networking (SDN) has been studied with a long history and attracted widespread attention due to its architectural challenges including the separation of the data plane and the control plane [1]. In particular, the programmability on the control plane has been actively developed for a decade with open API (e.g., OpenFlow), which enables to control network switches/devices by a controller in a central manner. Then Programming Protocol-independent Packet Processors (P4) [2], a domain specific language, appeared in 2014 and it enables to program the data plane (e.g. packets on flow) directly. Conventional switches (including OpenFlow switches) without supporting data plane programming cannot instantly change the functionality on packet processing by the rules written in ASIC datasheets, whereas the P4 architecture allows an administrator or controller to install and update the rules in a top-down approach [3]. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 460–469, 2021. https://doi.org/10.1007/978-3-030-57796-4_44

P4-Based Implementation and Evaluation

461

Recently, P4 framework has been supported by a variety of software-based and hardware-based switches and devices, and thus a number of studies appear on data plane programming using P4. For example, in [4], an on-switch mechanisms named SQR (Shared Queue Ring), in which enabling caching of small number of recently transmitted packets and retransmit them on the appropriate backup network path, is proposed to reduce FCT (Flow Completion Time) in the presence of link failures in datacenter networks. The performance evaluation on a hardware testbed using a Barefoot Tofino switch shows that their approach is effective and practical. In [5], the congestion avoidance scheme, in which congestion is detected via monitoring of both operation delay of packets and queueing delay and affected flows are re-routed, is implemented in both software and hardware switches. [6] implements the CoDel, targeting the bufferbloat problem, in software switch and confirm the effectiveness of CoDel which is a kind of Active Queue Management (AQM) scheme. In [7], we proposed a dynamic user-oriented AQM scheme named MTQ/QTL and evaluated its performances via simulation. For the MTQ/QTL scheme, non-standard functions to process the queues and the packet header should be implemented on each switch. Therefore, in this paper, we report a P4-based MTQ/QTL implementation and its evaluation on software switch (BMv2) [8] by Mininet emulator [9].

2

Adaptive Early Packet Discarding Scheme (MTQ/QTL)

The MTQ (Maximum Transmission Queue)/QTL (Queue To Live) is an adaptive early packet discarding scheme for delay-sensitive application flows that was previously proposed by the authors [7]. There are four types of delays experienced by each packet in the network: the propagation delay, the transmission delay, the queueing delay, and the processing delay. MTQ and QTL work based on the queueing delay (the time a packet will spend in a queue or the total time a packet will spend in all queues they will pass) to proactively discard some packets that are likely to become useless for the application. The MTQ and/or QTL parameters are set in the header of each packet at the sender node of an application flow, and those packets are forwarded to intermediate nodes. Every time a packet arrives at an intermediate node, it is queued or discarded according to the MTQ and/or QTL mechanisms as shown in Fig. 1. Packets with the MTQ parameter are managed based on a local queuing delay limit at each intermediate node. Packets with the QTL parameter, on the other hand, are managed based on a global (total) queuing delay limit throughout the end-to-end network path. In the MTQ/QTL scheme, both of MTQ and QTL parameters can be set in the header jointly and both values are examined sequentially (i.e., MTQ value is first examined and then QTL value is checked and updated if necessary) at each intermediate node. Note that, as a flow performance metric to be improved, we consider Effective Packet Loss Rate (EPLR), an application-aware packet loss rate considering both the packets discarded at a queue due to buffer overflow or MTQ/QTL and the packets

462

K. Kumazoe and M. Tsuru

Fig. 1. MTQ/QTL algorithms.

discarded by the receiver due to an experienced end-to-end delay exceeding the maximum allowable end-to-end delay. Since an end-to-end delay consists of a fixed part (depending on the packet size and the network path) assumed to be known and a total queuing delay, we adopt an maximum allowable total queuing delay called “the allowable queuing delay”, instead of an maximum allowable end-to-end delay. As demonstrated by simulation in [7], MTQ or QTL alone can improve the EPLR of flows differently, but a proper combination is more beneficial.

3

P4-Based Implementation of MTQ/QTL Scheme

We implemented MTQ/QTL algorithms by referring to the P4 programming samples available in VM [10] to evaluate its feasibility using SDN Emulator Mininet [9]. We used the software switch BMv2 (behavioral model version 2) in the simple switch as a target of BMv2 as shown in Fig. 2. In parser, packets are parsed based on defined protocols in the header, then the control for each packets (e.g., deciding the destination, modification to the packet (including adding or removing the header) and discarding packets etc. in ingress and egress parts. In BMv2 simple switch model, the queue exists between ingress and egress parts. Various parameters including such as queue length and an output rate of the queue (e.g. packet per sec) can be set via simple switch model CLI provided by BMv2. MTQ/QTL parameters we implemented in option fields in IP header are shown in Table 1 and its implementation in P4 code and class definitions needed in Scapy [13], which is used as a generater/receiver of packets, are shown in Fig. 3 Statistics at switches including the number of arriving packets, discarded packets and the length of the queue can be measured using counters or/and registers implemented in BMv2. As explained in Sect. 2, MTQ/QTL scheme refers the queue length and estimates the queueing delay when each packet arrives at the ingress part of the queue. In original BMv2 implementation, the queue length can not be referred at ingress part of the queue. Namely, the queue length

P4-Based Implementation and Evaluation

463

Fig. 2. Configuration of simple switch target on BMv2 from [11, 12]. Table 1. Fields in the Option Field implemented in MTQ/QTL algorithms Field name

Length

Flag

16 bit

1 (MTQ), 2 (QTL), 3 (MTQ+QTL)

MTQ

32 bit

MTQ value in microsecond

QTL

32 bit

QTL value in microsecond

Allowable Queuing Delay 32 bit

Allowed queueing delay in microsecond

Fig. 3. Example of IP header option implementation ((a) P4, (b) Scapy).

is available when the packet arrives at the egress part [14]. Therefore, we modified the BMv2 implementation so that the queue length can be monitored when each packet arrives at ingress of the queue. Note that, in the implementation of BMv2 on Mininet, the processing time to transmit one packet is specified by a packet-per-sec setting of output rate of the queue regardless of the size of the packet. Based on the architecture, we implement MTQ/QTL algorithms in BMv2 architecture.

464

4 4.1

K. Kumazoe and M. Tsuru

Emulation-Based Evaluation Evaluation Settings

The network model to evaluate our P4-based MTQ/QTL is illustrated in Fig. 4. Three constant bit rate (CBR) UDP flows, f1 , f2 and f3 , are generated by Scapy; each flow consists of 1000 packets with a size of 100 [Byte]. On each (output) link, the length of queuing buffer is 64 packets and the packet transmission rate is 30 [pps], thus the transmission time of one packet on link is 1/30 = 0.033[s]. As a congested network scenario, flow f1 competes with other flows at two queues (on s1 and s2 ) and may experience a longer end-to-end delay in the network, while flows f2 and f3 compete with another flow at a only single queue on s1 and s2 , respectively. As a performance metric for delay-sensitive application flows with an application-aware maximum allowable total queuing delay (we call it “the allowable queuing delay”), we adopt EPLR introduced in Sect. 2. The allowable queuing delay is set to 3 [s] in the following evaluation.

Fig. 4. Emulation network model.

Please note that, in P4, a program implementation can run on a variety of software/hardware-based platforms with little or no change. Thus, we verify the functionality of a basic implementation on P4 with software switches in this section. A more performance-oriented refinement of that on P4 with hardware devices remains as future work. In our evaluation, a low processing speed and a long processing time of transmission in queues are emulated, although these settings may not reflect a realistic application and network. The purpose is to avoid an unnecessary affect by the performance issue due to emulation environment resources. 4.2

Results Without MTQ/QTL

Figures 5 and 6 provide experimental queue length histograms of s1 and s2 switches without MTQ/QTL on the network shown in Fig. 4, respectively. The histogram indicates how many arriving packets see a range of queue lengths, e.g., [0, 10], [11, 20], . . . , [51, 63], [64]. Since the size of queuing buffer is 64 packets, any arriving packet that sees a queue length of 64 is discarded. Most often, the queue length appears between 51 and 63. Since the transmission time of one packet on output link is 1/30 [s], an

P4-Based Implementation and Evaluation

465

arriving packet will experience a delay of n/30 [sec] on the switch when the queue length is n packets. Depending on the number of congested queues on each flow, the packets on flow f1 will experience at most the delay of 1/30×64×2 = 4.26 [s], while the packets on f2 or f3 will experience at most the delay of 1/30 × 64 × 1 = 2.13 [s]. In this network configuration, we consider an application that requires the allowable queuing delay of 3 [s] (3000 [ms]), and examine the effects of MTQ/QTL on flows of the application.

Fig. 5. Queue length distribution at s1 .

Fig. 6. Queue length distribution at s2 .

Figure 7 compares the packet loss rates of flows f1 , f2 , f3 at queues, at the application, and in total (i.e., EPLR). Packets on flow f1 often experience a total queuing delay exceeding the allowable queuing delay and thus are discarded by the application even if they are received at h3. On the other hand, packets on flows f2 or f3 are never discarded by the application. 4.3

Results with MTQ and/or QTL

As explained above, since the size of queuing buffer is 64 packets and the transmission time of one packet is 1/30 [s], the delay experienced on a single queue is at most 2.133 [s] (2133 [ms]). Therefore we examine MTQ values ranging from 200 to 2000 [ms].

Fig. 7. Packet loss rates at queues, at the application, and in total (i.e., EPLR).

466

K. Kumazoe and M. Tsuru

Fig. 8. Queue length distribution at s1 .

Fig. 9. Queue length distribution at s2 .

Figures 8 and 9 provide experimental queue length histograms of s1 and s2 switches with MTQ value of 1500 [ms]. In response to this MTQ value, the queue length is strongly limited to 45 packets that is around 70% of the original length of 64 packets. In other words, when the queue length is 45, a newly arriving packet is discarded by MTQ although the queuing buffer is not overflowed. Figure 10 compares the packet loss rates of flows f1 , f2 , f3 with MTQ values of 1200 and 1500 [ms] at queues, at the application, and in total (EPLR). When the MTQ value is set to 1200 [ms], there is no packet loss by the application. This is because, even on flow f1 (h1 to h3; competing at two switches), the queuing delay is limited to 2400 (1200 × 2) [ms] by MTQ, which is shorter than the allowable queuing delay (3000 [ms]). However, there are considerable packet losses at queues due to this strong limitation on queue length.

Fig. 10. Packet loss rates at queues, at the application, and in total (i.e., EPLR) (MTQ adopted).

In contrast, with the MTQ value of 1500 [ms], the packet losses at queues are mitigated due to a weaker limitation on queue length. However, there are considerable packet losses by the application on flow f1 , although the queuing delay is expected to be up to 3000 (1500 × 2) [ms] that does not exceed the allowable queuing delay. We investigate the cause of this happening. Figures 11 and 12 provide experimental histograms of queuing delay experienced by each packet (measured when a packet departs the queue) at s1 and s2 with MTQ value of 1500 [ms]. Contrary to our expectation, the experienced delay in queue

P4-Based Implementation and Evaluation

467

sometimes exceeds 1500 [ms] limited by MTQ. In our P4-based MTQ implementation, when a new packet arrives, the expected queuing delay is computed based on the current queue length. Then, if the expected delay is less than the MTQ value (1500 [ms] in this case), the packet is queued, otherwise discarded. Those results suggest that a packet experiences not only a queuing delay but also an additional processing delay up to 40 [ms] queued. In the following experiment scenarios, based on this observation, we set MTQ and QTL values conservatively, i.e., smaller than the allowable queuing delay of the application.

Fig. 11. Queuing delay time distribution at s1 .

Fig. 12. Queuing delay time distribution at s2 .

Figure 13(a) shows the effect of the early packet discarding of MTQ on the effective packet loss rate (EPLR) of each of three flows. A too small MTQ value causes unnecessary early packet discarding while a too large MTQ value provides very little effect. With MTQ value of 1400 [ms], the EPLR of flow f1 is minimized while the EPLRs of flows f2 and f3 are a little larger than those with MTQ value of 1000 or 1200. In other words, by MTQ with an appropriate setting, the flow in the most severe condition benefits significantly in terms of EPLR while other flows are not damaged. Similarly, Fig. 13(b) shows the effect of QTL on the EPLR of each of three flows. The EPLR of flow f1 is significantly improved by QTL especially with QTL value of 2500 [ms]. On switch s2 , a packet on flow f1 is discarded before being queued if it is expected to experience a total queuing delay (on s1 and s2 ) exceeding the QTL value; and this early discarding helps other packets on s2 . For flow f2 , the QTL has very little effect on the EPLR. This is because f2 competes only with f1 at s1 and the maximum queuing delay at s1 is 2133 [ms]. Therefore, the packets on flow f2 always experience a queuing delay less than 2133 and thus are not discarded by QTL with a value larger than 2133. In contrast, for flow f3 , the QTL achieves 0 EPLR. This is because f3 competes only with f1 at s2 but some of competing packets on f1 are discarded by QTL. Therefore, the congestion on s2 is mitigated and the packet losses on f3 can be avoided. Finally, Fig. 13(c) shows the combined effect of MTQ+QTL on the EPLR of each of three flows. It is suggested that effective combinations of (MTQ, QTL) are (1200, 2800), (1400, 2800), and (1500, 2500) in this scenario. Compared with

468

K. Kumazoe and M. Tsuru

Fig. 13. Effective Packet Loss Rate (a) MTQ, (b) QTL, (c) MTQ+QTL.

the results in Fig. 13(a) (MTQ only) and Fig. 13(b) (QTL only), an appropriate combination can achieve a better and balanced performance improvement of three flows in terms of EPLR.

5

Concluding Remarks

MTQ allows a sender of application flows to set a time in its packets depending on the application to limit the queuing delay in each queue they will pass. QTL allows a sender to set a time in its packets to limit the total queuing delay over queues they will pass. In this sense, MTQ/QTL is a good example of useroriented dynamic AQM, and this motivates us to try a P4-based implementation. All experimental results are consistent with our simulation-based evaluation of MTQ/QTL in the previous paper. This consistency suggests the validity of our P4-based implementation of MTQ/QTL and the feasibility of the data plane programming using P4 in advanced AQM. Acknowledgements. The research results have been achieved by the “Resilient Edge Cloud Designed Network (19304),” NICT, and by JSPS KAKENHI JP20K11770, Japan.

P4-Based Implementation and Evaluation

469

References 1. Feamster, N., Rexford, J., Zegura, E.: The road to SDN: an intellectual historyof programmable networks. ACM SIGCOMM Comput. Commun. Rev. 44(2), (2014) 2. Bosshart, P., Daly, D., Gibb, G., Izzard, M., et al.: P4: programming protocolindependent packet processors. ACM SIGCOMM Comput. Commun. Rev. 44(3), 88–94 (2014) 3. P4 Language Tutorial. https://p4.org/assets/P4 D2 East 2018 01 basics.pdf. Accessed 17 June 2020 4. Qu, T., Joshi, R., Chan, M.C., et al.: SQR: in-network packet loss recovery from link failures for highly reliable datacenter networks. In: IEEE 27th International Conference on Network Protocols (ICNP), Chicago, IL, USA, 1-12, 2019 (2019) 5. Turkovic, B., Kuipers, F., van Adrichem, N., et al.: Fast network congestion detection and avoidance using P4. In : ACM Sigcomm 2018 Workshop on Networking for Emerging Applications and Technologies (NEAT 2018). Budapest, Hangury 45–51, 2018 (2018) 6. R. Kundel, J. Blendin, T. Viernickel, B. et al. (2018) P4-CoDel: Active Queue Management in Programmable Data Plane. In : IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Verona, Italy, 1-4, 2018 7. Kumazoe, K., Tsuru, M., Oie, Y.: Adaptive early packet discarding scheme to improve network delay characteristics of real-time flows IEICE. Trans. Commun. 90(9), 2481–2493 (2007) 8. BEHAVIORAL MODEL (bmv2). https://github.com/p4lang/behavioral-model. Accessed 17 June 2020 9. An Instant Virtual Network on your Laptop, http://mininet.org. Accessed 17 June 2020 10. P4 Developer Day 2019. https://p4.org/events/2019-04-30-p4-developer-day/. Accessed 17 June 2020 11. The BMv2 Simple Switch target. https://github.com/p4lang/behavioral-model/ blob/master/docs/simple$ $switch.md. Accessed 17 June 2020 12. P4 Language Specifition ver.1.0.5. https://p4.org/p4-spec/p4-14/v1.0.5/tex/p4. pdf. Accessed 17 June 2020 13. Scapy, Packet crafting for Python2 and Python3. https://scapy.net/. Accessed 17 June 2020 14. https://github.com/p4lang/behavioral-model/issues/310. Accessed 17 June 2020

Matching Based Content Discovery Method on Geo-Centric Information Platform Kaoru Nagashima1(B) , Yuzo Taenaka2 , Akira Nagata3 , Hitomi Tamura4 , Kazuya Tsukamoto1 , and Myung Lee5 1

Kyushu Institute of Technology, Iizuka, Japan [email protected], [email protected] 2 Nara Institute of Science and Technology, Ikoma, Japan [email protected] 3 iD Corporation, Fukuoka, Japan [email protected] 4 Fukuoka Institute of Technology, Fukuoka, Japan [email protected] 5 CUNY, City College, New York, USA [email protected]

Abstract. We have proposed a concept of new information platform, Geo-Centric information platform (GCIP), that enables IoT data fusion based on geolocation. GCIP produces new and dynamic contents by combining cross-domain data in each geographic area and provides them to users. In this environment, it is difficult to find appropriate contents requested by a user because the user cannot recognize what contents are created in each area beforehand. In this paper, we propose a content discovery method for GCIP. This method evaluates the relevancy between topics specified in user requests and topics representing IoT data used for creating contents, called matching, and presents the candidates for the desired contents based on the relevancy. Simulation results showed that appropriate contents can reliably be discovered in response to user’s request.

1

Introduction

The cross-domain (horizontal-domain) IoT data fusion attracts much attention. To create contents based on local cross-domain IoT data, linking geolocation to IoT data is essential. Even though, as each of network is managed without the consideration of the physical location, the feasibility of the cross-domain data fusion is quite low in the practical environment. So far, we have proposed Geo-Centric Information Platform (GCIP) [1] that conducts a geolocation-oriented IoT data collection/processing. Although each IoT device is supposed to dedicate to a particular service, GCIP collects data created at the physical proximity and produces contents from them as for secondary use by nearby edge servers. GCIP introduces two kinds of edge servers, c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 470–479, 2021. https://doi.org/10.1007/978-3-030-57796-4_45

Matching based content discovery method on GCIP

471

a data store (DS) server and data fusion (DF) servers, for these processing. A DS server, which is assumed to be arranged for a particular area and managed by local organization like mall-owner or municipality, stores the collected IoT data. A DF server(s) deployed by contents service providers produces new and dynamic contents using cross-domain IoT data stored in the DS server. In this paper, we define contents created by DF server(s) as spatio-temporal content (STC). As IoT data like sensor data dynamically changes their number, sensing preciseness, existence, etc. by various reasons, such as movement and sleep cycle of IoT devices, DF server may accordingly produce different STCs depending on the observable IoT data at that point. Therefore, users hardly recognize variable STCs created in a timely manner. Moreover, as the number of DF servers (contents providers) may differ between geographical areas, it is quite difficult to identify every DF servers as well as recognize what STC is created in which DF server. Therefore, GCIP needs to provide users with a mechanism that finds appropriate STCs they are interested in. In this paper, we propose a content discovery method that finds the candidates for STCs the user is requesting. Specifically, in the method, a DS server evaluates the relevancy between topics specified in user requests and topics representing IoT data used for creating STCs, and then selects a DF server having the highest potential to have appropriate STCs. The designated DF server searches for appropriates STCs from its own STC database and presents a list of STC candidates to users. We evaluate the effectiveness of our proposed discovery method on GCIP through simulation. The rest of this paper is organized as follows, we first review the existing studies in Sect. 2. Then, we describe the requirements for STC management on the GCIP in Sect. 3. Proposed method is described in Sect. 4 and simulation environment and simulation results are explained in Sect. 5. Finally, the conclusion is provided in Sect. 6.

2

Related Work

In this section, we review the existing studies focusing on geo-location based cross-domain data fusion and IoT contents discovery. Reference [2] summarizes the existing studies focusing on smart city applications, which use IoT data for users in some city. Various sensors are deployed on the platform in which IoT data obtained from these sensors are used for a specific application only. On the other hand, as our proposed GCIP is designed for cross-domain IoT data fusion, variety of purposes are aimed to be supported even in a smart city constructed by multiple service providers. Therefore, our study flexibly produces contents without any limitations on the purpose of sensors and networks. References [3] and [4] show examples for smart city application using cloud platform. In these papers, all IoT data are sent to the cloud, processed, and then delivered to the user from the cloud. On the other hand, GCIP processes the collected IoT data at edge servers; there is no need to carry them to the cloud,

472

K. Nagashima et al.

so that people can benefit from local use in terms of not only a few load on the network but also a low latency. Reference [5] summarizes the existing studies focusing on the contents search method from various perspectives (e.g., event-based [6], metadata-based [7], and location-based [8]). Although none of existing studies tries to find the contents dynamically created from the collected IoT data, the proposed method can achieve it. Information Centric Network (ICN) is a promising concept that brings us efficient content search and distribution. As the ICN operated in a contentbased, not an IP-based [9], users can directly search for a content by using content name without knowing the location of the content. However, since the contents are created from cross-domain IoT data observable at that time, the user cannot know their name at the retrieval timing in the ICN, which makes us extremely difficult to search for the contents. As a result, we can say that the proposed contents discovery method is novel.

3

STC Management on the GCIP

This section briefly describes the concept of STCs management on the GCIP [1], which comprises of three steps: IoT data collection, STC production, and STC discovery, as shown in Fig. 1.

Fig. 1. Conceptual design of GCIP

Matching based content discovery method on GCIP

3.1

473

Step1: IoT Data Collection

As shown in Fig. 2, GCIP divides geographical area into hierarchical meshes based on latitude and longitude lines, and each of which has a unique mesh code. There is a router, called mesh router, to be responsible for the routing function in the corresponding mesh. Since we embedded mesh codes, which uniquely identify every meshes, into IPv6 address format [10], every packets following the GCIP rules can be processed (e.g. routing) with the consideration of geolocation specified in IPv6 address even during the transmission of traditional Internet. 3.2

Step2: STC Production

Since a DS server is responsible for storing data in its belonging mesh and DF servers are responsible for creating STCs, deployment of DS server and DF server(s) are necessary for creating STCs. In our method, Publish/Subscribe model is applied for the communication among these two sorts of servers. The mesh router publishes the collected IoT data to the DS server. Each of the DF servers sends subscribe requests to the DS server for data represented by the desired topics. The DF server produces a STC by using those data. In this paper, we assume that one STC will be created by data collected by a single subscribe request including multiple topics.

Fig. 2. STC generation image in GCIP

3.3

Step3: STC Discovery

The data collected from diverse IoT devices will change spatio-temporally in terms of their amount and generation interval. As a result, STCs created by

474

K. Nagashima et al.

the DF servers have spatio-temporal characteristics. In such a case, users who want to get a STC on the GCIP hardly discover which DF server has what STCs. Therefore, we need to consider a new STC discovery method that can satisfy the following requirements: (1) support for geolocation-aware discovery, (2) support for fuzzy search and (3) support for exploration of new findings. Note that, as we already proposed a geolocation-oriented communication, we already satisfy the requirement (1).

4

Proposed Method

Since we assume that each of DF servers sends a subscribe message including desired topics to the DS server for creating a STC, the DS server can grasp the statistics on the number of topics included in all of subscription messages. On the other hand, users need to send a request message with desired topics to the DS server because they cannot directly identify which DF server has what STCs. Therefore, we try to evaluate the relevance matching between the user request and statistical information of subscriptions at the DS server. Through this evaluation, the DS server identifies a DF server that has the most appropriate STCs created by relevant topics with the user’s request, and forwards the user request message to have the DF server search for appropriate STCs in response to the user’s request. In the proposed method, a user is supposed to specify mesh-code, topic, and priority of each topic in a request message. Regarding topics, a request message contains up to three topics, each of which is allocated with the variable weight. The weight of each topic is different depending on the user preferences, but the total must be 100. For example, if the three topics are called T1 , T2 , and T3 , a user specifies the weight 70 for T1 , 20 for T2 , and 10 for T3 . In this way, even if a user has no knowledge about STCs, the user sends a request including topics with their weight (priority), not any particular identifier of STCs, thereby achieving requirement (2). As the DS server obtains the statistics on subscription messages transmitted from every DF servers, we use the statistics for STC discovery. The DS server calculates a subscription probability Pi,j of topic j for each DF server i as follows, Pi,j =

Si,j Si,all

(1)

where Si,j is the number of topic j subscribed by the DF server i and Si,all is the total number of topics subscribed by the DF server i. From the assumption of one STC from one subscription including multiple topics, the DF server with the highest Pi,j has the largest number of STCs partly created by the topic j. To find appropriate STCs based on both the user request and the subscriptions statistics, we evaluate the relevance between them. Although a user request includes up to three topics, there may not be STCs created by the exact same topics. We then search for STCs created by not only exact same topics with user request but also relevant STCs created by different combinations of topics,

Matching based content discovery method on GCIP

475

thereby providing unconscious users’ interest. Specifically, as we have 10 sets of possible combinations out of three topics in total, i.e., 3 C1 + 3 C2 + 1 = 10, we choose several combinations, a set of partial combinations C, for searching. That is once, the DS server receives a user request, it calculates total weight for every possible combinations of topics based on weight specified in the request, and then only selects combinations whose total weight exceeds the predetermined threshold α. That is, these combinations are the partial combinations C. To identify an appropriate DF server, the DS server calculates the expected value, Ei , of the number of STCs satisfying the threshold for each DF server i. (Gi (c) × Ni ) (2) Ei = c∈C

where C is a set of partial combinations of user-specified topics, whose total weight exceeds the threshold. Gi (c) is defined as Pi,j for all topics contained in c. Ni is the total number of subscription messages received by the DF server i. This number essentially means the number of STCs created by the server. Since a DF server with the highest value of Ei is most likely to have the large number of STCs requested by the user, the DS server forwards the request to the DF server. After that, the DF server searches for STCs created by several combinations of topics, whose total weight exceeds α, in its own STC database. In this way, as several STCs may be discovered based on not only topics specified in the user request but also other topics, a list of such candidate STCs can be presented to the user as a response to the request. From these consideration, we can remark that our proposed method successfully satisfies search requirement (3). Finally, we describe an example where a user retrieves STC. A user, one of us, may want to know a temperature and humidity to choose clothes before visiting anywhere. The user sends a request with the topics of temperature and humidity regarding the location where the user will go. As a result, the user can get not only requested STC, i.e., temperature and humidity but also several STCs related to those topics if the user discovers STCs by the proposed method. In this case, we may be able to get both the discomfort index [1] made by temperature and humidity and sensible temperature made by temperature, humidity, and wind speed. In this context, the proposed method brings user’s unconscious interests.

5 5.1

Evaluation for STC Discovery Simulation Environment

We conduct simulation to evaluate our proposed method. Figure 3 shows the network topology we employed in this simulation. Each DF server individually produces STCs with various combinations of randomly selected topics, which are 2 to 5 topics, as shown in Table 1. In order to ensure the feasibility of STCs production process, we additionally employ the subscription possibility parameter, called subscription bias, which is

476

K. Nagashima et al.

the popularity of topics used for STC production on a DF server. That is, if the subscription bias of a certain topic is set to 1, the DF server must include the topic in all of the subscriptions sent to the DS server. This case is called subscription bias 1. If it is set to 0.5, every DF server uses a particular topic in 50% possibility to produce STCs. We define this condition as subscription bias 0.5. Note, each DF server randomly sets the number of topics up to maximum of 5 in addition to a topic determined by the subscription bias. We assume that up to three topics characterizing the desired STCs are included in a user request. Since the topics have different weight, we introduce the user request bias, in addition to the subscription bias, as shown in Table 2. We use two patterns for user request bias: request bias 1 (imbalanced) and request bias 2 (relatively balanced). In each simulation, we send 10 user requests in one round and then conduct the experiment for 10 rounds.

Fig. 3. Simulation environment for performance evaluation

Table 1. Simulation parameters Parameters

Value

The number of DF servers

10

The number of STCs created on each DF server 1000 The number of topics

10

The number of topics used for each STC

2–5

Threshold of the total weight α

80

Matching based content discovery method on GCIP

477

Table 2. User request bias settings Weight of T1 Weight of T2 Weight of T3 Request bias 1 (imbalanced)

80

15

5

Request bias 2 (relatively balanced) 35

35

30

5.2

Comparative Methods and Performance Measures

We use the following two comparative methods to show the effectiveness of the proposed method. Comparative method 1: This method searches for STCs whose configuration topics completely match topics included in the user request. Comparative method 2: This method searches for STCs from a DF server that is expected to hold the largest number of STCs configured from T1 topic. We introduce the new performance measure, called unconscious contents ratio (UCR), to evaluate the amount of STCs that offer more topics than those of users’ request. As the proposed method enables to provide STCs made by relevant topics, the user has a potential to retrieve unconscious but interested STCs. That is why the evaluation of UCR is also important. The UCR is calculated by Eq. (3). Ctotal − Ccomplete match × 100 (3) U CR = Ctotal where Ctotal shows the total number of retrieved STCs and Ccomplete match shows the number of STCs created from specified topics only. 5.3

Simulation Results

We describe the number of retrieved STCs and the UCR through the simulation. Table 3 and 4 show the results in case of subscription bias with 1 and 0.5, respectively. Since the comparative method does not consider the weight of the specified topics, the results are similar, irrespective of the change in the request bias. In contrast, the proposed method retrieves the relatively larger number of STCs that involves one and/or two additional topics not limited to specified by the user. They are referred to as complete match+1, complete match+2, respectively. The overall number of STCs in case of subscription bias = 0.5 is lower than that of subscription bias = 1. However, regardless of subscription bias, there is little difference in the number of complete match STCs on the comparative method 1, 2, and the proposed method. Since all the methods including the proposed method are able to estimate a DF server holding the appropriate STCs, there is no difference between the selected DF server among all methods. In contrast, the proposed method can retrieve 9.4 to 9.9 times larger number of STCs, regardless of the subscription bias because the proposed method obtains the STCs involving complete match+1 and complete match+2.

478

K. Nagashima et al. Table 3. The retrieved number of STCs (Subscription bias = 1) Comparison method 1

Comparison method 2

Proposed method

Complete match

Complete match

Complete match

Complete match+1

Complete match+2

Request bias 1 7.65

7.24

7.24

20.63

41.67

Request bias 2 7.65

7.24

7.65

22.37

43.94

Table 4. The retrieved number of STCs (Subscription bias = 0.5) Comparison method 1

Comparison method 2

Proposed method

Complete match

Complete match

Complete match

Complete match+1

Complete match+2

Request bias 1 3.71

3.58

3.58

10.04

21.5

Request bias 2 3.71

3.58

3.71

11.04

22.2

Table 5. UCR in case of subscription bias = 1 UCR 95% confidence interval

Table 6. UCR in case of subscription bias = 0.5 UCR 95% confidence interval

Request bias 1 89.6

0.454

Request bias 1 90.0 1.358

Request bias 2 89.6

0.700

Request bias 2 89.9 0.756

Finally, we assess the UCR. Table 5 shows how the UCR is varied with the change in the user request bias, in case of the subscription bias = 1. Table 6 shows how the UCR is varied with the change in the user request bias, in case of the subscription bias = 0.5. From these results, we demonstrate that the UCR does not change, irrespective of the changes in not only subscription bias but also the request bias. Therefore, the proposed method provides not only the desired STCs but also the unconscious STCs, even under any environments of topic subscriptions and user requests.

6

Conclusion

We have proposed the Geo-Centric Information Platform (GCIP) that can provide Spatio-temoral Contents (STC) for specific area. However, from the view point of user, since the STC is dynamically created on the GCIP, a method of STC discovery is necessary. In this paper, we proposed a method that matches topics specified in user requests and topics representing IoT data used for creating STCs. Through simulation, we showed that the user can retrieve the relatively larger number of STCs including both required and non conscious (but beneficial) topics.

Matching based content discovery method on GCIP

479

We showed that the proposed method can flexibly perform STCs search in accordance with user requirements. The user is possible to retrieve appropriate STCs even if the user does not know the STC information beforehand. Although the current proposed method tries to discover appropriate STCs only in a mesh, we have to take into account the cases where there is no appropriate STC in the specified mesh. Therefore, we will extend the proposed STC discovery method to adaptively change the size of geolocation area for achieving the reliable STC discovery. Acknowledgements. This work was supported by JSPS KAKENHI Grant Number JP18H03234, NICT Grant Number 19304, USA Grant number 1818884, and 1827923.

References 1. Nagashima, K., Taenaka, Y., Nagata, A., Nakamura, K., Tamura, H., Tsukamoto, K.: Experimental evaluation of publish/subscribe-based spatio-temporal contents management on geo-centric information platform. Adv. Networked-Based Inf. Syst. 1036, 396–405 (2019) 2. Lau, B.P.L., et al.: A survey of data fusion in smart city applications. Inf. Fusion 52, 357–374 (2019) 3. Consoli, S., Reforgiato, D., Mongiovi, M., Presutti, V., Cataldi, G., Patatu, W.: An urban fault reporting and management platform for smart cities. In: WWW 2015 Companion: Proceedings of the 24th International Conference on World Wide Web, pp. 535–540, May 2015 4. Ahmed, F., Hawas, Y.E.: An integrated real-time traffic signal system for transit signal priority, incident detection and congestion management. Transp. Res. Part C: Emerg. Technol. 60, 52–76 (2015) 5. Pattar, S., Buyya, R., Venugopal, K.R., Iyengar, S.S., Patnaik, L.M.: Searching for the IoT resources: fundamentals, requirements, comprehensive review, and future directions. IEEE Commun. Surv. Tutorials 20, 2101–2132 (2018) 6. Pintus, A., Carboni, D., Piras, A.: Paraimpu: a platform for a social Web of Things. In: Proceedings 21st International Conference on Companion World Wide Web (WWW Companion), pp. 401–404, April 2012 7. Mayer, S., Guinard, D.: An extensible discovery service for smart things. In: WoT 2011: Second International Workshop on the Web of Things, June, pp. 1-6 (2011) 8. Mayer, S., Guinard, D., Trifa, V.: Searching in a web-based infrastructure for smart things. In: 2012 3rd IEEE International Conference on the Internet of Things, pp. 119–126, October 2012 9. Xylomenos, G., et al.: A survey of information-centric networking research. IEEE Commun. Surv. Tutorials 16, 1024–1049 (2013) 10. Tamura, H.: Program for determining IP address on the basis of positional information, device and method. JP Patent 6074829, 20 January 2017

SDN-Based In-network Early QoE Prediction for Stable Route Selection on Multi-path Network Shumpei Shimokawa1(B) , Yuzo Taenaka2 , Kazuya Tsukamoto1 , and Myung Lee3 1

2

Kyushu Institute of Technology, Iizuka, Japan [email protected], [email protected] Nara Institute of Science and Technology, Ikoma, Japan [email protected] 3 CUNY, City College, New York, USA [email protected]

Abstract. As QoE is useful to uniformly handle many kinds of application flows, we have been tackling QoE-oriented network resource management based on SDN technology. Toward this goal, our previous study proposed a QoE measurement method for on-going streaming flows. However, the standard QoE calculation model requires at least 8 s for collecting the flow information. In this study, we tackle early QoE prediction on a SDN-enabled multi-path network. To predict video QoE as soon as possible, we exploit not only packet loss rate measured regularly but also the number of packet transmissions by short-period measurement at the flow arrival. Finally, through experiments, we demonstrated that QoE of all flows can be maximized by selecting an appropriate route based on the predicted QoE.

1

Introduction

Network resource management, such as route selection, is one of key technologies to make multiple applications coexist in a network with reliable performance. Our previous study tackled the efficient resource utilization to keep required throughput for every flows [1]. However, as many kinds of applications coexist in a realistic network, throughput is not the best performance metric for some applications. To keep application performance of any kind of application flows, Quality of Experience (QoE) could be a common performance metric for them. To use QoE as a metric for a route selection, we need to track QoE for every flows by exploiting in-network information. Our previous study proposed SDN-based in-network QoE measurement method for video streaming flows [2]. However, the standard QoE calculation model requires at least 8 s for collecting flow information from the network, and thus QoE-based route selection is quite difficult at the time of a new flow arrival. Although there is a very few information c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 480–492, 2021. https://doi.org/10.1007/978-3-030-57796-4_46

SDN-Based In-network Early QoE Prediction for Stable Route Selection

481

we can obtain for an arrival flow, predicting QoE and accordingly handling the flow are essential to keep good QoE. Therefore, we propose an early QoE prediction method on SDN-enabled multi-path network. Specifically, we predict video QoE metric by exploiting not only packet loss rate measured regularly but also the number of arrival bytes for very short period at the OFC, and then select an appropriate route for the video flow.

2

Related Work

Network management aimed at improving video QoE has been studied so far [3–5]. To achieve a network-wide QoE fairness, paper [4] allocated network resource for heterogeneous applications dynamically by using SDN. In order to improve the quality of video streaming and file download, they took buffering time and throughput into consideration. However, since they do not use a common metric for these applications, the method cannot improve the performance of any applications except video and file download. By using QoE as a metric for network management, the same management policy for controlling any applications can be applied even when various applications coexist. Therefore, measuring QoE by using in-network information and performing QoE-based control make network management elastic. Paper [5] proposed a QoE-based route optimization for multimedia services to maximize end users’ QoE. To identify the best route for each flow in SDN controller, they made Session Initiation Protocol (SIP) server mediate between client and media server, and obtained media parameters, such as video codec, from SIP server. However, all multimedia services do not use intermediate node such as SIP server in real Internet. Therefore, in this paper, we consider the way of predicting not only network parameters but also media parameters based on available in-network information.

3

QoE Calculation Model

We explain QoE calculation for two applications, that is, not only for video QoE our method focuses but also for file transfer, because any kind of applications coexist in the real Internet. 3.1

QoE Calculation for Video Streaming Services

QoE calculation for video streaming services is standardized in ITU-T G.1071 [6]. G.1071 requires network condition measured at end hosts, and video settings. Although QoE value is calculated in the combination of video and audio metric, we here focus only on the video in this study because it is a primary factor of video QoE. Note QoE is ranged from 1 to 5.

482

S. Shimokawa et al. Table 1. The target video settings in G.1071 Category

Lower resolution

Higher resolution

Protocol

RTSP over RTP

MPEG2-TS over RTP

Video codec

H.264, MPEG-4

H.264

Resolution

QCIF (176 × 114), SD (720 × 480), HVGA (480 × 320) HD (1280 × 720, 1920 × 1080)

Video bitrate (bps) QCIF: 32–1000 k, SD: 0.5–9 M, HVGA: 192–6000 k HD: 0.5–30 M Video framerate

5–30 fps

25–60 fps

G.1071 consists of two resolution categories: high resolution and low resolution (Table 1). We here use high resolution because high resolution video is more general than low one. Those equations are as follows: QoEvideo = 1.05 + 0.385 × QV + QV (QV − 60)(100 − QV ) × 7.0 × 10−6 , QV = 100 − QcodV − QtraV ,

(1) (2)

QcodV = A × eB×bp + C + (D × eE×bp + F ) + G,

(3)

br × 10 , r × fr QtraV = H × log(I × plc + 1),

bp =

6

plr ] − J. plc = J × exp[K × (L − M ) × M × (N × plb + O) + plr

(4) (5) (6)

The range of QV is from 1 to 100, and QV is converted to the QoE value by the Eq. (1). Parameters of A ∼ O are fixed and defined in G.1071, and take positive except B and E. Besides, parameters of video bitrate br [bps], resolution r [pixel], frame rate fr [fps], and packet loss concealment (PLC) are pre-determined as the video settings. On the other hand, parameters of packet loss rate plr [%] and average number of consecutive packet losses plb need to be measured at end hosts. The values of fixed parameters in the Eq. (6) (i.e., J ∼ O) have different values in accordance with PLC which is an application function to correct a damaged video frame happened at packet losses. PLC consists of Freezing, which only ignores losses, and Slicing, which tries to correct the losses. 3.2

QoE Calculation for File Transfer

As paper [7] proposed QoE calculation for file transfer application (FT), we here explain how to calculate QoE. ⎧ ⎪ (R ≤ R− ) ⎨1 QoEFT = a · log10 (b · R) (R− < R < R+ ) (7) ⎪ ⎩ + 4.5 (R ≤ R).

SDN-Based In-network Early QoE Prediction for Stable Route Selection

483

QoEFT is ranged from 1 to 4.5. R is the goodput of flow. R+ is the maximum transmission speed in network without packet losses. In this study, we use 100 Mbps, which is the maximum transmission speed in our experimental network, as R+ . On the other hand, as the lowest bandwidth provided by AT&T’s DSL service is 0.8 Mbps [8], we set the minimum transmission speed, which are referred to as R− , to 0.8 Mbps. a and b are determined by fitting the approximate curve in the Eq. (7). As a result, a and b are calculated as following: a = 1.67, b = 4.97.

4

SDN-Based Early QoE Prediction and Route Selection

In this section, we propose an early QoE prediction method and a route selection method. To predict video QoE for arrival flow, we transmit a new flow on a temporary route for very short period and calculate QoE from the network information measured during that period. After the QoE prediction, we switch the flow to an appropriate route based on the predicted QoE. 4.1

Early QoE Prediction at the Flow Arrival Timing

As explained in Sect. 3.1, 6 parameters are required for QoE calculation of video flow. However, the average number of consecutive packet losses and PLC cannot be obtained in a network because OpenFlow cannot count the consecutive number of packet losses and PLC is an application parameter invisible on network. Thus, in this study, we set these parameters to fixed value, considering the case bringing the worst QoE, as does the previous study [2]. Although other 4 parameters have to be obtained, obtaining the precise information is extremely difficult at the time of flow arrival. Therefore, we predict these parameters by combining the information of network condition collected before the flow arrival and shortperiod measurement after the flow arrival. Our QoE prediction method consists of two functions: (1) packet loss rate prediction and (2) video settings prediction. 4.1.1 Packet Loss Rate Prediction As packet losses are basically measured from flows which have been transmitting already, it is essentially difficult to obtain it before start of new flow transmission. However, we can assume the packet loss ratio (PLR) is the same for any flows on a same network. Hence, we measure the PLR on existing (ongoing) traffic regularly and use the value as the PLR for QoE prediction of new arrival flow. Here, we have one more assumption in which the arrival flow will not suffer from overload if it is forwarded on the network. To measure the PLR of existing traffic, we use the statistic information, called PortStats, of OpenFlow. PortStats can be collected from OpenFlow Switches (OFSs) by OpenFlow Controller (OFC). Since PortStats includes the number of transmitted packets and received packets on each network interface (not each flow), the OFC can calculate the PLR of existing traffic by the difference of the number of transmitted and received packets between neighboring two OFSs.

484

S. Shimokawa et al.

Fig. 1. The number of video flow’s transmitted packets every 0.1 s.

However, PLR may contain measurement errors because PortStats cannot be collected at the exact same time between two OFSs. We handle this measurement error at the same way with our previous study [2]. Briefly explaining, if the number of received packets at the receiver-side OFS is larger than the transmitted packets at the sender-side OFS, we can clearly treat them as measurement errors. We then hold the difference as the accumulated number of surplus packets for correcting subsequent errors. After that, when the number of transmitted packets at the sender-side is larger than the received packets at the receiver-side, we can expect that the difference is the measurement error if we have remaining surplus packets at that time. In such case, we subtract the accumulated number of surplus packets from the difference and then treat the result as the number of packet losses. To conduct the PLR measurement with error correction, we collect PortStats every 1 s in this study. 4.1.2 Video Settings Prediction Although it is possible to identify video bitrate based on the measured throughput [2], we cannot measure the throughput before the start of flow transmission. In order to get video bitrate, we need to temporarily forward a new flow on a route with the largest residual bandwidth, which has less possibility for packet losses, for very short period, and measure the number of packets or bytes during the transmission on that route. Figure 1 shows the number of packets measured for every 0.1 s after the flow transmission starts. Note we only transmit a single video flow on a stable network in this experiment. We vary the video bitrate from 3,250 kbps to 10,000 kbps. Besides, “Theoretical” means the theoretical number of packets that can be transmitted for 0.1 s. Since the video buffering function works at the beginning, the number of packets is significantly different from the theoretical value, irrespective of video bitrate. This causes the overestimation of video bitrate, and thus it is hard to use that value for identifying video bitrate. Due to this throughput fluctuation, QoE measurement method for on-going flows proposed in our previous study used the measurement results for 8 s from

SDN-Based In-network Early QoE Prediction for Stable Route Selection

485

Fig. 2. The number of transmitted bytes for 0.9 s after a flow starts.

first second to ninth seconds. That is why we have to complete the QoE prediction and the initial route selection until 1 s so as to enable the QoE measurement on a particular route. As there are delays such as transmission and processing of PortStats, we use the number of bytes measured for a little bit less than 1 s, i.e., 0.9 s, after the flow arrival. For this, we investigate the characteristics of video packet transmission for 0.9 s immediately after the flow transmission starts. Figure 2 shows the number of transmitted bytes for 0.9 s, when the video bit rate is varied. In Fig. 2, we can expect that the number of bytes is proportional to the video bitrate even involving the effect of video buffering. Based on this assumption, we perform straight linear approximation as following: Rv = 8.7903 × Bv − 7.7067 × 102

(8)

where Rv [kbps] is video bitrate and Bv [kbyte] is the number of bytes for 0.9 s from the start of new video flow. As a result, we predict video bitrate from the approximation straight line (Eq. (8)). In OpenFlow, OFC can counts Bv by exploiting packet-in message, which is used by OFS for asking a way to handle the new flow for the OFC. Specifically, when a new flow arrives at a OFS, the OFS transmits a packet-in message with the header of that packet to the OFC. The OFC generally returns a flow control rule, called flow entry, as a flow mod message and then the OFS will not send packet-in after receiving flow mod message. However, in this study, we make the OFC return packet-out message only, which instructs to send the particular packet to next hop. After 0.9 s pass, the OFC returns a flow mod that matches every packets of the flow. In this way, an OFS sends packet-in whenever new packets arrive until the 0.9 s, and the OFC counts the total bytes transmitted during this period. Note that, if there are multiple OFSs on the selected route, we conduct this process only for the first OFS (nearest to the sender) and send a flow mod controlling every packets of the flow to the other OFSs. We can predict video bitrate based on the Eq. (8) by using Bv obtained in this way. Frame rate and resolution are predicted based on this predicted video bitrate. We use Table 2 which is made from the recommendation of video settings of YouTube in terms of resolution, frame rate, and video bitrate [9] as in the previous study.

486

S. Shimokawa et al. Table 2. Estimation of resolution and frame rate based on video bitrate.

4.2

Video bitrate [kbps] Resolution

Frame rate [fps]

∼3,250

SD (720 × 480)

30

3,250∼4,500

SD (720 × 480)

60

4,500∼6,250

HD (1280 × 720)

30

6,250∼7,750

HD (1280 × 720)

60

7,750∼10,000

HD (1920 × 1080) 30

10,000∼

HD (1920 × 1080) 60

Route Selection Based on the Predicted QoE

After video settings prediction, the OFC calculates QoE based on both the PLR of existing traffic and predicted video settings. Here, as a network condition may be different among routes, the OFC measures the PLR of existing traffic and predicts QoE for each route. After the QoEs of all available routes are predicted, the OFC tries to choose an appropriate route for the new arrival flow for maximizing QoE of the flow. Specifically, we select a route which has the highest predicted QoE. By doing that, we can forward a video flow to a route with having the largest QoE by 1 s after the new flow arrival.

5

Experimental Evaluation

We conduct experiments to evaluate the proposed method. The goal of this experiment is to show effectiveness of the QoE prediction. At first, we compare predicted QoE with actual QoE that can be obtained at end hosts. Then, we show that the QoE-based route selection method successfully improves QoE of all flows transmitted on the SDN-enabled network. 5.1

Comparative Method

We use a throughput-oriented route selection [1] as the conventional method. Briefly explaining, to avoid packet losses for a new arrival flow, the method temporarily forwards the flow on a route with the largest residual bandwidth because its required throughput is unknown at that time. After the OFC measures the throughput based on the FlowStats which is the statistic information of each flow, the OFC forwards the flow to the route with the smallest but sufficient residual bandwidth. By doing this throughput-oriented mechanism, since a route with the largest residual bandwidth is prepared for next flow, we can avoid packet losses at the next flow arrival as much as possible.

SDN-Based In-network Early QoE Prediction for Stable Route Selection

487

Fig. 3. Experimental environment. Table 3. The true value of QoE after the route selection in scenario 1. Video bitrate

5.2

The proposed method The conventional method Median Max. Min. Median Max. Min.

3,250 kbps

4.395

4.395 1.317

1.311

1.328 1.285

10,000 kbps

4.391

4.391 1.302

1.321

1.345 1.307

Experimental Settings

The experimental topology is shown in Fig. 3. Regarding OpenFlow, we use Trema as OFC and install OpenvSwitch on every OFSs. All devices are connected with 100 Mbps Ethernet. We put a Linux PC between OFS 1-2 and OFS 2-2 to generate 1% random packet loss. We define the route from OFS 1-1 to OFS 2-1 as Route 1 and from OFS 1-2 to OFS 2-2 as Route 2. In addition, we prepare two background traffic flows, which are a 2 Mbps flow through Route 1 and a 4 Mbps flow through Route 2 to make an imbalanced residual bandwidth between two routes. After that, we evaluate how the QoEs of all flows are varied when several flows including video streaming are transmitted from PC 1. 5.3

Evaluation of the Early QoE Prediction

To show effectiveness of our proposed QoE prediction method, we conduct two scenarios with different evaluation purposes. • Scenario 1: We show how the proposed method improves the QoE of the incoming video streaming flow. We also evaluate accuracy of predicted PLR, video bitrate and QoE. • Scenario 2: We show how the proposed method is effective by using predicted QoE for the route selection in case of multiple new flow arrivals. 5.3.1 Scenario 1: Evaluation in Case of only Video Flow In this scenario, we show the performance of the route selection. Specifically, we transmit a video flow from PC 1 to PC 2 and evaluate the result of route

488

S. Shimokawa et al.

selection. In this experiment, we transmit two kinds of video, that are 3,250 kbps and 10,000 kbps and do this 9 rounds. Table 3 shows the true value of QoE after the route selection. In the conventional method, although a new flow is transmitted on Route 1 which has the largest residual bandwidth at first, the video flow is finally switched to Route 2 due to the throughput-oriented mechanism. As a result, since packet losses occur in Route 2, QoE of the video flow drastically drops to around 1.3. On the other hand, in the proposed method, the video flow is continuously transmitted on Route 1 as a result of the route selection based on predicted QoE. Thus, the median and maximum value of QoE are around 4.3, which is good quality. However, the minimum value of QoE is around 1.3. This is because the occurrence of measurement errors of PLR, which cannot be corrected the way described in Sect. 4.1.1, degrades the predicted QoE, thereby accordingly switching the video flows to the Route 2. Table 4. The predicted and true value of PLR in Route 1. Video bitrate Predicted value Median Max.

True value Min.

3,250 kbps

0%

1.622% 0%

0%

10,000 kbps

0%

3.125% 0%

0%

Table 5. The predicted and true value of video bitrate in Route 1. Video bitrate Predicted value Median Max. 3,250 kbps 10,000 kbps

True value Min.

3,128 kbps 3,317 kbps 3,057 kbps

3,500 kbps

5,339 kbps 5,872 kbps 4,086 kbps 10,000 kbps

Table 6. The predicted and true value of QoE in Route 1. Video bitrate Predicted value True value Median Max. Min. 3,250 kbps

4.720

4.726 1.286 4.395

10,000 kbps

4.732

4.747 1.393 4.395

To investigate the measurement error, Tables 4, 5 and 6 show the predicted and true value of PLR, video bitrate and QoE in Route 1, respectively. Although no packet loss happens in Route 1 in fact, the maximum value of predicted

SDN-Based In-network Early QoE Prediction for Stable Route Selection

489

PLR indicates more than 1% larger than the true value. The gaps are caused by miss-correction of errors introduced in Sect. 4.1.1. Specifically, measurement errors cannot be corrected because of a mistake in judgement of packet loss and measurement error. Thus, the predicted QoE is degraded by this measurement error. Next, focusing on video bitrate, the predicted video bitrate of 3,250 kbps is almost same with the true value, while 10,000 kbps is lower than the true value at all. This is because there is limitation on the number of bytes measured in OFC due to its processing delay. However, the difference between the true value and median/max. Predicted value is only around 0.3, that is almost same with the true value because video bitrate has the less impact on QoE [2]. From these results, we can demonstrate that the proposed method can improve video QoE by switching to the route based on predicted QoE. In addition, since the predicted QoE value shows the almost same value independent of the change in the video bitrates, we can say that the proposed method has video bitrate tolerance characteristics. 5.3.2 Scenario 2: Evaluation in Case of Multiple Flows In this scenario, we demonstrate the effectiveness of the proposed method, using predicted QoE for the route selection, in case of multiple new flow arrivals. We transmit three flows in the order of video flow 1, video flow 2, and File transfer (FT) flow every 5 s. The video bitrate of both video 1 and video 2 is set to 2,500 kbps and the FT flow is a simple TCP file transfer. Here, we assume that OFC can precisely predict QoE of FT, and use the true value of QoE for route selection of FT. We conduct this experiment at 9 times, but here show the result of the median value.

Fig. 4. QoE transition of the conventional method in scenario 2.

490

S. Shimokawa et al.

Figure 4 shows true value of QoE in the conventional method in the time series. Although a new flow is temporarily transmitted on Route 1 which has the largest residual bandwidth, that video flow is then switched to Route 2 based on the throughput-oriented mechanism. As a result, QoE of video 1 is high at the flow arrival but its QoE drops drastically after the route selection due to packet losses on Route 2. For the video 2, as the conventional method performs the same flow management as with video 1, QoE of video 2 drops. Lastly, for the FT, the conventional method temporarily selects Route 1, but does not change the route because the residual bandwidth of Route 2 is less than that of Route 1. As a result, FT can maintain high QoE but both videos cannot in the conventional method.

Fig. 5. QoE transition of the proposed method in scenario 2.

Figure 5 shows true value of QoE in the proposed method in the time series. The video 1 flow is temporarily transmitted on Route 1 as with the conventional method, but decides to keep the route on the same one based on the QoE prediction. The video 2 is temporarily transmitted on Route 2 at arrival timing because Route 2 has the largest residual bandwidth at that time. Thus, QoE of video 2 is low at the flow arrival. However, since the proposed method switches the video 2 flow to Route 1 in accordance with its predicted QoE, QoE of the video 2 flow is clearly improved after 1 s. Finally, FT is first temporarily transmitted on Route 2, which has the largest residual bandwidth, and then does not change the transmission route. Hence, QoE of FT is kept around 3.5 because QoE of FT strongly depends on throughput performance. As a result, the proposed method can keep QoE of every flows excellent level. Therefore, we can remark that the predicted QoE-based route selection successfully improves QoE for multiple new flow arrivals. However, the proposed method still has the following limitations: it does not consider that QoE of existing flows may drop due to transmission of a new flow. If the route selection for a new flow causes the exceed of available bandwidth, QoE

SDN-Based In-network Early QoE Prediction for Stable Route Selection

491

of existing video flows inherently drops. Moreover, since the proposed method does not prioritize to prepare a route with the highest residual bandwidth, the possibility of bandwidth scarcity at flow arrivals is relatively higher than that of the conventional method.

6

Conclusion

As G.1071 requires the network measurement of at least 8 s to calculate QoE, a QoE-based route selection is quite difficult at the arrival timing of new flow. To resolve this problem, we proposed a early QoE prediction method for new arrival video flow. As both PLR and video settings are required for the QoE calculation, we predicted PLR by exploiting existing traffic, and video settings by exploiting the measurement result of very-short period (0.9 s) immediately after the video start while alleviating the effects of video buffering. The proposed method then selected an appropriate route for every flows in accordance with the predicted QoE. In our experiments, we showed that the proposed method can predict QoE precisely irrespective of the change in the video bitrate. We also demonstrated that the route selection for arrival video and file transfer flows based on predicted QoE successfully improved their QoE. As a next step, we are going to collaborate the QoE prediction method, which initially selected an appropriate route, and our proposed QoE estimation method for ongoing flows [2] in order to catch up with the change of network conditions in the timely manner. Acknowledgements. This paper was partly supported by JSPS KAKENHI Grant Number 18H03234, NICT Grant Number 19304, USA Grant number NSF 1818884 and 1827923.

References 1. Tagawa, M., Taenaka, Y., Tsukamoto, K., Yamaguchi, S.: A channel utilization method for flow admission control with maximum network capacity toward loss-free software defined WMNs. In: The Fourteenth International Conference on Networks (ICN 2015), pp. 118–123 (February 2016) 2. Shimokawa, S., Kanaoka, T., Taenaka, Y., Tsukamoto, K., Lee, M.: SDN-based timedomain error correction for in-network video QoE estimation in wireless networks. In: The 11th International Conference on Intelligent Networking and Collaborative Systems (INCoS 2019), pp. 331–341 (September 2019) 3. Erfanian, A., Tashtarian, F., Yaghmaee, M.H.: On maximizing QoE in AVC-based http adaptive streaming: an SDN approach. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–10 (June 2018) 4. Zinner, T., Jarschel, M., Blenk, A., Wamser, F., Kellerer, W.: Dynamic applicationaware resource management using software-defined networking: implementation prospects and challenges. In: 2014 IEEE Network Operations and Management Symposium (NOMS), pp. 1–6 (May 2014) 5. Dobrijevic, O., Kassler, A., Skorin-Kapov, L., Matijasevic, M.: Q-POINT: QoEdriven path optimization model for multimedia services. In: Wired/Wireless Internet Communications (WWIC 2014), pp. 134–147 (May 2014)

492

S. Shimokawa et al.

6. Recommendation ITU-T G.1071: Opinion model for network planning of video and audio streaming applications (November 2016) 7. Thakolsri, S., Khan, S., Steinbach, E., Kellerer, W.: QoE-driven cross-layer optimization for high speed downlink packet access. J. Commun. 4(9), 669–680 (2009) 8. AT&T Speed Tiers. https://att.net/speedtiers 9. Recommended Upload Encoding Settings, YouTube Help. https://support.google. com/youtube/answer/1722171?hl=en

Reliable Network Design Considering Cost to Decrease Failure Probability of Simultaneous Failure Yuma Morino and Hiroyoshi Miwa(B) Graduate School of Science and Technology, Kwansei Gakuin University, 2-1 Gakuen, Sanda-shi, Hyogo 669–1337, Japan {morino-m,miwa}@kwansei.ac.jp

Abstract. It is important to design a reliable information network that is resistant to network failures. Protection to decrease failure probability by both backup mechanism and fast recovery mechanism is an approach to design such an information network. However, enormous cost is required, if the failure probability of all network elements must be decreased. Therefore, it is practical to preferentially protect only some highly required network elements so that the reliability of the entire information network is improved. In this paper, we assume that the failure probability of a failure set, a set of network elements that simultaneously fails at a single disaster, can be decreased according to cost for protection. Since failure probability of each failure set is decreased according to cost assigned to the failure set, the network failure probability defined as the probability that the entire information network is not connected, is decreased. We define a network design problem that determines cost assigned to each failure set so that the sum of the cost assigned to each failure set is minimized under the constraint that the network failure probability must be within a threshold. First, we formulate the network design problem as a 0–1 integer programming problem, when the relationship between cost and probability decreased according to cost is a step function. In addition, we evaluate the performance by using the topology of some actual information networks.

1

Introduction

There are many cases in which a single disaster triggers simultaneous failures of many network elements such as nodes and links in an information network. It is not realistic to completely prevent failures of nodes and links in the event of a disaster, and we must assume that failures will occur. Let a failure set be a set of the simultaneously broken network elements at a single disaster. Even if all the nodes and all the links in a failure set are broken, it is desirable that all nodes must be connected each other in a remained information network. It is important to design such a reliable information network that is resistant to network failures. c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 493–502, 2021. https://doi.org/10.1007/978-3-030-57796-4_47

494

Y. Morino and H. Miwa

Protection to decrease failure probability by both backup resource reserved in advance and fast recovery mechanism is an approach to design such an information network. However, since protection needs much network resource, enormous cost is required, if the failure probability of all network elements must be decreased. Therefore, it is practical to preferentially protect only some highly required network elements so that the reliability of the entire information network is improved. The failure probability of a network element can be decreased according to cost for protection of the network element. Since the failure probability of each network element is decreased according to cost assigned to the network element, the network failure probability defined as the probability that an entire information network is not connected, is decreased. In this paper, we assume that the initial failure probability of a failure set is given for all failure sets. The initial failure probability of a failure set corresponds to the probability that a single disaster occurs and all the network elements included in the failure set are broken by the disaster. In addition, the relationship between cost and probability decreased according to cost is given. We define a network design problem that determines cost assigned to each failure set so that the sum of the cost assigned to each failure set is minimized under the constraint that the network failure probability must be within a threshold. In the information network improved by using the solution of this network design problem, even if all the nodes and all the links in any failure set are broken, the probability that all nodes are connected each other in a remained information network is sufficiently large. We formulate the network design problem as a 0–1 integer programming problem, when the relationship between cost and probability decreased according to cost is a step function. In addition, we evaluate the performance by using the topology of some actual information networks.

2

Related Research

Connectivity used in a graph is often used as a measure of network reliability. When the connectivity of a network is too small, the algorithm to increase the connectivity by adding edges to the graph are applied. This corresponds to addition of links to an information network. This is known as the connectivity augmentation problem that determines augmented edges so that the graph does not become disconnected against any removal of edges. Some theoretical results for this problem are known; a set of the minimum number of edges for increasing the edge-connectivity of a graph to an arbitrary value can be determined in polynomial time; on the other hand, for the vertex-connectivity, the problem is generally NP-hard [1–3]. As another measure for evaluating reliability of a network, there is the number of the connected components of the network resulting from disconnection by failures [4]. The reference [4] addresses an optimization problem for maximizing the number of connected components disconnected due to the failure of nodes. When each edge has the probability that an edge is removed, the network reliability is defined as the probability so that an entire network is connected. The

Network Design Considering Cost to Decrease Failure Probability

495

network design problems that maximize the reliability under cost constraint or that minimize cost under reliability constraint have been investigated (ex. [5,6]).

3

Reliable Network Design Problem Considering Cost to Decrease Failure Probability of Simultaneous Failure Set

Let G = (V, E) where V is a set of vertices and E is a set of edges be a connected undirected graph representing an information network. Each vertex corresponds to a node such as a router, and each edge corresponds to a link between nodes. Let Fi (i = 1, 2, . . . , t) be a set of vertices and edges. Each set of Fi (i = 1, 2, . . . , t) is called a removal set. Let K = {F1 , F2 , . . . , Ft } be a set of the removal sets. We call K a set of the removal sets. A removal set corresponds to a set of nodes and links broken simultaneously in a single disaster, and a set of the removal sets corresponds to a set of possible disasters. For G = (V, E) and F ∈ K, let GF be the resulting graph that all edges in F are removed and that all vertices included in F and all edges incident to the vertices are removed. For removal set F (∈ K), let p(F ) be the probability that F is removed, in other words, the probability that graph G is converted to graph GF . For graph G = (V, E) and a set of the removal sets K, let an improved failure probability function be h : K × R+ → R+ . Let an assigned cost function be c : K → R+ . When assigning cost c(F ) to F (∈ K), the probability that F is removed decreases from h(F, 0) (= p(F )) to h(F, c(F )). We call p(F ) and h(F, c(F )) failure probabilities for the sake of simplicity. In the rest of this paper, we assume that an improved failure probability function h is a step function. For graph G = (V, E), a set of the removal sets K, improved failure probability function h, assigned cost function c, let P (G, K, h, c) be the probability that the graph is not connected. We call P (G, K, h, c) graph failure probability. Problem CADP Input: Graph G = (V, E), a set of the removal sets K, improved failure probability function h, a positive real number ε. Constraint: P (G, K, h, c) ≤ ε Objective: F ∈K c(F ) (minimization) Output: Assigned cost function c For the sake of simplicity, we define the sum of costs assigned to removed sets as the objective function, although, in general, the total cost is not necessarily a linear function of costs assigned to removed sets. An example of this problem is shown in Fig. 1, Fig. 2, Fig. 3, and Fig. 4. We assume that, for graph G = (V, E), a removal set R(e) is {e} (e ∈ E) and that a set of the removal sets is K = {R(e)|e ∈ E}. The improved failure probability function h is the step function in Fig. 4. When the value of the horizontal axis is x and the value of the vertical axis is y, the failure probability is decreased by y after assigning cost x.

496

Y. Morino and H. Miwa

In Fig. 1, the graph failure probability is 0.832. In Fig. 2, the failure probability of R(e1 ), R(e2 ), and R(e3 ) is decreased by 0.1, respectively. Consequently, the graph failure probability becomes 0.72. On the other hand, in Fig. 3, the failure probability of R(e2 ) is decreased by 0.1 and that of R(e3 ) is decreased by 0.2. As a result, the graph failure probability becomes 0.706, which is smaller than the result in Fig. 2. Although the total cost is 3 both in Fig. 2 and in Fig. 3, the graph failure probability of Fig. 3 is less than that of Fig. 2. Thus, when appropriately assigning cost to the removal sets, the graph failure probability can be decreased. The purpose of the problem is to find the optimum assigned cost function.

Fig. 1. Failure Prob- Fig. 2. Example1 of improved abilities of e1 , e2 , e3 . probabilities (Total Cost = 3).

Fig. 3. Example2 of improved probabilities (Total Cost = 3).

We formulate the problem CADP as a 0–1 integer programming problem, when an improved failure probability function h is a step function. Let F (∈ K) be a removal set. For j-th interval in a step function (j = 1, 2, . . . , s), when the F cost cF j is assigned to F , let the failure probability be decreased by aj from the F F initial failure probability p(F ). We assume that c1 = 0 and a1 = 0 and that, F F when j ≥ 2, cF j > 0 and aj > 0. For F (∈ K) and j (j = 1, 2, . . . , s), if cj F is assigned to F and the failure probability becomes p(F ) − aF j and xj = 1; F otherwise, xj = 0. Let H be a set of removed sets and we use the same notation H for the union of the removed sets in H. If H satisfies that GH is not connected, H is called a separator. For graph G and set of removed sets K, let SG,K be the set of all separators. The problem CADP is formulated as follows. Minimize subject to

s

F cF j · xj

F ∈K j=1 s xF j = j=1

1

H∈SG,K F ∈H

(F ∈ K) (p(F ) −

s j=1

F aF j · xj )

F ∈K\H

(1 − (p(F ) −

s j=1

F aF j · xj )) ≤ ε

Network Design Considering Cost to Decrease Failure Probability

497

Fig. 4. Step Function for e1 , e2 , and e3 .

4

Performance Evaluation

We evaluate the performance of CADP in Sect. 3 using the actual network topology which is presented by CAIDA (Cooperative Association for Internet Data Analysis) [8]. In the following numerical experiments, for graph G = (V, E), we define removal set Fi for vertex vi (∈ V ) as the union of the set of the edges incident to vi and the set of the edges between vertices adjacent to vi . We show an example of a graph in Fig. 5 where the number of vertices 19 and the number of edges is 33. We show removal set F9 (the set of the edges denoted by the dotted lines) of the network of Fig. 5 in Fig. 6. We investigate the relationship between the size of the intersection of different removed sets and cost later. The intersection of the removed sets F4 and F7 is the set of the edges denoted by the bold lines in Fig. 7 and the size of the intersection is five.

Fig. 5. Network1.

Fig. 6. Removal Set F9 .

Fig. 7. Edges in intersection of F4 and F7 .

498

Y. Morino and H. Miwa

For the sake of simplicity, we assume that the number of the removed sets is two. We give the initial failure probability of a removed set randomly, and, when the size of the intersection of two removal sets is zero, we define the step functions in Table 1 and Fig. 8 for a removal set and Table 2 and Fig. 9 for the other removal set. Table 1 and Fig. 8 show that the initial failure probability is 0.583 and the cost of 4.357 (resp. 8.949, 21.031, 48.424, 79.392, 92.67) is required to decrease the failure probability by 0.1 (resp. 0.2, 0.3, 0.4, 0.5, 0.583). Table 2 and Fig. 9 shows the similar contents for the other removed set. Table 1. Step function for a removal set

Table 2. Step function for the other removal set

Probability Cost

Probability Cost

0.1

4.357

0.1

15.284

0.2

8.949

0.2

17.887

0.3

21.031

0.3

43.244

0.4

48.424

0.4

47.957

0.5

79.392

0.5

56.887

0.583

92.67

0.6

60.515

0.7

80.442

0.716

99.676

We show the result in Table 5. The column of “Cost A (resp. Cost B)” indicates the assigned cost and the improved failure probability of a removal set (A) and the other removal set (B). As the threshold of the graph failure probability, ε, decreases, the minimum cost (“Total”) to achieve the threshold, the assigned costs to the removal sets, and the amount of reduction of the failure probabilities increase. The computational time is much less than one second by MacBookAir with Intel Core i7 2.2 GHz, 8 GB RAM, and the solver for optimization, Gurobi Ver.9.0.1. We can the optimum solution sufficiently in short computational time. Next, we examine the relationship between the size of the intersection of the removed sets and cost. We assume that the assigned cost decreases in proportion to the number of the common edges in the removed sets. We show the result in case that the number of the common edges in the removed sets is five. We show the step functions in Table 3 and Fig. 10 for a removal set and Table 4 and Fig. 11 for the other removal set. We show the result in Table 6. As in case that the size of intersection is zero, according to the threshold of the graph failure probability, the assigned costs to the removal sets and the amount of reduction of the failure probabilities change. As might be expected, the cost is less than the cost of the case that the size of intersection is zero. As in case that the size of intersection is zero, the computational time is also much less than one second.

Network Design Considering Cost to Decrease Failure Probability

Fig. 8. Step function for a removal set

Table 3. Step function for a removal set (Size of intersection is 5)

Fig. 9. Step function for the other removal set Table 4. Step function for the other removal set (Size of intersection is 5)

Probability to decrease Cost

Probability to decrease Cost

0.1

0.732

0.1

9.025

0.2

1.504

0.2

10.562

0.3

3.535

0.3

25.535

0.4

8.139

0.4

28.318

0.5

13.343

0.5

33.591

0.583

15.575

0.6

35.734

0.7

47.5

0.716

58.858

Fig. 10. Step function for a removal set (Size of intersection is 5)

499

Fig. 11. Step function for the other removal set (Size of intersection is 5)

500

Y. Morino and H. Miwa

Table 5. Cost (Size of intersection : 0).

Table 6. Cost (Size of intersection : 5).

Threshold Total

Threshold Total

Cost A

Cost B

Cost A

Cost B

0.1

159.834 79.392(0.5) 80.442(0.7)

0.1

60.844 13.343(0.5) 47.5(0.7)

0.2

128.866 48.424(0.4) 80.442(0.7)

0.2

49.077 13.343(0.5) 35.734(0.6)

0.3

101.473 21.031(0.3) 80.442(0.7)

0.3

43.872 8.139(0.4)

35.734(0.6)

0.4

81.546

21.031(0.3) 60.515(0.6)

0.4

39.268 3.535(0.3)

35.734(0.6)

0.5

69.464

8.949(0.2)

60.515(0.6)

0.5

36.457 8.139(0.4)

28.318(0.4)

0.6

56.906

8.949(0.2)

47.957(0.4)

0.6

23.906 13.343(0.5) 10.562(0.2)

0.7

38.918

21.031(0.3) 17.887(0.2)

0.7

14.097 3.535(0.3)

10.562(0.2)

0.8

17.887

-

17.887(0.2)

0.8

3.535

3.535(0.3)

-

0.9

0

-

-

0.9

0

-

-

1.0

0

-

-

1.0

0

-

-

We show both the results in Fig. 12. We can observe that the cost increases according to the decrease of the threshold ε.

Fig. 12. Relationship between threshold and cost.

5

Conclusion

In this paper, we addressed the network design problem to make a network robust by improvement. We used the approach of protection that decreases failure probability by both backup resource reserved in advance and fast recovery

Network Design Considering Cost to Decrease Failure Probability

501

mechanism. Since protection needs much cost, it is practical to preferentially protect only some highly required network elements so that the reliability of the entire information network is improved. We assumed that the initial failure probability of a failure set is given for all failure sets and that the relationship between cost and probability decreased according to cost is given as a step function. Thus, we defined a network design problem that determines cost assigned to each failure set so that the sum of the cost assigned to each failure set is minimized under the constraint that the network failure probability must be within a threshold. In the information network improved by using the solution of this network design problem, even if all the nodes and all the links in any failure set are broken, the probability that all nodes are connected each other in a remained information network is sufficiently large. First, we formulated the network design problem as a 0–1 integer programming problem, when the relationship between cost and probability decreased according to cost is a step function. In addition, we evaluated the performance by using the topology of some actual information networks. As a result, we can obtain the optimum solution for the actual information networks within practical small computational time. In this paper, for the sake of simplicity, we defined the sum of costs assigned to failure sets as the objective function. In general, a total cost is not necessarily a linear function of costs assigned to failure sets but a submodular function. In the future work, we extend the definition of a total cost and design an efficient algorithm. Acknowledgements. This work was partially supported by the Japan Society for the Promotion of Science through Grants-in-Aid for Scientific Research (B) (17H01742) and JST CREST JPMJCR1402.

References 1. Frank, A.: “Augmenting graphs to meet edge-connectivity requirements.” In: Proceedings 31st Annual Symposium on Foundations of Computer Science, St. Louis, 22-24 October 1990 2. Ishii, T., Hagiwara, M.: Minimum augmentation of local edge-connectivity between vertices and vertex subsets in undirected graphs. Discrete Appl. Math. 154(16), 2307–2329 (2006) 3. Kortsarz, G., Krauthgamer, R., Lee, J.R.: Hardness of approximation for vertexconnectivity network design problems. Lect. Notes Comput. Sci. 2462, 185–199 (2002) 4. Arulselvan, A., Commander, C.W., Elefteriadou, L., Pardalos, P.M.: Detecting critical nodes in sparse graph. In: Proceedings Computers & Operations Research, vol. 36(7), pp. 2193–2200, July 2009 5. Aggarwal, K.K., Chopra, Y.C., Bajwa, J.S.: Topological layout of links for optimising the overall reliability in a computer communication system. Microelectron. Reliabil. 22, 347–351 (1982) 6. Jan, R.-H., Hwang, F.-J., Chen, S.-T.: Topological optimization of a communication network subject to a reliability constraint. IEEE Trans. Reliabil. 42, 63–70 (1993)

502

Y. Morino and H. Miwa

7. Uji, K., Miwa, H.: “Reliable network design problem considering cost to improve link reliability.” In: Proceedings International Conference on Intelligent Networking and Collaborative System (INCoS2018), Bratislava, Slovakia, 5–7 September 2018 8. CAIDA. http://www.caida.org/

Beacon-Less Autonomous Transmission Control Method for Spatio-Temporal Data Retention Ichiro Goto1(B) , Daiki Nobayashi1 , Kazuya Tsukamoto1 , Takeshi Ikenaga1 , and Myung Lee2 1

Kyushu Institute of Technology, Kitakyushu, Japan [email protected], {nova,ike}@ecs.kyutech.ac.jp, [email protected] 2 City College of New York, New York, USA [email protected]

Abstract. With the development and spread of Internet of Things (IoT) technology, the number of devices connected to the Internet is increasing, and various kinds of data are now being generated from IoT devices. Some data generated from IoT devices depends on geographical location and time. We refer to such data as spatio-temporal data (STD). Since the “local production and consumption” of STD is effective for locationdependent applications, we have proposed a STD retention system using vehicles equipped with storage modules, computing resources, and shortrange wireless communication equipment. In this previous system, each vehicle controls the data transmission probability based on the neighboring vehicle density in order to achieve effective data retention. However, since the overhead of beacon messages required for estimation of the neighboring vehicle density becomes a critical problem with the increase in the number of vehicles, thereby preventing the effective data retention. In this paper, we propose a new data transmission control method to realize effective and reliable STD retention without beacon. Simulation results showed that our proposed scheme can achieve effective data retention.

1

Introduction

With the development and spread of Internet of Things (IoT) technologies, the number of devices connected to the Internet is increasing. According to Cisco Systems, Inc., the number of IoT devices is growing year by year and is expected to grow to approximately 28 billion by 2022 [1]. Therefore, various kinds of data are now being generated from IoT devices and are mostly collected, analyzed and distributed by Internet cloud servers. Here, when we pay attention to data content generated from IoT devices, some data content such as traffic information, weather information and disaster information are highly dependent on location and time. We define such data content as “spatio-temporal data (STD).” STD can be effectively utilized by c Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 503–513, 2021. https://doi.org/10.1007/978-3-030-57796-4_48

504

I. Goto et al.

distributing it directly to the user from generation place of it. For example, when a traffic accident occurs, the driver distributes the information about traffic accident to other neighboring drivers directly. As a result, these drivers can passively acquire the information about traffic accident without any kind of active actions, so that drivers can take an action to avoid a route where a traffic jam occurs. In other words, we suppose that “local production and consumption of data” brings us effective utilization of STD. Therefore, we have proposed a novel Geo-Centric Information Platform (GCIP) for collecting, analyzing, and delivering STD based on physical space [2]. As a fundamental element of GCIP, we have also proposed a STD retention system usig vehicles equipped with storage modules, computer resources, and short-range wireless communication equipment [3–5]. In this system, vehicles capable of wireless communication are defined as regional information hub (InfoHub), and the purpose of this system is to retain STD within a target area. However, in the data retention method using a vehicular ad hoc networks, since all vehicles use the same radio band, radio interference occur frequently when the number of vehicles increases. Therefore, in the previous study, in order to solve this problem, we proposed a transmission control method according to the data transmission situation of neighboring vehicles [3–5]. Those methods control the transmission probability based on the density of neighboring vehicles. However, since the beacon messages periodically transmitted from each vehicles are used for estimation of the vehicle density, the overhead of beacon transmission can become a crucial problem especially in a high vehicle density environment, thereby causing frequent radio interference. In this paper, we propose a new beacon-less transmission control method. In the proposed method, we first introduce transmission zones in which vehicles can transmit data for data retention. Next, we introduce time division mechanism of the transmission timing among neighboring transmission zones to avoid radio interference. Finally, we control the transmission based on the signal strength of packet received from neighboring vehicles in order to distribute the data to the entire retention area with minimum data transmission. We evaluate the performance of the proposed method by simulation and verify its effectiveness. The rest of this paper is organized as follows. In Sect. 2, we briefly review studies related to data retention, while we outline our previous STD retention system and discuss the problems to be addressed in Sect. 3. In Sect. 4, we describe our new transmission control method for STD retention, while simulation models and evaluation results are provided in Sect. 5. Finally, we give our conclusions in Sect. 6.

2

Related Works

Maihofer et al. proposed an abiding geocast, which distributes and retains data to all vehicles within the geocast target area for a certain period of time [6]. Maihofer et al. also proposed three methods for distributing and retaining data to vehicles in the target area. The first is a server approach where a particular server

Beacon-Less Autonomous Transmission Control Method

505

retains the data and periodically distributes it based on the geocast routing protocol. In this method, not only the specific server distributes the data but also the location information of all vehicles in the target area needs to be exchanged in order to distribute the data, so that the load to the specific server is increased. The second is an election approach in which selected vehicles in the target area retain data and distribute it periodically. These two approaches increase the processing load for a particular server and vehicle, which result in more frequent failures. In addition, when these devices fail, data cannot be distributed. The third is a neighboring approach in which each vehicles within the target area hold data and vehicle location information, and deliver data when they sense vehicles entering the target area. This approach requires no infrastructure because it is a vehicle-only system. Therefore, the range of practical use is wide, and many studies have been conducted. Rizzo et al. proposed a method to exchange information between vehicles based on the assumption that the infrastructure could not be used due to a disaster [7]. Leontiadis et al. proposed a method in which a vehicle exchanges navigation information with neighboring vehicles and distributes data to a vehicle heading to a target area [8]. In the Floating Content [9] and Locus [10], vehicles have a data list and exchange the list with neighboring vehicles. When the vehicle does not retain the data, the vehicle makes a transmission request, and the vehicle retaining the data transmits the data. At this time, the vehicle determines whether to transmit the data based on the data transmission probability set according to the distance from the center of the target area. Therefore, the further the vehicle is from the center, the lower the data acquisition probability is. On the other hand, when many vehicles are located near the center, channel competition occurs because the transmission probability of all vehicles is high, and communication quality deteriorates. Furthermore, since the user has to send a query to acquire these data, there is an overhead in acquiring the data, which is inefficient for acquiring realtime information (For example, traffic information). Therefore, in this study, we propose a new network infrastructure that retains STD at the generation place and effectively distributes it to users in order to promote local production and consumption of data.

3

STD Retention System

In this section, we describe the assumptions, requirements, and outline of the retention system [3], and then discuss the related problems. 3.1

Assumption

InfoHub vehicles have a wireless interface that meets the IEEE 802.11p specification and obtain location information using a Global Positioning System (GPS) receiver. STD includes not only data for an application but also transmission control information such as a center coordinates, a radius R of the retention area, a length r of an auxiliary area, and a data transmission interval d. Vehicles

506

I. Goto et al.

in the retention area transmit data once every data transmission cycle (interval d). In addition, all vehicles are equipped with the same antenna and have the same transmission power. 3.2

System Requirements

To achieve data retention, the entire retention area must be covered by sum of the communication range of InfoHub vehicles. Further, in order to realize the quick data distribution to users, the data must be transmitted from InfoHub vehicles to the entire retention area at regular intervals so that users can receive the data in a short time (at least once every transmission interval d). In this paper, we defined coverage rate as an indicator of the data retention performance in a certain period. The coverage rate formula is as follows: Coverage Rate =

SDT ST A

(1)

where ST A is the size of retention area and SDT is the size of the total area where a user can obtain the data transmitted from InfoHub vehicles within the data transmission interval. Figure 1 shows an example of the coverage. The black dot indicates vehicles, and the pink circle indicates the communication range. As shown in the Fig. 1(a), a high coverage rate means that users can passively acquire data anywhere within the retention area. On the other hand, as shown in the Fig. 1(b), a low coverage rate means that users are highly likely to be unable to passively acquire data in the retention area. Therefore, maintaining a high coverage rate is important for the data retention system. In addition, in an environment with high vehicle density, since multiple vehicles frequently transmit data simultaneously, a lot of radio interference occur, thereby degrading the data retention performance. Therefore, the requirement of this system is to maintain a high coverage rate, while limiting the number of data transmissions as much as possible.

Fig. 1. Coverage rate

Beacon-Less Autonomous Transmission Control Method

3.3

507

Previous Method

Previous method controls the data transmission probability using simple information, such as the number of neighboring vehicles and the data reception number. Each vehicle periodically transmits a beacon in order to estimate the number of neighboring vehicles based on the number of beacons received from the neighboring vehicles, and sets a data transmission probability based on the number of beacons and data received at each data transmission interval d. As a result, the previous method achieved a high coverage rate while reducing the number of data transmissions. 3.4

Problems of the Previous Method

In the previous method, since each vehicle estimates the number of neighboring vehicles from the number of received beacons, all vehicles in the retention area must transmit beacons. Therefore, the number of beacon transmissions inherently increases in proportion to the increase in the number of vehicles. That is, although the previous method can reduce the number of data transmissions effectively even in the high dense environment, the effects of overhead of beacon messages and collisions between data and beacons are not taken into consideration at all. Therefore, in order to realize effective data retention, it is necessary to beacon-less data transmission control.

Fig. 2. Optimal transmission point

4

Fig. 3. Transmission zone

Proposed Method

In this section, we propose a method to realize effective data retention without beacon transmissions. First, we introduce transmission zones that vehicles can transmit data within the retention area. Next, in order to suppress the radio interference between neighboring transmission zones, we assign one transmission interval for each transmission zone within one data transmission cycle. Finally, the vehicle controls the transmission based on its own location and the signal strength of packet received from neighboring vehicles.

508

4.1

I. Goto et al.

Introduction of Transmission Zones

In this study, first, we defined the minimum required transmission position (hereinafter called “optimal transmission point”) that covers the entire retention area based on the communication range of one vehicle. Since not only the origin point of STD but also the range of communication coverage are predetermined, we can know the optimal transmission points in advance. Therefore, as shown in Fig. 2, we set the optimal transmission points determined based on these information. If only vehicles located at the optimal transmission points transmit data, the number of data transmissions can be minimized. However, in the real environment, since vehicles freely move with high mobility, the reliable transmission from the optimal point is impossible. Therefore, we introduce the transmission zone whose center is the optimal transmission point as shown in the Fig. 3, and only vehicles within the transmission zone transmit data, aiming to maintain a high coverage rate. 4.2

Decision of Transmission Timing in the Transmission Zone

We set the transmission timing according to the following procedure so that the vehicle closer to the optimum transmission point transmits data preferentially within the transmission zone. Initially, we assume that all vehicles synchronize time, and data transmission cycle using GPS. First, each vehicle acquires its own current position at every data transmission cycle, and confirms whether its own position is located within the transmission zone. When the vehicle is in the transmission zone, the vehicle calculates the distance to the nearest optimal transmission point and sets the transmission timing by following the formula. N ext T ransmission T iming =

l ∗ d + Current T ime s/2

(2)

where l is the distance from the optimal transmission point, s is the radius of communication range and d is the transmission interval. In this formula, the transmission timing is set earlier as the vehicle is closer to the optimum transmission point. Thus, the data can be transmitted from the vehicle close to the optimal transmission point. On the other hand, when the vehicle is outside the transmission zone, the vehicle does not set the transmission timing (not send data packet) during this data transmission cycle. 4.3

Time Division Scheduling Among Transmission Zones

In order to determine the transmission timing according to the distance from the optimum transmission point, since all the vehicles must synchronize the data transmission cycles, vehicles located in the two neighboring transmission zones whose distance to the optimal transmission point is the same, transmit data simultaneously, causing radio interference. Therefore, we introduce a method to avoid radio interference between neighboring transmission zones. To avoid radio

Beacon-Less Autonomous Transmission Control Method

509

interference, we treat the neighboring 9 transmission zones as one group and one transmission interval is allocated to each of transmission zone within one data transmission cycle, as shown in Fig. 4. Then, the vehicles in transmission zone set the transmission timing described in Sect. 4.2 within the transmission interval assigned to each transmission zone. Thus, since the transmission timings of the neighboring transmission zones are different from each other, radio interference can be prevented. 4.4

Transmission Control Based on the Received Signal Strength

In the previous section, each vehicle decided the transmission timing based on the distance from optimal transmission point. Next, each vehicle determines whether to transmit data based on the surrounding transmission situation. The ideal environment for this system is to transmit data at the optimal transmission point in free space. Thus, by comparing the received signal strength in this ideal environment with the actual received signal strength, we can determine whether transmission is required at that location. When the received signal strength of the data is higher than the ideal received signal strength, the vehicle does not need to transmit the data because the space around the vehicle has a sufficient signal strength. On the other hand, when the received signal strength of the data is lower than the ideal received signal strength, since the propagation environment around the vehicle is poor and areas where radio waves do not reach occur, the vehicle needs to transmit data. Therefore, the vehicle in the transmission zone controls transmission according to the following procedure.

Fig. 4. Grouping of transmission zones

510

I. Goto et al.

Fig. 5. Simulation model

1. When the vehicle receives the data, the vehicle acquires the current position information. 2. The vehicle confirms whether the received data is data transmitted from the same transmission zone as itself. a. When the transmission zones are the same, the vehicle calculates a distance to the optimal transmission point, and calculates an ideal received signal strength at the current position by using the Friis transmission equation. λ 2 ) (3) Pr = Gt Gr pt ( 4πr where Pr is the received power, Gt is the transmission gain, Gr is the receive gain, Pt is the transmission power, λ is the wavelength and r is the distance from transmission point. Next, the vehicle compares the calculated value with the measured value. When the measured value is large, the vehicle does not transmit the data in the current data transmission cycle and waits until the next data transmission cycle. b. When the transmission zones are different, the vehicle waits until the transmission timing.

5

Simulation Results

In this section, we evaluate the performance of our proposed method using a simulation. 5.1

Simulation Model

We evaluated our proposed method using the Veins [14] simulation framework, which implements both the IEEE 802.11p specification and mobility model for the vehicular ad-hoc networks (VANETs). The veins can combine the Objective Modular Network Testbed in C++ (OMNeT++) [12] network simulator with

Beacon-Less Autonomous Transmission Control Method

511

the Simulation of Urban MObility (SUMO) road traffic simulator [13]. To show the effectiveness of our proposed method, we used random topology (Fig. 5) that vehicles with randomly generated starting and ending points drive on a road. A traffic signal was installed at the intersection. The distance between intersections w was 50 m. In our simulations, we set each parameter to evaluate our proposed method based on the previous method [3]. The retention area radius R was set to 750 m, the auxiliary area length r was set to 250 m, the communication range of the vehicle was set to 300 m, the speed was set to 40 km/h, and the transmission interval d was set to 5 s. We created 10 mobility models for vehicles with randomly generated starting and ending points, and evaluated them by changing the number of vehicles in each model from 250 to 1000. As comparison methods, we used a naive method that a vehicle always sends data at least once during the transmission interval d, the previous method [3], the proposed method without the time division scheduling (TDS) of Sect. 4.3, and the proposed method with the scheduling. In the previous method, the beacon transmission interval was set to 5 s, the moving average coefficient α was set to 0.5, and the target value of the number of received data β was set to 4 in addition to the above parameter setting. 5.2

Performance Evaluation

First, we evaluate the coverage rate. Figure 6 shows the coverage rate of the four comparison methods. From this graph, the proposed method achieve a coverage rate approximately 100% as the previous method regardless of the number of vehicles. This result shows that the proposed method can distribute STD to the entire retention area without beacon transmissions regardless of the number of vehicles. Next, we evaluate the number of retained STD transmissions as shown in Fig. 7. From this graph, the proposed method can suppress the number of retained STD transmissions to around 100, which is the same as the previous method, even in an environment with high vehicle density. This result shows that the proposed method can realize the same reduction of the number of data transmission as the previous method by controlling the transmission according to the signal strength of the received data. On the other hand, Fig. 8 shows the total number of data transmissions to the retention area. This means the number of transmissions of all data, including the beacons, not just the retained STD. From this result, since all vehicles in the previous method transmit beacons, the total number of data transmissions to the retention area is larger than that of the naive method. As a result, the previous method is difficult to achieve effective data retention because the frequency of radio interference may increase. On the other hand, since the proposed method does not transmit beacons, the total number of data transmissions to the retention area is significantly smaller than other methods. These results indicate that the proposed method can achieve data retention with minimum data transmission. Finally, we evaluate the radio interference. Figure 9 shows the number of occurrence of radio interference. From this graph, the performance of the pro-

512

I. Goto et al.

Average coverage rate [%]

100 99 98 97 naive previous proposed (w/o TDS) proposed

96

95 200 300 400 500 600 700 800 900 10001100

Average number of data transmissions

posed method without TDS is equivalent to the previous method. Furthermore, the proposed method with TDS can suppressed the total number of occurrence of radio interference to approximately one tenth of the previous method. This results shows that the proposed method hardly occur radio interference by time division scheduling even in an environment with high vehicle density. These results show that the proposed method can significantly reduce the number of data transmissions and radio interference without beacon transmissions, while achieving a coverage rate of approximately 100% regardless of the number of vehicles.

Total number of vehicles

1200 1000 800

naive previous proposed (w/o TDS) proposed

600 400 200 0 200 300 400 500 600 700 800 90010001100 Total number of vehicles

Fig. 8. The total number of data transmissions

6

naive previous proposed (w/o TDS) proposed

1000 800 600 400 200

0 200 300 400 500 600 700 800 90010001100 Total number of vehicles

Fig. 7. The number of retained STD transmissions

The number of radio interference

Average number of transmissions

Fig. 6. Coverage rate

1200

10000 1000

naive previous proposed (w/o TDS) proposed

100 10 1 0.1 200 300 400 500 600 700 800 90010001100 Total number of vehicles

Fig. 9. The number of occurrence of radio interference

Conclusions

In this paper, we proposed a STD retention system that enables passive data reception in a specific area using a VANET composed of InfoHub vehicles. Additionally, we also proposed a transmission control method to realize effective data retention without beacon transmissions. The proposed method controls data transmission based on the transmission position and the received signal strength

Beacon-Less Autonomous Transmission Control Method

513

of data. Simulation evaluation revealed that our proposed method can significantly reduce the number of data transmissions and radio interference while achieving a coverage rate of approximately 100% regardless of the number of vehicles. In our future work, we will evaluate using a vehicle traffic model that simulates a real environment such as LuST [11]. Acknowledgements. This work supported in part by JSPS KAKENHI Grant Number 18H03234, NICT Grant Number 19304, and NSF award number 1818884 (JUNO2), 1827923 (COSMOS).

References 1. Cisco: Cisco Visual Networking Index: Forecast and Trends, 2017–2022, Cisco White Paper (2019) https://davidellis.ca/wp-content/uploads/2019/12/cisco-vnimobile-data-traffic-feb-2019.pdf 2. Nagashima, K., Taenaka, Y., Nagata, A., Nakamura, K., Tamura, H., Tsukamoto, K.: Experimental evaluation of publish/subscribe-based spatio-temporal contents management on geo-centric information platform. In: NBiS - 2019 2019 Advances in Networked-based Information Systems, pp. 396-405. August 2019 3. Teshiba, H., Nobayashi, D., Tsukamoto, K., Ikenaga, T.: Adaptive data transmission control for reliable and efficient spatio-temporal data retention by vehicles. In: Proceedings ICN 2017, pp. 46-52. Italy, April 2017 4. Yamasaki, S., Nobayashi, D., Tsukamoto, K., Ikenaga, T., Lee, M.: On-demand transmission interval control method for spatio-temporal data retention. In: INCoS 2019 Advances in Intelligent Networking and Collaborative Systems, pp. 319-330. August 2019 5. Goto, I., Nobayashi, D., Tsukamoto, K., Ikenaga, T., Lee, M.: Transmission control method to realize efficient data retention in low vehicle density environments. In: INCoS 2019: Advances in Intelligent Networking and Collaborative Systems, pp. 390-401, August 2019 6. Maihofer, C., Leinmuller, T., Schoch, E.: Abiding geocast: time-stable geocast for ad hoc networks. In: Proceedings ACM VANET, pp. 20-29 (2005) 7. Rizzo, G., Neukirchen, H.: Geo-based content sharing for disaster relief applications. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, Advance in Intelligent System and Computing, vol. 612, pp. 894–903 (2017) 8. Leontiadis, I., Costa, P., Mascolo, C.: Persistent content-based information dissemination in hybrid vehicular networks. In: Proceedings IEEE PerCom, pp. 1–10 (2009) 9. Ott, J., Hyyti, E., Lassila, P., Vaegs, T., Kangasharju, J.: Floating content: information sharing in urban areas. In: Proceedings IEEE PerCom, PP. 136–146 (2011) 10. Thompson, N., Crepaldi, R., Kravets, R.: Locus: a location-based data overlay for disruption-tolerant networks. In: Proceedings ACM CHANTS, pp. 47–54 (2010) 11. Codeca, L., Frank, R., Engel, T.: Luxembourg SUMO Traffic (LuST) scenario: 24 Hours of mobility for vehicular networking research. In: Proceedings of the 7th IEEE Vehicular Networking Conference (VNC15) (2015) 12. OMNeT++. https://omnetpp.org/ 13. SUMO. http://www.dlr.de/ts/en/desktopdefault.aspx/tabid-9883/16931 read41000/ 14. Veins. http://veins.car2x.org/

Author Index

A Abouali, Meryem, 161 Ahmed, Aly, 89 Ahmed, Kazi J., 216 Ajayi, Oluwaseyi, 161 Amato, Alessandra, 194, 238 Ampririt, Phudit, 1, 15, 26 As, Mansur, 447 Avino, Ilaria, 44 B Barolli, Admir, 1, 26 Barolli, Leonard, 1, 15, 26, 36, 281, 291 Batiha, Tarek, 149, 204 Braun, Nicholas A., 133 Bylykbashi, Kevin, 15 C Canonico, Roberto, 112 Chen, Xu, 338 Chen, Yuan, 326 Chen, Yubo, 133 Cozzolino, Giovanni, 112, 194, 238 Cuka, Miralda, 36 D Duolikun, Dilawaer, 101 E Elhadad, Mohamed K., 256 Elmazi, Donald, 36 Enokido, Tomoya, 101

F Fenza, Giuseppe, 44 Fiore, Nicola, 183 Fuccio, Graziano, 44 Fujihara, Akihiro, 121 G Gajdoš, Petr, 413 Gebali, Fayez, 256 Genovese, Alessia, 44 Goto, Ichiro, 503 Guo, Xuan, 326 H Haohan, Bei, 300 Hernandez, Marco, 216 Hřivňák, Rostislav, 413 Hryhoruk, Connor C. J., 133 I Iizumi, Rui, 67 Ikeda, Makoto, 36, 291 Ikenaga, Takeshi, 503 Iwai, Yuutaro, 121 J Jain, Prakhar, 133 Johnny, Olayinka, 319 K Kimoto, Kenta, 78 Köppen, Mario, 447

© Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): INCoS 2020, AISC 1263, pp. 515–517, 2021. https://doi.org/10.1007/978-3-030-57796-4

516 Krömer, Pavel, 149, 204, 425 Kumazoe, Kazumi, 460 L Lee, Myung, 216, 470, 480, 503 Leung, Carson K., 133 Levesque, Denis L., 133 Li, Jiangtao, 382 Li, Kin Fun, 256 Li, Linyang, 361 Li, Qiong, 382 Li, Xiao, 348, 361 Liu, Shudong, 338 Loia, Vincenzo, 44 Lu, Hong, 269 Lu, Wei, 269 M Maisto, Alessandro, 173 Martorelli, Giandomenico, 173 Matsuo, Keita, 1, 15, 36, 281 Mitsugi, Kenshiro, 281 Miwa, Hiroyoshi, 56, 78, 493 Morino, Yuma, 493 N Nagashima, Kaoru, 470 Nagata, Akira, 470 Nakamura, Shigenari, 67 Nakamura, Yasuto, 56 Nobayashi, Daiki, 503 Noguchi, Kaiya, 101 Nowaková, Jana, 425 O Ogiela, Lidia, 145 Ogiela, Marek R., 145 Ohara, Seiji, 1 Okamoto, Shusuke, 26 Orciuoli, Francesco, 44 P Pan, Xiang, 326 Paone, Antonietta, 173 Parente, Gaetano, 183 Pelosi, Serena, 173 Platoš, Jan, 391, 425 Polito, Massimiliano, 183 Q Qafzezi, Ermioni, 15

Author Index R Révay, Lukáš, 402 Rosenbaum, Sara, 269 S Saadawi, Tarel, 161 Saito, Takumi, 67, 101 Sakamoto, Shinji, 26 Sakumoto, Yusuke, 245 Seth, Nitya, 133 Shang, Siyuan, 133 Shibata, Masahiro, 437 Shimokawa, Shumpei, 480 Sichao, Liao, 56 Sirimorok, Nurdiansyah, 447 Snášel, Václav, 413, 425 Soussan, Tariq, 312 Sperlì, Giancarlo, 112 Srinivasan, Venkatesh, 226 Stingo, Michele, 183 Su, Lili, 382 Sun, Yujia, 391 T Tada, Yoshiki, 291 Taenaka, Yuzo, 470, 480 Takizawa, Makoto, 1, 15, 36, 67, 101 Tamura, Hitomi, 470 Taniguchi, Toyoaki, 245 Thomo, Alex, 89, 226 Tomić, Sanja, 402 Toyama, Atushi, 281 Trovati, Marcello, 312, 319 Tsukamoto, Kazuya, 216, 470, 480, 503 Tsuru, Masato, 437, 460 W Wang, Chao, 361 Wang, Chuang, 338 Wang, Xu An, 338 Wang, Zhifeng, 372 Wei, Jiamin, 338 Wen, Yan, 133 Wenhao, Li, 300 X Xiao, Zhiting, 326 Xu, Yiming, 348, 361 Y Yang, Yao, 372 Yang, Yu, 348, 361 Ying, Huang, 300

Author Index Yoshida, Kaori, 447 Yoshida, Ryosuke, 437 Yu, Tengkai, 226 Yubo, Liu, 300

517 Z Zeng, Chunyan, 372 Zhao, Yanyan, 382 Zhu, Dongliang, 372