Advances on P2P, Parallel, Grid, Cloud and Internet Computing: Proceedings of the 15th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2020) [1st ed.] 9783030611040, 9783030611057

This book aims to provide the latest research findings, innovative research results, methods and development techniques

337 118 58MB

English Pages XXV, 442 [464] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Grid, Cloud, and Cluster Computing [1 ed.] 9781683925699, 9781601324993

Proceedings of the 2019 International Conference on Grid, Cloud, and Cluster Computing (GCC'19) held July 29th - Au

160 83 2MB Read more

Internet Infrastructure: Networking, Web Services, and Cloud Computing 1138039918, 9781138039919

Internet Infrastructure: Networking, Web Services, and Cloud Computing provides a comprehensive introduction to networks

2,297 442 20MB Read more

Innovative Mobile and Internet Services in Ubiquitous Computing: Proceedings of the 17th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2023) 303135835X, 9783031358357

This book provides latest research findings, methods and development techniques, challenges and solutions from both theo

403 68 40MB Read more

Being Online: On Computing, Data, the Internet, and the Cloud 2021939597, 9781951627799, 9781951627966, 1951627792

A pioneer of cloud computing and big data offers his vision of the future world taking shape around us. Jian Wang was

178 59 2MB Read more

Innovative Mobile and Internet Services in Ubiquitous Computing : Proceedings of the 17th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2023) 9783031358364, 9783031358357

This book provides latest research findings, methods and development techniques, challenges and solutions from both theo

273 57 67MB Read more

Parallel And Distributed Computing

1,723 145 3MB Read more

Euro-Par 2020: Parallel Processing: 26th International Conference on Parallel and Distributed Computing, Warsaw, Poland, August 24–28, 2020, Proceedings [1st ed.] 9783030576745, 9783030576752

This book constitutes the proceedings of the 26th International Conference on Parallel and Distributed Computing, Euro-P

1,267 43 43MB Read more

Grid, Cloud, and Cluster Computing and Applications [1 ed.] 9781683921929

This volume contains the proceedings of the 2017 International Conference on Grid, Cloud, and Cluster Computing (GCC

161 43 8MB Read more

Proceedings of the Fifth International Conference on Mathematics and Computing: ICMC 2019 [1st ed.] 9789811554100, 9789811554117

This book features selected papers from the 5th International Conference on Mathematics and Computing (ICMC 2019), organ

923 147 6MB Read more

Smart Grid Technology: A Cloud Computing and Data Management Approach 1108475205, 9781108475204

This comprehensive text covers fundamental concepts of smart grid technologies, integrating the tools and techniques of

1,152 181 6MB Read more

Advances on P2P, Parallel, Grid, Cloud and Internet Computing: Proceedings of the 15th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2020) [1st ed.]
9783030611040, 9783030611057

Author / Uploaded
Leonard Barolli
Makoto Takizawa
Tomoki Yoshihisa
Flora Amato
Makoto Ikeda

Table of contents :
Front Matter ....Pages i-xxv
An Intelligent VegeCare Tool for Corn Disease Classification (Natwadee Ruedeeniraman, Makoto Ikeda, Leonard Barolli)....Pages 1-8
Performance Comparison of CM and RDVM Router Replacement Methods for WMNs by WMN-PSOHC Hybrid Simulation System Considering Normal Distribution of Mesh Clients (Shinji Sakamoto, Leonard Barolli, Shusuke Okamoto)....Pages 9-17
An Algorithm to Select a Server to Minimize the Total Energy Consumption of a Cluster (Kaiya Noguchi, Takumi Saito, Dilawaer Duolikun, Tomoya Enokido, Makoto Takizawa)....Pages 18-28
An Approach to Support the Design and the Dependability Analysis of High Performance I/O Intensive Distributed Systems (Lucas Bressan, Laércio Pioli, Mario A. R. Dantas, Fernanda Campos, André L. de Oliveira)....Pages 29-40
A Waiting Time Determination Method to Merge Data on Distributed Sensor Data Stream Collection (Tomoya Kawakami, Tomoki Yoshihisa, Yuuichi Teranishi)....Pages 41-50
Possible Energy Consumption of Messages in an Opportunistic Network (Nanami Kitahara, Shigenari Nakamura, Takumi Saito, Tomoya Enokido, Makoto Takizawa)....Pages 51-61
Aggregating and Sharing Contents for Reducing Redundant Caches on NDN (Yuya Nakata, Tetsuya Shigeyasu)....Pages 62-73
A Balanced Dissemination of Time Constraint Tasks in Mobile Crowdsourcing: A Double Auction Perspective (Jaya Mukhopadhyay, Vikash Kumar Singh, Sajal Mukhopadhyay, Anita Pal)....Pages 74-85
A Scheduling Method of Division-Based Broadcasting Considering Delivery Cycle (Yusuke Gotoh, Keisuke Kuroda)....Pages 86-94
A Simply Implementable Architecture for Broadcast Communication Environments (Tomoki Yoshihisa)....Pages 95-101
Assessment of Available Edge Computing Resources in SDN-VANETs by a Fuzzy-Based System Considering Trustworthiness as a New Parameter (Ermioni Qafzezi, Kevin Bylykbashi, Phudit Ampririt, Makoto Ikeda, Leonard Barolli, Makoto Takizawa)....Pages 102-112
eWound-PRIOR: An Ensemble Framework for Cases Prioritization After Orthopedic Surgeries (Felipe Neves, Morgan Jennings, Miriam Capretz, Dianne Bryant, Fernanda Campos, Victor Ströele)....Pages 113-125
Challenges of Crowdsourcing Platform: Thai Healthcare Information Case Study (Krit Khwanngern, Juggapong Natwichai, Vivatchai Kaveeta, Panutda Nantawad, Sineenuch Changkai, Supaksiri Suwiwattana)....Pages 126-135
An Implementation Science Effort in a Heterogenous Edge Computing Platform to Support a Case Study of a Virtual Scenario Application (Marceau Decamps, Jean-Francois Meháut, Vinicius Vidal, Leonardo Honorio, Laércio Pioli, Mario A. R. Dantas)....Pages 136-147
Detection and Analysis of Meal Sequence and Time Based on Internet of Things (Liyang Zhang, Hiroyuki Suzuki, Akio Koyama)....Pages 148-157
An Approach of Time Constraint of Data Intensive Scalable in e-Health Environment (Eliza Gomes, Rubens Zanatta, Patricia Plentz, Carlos De Rolt, Mario Dantas)....Pages 158-169
A Tool to Manage Educational Activities on a University Campus (Antonio Sarasa-Cabezuelo, Santi Caballé)....Pages 170-178
Towards the Use of Personal Robots to Improve the Online Learning Experience (Jordi Conesa, Beni Gómez-Zúñiga, Eul`lia Hernández i Encuentra, Modesta Pousada Fernández, Manuel Armayones Ruiz, Santi Caballé Llobet et al.)....Pages 179-187
Towards the Design of Ethically-Aware Pedagogical Conversational Agents (Joan Casas-Roma, Jordi Conesa)....Pages 188-198
Evaluation on Using Conversational Pedagogical Agents to Support Collaborative Learning in MOOCs (Santi Caballé, Jordi Conesa, David Gañán)....Pages 199-210
Detection of Student Engagement in e-Learning Systems Based on Semantic Analysis and Machine Learning (Daniele Toti, Nicola Capuano, Fernanda Campos, Mario Dantas, Felipe Neves, Santi Caballé)....Pages 211-223
Monitoring Airplanes Faults Through Business Intelligence Tools (Alessandra Amato, Giovanni Cozzolino, Alessandro Maisto, Serena Pelosi)....Pages 224-234
Artificial Intelligent ChatBot for Food Related Question (Alessandra Amato, Giovanni Cozzolino, Antonino Ferraro)....Pages 235-240
A Smart Interface for Provisioning of Food and Health Advices (Alessandra Amato, Giovanni Cozzolino, Antonino Ferraro)....Pages 241-250
Analysis of COVID-19 Data (Alessandra Amato, Giovanni Cozzolino, Alessandro Maisto, Serena Pelosi)....Pages 251-260
Towards the Generalization of Distributed Software Communication (Reinout Eyckerman, Thomas Huybrechts, Raf Van den Langenbergh, Wim Casteels, Siegfried Mercelis, Peter Hellinckx)....Pages 261-270
A Survey on the Software and Hardware-Based Influences on the Worst-Case Execution Time (Thomas Huybrechts, Siegfried Mercelis, Peter Hellinckx)....Pages 271-281
Intelligent Data Sharing in Digital Twins: Positioning Paper (Thomas Cassimon, Jens de Hoog, Ali Anwar, Siegfried Mercelis, Peter Hellinckx)....Pages 282-290
Towards Hybrid Camera Sensor Simulation for Autonomous Vehicles (Dieter Balemans, Yves De Boeck, Jens de Hoog, Ali Anwar, Siegfried Mercelis, Peter Hellinckx)....Pages 291-300
Lane Marking Detection Using LiDAR Sensor (Ahmed N. Ahmed, Sven Eckelmann, Ali Anwar, Toralf Trautmann, Peter Hellinckx)....Pages 301-310
Applying Artificial Intelligence for the Detection and Analysis of Weather Phenomena in Vehicle Sensor Data (Wouter Van den Bogaert, Toon Bogaerts, Wim Casteels, Siegfried Mercelis, Peter Hellinckx)....Pages 311-320
Proposal of a Traditional Craft Simulation System Using Mixed Reality (Rihito Fuchigami, Tomoyuki Ishida)....Pages 321-329
Development and Evaluation of an Inbound Tourism Support System Using Augmented Reality (Yusuke Kosaka, Tomoyuki Ishida)....Pages 330-338
A Study on the Relationship Between Refresh-Rate of Display and Reaction Time of eSports (Koshiro Murakami, Kazuya Miyashita, Hideo Miyachi)....Pages 339-347
Basic Consideration of Video Applications System for Tourists Based on Autonomous Driving Road Information Platform in Snow Country (Yoshitaka Shibata, Akira Sakuraba, Yoshiya Saito, Yoshikazu Arai, Jun Hakura)....Pages 348-355
Design of In-depth Security Protection System of Integrated Intelligent Police Cloud (Fahua Qian, Jian Cheng, Xinmeng Wang, Yitao Yang, Chanchan Li)....Pages 356-365
Design and Implementation of Secure File Transfer System Based on Java (Tu Zheng, Su Yunxuan, Wang Xu An, Li Ruifeng)....Pages 366-375
Secure Outsourcing Protocol Based on Paillier Algorithm for Cloud Computing (Su Yunxuan, Tu Zheng, Wang Xu An, Li Ruifeng)....Pages 376-384
Energy Consumption and Computation Models of Storage Systems (Wenlun Tong, Takumi Saito, Makoto Takizawa)....Pages 385-396
Performance Analysis of WMNs by WMN-PSODGA Simulation System Considering Uniform Distribution of Mesh Clients and Different Router Replacement Methods (Seiji Ohara, Admir Barolli, Phudit Ampririt, Keita Matsuo, Leonard Barolli, Makoto Takizawa)....Pages 397-409
Forecasting Electricity Consumption Using Weather Data in an Edge-Fog-Cloud Data Analytics Architecture (Juan C. Olivares-Rojas, Enrique Reyes-Archundia, José A. Gutiérrez-Gnecchi, Ismael Molina-Moreno, Arturo Méndez-Patiño, Jaime Cerda-Jacobo)....Pages 410-419
Vision-Referential Speech Enhancement with Binary Mask and Spectral Subtraction (Mitsuharu Matsumoto)....Pages 420-428
Detection of the QRS Complexity in Real Time with Bluetooth Communication (Ricardo Rodríguez-Jorge, I. De León-Damas, Jiri Bila)....Pages 429-439
Back Matter ....Pages 441-442

Citation preview

Lecture Notes in Networks and Systems 158

Leonard Barolli · Makoto Takizawa · Tomoki Yoshihisa · Flora Amato · Makoto Ikeda Editors

Advances on P2P, Parallel, Grid, Cloud and Internet Computing Proceedings of the 15th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2020)

Lecture Notes in Networks and Systems Volume 158

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA; Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada; Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong

The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. ** Indexing: The books of this series are submitted to ISI Proceedings, SCOPUS, Google Scholar and Springerlink **

More information about this series at http://www.springer.com/series/15179

Leonard Barolli Makoto Takizawa Tomoki Yoshihisa Flora Amato Makoto Ikeda •

•

•

•

Editors

Advances on P2P, Parallel, Grid, Cloud and Internet Computing Proceedings of the 15th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2020)

123

Editors Leonard Barolli Fukuoka Institute of Technology Fukuoka, Japan

Makoto Takizawa Hosei University Tokyo, Japan

Tomoki Yoshihisa Osaka University Osaka, Japan

Flora Amato University of Naples “Federico II” Napoli, Italy

Makoto Ikeda Fukuoka Institute of Technology Fukuoka, Japan

ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-030-61104-0 ISBN 978-3-030-61105-7 (eBook) https://doi.org/10.1007/978-3-030-61105-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Welcome Message from 3PGCIC-2020 Organizing Committee

Welcome to the 15th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2020), which will be held in conjunction with BWCCA-2020 International Conference from October 28 to October 30, 2020 in Yonago City, Tottori Prefecture, Japan. P2P, grid, cloud and Internet computing technologies have been established as breakthrough paradigms for solving complex problems by enabling large-scale aggregation and sharing of computational data and other geographically distributed computational resources. Grid computing originated as a paradigm for high-performance computing, as an alternative to expensive supercomputers. The grid computing domain has been extended to embrace different forms of computing, including semantic and service-oriented grid, pervasive grid, data grid, enterprise grid, autonomic grid, knowledge and economy grid. P2P computing appeared as the new paradigm after client–server and Web-based computing. These systems are evolving beyond file sharing toward a platform for large-scale distributed applications. P2P systems have as well inspired the emergence and development of social networking, business to business (B2B), business to consumer (B2C), business to government (B2G), business to employee (B2E) and so on. Cloud computing has been defined as a “computing paradigm where the boundaries of computing are determined by economic rationale rather than technical limits.” Cloud computing is a multipurpose paradigm that enables efficient management of data centers, timesharing and virtualization of resources with a special emphasis on business model. Cloud computing has fast become the computing paradigm with applications in all application domains and providing utility computing at large scale. Finally, Internet computing is the basis of any large-scale distributed computing paradigms; it has very fast developed into a vast area of flourishing field with enormous impact on today’s information societies. Internet-based computing serves thus as a universal platform comprising a large variety of computing forms.

v

vi

Welcome Message from 3PGCIC-2020 Organizing Committee

The aim of the 3PGCIC conference is to provide a research forum for presenting innovative research results, methods and development techniques from both theoretical and practical perspectives related to P2P, grid, cloud and Internet computing. Many people have helped and worked hard to produce a successful 3PGCIC-2020 technical program and conference proceedings. First, we would like to thank all the authors for submitting their papers, the PC members and the reviewers who carried out the most difficult work by carefully evaluating the submitted papers. We thank Web Administrators for their excellent work and support with the Web Submission and Management System of conference. We are grateful to Prof. Makoto Takizawa, Hosei University, Japan, as Honorary Chair of the conference for his support and encouragement. Our special thanks also go to keynote speakers. We hope you will enjoy the conference and have a great time in Yonago City, Japan. Leonard Barolli 3PGCIC-2020 Steering Committee Chair Tomoki Yoshihisa Flora Amato Chuan-Yu Chang 3PGCIC-2020 General Co-chairs Yusuke Gotoh Omar Hussain Juggapong Natwichai 3PGCIC-2020 Program Committee Co-chairs

3PGCIC-2020 Organizing Committee

Honorary Chair Makoto Takizawa

Hosei University, Japan

General Co-chairs Tomoki Yoshihisa Flora Amato Chuan-Yu Chang

Osaka University, Japan University of Naples Federico II, Italy National Yunlin University of Science and Technology, Taiwan

Program Committee Co-chairs Yusuke Gotoh Omar Hussain Juggapong Natwichai

Okayama University, Japan University of New South Wales, Australia Chiang Mai University, Thailand

Workshops Co-chairs Peter Hellinckx Tomoyuki Ishida Santi Caballe

University of Antwerp, Belgium Fukuoka Institute of Technology, Japan Open University of Catalonia, Spain

Finance Chair Makoto Ikeda

Fukuoka Institute of Technology, Japan

vii

viii

3PGCIC-2020 Organizing Committee

Web Administrator Chairs Kevin Bylykbashi Phudit Ampririt Seiji Ohara Ermioni Qafzezi

Fukuoka Fukuoka Fukuoka Fukuoka

Institute Institute Institute Institute

of of of of

Technology, Technology, Technology, Technology,

Japan Japan Japan Japan

Local Organizing Co-chairs Elis Kulla Akimitsu Kanzaki

Okayama University of Science, Japan Shimane University, Japan

Steering Committee Chair Leonard Barolli

Fukuoka Institute of Technology, Japan

Track Areas 1. Data Mining, Semantic Web and Information Retrieval Co-chairs Bowonsak Srisungsittisunti Francesco Piccialli Agnes Haryanto

University of Phayao, Thailand University of Naples “Federico II”, Italy Monash University, Australia

PC Members De-Nian Yang Nicola Cuomo Marco Cesarano Giuseppe Cicotti Marco Giacalone Seyedeh Sajedeh Saleh Luca Sorrentino Antonino Vespoli Wenny Rahayu David Taniar Eric Pardede Kiki Adhinugraha

Academia Sinica, Taiwan ESET, Slovakia Marvell Semiconductor, Santa Clara, California, USA Definiens, The Tissue Phenomics Company, Munich, Germany Vrije Universiteit Brussel, Belgium Vrije Universiteit Brussel, Belgium Brightstep AB, Stockholm, Sweden Centre for Intelligent Power at Eaton, Dublin, Ireland La Trobe University, Australia Monash University, Australia La Trobe University, Australia La Trobe University, Australia

3PGCIC-2020 Organizing Committee

ix

2. Cloud and Service-Oriented Computing Co-chairs Mario Dantas Francesco Orciuoli Wang Xu An

Federal University of Juiz de Fora (UFJF), Brazil University of Salerno, Italy Engineering University of CAPF, China

PC Members Douglas D. J. de Macedo Edelberto Franco Silva Massimo Villari Stefano Chessa Miriam Capretz Jean-Francois Mehaut Giuseppe Fenza Carmen De Maio Angelo Gaeta Sergio Miranda

University University University University University University University University University University

of of of of of of of of of of

Santa Catarina, Brazil Juiz de Fora, Brazil Messina, Italy Pisa, Italy Western Ontario, Canada Grenoble Alpes, France Salerno, Italy Salerno, Italy Salerno, Italy Salerno, Italy

3. Security and Privacy for Distributed Systems Co-chairs Aniello Castiglione Michal Choras Giovanni Mazzeo

University of Naples Parthenope, Italy University of Bydgoszcz, Poland University of Naples Parthenope, Italy

PC Members Silvio Barra Carmen Bisogni Javier Garcia Blas Han Jinguang Sokol Kosta Gloria Ortega López Raffaele Montella Fabio Narducci Rafal Kozik Joerg Keller Rafal Renk Salvatore D’Antonio Lukasz Apiecionek Joao Campos Gerhard Habiger

University of Cagliari, Italy University of Salerno, Italy Charles III University of Madrid, University of Surrey, UK University of Aalborg, Denmark University of Malaga, Spain University of Naples Parthenope, University of Naples Parthenope, UTP Bydgoszcz, Poland FUH Hagen, Germany UAM Poznan, Poland University of Naples Parthenope, UKW Bydgoszcz, Poland University of Coimbra, Portugal Ulm University, Germany

Spain

Italy Italy

Italy

x

Luigi Sgaglione Valerio Formicola

3PGCIC-2020 Organizing Committee

University of Naples Parthenope, Italy University of Naples Parthenope, Italy

4. P2P, Grid and Scalable Computing Co-chairs Nadeem Javaid Keita Matsuo

COMSATS University Islamabad, Pakistan Fukuoka Institute of Technology, Japan

PC Members Joan Arnedo Moreno Santi Caballe Vladi Kolici Evjola Spaho Yi Liu Yusuke Gotoh Akihiro Fujimoto Kamran Munir Safdar Hussain Bouk Muhammad Imran Syed Hassan Ahmed Hina Nasir Sakeena Javaid Rasool Bakhsh Asif Khan Adia Khalid Sana Mujeeb

Open University of Catalonia, Spain Open University of Catalonia, Spain Polytechnic University of Tirana, Albania Polytechnic University of Tirana, Albania Oita National College of Technology, Japan Okayama University, Japan Wakayama University, Japan University of the West England, UK Daegu Gyeongbuk Institute of Science and Technology (DGIST), Korea King Saud University, Saudi Arabia Georgia Southern University, USA Air University Islamabad, Pakistan COMSATS University Islamabad, Pakistan COMSATS University Islamabad, Pakistan COMSATS University Islamabad, Pakistan COMSATS University Islamabad, Pakistan COMSATS University Islamabad, Pakistan

5. Bio-inspired Computing and Pattern Recognition Co-chairs Francesco Mercaldo Salvatore Vitabile

Institute of Informatics and Telematics (IIT), CNR, Italy University of Palermo, Italy

PC Members Andrea Saracino Andrea De Lorenzo Fabio Di Troia

Institute of Informatics and Telematics (IIT), CNR, Italy University of Trieste, Italy San Jose State University, USA

3PGCIC-2020 Organizing Committee

Jelena Milosevic Martina Lindorfer Mauro Migliardi Vincenzo Conti Minoru Uehara Philip Moore

xi

TU Wien, Austria University of California, Santa Barbara, USA University of Padua, Italy University of Enna Kore, Italy Toyo University, Japan Lanzhou University, China

6. Intelligent and Cognitive Systems Co-chairs Serena Pelosi Alessandro Maisto Nico Surantha

University of Salerno, Italy University of Salerno, Italy Bina Nusantara University, Indonesia

PC Members Lorenza Melillo Francesca Esposito Pierluigi Vitale Chiara Galdi Marica Catone Annibale Elia Raffaele Guarasci Mario Monteleone Azzurra Mancuso Daniela Trotta

University of Salerno, Italy University of Salerno, Italy University of Salerno, Italy EURECOM, Sophia Antipolis, France University of Salerno, Italy University of Salerno, Italy Institute for High Performance Computing and Networking (ICAR), CNR, Italy University of Salerno, Italy University of Salerno, Italy University of Salerno, Italy

7. Web Application, Multimedia and Internet Computing Co-chairs Giovanni Cozzolino Yasuo Ebara

University of Naples “Federico II”, Italy Osaka Electro-Communication University, Japan

PC Members Flora Amato Vincenzo Moscato Walter Balzano Francesco Moscato Francesco Mercaldo Alessandra Amato Francesco Piccialli

University of Naples “Federico II”, Italy University of Naples “Federico II”, Italy University of Naples “Federico II”, Italy University of Campania “Luigi Vanvitelli”, Italy National Research Council of Italy (CNR), Italy University of Naples “Federico II”, Italy University of Naples “Federico II”, Italy

xii

Tetsuro Ogi Hideo Miyachi Kaoru Sugita Akio Doi Tomoyuki Ishida

3PGCIC-2020 Organizing Committee

Keio University, Japan Tokyo City University, Japan Fukuoka Institute of Technology, Japan Iwate Prefectural University, Japan Fukuoka Institute of Technology, Japan

8. Distributed Systems and Social Networks Co-chairs Masaki Kohana Jana Nowakova

Chuo University, Japan VSB-Technical University of Ostrava, Czech Republic

PC Members Jun Iio Shusuke Okamoto Hiroki Sakaji Shinji Sakamoto Masaru Kamada Martin Hasal Jakub Safarik Michal Pluhacek

Chuo University, Japan Seikei University, Japan The University of Tokyo, Japan Seikei University, Japan Ibaraki University, Japan VSB-Technical University of Ostrava, Czech Republic VSB-Technical University of Ostrava, Czech Republic Tomas Bata University in Zlin, Czech Republic

9. IoT Computing Systems Co-chairs Paskorn Champrasert Lei Shu

Chiang Mai University, Thailand Nanjing Agricultural University, China

PC Members Chonho Lee Yuthapong Somchit Pruet Boonma Somrawee Aramkul Roselin Petagon Guisong Yang Baohua Zhang

Cybermedia Center, Osaka University, Japan Chiang Mai University, Thailand Chiang Mai University, Thailand Chiang Mai Rajabhat University, Thailand Chiang Mai Rajabhat University, Thailand University of Shanghai for Science and Technology, P.R. China College of Engineering, Nanjing Agricultural University, China

3PGCIC-2020 Organizing Committee

Ye Liu Kai Huang Jun Liu Feng Wang Alba Amato Salvatore Venticinque Flora Amato

xiii

College of Engineering, Nanjing Agricultural University, China College of Engineering, Nanjing Agricultural University, China School of Automation, Guangdong Polytechnic Normal University, China Hubei University of Arts and Science, China National Research Council of Italy (CNR), Italy University of Campania Luigi Vanvitelli, Italy University of Naples Federico II, Italy

10. Wireless Networks and Mobile Computing Co-chairs Akimitu Kanzaki Shinji Sakamoto

Shimane University, Japan Seikei University, Japan

PC Members Teruaki Kitasuka Hiroyasu Obata Tetsuya Shigeyasu Chisa Takano Shigeru Tomisato Makoto Ikeda Keita Matsuo Donald Elmazi Admir Barolli Evjola Spaho Elis Kulla Tetsuya Oda

Hiroshima University, Japan Hiroshima City University, Japan Prefectural University of Hiroshima, Japan Hiroshima City University, Japan Okayama University, Japan Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Aleksander Moisiu University of Durres, Albania Polytechnic University of Tirana, Albania Okayama University of Science, Japan Okayama University of Science, Japan

3PGCIC-2020 Reviewers Amato Flora Barolli Admir Barolli Leonard Barra Silvio Boonma Pruet Caballé Santi Capretz Miriam Capuano Nicola Champrasert Paskorn Choras Michal

Cozzolino Giovanni Jordi Conesa Cui Baojiang Dantas Mario D’Antonio Salvatore Di Martino Beniamino Enokido Tomoya Fenza Giuseppe Ficco Massimo Fiore Ugo

xiv

Fortino Giancarlo Fun Li Kin Funabiki Nobuo Giacalone Marco Gotoh Yusuke Hasal Martin Hayashibara Naohiro Hellinckx Peter Hussain Farookh Hussain Omar Jorge Ricardo Rodríguez Iio Jun Ikeda Makoto Ishida Tomoyuki Kolici Vladi Koyama Akio Kanzaki Akimitsu Kulla Elis Loia Vincenzo Liu Yi Ma Kun Maisto Alessandro Mizera-Pietraszko Jolanta Goreti Marreiros Macedo Douglas Matsuo Keita Mazzeo Giovanni Messina Fabrizio Moore Philip Moreno Edward Kamada Masaru Kohana Masaki Kryvinska Natalia

3PGCIC-2020 Organizing Committee

Natwichai, Juggapong Nishino Hiroaki Nabuo Funabiki Nowakova Jana Oda Tetsuya Ogiela Lidia Ogiela Marek Ogiela Ursula Okada Yoshihiro Orciuoli Francesco Pace Pasquale Palmieri Francesco Pardede Eric Rahayu Wenny Rawat Danda Ritrovato Pierluigi Rodriguez Jorge Ricardo Sakaji Hiroki Shibata Yoshitaka Shu Lei Spaho Evjola Somchit Yuthapong Sugita Kaoru Surantha Nico Takizawa Makoto Taniar David Uchiya Takahiro Uehara Minoru Venticinque Salvatore Villari Massimo Wang Xu An Yoshihisa Tomoki

3PGCIC-2020 Keynote Talks

Fairness and Efficiency in Network Resource Sharing Masato Tsuru Kyushu Institute of Technology, Japan Abstract. With the expansion of network users and applications, the network traffic is still growing and a better sharing of limited network resources among multiple users/applications is required. In particular, recent strong demand on Internet of Things (IoT) for smart and connected communities along with architectural advancement, such as software-defined networking (SDN) and multi-access edge computing (MEC), has posed new challenges in fair and efficient resource sharing by multiplexing with complex and heterogeneous settings. In this talk, after briefly reviewing recent trends in communication networks, we discuss the concept of fairness in terms of achieved performance of each user through simple examples in wireless and wired networks. Then, we go into more details in few examples (multipath-multicast file transfer on OpenFlow network; wireless shared channel scheduling) and see how a fair and efficient resource sharing can be realized by time-division, space-division and information-coding multiplexing.

xvii

Road Status Sensing and V2X Technologies toward Autonomous Driving on Challenged Network Environment Yoshitaka Shibata Iwate Prefectural University, Morioka, Japan Abstract. Autonomous driving systems are expected as future safe and effective vehicles and have been investigated and developed in industrial countries and actually driving on the exclusive and highway roads with flat surface, clear driving lanes and centerlines separated from the opposite direction and on good weather conditions. In the future autonomous driving systems, more general road status and weather status environments such as heavy snow countries in addition to challenged network environment where no public communication network is available must be considered to realize safer and reliable mobility infrastructure. In this talk, in order to resolve the above problems, IoT-based crowd-sensing technology using various environmental sensors to precisely identify qualitative and quantitative road status using AI technology is discussed. The next-generation V2X communication technology to exchange and share those road status and GIS information among surrounding vehicles and roadside bases stations is also explained. Finally, a wide road status information-sharing platform for challenged weather and network environments based on the 5G and the next-generation high-speed LAN is introduced.

xix

Contents

An Intelligent VegeCare Tool for Corn Disease Classification . . . . . . . . Natwadee Ruedeeniraman, Makoto Ikeda, and Leonard Barolli Performance Comparison of CM and RDVM Router Replacement Methods for WMNs by WMN-PSOHC Hybrid Simulation System Considering Normal Distribution of Mesh Clients . . . . . . . . . . . . . . . . . Shinji Sakamoto, Leonard Barolli, and Shusuke Okamoto An Algorithm to Select a Server to Minimize the Total Energy Consumption of a Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kaiya Noguchi, Takumi Saito, Dilawaer Duolikun, Tomoya Enokido, and Makoto Takizawa An Approach to Support the Design and the Dependability Analysis of High Performance I/O Intensive Distributed Systems . . . . . . . . . . . . Lucas Bressan, Laércio Pioli, Mario A. R. Dantas, Fernanda Campos, and André L. de Oliveira A Waiting Time Determination Method to Merge Data on Distributed Sensor Data Stream Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tomoya Kawakami, Tomoki Yoshihisa, and Yuuichi Teranishi Possible Energy Consumption of Messages in an Opportunistic Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nanami Kitahara, Shigenari Nakamura, Takumi Saito, Tomoya Enokido, and Makoto Takizawa Aggregating and Sharing Contents for Reducing Redundant Caches on NDN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuya Nakata and Tetsuya Shigeyasu

1

9

18

29

41

51

62

xxi

xxii

Contents

A Balanced Dissemination of Time Constraint Tasks in Mobile Crowdsourcing: A Double Auction Perspective . . . . . . . . . . . . . . . . . . . Jaya Mukhopadhyay, Vikash Kumar Singh, Sajal Mukhopadhyay, and Anita Pal

74

A Scheduling Method of Division-Based Broadcasting Considering Delivery Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yusuke Gotoh and Keisuke Kuroda

86

A Simply Implementable Architecture for Broadcast Communication Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tomoki Yoshihisa

95

Assessment of Available Edge Computing Resources in SDN-VANETs by a Fuzzy-Based System Considering Trustworthiness as a New Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Ermioni Qafzezi, Kevin Bylykbashi, Phudit Ampririt, Makoto Ikeda, Leonard Barolli, and Makoto Takizawa eWound-PRIOR: An Ensemble Framework for Cases Prioritization After Orthopedic Surgeries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Felipe Neves, Morgan Jennings, Miriam Capretz, Dianne Bryant, Fernanda Campos, and Victor Ströele Challenges of Crowdsourcing Platform: Thai Healthcare Information Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Krit Khwanngern, Juggapong Natwichai, Vivatchai Kaveeta, Panutda Nantawad, Sineenuch Changkai, and Supaksiri Suwiwattana An Implementation Science Effort in a Heterogenous Edge Computing Platform to Support a Case Study of a Virtual Scenario Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Marceau Decamps, Jean-Francois Meháut, Vinicius Vidal, Leonardo Honorio, Laércio Pioli, and Mario A. R. Dantas Detection and Analysis of Meal Sequence and Time Based on Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Liyang Zhang, Hiroyuki Suzuki, and Akio Koyama An Approach of Time Constraint of Data Intensive Scalable in e-Health Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 Eliza Gomes, Rubens Zanatta, Patricia Plentz, Carlos De Rolt, and Mario Dantas A Tool to Manage Educational Activities on a University Campus . . . . 170 Antonio Sarasa-Cabezuelo and Santi Caballé

Contents

xxiii

Towards the Use of Personal Robots to Improve the Online Learning Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Jordi Conesa, Beni Gómez-Zúñiga, Eulàlia Hernández i Encuentra, Modesta Pousada Fernández, Manuel Armayones Ruiz, Santi Caballé Llobet, Xavi Aracil Díaz, and Francesc Santanach Delisau Towards the Design of Ethically-Aware Pedagogical Conversational Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Joan Casas-Roma and Jordi Conesa Evaluation on Using Conversational Pedagogical Agents to Support Collaborative Learning in MOOCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Santi Caballé, Jordi Conesa, and David Gañán Detection of Student Engagement in e-Learning Systems Based on Semantic Analysis and Machine Learning . . . . . . . . . . . . . . . . . . . . . 211 Daniele Toti, Nicola Capuano, Fernanda Campos, Mario Dantas, Felipe Neves, and Santi Caballé Monitoring Airplanes Faults Through Business Intelligence Tools . . . . . 224 Alessandra Amato, Giovanni Cozzolino, Alessandro Maisto, and Serena Pelosi Artificial Intelligent ChatBot for Food Related Question . . . . . . . . . . . . 235 Alessandra Amato, Giovanni Cozzolino, and Antonino Ferraro A Smart Interface for Provisioning of Food and Health Advices . . . . . . 241 Alessandra Amato, Giovanni Cozzolino, and Antonino Ferraro Analysis of COVID-19 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 Alessandra Amato, Giovanni Cozzolino, Alessandro Maisto, and Serena Pelosi Towards the Generalization of Distributed Software Communication . . . Reinout Eyckerman, Thomas Huybrechts, Raf Van den Langenbergh, Wim Casteels, Siegfried Mercelis, and Peter Hellinckx

261

A Survey on the Software and Hardware-Based Influences on the Worst-Case Execution Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Thomas Huybrechts, Siegfried Mercelis, and Peter Hellinckx Intelligent Data Sharing in Digital Twins: Positioning Paper . . . . . . . . . 282 Thomas Cassimon, Jens de Hoog, Ali Anwar, Siegfried Mercelis, and Peter Hellinckx Towards Hybrid Camera Sensor Simulation for Autonomous Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Dieter Balemans, Yves De Boeck, Jens de Hoog, Ali Anwar, Siegfried Mercelis, and Peter Hellinckx

xxiv

Contents

Lane Marking Detection Using LiDAR Sensor . . . . . . . . . . . . . . . . . . . . 301 Ahmed N. Ahmed, Sven Eckelmann, Ali Anwar, Toralf Trautmann, and Peter Hellinckx Applying Artificial Intelligence for the Detection and Analysis of Weather Phenomena in Vehicle Sensor Data . . . . . . . . . . . . . . . . . . . 311 Wouter Van den Bogaert, Toon Bogaerts, Wim Casteels, Siegfried Mercelis, and Peter Hellinckx Proposal of a Traditional Craft Simulation System Using Mixed Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 Rihito Fuchigami and Tomoyuki Ishida Development and Evaluation of an Inbound Tourism Support System Using Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 Yusuke Kosaka and Tomoyuki Ishida A Study on the Relationship Between Refresh-Rate of Display and Reaction Time of eSports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Koshiro Murakami, Kazuya Miyashita, and Hideo Miyachi Basic Consideration of Video Applications System for Tourists Based on Autonomous Driving Road Information Platform in Snow Country . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 Yoshitaka Shibata, Akira Sakuraba, Yoshiya Saito, Yoshikazu Arai, and Jun Hakura Design of In-depth Security Protection System of Integrated Intelligent Police Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 Fahua Qian, Jian Cheng, Xinmeng Wang, Yitao Yang, and Chanchan Li Design and Implementation of Secure File Transfer System Based on Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 Tu Zheng, Su Yunxuan, Wang Xu An, and Li Ruifeng Secure Outsourcing Protocol Based on Paillier Algorithm for Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 Su Yunxuan, Tu Zheng, Wang Xu An, and Li Ruifeng Energy Consumption and Computation Models of Storage Systems . . . 385 Wenlun Tong, Takumi Saito, and Makoto Takizawa Performance Analysis of WMNs by WMN-PSODGA Simulation System Considering Uniform Distribution of Mesh Clients and Different Router Replacement Methods . . . . . . . . . . . . . . . . . . . . . . 397 Seiji Ohara, Admir Barolli, Phudit Ampririt, Keita Matsuo, Leonard Barolli, and Makoto Takizawa

Contents

xxv

Forecasting Electricity Consumption Using Weather Data in an Edge-Fog-Cloud Data Analytics Architecture . . . . . . . . . . . . . . . . 410 Juan C. Olivares-Rojas, Enrique Reyes-Archundia, José A. Gutiérrez-Gnecchi, Ismael Molina-Moreno, Arturo Méndez-Patiño, and Jaime Cerda-Jacobo Vision-Referential Speech Enhancement with Binary Mask and Spectral Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 Mitsuharu Matsumoto Detection of the QRS Complexity in Real Time with Bluetooth Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 Ricardo Rodríguez-Jorge, I. De León-Damas, and Jiri Bila Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441

An Intelligent VegeCare Tool for Corn Disease Classification Natwadee Ruedeeniraman1 , Makoto Ikeda2(B) , and Leonard Barolli2 1

Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected] 2 Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected], [email protected]

Abstract. Due to the decrease of the agricultural population, agriculture has widely applied to machine learning and deep learning. In this paper, we present the classification performance of the proposed VegeCare tool for corn disease classification. We classify the major leaf diseases of the corn crop. The dataset includes four classes: gray leaf spot, common rust, health and northern leaf blight. From this evaluation, we found that our proposed VegeCare tool has a good performance.

Keywords: Deep learning Agriculture

1

· VegeCare · Corn · Disease classification ·

Introduction

In recent years, Artificial Intelligence (AI) based advanced agriculture system has attracted attention due to the increase in the average age of farmers and decrease the cultivated area. Also, people are more interested in safer food due to the impact of the COVID-19 epidemic. AI-based agriculture systems are expected to deliver safe and high-quality food [15,19]. AI systems focus on edge-AI, where daily learning is done in the cloud-AI and real-time prediction is done at the edge. The application of Deep Neural Networks (DNNs) is expected to solve complicated problems for humans in various fields [16,20,21]. There is also a competition community [1], which provides many datasets on the Internet. To improve the accuracy of DNNs, both layered models and algorithms are essential. For the next generation of farmers, the intelligent growth management system is important for increasing crops’ productivity [2,6]. In [17,18], we proposed a classification system considering potato and tomato diseases. We analyzed the classification performance in two of four major crop types as tubers, fruits, cereals and pulses. Potatoes are categorized as a tuber crop that grows in the ground. Tomatoes are categorized as a fruit crop. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 1–8, 2021. https://doi.org/10.1007/978-3-030-61105-7_1

2

N. Ruedeeniraman et al.

In this paper, we present the performance evaluation of VegeCare tool for corn disease classification. We used four kinds of corn diseases. The VegeCare tool helps the growth management for farmers. Corns are categorized as a cereal crop, which are grasses cultivated for their edible seeds. From this work, we provide insight into the classification performance of three major crop types: tuber, fruit and cereal crops. The structure of the paper is as follows. In Sect. 2, we describe the related work. In Sect. 3, we describe the proposed system. In Sect. 4, we described the evaluation results. Finally, conclusions and future work are given in Sect. 5.

2

Related Work

In recent years, new systems are used for rice cultivation, which automatically can measure the water level and water temperature of the paddy field. Also, a cultivation management system was proposed to collect the knowledge and skill of farmers. However, there are many problems, such as the growing conditions and physiological conditions in the field of biological information sensing. The autonomous agricultural machines (e.g., rice planter and tractor) have been developed for remote monitoring by farmers [10]. For rice harvesting, the system applies object detection and straight-line assistance to autonomously move the target area by coordinating GPS and GIS modules. Most of the areas of crop cultivation are near mountains and suburbs. Therefore, animals often come down from the mountains and bite the cultivated crops in the fields. So, bird and animal removal systems are developed. Wireless communication using Unmanned Aerial Vehicle (UAV) and LoRa have attracted attention for transmitting the sensed agricultural data [3–5]. In [7], the authors presented a framework for classifier fusion, which can support the automatic recognition of fruits and vegetables in a supermarket environment. The authors show that the proposed framework yields better results than several related works found in the literature. In [8], the authors proposed nine layers of Convolutional Neural Network (CNN) for leaf disease classification. They reported that the accuracy of their model is better than traditional approaches. DNN has a deep hierarchy that connects multiple internal layers for feature detection and representation learning. Representation learning is used to express the extracting essential information from observation data in the real world. Feature extraction uses trial and error by artificial operations. However, DNN uses the pixel level of the image as an input value and acquires the most suitable characteristic to identify it [9,13]. The CNN uses the backpropagation model like a conventional multi-layer perceptron. To update the weighting filter and coupling coefficient, CNN uses stochastic gradient descent. In this way, CNN recognizes the optimized feature using convolutional and pooling operations [12,14].

An Intelligent VegeCare Tool for Corn Disease Classification

3 3.1

3

Proposed System Overview of the Proposed System

The structure of our proposed system is shown in Fig. 1. The proposed system consists of the VegeCare tool used at the edge and the VegeCare system on the cloud. The VegeCare tool is a mobile application of the Android terminal that predicts the object and manages the plant growth for farmers. The VegeCare tool’s functions have a plant disease classification, vegetable classification and insect pest classification. The VegeCare system on Cloud-AI has three functions: computing module, data management and job scheduler. Based on corn disease classification characteristics, we consider the accuracy and loss, which are computed by a high-level API of TensorFlow (tf.keras).

Fig. 1. Model of the proposed system.

3.2

Corn Disease Classification

Corn/maize prefers hot and sunny conditions. When the rainy season, corn often becomes disease affected by fungus and infectious disease by pests. VegeCare tool considers four classes of typical corn diseases: gray leaf spot, common rust, health and northern leaf blight. The samples of each class are shown in Fig. 2.

4

N. Ruedeeniraman et al.

Fig. 2. Images from each corn disease [11].

Gray Leaf Spot. Corn gray leaf spot is a type of infection caused by Cercospora zeae-maydis. The disease is usually first noticed in the lower leaves. The lesions initially become small dots with yellow halos. Then, the lesions become pale brown to gray and rectangular shapes parallel to the leaves’ veins. Finally, the lesions merge and kill the leaves. Common Rust. Corn common rust is a type of infection caused by Puccinia sorghi. The disease creates multiple spore prints on both surfaces of leaves. The spore prints become orange to brown, slightly slender, 2–5 mm long and 1–2 mm wide. Then, the spore prints are slightly blackened. Finally, the surface skin breaks and the spores are spread by flying around. Northern Leaf Blight. Corn northern leaf blight is a type of infection caused by Setosphaeria turcica. The disease increases in cold and humid conditions. The massive outbreak of the disease will spread to the whole field. The disease becomes large lesions on the surface of leaves with yellow-brown to gray, spindleshaped and 3–10 cm long. When the lesions become old, the center of lesions becomes black and like musty, and can easily split vertically from the center.

4 4.1

Evaluation Results Evaluation of Setting

In this paper, we use the dataset of PlantVillage [11] to collect the corn disease classification images. The dataset contains 9, 245 images belonging to 4 classes. We show a list of hyperparameters of the proposed CNN in Table 1. To create a training model, we used a maximum of 400 epochs and the batch size is 32. The image size is 256×256. The network is based on the sequential model, which consists of four convolutional layers, four pooling layers and four fully connected layers. We consider dropout layers to prevent the model from over-fitting and use Rectified Linear Unit (ReLU) as the activation function to improve the representation of the model. In the output layer, softmax as an activation function is used to split the final result into multiple diseases.

An Intelligent VegeCare Tool for Corn Disease Classification

5

Table 1. List of hyperparameters. Function

Values

Epoch

100, 200, 400

Batch size

32

Filter sizes for convolution layer 3 × 3

4.2

Activation function

ReLU

Loss function

Categorical cross-entropy

Optimizer

RMSprop

Dropout

0.5

Training and Validation Results

The dataset contains 9, 245 images belonging to 4 classes of corn diseases. Because of the imbalanced data distribution, we split the dataset to 7, 316, 1, 829 and 100 images. The training, validation and testing sets are split at the subclass level. The dataset should be diverse to prevent over-fitting and improve classification results. We use various images of corn leaves, which are converted from original images by using rotation, re-scale, random zooming and shear conversion. From Fig. 3 to Fig. 5 are shown the results of accuracy and loss for different epochs. For 100 epochs (see Fig. 3), the validation accuracy results are increased with the increase of epochs. The training and validation loss is quite low. For 200 epochs (see Fig. 4), the accuracy after 100 epochs starts to decrease. Due to this effect, the loss results are also increased. For 400 epochs (see Fig. 5), the training accuracy is less than 90%. After 360 epochs, there are some oscillations.

(a) Accuracy

(b) Loss

Fig. 3. Training and validation results for 100 epochs.

6

N. Ruedeeniraman et al.

(a) Accuracy

(b) Loss

Fig. 4. Training and validation results for 200 epochs.

(a) Accuracy

(b) Loss

Fig. 5. Training and validation results for 400 epochs.

4.3

Classification Results

We used a total of 100 images (25 images for each class) to evaluate the proposed VegeCare tool and investigate the accuracy of corn disease classification. The results of corn disease classification for different epochs are shown in Table 2. For both common rust and healthy leaves, the classification results show that the proposed tool has a good accuracy regardless of the number of epochs. For the gray leaf spot, the accuracy of the proposed tool is more than 96% for 100 and 200 epochs. While the accuracy for 400 epochs is less than 72%. For northern leaf blight, the accuracy of the proposed tool for 100 epochs has a good accuracy, but the accuracy decreases with increased epochs. From these results, we found that our proposed tool classified common rust disease correctly considering that the spore is smaller and looks different from the other diseases. For 400 epochs, the proposed tool sometimes classified gray leaf spot and northern leaf blight leaves as health leaves. This is because of the over-fitting problem.

An Intelligent VegeCare Tool for Corn Disease Classification

7

Table 2. Testing results for classification. Class

Epoch Class #1 Class #2 Class #3 Class #4 Accuracy

1: Gray leaf spot

100 200 400 100 200 400

2: Common rust

3: Healthy

100 200 400 4: Northern leaf blight 100 200 400

5

24 24 18 0 0 0

0 0 0 25 25 25

0 0 3 0 0 0

1 1 4 0 0 0

96 96 72 100 100 100

0 0 0 0 2 1

0 0 0 0 0 0

25 25 25 0 1 6

0 0 0 25 22 18

100 100 100 100 88 72

Conclusions

In this paper, we proposed a corn disease classification tool called VegeCare. We evaluated the performance for corn diseases considering accuracy and loss. From the evaluation results, we found that our training data for 100 epochs have a good performance. In the future work, we will implement an all-in-one tool with multiple crop classification for improving the flexibility of our proposed tool.

References 1. Kaggle: Data science community. https://www.kaggle.com/ 2. Ahmed, N., De, D., Hussain, I.: Internet of things (IoT) for smart precision agriculture and farming in rural areas. IEEE Internet Things J. 5(6), 4890–4899 (2018) 3. Bacco, M., Berton, A., Gotta, A., Caviglione, L.: IEEE 802.15.4 air-ground UAV communications in smart farming scenarios. IEEE Commun. Lett. 22(9), 1910– 1913 (2018) 4. Castellanos, G., Deruyck, M., Martens, L., Joseph, W.: System assessment of WUSN using NB-IoT UAV-aided networks in potato crops. IEEE Access 8, 56823– 56836 (2020) 5. Daskalakis, S.N., Goussetis, G., Assimonis, S.D., Tentzeris, M.M., Georgiadis, A.: A uW backscatter-morse-leaf sensor for low-power agricultural wireless sensor networks. IEEE Sens. J. 18(19), 7889–7898 (2018) 6. Elijah, O., Rahman, T.A., Orikumhi, I., Leow, C.Y., Hindia, M.N.: An overview of internet of things (IoT) and data analytics in agriculture: benefits and challenges. IEEE Internet Things J. 5(5), 3758–3773 (2018) 7. Faria, F.A., dos Santos, J.A., Rocha, A., da Torres, R.S.: Automatic classifier fusion for produce recognition. In: Proceedings of the 25th International Conference on Graphics, Patterns and Images (SIBGRAPI-2012), pp. 252–259 (2012)

8

N. Ruedeeniraman et al.

8. Geetharamani, G., Pandian, A.J.: Identification of plant leaf diseases using a ninelayer deep convolutional neural network. Comput. Electr. Eng. 76, 323–338 (2019) 9. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006) 10. Hokkaido Agricultural Research Center N: HARC brochure (2017). http://www. naro.affrc.go.jp/publicity report/publication/files/2017NARO english 1.pdf 11. Hughes, D.P., Salath’e, M.: An open access repository of images on plant health to enable the development of mobile disease diagnostics through machine learning and crowdsourcing. Computing Research Repository (CoRR) (2015) 12. Kang, L., Kumar, J., Ye, P., Li, Y., Doermann, D.: Convolutional neural networks for document image classification. In: Proceedings of 22nd International Conference on Pattern Recognition 2014 (ICPR-2014), pp. 3168–3172, August 2014 13. Le, Q.V.: Building high-level features using large scale unsupervised learning. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 2013 (ICASSP-2013), pp. 8595–8598, May 2013 14. Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609–616, June 2009 15. Mattihalli, C., Gedefaye, E., Endalamaw, F., Necho, A.: Plant leaf diseases detection and auto-medicine. Internet Things 1–2, 67–73 (2018) 16. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015) 17. Ruedeeniraman, N., Ikeda, M., Barolli, L.: Performance evaluation of VegeCare tool for tomato disease classification. In: Proceedings of the 22nd International Conference on Network-Based Information Systems (NBiS-2019), pp. 595–603, September 2019 18. Ruedeeniraman, N., Ikeda, M., Barolli, L.: Performance evaluation of VegeCare tool for potato disease classification. In: Proceedings of the 23rd International Conference on Network-Based Information Systems (NBiS-2020), August 2020 19. Sardogan, M., Tuncer, A., Ozen, Y.: Plant leaf disease detection and classification based on CNN with LVQ algorithm. In: Proceedings of the 3rd International Conference on Computer Science and Engineering (UBMK-2018), pp. 382–385, September 2018 20. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016) 21. Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L., van den Driessche, G., Graepel, T., Hassabis, D.: Mastering the game of Go without human knowledge. Nature 550, 354–359 (2017)

Performance Comparison of CM and RDVM Router Replacement Methods for WMNs by WMN-PSOHC Hybrid Simulation System Considering Normal Distribution of Mesh Clients Shinji Sakamoto1(B) , Leonard Barolli2 , and Shusuke Okamoto1 1

2

Department of Computer and Information Science, Seikei University, 3-3-1 Kichijoji-Kitamachi, Musashino-shi, Tokyo 180-8633, Japan [email protected], [email protected] Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected] Abstract. Wireless Mesh Networks (WMNs) have many advantages such as easy maintenance, low upfront cost and high robustness. However, WMNs have some problems such as node placement problem, security, transmission power and so on. In this work, we deal with node placement problem. In our previous work, we implemented a hybrid simulation system based on Particle Swarm Optimization (PSO) and Hill Climbing (HC) called WMN-PSOHC for solving the node placement problem in WMNs. In this paper, we evaluate the performance of two mesh router replacement methods: Constriction Method (CM) and Rational Decrement of Vmax Method (RDVM) by WMN-PSOHC hybrid intelligent simulation system. Simulation results show that a better performance is achieved for CM compared with RDVM.

1

Introduction

The wireless networks and devices are becoming increasingly popular and they provide users access to information and communication anytime and anywhere [1,3–5,9–12,14,15,17,19,20,25,29]. Wireless Mesh Networks (WMNs) are gaining a lot of attention because of their low cost nature that makes them attractive for providing wireless Internet connectivity. A WMN is dynamically self-organized and self-configured, with the nodes in the network automatically establishing and maintaining mesh connectivity among them-selves (creating, in effect, an ad hoc network). This feature brings many advantages to WMNs such as low up-front cost, easy network maintenance, robustness and reliable service coverage [2]. Moreover, such infrastructure can be used to deploy community networks, metropolitan area networks, municipal and corporative networks, and to support applications for urban areas, medical, transport and surveillance systems. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 9–17, 2021. https://doi.org/10.1007/978-3-030-61105-7_2

10

S. Sakamoto et al.

In this work, we deal with node placement problem in WMNs. We consider the version of the mesh router nodes placement problem in which we are given a grid area where to deploy a number of mesh router nodes and a number of mesh client nodes of fixed positions (of an arbitrary distribution) in the grid area. The objective is to find a location assignment for the mesh routers to the cells of the grid area that maximizes the network connectivity and client coverage. Network connectivity is measured by Size of Giant Component (SGC) of the resulting WMN graph, while the user coverage is simply the number of mesh client nodes that fall within the radio coverage of at least one mesh router node and is measured by Number of Covered Mesh Clients (NCMC). Node placement problems are known to be computationally hard to solve [13,33]. In some previous works, intelligent algorithms have been recently investigated [8,16,18,27,28,35]. We already implemented a Particle Swarm Optimization (PSO) based simulation system, called WMN-PSO [23]. Also, we implemented a simulation system based on Hill Climbing (HC) for solving node placement problem in WMNs, called WMN-HC [22]. In our previous work [23,26], we presented a hybrid intelligent simulation system based on PSO and HC. We called this system WMN-PSOHC. In this paper, we analyze the performance of Constriction Method (CM) and Rational Decrement of Vmax Method (RDVM) by WMN-PSOHC simulation system considering Normal distribution of mesh clients. The rest of the paper is organized as follows. We present our designed and implemented hybrid simulation system in Sect. 2. In Sect. 3, we introduce WMNPSOHC Web GUI tool. The simulation results are given in Sect. 4. Finally, we give conclusions and future work in Sect. 5.

2 2.1

Proposed and Implemented Simulation System Particle Swarm Optimization

In Particle Swarm Optimization (PSO) algorithm, a number of simple entities (the particles) are placed in the search space of some problem or function and each evaluates the objective function at its current location. The objective function is often minimized and the exploration of the search space is not through evolution [21]. However, following a widespread practice of borrowing from the evolutionary computation field, in this work, we consider the bi-objective function and fitness function interchangeably. Each particle then determines its movement through the search space by combining some aspect of the history of its own current and best (best-fitness) locations with those of one or more members of the swarm, with some random perturbations. The next iteration takes place after all particles have been moved. Eventually the swarm as a whole, like a flock of birds collectively foraging for food, is likely to move close to an optimum of the fitness function. Each individual in the particle swarm is composed of three D-dimensional vectors, where D is the dimensionality of the search space. These are the current position xi , the previous best position pi and the velocity v i .

Performance Comparison of Router Replacement Methods by WMN-PSOHC

11

The particle swarm is more than just a collection of particles. A particle by itself has almost no power to solve any problem; progress occurs only when the particles interact. Problem solving is a population-wide phenomenon, emerging from the individual behaviors of the particles through their interactions. In any case, populations are organized according to some sort of communication structure or topology, often thought of as a social network. The topology typically consists of bidirectional edges connecting pairs of particles, so that if j is in i’s neighborhood, i is also in j’s. Each particle communicates with some other particles and is affected by the best point found by any member of its topological neighborhood. This is just the vector pi for that best neighbor, which we will denote with pg . The potential kinds of population “social networks” are hugely varied, but in practice certain types have been used more frequently. In the PSO process, the velocity of each particle is iteratively adjusted so that the particle stochastically oscillates around pi and pg locations. 2.2

Hill Climbing

Hill Climbing (HC) algorithm is a heuristic algorithm. The idea of HC is simple. In HC, the solution s is accepted as the new current solution if δ ≤ 0 holds, where δ = f (s ) − f (s). Here, the function f is called the fitness function. The fitness function gives points to a solution so that the system can evaluate the next solution s and the current solution s. The most important factor in HC is to define the neighbor solution, effectively. The definition of the neighbor solution affects HC performance directly. In our WMN-PSOHC system, we use the next step of particle-pattern positions as the neighbor solutions for the HC part. 2.3

WMN-PSOHC System Description

In following, we present the initialization, particle-pattern, fitness function and router replacement methods. Initialization Our proposed system starts by generating an initial solution randomly, by ad hoc methods [34]. We decide the velocity of particles by a random process considering the area size. For √ instance, when√the area size is W × H, the velocity is decided randomly from − W 2 + H 2 to W 2 + H 2 . Particle-Pattern A particle is a mesh router. A fitness value of a particle-pattern is computed by combination of mesh routers and mesh clients positions. In other words, each particle-pattern is a solution as shown is Fig. 1. Therefore, the number of particle-patterns is a number of solutions. Fitness Function One of most important thing is to decide the determination of an appropriate objective function and its encoding. In our case, each particle-pattern has an

12

S. Sakamoto et al.

own fitness value and compares other particle-patterns fitness value in order to share information of global solution. The fitness function follows a hierarchical approach in which the main objective is to maximize the SGC in WMN. Thus, we use α and β weight-coefficients for the fitness function and the fitness function of this scenario is defined as: Fitness = α × SGC(xij , y ij ) + β × NCMC(xij , y ij ). Router Replacement Methods A mesh router has x, y positions and velocity. Mesh routers are moved based on velocities. There are many router replacement methods in PSO field [7,30–32]. In this paper, we consider CM and RDVM. Constriction Method (CM) CM is a method which PSO parameters are set to a week stable region (ω = 0.729, C1 = C2 = 1.4955) based on analysis of PSO by M. Clerc et al. [5–7]. Rational Decrement of Vmax Method (RDVM) In RDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). The Vmax is kept decreasing with the increasing of iterations as Vmax (x) =

W 2 + H2 ×

T −x . x

where, W and H are the width and the height of the considered area, respectively. Also, T and x are the total number of iterations and a current number of iteration, respectively [24].

Fig. 1. Relationship among global solution, particle-patterns and mesh routers.

3

WMN-PSOHC Web GUI Tool

The Web application follows a standard Client-Server architecture and is implemented using LAMP (Linux + Apache + MySQL + PHP) technology (see Fig. 2). We show the WMN-PSOHC Web GUI tool in Fig. 3. Remote users (clients) submit their requests by completing first the parameter setting. The parameter values to be provided by the user are classified into three groups, as follows.

Performance Comparison of Router Replacement Methods by WMN-PSOHC

13

• Parameters related to the problem instance: These include parameter values that determine a problem instance to be solved and consist of number of router nodes, number of mesh client nodes, client mesh distribution, radio coverage interval and size of the deployment area. • Parameters of the resolution method: Each method has its own parameters. • Execution parameters: These parameters are used for stopping condition of the resolution methods and include number of iterations and number of independent runs. The former is provided as a total number of iterations and depending on the method is also divided per phase (e.g., number of iterations in a exploration). The later is used to run the same configuration for the same problem instance and parameter configuration a certain number of times.

4

Simulation Results

In this section, we show simulation results using WMN-PSOHC system. In this work, we consider Normal distribution of mesh clients. The number of mesh routers is considered 16 and the number of mesh clients 48. We consider the number of particle-patterns 9. We conducted simulations 100 times, in order to avoid the effect of randomness and create a general view of results. The total number of iterations is considered 800 and the iterations per phase is considered 4. We show the parameter setting for WMN-PSOHC in Table 1. We show the simulation results in Fig. 4 and Fig. 5. For SGC, both replacement methods reach the maximum (100%). This means that all mesh routers are connected to each other. We see that CM converges faster than RDVM for SGC. Also, for the NCMC, CM performs better than RDVM. Therefore, we conclude that the performance of CM is better compared with RDVM.

Fig. 2. System structure for web interface.

14

S. Sakamoto et al.

Fig. 3. WMN-PSOHC Web GUI Tool.

Fig. 4. Simulation results of WMN-PSOHC for SGC.

Table 1. Parameter settings. Parameters

Values

Clients distribution

Normal distribution

Area size

32.0 × 32.0

Number of mesh routers

16

Number of mesh clients

48

Total iterations

800

Iteration per phase

4

Number of particle-patterns

9

Radius of a mesh router

2.0

Fitness function weight-coefficients (α, β) 0.7, 0.3 Replacement method

CM, RDVM

Performance Comparison of Router Replacement Methods by WMN-PSOHC

15

Fig. 5. Simulation results of WMN-PSOHC for NCMC.

5

Conclusions

In this work, we evaluated the performance of CM and RDVM router replacement methods for WMNs by WMN-PSOHC hybrid intelligent simulation system. Simulation results show that the performance of CM is better compared with RDVM. In our future work, we would like to evaluate the performance of the proposed system for different parameters and scenarios.

References 1. Ahmed, S., Khan, M.A., Ishtiaq, A., Khan, Z.A., Ali, M.T.: Energy harvesting techniques for routing issues in wireless sensor networks. Int. J. Grid Util. Comput. 10(1), 10–21 (2019) 2. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005) 3. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: A hybrid simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs: performance evaluation considering normal and uniform distribution of mesh clients. In: International Conference on Network-Based Information Systems, pp. 42–55. Springer (2018) 4. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance analysis of simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs considering different distributions of mesh clients. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 32–45. Springer (2018) 5. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance evaluation of WMN-PSODGA system for node placement problem in WMNs considering four different crossover methods. In: The 32nd IEEE International Conference on Advanced Information Networking and Applications (AINA-2018), IEEE, pp. 850– 857 (2018) 6. Barolli, A., Sakamoto, S., Durresi, H., Ohara, S., Barolli, L., Takizawa, M.: A comparison study of constriction and linearly decreasing Vmax replacement methods for wireless mesh networks by WMN-PSOHC-DGA simulation system. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 26–34. Springer (2019)

16

S. Sakamoto et al.

7. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002) 8. Girgis, M.R., Mahmoud, T.M., Abdullatif, B.A., Rabie, A.M.: Solving the wireless mesh network design problem using genetic algorithm and simulated annealing optimization methods. Int. J. Comput. Appl. 96(11), 1–10 (2014) 9. Gorrepotu, R., Korivi, N.S., Chandu, K., Deb, S.: Sub-1GHz miniature wireless sensor node for IoT applications. Internet Things 1, 27–39 (2018) 10. Inaba, T., Obukata, R., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of a QoS-aware fuzzy-based CAC for LAN access. Int. J. Space Based Situated Comput. 6(4), 228–238 (2016) 11. Inaba, T., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A testbed for admission control in WLAN: a fuzzy approach and its performance evaluation. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 559–571. Springer (2016) 12. Islam, M.M., Funabiki, N., Sudibyo, R.W., Munene, K.I., Kao, W.C.: A dynamic access-point transmission power minimization method using PI feedback control in elastic WLAN system for IoT applications. Internet Things 8(100), 089 (2019) 13. Maolin, T., et al.: Gateways placement in backbone wireless mesh networks. Int. J. Commun. Netw. Syst. Sci. 2(1), 44 (2009) 14. Marques, B., Coelho, I.M., Sena, A.D.C., Castro, M.C.: A network coding protocol for wireless sensor fog computing. Int. J. Grid Util. Comput. 10(3), 224–234 (2019) 15. Matsuo, K., Sakamoto, S., Oda, T., Barolli, A., Ikeda, M., Barolli, L.: Performance analysis of WMNs by WMN-GA simulation system for two WMN architectures and different TCP congestion-avoidance algorithms and client distributions. Int. J. Commun. Netw. Distrib. Syst. 20(3), 335–351 (2018) 16. Naka, S., Genji, T., Yura, T., Fukuyama, Y.: A hybrid particle swarm optimization for distribution state estimation. IEEE Trans. Power Syst. 18(1), 60–68 (2003) 17. Ohara, S., Barolli, A., Sakamoto, S., Barolli, L.: Performance analysis of WMNs by WMN-PSODGA simulation system considering load balancing and client uniform distribution. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 25–38. Springer (2019) 18. Ozera, K., Bylykbashi, K., Liu, Y., Barolli, L.: A fuzzy-based approach for cluster management in VANETs: performance evaluation for two fuzzy-based systems. Internet Things 3, 120–133 (2018) 19. Ozera, K., Inaba, T., Bylykbashi, K., Sakamoto, S., Ikeda, M., Barolli, L.: A WLAN triage testbed based on fuzzy logic and its performance evaluation for different number of clients and throughput parameter. Int. J. Grid Util. Comput. 10(2), 168–178 (2019) 20. Petrakis, E.G., Sotiriadis, S., Soultanopoulos, T., Renta, P.T., Buyya, R., Bessis, N.: Internet of Things as a Service (iTaaS): challenges and solutions for management of sensor data on the cloud and the fog. Internet Things 3, 156–174 (2018) 21. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007) 22. Sakamoto, S., Lala, A., Oda, T., Kolici, V., Barolli, L., Xhafa, F.: Analysis of WMN-HC simulation system data using Friedman test. In: The Ninth International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS-2015), IEEE, pp. 254–259 (2015) 23. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evaluation of a simulation system based on particle swarm optimisation for node placement problem in wireless mesh networks. Int. J. Commun. Netw. Distrib. Syst. 17(1), 1–13 (2016)

Performance Comparison of Router Replacement Methods by WMN-PSOHC

17

24. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation of a new replacement method in WMN-PSO simulation system and its performance evaluation. In: The 30th IEEE International Conference on Advanced Information Networking and Applications (AINA-2016), pp. 206–211 (2016). https://doi.org/ 10.1109/AINA.2016.42 25. Sakamoto, S., Obukata, R., Oda, T., Barolli, L., Ikeda, M., Barolli, A.: Performance analysis of two wireless mesh network architectures by WMN-SA and WMN-TS simulation systems. J. High Speed Netw. 23(4), 311–322 (2017) 26. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid systems for node placement problem in WMNs considering particle swarm optimization, hill climbing and simulated annealing. Mob. Netw. Appl. 23(1), 27–33 (2018) 27. Sakamoto, S., Barolli, A., Barolli, L., Okamoto, S.: Implementation of a web interface for hybrid intelligent systems. Int. J. Web Inf. Syst. 15(4), 420–431 (2019) 28. Sakamoto, S., Barolli, L., Okamoto, S.: WMN-PSOSA: an intelligent hybrid simulation system for WMNs and its performance evaluations. Int. J. Web Grid Serv. 15(4), 353–366 (2019) 29. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Implementation of an intelligent hybrid simulation systems for WMNs based on particle swarm optimization and simulated annealing: performance evaluation for different replacement methods. Soft. Comput. 23(9), 3029–3035 (2019) 30. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle swarms. J. Global Optim. 31(1), 93–108 (2005) 31. Shi, Y.: Particle swarm optimization. IEEE Connect. 2(1), 8–13 (2004) 32. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In: Evolutionary Programming VII, pp. 591–600 (1998) 33. Wang, J., Xie, B., Cai, K., Agrawal, D.P.: Efficient mesh router placement in wireless mesh networks. In: Proceedings of IEEE Internatonal Conference on Mobile Adhoc and Sensor Systems (MASS-2007), pp. 1–9 (2007) 34. Xhafa, F., Sanchez, C., Barolli, L.: Ad hoc and neighborhood search methods for placement of mesh routers in wireless mesh networks. In: Proceedings of 29th IEEE International Conference on Distributed Computing Systems Workshops (ICDCS2009), pp. 400–405 (2009) 35. Yaghoobirafi, K., Nazemi, E.: An autonomic mechanism based on ant colony pattern for detecting the source of incidents in complex enterprise systems. Int. J. Grid Util. Comput. 10(5), 497–511 (2019)

An Algorithm to Select a Server to Minimize the Total Energy Consumption of a Cluster Kaiya Noguchi1(B) , Takumi Saito1 , Dilawaer Duolikun1 , Tomoya Enokido2 , and Makoto Takizawa1 1 Hosei University, Tokyo, Japan [email protected], [email protected], [email protected], [email protected] 2 Rissho University, Tokyo, Japan [email protected]

Abstract. We have to decrease electric energy consumption of information systems, especially servers to reduce carbon dioxide emission. In this paper, we discuss how to select a server to perform a new application process issued by a client so as to reduce the total energy consumption of servers in a cluster. Here, we have to estimate the execution time of application processes on a server and the energy consumption of the server to perform the new process and current active processes. In this paper, we newly propose an ESEC (EStimation of Energy Consumption) algorithm to estimate the execution time of current active processes and the energy consumption of a server by considering not only current active processes but also possible processes to be issued after the current time. By using the ESEC model, we also propose an ESECS (ESEC Selection) algorithm to select a server to perform an application process. In the evaluation, we show the total energy consumption of servers and the average execution time of processes can be reduced in the ESEC algorithm in compared with the previous estimation algorithm. Keywords: Energy-efficient cluster · Server selection · Power consumption model · Computation model · ESEC algorithm · ESECS algorithm

1

Introduction

It is critical to reduce the energy consumption of information systems, especially servers in clusters to realize green societies by reducing carbon dioxide emission on the earth. For each new application process issued by a client, one host server is selected in the cluster where the application process is to be performed. In our previous studies [4,9–11], algorithms to select a host server for each new application process are proposed so as to reduce total energy consumption of c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 18–28, 2021. https://doi.org/10.1007/978-3-030-61105-7_3

An Algorithm to Select a Server to Minimize the Energy Consumption

19

servers. Here, we have to estimate the execution time of application processes on a server. In order to do the estimation, we need a model to give the power consumption of each server to perform application processes. A pair of the simple power consumption (SPC) and simple computation (SC) models [5,6] and a pair of the multi-level power consumption (MLPC) and multi-level computation (MLC) models [10,11] are proposed to give the power consumption [W] of a server and the execution time of each application process on the server. The models are macro-level ones [5] where the total power consumption of a whole server where application processes are performed is considered without being conscious of each hardware component like CPU and memory. The power consumption Et (τ ) [W] of a server st and the consumption rate P Rti (τ ) of a process pi on a server st at time τ are given as functions N Et (nt ) and N P Rti (nt ) of the number nt of active application processes, respectively [10,11]. By using the models, the execution time of an application process on a server and the energy consumption of the server can be estimated. Algorithms to select energy-efficient servers in a cluster are so far proposed [4–8,10]. Here, only active application processes being performed are considered to do the estimation. In this paper, we consider not only current active application processes but also possible application processes to be issued after current time. We newly propose an ESEC (EStimation of Energy Consumption) algorithm to obtain the expected energy consumption to perform application processes on a server by estimating the number of possible application processes to be issued to the server. Here, the number of possible application processes to be issued is estimated from the average number of application processes performed before the current time. By using the ESEC estimation algorithm, we also propose an ESECS (ESEC Selection) algorithm to select a server to perform a new application process issued by a client where the total energy consumption of the servers can be reduced. Here, a server is selected to perform a new application process, where estimated energy consumption is smallest to perform the new application process. By using the EDS (Eco Distributed System) simulator [12], the ESECS algorithm is evaluated. We show the total energy consumption of servers and the average execution time of processes can be more reduced in the ESECS algorithm than the previous EA (Energy-aware) server selection algorithm [11]. In Sect. 2, we present a system model and the power consumption and computation models. In Sect. 3, we propose the ESEC algorithm to estimate the execution time of application processes on a server. In Sect. 4, we propose the ESECS algorithm to select servers to energy-efficiently perform application processes. In Sect. 5, we evaluate the ESECS algorithm.

2 2.1

System Model Clusters of Servers

A cluster S is composed of servers s1 , · · · , sm (m ≥ 1) (Fig. 1). A client ci issues a request qi to a load balancer L. The load balancer L selects a host server st in the cluster S and forwards the request qi to the server st . An application

20

K. Noguchi et al.

process pi is created to handle the request qi and performed on the sever st . On termination of the application process pi , the server st sends a reply ri to the client ci . Here, the load balancer L selects such a host server st that the total energy consumption of the servers s1 , · · · , sm can be reduced and the average execution time of application processes on the servers can be shortened. In this paper, a term process stands for an application process to be performed on a server, which only consumes CPU resources of a server, i.e. computation process [5]. A process pi is active on a server st if and only if (iff) the process pi is performed on the server st . Otherwise, the process pi is idle. CPt (τ ) is a set of active processes on a server st at time τ .

Fig. 1. A server cluster.

2.2

Power Consumption and Computation Models

A server st is composed of npt (≥1) homogeneous CPUs [1] each of which includes nct (≥1) homogeneous cores. Each core supports the same number ctt of threads. A server st supports processes with totally ntt (=npt · nct · ctt ) threads. Each process is at a time performed on one thread. A thread is active iff at least one process is performed, otherwise idle. Here, the electric power N Et (n) [W] of a server st to concurrently perform n(≥0) processes is given in the MLPC (Multi-Level Power Consumption) model [9–11] as follows: [Power consumption of a server st to perform for n processes]

⎧ ⎪ minEt if n = 0. ⎪ ⎪ ⎪ ⎪ ⎪minEt + n · (bEt + cEt + tEt ) if 1 ≤ n ≤ npt . ⎨ N Et (n) = minEt + npt · bEt + n · (cEt + tEt ) if npt < n ≤ nct · npt . ⎪ ⎪ ⎪ ⎪minEt + npt · (bEt + nct · cEt ) + ntt · tEt if nct · npt < n < ntt . ⎪ ⎪ ⎩ maxEt if n ≥ ntt .

(1)

An Algorithm to Select a Server to Minimize the Energy Consumption

21

The electric power Et (τ ) [W] consumed by a server st at time τ is assumed to be N Et (|CPt (τ )|) in our approach. That is, the electric power consumption of a server st depends et on the number n of active processes. A server st consumes electric energy τ =st Et (τ ) [W · time unit] from time st to et. Let minTti show the minimum execution time [time unit] of a process pi , i.e. only the process pi is performed on a thread of a server st without any other process. Let minTi be a minimum one of minE1i , · · · , minTmi . That is, minTi is minTf i of a server sf which supports the fastest thread. A server sf supporting the fastest thread is referred to as f astest in a cluster S. In most server applications, well-known processes, i.e. transactions are performed. Hence, we can get the minimum execution time minTti of each process pti on each server st . In order to give the computation metrics of each process pi , a concept of the virtual computation amount V Ci of a process pi is introduced, which is defined to be minTi . The thread computation rate T CRt of a server st is V Ci /minTti = minTi /minTti (≤ 1), which shows a each thread of is T CRt (≤ 1) slower than sf . The maximum computation rate XSCRt of a server st is ntt · T CRt where ntt is the total number of thread of the server st . The maximum computation rate maxP CRti of a process pi on a server st is T CRt . Here, the process pi is only performed on the server st without any other process. It is noted, for every pair of processes pi and pj on a server st , maxP CRti = maxP CRtj = T CRt . The process computation rate N P Rti (n) of a process pi on a server st where n active processes are performed at time τ is defined in the MLC (Multi-Level Computation) model [3,4,10]: [MLC (Multi-Level Computation) model] ntt · T CRt /n if n > ntt . N P Rti (n) = T CRt if n ≤ ntt .

(2)

Since N P Rti (n) = N P Rtj (n) for every pair of processes pi and pj on a server st , N P Rt (n) stands for N P Rti (n) for each process pi . The computation rate P Rti (τ ) of each process pi at time τ is assumed to be N P Rt (|CPt (τ )|). The P Rt (n). If a process pi starts on a computation rate SCRt (n) of a server st is n·N et server st at time st and terminates at time et, τ =st N P Rt (|CPt (τ )|) = minTi . Thus, minTi shows the amount of computation of a process pi . The server computation rate N SRt (n) of a server st to perform n processes is n · N P Rt (n), i.e. ntt · T CRt (=maxSCRt ) for n > ntt and T CRt for n ≤ ntt . A process pi is modeled to be performed on a server st as follows. [Computation model of a process pi ] 1. At time τ a process pi starts, the computation residue Ri of a process pi is V Ci , i.e. Ri = minTi ; 2. At each time τ , Ri = Ri − N P Rt (|CPt (τ )|); 3. If Ri ≤ 0, the process pi terminates at time τ .

22

3

K. Noguchi et al.

An ESEC (EStimation of Energy Consumption) Algorithm

In our previous studies [9,11], we estimate at current time τ by what time every current active process to terminate on a server st under the assumption that no new process is to be issued after the time τ by a client as shown in Algorithm 1. Here, CPt (τ ) is a set of current active processes of a server st at time τ . The variable Pt denotes a set CPt (τ ) of processes active on a server st at time τ . The variable n shows the number |CPt (τ )| of active processes. The variable EEt denotes the energy consumption of a server st . Initially EEt = 0. The variable Ri denotes the computation residue of each active process pi (∈ Pt ). Ri = minTi when a process pi starts. The time variable tm stands for each time unit and initially denotes the current time τ . The energy consumption EEt of the server st is incremented by the power consumption N Et (n). The computation residue Ri is decremented by the process computation rate N P Rt (n) for each active process pi ∈ Pt . If Ri ≤ 0, the process pi terminates and is removed from the set Pt . Then, the time variable tm is incremented by one, i.e. tm = tm + 1. Until the set Pt gets empty, this procedure is iterated. If Pt = φ, the estimation procedure terminates. The execution time ETt is tm − 1. EEt shows the total energy to be consumed by the server st and ETt indicates the execution time of the server st to perform every current active processes. As shown here, no new process is issued to the server st after the current time τ . In order to make the estimation of the energy consumption and execution time of a server st more accurate, we have to take into consideration possible processes to be issued after current time τ . A possible process is a process to be issued after current time τ . Let P be a set {p1 , · · · , pn } (n ≥ 1) of all processes to be issued to a cluster S. Let minT be the average minimum execution time of all the processes, i.e. minT = pi ∈P minTi /|P |. At current time τ , we consider processes performed for δ time units from time τ − δ2 to time units τ − δ1 before the current time τ where δ = δ1 − δ2 + 1 ≤ τ . We assume current active processes start after time τ − δ1 . That is, no current process is performed before time τ − δ1 and some current process is performed after τ − δ1 . In this paper, δ1 is the half of the average minimum execution time minT of all the processes in the set P , i.e. δ1 = minT /2. The average number ant of active processes per one τ −δ1 time unit from time τ − δ2 to time τ − δ1 is t=τ −δ |CPt (t)|/(δ2 − δ1 + 1). Let 2 nt be the number |CPt (τ )| of active processes. Let ent be the expected number of possible processes to be issued after the current time τ . In this paper, the expected number ent (nt ) of possible processes to be issued for the number nt of current active processes after the current time τ is given as follows; nt · (1 + (nt − ant )/nt ) if ant ≤ 2nt . ent (nt ) = (3) 0 otherwise. Here, δ1 = minT /2. ent (nt ) means, the more number of processes than ntt are performed before time τ − δ1 , the fewer number of possible processes are

An Algorithm to Select a Server to Minimize the Energy Consumption

23

performed at time τ . That is, if ant ≥ nt and ant < nt , ent (nt ) ≤ nt and ent (nt ) > nt , respectively. The total computation residue of active processes and possible processes on a server st is pj ∈CPt (τ ) Rj + ent (nt ) · minT . The total number of processes to be performed at time τ is nt +ent (nt ) where 0 ≤ ent (nt ). In the previous algorithm, ent (nt ) = 0. Hence, the execution time N ETt (nt ) of nt active processes on a server st is defined as follows; pj ∈CPt (τ ) Rj + ent (nt ) · minT . (4) N ETt (nt ) = SCRt (nt + ent (nt )) A server st consumes electric power N Et (nt + ent (nt )) [W] for N ETt (nt ) time units. Hence, the energy consumption N EEt (nt ) of a server st to perform active processes and possible processes is given as follows; N EEt (nt ) = N ETt (nt ) · N Et (nt + ent (nt )).

(5)

Algorithm 1: Previous estimation algorithm [7,9] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

4

input st = server; Pt = set CPt (τ ) of current active processes on a server st ; τ = current time; output EEt = energy consumption of each server st ; ETt = execution time of each server st ; etimei = termination time of each process pi ; EEt = 0; tm = τ ; while Pt = φ do nt = |Pt |; EEt = EEt + N Et (nt ); for each process pi in Pt do Ri = Ri − N P Rt (nt ); if Ri ≤ 0 then Pt = Pt − {pi }; etimei = tm; tm = tm + 1; ETt = tm − 1;

An ESECS (ESEC Selection) Algorithm to Select a Server

A client issues a request to a load balancer L of a cluster S of servers s1 , · · · , sm (m ≥ 1). The load balancer L selects a server st in the cluster S. Then, a process pi to handle the request is created and performed on the server st as shown in Fig. 1. Let P be a set {p1 , · · · , pn }(n ≥ 1) of processes to be performed on the servers in the cluster S. In this paper, we propose an

24

K. Noguchi et al.

ESECS (ESEC Selection) algorithm to select a server to perform a new process by taking advantage of the proposed ESEC estimation algorithm. Suppose a new process pi is issued to a server st at time τ . Then, not only current active processes CPt (τ ) but also the new process pi are performed on the server st . Let nt be the number of active processes on the server st at time τ , i.e. nt = |CPt (τ )|. Here, the total computation residue RSt of the current active processes and the new process pi on a server st is pj ∈CPt (τ ) Rj + minTi . Totally (nt + 1) processes are performed on the server st at time τ . As discussed in the preceding section, the number ent (nt ) of possible processes are estimated to be newly performed after time τ . Hence, the execution time T ETt (nt , pi ) of the server st to perform current active processes, a new process pi , and ent (nt ) possible processes is given as follows; pj ∈CPt (τ ) (Rj + minTi + ent (nt ) · minT ) . (6) T ETt (nt , pi ) = SCRt (nt + ent (nt ) + 1) The server st consumes the electric power N Et (nt + ent (nt ) + 1) [W] for T ETt (nt , pi ) time units. Hence, the server st consumes energy T EEt (nt , pi ) [W · time unit] to perform the new process pi and the nt active processes as follows; T EEt (nt , pi ) = T ETt (nt , pi ) · N Et (nt + ent (nt ) + 1).

(7)

A load balancer L selects a server st whose expected energy consumption T EEt (nt , pi ) is the smallest to perform a new process pi by an ESECS (ESEC Selection) algorithm. [ESECS (ESEC Selection) algorithm] 1. A client issues a process pi to a load balancer L. 2. The load balancer L selects a server st whose T EEt (nt , pi ) is minimum. 3. The process pi is performed on the server st .

5

Evaluation

By using the EDS (Eco Distributed System) simulator [12], selection algorithms like the ESECS algorithm to select a server to perform a new process issued by a client are evaluated in terms of the total energy consumption of servers and the average execution time of processes. The EDS simulator is performed on tables in a Sybase relational database [2], which holds configurations on servers and processes and evaluation data like energy consumption of servers and execution time of processes. In the evaluation, we consider a cluster S of four servers s1 , · · · , s4 (m = 4). The performance and energy parameters of the servers are shown in Table 1. The thread computation rate T CR1 of the fastest server s1 is one, i.e. T CR1 = 1. T CR2 = 0.8, T CR3 = 0.6, and T CR4 = 0.4 for the other servers s2 , s3 , and s4 , respectively. For example, the server s1 supports the maximum server

An Algorithm to Select a Server to Minimize the Energy Consumption

25

computation rate XSCR1 = 16 by sixteen threads, nt1 = 16, whose thread computation rate T CRt is one. The maximum power consumption maxE1 is 230 [W ] and the minimum power consumption minE1 is 150 [W ]. The server s4 supports XSCR4 = 3.2 by eight threads, nt4 = 8, where maxE4 = 77 [W ] and minE4 = 40 [W ]. The servers s2 and s3 support the same number twelve threads, nt2 = nt3 = 12, while maxE2 > maxE1 and minE2 > minE1 . XSCR2 = 9.6 and XSCR3 = 7.2. The server s3 is more energy-efficient than the server s2 . The server s4 shows a desktop PC. The server s1 stands for a server computer. Let P be a set of processes p1 , · · · , pn (n ≥ 1) to be issued to the cluster S. For each process pi in the set P , the starting time stimei [time unit] and the minimum execution time minTi [time unit] are randomly given as 0 < stimei ≤ xtime and 25 ≤ minTi ≤ 50. Here, xtime = 2, 000 [time unit]. At each time τ , if there is a process pi whose start time (stimei ) is τ , one server st is selected by a selection algorithm. The process pi is added to the set Pt , i.e. Pt = Pt ∪ {pi }. For each server st , active processes in the set Pt are performed. The variable nt denotes the number |CT (τ )| of active processes on each server st , nt = |Pt |. The energy variable Et is incremented by the power consumption N Et (|Pt |). If |Pt | = φ, Et is incremented by minEt . If |Pt | > 0, the active time variable Tt [time unit] is incremented by one [time unit]. The variable Tt [time unit] shows how long the server st is active, i.e. some active process is performed. For each process pi in the set Pt , the computation residue Ri of the process pi is decremented by the process computation rate N P Rt (nt ). If Ri ≤ 0, the process pi terminates, i.e. Pt = Pt − {pi } and P = P − {pi }. Here, the termination time etimei is τ and the execution time P Ti is etimei − stimei + 1 for the process pi . Until the set P gets empty, the steps are iterated. The variables Et and Tt give the total energy consumption and execution time of each server st , respectively. Table 1. Parameters of servers. st npt nct ntt T CRt XSCRt minEt maxEt pEt cEt tEt s1 1

8

16

1.0

16.0

150.0

270.0

40.0 8.0 1.0

s2 1

6

12

0.8

9.6

128.0

200.0

30.0 5.0 1.0

s3 1

6

12

0.6

7.2

80.0

130.0

20.0 3.0 1.0

s4 1

4

8

0.4

3.2

40.0

67.0

15.0 2.0 0.5

Algorithm 2 shows the procedure of the EDS simulator. Given the process set P and the server set S, the EDS simulator gives the total energy consumption Et and total active time Tt of each server st in the cluster S and the execution time P Ti of each process pi in the set P . In the evaluation, the total energy consumption E is E1 + · · · + E4 and the average execution time AP T of the n (= |P |) processes is pi ∈P P Ti /n. We consider the EA (Energy-aware) server selection algorithm [11] to compare with the ESEC algorithm. In the EA algorithm, the energy consumption of each server is estimated by the previous algorithm [Algorithm 1].

26

K. Noguchi et al.

Algorithm 2: EDS Simulator 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

input P = set of processes p1 , · · · , pn ; S = set of servers s1 , · · · , sm ; output Et = energy consumption of each server st ; Tt = active time of each server st ; P Ti = execution time of each process pi ; τ = 0; for each server st do Pt = φ; Et = 0; Tt = 0; while P = φ do for each process pi where stimei = τ do select a server st in a selection algorithm; Pt = Pt ∪ {st }; /* pi is performed on st */ for each server st do nt = |Pt |; Et = Et + N Et (|nt |); /* energy consumption */ if Pt = φ then for each process pi in Pt do Ri = Ri − N P Rt (nt ); if Ri ≤ 0 then /*pi terminates */ Pt = Pt − {pi }; P = P − {pi }; etimei = τ ; P Ti = etimei − stimei + 1; /* execution time */ Tt = Tt + 1; τ = τ + 1;

Figure 2 shows the total energy consumption E of the servers obtained by the EDS simulator in the EA algorithm which use the previous estimation model [Algorithm 1] and the proposed ESECS algorithm for number n of processes. The total energy consumption E of the servers in the ESECS algorithm is smaller than the EA algorithm as shown in Fig. 2. Figure 3 shows the average execution time APT of the n processes. The average execution time APT of the processes of the ESECS algorithm is shorter than the EA algorithm.

An Algorithm to Select a Server to Minimize the Energy Consumption

27

Fig. 2. Total energy consumption of servers.

Fig. 3. Average execution time of processes.

6

Concluding Remarks

In this paper, we newly proposed the ESEC algorithm to estimate the execution time of processes on a server and the energy consumption of a server where not only current active processes but also possible processes to be issued after current time are considered. By using the ESEC algorithm, we also proposed the ESECS algorithm to select a server to energy-efficiently perform a process issued by a client. In the evaluation, we showed the total energy consumption of servers is smaller and the average execution time of processes is shorter in the ESECS algorithm than the EA algorithm.

References 1. Intel xeon processor 5600 series: the next generation of intelligent server processors, white paper (2010). http://www.intel.com/content/www/us/en/processors/xeon/ xeon-5600-brief.html 2. Sybase. https://www.sap.com/products/sybase-ase.html

28

K. Noguchi et al.

3. Duolikun, D., Aikebaier, A., Enokido, T., Takizawa, M.: Energy-aware passive replication of processes. Int. J. Mob. Multimed. 9(1,2), 53–65 (2013) 4. Duolikun, D., Kataoka, H., Enokido, T., Takizawa, M.: Simple algorithms for selecting an energy-efficient server in a cluster servers. In: Proceedings of the International Journal of Communication Networking and Distributed Systems, pp. 1–25 (2018) 5. Enokido, T., Ailixier, A., Takizawa, M.: A model for reducing power consumption in peer-to-peer systems. IEEE Syst. J. 4(2), 221–229 (2010) 6. Enokido, T., Ailixier, A., Takizawa, M.: Process allocation algorithms for saving power consumption in peer-to-peer systems. IEEE Trans. Industr. Electron. 58(6), 2097–2105 (2011) 7. Enokido, T., Ailixier, A., Takizawa, M.: An extended simple power consumption model for selecting a server to perform computation type processes in digital ecosystems. IEEE Trans. Industr. Inf. 10(2), 1627–1636 (2014) 8. Enokido, T., Takizawa, M.: An integrated power consumption model for distributed system. IEEE Trans. Industr. Electron. 60(2), 824–836 (2013) 9. Kataoka, H., Duolikun, D., Enokido, T., Takizawa, M.: Energy-efficient virtualisation of threads in a server cluster. In: Proceedings of the 10th International Conerence on Broadband and Wireless Computing, Communication and Applications (BWCCA-2015), pp. 288–295 (2015) 10. Kataoka, H., Nakamura, S., Duolikun, D., Enokido, T., Takizawa, M.: Multi-level power consumption model and energy-aware server selection algorithm. Int. J. Grid Util. Comput. (IJGUC) 8(3), 201–210 (2017) 11. Kataoka, H., Sawada, A., Duolikun, D., Enokido, T., Takizawa, M.: Energy-aware server selection algorithm in a scalable cluster. In: Proceedings of IEEE the 30th International Conference on Advanced Information Networking and Applications (AINA-2016), pp. 565–572 (2016) 12. Noguchi, K., Saito, T., Duolikun, D., Enokido, T., Takizawa, M.: An algorithm to select an energy-efficient server for an application process in a cluster of servers. In: Proceedings of the 12-th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2020)

An Approach to Support the Design and the Dependability Analysis of High Performance I/O Intensive Distributed Systems Lucas Bressan(B) , Laércio Pioli, Mario A. R. Dantas, Fernanda Campos, and André L. de Oliveira Programa de P´ os Gradua¸ca õ Em Ciência da Computa¸ca õ, UFJF, Juiz de Fora, Brazil [email protected] Abstract. Frequent service down times and poor system performance can affect aspects such as the availability, quality of experience and generate millions of dollars in lost revenue. High Performance Computing (HPC) environments are often required to comply with performance and dependability requirements. The CHESS methodology provides support for the design and the evaluation of dependability and performance system attributes. In this paper we extend the CHESS methodology to support the design and the dependability analysis of HPC environments. The proposed approach was employed in the Grid’5000, a highly distributed and I/O intensive HPC environment. The application of the proposed approach provided key information for demonstrating dependability, deriving project decisions, agreeing on new design choices and resource allocation strategies.

1

Introduction

Dependability is the ability of a system to operate as intended and to deliver its services when required and in a trusted manner [17]. It is broken down into availability, reliability, safety, security and resilience [21]. Fault tolerance relates to the capability of a system to continue operating as intended, after encountering a failure [13]. Availability is directly related to fault tolerance and refers to the ability of a system to operate continuously by either protecting itself against or quickly recovering from failures [19]. Distributed architectures such as High Performance Computing (HPC) environments are often required to attend to performance and dependability requirements. In certain domains (e.g.: industrial, military, banking and e-health) long service response times, failures and momentary service down times can affect their provided Quality of Experience (QoE) and generate undesirable or even contribute to catastrophic consequences. Thus, HPC environments must ensure their dependability, performance and are sometimes required to implement redundancy, error detection, fault recovery capabilities [6] and provide low I/O times and data exchange latency [22]. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 29–40, 2021. https://doi.org/10.1007/978-3-030-61105-7_4

30

L. Bressan et al.

Analyzing and demonstrating these requirements can prove to be very challenging when dealing with big and complex distributed HPC environments. Thus, compositional analysis and simulation techniques are applied when designing these kinds of environments [9]. CHESS is a methodology and toolset that enables the high level specification (i.e. using UML and SysML constructs) and analysis of system models [7]. Unlike other model-based design (e.g. Matlab & Simulink) and compositional analysis (e.g. Hip-Hops [3]) techniques, the CHESS Toolset is completely free and partially open-source thus, allowing developers to easily extend and adapt it to their specific needs. Due to its maturity and technological readiness, the methodology is used in the industry and applied in domains such as IoT [11], automotive [7] and petroleum [15]. The CHESS Toolset and its State-Based Analysis extension, provide extensive support for the analysis of dependability related attributes such as availability and reliability [5] via the discrete-event simulations of Stochastic Petri Nets (SPN). These results can be further attached to dependability requirements as evidence for their compliance. Furthermore, the results can also be used as basis for agreeing on project decisions and achieving satisfactory levels of performance and dependability. Although SPN-based availability and reliability estimation techniques are constantly applied in the context of distributed architectures [4], most of them do not provide a general purpose, high level system architecture and error behavior specification interface like CHESS does. In previous work [7], we have extended and applied CHESS towards the generation of safety evidence for the certification of aerospace and automotive systems. Similarly, in this paper, we extend the CHESS methodology even further to support the design and dependability evaluation of HPC environments. The approach is intended to support engineers on deriving new design and additional project decisions by considering dependability analysis results and performance attributes. The feasibility of the approach was evaluated considering the Grid’5000 [2], a highly distributed and I/O intensive HPC environment. The approach successfully supports the demonstration and evaluation of the impacts different environment configurations have on dependability and performance. Furthermore, it also effectively contributes with agreeing on design choices and resource allocation strategies based on dependability, performance and cost attributes. The rest of this paper is organized as follows: In Sect. 2 we present the related work. Section 3 contains the background. A description of the proposed approach is presented in Sect. 4. Section 5 illustrates the evaluation of the application of the proposed approach in the Grid’5000. Finally, Sect. 6 presents the concluding remarks and future work.

2

Related Work

A few authors have in the past applied or extended the CHESS Methodology, to support the design of systems belonging to various different domains [11]. They have also analyzed the feasibility of the Toolset and of some of its analysis

An Approach to Support the Design and the Dependability Analysis

31

capabilities for evaluating and demonstrating compliance with standards and project-specific dependability requirements [7]. Mazzini et. al. [11] performed a feasibility study of the CHESS Toolset regarding its support for the development, analysis, verification, operation, management and monitoring of mission-critical IoT systems. The authors provide an in depth analysis of the CHESS Toolset capabilities and how they comply with some of the main development life-cycle activities of mission-critical IoT architectures. The authors have concluded that the Toolset offers a holistic solution for the development of IoT environments and provides a meaningful view of the system and its architecture as a whole through its models and the traceability among their artifacts. In previous work [7], we have analyzed the applicability of the CHESS Methodology for generating certifiable evidence for safety-critical embedded systems. We present a systematic process to support the use of the CHESS Methodology and Toolset. The process covers the production of dependability evidence necessary for the certification of systems of the automotive and aerospace domains. The approach considers the analysis techniques implemented in the CHESS Toolset for the production of such evidence and is evaluated through a realistic automotive Hybrid Braking system. Moreover, we have also provided a mapping between the requirements within the ISO26262, DO-331 and SAE ARP 4754A standards and the activities supported by both the approach and CHESS.

3 3.1

Background High Performance Computing (HPC)

High Performance Computing (HPC) systems are designed to provide high processing power, low communication and data access latency. HPC environments comprise of complex combinations of hardware, software and large scale distributed and parallel applications e.g.: computing clusters, grids and supercomputers [10]. Due to the need of attending requirements such as low communication and data access latency, I/O performance is very important in HPC systems. Environments such as these are usually divided into computing and storage resources. The constant and massive movements of data between these layers and the different configuration scenarios implemented by computing and storage devices, can affect the I/O performance of these systems [18]. Furthermore, different configurations can also affect dependability attributes such as availability and reliability. Thus, it is very important to evaluate the impact of design choices on dependability, performance and costs before implementing them. 3.2

CHESS and the CHESS Dependability Analysis Plugin

The CHESS Toolset provides support for the specification of system models and the evaluation of their dependability attributes. The Toolset is completely

32

L. Bressan et al.

free of charge and is available as an Eclipse IDE extension [5]. CHESS is built upon an open and highly extensible platform thus, allowing developers to easily extend and adapt it according to their specific needs. Initiatives such as the AMASS Project [1] for example, have recently integrated the CHESS Toolset with additional design, validation, certification and dependability assessment solutions. The Toolset is built upon the CHESS Methodology [12]. The CHESS Methodology enables system design at different levels of abstraction and supports system development phases such as the definition of requirements, system and component level architectural specification and error modeling. CHESS models are specified using the CHESS Modeling Language (CHESS-ML). CHESSML extends the traditional UML and SysML modeling languages and provides profiles to enable the specification of high level architectural, error models and dependability analysis scenarios. CHESS supports the analysis of quantitative and qualitative dependability properties. Quantitative properties include metrics such as availability and reliability. Qualitative properties concern aspects such as fault tolerance. The CHESS-ML language alongside the CHESS Methodology ensure the traceability between the model elements thus, providing a meaningful overview of the system as a whole: model elements such as components, requirements, error models, evidence and analysis results can all be traced one to another. The State-Based Analysis Plugin (CHESS-SBA) [14,16] gathers the information within CHESS models and converts it into Stochastic Petri Nets (SPN). The analysis considers the architectural model and error information within UML State Machines. The semantics of these State Machines are based on the Fault-Error-Failure model and enable the specification of random faults, error states, error modes propagation and repair times. Additional failure annotation strategies such as stochastic and Fault Propagation and Transformation Calculus (FPTC) are also supported by the plugin. CHESS-SBA calculates dependability metrics such as Availability and Reliability, by performing discrete-event simulations [15]. Reliability describes the probability a system remains healthy continuously from t = 0 to t = x. Availability is divided into instantaneous and averaged. Instantaneous availability gives the probability a system is healthy at t = x. Averaged availability, denotes the fraction of time within an interval, a system remains healthy. CHESS-SBA does not require a complete model of the overall system to perform the analysis. Thus, models can be specified in different levels of detail according to project needs.

4

The Proposed Approach

This section describes the approach to support the definition and evaluation of the dependability properties of HPC environments. The proposed approach is incremental and comprises of activities covering the definition of requirements, architectural design, error modeling and dependability analysis. Figure 1 shows the approach steps.

An Approach to Support the Design and the Dependability Analysis

33

Fig. 1. The proposed approach.

The first two steps (1,2) of the proposed approach cover the collection of data regarding system architecture, dependability and performance properties. Additional data such as component costs and maintenance requirements may also be recovered during step 2. Such information does not necessarily have to cover the entire system since different parts of the architecture can be modeled and analyzed separately if desired. The system architecture data, includes information such as the devices used in the architecture and how they communicate with each other e.g.: devices used in the computation and storage nodes. System dependability information covers the error information considered during the analysis and it can be gathered in a couple of different ways. One way to do it is by analyzing the data within system error logs. Such data requires the system to be up and running and therefore, must be collected from setups that are similar to the intended one. Information such as system up/down times and the number of times a certain piece of software/hardware has failed throughout a time span can be attached to model components as dependability information. Dependability information can also be gathered from device specifications e.g.: Mean Time Between Failures (MTBF) or experiments. Additional information such as performance, maintenance and implementation costs can also be gathered and used alongside dependability information later when agreeing on how to configure the system and allocate its resources. Once having all the necessary information collected, the system model can now be specified using UML, SysML and CHESS-ML constructs. The next couple of steps cover the specification of the system architecture according to the CHESS Methodology and the attachment of failure information to model elements (steps 3 and 4). System specification can be performed in many different ways and abstraction levels. As previously mentioned in Sect. 3.2, the CHESS Methodology provides means for the definition of architectures on system, component and deployment levels. The CHESS-SBA plugin work with models belonging to all those levels and thus, it is completely up to the system designer to decide the complexity and the level of detail of their to-be-analyzed system model.

34

L. Bressan et al.

The next two steps (5,6) include the analysis of the dependability attributes, their evaluation and the estimation of the model tweaks necessary to achieve the desired system requirements. Such requirements can be defined at any stage of the process and may relate to fault tolerance, availability, reliability, performance and costs. System designers can use the dependability information of different system configurations alongside performance and cost data to estimate the necessary model optimizations. Increasing dependability is usually costly and does not always translates into better system performance or an expressive gain in availability. Therefore, system engineers must analyze all these variables together and agree upon the tweaks necessary so the considered system satisfies its requirements. Furthermore, dependability analysis results can be attached to the model as dependability evidence and used to argue over and agree on project decisions e.g.: design choices, resource allocation and planned maintenance.

5

Evaluation

In order to analyze how the proposed approach can support the derivation of project decisions based on its provided information, we have performed an evaluation aimed towards answering the following research question: RQ: How does the information provided by the proposed approach can support project managers on identifying and agreeing upon project decisions based on dependability/performance requirements and analysis results? The evaluation was conducted considering the architectural, performance and dependability information gathered from the Grid’5000. 5.1

Experimental Environment

The Grid’5000 [2] is a highly distributed and High Performance Computing environment. It is spread across 8 different locations along France and comprises of a total of 800 nodes grouped into clusters. These clusters are variant-intensive and may implement a different set of solutions from each other e.g: CPUs, GPUs, storage and network communication devices. In this study we have considered the characteristics of the computation and storage nodes within the Dahu cluster, located in Grenoble. The considered group of nodes include Dell PowerEdge C6420 servers, interconnected via a Gigabit Ethernet network environment. These nodes implement a pair of Intel Xeon Gold 6130 2.10GHz/16 core CPUs and include a total of 192GB of RAM each. As for storage solutions, the nodes may contain 240GB/480 GB SATA SSDs and 4.0 TB SATA HDDs. The nodes are running under CentOS7 (kernel v3.10.0–957.21.2.el7.x86 64) and use the ext4 file system [18]. 5.2

Execution

It is already known for a fact, that Solid State Drives (SSDs) generally provide better availability, reliability and I/O performance when compared to mechanical Hard Disk Drives (HDDs). This is however, only the tip of the iceberg.

An Approach to Support the Design and the Dependability Analysis

35

Different data scheduler choices and storage configurations may also impact on these attributes. That being said, is it always worth it to adopt a SSD exclusive architecture for both data and meta-data storage? Do the dependability and performance gains related to the adoption of a SSD exclusive architecture always justify the extra costs compared to adopting a hybrid storage approach? When using a hybrid architecture, will certain services still require their data to be stored exclusively on SSDs due to their specific performance, availability and reliability needs? For this evaluation, we have considered a total of 9 different Grid’5000 scenario configurations. Each one implementing a different set of storage device and I/O scheduler combinations. Three different approaches to store data and metadata were considered during the evaluation. In the first scenario, both data and metadata are stored in HDDs. In a different scenario, data is kept into HDDs while metadata is stored into SSDs. In the last scenario, both data and metadata are stored into SSDs. When it comes to the selection of different Linux I/O schedulers, different experiments were performed considering the Complete Fairness Queueing (CFQ), Deadline and Noop schedulers. Dependability properties such as availability, reliability and fault tolerance can all be evaluated through the State-Based Analysis (CHESS-SBA) CHESS Plugin. As previously mentioned in Sect. 3.2, the analysis considers error information specified in UML State Machines and other types of failure annotations. Before annotating components with quantitative failure data, we must first analyze the different scenarios and the failure probabilities for each component depicted in the model. When it comes to storage devices for example, failures within them may also affect overall system dependability. Therefore information regarding their failure rate distributions are necessary to provide an appropriate description of their failure behavior. Such information can be gathered through previous experiments, system logs or through the device manufacturers themselves. Since we don’t have data regarding the failure rates or component up/down times of the storage device arrays within the storage nodes, we will be relying on the failure rates provided by previously published experiments. Even though device manufacturers provide metrics such as the Mean Time Between Failures (MTFB) claiming that their hardware offers an average of five years of continuous operation, experiments have shown that storage device failure rates go way beyond that. According to previously published studies [8,20], Hard Disk failure rates follow a Weibull distribution and can present an annual failure rate of 5% after 2 years of continuous operation. Such value can go up as high as 10% after 5 and 18% after 10 years of use. When it comes to SSDs however, the failure probabilities tend to be constant and lay around 0.10% a year therefore, following an Exponential failure distribution. Figure 2 shows the failure information model describing the failure behavior of Mechanical Hard Disks. The state machine attached to the HDD component describes its behavior upon encountering both internal and external faults. As earlier mentioned, the internal failure rates of mechanical disks follow a Weibull distribution. Therefore, the distribution wei(0.54,2E6) describes the probability of the component failing

36

L. Bressan et al.

Fig. 2. Error model state machine describing the failure behavior of the HDD component.

randomly over time. Once failing, there is a 95% chance that the Hard Drive failure will be detected by the system and that a redundant backup drive will kick right into its place. Such automatic detection and device substitution may take a fraction of time and may cause delays in I/O requests. Furthermore, there is also a 5% probability that the failure goes undetected. In that case, the Hard Disk will stop responding to any requests until replaced or fixed manually. When it comes to performance regarding the Grid’5000, different storage device/scheduler configurations may generate different average I/O latency values. Long response times regarding I/O operations can affect not only attributes related to system performance but also, those related to dependability. Depending on the application domain, a system might as well be considered unfeasible if it is not capable to provide its requested services with short time response delays. Previous experiments regarding the Grid’5000 have analyzed and made public the average read and write times per storage device/scheduler configuration [18]. Table 1 lists the average read and write times considering the different scenarios within the Grid’5000. Table 1. Average I/O times per storage node and scheduler configurations. Conf

CFQ(R) CFQ(W) Deadline(R) Deadline(W) Noop(R) Noop(W)

HDDHDD 0.3543

2.0151

0.3468

2.0183

0.3637

2.2576

HDDSSD

0.2560

2.0871

0.2707

1.9146

0.2493

1.8570

SSDSSD

0.2815

0.6827

0.2841

0.7274

0.3087

0.6839

Once having the availability of the clusters containing different storage node configurations (i.e.: HDDs, SSDs and SSDs and HDDs), analyzing and comparing the results obtained through the State-Based Analysis, we were able to observe

An Approach to Support the Design and the Dependability Analysis

37

that the cluster that achieved the highest overall availability was the one storing both data and metadata on SSDs. Figure 3 shows the analysis results for each considered scenario. The graph shows the fraction of time each configuration has not failed within a year.

Fig. 3. State-based analysis results.

The obtained availability results however, can be improved by planning recurrent system checks and maintenance. Figure 4 shows the impact planned recurrent maintenance has on the availability of each configuration. As an answer to RQ, we can say that, by considering the dependability data listed on Figs. 3 and 4 and the I/O performance information listed on Table 1, project managers can evaluate and analyze their options when it comes to efficiently setting up the environment in many ways. If what they are looking for is high availability, very short I/O times and the reduced need of recurrent maintenance, then going for a SSD exclusive architecture and using the storage nodes with SSDs for both data and metadata and the CFQ scheduler could be a good idea. If they still need to provide a decent amount of availability and performance and are limited by costs, then they may be better off with the hybrid SSD / HDD approach. Even though it is known for a fact that SSDs generally provide better I/O response times and dependability, investing in 100% SSD storage architectures may not always the most efficient option especially, when considering costs. As the results have shown, it is still possible to get pretty decent I/O times and dependability, through the adoption of a hybrid HDDSSD storage architecture. Furthermore, a hybrid architecture may also allow a smarter

38

L. Bressan et al.

Fig. 4. Impact of recurrent maintenance on system availability.

resource allocation according to each service specific needs. If a certain job or service requires a high availability environment to execute and shorter amounts of time to perform its I/O operations, then such job, its data and metadata, could be allocated to SSD storage nodes only. If not, then it might as well have its data and metadata allocated to HDD and SSD storage nodes respectively.

6

Conclusions and Future Work

This paper has presented an approach to support the design and the dependability evaluation of HPC environments. The approach comprises of an extension of the CHESS methodology and was successfully applied and evaluated considering a highly distributed and I/O intensive HPC environment, the Grid’5000. As a result, we observed that the methodology successfully provides support for the dependability evaluation of distributed systems. It does not only supports the estimation and demonstration of attributes such as failure behavior, availability and reliability, but also, the impact project choices may have upon them. Furthermore, we have also demonstrated how the analysis results can be combined with additional information such as performance data and provide assistance for deriving and agreeing upon new project decisions. These combined results can be used by engineers to derive new project decisions and agree on new design choices and resource allocation strategies. As future work, we will gather further information from the Grid’5000, e.g. its up/down times and failure detection and mitigation mechanisms. This information is going to be used to enrich the current Grid’5000 model with more detailed data about its dependability properties. A more accurate model will provide us with more precise dependability metrics and help determining new strategies and requirements regarding risk mitigation and resource allocation.

An Approach to Support the Design and the Dependability Analysis

39

Experiments presented in this paper were carried out using the Grid’5000 experimental testbed, developed under the INRIA ALADDIN development action with the support of several Universities and funding bodies (see https:// www.grid5000.fr for more info). We would also like to thank the Federal University of Juiz de Fora (UFJF), CNPq and CAPES for providing financial support to this study.

References 1. AMASS: architecture-driven, multi-concern and seamless assurance and certification of cyber-physical systems. https://www.amass-ecsel.eu 2. Grid’5000. https://www.grid5000.fr/w/Grid5000:Home 3. Adachi, M., Papadopoulos, Y., Sharvia, S., Parker, D., Tohdo, T.: An approach to optimization of fault tolerant architectures using HiP-HOPS. Softw. Pract. Experience 41(11), 1303–1327 (2011). https://doi.org/10.1002/spe.1044 4. Chattaraj, D., Sarma, M., Samanta, D.: Stochastic petri net based modeling for analyzing dependability of big data storage system. In: Abraham, A., Dutta, P., Mandal, J.K., Bhattacharya, A., Dutta, S. (eds.) Emerging Technologies in Data Mining and Information Security, pp. 473–484. Springer Singapore, Singapore (2019) 5. Cicchetti, A., Ciccozzi, F., Mazzini, S., Puri, S., Panunzio, M., Zovi, A., Vardanega, T.: CHESS: a model-driven engineering tool environment for aiding the development of complex industrial systems. In: Proceedings - 2012 27th IEEE/ACM International Conference on Automated Software Engineering, ASE 2012, pp. 362–365 (2012). https://doi.org/10.1145/2351676.2351748 6. Dave, S.: Fault-tolerance techniques in distributed systems. J. Soc. Instrum. Control Eng. 31(7), 775–780 (1992). https://doi.org/10.11499/sicejl1962.31.775 7. De Oliveira, A.L., Bressan, L., Montecchi, L., Gallina, B.: A systematic process for applying the CHESS methodology in the creation of certifiable evidence. In: Proceedings - 2018 14th European Dependable Computing Conference, EDCC 2018, pp. 49–56 (2018). https://doi.org/10.1109/EDCC.2018.00019 8. Gupta, V., Kaur, B.P., Jangra, S.: An efficient method for fault tolerance in cloud environment using encryption and classification. Soft Comput. 23(24), 13591– 13602 (2019). https://doi.org/10.1007/s00500-019-03896-6 9. Khanghahi, N., Ravanmehr, R.: Cloud computing performance evaluation: issues and challenges. Int. J. Cloud Comput. Serv. Architect. 3(5), 29–41 (2013). https:// doi.org/10.5121/ijccsa.2013.3503 10. Lucas, R., Ang, J., Bergman, K.: DOE Advanced Scientific Computing Advisory Subcommittee (ASCAC) Report: Top Ten Exascale Research Challenges. Technical report, United States (2014). https://doi.org/10.2172/1222713. https://www.osti. gov/servlets/purl/1222713 11. Mazzini, S., Favaro, J., Baracchi, L.: A model-based approach across the IoT lifecycle for scalable and distributed smart applications. In: Proceedings IEEE Conference on Intelligent Transportation Systems, ITSC October 2015, pp. 149–154 (2015). https://doi.org/10.1109/ITSC.2015.33 12. Mazzini, S., Favaro, J., Puri, S., Baracchi, L.: CHESS: an open source methodology and toolset for the development of critical systems. CEUR Workshop Proc. 1835, 59–66 (2016)

40

L. Bressan et al.

13. Menychtas, A., Konstanteli, K.G.: Fault detection and recovery mechanisms and techniques for service oriented infrastructures, pp. 259–274 (2011). https://doi. org/10.4018/978-1-60960-827-9.ch014 14. Montecchi, L.: CHESS-SBA: state-based analysis. https://github.com/montex/ CHESS-SBA 15. Montecchi, L., Gallina, B.: SafeConcert: a metamodel for a concerted safety modeling of socio-technical systems. In: 5th International Symposium on Model-Based Safety and Assessment (IMBSA 2017), LNCS, vol. 10437, pp. 129–144. Trento, Italy (2017) 16. Montecchi, L., Gallina, B.: SafeConcert: a metamodel for a concerted safety modeling of socio-technical systems. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) LNCS, vol. 10437, pp. 129–144 (2017). https://doi.org/10.1007/978-3-319-6411959 17. Pan, Y., Hu, N.: Research on dependability of cloud computing systems. In: 2014 10th International Conference on Reliability, Maintainability and Safety (ICRMS), pp. 435–439 (2014). https://doi.org/10.1109/ICRMS.2014.7107234 18. Pioli, L., de Andrade Menezes, V.S., Dantas, M.A.R.: Research characterization on I/O improvements of storage environments. In: Lecture Notes in Networks and Systems, vol. 96, pp. 287–298 (2020). https://doi.org/10.1007/978-3-030-335090 26 19. Schmidt, K.: High Availability and Disaster Recovery: Concepts, Design, Implementation, 1st edn. Springer Publishing Company, Berlin (2010) 20. Schroeder, B., Gibson, G.A.: Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you. In: FAST 2007 - 5th USENIX Conference on File and Storage Technologies (2007) 21. Sommerville, I.: Software Engineering, 10th edn. Pearson (2015) 22. Wu, J., Liang, Q., Bertino, E.: Improving scalability of software cloud for composite web services. In: CLOUD 2009 - 2009 IEEE International Conference on Cloud Computing, pp. 143–146 (2009). https://doi.org/10.1109/CLOUD.2009.75

A Waiting Time Determination Method to Merge Data on Distributed Sensor Data Stream Collection Tomoya Kawakami1(B) , Tomoki Yoshihisa2 , and Yuuichi Teranishi2,3 1

3

University of Fukui, Fukui, Japan [email protected] 2 Osaka University, Ibaraki, Osaka, Japan National Institute of Information and Communications Technology, Koganei, Tokyo, Japan

Abstract. We define continuous sensor data with difference cycles as “sensor data streams” and have proposed methods to collect distributed sensor data streams. However, it is required to determine the appropriate waiting time in each processing computer (node) to collect and merge data efficiently. Therefore, this paper presents a method to determine a specific waiting time in each node. The simulation results show that the waiting time affects the loads and processing time to collect the data.

1

Introduction

In the Internet of Things (IoT), various devices (things) including sensors generate data and publish them via the Internet. We define continuous sensor data with difference cycles as a sensor data stream and have proposed methods to collect distributed sensor data streams as a topic-based pub/sub (TBPS) system [8]. In addition, we have also proposed a collection system considering phase differences to avoid concentrating the data collection to the specific time by the combination of collection cycles [4,5]. These previous methods are based on skip graphs [1], one of the construction techniques for overlay networks [3,6,7]. In our skip graph-based method considering phase differences, the collection time is balanced within each collection cycle by the phase differences, and the probabiity of load concentration to the specific time or processing computer (node) is decreased. However, it is required to determine the appropriate waiting time in each node to collect and merge data efficiently. Therefore, this paper presents a method to determine a specific waiting time in each node. The simulation results show that the waiting time affects the loads and processing time to collect the data.

c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 41–50, 2021. https://doi.org/10.1007/978-3-030-61105-7_5

42

T. Kawakami et al.

Fig. 1. An example of input setting.

2 2.1

Problems Addressed Assumed Environment

The purpose of this study is to disperse the communication load in the sensor stream collections that have different collection cycles. The source nodes have sensors so as to gain sensor data periodically. The source nodes and collection node (sink node) of those sensor data construct P2P networks. The sink node searches source nodes and requires a sensor data stream with those collection cycles in the P2P network. Upon reception of the query from the sink node, the source node starts to delivery the sensor data stream via ohter nodes in the P2P network. The intermediate nodes relay the sensor data stream to the sink node based on their routing tables. 2.2

Input Setting

The source nodes are denoted as Ni (i = 1, · · · , n), and the sink node of sensor data is denoted as S. In addition, the collection cycle of Ni is denoted as Ci . In Fig. 1, each node indicates source nodes or sink node, and the branches indicate collection paths for the sensor data streams. Concretely, they indicate communication links in an application layer. The branches are indicated by dotted lines because there is a possibility that the branches may not collect a sensor data stream depending on the collection method. The sink node S is at the top and the four source nodes N1 , · · · , N4 (n = 4) are at the bottom. The figure in the vicinity of each source node indicates the collection cycle, and C1 = 1, C2 = 2, C3 = 2, and C4 = 3. This corresponds to the case where a live camera acquires an image once every second, and N1 records the image once every second, N2 and N3 record the image once every two seconds, and N4 records the image once every three seconds, for example. Table 1 shows the collection cycle of each source node and the sensor data to be received in the example in Fig. 1.

A Waiting Time Determination Method to Merge Data

43

Table 1. An example of the sensor data collection. Time N1 (Cycle: 1) N2 (Cycle: 2) N3 (Cycle: 2) N4 (Cycle: 3)

2.3

0

1

2

3

4

5

6

7

...

...

...

...

...

Definition of a Load

The communication load of the source nodes and sink node is given as the total of the load due to the reception of the sensor data stream and the load due to the transmission. The communication load due to the reception is referred to as the reception load, the reception load of Ni is Ii and the reception load of S is I0 . The communication load due to the transmission is referred to as the transmission load, the transmission load of Ni is Oi and the transmission load of S is O0 . In many cases, the reception load and the transmission load are proportional to the number of sensor data pieces per unit hour of the sensor data stream to be sent and received. The number of pieces of sensor data per unit hour of the sensor data stream that is to be delivered by Np to Nq (q = p; p, q = 1, · · · , n) is R(p, q), and the number delivered by S to Nq is R(0, q). 1 node

Level 2

13

21

33

10

01

48

75

99

00

11

11

21

75

99

10

11

11

00

Level 1

13 00

Level 0

Key

33

48

01

00

13

21

33

48

75

99

00

10

01

00

11

11

Fig. 2. A structure of a skip graph.

Membership vector

44

T. Kawakami et al.

3

Proposed Method

3.1

Skip Graph-Based Collection Considering Phase Differences

Currently we have proposed a large-scale data collection schema for distributed TPBS [8]. [8] assumes the overlay network for the skip graph-based TBPS such as Banno et al [2]. Skip graphs are overlay networks that skip list are applied in the P2P model [1]. Figure 2 shows the structure of a skip graph. In Fig. 2, squares show entries of routing tables on peers (nodes), and the number inside each square shows a key of the peer. The peers are sorted in ascending order by those keys, and bidirectional links are created among the peers. The numbers below entries are called “membership vector.” The membership vector is an integral value and assigned to each peer when the peer joins. Each peer creates links to other peers on the multiple levels based on the membership vector. In [8], we employ “Collective Store and Forwarding,” which stores and merges multiple small size messages into one large message along a multi-hop tree structure on the structured overlay for TBPS, taking into account the delivery time constraints. This makes it possible to reduce the overhead of network process even when a large number of sensor data is published asynchronously. In addition, we have proposed a collection system considering phase differences [4,5]. In the proposed method, the phase difference of the source node Ni is denoted as di (0 ≤ di < Ci ). In this case, the collection time is represented to Ci p + di (p = 0, 1, 2, ...). Table 2 shows the time to collect data in the case of Fig. 1 where the collection cycle of each source node is 1, 2, or 3. By considering phase differences like Table 2, the collection time is balanced within each collection cycle, and the probabiity of load concentration to the specific time or node is decreased. Each node sends sensor data at the time base on his collection cycle and phase difference, and other nodes relay the sensor data to the sink node. In this paper, we call considering phase differences “phase shifting (PS).” Fig. 3 shows an exmple of the data forwarding paths on skip graphs with phase shifting (PS). 3 Key Phase diff. 2 N1

3

3

2

2

1

1

1

0

1

0

0

0

Dest. node

N2

N3

N4

N5

N6

N7

D1

t=0 t=1 t=2 t=3 t=4 t=5

Fig. 3. Sensor data stream collection considering phase differences.

A Waiting Time Determination Method to Merge Data

45

Table 2. An example of the collection time considering phase differences. Cycle Phase Diff Collect. Time

3.2

1

0

0, 1, 2, 3, 4, ...

2

0 1

0, 2, 4, 6, 8, ... 1, 3, 5, 7, 9, ...

3

0 1 2

0, 3, 6, 9, 12, ... 1, 4, 7, 10, 13, ... 2, 5, 8, 11, 14, ...

Determination of the Waiting Time

In the collection scheme shown in [4,5] and Fig. 3, more data are efficiently aggregated on the relay nodes for the destination node if the left side nodes which have longer cycles send data earlier to the next right side nodes. Hence, the possibility of data aggregation can be enhanced if the waiting time to send data is configured longer on the shorter cycle nodes. On the other hand, nodes are required the specific costs to understand indirectly linked nodes on autonomous decentralized overlay networks such as skip graphs. The costs depend on the scale of the overlay networks. Therefore, nodes in the proposed method configure their own waiting time based on the position on the key space. The position on the key space is estimated by their own collection cycles and phase differences. In the processes to determine the waiting time, this paper assumes that all nodes know the maximum waiting time denoted by wmax . The maximum waiting time is configured to the shortest cycle node located to the right edge on the key space. Each node configures its own waiting time based on the estimated position and the maximum waiting time to send data earlier than the nodes located right side. Algorithm 1 shows the flow to determine the waiting time in the proposed method. From the line 1 to 6, the distance to the maximum waiting time node is calculated based on the node’s collection cycle and phase difference. From the line 6 to 9, the maximum distance on the key space is calculated by the values of the selectable collection cycles. At the line 10, the relative position on the key space is calculated, and the node’s waiting time is determined to shorten the waiting time for longer distance nodes.

46

T. Kawakami et al.

Algorithm 1: Determination of the waiting time on each node

1 2 3 4 5 6 7 8 9 10

4

Input: C: list of selectable collection cycles, c: node’s collection cycle, d: node’s phase difference, wmax : max. waiting time Output: Node’s waiting time p=0 // Node’s location by its collection cycle and phase difference for i in C do if c = Ci then break p = p + Ci p=p+d pmax = 0 // Max. value by the selectable collection cycles and phase differences for i in C do pmax = pmax + Ci return wmax (1 − p/pmax )

Evaluation

In this paper, we evaluate the proposed method in simulation. 4.1

Simulation Environments

Table 3 shows the simulation environments. The collection cycle of each source node denoted by Ci is determined at random between 1 and 10. The simulation time denoted by t is from 0 to 2519, which length is the least common multiple of the selectable collection cycles. The number of source nodes is 250, 500, 750, or 1000. The data from the source nodes are forwarded to one destination node and aggregated on the relay nodes based on the configured waiting time. The average communication delay among nodes is 0.005 × 20 , 0.005 × 21 , 0.005 × 22 , or 0.005 × 23 . The communication delay on each node is determined under the normal distribution which variance σ 2 is 0.001. The maximum number of the aggregated streams on each node is 10 per time. We execute the simulation for each environment and compare the results where the maximum waiting time is 0 (no waiting time), 0.5, 0.75, and 1.0. The default values of the number of nodes and the average communication delay are 500 and 0.01, respectively. The simulation is executed ten times for each environment, and the average values of the evaluation indices are calculated as simulation results. The evaluation indices are the maximum instantaneous load, the total loads, the average arrival delay from the source nodes to the destination node, and the maximum arrival delay.

A Waiting Time Determination Method to Merge Data

47

Table 3. Simulation environments. Item

Value

Collection cycles

1, 2, ..., 10 (Determined at random)

The number of destination nodes

1

The number of source nodes

250, 500, 750, 1000

Avg. communic. delay

0.005, 0.01, 0.02, 0.04

Max. waiting time

0 (No waiting time), 0.5, 0.75, 1.0

Max. number of the aggregated streams 10 Simulation count

10

Evaluation indices

Max. instantaneous load, total loads, avg. arrival delay, max. arrival delay

4.2

Results by the Number of Nodes

Figure 4 shows the maximum instantaneous load and the total loads of nodes when the number of nodes on the lateral axis is from 250 to 1000. The average communication delay is 0.01. In Fig. 4a, the maximum instantaneous load decreases by the maximum waiting time because longer waiting time increases the possibility of data aggregation. However, the falling rate is 9% compared to the case of no waiting time (the maximum waiting time is 0 and no data aggregation) even if the number of nodes is 1000 and the maximum waiting time is 1.0. In Fig. 4b, also the total loads decreases by the maximum waiting time, and the falling rate is higher than the result of the maximum instantaneous load. The maximum falling rate is nearly 30% compared to the case of no waiting time. Figure 5 shows the average arrival delay and the maximum arrival delay when the number of nodes on the lateral axis is from 250 to 1000. The average communication delay is the same, 0.01. In Fig. 5a, the loads become lowest when the maximum waiting time is 1.0, however, the average arrival delay is over 1.0 and moves to the next time. On the other hand, the maximum of the increased amount is 0.1 in the case of no waiting time. The influence is not serious if the application allows a little increase of the arrival delay. In addition, the number of nodes does not have a large influence on the average arrival delay in this simulation environment because the proposed method uses skip graphs which can keep the number of hops nearly log n. On the other hand, Fig. 5b shows that the maximum arrival delay is more affected by the number of nodes compared to the average arrival delay. The growth rate is decreased by the maximum waiting time. When the number of nodes is 1000 and no waiting time, the maximum arrival delay increases by 34% compared to the case of 250 nodes.

T. Kawakami et al. 10 6 2

4

Total Loads [106]

8

400 300 200 100 250

500

Waiting time: 0.75 Waiting time: 1.0 750

No waiting time Waiting time: 0.5

0

No waiting time Waiting time: 0.5

0

Max. Instantaneous Load

500

48

1000

250

The Number of Nodes

500

Waiting time: 0.75 Waiting time: 1.0 750

1000

The Number of Nodes

(a) The maximum instantaneous load

(b) The total loads

1.2 1.0 0.8 0.6

No waiting time Waiting time: 0.5

0.0

0.0

0.2

Waiting time: 0.75 Waiting time: 1.0

0.4

Max. Arrival Delay

0.8 0.6 0.4

No waiting time Waiting time: 0.5

0.2

Avg. Arrival Delay

1.0

1.2

Fig. 4. Loads by the number of nodes.

250

500

750

The Number of Nodes

(a) The average delay

1000

250

500

Waiting time: 0.75 Waiting time: 1.0 750

1000

The Number of Nodes

(b) The maximum delay

Fig. 5. Arrival delays by the number of nodes.

4.3

Results by the Average Communication Delay

Figure 6 shows the maximum instantaneous load and the total loads of nodes when the average communication delay on the lateral axis is from 0.005 to 0.004. The number of nodes is 500. In Fig. 6a, the maximum instantaneous load increases by the average communication delay because longer communication delay decreases the possibility of data aggregation. In Fig. 6b, also the total loads increases by the average communication delay, and the growth rate is higher than the result of the maximum instantaneous load. When the average communication time is 0.04 and the maximum waiting time is 1.0, the total loads increase by 43% compared to the case where the average communication time is 0.005. Figure 7 shows the average arrival delay and the maximum arrival delay when the average communication delay on the lateral axis is from 0.005 to 0.04. The number of nodes is the same, 500. Similar to the results by the number of nodes, the average arrival delay is over 1.0 and moves to the next time when the maximum waiting time is 1.0. In addition, the average communication delay affects

5

49

3 1

2

Total Loads [106]

4

200 150 100 50 0.005

0.01

Waiting time: 0.75 Waiting time: 1.0 0.02

No waiting time Waiting time: 0.5

0

No waiting time Waiting time: 0.5

0

Max. Instantaneous Load

250

A Waiting Time Determination Method to Merge Data

0.04

0.005

Avg. Communic. Delay

0.01

Waiting time: 0.75 Waiting time: 1.0 0.02

0.04

Avg. Communic. Delay

(a) The maximum instantaneous load

(b) The total loads

Waiting time: 0.75 Waiting time: 1.0

1.0 0.5

Max. Arrival Delay

0.8 0.6 0.4

No waiting time Waiting time: 0.5

No waiting time Waiting time: 0.5

0.0

0.0

0.2

Avg. Arrival Delay

1.0

1.2

1.5

Fig. 6. Loads by the average communication delay.

0.005

0.01

0.02

Avg. Communic. Delay

(a) The average delay

0.04

0.005

0.01

Waiting time: 0.75 Waiting time: 1.0 0.02

0.04

Avg. Communic. Delay

(b) The maximum delay

Fig. 7. Arrival delays by the average communication delay.

the average arrival delay and the maximum arrival delay. When the average communication time is 0.04 and the maximum waiting time is 0 (no waiting time), the maximum arrival delay increases nearly by four times compared to the case where the average communication time is 0.005.

5

Conclusion

We have proposed a skip graph-based collection system for sensor data streams considering phase differences. In this paper, we proposed a method to determine a specific waiting time in each node. The simulation results show that the waiting time affects the loads and processing time to collect the data. In future, we will evaluate the proposed method in various environments such as another distribution for the communication delays among nodes. Acknowledgements. This work was supported by JSPS KAKENHI Grant Number 18K11316, G-7 Scholarship Foundation, and Research Grants from the University of Fukui.

50

T. Kawakami et al.

References 1. Aspnes, J., Shah, G.: Skip graphs. ACM Trans. Algorithms 3(4), 1–25 (2007) 2. Banno, R., Takeuchi, S., Takemoto, M., Kawano, T., Kambayashi, T., Matsuo, M.: Designing overlay networks for handling exhaust data in a distributed topic-based pub/sub architecture. J. Inform. Process. 23(2), 105–116 (2015) 3. Duan, Z., Tian, C., Zhou, M., Wang, X., Zhang, N., Du, H., Wang, L.: Two-layer hybrid peer-to-peer networks. Peer-to-Peer Netw. Appl. 10, 1304–1322 (2017) 4. Kawakami, T., Yoshihisa, T., Teranishi, Y.: A load distribution method for sensor data stream collection considering phase differences. In: Proceedings of the 9th International Workshop on Streaming Media Delivery and Management Systems (SMDMS 2018), pp. 357–367 (2018) 5. Kawakami, T., Yoshihisa, T., Teranishi, Y.: Evaluation of a distributed sensor data stream collection method considering phase differences. In: Proceedings of the 10th International Workshop on Streaming Media Delivery and Management Systems (SMDMS 2019), pp. 444–453 (2019) 6. Legtchenko, S., Monnet, S., Sens, P., Muller, G.: RelaxDHT: a churn-resilient replication strategy for peer-to-peer distributed hash-tables. ACM Trans. Auton. Adapt. Syst. 7(2), 1–18 (2012) 7. Shao, X., Jibiki, M., Teranishi, Y., Nishinaga, N.: A virtual replica node-based flash crowds alleviation method for sensor overlay networks. J. Netw. Comput. Appl. 75, 374–384 (2016) 8. Teranishi, Y., Kawakami, T., Ishi, Y., Yoshihisa, T.: A large-scale data collection scheme for distributed topic-based pub/sub. In: Proceedings of the 2017 International Conference on Computing, Networking and Communications (ICNC 2017) (2017)

Possible Energy Consumption of Messages in an Opportunistic Network NanamiKitahara1(B) , Shigenari Nakamura2 , Takumi Saito1 , Tomoya Enokido3 , and Makoto Takizawa1 1

2

Hosei University, Tokyo, Japan {nanami.kitahara.3y,takumi.saito.3j}@stu.hosei.ac.jp, [email protected] Tokyo Metropolitan Industrial Technology Research Institute, Tokyo, Japan [email protected] 3 Rissho University, Tokyo, Japan [email protected]

Abstract. In disaster-tolerant (DTN) and vehicle-to-vehicle (V2V) networks, each node communicates with other nodes in infrastructure-less networks using wireless networks. Here, a node has to wait for opportunity that the node can communicate with another node. Even if a message is successfully forwarded to a neighboring node, the node might be unable to forward the message to any node. Hence, each node has to retransmit a message in more times. In our previous studies, the possible energy consumption (PEC) of a message stored in a node is proposed, which shows how much energy the node is expected to consume to retransmit the message. The PEC of a message is defined to depend on the delivery ratio to the destination node. In this paper, we newly propose an algorithm to retransmit messages where the interval of retransmissions of each message is decided based on the PEC. In the evaluation, we evaluate the retransmission algorithm in terns of number of retransmissions and delivery time. Keywords: Energy-efficient opportunistic networks · Possible energy consumption (PEC) · Optimistic node · Pessimistic node

1

Introduction

The opportunistic networks [4,7] which use wireless communication are getting more important in various applications like V2V (vehicle-to-vehicle) networks [3] and DTN (Delay/disaster Tolerant Networks) [5]. Here, each node has to wait for some node which comes in the communication range to deliver messages to the destination nodes. On receipt and transmission of a message, a node keeps the message in the buffer. Once a node pi finds some node pj in the communication range, the node pi retransmits a message m in the buffer to the node pj . c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 51–61, 2021. https://doi.org/10.1007/978-3-030-61105-7_6

52

N. Kitahara et al.

In many opportunistic routing protocols like Epidemic [10], Prophet [1], Spray and wait [8], MAC [9], and DOMAC [2], the number of messages transmitted in networks and kept in the memory buffer and the delivery ratio of messages are tired to be decreased and increased, respectively. A node consumes electric energy to transmit and receive messages. The longer a message is kept in the buffer of a node, the more amount of energy the node consumes by retransmitting the message. The concept of possible energy consumption (PEC) of each message kept in the buffer is proposed in our previous paper, [6]. The PEC of a message in a node shows how much energy the node is expected to consume to retransmit the message. The PET of a message depends on the ratio of the message to the destination node. The smaller delivery ratio, the larger PEC. The effective energy residue (ER) of each node is, the difference of the maximum energy residue to the total PEC of messages in the buffer. The smaller the ER of a node is, the more number of times the node can retransmit each message in the buffer. A node whose ER is larger and smaller is optimistic and pessimistic, respectively. In this paper, we propose a retransmission algorithm to retransmit messages by changing the inter-retransmission interval. The inter-retransmission interval is decided on the types of source and destination nodes. In this paper, we propose four types of the retransmission algorithms. In the evaluation, we show the expected number of retransmissions and expected delivery time of each message in the retransmission algorithm. In Sect. 1, we present a system model and PEC of a message. In Sect. 2, we propose the retransmission algorithm. In Sect. 3, we evaluate the retransmission algorithm.

2 2.1

System Model Wireless Networks

A system S is composed of mobile nodes p1 , ..., pn (n ≥ 1) which are interconnected in wireless networks. Each node pi communicates with other nodes in wireless networks. A node pi can communicate with another node pj (pi ↔ pj ) only if the node pj is in the communication range of the node pi . A node pi supports the buffer BFi to store messages which the node pi receives and sends. Let sbi be the size of the buffer BFi , i.e. the maximum number of messages which the node pi can store in the buffer BFi . Let mmi be the number of message in the buffer BFi . On receipt of a message m, a node pi stores the message m in the buffer BFi if mmi < sbi . Otherwise, node pi cannot receive message m. For each message m, a TTL (time-to-line) variable m.c is manipulated. On receipt of a message m, the variable m.c is 0 and the message m is stored in the buffer BFi . A message m in the buffer BFi is eventually retransmitted to another node, e.g. each time some node comes in the communication range. Each time a node pi retransmits a message m, the variable m.c

An Opportunistic Communication Protocol to Reduce Energy Consumption

53

is incremented by one in the node pi . If the variable m.c gets larger than the maximum number xri of retransmisions, the message m is removed in the buffer BFi . 2.2

Energy Consumption

A node pi consumes electric energy to transmit and receive a message. Let SEi and REi be the electric energy to be consumed by a node pi to transmit and receive a message, respectively. A message m in the buffer BFi of a node pi is retransmitted if some node pj is in the communication range of pi . We consider how much energy a node pi is expected to consume to retransmit a message m until the message m is delivered to the destination node. The energy to be consumed by anode pi depends on the loss probability of each message m to the destination node pj . Let m.dst stand for the destination node of a message m. Let fij be the probability that a node pi cannot deliver a message to a destination node pj . Let xri be the maximum number of retransmission of each message in a node fi . If a node pi can retransmit a node pj a message infinite times, the expected number of transmissions is 1/(1 − fij ). The probability that a message k . Let cij and is delivered to the destination node pj by k transmissions is 1 − fij k k dij be k/xrij (< 1) where k is number where 1 − fij ≤ 0.8 and 1 − fij ≥ 0. respectively. In this paper, the expected power P Rij (k) [W] to be consumed by a node pi to retransmit a message m where m.dst = pj and m.c = k (≤ xri ), i.e. which is already retransmitted k times, is defined as follows (Fig. 1): ⎧ ⎪ i. ⎨1 f or k ≤ dij · xr (k−dij ·mri )2 2 P Rij (k) = (1 − (cij −dij )2 ·mr2 ) f or dij · xri < k ≤ cij · xri . i ⎪ ⎩ 0 f or cij · mrij < k.

(1)

P Rij (dij · xri ) = 1 and P Rij (cij · xri ) = 0. For a message m where m.dst = pj and m.c is k, the possible energy consumption (P EC) N P Eij (k) is defined as follows: N P Eij (k) = SEij · P Rij (k).

(2)

Figure 1 shows N P Eij (k) for 0 ≤ k ≤ xri . For k ≤ dij ·xri , N P Eij (k) = SEi . For cij · xri ≥ k > dij · xri , N P Eij (k) exponentially decreases since there is higher possibility that the message m is delivered to the destination node: For k > cij · xri , N P Eij (k) = 0. The possible energy consumption (P EC) P Eij (m) of a node pi for each message m whose destination is pi in the buffer BFi of the node pi is as follows: ⎧ ⎪ ⎨SEij f or k ≤ dij · xr2i . (k−dij ·xri ) 2 P Eij (m) = SEij · (1 − (c f or dij · xri < k ≤ cij · xri . 2) 2 i −dij ) ·xri ⎪ ⎩ 0 f or cij · xri < k.

(3)

54

N. Kitahara et al.

Fig. 1. Power consumption of a node pi to retansmit a message to pj .

P Eij (m) = SEi if a message m is retransmited a fewer times than dij · xri . The P EC P Eij (m) decreases in the square of the number m.c of retransmissions of a message m. Each message m is retranmitted at most cij · xri times. Let xP Eij stand for the maximum PEC of each message m where m.dst = pj , i.e. xP Eij = P Eij (m) where m.c = 0 for each message m. A node pi consumes the energy SEi to retransmit a message m. Since a message m in the buffer BFi is already retransmitted m.c times, the node pi already consumes the transmission energy T Ei (m) to retransmit the message m as follows: T Ei (m) = SEi · m.c.

(4)

Let xEi be the maximum energy residue which the battery of a node pi can support. A variable Ri (≤ maxCi ) denotes the energy residue of a node pi . Initially Ri is xEi , i.e. the buttery is fully charged. Each time a node pi receives and transmits a message, the energy residue Ri is decremented by the energy consumption REi and SEi , respectively. T P Ei is the total PEC of a node pi to transmit every message in the buffer BFi . The total PEC T P Ei of a node pi is defined to be summation of PEC of all the messages in the buffer BFi as follows: P Ei,m.dst (m). (5) T P Ei = m∈BFi

The ef f etive energy residue (ER) ERi of a node pi is as follows: ERi = Ri − T P Ei .

(6)

Initially, Ri = ERi = Ri and T P Ei = 0 for each node pi . The ERi means electric energy which a node pi can consume to newly transmit and receive message. The effective energy residue ERi and energy residue Ri are manipulated each time a node pi transmits and receives a message m in the Algorithm 1.

An Opportunistic Communication Protocol to Reduce Energy Consumption

55

Algorithm 1: [pi transmits a message m] 1 2 3 4 5 6 7 8 9

input : m = message to be transmitted, where m.dst = pj ; if m.c > cij · xri then T P Ei = T P Ei - N P Eij (m.c) + N P Eij (m.c + 1); Ri = Ri - SEi ; ERi = Ri - T P Ei ; m.c = m.c + 1; transmit m; else remove m;

A node pi receives a message m whose destination is pi , i.e. m.dst = pj in the Algorithm 2.

Algorithm 2: [pi receive a message m] 1 2 3 4 5 6 7 8 9 10 11

m = receive(); if BFi is not full then m.c = cij · xri ; T P Ei = T P Ei - P Eij (m); Ri = Ri - REi ; ERi = Ri - T P Ei ; store m in BFi ; else /* buffer full */ select a message m in BFi , where P Eij (m) is the smallest; remove m;

If Ri ≤ 0, a node pi can neither transmit nor receive any message. On the other hand, even if ERi ≤ 0, a node pi can send and receive messages but may not be able to retransmit every message in the buffer BFi . It is noted Ri ≥ ERi . The more amount of the effective energy residue ERi of a node pi , the more optimistic the node pi is, i.e. the node pi can more often retransmit messages. A node pi is optimistic if the effective energy residue ERi is larger. For example, once an optimistic node pi finds another node pj to be in communication range, the node pi transmits messages to the node pj . On the other hand, if the ERi of a node pi is smaller, the node pi is pessimistic, i.e. the node pi does not often retransmit messages. For example, even if another node pj is in the communication range, a pessimistic node pi may not send messages to the node pj . A pessimistic node pi only sends messages to a node pj which is more optimistic, i.e. whose effective energy residue ERi is larger.

56

3

N. Kitahara et al.

Retransmission Algorithm

Suppose a node pj is in the communication range of a node pi (pi ↔ pj ). The node pi retransmits a message m in the buffer BFj , whose destination node of the message m is a node pj , i.e. m.dst = pj . The node pi changes the interretransmission interval of a message m depending on the type of the destination node pj and its own type. The fewer number of retransmissions, the smaller energy the node pi consumes. Let xri be the maximum number of retransmissions of each message at a node pi . Let rtij (k) show time a node pi retransmits a message m where m.dst = pj at the kth retransmission (k ≤ xri ). rtij (k + 1) ≥ rtij (k) for k ≤ 1. That is, a node pi retransmits a message m to a node pj rtij (k) − rtij (k − 1) time units after (k − 1)th retransmission. The difference rtij (k) − rtij (k − 1) gives the inter-retransmission time between the (k − 1)th and kth retransmissions. An optimistic node pi thinks a message m can be delivered to the destination node once the node pi transmits the message m. Hence, the inter-transmission time is longer. On the other hand, a pessimistic node pi thinks a message m cannot be delivered to the destination node. Hence, a pessimistic node pi often retransmits the message m. That is, the inter-retransmission time is shorter. The retransmission time rtij (k) of the kth retransmission of a node pi is decided depending on the types of a node pi and a destination node pj as follows: 1. [Optimistic-Optimistic (OO)]. If an optimistic node pi transmits a message m to an optimistic node pj , rtij (k) = (xrij /2) · k. The interretransmission time rtij (k) − rtij (k − 1) is constant xrij /2. 2. [Pessimistic-Pessimistic (PP)]. If a pessimistic node pi transmits a message m to a pessimistic node pj , rtij (k) = 2k. Here, the inter-retransmission time rtij (k) − rtij (k − 1) is constant 2. Every two time units, the node pi retransmits the message m. 3. [Optimistic-Pessimistic (OP)]. If an optimistic node pi retransmits a message m to a pessimistic node pj , the larger number k of retransmissions, the longer inter-retransmission interval. In this paper, rtij (k) = k(k + 1)/2, for k ≤ xri . rtij (k) − rtij (k − 1) (= k) > rtij (k − 1) − rtij (k − 2)(= k − 1). 4. [Pessimistic-Optimistic (PO)]. If a pessimistic node pi retransmits a message m to an optimistic node pj , the inter-retransmission interval decreases as k gets larger. In this paper, rtij (k) = k(2xrij − k + 1)/2 − 1 for k ≤ xri . rtij (k) − rtij (k − 1) (= xrij − k + 1) < rtij (k − 1) − rtij (k − 2) (= xrij − k + 2). Figure 2 shows the inter-retransmission interval rtij (k) for the OO, SO, SP, and PP types.

An Opportunistic Communication Protocol to Reduce Energy Consumption

57

Table 1 summarizes the actions of a source node pi to a destination node pj . Table 1. Optimistic and pessimistic actions.

Fig. 2. Inter-retransmission interval ( xrij = 10).

4

Evaluation

We consider a pair of a source node pi and a destination node pj . Let f (x) be the loss probability (LP) that the node pj does not receive a message which the node pi retransmits the message x time units later. In fact, the variable x shows the distance between the nodes pi and pj . Let xr be the maximum number of retransmissions of a message of the node pi . The loss probability f (x) changes as x changes. Let xrt be the maximum number of rt(xr) for OO, SO, SP, and PP types. In the evaluation, xr = 10 and xrt = 55. We consider two types of loss probability (LP) function, f (x) = x/xrt and f (x) = 1 − x/xrt. The first loss

58

N. Kitahara et al.

probability (LP) function f (x) = x/xrt means the node pj is leaving the node pi . The loss probability function f (x) monotonically increases. The second loss probability function f (x) = 1 − x/xrt shows the node pj is approaching to the node pi , i.e. f (x) monotonically decreases. A function rt(k) gives time the node pi does the kth retransmission of a message as discussed in the preceding section. Here, the expected number RN of retransmissions to deliver a message to the node pj is given as follows. RN = (1 − f (rt(1))) + 2 · (1 − f (rt(2))) · f (rt(1)) + ... + xr · (1 − f (rt(xr))) · f (rt(1)) · ... · f (rt(xr − 1)).

(7)

The expected time RT to delivery a message to the destination node pj is as follow. RT = (1 − f (rt(1))) · rt(1) + (1 − f (rt(2))) · f (rt(1)) · rt(2) + ... + (1 − f (rt(xrn))) · f (rt(1)) · ... · f (rt(xrn − 1)) · rt(xrn).

(8)

Figures 3 and 4 show the expected number RN of retransmissions of the OO, SO, SP, and PP types for the number k of retransmissions for the first and second loss probability functions. For the first loss probability function f where the node pj is leaving the node pi , the SO types implies the fewest number RN of retransmissions. For the second probability function of where the node pj is approaching to the node pi , RN is the smallest for k ≤ 6 in the PP type and k > 6 in the SP type.

Fig. 3. Expected number RN of retransmissions (xr = 10, LP type 1).

An Opportunistic Communication Protocol to Reduce Energy Consumption

59

Figures 5 and 6 show the expected delivery time RT to the OO, SO, SP and PP types with number k of retransmissions for the first and second loss probability functions. For the first and second loss probability functions f , the delivery time RT is the shortest in the SO type and in the PP type, respectively.

Fig. 4. Expected number RN of retransmissions (xr = 10, LP type 2).

Fig. 5. Expected delivery time RT (xr = 10, LP type 1).

60

N. Kitahara et al.

Fig. 6. Expected delivery time the RT (xr = 10, LP type 2).

5

Concluding Remarks

Mobile nodes are interconnected in wireless networks. A node communicates with another node in the communication range. A node consumes energy to transmit messages in the buffer. In this paper, we proposed the algorithm to retransmit messages where inter-retransmission intervals of the messages are changed depending on the types of nodes. In the evaluation, we showed the expected number of retransmissions and expected delivery time of messages.

References 1. Probabilistic routing protocol for intermittently connected networks. http://tools. ietf.org/html/draft-irtf-dtnrg-prophet-09 2. Bazan, O., Jessemudin, M.: An opportunistic directional mac potocol for multihop wireless networks with switched beam directional antennas. In: Proceedings of IEEE International Conference on Communications, pp. 2775–2779 (2008) 3. Bilstrup, K., Uhlemann, E., Strom, E.G., Bilstrup, U.: Evaluation of the ieee 802.11p mac method for vehicle-to-vehicle communication. In: Proceedings of IEEE the 68th Vehicular Technology Conference VTC 2008-Fall(2008), pp. 1–5 (2008) 4. Spaho, E., Barolli, L., Kolici, V., Lala, A.: Evaluation of single-copy and multiplecopy routing protocols in a realistic vdtn scenario. In: Proceedings of the 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS-2016), pp. 285–289 (2016) 5. Farrell, S., Cahill, V.: Delay and disruption tolerant networking. Artech House (2006) 6. Kitahara, N., Nakamura, S., Saito, T., Enokido, T., Takizawa, M.: Opportunistic communication protocol to reduce energy consumption of nodes. In: Proceedings of IEEE 23nd International Conference on Network-Based Information Systems (NBiS-2020) (2020)

An Opportunistic Communication Protocol to Reduce Energy Consumption

61

7. Dhurandher, S.K., Sharma, D.K., Woungang, I., Saini, A.: An energy-efficient history-based routing scheme for opportunistic networks. Int. J. Commun. Syst. 30(7), e2989 (2015) 8. Spyropoulos, T., Psounis, K., Raghavendra, C.S: An efficient routing scheme for intermittently connected mobile networks. In: Proceedings of IEEE of ACMSIGCOMM 2005 Workshop on Delay Tolerant Networking and Related Networks(WDTN-05), pp. 252–259 (2005) 9. Tzamaloukas, A., Garcia-Luna-Aceves, J.J.: Channelhopping multiple access. In: Proceedings of IEEE ICC 2000, pp. 415–419 (2000) 10. Vahdat, A., Becker, D.: Epidemic routing for partially connected ad hoc networks. Technical report CS-200006 (4) (2000)

Aggregating and Sharing Contents for Reducing Redundant Caches on NDN Yuya Nakata1(B) and Tetsuya Shigeyasu2 1

Graduate School of Comprehensive Scientific Research, Prefectural University of Hiroshima, Hiroshima, Japan [email protected] 2 Department of Management and Information Systems, Prefectural University of Hiroshima, Hiroshima, Japan [email protected]

Abstract. NDN (Named Data Networking) returns requested content to users by utilizing network cache on routers. Routers cache contents which once delivered over them. Hence, popular content requested frequently by multiple users may be cached in multiple routers, and this leads high redundancy of cached content and consume capacity of network buffer. When the multiplicity of cached content becomes low, performance of NDN also degrades due to low cache hit ratio. In this paper, we propose new method to aggregate network cached contents according to the survey of user requests at the edge routers. The proposal relocates aggregated contents for improving cache hit ratio by sharing them among multiple users. The results of performance evaluations confirm that proposal well educes the network cache effects than the conventional NDN.

1

Introduction

On the basis of changing of network utilization form, NDN (Named Data Networking) [1–3] has started to attract the concentrations from network researchers. The NDN enables rapid contents delivery requested by users, by utilizing network caches on routers. For the purpose, NDN caches contents in relay routers when it forwards contents towards requesting users. It is obviously that NDN shortens the contents delivery time if those caches are stored relay routers closer to users. Incidentally, NDN can not shortens the content delivery time effectively if the same content is stored multiple relay routers because of the degrading cache utility by low cache multiplicity. Typically, LCE (Leave Copy Everywhere) [2] algorithm is selected as NDN caching algorithm however. In LCE algorithm, all relay routers store as cache in its buffer once it forward the content. So, popular contents frequently requested will be cached in all over the network more than necessary. This phenomenon induces low network buffer utilization and low content multiplicity. Hence, it is strongly required to solve the problem and increase multiplicity of content cached in network buffer. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 62–73, 2021. https://doi.org/10.1007/978-3-030-61105-7_7

Aggregating and Sharing Contents for Reducing Redundant Caches on NDN

63

In this paper, we propose new method to reduce cache redundancy and relocate the popular contents to routers which is likely to be able to respond for requests from more users. The proposal aggregates and relocates contents according to the users’ request survey at the ER (Edge Router) closing to users. By the performance evaluations, this paper clarifies that our proposal well educes the performance of NDN in terms of content delivery.

2

Related Works

Several methods reducing redundancy of cached contents in network buffer, have been proposed. LCD (Leave Copy Down) [4] and MCD (Move Copy Down) [5] are two of them, and those methods move or copy to one hop a part from the node returning the requested content. RAC (Random Autonomous Caching) [6], HPC (Hopbased Probabilistic Caching) [7] and ProbCache [8] are the methods copying forward content into buffer probabilistic at each relay router. In RAC, all relay routers store the forward cache with same probability while HPC varies the probability according to both of the cache holding time and distance between relay router and server published the original content. Probcache is an advance version of HPC. It varies caching probability at relay routers according to the amount of remained buffer in addition to the distance between relay router and server. On the other hand, several methods varies caching probability based on network centrality calculated by topology survey in advance, have been proposed [9]. These method can not effectively reduce the cache redundancy on entire network while it can reduce the redundancy locally. Hence, several methods aimed for reducing redundancy on entire network, have been proposed. Literature [10] has proposed the method avoiding duplicate caching by caching popular contents on routers inside the local domain. In the literature [11] , it has been proposed that the method reduces redundancy by routers exchanging cache status with the neighboring routers. The similar procedure with the method has been proposed in the literature [12]. This method avoids redundant caching by exchange cache status of relay routers with the ISP (Internet Service Provider). These type of methods require status exchanging periodically. By the exchanged status, entries on FIB (Forwarding Information Base) used for forwarding of Internet may be changed frequently. These procedure induces control overhead and induces low scalability of the methods.

64

3

Y. Nakata and T. Shigeyasu

Proposal

This chapter proposes a new higher scalable method for achieving high cache hit ratio while reducing redundancy of cached contents. In the proposal, contents belonging to popular categories are aggregated and relocated to the relay router which is suitable for delivering contents toward users locating wide areas. The contents relocation is enforced according to the content request survey at the ER. The relay router which is selected as a router to store relocated contents is called AR (Aggregation Router). By the adequate survey and relocation, our proposal can make low redundancy and hit ratio in terms of cache buffer. Estimation of Popular Content To estimate the popularity of requested contents, ER investigates the content request arrival interval for each category of contents. For a category A, ER calculates content request arrival interval TA by the following Eq. (1). TA = αToldA + (1 − α)(tA − tpreA )

(1)

Here, ToldA , α, tA and tpreA are the current estimation values of content request arrival interval for category A, coefficient, latest Interest receiving time and previous Interest receiving time, respectively. Aggregation of Popular Contents For informing the estimated values of each content request interval, our proposal registers those values in the header of Interest as a filed RI(Request Interval). For the RI, estimated content request intervals are registered after normalized by number of faces. When the AR receives the Interest with the estimated values, the AR decides to aggregate the content category having highest value in RI. For the decision, if the AR selects the category requested by few number of user group, the aggregation effect can not be obtained. Then, a category which is requested by many user groups will be selected as aggregation category in the proposal. For that, the AR eliminates categories from aggregation candidates when it is requested by few user group (See Fig. 1(i)). As the Fig. 1(i) shows, category C is not selected as aggregated contents. After the decision of the aggregation category, AR does not cache the contents except for the selected contents. In addition, for avoiding duplicate caching, AR adds the header field “cached”. The cached is set as true when the forwarding content is stored at the AR. So, the relay router received with cached flag never cache even if it does not have the content in its buffer (See Fig. 1(ii)).

Aggregating and Sharing Contents for Reducing Redundant Caches on NDN

65

Fig. 1. Procedure of our proposal

Refresh Sequence for Buffered Caches In the previous description, our proposal relocates aggregation contents to AR. In the other relay routers belonging to the same contents delivery path, however, still store contents which is selected as aggregated contents for the AR. For reducing duplicated caches of aggregated contents, our proposal further implements refresh sequence for detecting duplicate caches in buffer on relay routers. Refresh sequence starts from AR. For investigation of duplicate caches, AR adds INV (Investigation) list to Data header. Here, INV is used for gathering information with regard to categories which relay routers store in its buffer as a cache. A relay router received a Data with INV registers contents categories that the relay router stores. The relay router forwards the Data to its downstream relay router after registration. If the relay router stores same category with the INV of received Data has, the relay router eliminates all contents of the category to reduce redundancy of cache.

66

Y. Nakata and T. Shigeyasu Table 1. Simulation parameters Parameter

Value

Number of nodes

62

Number of contents

700

Interest generation interval 0.1[pkt/sec]

4 4.1

Data Rate

1[Gbps]

Interest Packet

100[Byte]

Data Packet

1,000[Byte]

Link delay

1[msec]

Simulation period

1,000[sec]

α

0.9

Performance Evaluation Evaluation Condition

Table 1 shows parameters used for the simulation, and Fig.ure 2 shows the evaluation network topology. In the figure, upper most nodes including original server is N W1 , and the N W1 is produced according to the BA (Barabashi Albert) model [13] to imitate the core network by ISP/AS. N W2 is a network that consists of users requesting contents. These users start to request contents after 20 s since simulation started. Requested content name by users based on Zips low [14]. The Zips low is a well-known and typical low to imitate the contents request at the Internet. Full name of a content is consisted as “/host0 /x /y”. Here, first layer, host0 means original server who publishes the content, and x , the second layer shows contents category, and the third layer y shows the chunk number. x and y are selected by Zips low and uniform random, respectively. 4.2

Relationship Between Cache Size and Multiplicity of Cached Contents

Figure 3 shows characteristics of multiplicity of cached contents under varying cache size of relay routers. The vertical axis shows the multiplicity of contents cached in N W2 . The values of the vertical axis are normalized by total number of original contents. As the figure indicates, multiplicity of the cached contents is improved by aggregation by the proposal.

Aggregating and Sharing Contents for Reducing Redundant Caches on NDN

67

Fig. 2. Simulation topology

4.3

Transition of Cache Hit Ratio

Figure 4 shows the transition of cache hit ratios. In this figure, horizontal axis and vertical axis shows the elapsed time of the simulation and cache hit ratio, respectively. The cache hit ratio is only measured inside N W2 . As the figure shows, the results confirm that our proposal achieved higher cache hit ratio than conventional after 20 s when the aggregation process has started at proposal. 4.4

Differences on Cache Hit Ratio Among Content Categories

Characteristics of cache hit ratios of each content rank are shown in Fig. 5, 6, 7, 8. Content rank corresponds to request probability based on Zips low as mentioned before (smaller rank has a high request probability). There three figures, Fig. 5, 6, 7 show the results of category A, B, C, D, respectively. As the figures show, by introducing our proposal, almost contents ranked as higher than 40 achieved higher cache hit ratio regardless of its content category. In addition, most improved category in terms of cache hit ratio is B.

68

Y. Nakata and T. Shigeyasu

Fig. 3. Normalized cache multiplicity

The reason for that is category A was requested by all users as the network topology shows. In the conventional NDN, these request patterns make almost relay routers cache the contents of category A than the other categories. However, by applying our proposal, A is selected as aggregation category and the cache contents will be relocated to upmost AR in N W2 . Therefore, contents of categories B, C a second frequent requested category, can use the room made after migration of these cached contents of category A. On the contrary, contents of category D did not increase by our proposal, because only one user request them. 4.5

Characteristics of Hop Count

Figure 9 shows the results of characteristics of hop count to obtain requested contents under varying CS size of each relay router. As the figure shows, by applying our proposal, hop count is reduced compared with the conventional NDN. This phenomenon implies that our proposal delivers large number of contents from closer node than original server by increasing multiplicity of cached contents on relay routers. Then, the results confirm that our proposal can reduce effectively network traffic for delivering contents.

Aggregating and Sharing Contents for Reducing Redundant Caches on NDN

Fig. 4. Cache hit ratio (CS size = 100)

Fig. 5. Cache hit ratio (category A, CS size = 100)

69

70

Y. Nakata and T. Shigeyasu

Fig. 6. Cache hit ratio (category B, CS size = 100)

Fig. 7. Cache hit ratio (category C, CS size = 100)

Aggregating and Sharing Contents for Reducing Redundant Caches on NDN

Fig. 8. Cache hit ratio (category D, CS size = 100)

Fig. 9. Hop count

71

72

5

Y. Nakata and T. Shigeyasu

Conclusion

In this paper, we proposed the content aggregation and relocation method for increasing multiplicity of cached contents on NDN. Our proposal mainly consists of two parts: 1) detection of popular contents and decision of content category that should be relocated, and 2) detection of redundant cache on relay routers on same content delivery path and remove the overlapped contents. By the computer simulations, we clarified that our proposal well educes the performance of cache hit ratio and reduce hop count for delivering contents. In the future, we are going to make further discussions for improving our proposal.

References 1. Soniya, M., Kumar, K.: A survey on named data networking. In: Proceedings of 2015 2nd International Conference on Electronics and Communication Systems (ICECS 2015 ), pp.1515–1519 (2015) 2. Jacobson, V., Smetters, D., Thornton, J., Plass, M., Briggs, N., Braynard, R.: Networking named content. In: Proceedings of ACM CoNEXT, vol. 2009, pp. 1–12 (2009) 3. Chen, Q., Xie, R., Yu, F., Liu, J., Huang, T., Liu, Y.: Transport control strategies in named data networking: a survey. IEEE Commun. Surv. Tutorials 18, 2052–2083 (2016) 4. Laoutaris, N., Syntila, S., Stavrakakis, I.: Meta algorithms for hierarchical web caches. In: Proceedings of the IEEE International Performance Computing and Communications Conference (IEEE IPCCC), Phoenix, Arizona (2004) 5. Ramaswamy, L., Liu, L.: An expiration age-based document scheme for cooperative web caching. IEEE Trans. Knowl. Data Eng. 16, 585–600 (2004) 6. Arianfar, S., Nikander, P., Ott, J.: On content-centric router design and implications. In: Proceedings of the Re-Architecting the Internet Workshop (ReARCH), New York, pp. 1–6 (2010) 7. Wang, Y., Xu, M., Feng, Z.: Hop-based probabilistic caching for informationcentric networks. In: IEEE Global Communications Conference (GLOBECOM 2013), Atlanta, pp. 2102–2107 (2013) 8. Psaras, I., Chai, W.K., Pavlou, G.: Probabilistic in-network caching for information-centric networks. In: Proceedings of the second edition of the ICN workshop on Information-centric networking (ICN 2012), pp. 55–60 (2012) 9. Chai, W.K., He, D., Psaras, L., Pavlou, G.: Cache “less for more” in informationcentric networks. In: Proceedings of the 11th International IFIP TC6 Conference on Networking (IFIP 2012), pp. 27–40 (2012) 10. Ji, J., Xu, M., Yang, Y.: Content-hierarchical intra-domain cooperative caching for information-centric networks. In: Proceedings of The Ninth International Conference on Future Internet Technologies (CFI), pp. 1–6 (2014) 11. Wang, J.M., Zhang, J., Bensaou, B.: Intra-AS cooperative caching for contentcentric networks. In: Proceedings of the 3rd ACM SIGCOMM Workshop on Information-Centric Networking, pp. 61–66 (2013) 12. Matsuda, K., Hasegawa, G., Murata, M.: Multi-ISP cooperative cache sharing for saving inter-ISP transit cost in content centric networking. IEICE Trans. Commun. 98, 621–629 (2015)

Aggregating and Sharing Contents for Reducing Redundant Caches on NDN

73

13. Dorogovtsev, S., Mendes, J., Samukhin, A.: Structure of growing networks with preferential linking. Phys. Rev. Lett. 85, 4633–4636 (2000) 14. Breslau, L., Cao, P., Fan, L., Phillips, G., Shenker, S.: Web caching and Zipf-like distributions: evidence and implications. In: Proceedings of IEEE Infocom 99, New York, pp. 126–134 (1999)

A Balanced Dissemination of Time Constraint Tasks in Mobile Crowdsourcing: A Double Auction Perspective Jaya Mukhopadhyay1(B) , Vikash Kumar Singh2 , Sajal Mukhopadhyay3 , and Anita Pal1 1

3

Department of Mathematics, National Institute of Technology, Durgapur 713209, West Bengal, India [email protected], [email protected] 2 School of Computer Science and Engineering, Vellore Institute of Technology, Amaravati 522237, Andhra Pradesh, India [email protected] Department of Computer Science and Engineering, National Institute of Technology, Durgapur 713209, West Bengal, India [email protected]

Abstract. Mobile crowdsourcing (MCS) is gaining real attention in recent years as it has found widespread applications such as traffic monitoring, pollution control surveillance, locating endangered species, and many others. This paradigm of research is showing an interesting power of smart devices that are held by intelligent agents (such as human beings). In MCS, the tasks which are outsourced are executed by the task executors (intelligent agents carrying smart devices). In this paper, how overlapping tasks (with a deadline) can be disseminated in slots and leveraged as evenly as possible to the stakeholders (task executors or sellers) is addressed through a scalable scheduling (interval partitioning) and economic mechanism (double auction). It is proved that our mechanism is truthful and also shown via simulation that our proposed mechanism will perform better when the agents are manipulative in nature.

1 Introduction With unprecedented growth of smartphone users, the agents carrying smartphones can potentially serve many things [1, 2]. The application ranges from environment (say for example collecting data of pollution condition in some areas), eco-system restoring (collecting information for to-be-extinct species), transportation (acquiring information of road condition) and many more [3, 4]. In all cases they collect and send information to the task requester(s) (or buyer(s)). Agents with smartphones when solving such a system, is called mobile crowdsourcing (MCS)1 . In this model task requester(s) publishes tasks to a platform and then the task executor(s) or seller(s) serves the tasks. The architecture of the MCS system is given in Fig. 1. 1

In literature, mobile crowdsourcing is also termed as participatory sensing.

c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 74–85, 2021. https://doi.org/10.1007/978-3-030-61105-7_8

A Balanced Dissemination of Time Constraint Tasks in Mobile Crowdsourcing

B1 B2 B3 Bn−1 Bn

75

{tˆ1 } Platform

{tˆ2 } {tˆ3 } {tˆn−1 } Matching

{tˆn }

Task Requesters D1

D2

D3

Dm−1 Dm

Task Executors

Fig. 1. System model

MCS relies on the fact that the agents provide the data. But the fundamental question is: Why the agents should provide the data? or how they can be motivated? One way to think that in some applications from their social urge they should provide the data. In another case, in many applications, agents may be motivated only when they are given some incentives (ex: money). There are several works devoted to incentivizing agents in MCS [5–8]. In this paper, how the time constrained jobs are to be separated and distributed in a balanced way to the participating sellers through double auction, is addressed. The main contributions of our paper are: • First the submitted jobs of the task providers (buyers) are distributed to |d| slots, so that they become non overlapping. • Second, the jobs of each slot are assigned as evenly as possible, so that sellers are not overburden. • The third goal is to design a truthful mechanism so that social welfare (in usual economic sense) is maximized. The rest of the paper is organized as follows. In Sect. 2, preliminaries of the proposed scheme is presented. The system model is discussed in Sect. 3. In Sect. 4, the detailing of the proposed algorithms are presented. In Sect. 5, the simulations are carried out. Conclusions and future works are given in Sect. 6.

2 Related Works In this section, our emphasis will be to give a brief overview about the prior works done in the field of MCS. Our discussion mainly circumvent around the incentive aspects in MCS, and quality of data provided by the participating agents, etc. The readers can go through [3, 4, 9] in order to get an overview of the field. The whole of the MCS field relies mainly on the idea of collecting the data from large group

76

J. Mukhopadhyay et al.

of interested users (may be common people) having some sensing devices (say smartphone) geographically distributed around the globe. Following the above discussion, the natural question that arise in ones mind is: why the agents should provide the data by placing so much effort (CPU utilization, power consumption, etc.) and more importantly exposing their locations? Answering to the above raised question: In [10] the works have been done in the direction of: how to influence large number of participants to take part in the sensing process along with the evaluation of their provided data? Several incentive schemes have been introduced in [5, 6, 8]. Dissimilar to above discussed incentive schemes, in [11–13] the incentive compatible mechanisms are introduced for the MCS environment. Some works in MCS environment is devoted to the quality of the data collected by the participating agents [11, 14]. However, the drawback that is identified in these proposed schemes is that, in the proposed mechanism the quality of the data reported by the agents to the system are not taken into consideration. In [14, 15] efforts have been made in this direction by combining the quality of data provided by the agents with their respective bids. Some works [16–18] model MCS market as double auction, where task requesters play the role of buyers to buy the sensed data from crowd. In [16], for sensing task allocation in MCS system a truthful double auction mechanism is proposed, that take into consideration the relationship between the number of users that are assigned to do the tasks and utility of task requesters. In [17], a general framework for designing the truthful double auction mechanisms for the dynamic mobile crowdsourcing is proposed. In [18], an approach based on max-min fairness task allocation is discussed. The utilities of the participating agents are maximized, and the trusted data are gathered using incentive mechanisms. From the above discussed works it can be seen that no work has considered the balanced dissemination of time constrained tasks to the task executors. In this paper, we have investigated this scenario and proposed a double auction based mechanism coupled with scheduling algorithm.

3 System Model In this MCS model we have a set of participating buyers (task providers) B = {B1 , B2 , . . . , Bn }. A set of participating sellers (task executors) D = {D1 , D2 , . . . , Dm }. Usually the number of sellers are far more than the number of buyers. Buyers submit a set of jobs J = {J1 , J2 , . . . , Jn }, where any Ji = {gij } denotes the set of all jobs submitted by the ith buyer. Each job gij has a start time sij and a finish time f ji where f ji ≥ sij . However two jobs j and j may overlap. For the overlapping case of two jobs ( j and j , where j being the first job without loss of generality) we have sij ≤ sij ≤ f ji (if the two jobs from the same buyer) or sij ≤ skj ≤ f ji (if two jobs from the different buyers). The first non-trivial goal is to decode the minimum number of slots needed to distribute all the jobs so that no two jobs overlap. To understand the meaning of slot, consider Fig. 2a and Fig. 2b submitted jobs are shown (may be overlapping). In Fig. 2b non-overlapping jobs are separated in three slots (d1 , d2 , and d3 ). Slots are the container of non-overlapping jobs. The second goal is to distribute the jobs to the sellers as evenly as possible. For this we use a double auction framework. In

A Balanced Dissemination of Time Constraint Tasks in Mobile Crowdsourcing

77

this problem task providers (buyers) and the task executors (sellers) are considered as strategic in nature. The buyers are constrained by budget. In this model, the buyers submit the jobs along with the valuation; The valuation of job submitted by arbitrary buyer i is denoted by β ji (ith buyer, jth job). β ji is the private information of the buyers and denotes the maximum amount the buyer can pay for his task to be executed. This we will further refer as the bid valuation of the buyers. The sellers (task executors) also submit their bid valuation and is denoted by δ ji . So, the bid vectors for both buyer and the seller is denoted by β = {β ji } and δ = {δ ji }. the third goal is to ensure that the agents (strategic) should not misreport their valuation so that the social welfare (in usual economic sense) is maximized. d1 s11

g11 g12

s12 s13

f11 g13 s14

s11

g11

d2 f11 s12

d3

g12

f21 s13

g13

f31

f21 g14 s15

f31 f41 g15

f51

(a) Overlapping Jobs

s14

g14

f41

s15

g15

f51

(b) Non-Overlapping Jobs

Fig. 2. Jobs configurations

The utility of any seller is defined as the difference between the payment received by the seller and the true valuation of the seller. More formally, the utility of Di is: ∑ j pîj − ∑ j δ ji , if Di wins s ui = (1) 0, Otherwise Similarly, the utility of any buyer is defined as the difference between the true valuation of the buyer and the payment he pays. More formally, the utility of Bi is: ∑ j β ji − ∑ j pij , if Bi wins ubi = (2) 0, Otherwise Further, in Lemma 2 it has been shown that the proposed mechanism is truthful (see Definition below). Definition 1 (Truthful). Truthful means that, if the utility relation for the ith buyer is ubi ≥ u bi holds considering that ubi is the utility of buyer i when he is reporting his true bid profile vector β ji and u bi is the utility of that buyer when he is reporting any other

bid profile vector β ij = β ji . For each seller i, truthful means usi ≥ u si .

78

J. Mukhopadhyay et al.

4 Proposed Mechanism The proposed mechanism namely Balanced distribution of time bound task using double auction (BDoTTuDA) is a two step process: (a) Separating the non-overlapping jobs into d slots, and (b) The set of tasks of each di ∈ d are then auctioned off. 4.1

Sketch of BDoTTuDA

The BDoTTuDA can further be studied under two different sections: Partitioning and Scheduling, and Double auction mechanism. First the sub-part of the proposed mechanism i.e. Partitioning and Scheduling phase is discussed and presented motivated by [19]. The Double auction mechanism is presented next motivated by [20].

Algorithm 1. Partitioning and Scheduling (J) Output: d ← φ 1: begin 2: x ← 0, Heap ← φ , j = 1, = 0, count = 0 3: Sˆ = Sort(J) Sort jobs based on start time. 4: Q ← Sˆ Maintain a queue for the sorted jobs. 5: x ← Delete(Q) Placing x in jth slot. 6: slot (d j , x) 7: d j ← d j ∪ {x} 8: d ← d ∪ d j 9: Heap insert (Heap, , (x. f , j)) /* Inserting element in heap */ 10: while Q = φ do 11: x ← Delete(Q) Returns the minimum element from heap 12: ( fˆ, j ) ← minimum (Heap) 13: if x · s ≥ fˆ then 14: slot (d j , x) 15: d j ← d j ∪ {x} /* Updating heap */ 16: Heap update (Heap, 0, (x. f , j )) 17: else 18: j= j+1 Placing x in jth slot. 19: slot (d j , x) 20: d j ← d j ∪ {x} 21: d ← d ∪ dj 22: Heap insert (Heap, , (x. f , j)) /* Inserting element in heap */ 23: end if 24: end while 25: return d 26: end

4.1.1 Partitioning and Scheduling The input to the Partitioning and Scheduling phase is the set of available jobs given as J. The output the set of slots containing the buyers. Talking about Algorithm 1, Line 3 sorts the jobs based on the given start time. In Line 4, the sorted jobs are maintained

A Balanced Dissemination of Time Constraint Tasks in Mobile Crowdsourcing

79

in a Queue data structure. Line 5–8 assign a slot d1 to the first job in the queue Q. Line 9 initializes the Heap containing the finish time of the currently selected job from Q. The While loop in line 10–24 perform the rest of the process of scheduling jobs to particular slot by utilizing several sub-routines such as Heap insert, Heap update, and Min heapify. The while loop terminates once all the jobs assigned to respective non-overlapping slots. Algorithm 2. Double auction mechanism ( D, di , β , δ ) Output: A ← φ , pˆs ← φ , pb ← φ 1: begin 2: for each di ∈ d do 3: B ← φ , D ← φ 4: D ← Sort ascending(D, D. δ ji ) 5: 6: 7: 8: 9: 10: 11: 12: 13:

Sorting based on δ ji ∈ δ for all Di ∈ D j

Sorting based on β ji ∈ β for all di ∈ di d ← Sort descending(di , di . β ji ) i i κ ← argmaxk {di . β j − D. δ j ≥ 0} for i = 1 to κ do j B = B ∪ {di } D = D ∪ {Di } Ai ← (B , D ) end for for i =1 to κ do κ +1 κ +1 βiκ +1 +δiκ +1 κ and βi +δi ≤ β if ≤ δiκ then i 2 2

14: pi ← βiκ +1 ; pbi ← pbi ∪ pi j j 15: pî ← δiκ +1 ; psi ← psi ∪ pî 16: else j j 17: pi ← βiκ ; pbi ← pbi ∪ pi j j 18: pî ← δiκ ; psi ← psi ∪ pî 19: end if 20: end for 21: A ← A ∪ Ai j 22: pˆs ← pˆs ∪ pî j 23: pb ← pb ∪ pi 24: end for 25: return A , pˆs , pb 26: end j

j

4.1.2 Double Auction Mechanism The input to the Double auction mechanism are the set of sellers i.e. D, and the set of buyers in a slot di ∈ d. The output is the set of buyers-sellers winning pairs held in Ai data structure. Line 5 sorts the sellers in ascending order based on the elements of δ ji . The set of buyers in di slot are sorted in descending order based on the elements of β ji . Line 6 determines the largest index κ that satisfy the condition that di . β ji − D. δ ji ≥ 0. The for loop in line 7–11 iterates over the κ winning Buyer-seller pairs. In line 8 B

80

J. Mukhopadhyay et al.

data structure keeps track of all the winning buyers. The D data structure keeps track of all the winning sellers. The Ai data structure in line 10 keeps track of all the winning seller-buyer pairs. Line 12–20 determines the payment of winning seller-buyer pairs. Line 25 returns the allocation set A , the seller’s payment pˆs , and the buyers payment pb . Lemma 1. The running time of BDoTTuDA is O(m lg m). Proof. The running time of BDoTTuDA will be equal to the sum of the running time of the Partitioning and Scheduling, and Double auction mechanism. The running time of the Partitioning and Scheduling phase is given as O(n lg n). The Double auction mechanism takes O(m lg m) time. So, the running time of BDoTTuDA is given as O(n lg n) + O(m lg m) = O(m lg m), as the number of sellers are far greater than the number of buyers. Lemma 2. BDoTTuDA is Truthful. Proof. Let us consider the case of sellers. Fix slot di ∈ d. Case 1. Let us say that the ith winning seller misreports a bid value as δ ij > δ ji . As the

seller was winning with δ ji , with δ ij he would keep on winning and his utility u si = usi .

If, say, he reports δ ij < δ ji . This will give rise to two cases. He can still be in the winning set. If he is in the winning set his utility from the definition will be u si = usi . If he is in losing set, then his utility will be u si = 0 < usi . Case 2. If the ith seller was in losing set and he reports δ ij < δ ji , he would still belong

to losing set and his utility u si = 0 = usi . If instead he reports δ ij > δ ji . This wil give rise to two cases. If he still belong to losing set his utility u si = 0 = usi . But if he is in winning set, then he had to beat some valuation δ kl > δ ji and hence usi > usk . Now as he is in wining set his utility u si = pîj − δ ji = δlk − δ ji < 0. So he would have got a negative utility. Hence no gain is achieved. From the above two cases i.e. Case 1 and Case 2, it can be concluded that any seller i can’t gain by mis-reporting his bid value. The proof considers the sellers case, similar road map could be followed for the buyers. This completes the proof. Lemma 3. The number of tasks that is assigned to any jth slot in expectation is given as nk , where k is the number of slots available and n is the number of tasks. In other words, we can say n E[Z j ] = k where, Z j is the random variable measuring the number of tasks assigned to any jth slot out of n tasks. Proof. Fix a slot j. In this, our goal is to compute the expected number of tasks assigned to any given slot. The indicator random variable Z j is used to determine the total number of tasks assigned to jth slot. So, the expected number of tasks assigned to jth slot is given as E[Z j ]. Let us say we have k different slots. Now, when a task is picked up for

A Balanced Dissemination of Time Constraint Tasks in Mobile Crowdsourcing

81

allocating a slot, then it can be placed in any of these k different slots. So, any slot (1 ≤ ≤ k) could be the outcome of the experiment (allocation of slot to the task). It is to be noted that the selection of any such is equally likely. Therefore, each task ti can be assigned to any jth slot with probability 1k . We define an indicator random variable Zij associated with the event in which ti task is assigned to jth slot. Thus, Zij = I{ti task is assigned to jth slot} 1, = 0,

if task ti is assigned to jth slot, otherwise

Taking expectation both side, we get E[Zij ] = E[I{ti task is assigned to jth slot}] As always with the indicator random variable, the expectation is just the probability of the corresponding event [21]: E[Zij ] = 1 · Pr{Zij = 1} + 0 · Pr{Zij = 0} = 1 · Pr{Zij = 1} 1 = k

(3)

Now, let us consider the random variable that is of our interest and is given as n

Z j = ∑ Zij . The expected number of tasks assigned to any jth slot is just the expected i=1

value of our indicator random variable Z j and is given as E[Z j ]. By taking expectation both side, we get n E[Z j ] = E ∑ Zij (4) i=1

By linearity of expectation, we get n

E[Z j ] = ∑ E[Zij ]

(5)

i=1

On substituting the value of E[Zij ] from Eq. 3 to Eq. 5, we get n

1 n = k i=1 k

E[Z j ] = ∑

So, one can conclude that, in any jth slot on an average Hence proved.

n k

Observation 1. If the value of n is 100 and k is 5 then E[Zi ] = that, on an average each slot will be carrying 20 tasks.

tasks will be assigned.

n k

=

100 5

= 20. It means

82

J. Mukhopadhyay et al.

5 Experiments and Results In this section, the proposed mechanism called BDoTTuDA is compared with the proposed benchmark mechanism (BM) that is vulnerable to manipulation. The manipulative nature of the task executors in case of BM can be seen easily in the simulation results. It is to be noted that, our BM differs only in terms of payment rule, the allocation rule is similar to that of BDoTTuDA. As a payment rule of BM, each of the winning task executors will be paid his respective reported bid value and each of the winning task requesters will be paying his respective reported bid value. The unit of bid values reported by the task executors and task requesters is taken as $. The simulations are done using Python. 5.1

Simulation Setup

In our setup, the experiment runs for 100 times and the obtained values are plotted by taking average over these 100 times. The simulations are done considering the uniform distribution (UD). In case of UD, for all the agents the bid value ranges from 100 to 200. The performance metric that is considered in order to compare the two mechanisms is utility. The purpose for considering the utility parameter is to verify the two mechanisms based on truthfulness. 5.2

Result Analysis

In this section, BDoTTuDA is simulated against the benchmark mechanism (depicted as BM in the figures shown in the simulation results). BDoTTuDA is claimed to be truthful in our setting. To present the manipulative nature of the bechmark mechanism, the bid values of the subset of the agents participating in the system are varied. It is considered that 20% of the agents (in our case it is said to be small deviation) are manipulating their bid values by 30% of their true value (task executors are increasing their bid values by 30% of their true values and task requesters are decreasing their bid values by 30% of their true value). Similarly for the medium deviation (30% of the agents are manipulating their bid values by 30% of their true value) and large deviation (45% of the agents are manipulating their bid values by 30% of their true value). In the simulation results, BM-S-Dev, BM-M-Dev, and BM-L-Dev represents benchmark mechanism with small deviation, benchmark mechanism with medium deviation, and benchmark mechanism with large deviation respectively. It is shown in Fig. 3 and 4 that the utility of agents in case of BDoTTuDA is more as compared to the utility of agents in case of benchmark mechanism (the utility of agents are 0 in case of BM.). This is due to the reason that in case of BDoTTuDA, the task executors are paid more than their true valuation and the task requesters are paying less than their true valuation, whereas in case of BM the task executors are paid equal to their reported bid value and the task requesters are paying equal to their reported bid value. Considering the manipulative nature of the agents, it can be seen in Fig. 3 and 4 that if the agents are manipulating their bid values in case of BM, then they are gaining. More formally, for the BM case, the utility of the agents are higher in case of large

A Balanced Dissemination of Time Constraint Tasks in Mobile Crowdsourcing

83

1800

BDoTTuDA BM BM-S-dev 1400 BM-M-dev BM-L-dev 1200

Utility of Task Executors

1600

1000 800 600 400 200 0 100

150

200 250 No. of tasks

300

Fig. 3. Comparison of utility of task executors

deviation than in case of medium deviation than in case of small deviation than in case of no deviation. This is due to the reason that, the task executors are paid their reported bid values which is higher than their true valuation (in case of manipulation) and the task requesters are paying their reported bid value which is lower than their true valuation (in case of manipulation). So, it can be concluded that, more the number of agents manipulating their bid by some fixed amount, higher will be the utility of the agents. As the agents are increasing utility by manipulation in case of BM, so we can say that BM is vulnerable to manipulation. It is not a truthful mechanism.

Utility of Task Requesters

1800

BDoTTuDA BM BM-S-dev 1400 BM-M-dev BM-L-dev 1200 1600

1000 800 600 400 200 0 100

150

200

250

300

No. of tasks

Fig. 4. Comparison of utility of task requesters

Another thing is that, in case of BM, if the task executors are deviating (increasing their bid values) by large amount from their true values, then in that case the task requesters will have to pay very high value. In that case, the task requesters will not be willing to pay such a huge amount. Also, if the task requesters are constrained by some budget, then many more task executors could not be served. So, if the manipulation

84

J. Mukhopadhyay et al.

is their in the system, even if the task executors are gaining but it is not good for the system.

6 Conclusion and Future Works In this paper a double auction framework is developed to distribute time bound tasks to the task executors and thereby achieving balance in distributions. In our future work balance distribution can be performed by thresholding on the number of tasks being allocated to the sellers. The other two directions could be to distribute the tasks by considering the location information and quality of the agents in our settings.

References 1. Hasenfratz, D., Saukh, O., Sturzenegger, S., Thiele, L.: Participatory air pollution monitoring using smartphones. In: Mobile Sensing: From Smartphones and Wearables to Big Data, Beijing, China, April 2012. ACM (2012) 2. J, W., et al.: Fine-grained multitask allocation for participatory sensing with a shared budget. IEEE Internet Things J. 3(6), 1395–1405 (2016) 3. Phuttharak, J., Loke, S.W.: A review of mobile crowdsourcing architectures and challenges: toward crowd-empowered Internet-of-Things. IEEE Access 7, 304–324 (2019) 4. Yu, R., Cao, J., Liu, R., Gao, W., Wang, X., Liang, J.: Participant incentive mechanism toward quality-oriented sensing: understanding and application. ACM Trans. Sen. Netw. 15(2), 21:1–21:25 (2019) 5. Duan, Z., Tian, L., Yan, M., Cai, Z., Han, Q., Yin, G.: Practical incentive mechanisms for IoT-based mobile crowdsensing systems. IEEE Access 5, 20383–20392 (2017) 6. Singh, V.K., Mukhopadhyay, S., Xhafa, F., Krause, P.: A quality-assuring, combinatorial auction based mechanism for IoT-based crowdsourcing. In: Advances in Edge Computing: Massive Parallel Processing and Applications, vol. 35, pp. 148–177. IOS Press (2020) 7. Mukhopadhyay, J., Singh, V.K., Mukhopadhyay, S., Pal, A.: Clustering and auction in sequence: a two fold mechanism for participatory sensing. In: Yadav, N., Yadav, A., Bansal, J.C., Deep, K., Kim, J.H. (eds.) Harmony Search and Nature Inspired Optimization Algorithms, pp. 347–356. Springer, Singapore (2019) 8. Singh, V.K., Mukhopadhyay, S., Xhafa, F., Sharma, A.: A budget feasible peer graded mechanism for IoT-based crowdsourcing. J. Ambient Intell. Humanized Comput. 11(4), 1531– 1551 (2020) 9. Restuccia, F., Das, S.K., Payton, J.: Incentive mechanisms for participatory sensing: survey and research challenges. Trans. Sensor Netw. 12(2), 13:1–13:40 (2016) 10. Lee, J.S., Hoh, B.: Dynamic pricing incentive for participatory sensing. Elsevier J. Pervasive Mob. Comput. 6(6), 693–708 (2010) 11. Zhao, D., Li, X.Y., Ma, H.: How to crowdsource tasks truthfully without sacrificing utility: online incentive mechanisms with budget constraint. In: Annual IEEE International Conference on Computer Communications (INFOCOM), pp. 1213–1221 (2014) 12. Feng, Z., Zhu, Y., Zhang, Q., Ni, L.M., Vasilakos, A.V.: Trac: Truthful auction for locationaware collaborative sensing in mobile crowdsourcing, pp. 1231–1239 (2014) 13. Gao, L., Fen, H., Jianwei, H.: Providing long-term participation incentive in participatory sensing (2015). arXiv preprint arXiv:1501.02480 14. Yu, R., Liu, R., Wang, X., Cao, J.: Improving data quality with an accumulated reputation model in participatory sensing systems. Sensors 14(3), 5573–5594 (2014)

A Balanced Dissemination of Time Constraint Tasks in Mobile Crowdsourcing

85

15. Bhattacharjee, J., Pal, A., Mukhopadhyay, S., Bharasa, A.: Incentive and quality aware participatory sensing system. In: 12th International Conference on Dependable, Autonomic and Secure Computing (DASC), pp. 382–387. IEEE Computer Society (2014) 16. Xu, W., Huang, H., Sun, Y. E., Li, F., Zhu, Y.: Data: a double auction based task assignment mechanism in crowdsourcing systems. In: 2013 8th International Conference on Communications and Networking in China (CHINACOM), pp. 172–177 (2013) 17. Wei, Y., Zhu, Y., Zhu, H., Zhang, Q., Xue, G.: Truthful online double auctions for dynamic mobile crowdsourcing. In: 2015 IEEE Conference on Computer Communications (INFOCOM), pp. 2074–2082 (2015) 18. Huang, H., Xin, Y., Sun, Y., Yang, W.: A truthful double auction mechanism for crowdsensing systems with max-min fairness. In: 2017 IEEE Wireless Communications and Networking Conference (WCNC), pp. 1–6 (2017) ´ Algorithm Design. Addison-Wesley, Boston (2006) 19. Kleinberg, J., Tardos, E.: 20. Bredin, J., Parkes, D.C.: Models for truthful online double auctions. In: 21st International Conference on Uncertainty in Artificial Intelligence, pp. 50–59. AUAI press (2005) 21. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2009)

A Scheduling Method of Division-Based Broadcasting Considering Delivery Cycle Yusuke Gotoh(B) and Keisuke Kuroda Graduate School of Natural Science and Technology, Okayama University, Okayama, Japan [email protected] Abstract. Due to the recent popularization of multimedia broadcasting such as terrestrial digital TV broadcasting and one-segment broadcasting, the technology for broadcasting continuous media data such as audio and video has become important to users. Although broadcast delivery allows many clients to receive data with a certain bandwidth, clients have to wait between the request for data and the start of delivery. In order to reduce the waiting time, division-based broadcasting, in which the server divides data into several segments and delivers them on multiple channels, has been proposed. In addition, many scheduling methods set the timing of delivering each segment in division-based broadcasting. The Wrapped Harmonic Broadcasting (WHB) method, which is a conventional scheduling method, reduces the waiting time by setting a delivery schedule that shortens the delivery cycle of data using multiple channels. However, the WHB method increases the waiting time due to the long period during which segments are not scheduled on the channel. In this paper, we propose a scheduling method that considers the delivery cycle of video data in division-based broadcasting. Our method schedules more segments to be delivered than the conventional WHB method based on the available bandwidth and consumption rate. In our simulation evaluation assuming an actual network environment, waiting time with the proposed method is reduced by about 46.5% compared to the WHB method when the number of segments is 30, the playback rate is 5.0 Mbps, the available bandwidth is 15 Mbps, and the playing time is 180 s.

1

Introduction

Due to the recent popularization of online video streaming on the Internet, users are demanding environments in which they can watch multiple videos simultaneously [1]. There are two types of delivery methods for video data: on-demand delivery and broadcast delivery. In on-demand delivery, the server allocates the bandwidth and delivers the video based on the client’s request. Therefore, as the number of clients requesting video at the same time increases, the available bandwidth of the server increases proportionally. On the other hand, in broadcasting, although the server can deliver the same video repeatedly to clients with a certain bandwidth, the clients must wait until the requested data are delivered. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 86–94, 2021. https://doi.org/10.1007/978-3-030-61105-7_9

A Scheduling Method of Division-Based Broadcasting

87

In order to reduce waiting time, we proposed a division-based broadcasting that divides video data into multiple segments and delivers them using multiple channels. In division-based broadcasting, many scheduling methods have been proposed to reduce the waiting time for receiving data [2,3]. Conventional scheduling methods reduce waiting time by setting a delivery schedule that shortens the delivery cycle of data using multiple channels. However, these methods increases waiting time due to the long period when segments are not scheduled on the channel. In this paper, we propose a scheduling method that considers the delivery cycle of video data in division-based broadcasting. Our method schedules more segments to be delivered than the conventional method based on the available bandwidth and consumption rate. The remainder of this paper is organized as follows. In Sect. 2, we explain division-based broadcasting for multiple videos. We introduce related works in Sect. 3. We explain the conventional scheduling method considering the cycle in Sect. 4. We explain our proposed method in Sect. 5 and evaluate it in Sect. 6. Finally, we conclude our paper in Sect. 7.

2

Division-Based Broadcasting

IP networks have two main types of delivery systems: Video on Demand (VoD) and broadcasting. In broadcasting systems such as multicast and broadcast, the server delivers the same content data to many clients using a constant bandwidth. Although the server can reduce the network load and the required bandwidth, clients have to wait until their desired data are broadcast. VoD systems are used for delivering many kinds of movies. Clients can watch movies from on-demand services such as Netflix [4] and Amazon Prime Video [5]. In VoD systems, the server requires adequate bandwidth and starts delivering data sequentially based on client requests. Although clients can get their desired data immediately, the server’s load increases as the number of clients increases. In broadcasting systems, the server concurrently delivers data to many clients. In general broadcasting systems, since the server broadcasts data repetitively, clients must wait until their desired data are broadcast. Accordingly, various types of methods for broadcasting content data have been studied [6–9]. In content data broadcasting, clients must play the data without interruption until their end. By dividing the data into several segments and scheduling them so that clients receive one segment before playing the next, many methods can reduce the waiting time. In division-based broadcasting systems, since the waiting time is proportional to the data size of the preceding segment, we can reduce the waiting time by shortening the data size of the preceding segments. However, when the rate of the preceding segments is small, the client can not start the segment to be played next until it finishes playing the segment that it has already received. In this case, an interruption occurs while playing the data and the waiting time increases. Therefore, we need to consider the data size of the preceding segment.

88

Y. Gotoh and K. Kuroda

Several methods employ division-based broadcasting that reduces waiting time by dividing the data into several segments and frequently broadcasting precedent segments.

Fig. 1. Broadcast schedule under FB method

In the conventional Fast Broadcasting (FB) method [10], the broadcast bandwidth is divided into several channels. The broadcast schedule under the FB method is shown in Fig. 1. The bandwidth for each channel is equivalent to the consumption rate. In this case, the server uses three channels. In addition, the data are divided into three segments: S1 , S2 , and S3 . When the total playing time is seven min., the playing time of S1 is calculated to be one min., S2 is two min., and S3 is four min. In Fig. 1, the server broadcasts Si (i = 1, 2, 3) by broadcast channel Ci repetitively as shown. Clients can store the broadcasted segments in their buffers while playing the data and play all segments after receiving them. When clients finish playing S1 , they have also finished receiving S2 and can play S2 continuously. In addition, when they have finished playing S2 , they have also finished receiving S3 and can play S3 continuously. Since clients can receive broadcasted segments midstream, the waiting time is the same as the time needed to receive only S1 and the average waiting time is one min.

3

Related Works

In Optimized Periodic Broadcast (OPB) [11], each bit of data is separated into two parts. The server uses several broadcast channels and distributes each seg-

A Scheduling Method of Division-Based Broadcasting

89

ment on each channel. When clients finish receiving the precedent parts of the content, they begin receiving the remaining portions. Since clients can get the subsegment data in advance, waiting times can be reduced. However, the bandwidth increases as the amount of content increases. In Heterogeneous Receiver-Oriented Broadcasting (HeRO) [12], the server divides the data into K segments. Let J be the data size for the first segment. The data sizes for the segments are J, 2J, 22 J, ..., 2K−1 J. The server broadcasts them using K channels. However, since the data size of the K th channel becomes half of the data, clients may experience waiting times with interruptions. The Generalized Fibonacci Broadcasting (GFB) method [13] creates a broadcast schedule using the Fibonacci sequence. When the data size ratios in the first and second segments are 1 and 2, the GFB method sets the ratio of the data size in the nth segment (n ≥ 3) to n − 2 + n − 1 and makes schedules for clients with several types of available bandwidth. Clients with small available bandwidths can reduce their waiting time. The Catch and Rest (CAR) method [14] combines scheduling using the replicated channels in HeRO with scheduling that divides segments in the GFB method. Clients who have different bandwidth and buffer sizes use the CAR method. Clients with small buffer size and available bandwidth can reduce their waiting times by considering cases without receiving segments. In Layered Internet Video Engineering (LIVE) [15], clients feedback virtual congestion information to the server to support both bandwidth sharing and transient loss protection. LIVE can minimize the total distortion of all participating video streams and maximize their overall quality. Several researches studied reliable broadcasting in wireless networks with packet loss probability [16,17]. Other scheduling methods have investigated packet loss [18], the receiving performance of clients[19,20], and the variable bit rate (VBR)[21,22]. When a server delivers data to clients using bandwidth that is sufficiently larger than the consumption rate and plays it with buffering, they can reduce their waiting time. However, clients must have high performance memory and a processor. We need to evaluate a system in a situation where clients with various computer capabilities can simultaneously use division-based broadcasting with a scheduling method.

4

Conventional Scheduling Method Considering Delivery Cycle

The Wrapped Harmonic Broadcasting (WHB) method [23], which is a conventional scheduling method, reduces the waiting time by setting a delivery schedule that shortens the delivery cycle using multiple channels. In this method, when the video data is divided into Si segments (i = 1, 2, · · · , n), the ratio of data size in each segment is 1 : 12 : · · · : n1 . The server determines the delivery cycle based on the data size of S1 and schedules segments in the order of S1 , S2 , · · · , and Sn . The available bandwidth of each channel is equal to the consumption rate.

90

Y. Gotoh and K. Kuroda

When the data size of the segment to be scheduled is less than Si , the server sets up a new channel and schedules Si . An example of a WHB delivery schedule is shown in Fig. 2. The number of segments is 6, the playing time is 60 sec., and the consumption rate is 5.0 Mbps. First, the server divides the data size of S1 , S2 , · · · , S6 so that the ratio of the data size of S1 , S2 , · · · , and S6 is 1 : 12 : · · · : 16 . Since the server has no space for scheduling S2 to C1 , it sets a new channel C2 and schedules S2 . Similarly, after scheduling S3 to C2 , it schedules S4 and S5 to a new channel C3 and S6 to C2 , respectively. In this case, the necessary bandwidth is 5.0 × 3 = 15 Mbps. The waiting time is the same as the broadcast cycle of S1 and is 60 × 120 294 24.5 s.

Fig. 2. Example of WHB method

In the WHB method, the server does not schedule for 13.5 s in C3 . The proposed method can further reduce the broadcast cycle and waiting time by scheduling the segments during this time period.

5

Proposed Method

In this paper, we propose a scheduling method that considers the delivery cycle of video data in division-based broadcasting. The proposed method schedules more segments to be delivered than the conventional WHB method based on the available bandwidth and consumption rate. 5.1

Assumed Environment

The environment for division-based broadcasting assumed in this paper is as follows: • The available bandwidth of each channel is equal to the consumption rate. • The server can schedule multiple segments of video data on each channel.

A Scheduling Method of Division-Based Broadcasting

91

• The server can broadcast video data from multiple channels simultaneously. • The client can receive from all channels simultaneously. • The client has an enough buffer to store video data. 5.2

Scheduling Procedure

The scheduling procedure of our proposed method is: 1. Based on the available bandwidth B and consumption rate r, the number of available channels m is calculated by m=

B . r

2. The number of segments n in the video data Si is calculated using the following conditions: n 1 i=1

i

≤m

and

n+1 i=1

1 > m. i

3. The server divides the data size of S1 , S2 , · · · , Sn so that the ratio of the data size of S1 , S2 , · · · , Sn is 1 : 12 : · · · : n1 . 4. The server calculates the delivery cycle T of S1 and schedules S1 to C1 . 5. For scheduling of S2 , · · · , and Sn , when the data size that can be scheduled by Cj is less than Si , the server creates new channel Cj+1 and schedules it. 6. The server repeats the steps until all segments are scheduled.

Fig. 3. Example of broadcast schedule under proposed method

92

5.3

Y. Gotoh and K. Kuroda

Implementation

Figure 3 shows an example of scheduling under the proposed method. The available bandwidth is 15 Mbps, the consumption rate is 5.0 Mbps, and the playing time is 30 s. In step 1, the server calculates the number of channels 15 = 3. In step 2, the server calculates 10 as n, which is N with as m = 5.0 n 1 n+1 1 i=1 i ≤ m and N with i=1 i > m. In step 3, the server divides the segment 1 . In step so that the ratio of the data size of S1 , S2 , · · · , S10 is 1 : 12 : · · · : 10 4, the server calculates the delivery cycle T = 20.5 of S1 and schedules S1 to C1 . In step 5, for the scheduling of Si (i = 2, · · · , 10), the server schedules Si to C3 if the data size that can be scheduled in C2 is less than Si . In Fig. 3, after scheduling S2 , · · · , and S5 to C3 , the server schedules S6 to C2 because the data size for scheduling in C2 exceeds S6 . Finally, the server schedules S7 , · · · , and S10 to C3 . In Fig. 3, the waiting time under the proposed method is 20.5 s and 24.5 s under the WHB method. Therefore, the waiting time with our proposed method is reduced by about 16.3% compared to the WHB method.

6

Evaluation

We evaluated the performance of the proposed method using computer simulations. In the evaluation, we compared the waiting time of the proposed method and the WHB method based on the change in the playing time of the video data.

Fig. 4. Average waiting time and playing time

A Scheduling Method of Division-Based Broadcasting

6.1

93

Waiting Time and Playing Time

We evaluate the waiting time as the playing time changes. The evaluation result is shown in Fig. 4. The horizontal axis is the playing time and the vertical axis is the waiting time. The number of segments is 30, the consumption rate is 5.0 Mbps, and the available bandwidth is 15 Mbps. In Fig. 4, when the broadcast cycle increases by lengthening the playing time, the waiting time is lengthened. In addition, the waiting time under the proposed method is reduced compared to the WHB method. The proposed method reduces the broadcast cycle by scheduling more segments and shortens the waiting time. Since the proposed method schedules many segments in the time period when the WHB method is not scheduled, we can reduce the number of channels. For example, when the playing time is 180 s, the waiting time under the proposed method is 13.0 s and that under the WHB method is 24.3 s. Therefore, we can reduce the waiting time under the proposed method by 46.5% compared with the WHB method.

7

Conclusion

In this paper, we proposed a scheduling method considering the delivery cycle of video data in division-based broadcasting. The proposed method schedules more segments to be delivered than the conventional WHB method based on the available bandwidth and consumption rate. In our simulation evaluation assuming an actual network environment, the waiting time of the proposed method was reduced by about 46.5% compared with the WHB method when the number of segments was 30, the playback rate was 5.0 Mbps, the available bandwidth was 15 Mbps, and the playing time was 180 s. In the future, we will propose and evaluate a scheduling method that considers the division-based broadcasting of multiple videos. Acknowledgement. This work was supported by JSPS KAKENHI Grant Number 18K11265. In addition, this work was commissioned from the Initiative for Life Design Innovation the Telecommunications Advancement Foundation.

References 1. WHITE PAPER Information and Communications in Japan (2019). https://www. soumu.go.jp/johotsusintokei/whitepaper/eng/WP2019/2019-index.html 2. Gotoh, Y., Kimura, A.: Implementation and evaluation of division-based broadcasting system for webcast. J. Digital Inf. Manag. (JDIM) 13(4), 234–246 (2015) 3. Ozaki, T., Gotoh, Y.: Implementation and evaluation of hybrid broadcasting system for webcasts. Int. J. Web Grid Serv. (IJWGS) 14(3), 288–304 (2018) 4. Netflix. https://www.netflix.com/ 5. Amazon Prime Video. https://www.amazon.com/gp/prime/ 6. Gotoh, Y., Yoshihisa, T., Kanazawa, M., Takahashi, Y.: A broadcasting scheme for selective contents considering available bandwidth. IEEE Trans. Broadcast. 55(2), 460–467 (2009)

94

Y. Gotoh and K. Kuroda

7. Jinsuk, B., Jehan, F.P.: A tree-based reliable multicast scheme exploiting the temporal locality of transmission errors. In: Proceedings of IEEE International Performance, Computing, and Communications Conference (IPCCC 2005), pp. 275–282 (2005) 8. Viswanathan, S., Imilelinski, T.: Pyramid broadcasting for video on demand service. In: Proceedings of Multimedia Computing and Networking Conference (MMCN 1995), vol.2417, pp. 66–77 (1995) 9. Zhao, Y., Eager, D.L., Vernon, M.K.: Scalable on-demand streaming of non-linear media. In: Proceedings of IEEE INFOCOM, vol. 3, pp. 1522–1533 (2004) 10. Juhn, L., Tseng, L.: Fast data broadcasting and receiving scheme for popular video service. IEEE Trans. Broadcast. 44(1), 100–105 (1998) 11. Juhn, L.-S., Tseng, L.M.: Harmonic broadcasting for video-on-demand service. IEEE Trans. Broadcast. 43(3), 268–271 (1997) 12. Bagouet, O., Hua, K.A., Oger, D.: A periodic broadcast protocol for heterogeneous receivers. In: Proceedings of SPIE, vol. 5019, pp. 220–231 (2003) 13. Yan, E.M., Kameda, T.: An efficient VOD broadcasting scheme with user bandwidth limit. In: Proceedings of ACM Multimedia Computing and Networking, vol. 5019, pp. 200–208 (2003) 14. Lin, C.-T., Ding, J.-W.: CAR: a low latency video-on-demand broadcasting scheme for heterogeneous receivers. IEEE Trans. Broadcast. 52, 336–349 (2006) 15. Zhu, X., Pan, R., Dukkipati, N., Subramanian, V., Bonomi, F.: Layered internet video engineering (LIVE): Network-assisted bandwidth sharing and transient loss protection for scalable video streaming. In: Proceedings of IEEE INFOCOM, pp. 226–230 (2010) 16. Xiao, W., Agarwal, S., Starobinski, D., Trachtenberg, A.: Reliable wireless broadcasting with near-zero feedback. In: Proceedings of IEEE INFOCOM, pp. 2543– 2551 (2010) 17. Fountoulakis, N., Huber, A., Panagiotou, K.: Reliable broadcasting in random networks and the effect of density. In: Proceedings of IEEE INFOCOM, pp. 2552– 2560 (2010) 18. Mahanti, A., Eager, D.L., Vernon, M.K., Sundaram-Stukel, D.: Scalable on-demand media streaming with packet loss recovery. IEEE/ACM Trans. Netw. 11(2), 195– 209 (2003) 19. Tantaoui, M., Hua, K., Do, T.: BroadCatch: a periodic broadcast technique for heterogeneous video-on-demand. IEEE Trans. Broadcast. 50(3), 289–301 (2004) 20. Shi, L., Sessini, P., Mahanti, A., Li, Z., Eager, D.L.: Scalable streaming for heterogeneous clients. In: Proceedings of ACM Multimedia, pp. 22–27 (2006) 21. Saparilla, D., Ross, K.W., Reisslein, M.: Periodic broadcasting with VBR-encoded video. In: Proceedings of IEEE INFOCOM 1999, vol. 2, pp. 464–471 (1999) 22. Janakiraman, R., Waldvogel, M., Xu, L.: Fuzzycast: efficient video-on-demand over multicast. In: Proceedings of IEEE INFOCOM 2002, pp. 920–929 (2002) 23. Wang, X., Cai, G., Men, J.: Wrap harmonic broadcasting and receiving scheme for popular video service. IEEE Trans. Broadcast. (Early Access), 1-10 (2019)

A Simply Implementable Architecture for Broadcast Communication Environments Tomoki Yoshihisa(B) Cybermedia Center, Osaka University, Ibaraki, Osaka 567-0047, Japan [email protected]

Abstract. With the recent increasing attention to the Internet of Things (IoT) devices and cyber-physical systems, highly useful information has been extracted from large amounts of data. There are various situations where data about highly useful information is shared by a large number of user clients, such as distributing traffic information to smart cars and distributing disaster information to smartphones in disaster-affected areas. In such a situation, by adequately distributing the shared data to a large number of user clients at once, the distribution server can deliver the data faster than the case of delivering the data to each client. However, in the conventional computing environment, distributing the data to each client is the basic (primary communication method), and there is a fundamental problem that the communication overhead associated with broadcast delivery increases. Therefore, in this study, we focus on the computational environment based on broadcast distribution. In particular, we propose a computational environment based on broadcast distribution via radio broadcasting and discuss its effectiveness.

1 Introduction The attention to the super-smart society is increasing, and the number of user clients connected to the Internet, such as smartphones and smart cars, is increasing rapidly. On the other hand, with the recent increasing attention to the Internet of Things (IoT) devices and cyber-physical systems, a large amount of data has been generated continuously from various things. By analyzing these large amounts of data, we can extract highly useful information. For example, the following applications can be considered. • A super-smart society provides the location (traffic information) of other cars and pedestrians. While driving a smart car, it displays real-time surrounding traffic information. Drivers who learn that there are many people in the scheduled drive route change to a route with fewer people to avoid unexpected accidents. • In the future, rescue robots will play an active role. Immediately after an earthquake, the robot requests relevant data to confirm the disaster information. Even if many robots frequently check, they can obtain the data immediately.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 95–101, 2021. https://doi.org/10.1007/978-3-030-61105-7_10

96

T. Yoshihisa

When the above application is implemented with existing technology, there is a problem that the notification takes time in the situation where a large amount of data occurs, and the response takes time in the situation where there are a lot of data requests. As mentioned above, in the case of data shared by a large number of user clients, it can be delivered at high speed by properly delivering it to a large number of user clients at once. However, in the conventional computing environment, individual distribution is the basic (primary communication method), and there is a problem that the communication overhead associated with simultaneous delivery increases. Although there is the broadcast-type distribution over the Internet such as IP multicast and IP broadcast, broadcast storms where data is reciprocating between communication devices are likely to occur, and most use is prohibited. Therefore, in this research, we focus on the calculation environment based on simultaneous distribution, and model and propose a computational environment based on the simultaneous distribution (broadcast-type distribution) via radio broadcasting. In radio broadcasting, the processing load associated with simultaneous distribution does not depend on the number of destinations, and data can be transmitted together to multiple user clients within the radio wave reach range. In this paper, we discuss the effectiveness of the proposed computational environment.

2 Related Work Some research and development of broadcast-type distribution have been carried out. The Broadcast Computing Research Group of the Information Processing Society advocates computing itself using broadcast-type distribution. However, it has not been able to propose a specific computing environment, as shown later ([1]). Hironaka et al. are researching and developing a hybrid cast system that distributes access destinations via radio broadcasts and downloads and displays the home page of the destination distributed over the Internet ([2]). However, it is not possible to broadcast data according to the user’s situation. The authors propose how to create a broadcast schedule when users use multiple data [3]. However, only predetermined data with a fixed broadcast schedule can be broadcast. Performance measurement, such as the time it takes to complete the delivery in the case of distributing the data used by multiple computers at once, is performed in [4]. It is assumed that broadcast-type distribution via the Internet is assumed rather than radio broadcasting. However, it can also be used as a reference in this study to confirm the effectiveness of broadcast-type distribution.

3 Modeling 3.1 Assumed System Figure 1 shows a model of a computing environment based on broadcast-type distribution. The user client is connected to the Internet, and the broadcasting facility can receive the data distributed (broadcasted) all at once by radio wave broadcasting. The

A Simply Implementable Architecture for Broadcast Communication

97

Fig. 1. A broadcast and communication environment

data received by the broadcast is automatically stored in the receiving buffer of the user client. If there is missing data to be used, the user client sends a reception request to the data holding terminal. The data holding terminal transmits the data to be held over the Internet or broadcasts it via radio wave broadcast. We do not broadcast all the data all the time. When broadcasting, the data holding terminal transmits data and broadcast range (user client using broadcast data) to the broadcast server. The broadcast server manages several broadcasting facilities, and data can be broadcast using each broadcasting facility. Data retention terminals and user clients can obtain the radio wave reaches and the broadcast bandwidth by querying the broadcast server. For example, a smart car is a user client, Traffic information data server is a data holding terminal, the server of the telecommunications company is a broadcast server, and a 5G base station has a broadcasting facility. 3.2 Application Example We show an example that broadcast-type distribution can be processed faster than sending data over the Internet. In the heavy rain disaster in the Kanto region on June 24, 2014, there were 11904 accesses to the website. The amount of data on the relevant homepage is 860 K bytes, and a total of 10 Gb of data is transmitted. If the effective communication speed of the server of the homepage is 100 Mbps, it will take 1 min and 20 s to

98

T. Yoshihisa

Fig. 2. A data flow diagram for our proposed broadcast communication environments

send the data of the homepage. On the other hand, the communication speed of terrestrial digital television broadcasting is 23 Mbps (all 13 segments) by ISDB-T method. Although the broadcast range varies depending on the radio tower, Tokyo Skytree covers 14,000 households in the Kanto region. The time it takes to send the homepage data to the user client in the Kanto region is 0.3 s. It can be seen that it can be significantly shortened even if it is compared only by the time it takes to send data simply.

4 Evaluation Many users recently enjoy videos provides by video-on-demand services such as YouTube or Hulu. This means that one of the applications for distributing the data to many clients is video distribution. In this section, we evaluate our proposed environments under the situation that the server distributes videos.

5 Evaluation Setting For the video distribution, the video data are generally divided into some segments and are broadcast according to the created broadcast schedule. There are some methods to create a broadcast schedule. The main drawbacks of the video distribution are the waiting time and the interruption time. The waiting time is from the time that each client request receiving the broadcast data to the time that the client starts playing the data. The interruption time is the time elapsed while the video playback is interrupted. To reduce the waiting time and the interruption time, the broadcast schedule is created in order that the preceding segments are broadcast frequently. Because the chance to receive the preceding segments increases, the possibility that the clients can receive the segments before playing them increases. Thus, the waiting time and the interruption time are reduced. In this evaluation, we compare three methods under the broadcast communication environments. The first one is the sequential method in that the server broadcasts segments from the beginning to the end sequentially. When the server finishes broadcasting

A Simply Implementable Architecture for Broadcast Communication

99

Fig. 3. Average waiting time and the number of segments

Fig. 4. Average interruption time and the number of segments

the final segment, it starts broadcasting the first segment again and keeps broadcasting segments cyclically. This method is the most simple and the conventional method. The next one is the binary method. In this method, the first half of the video data is broadcast several times, and after that, the last half of the video data is broadcast once. Since

100

T. Yoshihisa

the segments included in the first half are broadcast more frequently than the segments included in the last half, the waiting time and the interruption time can be reduced. The last one is the parallel method in that the server creates some logical broadcast channels and broadcasts segments cyclically in each logical broadcast channel. By dividing the video data so that the broadcasting time for the preceding segment becomes shorter than others, the frequency that the preceding segments are broadcast increases, and the times can be reduced.

6 Evaluation Results Figure 3 shows the average waiting time under our proposed broadcast communication environments. The duration of the video data is 1 [min.], and the time needed to broadcast the data is 5 [sec.]. The horizontal axis is the number of segments. For the case of the binary method, the horizontal axis indicates the times to broadcast the first half of the video data. For the case of the parallel method, the number of the logical channels is the same as the number of the segments. The vertical axis is the average waiting time until starting playing the video data from the time to request receiving the data. From this result, we can see that the average waiting time decreases as the number of segments increases because the chance to receive the first segments increases. However, when the number of segments is excessively large, the interruption time arises, as shown in Fig. 4. The horizontal axis of the figure is the same as Fig. 3. The vertical axis of the figure is the average interruption time. Under the sequential method, the interruption time does not occur because the server broadcasts the segments sequentially. Under the binary method, the interruption time occurs when the number of segments is larger than 5. Under the parallel method, the interruption time occurs when the number of segments is larger than 10. The reason why the number of the segments that the interruption time occurs is larger than that for the binary method is that the server broadcasts segments in parallel. Since the intervals that each segment is broadcast are shorter than that for the binary method, the number of the segments that the interruption time occurs becomes large.

7 Discussion 7.1 Designation of Broadcast Area In the proposed computing environment, other data can be broadcast according to the broadcast range. They can be delivered at high speed by broadcasting only from the broadcasting facility where the geographic range where the user client requesting the data and the user client of the notification destination exists are included in the radio wave reach range. As one of the methods of this realization, the data holding terminal first recognizes the broadcasting equipment that the geographic target range is included in the radio wave reach range, and transmits the identifier of the data to be broadcast to the broadcast server which manages the broadcasting equipment and the broadcasting equipment to be used. Next, the broadcast server broadcasts the data and the geographic target range using the specified broadcasting facility. When broadcasting, we determine

A Simply Implementable Architecture for Broadcast Communication

101

the timing (broadcasting equipment schedule) in cooperation with other broadcast servers in consideration of multiple broadcasting facilities overlapping radio frequency reach and data waiting for it to be broadcast. Finally, the user client stores the received data when its position is within the geographic target range (Fig. 2).

8 How to Decide Broadcast Data When the Internet’s communication bandwidth is large, it may be faster to send over the Internet than to broadcast. Notification and response time can be shortened by broadcasting only when it is faster to broadcast. Therefore, the data holding terminal determines the delivery method, which can respond at high speed by comparing the transmission via the Internet and the delivery via the radio wave broadcast, taking into account the user client’s location, communication speed, and broadcasting band. When sending data according to the request of the user client, the number of requests is also taken into account. Basically, the number of requests is large, the communication bandwidth is small, but the larger the broadcast bandwidth, the faster it is possible to respond at high speed when it is delivered via radio wave broadcast. However, it is necessary to transmit data to broadcast to the broadcast server, so if these fluctuate significantly over time, it may be faster to send over the Internet.

9 Conclusion In this study, we proposed a computational environment based on broadcast-type distribution and modeled and discussed it. In the future, we plan to devise details of effective processing and communication methods. Acknowledgments. This work was partially supported by JSPS KAKENHI Grant Numbers JP18K11316 and JP20H00584, and by G-7 Scholarship Foundation.

References 1. IPSJ Broadcast and Communication Computing Group. https://www.ipsj.or.jp/sig/bccgr/. 2. Hironaka, Y., Majima, K., Kai, K., Sunasaki, S.: Broadcast metadata providing system for hybridcast. In: Proceedings of IEEE International Conference on Consumer Electronics, pp. 328–329 (2015) 3. Yoshihisa, T., Nishio, S.: A division-based broadcasting method considering channel bandwidths for NVoD services. IEEE Trans. Broadcast. 59(1), 62–71 (2013) 4. Sindoori, K.B.A., Kathikeyan, L., Sivakumar, S., Abirami, G., Durai, R.B.: Multiservice product comparison system with improved reliability in big data broadcasting. In: Proceedings of IEEE International Conference on Science Technology Engineering & Management (ICONSTEM), pp. 48–53 (2017)

Assessment of Available Edge Computing Resources in SDN-VANETs by a Fuzzy-Based System Considering Trustworthiness as a New Parameter Ermioni Qafzezi1(B) , Kevin Bylykbashi1 , Phudit Ampririt1 , Makoto Ikeda2 , Leonard Barolli2 , and Makoto Takizawa3 1

Graduate School of Engineering, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected], [email protected], [email protected] 2 Department of Information and Communication Engineering, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected], [email protected] 3 Department of Advanced Sciences, Faculty of Science and Engineering, Hosei University, 3-7-2, Kajino-machi, Koganei-shi, Tokyo 184-8584, Japan [email protected] Abstract. In this paper, we propose a fuzzy-based system to determine the processing capability of neighboring vehicles in Software Defined Vehicular Ad hoc Networks (SDN-VANETs) considering vehicle trustworthiness as a new parameter. The computational, networking and storage resources of vehicles comprise the Edge Computing resources in a layered Cloud-Fog-Edge architecture. A vehicle which needs additional resources to complete certain tasks and process various data can use the resources of the neighboring vehicles if requirements to realize such operations are fulfilled. We propose a new fuzzy-based system to assess the processing capability of each neighbor and based on the final value, we can determine whether the edge layer can be used by the vehicles in need. The proposed system takes into consideration the available resources of the neighbors, their trustworthiness value and the predicted contact duration between them and the vehicle. Our system takes also into account the neighbors willingness to share their resources and determines the processing capability for each neighbor. We evaluate the system by computer simulations. Helpful neighbors are the trustworthy ones that are predicted to be within the vehicle communication range for a while and have medium/large amount of available resources.

1

Introduction

The long distances separating homes and workplaces/facilities/schools as well as the traffic present in these distances make people spend a significant amount of time in vehicles. Thus, it is important to offer drivers and passengers ease of c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 102–112, 2021. https://doi.org/10.1007/978-3-030-61105-7_11

An Intelligent Approach for Resource Management in SDN-VANETs

103

driving, convenience, efficiency and safety. This has led to the emerging of Vehicular Ad hoc Networks (VANETs), where vehicles are able to communicate and share important information among them. VANETs are a relevant component of Intelligent Transportation Systems (ITS) which offer more safety and better transportation. VANETs are capable to offer numerous services such as road safety, enhanced traffic management, as well as travel convenience and comfort. To achieve road safety, emergency messages must be transmitted in real-time, which stands also for the actions that should be taken accordingly in order to avoid potential accidents. Thus, it is important for the vehicles to always have available connections to infrastructure and to other vehicles on the road. On the other hand, traffic efficiency is achieved by managing traffic dynamically according to the situation and by avoiding congested roads, whereas comfort is attained by providing in-car infotainment services. The advances in vehicle technology have made possible for the vehicles to be equipped with various forms of smart cameras and sensors, wireless communication modules, storage and computational resources. While more and more of these smart cameras and sensors are incorporated in vehicles, massive amounts of data are generated from monitoring the on-road and in-board status. This exponential growth of generated vehicular data, together with the boost of the number of vehicles and the increasing data demands from in-vehicle users, has led to a tremendous amount of data in VANETs [14]. Moreover, applications like autonomous driving require even more storage capacity and complex computational capability. As a result, traditional VANETs face huge challenges in meeting such essential demands of the ever-increasing advancement of VANETs. The integration of Cloud-Fog-Edge Computing in VANETs is the solution to handle complex computation, provide mobility support, low latency and high bandwidth. Each of them serves different functions, but also complements eachother in order to enhance the performance of VANETs. Even though the integration of Cloud, Fog and Edge Computing in VANETs solves significant challenges, this architecture lacks mechanisms needed for resource and connectivity management because the network is controlled in a decentralized manner. The prospective solution to solve these problems is the augmentation of Software Defined Networking (SDN) in this architecture. The SDN is a promising choice in managing complex networks with minimal cost and providing optimal resource utilization. SDN offers a global knowledge of the network with a programmable architecture which simplifies network management in such extremely complicated and dynamic environments like VANETs [13]. In addition, it will increase flexibility and programmability in the network by simplifying the development and deployment of new protocols and by bringing awareness into the system, so that it can adapt to changing conditions and requirements, i.e., emergency services [5]. This awareness allows SDN-VANET to make better decisions based on the combined information from multiple sources, not just individual perception from each node.

104

E. Qafzezi et al.

In previous works, we have proposed an intelligent approach to manage the cloud-fog-edge resources in SDN-VANETs using fuzzy logic. We presented a cloud-fog-edge layered architecture which is coordinated by an intelligent system that decides the appropriate resources to be used by a particular vehicle in need of additional computing resources. The proposed system was implemented in the SDN Controller (SDNC) and in the vehicles equipped with an SDN module [9– 12]. The main objective was to achieve a better management of these resources. The appropriate resources to be used by the vehicle were decided by considering the vehicle relative speed with its neighbors, the number of neighbors, the timesensitivity and the complexity of the task to be accomplished. In another recent work [8], we proposed a fuzzy-based system that could assess the edge computing resources by considering the available resources of the neighboring vehicles. We determined the processing capability for each neighbor separately, hence helpful neighbors could be discovered. In this work, we include the neighbors trustworthiness to better assess their processing capability or in other words, to attain a trustworthy processing capability value. If the neighbors are not trustworthy and do not have sufficient resources to process the data and complete the tasks, the resources to be used by the vehicle are those of fog or cloud. The remainder of the paper is as follows. In Sect. 2, we present an overview of Cloud-Fog-Edge SDN-VANETs. In Sect. 3, we describe the proposed fuzzybased system. In Sect. 4, we discuss the simulation results. Finally, conclusions and future work are given in Sect. 5.

2

Cloud-Fog-Edge SDN-VANETs

While cloud, fog and edge computing in VANETs offer scalable access to storage, networking and computing resources, SDN provides higher flexibility, programmability, scalability and global knowledge. In Fig. 1, we give a detailed structure of this novel VANET architecture. It includes the topology structure, its logical structure and the content distribution on the network. As it is shown, it consists of Cloud Computing data centers, fog servers with SDNCs, roadside units (RSUs), RSU Controllers (RSUCs), Base Stations and vehicles. We also illustrate the infrastructure-to-infrastructure (I2I), vehicle-to-infrastructure (V2I), and vehicle-to-vehicle (V2V) communication links. The fog devices (such as fog servers and RSUs) are located between vehicles and the data centers of the main cloud environments. The safety applications data generated through in-board and on-road sensors are processed first in the vehicles as they require real-time processing. If more storing and computing resources are needed, the vehicle can request to use those of the other adjacent vehicles, assuming a connection can be established and maintained between them for a while. With the vehicles having created multiple virtual machines on other vehicles, the virtual machine migration must be achievable in order to provide continuity as one/some vehicle may move out of the communication range. However, to set-up virtual machines on the nearby

An Intelligent Approach for Resource Management in SDN-VANETs

105

Fig. 1. Logical architecture of cloud-fog-edge SDN-VANET with content distribution.

vehicles, multiple requirements must be met and when these demands are not satisfied, the fog servers are used. Cloud servers are used as a repository for software updates, control policies and for the data that need long-term analytics and are not delay-sensitive. On the other side, SDN modules which offer flexibility and programmability, are used to simplify the network management by offering mechanisms that improve the network traffic control and coordination of resources. The implementation of this architecture promises to enable and improve the VANET applications such as road and vehicle safety services, traffic optimization, video surveillance, telematics, commercial and entertainment applications.

3

Proposed Fuzzy-Based System

In this section, we present our proposed fuzzy based system. A vehicle that needs storage and computing resources for a particular application can use those of neighboring vehicles, fog servers or cloud data centers based on the application requirements. For instance, for a temporary application that needs real-time processing, the vehicle can use the resources of adjacent vehicles if the requirements to realize such operations are fulfilled. Otherwise, it will use the resources of fog servers, which offer low latency as well. Whereas real-time applications require the usage of edge and fog layer resources, for delay tolerant applications, vehicles can use the cloud resources as these applications do not require low latency. The proposed system is implemented in the SDNC and in the vehicles which are equipped with SDN modules. If a vehicle does not have an SDN module, it

106

E. Qafzezi et al.

Fig. 2. Proposed system structure.

sends the information to SDNC which sends back its decision. The system uses the beacon messages received from the adjacent vehicles to extract information such as their current position, velocity, direction, available computing power, available storage, trustworthiness, and based on the received data, the processing capability of each adjacent vehicle is decided. The structure of the proposed system is shown in Fig. 2. For the implementation of our system, we consider four input parameters: Predicted Contact Duration (PCD), Available Computing Power (APC), Available Storage (AS) and Vehicle Trustworthiness (VT) to determine the Neighbor i Processing Capability (NiPC). PCD: In a V2V communication, the duration of the communication session is important since it determines the amount of data to be exchanged and the services that can be performed. A vehicle which needs additional resources will create virtual machines on the neighbors that are willing to lend their resources, therefore the contact duration becomes even more important since much more time is needed to accomplish these tasks than just performing a data exchange. ACP: Vehicles might be using their computing power for their own applications but a reserved amount can be allocated to help other vehicles in need to complete certain tasks. Vehicles let their neighbors know that they are willing to share their resources and how much they want to share. In other words, they decide the amount of physical processor cores and the amount of memory that a particular vehicle can use. VT: It is important to consider trustworthiness as it can help to make better assessments of the neighbors helpfulness. Trust is defined as the ratio of the successfully accomplished tasks to the number of tasks this vehicle is asked to help with. A trustworthy vehicle is a vehicle that has been given a high trust value by other vehicles and has been helpful to other vehicles in the network.

An Intelligent Approach for Resource Management in SDN-VANETs

107

Fig. 3. Membership functions.

AS: The neighbors should have a specific amount of storage so the vehicle can run the virtual machines. This storage is used also to store data after completing specific tasks out of all the tasks these neighbors are asked to accomplish. NiPC: The output parameter values consist of values between 0 and 1, with the value 0.5 working as a border to determine if a neighbor is capable of helping out the vehicle. A NiPC ≥ 0.5 means that this neighbor i has the required conditions to help the vehicle to complete its tasks. We consider fuzzy logic to implement the proposed systems because our system parameters are not correlated with each other. Having three or more parameters which are not correlated with each other results in a non-deterministic polynomial-time hard (NP-hard) problem and fuzzy logic can deal with these problems. Moreover, we want our systems to make decisions in real time and fuzzy systems can give very good results in decision making and control problems [1–4,6,7,15,16]. The input parameters are fuzzified using the membership functions showed in Fig. 3(a), (b), (c) and (d). In Fig. 3(e) are shown the membership functions used for the output parameter. We use triangular and trapezoidal membership

108

E. Qafzezi et al.

functions because they are suitable for real-time operation. The term sets for each linguistic parameter are shown in Table 1. We decided the number of term sets by carrying out many simulations. In Table 2, we show the Fuzzy Rule Base (FRB) of our proposed system, which consists of 81 rules. The control rules have the form: IF “conditions” THEN “control action”. For instance, for Rule 1: “IF PCD is Sh, ACP is Sm, VT is Lw and AS is S, THEN NiPC is ELPC” or for Rule 50: “IF PCD is Md, ACP is La, VT is Mo and AS is M, THEN NiPC is HPC”. Table 1. Parameters and their term sets for our proposed system. Parameters

Term sets

Predicted Contact Duration (PCD)

Short (Sh), Medium (Md), Long (Lo)

Available Computing Power (ACP)

Small (Sm), Medium (Me), Large (La)

Vehicle Low (Lw), Moderate (Mo), High (Hg) Trustworthiness (VT) Available Storage (AS)

Small (S), Medium (M), Big (B)

Neighbor i Processing Extremely Low Processing Capability (ELPC), Capability (NiPC) Very Low Processing Capability (VLPC), Low Processing Capability (LPC), Moderate Processing Capability (MPC), High Processing Capability (HPC), Very High Processing Capability (VHPC), Extremely High Processing Capability (EHPC)

PCD = 0.1 , ACP = 0.1 1 0.9

VT = 0.9 VT = 0.5 VT = 0.1

0.9

0.8

0.8

0.7

0.7

0.6

0.6 NiPC

NiPC

PCD = 0.1 , ACP = 0.9 1

VT = 0.9 VT = 0.5 VT = 0.1

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0 0

0.1

0.2

0.3

0.4

0.5 AS

0.6

0.7

(a) PCD = 0.1, ACP = 0.1

0.8

0.9

1

0

0.1

0.2

0.3

0.4

0.5 AS

0.6

0.7

(b) PCD = 0.1, ACP = 0.9

Fig. 4. Simulation results for PCD = 0.1.

0.8

0.9

1

An Intelligent Approach for Resource Management in SDN-VANETs

109

Table 2. The FRB of the proposed system. No PCD ACP VT AS NiPC

No PCD ACP VT AS NiPC

No PCD ACP VT AS NiPC

1

Sh

Sm

Lw S

ELPC 28 Md

Sm

Lw S

ELPC 55 Lo

Sm

Lw S

2

Sh

Sm

Lw M

ELPC 29 Md

Sm

Lw M

ELPC 56 Lo

Sm

Lw M

LPC

3

Sh

Sm

Lw B

ELPC 30 Md

Sm

Lw B

VLPC 57 Lo

Sm

Lw B

LPC

4

Sh

Sm

Mo S

ELPC 31 Md

Sm

Mo S

VLPC 58 Lo

Sm

Mo S

LPC

5

Sh

Sm

Mo M

ELPC 32 Md

Sm

Mo M

VLPC 59 Lo

Sm

Mo M

MPC

6

Sh

Sm

Mo B

VLPC 33 Md

Sm

Mo B

LPC

Sm

Mo B

MPC

7

Sh

Sm

Hg S

ELPC 34 Md

Sm

Hg S

LPC

61 Lo

Sm

Hg S

MPC

8

Sh

Sm

Hg M

VLPC 35 Md

Sm

Hg M

LPC

62 Lo

Sm

Hg M

HPC

9

Sh

Sm

Hg B

LPC

Sm

Hg B

MPC

63 Lo

Sm

Hg B

HPC

10 Sh

Me

Lw S

ELPC 37 Md

Me

Lw S

VLPC 64 Lo

Me

Lw S

LPC

11 Sh

Me

Lw M

ELPC 38 Md

Me

Lw M

LPC

65 Lo

Me

Lw M

MPC

12 Sh

Me

Lw B

VLPC 39 Md

Me

Lw B

LPC

66 Lo

Me

Lw B

MPC

13 Sh

Me

Mo S

ELPC 40 Md

Me

Mo S

LPC

67 Lo

Me

Mo S

MPC

14 Sh

Me

Mo M

VLPC 41 Md

Me

Mo M

MPC

68 Lo

Me

Mo M

HPC

15 Sh

Me

Mo B

LPC

42 Md

Me

Mo B

MPC

69 Lo

Me

Mo B

HPC

16 Sh

Me

Hg S

VLPC 43 Md

Me

Hg S

MPC

70 Lo

Me

Hg S

HPC

17 Sh

Me

Hg M

LPC

44 Md

Me

Hg M

HPC

71 Lo

Me

Hg M

VHPC

18 Sh

Me

Hg B

MPC

45 Md

Me

Hg B

HPC

72 Lo

Me

Hg B

VHPC

19 Sh

La

Lw S

ELPC 46 Md

La

Lw S

LPC

73 Lo

La

Lw S

MPC

20 Sh

La

Lw M

VLPC 47 Md

La

Lw M

MPC

74 Lo

La

Lw M

HPC

21 Sh

La

Lw B

VLPC 48 Md

La

Lw B

MPC

75 Lo

La

Lw B

HPC

36 Md

60 Lo

22 Sh

La

Mo S

VLPC 49 Md

La

Mo S

MPC

76 Lo

La

Mo S

HPC

23 Sh

La

Mo M

LPC

50 Md

La

Mo M

HPC

77 Lo

La

Mo M

VHPC

24 Sh

La

Mo B

LPC

51 Md

La

Mo B

HPC

78 Lo

La

Mo B

VHPC

25 Sh

La

Hg S

LPC

52 Md

La

Hg S

HPC

79 Lo

La

Hg S

VHPC

26 Sh

La

Hg M

MPC

53 Md

La

Hg M

VHPC 80 Lo

La

Hg M

EHPC

27 Sh

La

Hg B

MPC

54 Md

La

Hg B

VHPC 81 Lo

La

Hg B

EHPC

PCD = 0.5 , ACP = 0.1 1

PCD = 0.5 , ACP = 0.9 1

VT = 0.9 VT = 0.5 VT = 0.1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6 NiPC

NiPC

VLPC

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0 0

0.1

0.2

0.3

0.4

0.5 AS

0.6

0.7

(a) PCD = 0.5, ACP = 0.1

0.8

0.9

1

VT = 0.9 VT = 0.5 VT = 0.1 0

0.1

0.2

0.3

0.4

0.5 AS

0.6

0.7

0.8

0.9

1

(b) PCD = 0.5, ACP = 0.9

Fig. 5. Simulation results for PCD = 0.5.

4

Simulation Results

We used FuzzyC to carry out the simulations and the results are shown in Fig. 4, Fig. 5 and Fig. 6. We show the relation between NiPC and AS for different PCD, ACP and VT values. PCD and ACP are considered as constant parameters. The

110

E. Qafzezi et al.

considered VT values are 0.1, 0.5 and 0.9 which represent a low, moderate and high value of trustworthiness, respectively. We take into consideration three scenarios: when PCD is short (PCD = 0.1), medium (PCD = 0.5) and long (PCD = 0.9). In Fig. 4, are shown the scenarios when PCD is short. We consider two cases for each scenario: when available computing power is small (ACP = 0.1) and large (ACP = 0.9). As we can see from Fig. 4(a), a neighbor with a small computing power cannot help out the vehicle even if it has a high trustworthiness value. However, when ACP is large, we see that the neighbors with a high VT can be considered as helpful to process the data, given that these neighbors have at least a moderate amount of storage (see Fig. 4(b)). This is due to the large ACP and especially to the high VT which indicates that this vehicle has been really helpful to other vehicles in the past, and it is worth taking this neighbor into consideration as it will make a really potential neighbor if an increase in PCD happens (see Fig. 5(b)). PCD = 0.9 , ACP = 0.9 1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6 NiPC

NiPC

PCD = 0.9 , ACP = 0.1 1

0.5 0.4

0.5 0.4

0.3

0.3

0.2

0.2 VT = 0.9 VT = 0.5 VT = 0.1

0.1 0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

VT = 0.9 VT = 0.5 VT = 0.1

0.1 0 0.9

1

0

0.1

0.2

0.3

0.4

AS

(a) PCD = 0.9, ACP = 0.1

0.5

0.6

0.7

0.8

0.9

1

AS

(b) PCD = 0.9, ACP = 0.9

Fig. 6. Simulation results for PCD = 0.9.

Figure 6 shows the scenario when PCD = 0.9. This means that these neighbors will be within the vehicle communication range for a long time. We can see that, if these neighboring vehicles are willing to lend a large amount of their resources to the vehicle, they can be considered as helpful neighbors even if their trustworthiness value is low. Since it is predicted that they will be within the communication range for a long time and are willing to lend a large amount of resources, it is worth presuming that they will process the data and accomplish the tasks successfully.

5

Conclusions

In this paper, we proposed a new fuzzy-based system to assess the available edge computing resources in a layered Cloud-Fog-Edge architecture for SDNVANETs. Our proposed system determines if a neighboring vehicle is capable to help a vehicle that lacks the appropriate resources to accomplish certain tasks

An Intelligent Approach for Resource Management in SDN-VANETs

111

based on PCD, ACP, AS and VT. After calculating the processing capability for each available neighbor, our previous proposed Fuzzy System for Resource Management can select the appropriate layer in terms of data processing. We evaluated our proposed system by computer simulations. From the simulations results, we conclude as follows. • For short PCD, the neighboring vehicles which do not have high ACP are not capable to help other vehicles in need, regardless their high value of AS, even if their VT value is high. • For medium PCD, once a trustworthy vehicle has a certain ACP, it can be considered as a potential neighbor regardless its AS value. • The highest value of NiPC is achieved when the neighboring vehicle has a long PCD, large ACP, high VT and a medium/big AS. In the future, we would like to make extensive simulations to evaluate the proposed system and compare the performance with other systems.

References 1. Bylykbashi, K., Elmazi, D., Matsuo, K., Ikeda, M., Barolli, L.: Effect of security and trustworthiness for a fuzzy cluster management system in VANETs. Cogn. Syst. Res. 55, 153–163 (2019). https://doi.org/10.1016/j.cogsys.2019.01.008 2. Bylykbashi, K., Qafzezi, E., Ikeda, M., Matsuo, K., Barolli, L.: Fuzzy-based Driver Monitoring System (FDMS): implementation of two intelligent FDMSs and a testbed for safe driving in VANETs. Future Gener. Comput. Syst. 105, 665–674 (2020). https://doi.org/10.1016/j.future.2019.12.030 3. Kandel, A.: Fuzzy Expert Systems. CRC Press Inc., Boca Raton (1992) 4. Klir, G.J., Folger, T.A.: Fuzzy Sets, Uncertainty, and Information. Prentice Hall, Upper Saddle River (1988) 5. Ku, I., Lu, Y., Gerla, M., Gomes, R.L., Ongaro, F., Cerqueira, E.: Towards software-defined VANET: architecture and services. In: 13th Annual Mediterranean Ad Hoc Networking Workshop (MED-HOC-NET), pp. 103–110 (2014) 6. McNeill, F.M., Thro, E.: Fuzzy Logic: A Practical Approach. Academic Press Professional Inc., San Diego (1994) 7. Munakata, T., Jani, Y.: Fuzzy systems: an overview. Commun. ACM 37(3), 69–77 (1994) 8. Qafzezi, E., Bylykbashi, K., Ampririt, P., Ikeda, M., Barolli, L., Takizawa, M.: A fuzzy-based system for assessment of available edge computing resources in a cloudfog-edge SDN-VANETs architecture. In: International Conference on NetworkBased Information Systems, pp. 10–19. Springer (2020) 9. Qafzezi, E., Bylykbashi, K., Ikeda, M., Matsuo, K., Barolli, L.: Coordination and management of cloud, fog and edge resources in SDN-VANETs using fuzzy logic: a comparison study for two fuzzy-based systems. Internet Things 11, 100169 (2020) 10. Qafzezi, E., Bylykbashi, K., Ishida, T., Matsuo, K., Barolli, L., Takizawa, M.: Resource management in SDN-VANETs: coordination of cloud-fog-edge resources using fuzzy logic. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 114–126. Springer (2020)

112

E. Qafzezi et al.

11. Qafzezi, E., Bylykbashi, K., Spaho, E., Barolli, L.: An intelligent approach for resource management in SDN-VANETs using fuzzy logic. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 747–756. Springer (2019) 12. Qafzezi, E., Bylykbashi, K., Spaho, E., Barolli, L.: A new fuzzy-based resource management system for SDN-VANETs. Int. J. Mob. Comput. Multimedia Commun. (IJMCMC) 10(4), 1–12 (2019) 13. Truong, N.B., Lee, G.M., Ghamri-Doudane, Y.: Software defined networking-based vehicular adhoc network with fog computing. In: 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), pp. 1202–1207 (2015) 14. Xu, W., Zhou, H., Cheng, N., Lyu, F., Shi, W., Chen, J., Shen, X.: Internet of vehicles in big data era. IEEE/CAA J. Automatica Sinica 5(1), 19–35 (2018) 15. Zadeh, L.A., Kacprzyk, J.: Fuzzy Logic for the Management of Uncertainty. Wiley, New York (1992) 16. Zimmermann, H.J.: Fuzzy control. In: Fuzzy Set Theory and Its Applications, pp. 203–240. Springer (1996)

eWound-PRIOR: An Ensemble Framework for Cases Prioritization After Orthopedic Surgeries Felipe Neves1 , Morgan Jennings2 , Miriam Capretz3 , Dianne Bryant2 , Fernanda Campos1 , and Victor Ströele1(B) 1 Department of Computer Science, Federal University of Juiz de Fora, Juiz de Fora, Brazil

{felipe.neves.braz,victor.stroele}@ice.ufjf.br, [email protected] 2 Department of Physical Therapy, The University of Western Ontario, London, Canada {mjenni3,dianne.bryant}@uwo.ca 3 Department of Electrical and Computer Engineering, The University of Western Ontario, London, Canada [email protected]

Abstract. Patient follow-up appointments are an imperative part of the healthcare model to ensure safe patient recovery and proper course of treatment. The use of mobile devices can help patient monitoring and predictive approaches can provide computational support to identify deteriorating cases. Aiming to aggregate the data produced by those devices with the power of predictive approaches, this paper proposes the eWound-PRIOR framework to provide a remote assessment of postoperative orthopedic wounds. Our approach uses Artificial Intelligence (AI) techniques to process patients’ data related to postoperative wound healing and makes predictions as to whether the patient requires an in-person assessment or not. The experiment results showed that the predictions are promising and adherent to the application context, even if the on-line questionnaire had impaired the training model and the performance.

1 Introduction Patients are becoming great consumers of health information through the use of the Internet. It can favor an innovative healthcare model, providing the opportunity to support services like on-line consultation, patient and physician’s education, appointment booking, patient’s assessment, and monitoring. In large hospitals or clinics, the amount of daily information is usually expressive as different types of medical data are available for treatments and diagnosis. These data help physicians in their daily work, allowing them to understand the factors related to the patients’ health. Based on the hospital dataset, predictions can be made aiming to help physicians to understand the factors of a specific disease. Several works adopted predictive approaches in diagnosis and, forms of treatment [5], detection and risk of chronic diseases [6, 11] and, the use of medical notes to make predictions in health centers [12]. In this context, this © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 113–125, 2021. https://doi.org/10.1007/978-3-030-61105-7_12

114

F. Neves et al.

research deals with postoperative patients of orthopedic surgeries, identifying worsening in treatments, and informing a case prioritization. Patient follow-up appointments are an imperative part of the healthcare model to ensure safe patient recovery and proper course of treatment. The current standard of care following elected orthopedic surgery includes routine in-person follow up. However, technology now exists to enable remote patient assessment, so the patient may not need to come into the clinic physically. Marsh et al. [1] compared a web-based follow up for patients at least 12 months following total hip/knee arthroplasty to the standard in-person assessment. The web-based followup was feasible, cost-effective alternative and as successful as in-person assessment at identifying adverse events (i.e., no adverse event was missed). The incidence of early adverse events (within three months) following elected orthopedic surgery are rare. Early minor complications may include infection, thromboembolic events, stiffness, and instability and are more common than major complications or death. Surgical site infection rates for common elected orthopedic surgeries such as hip/knee replacement, high tibial osteotomy, and anterior cruciate ligament reconstruction (ACLR), are1.4%, 5% and, 0.48% respectively [2, 3]. Thus, follow-up appointments within the first three months are often unremarkable with no change in clinical management [4]. An online follow-up assessment may allow surgeons to maximize their efficacy while maintaining the effectiveness of identifying adverse events, such as surgical site infection, to manage patient treatment. While in-person follow-up involves many patient questions and assessments, a surgical site assessment is one component that if it is done remotely, may ease patient concerns during recovery and assist with prioritization of patients in a clinical setting. Previous literature has examined predictive approaches in patient diagnosis and treatment [5], and the detection and risk of chronic diseases [6]. Common concerns about using predictive approaches focus on patient’s care. A predictive approach requires special attention to patient monitoring and computational support of changes in patient health to ensure patient safety and that no worsening be missed. One promising approach is the adoption of Artificial Intelligence predictive models in healthcare environments, which favors the patients’ classification by context. These models, based on Machine Learning Models (ML), aim to understand the factors around a context and convert a scenario into a class (classification models) or a number (regression models [7]). Machine Learning models can predict, for example, if a patient presents a risk of worsening case and learn from this prediction. However, depending on the application context and the data used, the accuracy can be impaired. Problems such as noise and missing values are factors that directly influence the performance of a model. For instance, some approaches try to minimize possible error and over-fitting of the model by using ensemble learning models [8]. Ensemble learning is a Machine Learning approach that solves a problem by training multiple models. Unlike ordinary approaches in which a single hypothesis is learned from training data, ensemble approaches attempt to build a set of hypotheses and combine them to build a new hypothesis [9]. These approaches combine various learning algorithms to produce a result that is more adherent and precise [9]. Previous studies show that an ensemble model often performs better than the individual learners, as base learners [10].

eWound-PRIOR: An Ensemble Framework for Cases Prioritization

115

The primary research question of the study is as follows: Is it possible to use an ensemble Machine Learning strategy to identify postoperative patients of orthopedic surgeries that need to go to the hospital/clinic to receive in-person care, after evaluating their reported data from the on-line questionnaire? The proposed solution is a prediction system to provide a remote assessment of postoperative orthopedic wounds. It uses Artificial Intelligent models to process patients’ data related to postoperative wound healing and make predictions if the patient requires an in-person assessment. The eWound-PRIOR combines multiple models through an ensemble approach to achieve the most accurate prediction. The prediction results allow us to identify patients who need prioritized attention and notify both patients and physicians of the decision. The eWound-PRIOR framework was implemented through a mobile application to collect and transmit data. The use of such approach can bring benefits to patients and physicians through early diagnosis of disease and identification of at-risk patients, thereby reducing the flow of unnecessary patient to clinics or hospitals. The main goals of this paper are to: develop the eWound-PRIOR that uses Machine Learning models combined in an ensemble approach to classify and predict postoperative cases requiring follow-up, and evaluate the solution with real patients’ data from the orthopedic sector of a Hospital. This paper is organized as follows: Section 2 introduces the foundation of technologies used in this research. Section 3 presents the related work applied to ensemble Machine Learning models for healthcare. Section 4 describes the proposed framework and how we combined the ML models. Section 5 describes the evaluation process: planning, execution, results and, discussion analysis. Finally, Section 6 presents the conclusion and future research directions.

2 Background This section presents an overview of the main concepts and definitions of the ensemble learning approach used in this research, as the Machine Learning algorithms, the autoencoder, and the metrics used to compare the accuracy of the ensemble classifications with individual learners. The use of Machine Learning ensemble approaches is widely explored in specific applications. Confidence estimation, feature selection, addressing missing features, incremental learning from sequential data, and imbalanced data problems, are examples of the applicability of ensemble systems. The original goal of using ensemble-based decision systems is to improve the assertiveness of a decision. By weighing various opinions and combining them through some smart process to reach a final decision can reduce the variance of results [19]. The eWound-PRIOR framework is composed of the most adherent set of models for the classification problem. We identified the classic models from the literature and the empirical tests. The ones we will use in this research are: (i) Decision Tree (DT): is an abstract structure that is characterized by a tree, where each node denotes a test on an attribute value, each branch represents an outcome of the test, and tree leaves represent classes or class distributions [16]. (ii) K-Nearest Neighbors (KNN): is a classical and lazy learner model, based on learning by analogy, which measures the distance between a given test tuple with training tuples and compares if they are similar [16]. (iii) Random Forest (RF): is a classifier consisting of a collection of decision trees that classify

116

F. Neves et al.

sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting [17]. (i) Multi-Layer Perceptron (MLP): is represented by a neural network that learns the pattern of each classification through the data by calibrating the weights in each layer. The model can learn a nonlinear function approximator for either classification or regression [18]. An autoencoder algorithm [15] is part of a special family of dimensionality reduction methods, implemented using artificial neural networks. It aims to learn a compressed representation for input through minimizing its reconstruction error [8]. During the training step, the input dataset is compressed by the encoder and then, the decoder reconstructs the data by minimizing the reconstruction error. The ability to learn from the “intrinsic data structure” is useful when the available data have noise and/or too many missing values. In this study, the ReLU was used as the autoencoder. The evaluation metrics analyze the accuracy and highlight the sensitivity or true positive rate (TPR) and the specificity or true negative rate (TNR). TPR measures a model capacity to identify cases that need in-person attention. The TNR measures a model capacity to identify cases that do not need in-person attention. The sensitivity and specificity were evaluated using Eqs. (1) and (2), respectively. Where true positive (TP) is the number of cases that need attention that is correctly identified as in-person consultation is necessary. True negative (TN) is the number of cases that do not need in-person care that is correctly identified as it does not need special care. P is the total number of positive instances and N is the total number of negative instances. Sensitivity =

TP P

(1)

Specificity =

TN N

(2)

3 Related Work This paper highlights the use of predictive models, based on ensemble Machine Learning techniques in the context of healthcare. In the literature, some researchers applied those predictive models in recommender systems aiming to detect chronic diseases [11], detect the possibility of heart disease [6, 12, 13] and estimate the likelihood that an adverse event was present in postoperative cases [4, 14]. To the best of our knowledge, no previous study has explored ensemble approaches to predict the likelihood of postoperative wound infection to notify patients and physicians whether an in-person assessment is required, using a mobile device. Mustaqeem et. al. [6] proposed a hybrid prediction model for a recommender system, which detects and classifies subtypes of heart disease. From these classifications it makes recommendations. The work performs a probabilistic analysis and generates the recommendation from the case confirmation. Rallapalli and Gondkar [11] proposed the usage of an ensemble model, which includes a deep learning model to predict the exact attributes required to assess the patient’s risk of having a chronic illness. For the authors, the ensemble approach results are superior to individual learners. Another study that

eWound-PRIOR: An Ensemble Framework for Cases Prioritization

117

addresses the prediction of heart disease is presented in [12]. The proposed work tries to maximize the accuracy of the predictions through the application of an ensemble model. Their approach combines three different learning algorithms thought majority voting and presents the outperformance of the ensemble approach for their dataset. Tuli et al. [13] proposed the HealthFog framework for integrating ensemble deep learning in edge computing devices and for a real-life application of automatic heart disease analysis. The framework delivers healthcare as a fog service using IoT devices for efficiently managing the data of heart patients. The proposed architecture can provide recommendations based on the extracted data. The results pointed out that the use of deep learning in the continuous flow of data and combined in an ensemble form gets a significant improvement of prediction problems. Predictions in postoperative cases are addressed by some works. We highlight Zhang et al. [14] who focused on predicting complications in postoperative pediatric cataract patients by using data mining. The relationship between attributes that can contribute to complications was identified. The results point out that complications can be predicted and except for age and gender other attributes such as position, density, and area of cataracts are related to complications. Also, Jeffery [4] developed a patientreported e-Visit questionnaire, in two- and six-week cases following orthopedic surgery. The author used the data collected to build a statistical model using logistic regression to estimate the likelihood that an adverse event was present. A notification is made as to whether the patient should be seen by the surgeon in-person. In the study, among the two weeks patients, only 24.3% of patients needed the appointment. For patients who returned for an in-person follow-up six-weeks postoperative, only 31.6% of patients needed the appointment. This questionnaire was the base for our mobile app. A comparison is presented to identify the similarities and differences between our work and the selected ones. The criteria are studies about healthcare and prediction (C1); the use of ensemble predictive models or Machine Learning models in e-health (C2); evaluation with real patients’ data (C3); and predicting the likelihood of an adverse event in postoperative cases (C4). Table 1 presents the comparative analysis. Our proposal meets the comparative criteria as it focuses on early diagnosis of worsening cases in postoperative orthopedic recovery and predicts and notifies patients to receive prioritized care. The framework is based on ensemble Machine Learning models and our case study is with patients from the orthopedic sector of a Hospital. Table 1. Comparative analysis.

118

F. Neves et al.

4 eWound-PRIOR Framework The main goal of this research is to develop an ensemble framework to predict the likelihood of suffering an adverse event, of wound healing, following elected orthopedic surgery on the knee, and identify patients who require an in-person assessment. The application aims to reduce the flow of patients in hospitals and clinics requiring wound checks and to reassure patients throughout their wound healing. The eWound-PRIOR is illustrated in Fig. 1 and the main steps are described next.

Fig. 1. eWound-PRIOR framework.

Data Extraction - The first step refers to the data extraction process. It considers all available data and extracts the information considered important for the predictions, e.g. symptoms and medication. The information can be accessed and captured by medical notes and electronic forms. This process extracts the historical medical information (HMD) provided by the hospital database for training purposes and collects the real-time patient data from a mobile application for prediction. Pre-processing - This step pre-processes the patients’ data and historical medical data. The process removes noise values and replaces the missing ones. It also normalizes the data to prepare the dataset for training the ensemble model. To synthesize the amount of information, an autoencoder is trained with the normalized data to make data dimensionality reduction. The autoencoder is tuned until the Mean Square Error (MSE) reaches the minimum. Ensemble Model - The ensemble model represents the eWound-PRIOR core by executing the pre-processed data captured in the first step. It is responsible for predicting whether the patient needs to go to the hospital/clinic for an in-person appointment. The eWound-PRIOR framework model was designed to be composed of ensemble Machine Learning models. To synchronize and combine each model with the Voting Method [19], we use a coordinator service to manage each prediction result. The coordinator combines several classic ML models, implemented as software autonomous services, with

eWound-PRIOR: An Ensemble Framework for Cases Prioritization

119

reactivity, intelligence, and social features. As autonomous services, each model is able of handling with requests and predicting cases requiring follow-up. In this research, we use the KNN, DT, MLP, and RF ML models. The models are supervised because it is a classification problem, which consists of indicating if a patient had or not a worsening in his/her treatment. Dealing with postoperative patients eWound-PRIOR ensemble seeks accurate results with a higher certainty for the predictions. The Ensemble Strategy is based on the voting method [19] and we use the weighted average of the models’ results (probability for a certain classification) to define the final classification for a patient’s context. We set the weight of the “Wound healing well” classification as one and the “Wound requires care” classification as two. This allows focusing on patients that need special care/attention. The return value is an object with the final classification and its intensity. In the case of classification as “Wound requires care”, the patient is notified to go to the hospital/clinic where the doctor can assist him/her. The hospital/clinic administration is also notified to follow-up with this patient via a patient dashboard. The strategy is described in Algorithm 1.

Notification - The last step is responsible for notifying the patients about their health. If the patient is deemed a priority case requiring attention, the patient and the hospital/clinic administration are notified to arrange an in-person follow-up appointment. This allows medical facilities to get better management of people by prioritizing emergency cases.

5 Evaluation To evaluate the eWound-PRIOR framework, we had a partnership with Western University, which allowed access to the study proposed by Jeffery et al. [4] and part of the patients’ dataset. In this context, the research was evaluated using real-world data. Considering our context, the case study main steps are: planning, execution, and results. Planning - We first designed the eWound-PRIOR workflow with the main activities to illustrate the research (Fig. 2). Jeffery et al. [4] study proposed the e-Visit questionnaire,

120

F. Neves et al.

which must be answered on-line by the patient two and six weeks after the surgery. The questions are about the patient’s health after the surgery, the prescribed medication, and the symptoms in their daily life. It has a Demographic Information and eWound forms to be filled by the patient, one Risk Form, and the Surgeon’s Data Form.

Fig. 2. eWound-PRIOR Workflow.

Fig. 3. eWound APP.

Based on the eWound questionnaire, we developed a mobile app (Fig. 3). After the patient completes the questionnaire, the answers are stored in a database, which is available to the eWound-PRIOR framework process. Execution - The eWound questionnaire captures data about the patient’s clinical signs, and symptoms, pain, prescribed medications and how they self-assess their wound healing at two- and six-week postoperative. The dataset had a total of 352 patients who completed the questionnaire. The answers related to wound healing were captured. The dataset is divided into two subsets of data, one related to patients that did the surgery in the past two and six weeks. This shows that two different trained ensemble models are needed specifically for each period time patients have an appointment. After the extraction process, the data were pre-processed by removing the duplicated answers and inconsistent cases. The missing values were filled by the average feature values. Preprocessing the datasets is important because they have too many missing and noise values. Figure 4 shows the pre-processing main activities. As the e-Visit questionnaire has a skip logic, we treated the sub-questions by setting their values to − 1, where the main question has the no as value. The questionnaire skip logic implies that if the sub-questions were answered that usually means the patients answered yes to the main question. Whereas if they did not answer all of them, it is because they answered no to the main question. In this case, −1 indicates that the sub-questions were not answered, meaning that the patient did not feel anything related to the main question. We normalized the cleaned data by using the Normalizer provided by the Sklearn framework, and we trained an autoencoder structured with 10 hidden layers. ReLU was used as the activation function for all layers. After the autoencoder training, the Mean Square Error found was 0.003 for both datasets. This value may be considered a good

eWound-PRIOR: An Ensemble Framework for Cases Prioritization

121

result for an autoencoder [20]. We performed empirical tests to set the resulting number of columns (questions) of the dataset using the autoencoder, ranging from 10 to 46. In our experiments, 20 columns showed a better performance for the available datasets. Therefore, the cleaned data were encoded, reducing the number of columns from 46 to 20. The classifications remained untouched after the process, with 1 and 0 values for “Wound requires care” and “Wound healing well” classification. All questions were used in this process, which was transformed from 46 columns in the CSV to 20 columns. By encoding the original columns to extract the intrinsic information in the dataset, it was possible to improve the accuracy of the classifications. Aiming to obtain the average of predictions, instead of selecting the best set of training instances, we applied the SMOTEENN, a combination of Edited Nearest Neighbors (ENN) with Synthetic Minority Over-sampling Technique (SMOTE), an over-sample technique to multiply the minority instances to balance the classes for training. We randomized the selection of cases from the preprocessed datasets, splitting the cases into train and test sets. We randomly selected 80% of each dataset for training and 20% for testing. The process was executed 100 times. We selected the most adherent parameters based on the preprocessed dataset using the Grid Search CV function from the Sklearn framework [21]. The parameterization process was carried out by setting cross-validation with 10 folds based on the (Stratified) KFold [21]. We also used the Sklearn framework as a provider for each model. Table 2 shows the model’s parameterization. After the parametrization and pre-processing process, the two sets of models were sent to the coordinator service. As in the previous study, the results were combined with the voting ensemble method. The experiments were done in a server with Ubuntu 18.4 LTS; 94 GB RAM; Intel Xeon CPU E5–2630. The results are discussed next.

Fig. 4. Pre-processing step

Experimental Results and Discussion - We applied the same process for training and testing the two- and six-week datasets, which generated two sets of models. Each set was used for predicting the necessity of care for patients according to the number of days that had passed since surgery. Table 3 shows the results for the sensitivity (sens.)

122

F. Neves et al. Table 2. Model’s parametrization Model

Parametrization 2 weeks dataset

6 weeks dataset

KNN

Algorithm: auto Metric: Euclidian K: 2

Algorithm: auto Metric: Euclidian K: 3

Decision tree

Criterion: entropy Max features: 6 Splitter: best Max depth: 8

Criterion: gini Max features: 10 Splitter: random Max depth: none

MLP

# Hidden layers: 50 Activation func: relu # Input: 20 # Output: 1 Batch size: 29 Learning rate: constant Warm start: true Solver: lbfgs

# Hidden layers: 50 Activation func: identity # Input: 20 # Output: 1 Batch size: 12 Learning rate: adaptative Warm start: true Solver: lbfgs

Random forest Criterion: gini # Estimators: 11 Warm start: true Max features: 6

Criterion: entropy # Estimators: 76 Warm start: true Max features: 9

and specificity (spec.) metrics for each model for the two- and six-week datasets. To understand how precise the results were from the outliers, we have calculated the standard deviation (STD) for each metric. Sensitivity and specificity exist in a state of equilibrium [23]. The ability to correctly identify people who need special attention (sensitivity) usually causes a reduction in specificity (meaning more false positives). Likewise, high specificity generally implies a lower sensitivity (more false negatives). Still, high sensitivity is clearly important where the test is used to identify a severe but treatable disease. Although the ensemble presents the lowest value in the standard deviation compared to the individual models, there is a significant variation of the results for each model. As the datasets were provided with patients’ answers, many missing values were present in the answers of each case. The models presented difficulty to correctly classify the cases once the data treatment was carried out by replacing the missing answers. The missing values replacement did not differentiate very well each case because various cases got no value in the sub-answers, and the value for replacement was unique for all. Several classifications were only 60% certain in the predictions for all models, even using the autoencoder, which impairs the results of the ensemble. As we used different weights for each classification, the sensitivity got a good result for both datasets. However, the specificity was impaired by the several uncertain classifications of each model. For the missed cases requiring care, incorrectly classified as “Wound healing well”, we

eWound-PRIOR: An Ensemble Framework for Cases Prioritization

123

Table 3. Experiment results Model

Two-week dataset

Six-week dataset

Sens. (STD)

Spec. (STD)

Sens. (STD)

Spec. (STD)

eWound-PRIOR

0.730 (0.12)

0.428 (0.08)

0.728 (0.11)

0.339 (0.08)

KNN

0.497 (0.14)

0.650 (0.08)

0.561 (0.14)

0.465 (0.12)

DT

0.529 (0.17)

0.582 (0.09)

0.524 (0.13)

0.527 (0.11)

MLP

0.741 (0.20)

0.344 (0.23)

0.666 (0.14)

0.507 (0.15)

RF

0.526 (0.16)

0.591 (0.09)

0.521 (0.13)

0.533 (0.09)

found that most patients answered the questions related to the body temperature and the wound as not presenting any problem. This causes several columns (sub-answers) in the dataset to have a replacement, which generates unique values in some columns of the encoded dataset after using the autoencoder. Analysis of each set of questions revealed problems in the model classification. For example, cases with “red streaks” or an incision that is “hot to the touch” showed signs that the patient may be at risk of infection and need physician care. However, our model incorrectly classified these patients as “Wound healing well” due to the noise generated from responses to other questions. Although we have preprocessed the data, this study is limited by the noise and missing data in the datasets. As we can see from the results, having too many replacements on missing and implicit values in the dataset is not ideal, and may generate an incorrect classification. Without the skip logic we would capture more information related to the patients’ health and reduce the number of replacements, consequently reducing the preprocessing step to prepare the data for training. From this case study, we must highlight two genuine lessons learned. The first deals with a health perspective. The skip logic questionnaire is easier for patients as it is not necessary to answer the questions that do not apply. Originally, the questionnaire was created with no obligation to answer all the sub-questions. In this way, patients often skip a question that could be important for the classifications. Machine Learning models consider all columns in a dataset to pursue predictions. In our case, as many cases got negative classification (wound healing well - do not need to be seen), the pre-processing step had to replace the implicit and missing values, impairing the models’ training and performance. The second lesson comes from our previous experience in using ensemble strategies in healthcare and other domains. In general, they are not affected using different Machine Learning models, if individually they show good accuracy. The final ensemble classifications are, in general, better than those provided by individual models [22]. eWound-PRIOR framework, with multiple classifications as a predictive basis, together with the human expertise and other resources added to the questionnaire such as images will guarantee greater autonomy and assertiveness.

124

F. Neves et al.

6 Conclusion and Future Work Our research proposes the eWound-PRIOR framework for prioritization of postoperative cases, an ensemble model as the predictive core. The framework applies autonomous services and Machine Leaning models capable of cooperating and aggregating results to maximize the accuracy and certainty of the predictions. We developed a mobile application to collect postoperative patient’s health via an on-line questionnaire. We can answer our research question: it is possible to identify postoperative patients of orthopedic surgeries that need to go to the hospital/clinic to receive in-person care, after evaluating their reported data from an on-line questionnaire using an ensemble Machine Learning strategy. eWound-PRIOR ensemble made correct predictions for 50% (general accuracy) of cases for the two- and six-week datasets. It demonstrates good sensitivity and poor specificity. We compared the results of each Machine Learning model separately and together in an ensemble strategy. The prediction was impaired by the number of missing values in the dataset. The constrains came from the questionnaire skip questions, which gave the patients the option of not answering most of them. Future research needs to improve the models’ parameters with new cases and a larger dataset and to improve its effectiveness to correctly identify patients requiring in-person follow-up. Furthermore, the eWound application interface should be evaluated to ensure the patient’s satisfaction. We also recommend the application be tested using a diagnostic validity study design to compare its predictions to the current gold standard of in-person patient assessment. Acknowledgments. ELAP from University of Western Ontario, Canada, Federal University of Juiz de Fora (UFJF), CAPES, CNPq and FAPEMIG.

References 1. Marsh, J., Hoch, J.S., Bryant, D., MacDonald, S.J., Naudie, D., McCalden, R., Howard, J., Bourne, R., McAuley, J.: Economic evaluation of web-based compared with in-person follow-up after total joint arthroplasty. JBJS 96(22), 1910–1916 (2014) 2. Salvati, E., Robinson, R., Zeno, S., Koslin, B., Brause, B., Wilson, J.P.: Infection rates after 3175 total hip and total knee replacements performed with and without a horizontal unidirectional filtered air-flow system. J. Bone Joint Surg. Am. 64(4), 525–535 (1982) 3. Wildner, M., Peters, A., Hellich, J., Reichelt, A.: Complications of high tibial osteotomy and internal fixation with staples. Arch. Orthop. Trauma Surg. 111(4), 210–212 (1992) 4. Jeffery, W.G.: e-visits for early post-operative visits following orthopaedic surgery can they add efficiency without sacrificing effectiveness. Electronic Thesis and Dissertation Repository, vol. 5053 (2017). https://ir.lib.uwo.ca/etd/5053 5. Ali, F., Islam, S.R., Kwak, D., Khan, P., Ullah, N., Yoo, S.J., Kwak, K.: Type-2 fuzzy ontology– aided recommendation systems for IoT–based healthcare. Comput. Commun. 119, 138–155 (2018) 6. Mustaqeem, A., Anwar, S.M., Khan, A.R., Majid, M.: A statistical analysis based recommender model for heart disease patients. Int. J. Med. Inf. 108, 134–145 (2017) 7. Dreiseitl, S., Ohno-Machado, L.: Logistic regression and artificial neural network classification models: a methodology review. J. Biomed. Inform. 35(5–6), 352–359 (2002)

eWound-PRIOR: An Ensemble Framework for Cases Prioritization

125

8. Araya, D.B., Grolinger, K., ElYamany, H.F., Capretz, M.A., Bitsuamlak, G.: An ensemble learning framework for anomaly detection in building energy consumption. Energy Build. 144, 191–206 (2017) 9. Zhou, Z.H.: Ensemble Learning, pp. 411–416. Boston. Springer, Heidelberg (2015) 10. Opitz, D., Maclin, R.: Popular ensemble methods: an empirical study. J. Artif. Intell. Res. 11, 169–198 (1999) 11. Rallapalli, S., Gondkar, R.: Big data ensemble clinical prediction for healthcare data by using deep learning model. Int. J. Big Data Intell. 5(4), 258–269 (2018) 12. Kurian, R.A., Lakshmi, K.: An ensemble classifier for the prediction of heart disease. Int. J. Sci. Res. Comput. Sci. 3(6), 25–31 (2018) 13. Tuli, S., Basumatary, N., Gill, S.S., Kahani, M., Arya, R.C., Wander, G.S., Buyya, R.: “Healthfog: An ensemble deep learning based smart healthcare system for automatic diagnosis of heart diseases in integrated IoT and fog computing environments. Fut. Gener. Comput. Syst. 104, 187–200 (2020) 14. Zhang, K., Liu, X., Jiang, J., Li, W., Wang, S., Liu, L., Zhou, X., Wang, L.: Prediction of postoperative complications of pediatric cataract patients using data mining. J. Transl. Med. 17(1), 2 (2019) 15. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. California Univ San Diego La Jolla Inst for Cognitive Science, Technical Report (1985) 16. Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011) 17. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) 18. McClelland, J.L., Rumelhart, D.E., Group, P.R., et al.: Parallel Distributed Processing, vol. 2. MIT press, Cambridge (1987) 19. Polikar, R.: Ensemble Learning, pp. 1–34. Springer, Heidelberg (2012) 20. Tan, C.C., Eswaran, C.: Using autoencoders for mammogram compression. J. Med. Syst. 35(1), 49–58 (2011) 21. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011) 22. Braz, F., Campos, F., Stroele, V., Dantas, M.: An early warning model for school dropout: a case study in e-learning class. In: Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE), vol. 30, no. 1, p. 1441 (2019) 23. Lalkhen, A.G., McCluskey, A.: Clinical tests: sensitivity and specificity. Contin. Educ. Anaesth. Crit. Care Pain 8(6), 221–223 (2008). https://doi.org/10.1093/bjaceaccp/mkn041

Challenges of Crowdsourcing Platform: Thai Healthcare Information Case Study Krit Khwanngern1(B) , Juggapong Natwichai2 , Vivatchai Kaveeta1 , Panutda Nantawad3 , Sineenuch Changkai1 , and Supaksiri Suwiwattana1 1

Princess Sirindhorn IT Foundation Craniofacial Center, Chiang Mai University, Chiang Mai, Thailand {krit.khwanngern,vivatchai.k,sineenuch.c,supaksiri.s}@cmu.ac.th 2 Department of Computer Engineering, Faculty of Engineering, Chiang Mai University, Chiang Mai, Thailand [email protected] 3 Craniofacial Center, Faculty of Medicine, Chiang Mai University, Chiang Mai, Thailand [email protected] Abstract. Crowdsourcing platforms usually relied on user contribution. To maintain continuous user engagement, many aspects need to be carefully considered. In this work, we examine online healthcare information platforms. We start with detail on the Thailand healthcare structure and organizations. We point out information storage and transfer problem between healthcare units belonged to different levels. Two online healthcare platforms are presented as a use case. The first system is a mobile application for communication between local hospitals and village health volunteers. Second is an online medical information platform for craniofacial deformities. We explain its structure, functionalities, benefit, and limitation. The platforms also face the information gap and the data cannot transfer beyond their user bases. We show platform module extensions that act as the bridge for information and process effectively between the two platforms. The results of the platform deployment are illustrated.

1

Introduction

First, we give an overview of the Thailand healthcare structure. The definition of healthcare levels and their roles are discussed. We point out an information exchange gap between the local healthcare providers and central hospitals. Then, two healthcare online platforms designed to address some of these difficulties for treatment of craniofacial deformities are introduced. 1.1

Healthcare Organization in Thailand

Similar to most countries, healthcare organizations in Thailand are separated into multiple levels as shown in Fig. 1. Each level represents healthcare providers c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 126–135, 2021. https://doi.org/10.1007/978-3-030-61105-7_13

Challenges of Crowdsourcing Platform: Thai Healthcare Information

127

with higher capability in terms of bed count and specialized crew and equipment. The closest healthcare unit to the patient is village health volunteers. With the total number of almost 80,0000 [3], they are the medical first line of defense. The first group of hospitals that patient visit usually is a local hospital near their hometown. Local hospitals can provide general treatments, vaccination, and consultation. When face with more specialized illness, however, they may refer patients to the bigger hospitals (Table 1). Table 1. Healthcare organizations Healthcare level

Responsible area

Organization

Primary care

Village

Health volunteers

Primary care

Sub district (Tambon) Public health centers

Secondary care

District (Amphoe)

Tertiary and super tertiary care Province

Local hospitals Provincial and university hospitals

There are a few problems arise from this process. First, the patient may need to visit regional hospitals which locate far from their home. Moreover, medical specialists are very few and far between. The conditions which require multidisciplinary team cooperation mean the patients will be appointed to multiple hospitals for practices. Therefore, hospital visits can become the financial burden for the patients and their guardians. This prohibits some patient from completing the entire treatment plan. The second problem is the information exchange process. Thai hospitals working with various data storage systems. Many local hospitals relied on either physical document or rudimentary digitized document storage. On the other hand, large hospitals usually adopt electronic medical record (EMR) systems. Unfortunately, they usually are non-standardized proprietary systems which that unable are to perform data exchange with external systems. This leads to the information exchange gap between the hospitals. Transferred patients physically transfer their medical records between hospitals themselves. This process prone to human error and thus it causes missing information that can affect the quality of treatment. 1.2

Craniofacial Deformities

Craniofacial deformities are birth defect deformities that affect the head and facial bones. Many variations of the anomalies existed, the most common are cleft lip and cleft palate. Besides the patient’s appearance, they can also affect other bodily functions such as breathing, feeding, hearing, and speech. Some syndrome cases may have brain conditions and delay development. The incidence rate of cleft in Thailand is approximately 1.5 in 1000 infants. The treatment duration of craniofacial deformities can be substantial. Varying by the type and severity, many are in the treatment process since birth

128

K. Khwanngern et al.

until 20 years of age. The treatment requires the cooperation of a multidisciplinary team with many specialized medical fields such as dentists, surgeons, otolaryngologists, speech therapists, pediatric nurses, and pediatric anesthesio-logist The lack of specialists is a real concern. For surgical operations, a limited number of cases mean surgeons never have a hands-on experience with the operation. Many operations were only done in regional and university hospitals with sophisticate medical equipment. Patient-specific implants and instruments were required expensive to manufacture. Post-operative cleft patients are required to attend the speech therapy as studies show the link between speech ability and delay development. On the other hand, patients and families face a challenge to maintain a regular schedule for continuing hospital visits. These problems are the main driver behind the development of our healthcare information platform. ThaiCleftLink aims to be an online platform for medical professionals to securely store, search, export, and transfer craniofacial patient information. The most important benefit of this platform is it reduced the unnecessary patient’s hospital visits by having the electronic data “moving” instead of the patients. Another health online platform in this case story is AorSorMor Online. It is a mobile application developed as a communication platform between primary and secondary care units. Users can receive announcements, report work results, and chat with other volunteers. Detail structure and functions of these platforms will be discussed in Sect. 3.

2

Related Work

There are many crowdsourcing platforms are developed to address collaboration problems. In this section, we examine four groups of crowdsourcing projects each with different motivations and methodologies. The specific requirement and methodology are discussed. 2.1

Knowledge Platform

In this group, these platforms use crowdsourcing as the source of knowledge. Which they act as the caretaker of them. And provide easy to access platform to give out, search, and collect that information. Wikipedia is a pioneer and the biggest online knowledge platform. It is one of the most successful crowdsourcing projects. Another type of knowledge platform is crowdsourcing question and answer services such as Quora and Stack Exchange Network. Their success come from the rapidly increase user size [1] and appropriate coordination techniques [10]. The internal self-concept motivation [13] is directly related to the design of our platform. How to encourage non-financial compensated contributors to spend their time and knowledge. 2.2

Marketplace

This group is crowdsourcing to allow employers to connect with large groups of employees. There usually some financial incentives for this type of platform.

Challenges of Crowdsourcing Platform: Thai Healthcare Information

129

Amazon Mechanical Turk is a clear example. Although the platform gains initial popularity, they suffer from work quality. This likely comes from the unregulated nature of the platform [5]. 2.3

Grid Computing

Networking computers in various locations and distributing the workload to these clusters, these systems can achieve high computational power sometimes even rival to a super computer. When the workload is a scientific goal, they may be called crowd science, citizen science, or community science [6,11]. Examples are BOINC and Folding@home. Some are not relied on the distributed computational but on the distributed human intelligence such as Foldit [4,12]. 2.4

Geospacial Temporal

This group is the platform that collects and combines geographic information from individuals to create a global-scale look [7]. Examples are OpenStreetMap [8] and Waze. [2] learn that there are a separate group of volunteers between casual and serious mapper.

3

Online Healthcare Platforms

In this case study, we have two existing healthcare information platforms. We take a look at the similarities and differences between both platforms. Additionally, we give a brief introduction to the previously proposed extensions to extend the functionality by employing crowdsourcing methods.

AorSorMor Online

ThaiCleftLink

Fig. 1. Online healthcare platform. AorSorMor Online targets village health volunteer. ThaiCleftLink targets multidisciplinary teams.

130

3.1

K. Khwanngern et al.

Primary and Secondary Care: AorSorMor Online

Aorsormor Online is a mobile application developed by Advanced Info Service intended village health volunteer and health-promoting hospitals. The application provides a platform for learning, receiving news, messaging, reporting volunteer works, reporting infectious disease occurrences, and making appointments. Users can share many types of information such as image, audio, video, text, and location. 3.2

Tertiary and Super Tertiary Care: ThaiCleftLink

Thaicleftlink by Chiang Mai university’s Craniofacial Center is a web-based platform used by medical professionals. As previously mentioned, its main focus is being a sharing platform between specialists in multiple hospitals for the treatment of cleft patients. The main users are specialists in the area of cleft treatment including surgeons, nurses, speech therapists, etc. In the initial phase, we deploy the platform to users in the northern region of Thailand. Current at more than 2200 in-treatment patients and more than 150 active users originated from multiple hospitals. Figure 1b shows patient index page. Users can search for patients. The patient’s index view shows the profile photo and basic information of the patients. As the system run as a web-based service, the user can access it anywhere with an internet connection. Responsive design interface allows a wide range of different web browsers either on stationary and mobile devices to fit the pages to the screen. System modules gear toward managing medical information related to cleft treatment including patient basic information, diagnosis, medical history, gallery, appointment, and data visualization. Next, we give some statistics of ThaiCleftLink. We show the demographic of multidisciplinary users and patients. The growing number of users and patient profile. And the distribution of users in the multi-disciplinary groups. Finally, we show the number of data moving actions performed in contrast with walk-in patients. 3.2.1 Demographic Maps on Fig. 2 show demographic of user (Fig. 2a) and patient (Fig. 2b) in ThaiCleftLink. Figure 2a shows user location which unsurprisingly clustered in urban areas. On the other hand, patients came from all regions. Figure 2c visualizes patient visits. Lines are drawn from the patient’s home address to visited hospitals. It clearly shows that patients regularly traveled to regional hospitals. As the ability of local hospitals increased, we hope to see these lines become shorter over time. 3.2.2 User Figure 3a shows number of accumulate user since 2019. We currently service more than 150 active users. The sharp jumps in the number of users are the result

Challenges of Crowdsourcing Platform: Thai Healthcare Information

131

Table 2. ThaiCleftLink’s user disciplines Disciplinary

Count

Disciplinary

Count

Nurse

51

Speech therapist

7

Dentist

19

Pedodontist

6

Dental nurse

18

Occupational therapist

4

Plastice surgeon

17

Prostodontist

3

Orthodontist

13

Maxillofacial surgeon

2

ThaiCleftLink’s user demographic.

Disciplinary

Count

Audiologist

1

Otolaryngologist

1

Social worker

1

Medical resident

1

ThaiCleftLink’s patient demographic.

Patient visits. Lines are drawn between patient’s home address and visited hospital.

Fig. 2. ThaiCleftLink’s user and patient geolocation in northern region of Thailand. Retrieved from ThaiCleftLink on 19th August 2020

132

K. Khwanngern et al.

of very successful hospital networking. At the same time, Fig. 3b shows grow in the number of patient profile in ThaiCleftLink database. The sharp jumps are from data import. In some freshly join hospitals, they prefer to start with some of their patient’s basic information in place. So, we import the approved patient information directly into ThaiCleftLink database.

Number of user over time

Number of patient profile over time

Fig. 3. ThaiCleftLink’s user and patient profile growth

Table 2 shows discipline distribution of the user. The majority are nurses who are case managers inside our network hospitals. Follow by a group of dentists and surgeons. The very small number of speech therapist again emphasize the shortage of specialists in Thailand. 3.2.3 Walk in and Data Moving Figure 4 show the number of patient walk-in visit and data moving. This graph represents the ultimate goal of ThaiCleftLink which is to reduce the number of hospital visits by transfer data digitally instead. The number of data moving does not easily increase, however, as there are still many treatment processes that patient needs to attend in person. 3.3

Platform Cooperation

As we can see that two platforms target distinctive user groups. They provide much needed standardized open platform for information exchange. However, they also fall into the gap between healthcare levels. Thaicleftlink focused on tertiary care and quaternary care units (Regional and provincial hospitals) and Aorsormor Online focused on primary and secondary care units (Public health center and health volunteer). To initialize the information flow between them, in our previous work [9], we propose two extension modules to make a connection between both platforms. These modules provide mutual benefit toward all user groups on both platforms. Next, we give a brief detail about these modules.

Challenges of Crowdsourcing Platform: Thai Healthcare Information

133

Fig. 4. Number of patient history separated into walk in and data moving

3.3.1

Searching Module

Fig. 5. Search process

The searching module is used for locating new patients who not currently under an active treatment plan. By crowdsourcing the action to local health volunteers, we expect more patients will enter our care earlier. As a result, they will receive the time-sensitive process on time and improve the final treatment result (Figs. 5 and 6).

134

K. Khwanngern et al.

3.3.2

Following Module

Fig. 6. Follow up process

Following up is the process of contacting the patient for checking up their condition or getting an update of their information. There are four situations where the process is required as 1. 2. 3. 4.

4

Appointment confirmation Pre-surgery visit Post-surgery visit Contact information update

Conclusion

In this work, we look at a use case of crowdsourcing platforms for the healthcare information system. Start with the structure of Thai healthcare system, we point out the difficulties in the information organization and transfer. To address some of them, we developed an online platform for medical professionals called ThaiCleftLink. It is an online platform for storing, searching, and transferring medical history of craniofacial patients. Next, another existing mobile platform called AorSorMor Online is introduced. Similarly, it has been used as the platform for health volunteer and health-promoting hospitals for news announcements, work submission, and communication. We realize the gap between these two platforms. They are intended for different user groups on separate levels. Therefore, extension modules were developed to act as information and process bridge between them. These modules also employ crowdsourcing techniques to mutually benefit all users on both platforms. The results of platform deployment are shown. In future work, we wish to address some limitations of ThaiCleftLink especially the user access control. Also, we plan to upscale it into a national health

Challenges of Crowdsourcing Platform: Thai Healthcare Information

135

database for craniofacial deformities. We also like to expand the cooperation between AorSorMor Online and ThaiCleftLink for example new scenario besides search and follow up. Acknowledgements. This research was partially supported by Chiang Mai University and Advanced Wireless Network Company Limited.

References 1. Almeida, R.B., Mozafari, B., Cho, J.: On the evolution of Wikipedia. In: ICWSM (2007) 2. Budhathoki, N.R., Haythornthwaite, C.: Motivation for open collaboration: crowd and community models and the case of OpenStreetMap. Am. Behav. Sci. 57(5), 548–575 (2013) 3. Chuengsatiansup, K., Suksut, P.: Health volunteers in the context of change: potential and developmental strategies. J. Health Syst. Res. 1(3), 268–279 (2007) 4. Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popović, Z., et al.: Predicting protein structures with a multiplayer online game. Nature 466(7307), 756–760 (2010) 5. Fort, K., Adda, G., Cohen, K.B.: Amazon mechanical turk: gold mine or coal mine? Comput. Linguist. 37(2), 413–420 (2011) 6. Franzoni, C., Sauermann, H.: Crowd science: the organization of scientific research in open collaborative projects. Res. Policy 43(1), 1–20 (2014) 7. Goodchild, M.F.: Citizens as sensors: the world of volunteered geography. GeoJournal 69(4), 211–221 (2007) 8. Haklay, M., Weber, P.: OpenStreetMap: user-generated street maps. IEEE Pervasive Comput. 7(4), 12–18 (2008) 9. Khwanngern, K., Natwichai, J., Sitthikham, S., Sitthikamtiub, W., Kaveeta, V., Rakchittapoke, A., Martkamjan, S.: Crowdsourcing platform for healthcare: cleft lip and cleft palate case studies. In: International Conference on Network-Based Information Systems, pp. 465–474. Springer (2019) 10. Kittur, A., Kraut, R.E.: Harnessing the wisdom of crowds in Wikipedia: quality through coordination. In: Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, pp. 37–46 (2008) 11. Sauermann, H., Franzoni, C.: Crowd science user contribution patterns and their implications. Proc. Natl. Acad. Sci. 112(3), 679–684 (2015) 12. Schrope, M.: Solving tough problems with games. Proc. Natl. Acad. Sci. 110(18), 7104–7106 (2013) 13. Yang, H.L., Lai, C.Y.: Motivations of Wikipedia content contributors. Comput. Hum. Behav. 26(6), 1377–1383 (2010)

An Implementation Science Effort in a Heterogenous Edge Computing Platform to Support a Case Study of a Virtual Scenario Application Marceau Decamps1(B) , Jean-Francois Meháut1(B) , Vinicius Vidal2 , Leonardo Honorio2,3 , Laércio Pioli2 , and Mario A. R. Dantas2,3 1 Université Grenoble Alpes, CNRS, Inria, GrenobleINP, LIG, Juiz de Fora, Brazil

[email protected], [email protected] 2 Federal University of Juiz de Fora, Juiz de Fora, Brazil [email protected], [email protected], {laerciopioli,mario.dantas}@ice.ufjf.br 3 INESC P&D, Santos, Brazil

Abstract. IoT devices are pillars for the Industry 4.0 software applications. However, clustering these edge nodes are interesting open challenges in several dimensions, because of mandatory integration of diverse hardware and software packages. Different type of industrial cameras and a supercomputer node to support 3D reconstructions is not a trivial approach, especially considering aspects of the IoT software engineering. In this paper, we present a research, which could be classified as an implementation science effort. The target is a heterogenous edge computing platform, utilized to support a real case study of an electrical engineering field application. This application is characterized by a 3D virtual reconstruction paradigm for the hydropower project. Our results indicate interesting aspects related to implementation science and challenges found in the composition and operationalization of this heterogenous edge platform.

1 Introduction As reported in [1], a record of 4,185 terawatt hours (TWh) in electricity was recently generated from hydropower, avoiding up to 4 billion tonnes of greenhouse gases as well as harmful pollutants. In addition, this document also mention that Brazil produced 3.4 GW. The previous number illustrate the challenge to work with this Industry 4.0 segment, that also represents 64% of energy produced in Brazil. Furthermore, the distributed geographical area of the scenario is around 8.5 million of square kilometers. Nowadays, it is a common approach to consider virtual and/or augmented reality systems as a basis to enhance projects efforts. The goal to reproduce three-dimensional objects and scenarios in virtual environments is ordinarily adopted in several applications. Requirements, particularity from several disciplines, are becoming more complex, specially to IoT software application developers, as observed in the work [2]. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 136–147, 2021. https://doi.org/10.1007/978-3-030-61105-7_14

An Implementation Science Effort in a Heterogenous Edge Computing Platform

137

One interesting mechanism to reproduce three-dimensional objects and scenarios in virtual environments is to develop a reconstruction software application. As observed in work presented in [3], this option is faster and less costly alternative in specialized labor. On the other hand, the number of devices required in these processes of reconstruction could be exceptionally large and with different type of architectures. Therefore, it is necessary an orchestration to gather and to produce a synchronized differentiated result from these different architecture nodes. Heterogenous edge nodes [4] are interesting architectures to tackle issues where the configuration requires different kind of IoT devices to provide successful results. This is the case example where devices, such as cameras, are utilized to gather data to process and provide an accurate information to an application. Our work represents one experimental element of collaboration in a large real project, in the electricity industry segment, called as a Virtual Project. The project is a real effort in the Brazilian electricity industry that targets an excellence in assisted maintenance and people training, considering the paradigm of Industry 4.0. Therefore, in this paper we present an experimental research contribution work, which is characterized by implementation effort in heterogeneous edge nodes, as a computational environment, to provide data to a 3D reconstruction application. Our experiments indicate interesting set of aspects and issues considering this heterogenous edge configuration. The paper is organized as follows. In Sect. 2, we present some technical points related to implementation science (IS), virtualization, and edge architecture, once those are the three main adopted technologies in this work. The Virtual Project, a 3D reconstruction software application, is described in Sect. 3. In Sect. 4, it is illustrated related works. Our proposal contribution and experimental results are pointed out in Sect. 5. Finally, in Sect. 6, we present our conclusions and future works.

2 IS, Virtualization and Edge Architectures 2.1 Implementation Science (IS) Implementation science is an approach from the health filed which is common stated as: “Implementation science is the scientific study of methods and strategies that facilitate the uptake of evidence-based practice and research into regular use by practitioners and policymakers” [5]. This paradigm could be considered in the present research field of Edge-IoT, where heterogeneous IoT devices hardware with different software packages are utilized by applications programmers without a major standard organization providing methods or strategies in edge configurations. Application programmers use to adopt a search in environments, such as found in [25], to uptake of evidence-based practice. This is an evidence from the theory of implementation science which is mainly characterized as a field that seeks to systematically close the gap between what we know and what we do, commonly referred to as the know-do gap, by identifying and addressing the barriers. These barriers that slow or halt the uptake of proven health interventions and evidence based practices [5, 6].

138

M. Decamps et al.

2.2 Virtualization As stated in [7], in technical terms, virtual reality is the term utilized to describe a process to generate a three-dimensional computer environment, which can be explored and interacted with by a person. That person becomes part of this virtual world or is immersed within this environment and whilst there, is able to manipulate objects or perform a series of actions. In this paper, the technique adopted for virtualization is the 3D reconstruction. This process is characterized by capturing the shape and appearance of real objects. In the literature, it is shown several approaches to implement the 3D reconstruction (e.g. [8, 9, 10]). Stereo Vision, as presented in [11], this technique is an affordable and easy approach to implement 3D sensing applications. The distance extraction in stereo vision is based on triangulation between two sensors whose baseline and focal plane is known. The disparity – a key parameter all triangulation methods - is calculated by processing the images of both sensors (rectification and matching algorithm) and extracting the correspondences. The maximum detectable depth range is proportional to the baseline and the sensor resolution. The final depth resolution is mainly limited by matching and calibration errors [11]. Simultaneous Localization and Mapping (SLAM) is one technique utilized for map generation. As observed in [12], in SLAM an agent generates a map of an unknown environment while estimating its location in it. Distributed cameras lead to monocular visual SLAM, where a camera is the unique sensing device for the SLAM process. Nowadays, these edge devices are commonly deployed for the generation of maps, leading to necessary use of a distributed system. In other words, different distributed edge devices are employed to generate their specific region. As a result, these devices will cover together a wide area. This approach brings some challenges such as identifying overlapping maps and a devices orchestration process. RGB-D, as it is mentioned in [13], new camera systems, examples are the Microsoft Kinect or the Asus Xtion, sensor that provide both color and dense depth images became readily available. It is expected that those systems will push the development to level of new 3D perception-based applications. RGB-D sensors are interesting for applications such as 3D mapping, localization, object recognition and other issues related to color and dense depth images. Robot Operating System (ROS) [14] is a set of software libraries and tools that help developers to build robot applications. The environment provides from drivers to stateof-the-art algorithms and also powerful developer tools. ROS is an open source effort, which has several facilities to robotics project developments. 2.3 Edge Architecture Several studies had proposed classification architectures for edge computing platforms. In this paper, we considered an edge architectures classification found in the research work presented in [4]. Premsankar, Di Francesco and Taleb [4] observe that the review of architectures reveals that the edge of the network is not clearly defined, and the nodes expected to participate at the edge can vary. It is also mentioned that the terminology

An Implementation Science Effort in a Heterogenous Edge Computing Platform

139

used to describe the edge differs greatly, with the same term being used to define different architectures and functionality. Therefore, they classify, based on common features of deployments, these architectures into three categories. In addition, it is observed that in practice, features from one category can be used in combination with others. Finally, edge architectures were classified in three categories [4]. Resource-rich servers deployed close to the end-devices - One option to realize an edge computing platform is to deploy resource-rich servers in the network to which end-users connect. Some examples of these scenarios are virtual machine (VM)-based cloudlets deployed on WIFI access points, one hop away from end-devices; a multitiered system using cloudlets to provide cognitive assistance for users. Video and sensor data collected from users through Google Glass are processed on the cloudlet to provide realtime assistance; a scalable three-tier system using cloudlets for analytics and automated tagging of crowd-sourced video from user devices. Since the introduction of cloudlets, further research has proposed integrating cloudlets with femtocells, LTE base stations or even cars. Heterogeneous nodes at the edge, including the end-devices themselves. Different from the previous scenario, this classification covers a diverse set of computing resources, such as a fog platform which has the characteristics of a high virtualized system of heterogeneous nodes. This heterogeneity can be also considered for the wireless connectivity aspects. In other words, different devices in a fog platform fashion shape are orchestrated to process data for applications. Federation of resources at the edge and centralized data centers, this is another form to compound a platform. In this configuration, also called as edge-cloud, edge devices are, for example, employed to provide services locally and/or to cloud environments, and services could be deployed on a cloud infrastructure distributed throughout the Internet. In this paper, we considered the heterogenous edge platform. Because, it is an interesting architecture to tackle issues where devices, such as cameras and a computer node, are utilized to gather data to compute and process in an accurate fashion to a virtual application.

3 The Virtual Project The primary goal of the Virtual Project [15, 16, 17] is to support a supervision of electricity substations equipment, i.e. to design an approach which can provide a solution for equipment maintenance and crew training considering in an Industry 4.0 scenario. The Virtual Project is based upon the technology and advances in the computer vision area, providing a reconstruction of an equipment and/or an environment. Several techniques and equipment are experimented and tested to fulfill this purpose. Examples of equipment comprises image and laser data processing, as much as onboard and parallel programming for a low and high level of information processing. The project involves professors and students from undergraduate to postgraduate level, from engineering and computer science. Foreigners students visiting our labs also contribute with their effort, which is the case stated in this paper with the collaboration with Grenoble Alpes University. The technological paradigm of the project effort can be understanding as the utilization of virtual reality glasses and smartphones/tablets after the reconstruction of an

140

M. Decamps et al.

equipment or environment. Therefore, professionals can be trained for dangerous situations with no need for equipment and real-life simulations. The energy generation and transmission industries can greatly benefit by this type of technology. Large equipment makes it hard to keep track of maintenance techniques, and transmission faults situation must be dealt in the least amount of time, due to high penalty values from the government. In the Virtual Project the first moment is characterized by the environment and equipment scanned. Thus, both are placed and worked in a virtual and augmented reality application. Afterwards, a 3D point cloud is gathered with field scanners built and programmed in a laboratory with the latest sensor technology in the market. The sensor calibration must be accurate and adaptive to work in the most variety of environments. Once the point cloud is acquired, it must be manipulated to create a mesh and a textured model. Therefore, the whole environment or dedicated equipment can be inserted in the application. In this point, images are gathered from the environment, then computer graphics applications are utilized to create 360° view of the scene from that point. In virtual reality applications can place the employee in front of an equipment that could be miles away. However, its data can be monitored from anywhere as in a real-world situation. At the end of the day, distance maintenance is facilitated due to a clearer contact with the faulty equipment. It is not required to specialist to travel to solve an issue. Interactive maintenance procedures are recorded in a practical manner. Thus, the employee can be trained in the virtual world, and a 3D virtual map of the electricity compound (inside and outside environment) is created with the use of an optimized 3D scanner. Several benefits can be highlight with this project, such as financial and security aspects for the company. Because, as it was mention before 64% Brazilian energy comes from this industry in area of 8.5 million of square kilometers.

4 Related Works In this section we present a view of some related works in field of the present contribution. Our initial search, referring to similar contributions, was those works targeting the utilization of reconstruction in their applications. In a second moment, find which methods were being adopted to orient assisted maintenance and 3D reconstruction issues. In the literature we found interesting research referring to applications utilizing reconstruction and others describing their methods to support reconstruction. All those references can be considered as related to Virtual Project, then we present below some of their main ideas. In the work presented in [18], it is stated an effort in utilizing virtual reality for a previous analysis for mechanical equipment maintenance, targeting the time reduction and cost of the maintenance. It also targeted to reduce the number of equipment failures. These are interesting aspects for a software application. Spacecrafts reconstruction and the 3D models which stores relevant historical information are described in the [19]. Stumberg et al. [20] search was oriented to monocular SLAM to autonomous drone exploration targeting to map and locate obstacles and reconstruction without texture.

An Implementation Science Effort in a Heterogenous Edge Computing Platform

141

Schonberger and Frahm [8] describe an approach based upon structure from motion, based on matching features and SIFT descriptors. Whelan et al. [9], had a contribution characterized by tackling dense SLAM in real time, focus on the map, based on surface elements and deformation graph. The research presented in [10], the focus was in the optimization similar to that of elastic fusion for camera location and inclusion of features.

5 Proposal and Experimental Results In this section, we presented our experimental research proposal work and experiments results, which represents one contribution to the Virtual Project. The contribution of the present paper is an experimental research, which was based in previous works from our group presented in [3] and [21], from where we followed one proposed future work direction. The direction selected was the composition of a new heterogeneous environment formed by cameras and the utilization of an edge computing node. Therefore, first it was conducted a search in some possible computer hardware, observing their embedded application. The NVIDIA Jetson TX-2 [22] was selected to provide a differentiated performance for the 3D reconstruction. The choice was based upon characteristics found inside the development kit reports, which state advantages of its processing capacity, GPU and Ubuntu operating system compatible with ROS [14]. In other words, our goal in this step was to choose a device with could enhance the process of scan an environment, helping to promote a better model of it. Important to remember that our efforts involve the dimensions were the hardware, software and integration represent pillars for a 3D reconstruction application. After incrementing the configuration environment with the Jetson TX-2, it could be characterized, as categorized in [4], as a heterogeneous edge computing platform, where different type of cameras and a supercomputer on a module were integrated. The devices utilized for our experiments in heterogeneous edge computing environment were one camera RGB-D, two cameras stereo RGB and supercomputer. These edge devices are described below. ORBBEC Astra [23]: is an original depth camera based on stereo vision and structured light technology. It can acquire high-accuracy RGBD image with full-color texture at a working distance ranging from 0.2 to 2 m. Acusense is designed small and compact, easy to be integrated with robots for diverse applications such as machine vision, security systems and manufacturing; DUO MC [24]: The DUO MC is an ultra-compact, configurable stereo camera with a standard USB interface. Intended for use in consumer and industrial systems. The camera’s high speed and small size make it ideal for existing and new use cases for vision based applications. Delivering configurable and precise stereo imaging for remote vision, robotics, microscopy, medical, human computer interaction and beyond; Nvidia Jetson TX-2 [22]: This computer facility called as Nvidia Jetson TX-2, is a module which brings features of AI computing at the edge. It features a variety of standard hardware interfaces that make it easy to integrate it into a wide range of products and form factors. The features of this computer are processor ARM64 quad-cores, RAM

142

M. Decamps et al.

8GB DDR4, 256-core Pascal GPU, 5.5–19.6VDC input power (7.5W under typical load), Ubuntu 16.04-Tegra. Figure 1 illustrates the interaction of the Astra camera and Duo camera with the computer node. The first camera acts as the calibration of point cloud node. The other camera performs functions as an ORB-SLAM 2 node. Finally, the compute node executes the function of registering and filtering, and then store the output registered point cloud.

Fig. 1. The heterogenous edge devices functions and relations.

Figure 2 shows the real heterogenous edge platform utilized to support our 3D reconstruction experiments. The object for the reconstruction, shown in Fig. 3, it was the laboratory environment, because of non-disclosure agreements signed by us related to the Virtual Project. In other words, because our goal was an experimental research, we selected a well-known landscape which could be better provide more clear highlights of our experiments and results.

Fig. 2. The heterogenous edge environment supporting 3D reconstruction.

In Fig. 4, it is possible to see two pictures gathered by the Astra camera after calibration, considering two points of view. Figure 5 illustrates the methodology applied in this work. The goal is to reconstruct the environment as a point cloud [27] for further processing. In order to achieve it, the stereo camera was used to obtain Visual Odometry (VO) data. This camera was synchronized with the 3D information gathered by the well calibrated RGBD camera. Results from both cameras are transmitted to a final processing node, which registers the

An Implementation Science Effort in a Heterogenous Edge Computing Platform

143

Fig. 3. The laboratory environment utilized as case study.

point clouds and provides the final data in an online fashion. Every camera acquisition and process are performed by a ROS edge node and software synced. Therefore, the data gathered at the same instant in the beginning of the process, meets up again, in the final registration node.

Fig. 4. Scene gathered by the Astra camera after calibration.

Fig. 5. The DUO stereo camera tested under ROS framework.

Summarizing our experiments at this point, the primary result for both cameras can be seen in Figs. 2, 3 and 4. Figure 2 presents the Jetson TX-2 working with the Astra

144

M. Decamps et al.

camera set up. Figures 3 and 4 present a view of the laboratory environment in both 2D and 3D, gathered by the Astra camera after calibration. Finally, the DUO camera is tested under the ROS environment in Fig. 5. Targeting to enhance the performance of the environment with the computing node, our next experiment considered an open source SLAM method named ORB-SLAM2, developed by [26], is used in this work and presents results in Fig. 6. The goal was to obtain the stereo VO by means of the DUO camera, which performs well due to the combination of a low resolution well calibrated pair of mono cameras. Thus, leading in the end to online processing speed. Moreover, the method was published on the internet. Therefore, it could be used in applications as a standard and reference VO algorithm in higher level applications. It was also well placed in terms of accuracy when tested in remarkable datasets, such as KITTI stereo outside scenarios [29]. Figure 7 shows the environment reconstruction because of the entire method. The full working station was manually scanned by the camera group. The final point clouds are saved and filtered with millimeter level resolution by the final registering node and can move on to mesh generation and texture alignment.

Fig. 6. ORB-SLAM2 tested under DUO camera imaging in real time.

Challenges that were found during our work could be classified in two dimensions. The first one issue was found in the edge cameras nodes, where several required dependencies of each software involved in this development. Examples were the ORBSLAM2, with OpenCV, Pangolin and other software packages. The second dimension was the Jetson TX-2 edge node, this node requires a specific installation and its processor for the interoperability with the other nodes from the edge configuration. In reference [25], it is possible to obtain scripts and tutorials for many subjects and challenges found during the integration development. It is not straightforward to gather pieces of those heterogeneous edge nodes from these two dimensions to interoperate. Therefore, our empirical experiments we verified, throughout the heterogenous edge nodes environment, issues related to the programming IoT software systems in comparison to conventional system architectures.

An Implementation Science Effort in a Heterogenous Edge Computing Platform

145

Fig. 7. The environment reconstruction as a result of the entire method

6 Conclusions and Future Works In this paper we presented an experimental research effort, which can be classified as an implementation science approach. The work was conceived to support a reconstruction paradigm for 3D environments, through the utilization of a heterogeneous edge computing platform. The primarily edge nodes from the configuration were compound by IoT devices, characterized by special cameras for industrial environments. Those cameras had characteristics such as easy to be integrated with robots for diverse applications (e.g. machine vision) and precise stereo imaging for remote vision. Those characteristics require a high-performance computing approach which was support by the supercomputer edge node, the Nvidia Jetson TX-2. The work developed consisted first in selection of appropriated industrial cameras with specific features to compound the heterogeneous edge configuration to support the 3D reconstruction software. Afterwards, select and integrate a computing edge node to provide the performance to the reconstruction function. During the development we find several challenges related to the camera’s software dependencies and the specific installation of the computer node for the interoperability with other edge nodes. The previous aspects commented about the communication of the heterogeneous edge nodes indicate an interesting future work research direction. Therefore, more appropriate support for IoT software packages could be developed to help applications programmers, as it is stated in the implementation science paradigm, as efforts are illustrated

146

M. Decamps et al.

in [2, 5, 6], and [25]. Other interesting future work is to characterize I/O and storage activity, similar the research found in [28]. Acknowledgments. The authors acknowledge the financial funding and support of the following companies: TBE and EDP under supervision of ANEEL - The Brazilian Regulatory Agency of Electricity. Project number PD PD-02651-0013/2017. In addition, authors thank the sponsor from INESC Brazil, University of Grenoble Alpes, University of Juiz de Fora (UFJF), and Brazilian National Research Council (CNPq).

References 1. Hydropower. https://www.hydropower.org/country-profiles/brazil. Accessed 24 June 2020 2. Motta, R.C., de Oliveira, K.M., Travassos, G.H.: On challenges in engineering IoT software systems. In: SBES 2018: Proceedings of the XXXII Brazilian Symposium on Software Engineering, pp. 42–51 (2018) 3. Silva, L.A.Z.: Reconstruction with multiple cameras and distributed system under the fog paradigm. Msc dissertation, Electrical Engineering Department, UFJF (2019). https://www. ufjf.br/ppee/2019/02/21/defesa-de-dissertacao-de-mestrado-luiz-augusto-zillmann-da-silva/ 4. Premsankar, G., Di Francesco, M., Taleb, T.: Edge computing for the Internet of Things: a case study. IEEE Internet Things J. 5, 1275–1284 (2018) 5. Implementation Science. https://impsciuw.org/implementation-science/learn/implement ation-science-overview/. Accessed 24 June 2020 6. Handley, M.A., Gorukanti, A., Cattamanchi, A.: Strategies for implementing implementation science: a methodological overview. Emerg. Med. J. 33, 660–664 (2016) 7. Virtual Reality Society. https://www.vrs.org.uk/virtual-reality/what-is-virtual-reality.html. Accessed 24 June 2020 8. Schonberger, J.L., Frahm, J.-M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016) 9. Whelan, T., Kaess, M., Johannsson, H., Fallon, M., Leonard, J.J., McDonald, J.: Real-time large-scale dense RGB-D SLAM with volumetric fusion. Int. J. Robot. Res. 34, 598–626 (2015) 10. Dai, A., Nießner, M., Zollhöfer, M., Izadi, S., Theobalt, C.: BundleFusion: real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Trans. Graph. (TOG) 36, 76a (2017) 11. AMS. https://ams.com/stereovision. Accessed 24 June 2020 12. Egodagamage, R., Tuceryan, M.: Distributed monocular SLAM for indoor map building. J. Sens. 17 (2017). Article ID 6842173,11. 13. RGB-D. https://vision.in.tum.de/research/rgb-d_sensors_kinect. Accessed 24 June 2020 14. ROS. https://www.ros.org/. Accessed 24 June 2020 15. Inerge. https://www.ufjf.br/inerge/. Accessed 24 June 2020 16. Grin. https://www.ufjf.br/inerge/institucional/laboratorios/grin/. Accessed 24 June 2020 17. NenC. https://www.ufjf.br/nenc/. Accessed 24 June 2020 18. Qing, H.: Research and application of virtual reality technology in mechanical maintenance. In: International Conference on Advanced Technology of Design and Manufacture (ATDM 2010), pp. 256–258. IET (2010) 19. Shcherbinin, D.: Virtual reconstruction and 3D visualization of Vostok spacecraft equipment. In: 2017 International Workshop on Engineering Technologies and Computer Science (EnT), pp. 56–58. IEEE (2017)

An Implementation Science Effort in a Heterogenous Edge Computing Platform

147

20. von Stumberg, L., Usenko, V., Engel, J., Stückler, J., Cremers, D.: From monocular slam to autonomous drone exploration. In: 2017 European Conference on Mobile Robots (ECMR) Mobile Robots (ECMR), pp. 1–8. IEEE (2017) 21. Silva, L., Vidal, V., Silva, M., Santos, M., Carvalho, A., Cerqueira, A., Honório, L., Rezende, H., Ribeiro, J., Pancoti, A., et al.: Automatic recognition of electrical grid elements using convolutional neural networks. In: 2018 22nd International Conference on System Theory, Control and Computing (ICSTCC), pp. 822–826. IEEE (2018) 22. Jetson. https://developer.nvidia.com/embedded/jetson-tx2. Accessed 24 June 2020 23. Astra. https://orbbec3d.com/. Accessed 24 June 2020 24. DuoMC. https://duo3d.com/product/duo-mc-lv1. Accessed 24 June 2020 25. Jetson Hacks. JetsonHacks.com. Accessed 24 June 2020 26. Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source slam system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017) 27. Pointclouds. https://pointclouds.org. Accessed 24 June 2020 28. Inácio, E.C., Nonaka, J., Ono, K., Dantas, M.A.R., Shoji, F.: Characterizing I/O and storage activity on the K computer for post-processing purposes. In: ISCC, pp. 730–735 (2018) 29. Geiger, A., et al.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231– 1237 (2013)

Detection and Analysis of Meal Sequence and Time Based on Internet of Things Liyang Zhang(B) , Hiroyuki Suzuki, and Akio Koyama Graduate School of Science and Engineering, Yamagata University, 4-3-16 Jonan, Yonezawa, Yamagata, Japan [email protected], {shiroyuki,akoyama}@yz.yamagata-u.ac.jp

Abstract. With the prevalence of lifestyle diseases, systems for monitoring meal information through the Internet of Things have become widespread. Studies have shown that the sequence and time of meals are considered to be part of the factors that affect lifestyle diseases such as diabetes and obesity. An improved smart tableware is proposed which can detect the process of meal information more effectively. Since meal information is detected sequentially, the method of using recurrent neural network to detect the meal sequence is introduced, and the feasibility of this method is demonstrated by experiments. After analyzing information obtained from tableware, meal information will be fed back to users and help to improve their eating habits. In the meal experiment, we used improved smart tableware and proved the feasibility of the proposed method.

1 Introduction Lifestyle diseases are diseases related to the lifestyle of a person or group of people. And diseases such as heart disease, obesity and type 2 diabetes belong to lifestyle diseases [1]. In recent years, lifestyle diseases are becoming more and more common. Compared with 108 million in 1980, the number of people with diabetes in 2014 has increased to 422 million [2]. The prevalence of overweight or obese adolescents or children aged 5–19 has more than quadrupled worldwide from 1975 to 2016 [3]. There is a strong link between food intake and obesity, and lifestyles such as dietary habits are also the factors for the rapid rise in the incidence of diabetes [4]. Therefore, the prevention of lifestyle diseases has become a topic of increasing concern. People have realized the importance of preventing lifestyle diseases and are using various possible methods to prevent lifestyle diseases. Some traditional methods of monitoring food intake are through self-reports, such as relying on questionnaires [5] or entering information into mobile devices [6]. However, due to the need to enter information, it will take some time, which makes it difficult for the user to insist on using it. And because it is self-reported according to memory, there may be underreporting or overreporting, and may not report food intake accurately. In recent years, wearable sensor-based devices have become popular, and food intake is detected through the processes of biting, chewing, and swallowing. Dong et al. [7] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 148–157, 2021. https://doi.org/10.1007/978-3-030-61105-7_15

Detection and Analysis of Meal Sequence and Time Based on Internet of Things

149

detected the bite by studying the movement of the wrist, they found that the wrist roll occurred during bite and designed a device similar to a watch to detect the movement of the wrist. Experimental results show that the accuracy has reached more than 80%. A method of detecting chewing by a piezoelectric film sensor is proposed [8]. In the experiment, the process of chewing is detected by sticking the sensor under the outer ear, but this may cause discomfort to users. A new wearable device like glasses has also been proposed [9], which can detect food intake during user activities, and the average f-score in the experiment reached more than 90%. A method of detecting swallowing by embedding a sensor in a wearable necklace has been proposed [10]. In the experiment, the necklace was placed on the throat to detect its movement. For the detection of the intake of liquids and solids, the f-score reached more than 80%. But the tightness of the necklace will affect the experimental effect, and the tight necklace will cause user discomfort. On the other hand, camera-based food information detection systems have entered our lives. FoodLog [11] manages food information by analyzing food information in photos, and users can view it through the system. It has been proposed to detect chewing count by analyzing the information of face and hands during food intake by Kinect Xbox One camera sensor [12]. During the meal, there will be movement of the hands and face jaw, it is proposed to detect the bite count of food intake by tracking them and performing gesture analysis. Although the accuracy in the experiment has reached more than 90%, the position of the Kinect sensor will affect the tracking and recognition of body parts. The research presented above focused on meal intake, meal composition or bite count, etc. However, existing research has found that the sequence of meal has an impact on glycemic control. For example, for patients with type 2 diabetes, eating vegetables before carbohydrates can achieve better glycemic control [13, 14]. In other words, a reasonable sequence of meals can help prevent lifestyle diseases to a certain extent. Therefore, this study not only focuses on composition and total intake of meals, but also on the sequence and time of meals. In our previous research, an acceleration sensor has been attached to the smart tableware [15]. In order to collect meal information comprehensively, multiple sensors are used [16, 17], and different machine learning algorithms are used for experiments. In this study, we will introduce the improved smart tableware, which can detect the meal process accurately. Since the recurrent neural network (RNN) is a type of neural network that processes sequence data, therefore, through the meal data obtained from the smart tableware, a method for detecting the sequence of meals using the RNN is introduced, and the feasibility of the method is proved through experiments. After that, the obtained data is analyzed and fed back to users to help manage meals. The rest of paper is organized as follows. In Sect. 2, we will introduce the improved smart tableware, feature extraction, and how to process meal information and browsing of meal information. The evaluation will be given in Sect. 3. Finally, the conclusion of this paper is given in Sect. 4.

2 Methodology In this section, we will first introduce an improved smart tableware, through which to obtain information in the meal process. Then, the information obtained from the smart

150

L. Zhang et al.

tableware is analyzed, feature extraction and machine learning are carried out to judge the meal information such as time and sequence. Finally, users can browse meal information through the system, such as meal sequence and time, nutrition intake and so on. 2.1 Smart Tableware During the meal, there will be a reduction in the weight of the food in the tableware, and there may also be a process in which the tableware moves to the mouth to start eating and so on. Therefore, we consider detecting the acceleration information and pressure information of the tableware during meals. The acceleration information can be used to determine whether the tableware is moving or stationary during the meal, and the pressure information can be used to determine the weight change of the food. Here, we consider using two sensors. One is a circular flexible PR-C18.3-ST thin film pressure sensor with a diameter of 18.3mm that can be used to detect pressure information and can realize highly sensitive pressure detection. The other is the MPU-9250, which is used to detect whether the tableware is moving and can output acceleration in the 3-axis directions and so on. At the same time, we used NRF24L01 for wireless data transmission, and STM32F103C8T6 as the microcontroller. The smart tableware we use and the position where the sensors are attached to the tableware are shown in Fig. 1. In the experiment, the tableware is placed on the pressure sensor, and the sensor for sensing acceleration is attached to the tableware.

Fig. 1. Smart tableware and the position where the sensors are attached.

The improved hardware can obtain information wirelessly, and the information can be output about every 0.25 s, which allows us to grasp the state of the tableware more effectively and helps to improve the accuracy of the prediction of meal information. 2.2 Data Processing and Feature Extraction At present, the data obtained from the smart tableware contains three types of information, the ID, the pressure information, and the acceleration information of the tableware.

Detection and Analysis of Meal Sequence and Time Based on Internet of Things

151

An example of the log output from the smart tableware is shown in Fig. 2, which contains the ID of the tableware, the current output time in milliseconds, acceleration in the 3-axis directions and pressure information. We can see that the output here are records with tableware IDs A, B and C, respectively. Of course, it can also output more records with different IDs at the same time. The records are output in sequential data, and each record reflects the acceleration information and pressure information of the corresponding tableware at the current time.

Fig. 2. The content and corresponding meaning of the log.

Because sensors are attached to the tableware for the experiment, the tableware will correspond to an ID directly. We will distribute the food we eat to tableware directly, so that we can judge the food information corresponding to each ID. In the system, we also have the page to map food to different IDs in advance. In this way, we can know the name of food, nutrients and other information corresponding to tableware. In addition, during the meal, the change in the weight of the food is reflected by the pressure information, and whether the tableware is moving is described by the acceleration information. By combining acceleration information and pressure information, the sequence and time of the meal can be determined, as shown in Fig. 3.

Fig. 3. Meal information obtained from smart tableware.

To reflect the changing process of acceleration and pressure information during the meal, feature extraction and machine learning are performed. According to the information obtained from the tableware, 6 features are extracted, which includes 3 acceleration features and 3 pressure features. The 3 acceleration features are “resultant acceleration”, “resultant acceleration difference”, and “moving average deviation rate of resultant acceleration”. The resultant of the 3-axis acceleration Ax, Ay and Az is calculated using Eq. (1). 1/2 Resultant acceleration = Ax2 + Ay2 + Az2

(1)

152

L. Zhang et al.

The 3 pressure features extracted are “pressure (voltage) value”, “pressure difference” and “moving average deviation rate of pressure”. The extracted acceleration features are expected to reflect whether the tableware is in motion and whether it is in continuous motion, and the extracted pressure features are expected to reflect the process and trend of food weight changes. For each record output by the smart tableware, the features are extracted sequentially. 2.3 Detection of Meal Sequence and Time The meal information obtained from the smart tableware is sequential, as shown in Fig. 4. It shows the situation that the records captured by the smart tableware is output every 250 ms. Of course, this is just an example, the actual situation is that the output is about once every 250 ms. We will distinguish the output records corresponding to tableware according to the ID and handle them separately. We will extract feature vectors based on each output record of the smart tableware, and these feature vectors will be used for machine learning, which means that we will determine whether to eat or not every 250 ms.

Fig. 4. The judgment of sequential data.

Figure 5 shows an example assuming a 15-s meal process. In this example, since the 15-s meal process is long relatively, and it is difficult to draw the judgment process for each period in detail. Therefore, we draw a prediction every 3 s here, while the actual situation is to make a prediction every 250 ms. According to this example, it is judged that nothing was eaten in the third and fifth periods, rice was eaten in the first and second periods, and miso soup was drunk in the fourth period. In this way, the sequence and time of the meal can be judged, that is, the time to eat rice is 6 s, the time to drink miso soup is 3 s, and the sequence of meal can be thought of as eating rice and then drinking soup. We can see that what is obtained from the tableware is sequential data, and RNN is considered suitable for processing the relationship between sequential data and has been widely used. To learn the data from past, the output of the layer is fed back to the input layer many times in RNN, and the learning data is sequential. The gated recurrent unit (GRU) [18] is the RNN that can be remembered for a long time. Therefore, RNN is used to predict meal information in this paper and we used standard RNN and GRU for experiments.

Detection and Analysis of Meal Sequence and Time Based on Internet of Things

153

Fig. 5. The judgment method of sequence and time of meal.

2.4 Browsing of Meal Information After feature extraction and machine learning, we will save the processed data to the server for users to browse. The system contains modules such as calendar, personal information, and food assignment, as shown in Fig. 6, users can click the link to enter the corresponding page to perform more detailed operations or read the corresponding information.

Fig. 6. The homepage of the system. Users can click on the calendar, food assignment or personal information to enter different pages.

Users need to log in simple personal information such as age, gender, etc., so that we can correspond to standard intake data. Before eating, we need to match the food with the tableware. As shown in Fig. 7, we need to select the corresponding food in the tableware, so that we can judge the corresponding nutrients, calories, and other information of the food intake.

Fig. 7. Match the food with the tableware.

For the meals that have been performed, we can click on the calendar to enter the date selection page to view the meal information on the corresponding date, such as calorie intake, time of meal, sequence of meal, etc., as shown in Figs. 8 and 9. According to

154

L. Zhang et al.

the information of meal, some suggestions will also be given, such as eating too fast, chewing frequently and so on.

Fig. 8. Display salt and calories intake information.

Fig. 9. Display sequence and time of meal information.

3 Evaluation We used the smart tableware to conduct 10 meal experiments and ate various foods at the same time. The sensors were attached to tableware, and the food and the tableware were corresponded in advance to distinguish the food information contained in the tableware. In the experiment, we used a camera to record the meal process, which can facilitate the labeling of instances. In addition, we used sensitivity, specificity and accuracy to evaluate the performance of different algorithms, see Eqs. (2), (3) and (4). Sensitivity = TP / (TP + FN)

(2)

Specificity = TN / (TN + FP)

(3)

Accuracy = (TN + TP) / (TN + TP + FN + FP)

(4)

where TP, FN, TN and FP denote true positive, false negative, true negative and false positive respectively. Sensitivity represents the proportion of true positive assessments to all positive assessments, specificity represents the proportion of true negative assessments to all negative assessments, and accuracy represents the proportion of correct assessments to all assessments.

Detection and Analysis of Meal Sequence and Time Based on Internet of Things

155

A total of 22872 instances were extracted from 10 experiments, of which 14297 instances from 6 experiments were used for training, and 8575 instances from 4 experiments were used for testing, and each instance is represented by 6 features. In the experiment, we used standard RNN and GRU for evaluation, and used the PyTorch [19] machine learning library. The experimental results are shown in Figs. 10 and 11, which show the sensitivity, specificity and accuracy of the two algorithms. At the same time, we compared the experimental results of using only the features extracted from the acceleration, using only the features extracted from the pressure, and the combination of acceleration and pressure features, which are represented as acceleration information, pressure information and combination information in Figs. 10 and 11, respectively.

Fig. 10. The results of the sensitivity, specificity and accuracy of RNN.

Fig. 11. The results of the sensitivity, specificity and accuracy of GRU.

In general, the accuracy results of these two algorithms are comparable. For combination information, the accuracy of RNN and GRU reached 94.39% and 94.67%, respectively. Although GRU has higher specificity and accuracy than RNN in the combination

156

L. Zhang et al.

information, the sensitivity of RNN is higher slightly. For acceleration information, the sensitivity and accuracy of RNN are better than GRU. Although the specificity of RNN is higher than 90% in acceleration information, it is slightly inferior to GRU. GRU has advantages for pressure information, in terms of sensitivity, specificity and accuracy are higher than RNN. We can see that the combination of acceleration and pressure information can achieve higher accuracy than using only acceleration information or only using pressure information. And it can be seen from the experimental results that the specificity is generally better than the sensitivity, which may be related to the relatively large number of negative instances.

4 Conclusion This research aims to help people in the prevention of lifestyle diseases. Meal information is collected through the use of smart tableware. After machine learning and meal information analysis, the system will provide users with reasonable suggestions of meal through system. This paper introduces that the improved smart tableware can reflect the state of the tableware more accurately. Using the recurrent neural network to predict meal information and comparing the experimental results, the accuracy of the two algorithms has reached more than 90%. In the future, we hope to embed sensors to optimize tableware. In addition, we need to solve the problem of inconvenience caused by the need to match food and tableware in advance in the system. Moreover, we need to evaluate the effectiveness of the features. We currently use 6 features, including 3 acceleration features and 3 pressure features. If we can achieve the higher experimental results with more effective features, it can improve efficiency and reduce running time.

References 1. MedicineNet.com, Medical Definition of lifestyle disease. https://www.medicinenet.com/scr ipt/main/art.asp?articlekey=38316. Accessed 10 July 2020 2. World Health Organization: Diabetes. https://www.who.int/news-room/fact-sheets/detail/dia betes. Accessed 10 July 2020 3. World Health Organization: Obesity. https://www.who.int/health-topics/obesity#tab=tab_1. Accessed 10 July 2020 4. Sami, W., Ansari, T., Butt, N.S., Ab Hamid, M.R.: Effect of diet on type 2 diabetes mellitus: a review. Int. J. Health Sci. 11(2), 65–71 (2017) 5. Day, N.E., McKeown, N., Wong, M.Y., Welch, A., Bingham, S.: Epidemiological assessment of diet: a comparison of a 7-day diary with a food frequency questionnaire using urinary markers of nitrogen, potassium and sodium. Int. J. Epidemiol. 30(2), 309–317 (2001) 6. Wohlers, E.M., Sirard, J.R., Barden, C.M., Moon, J.K.: Smart phones are useful for food intake and physical activity surveys. In: Proceedings of the 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 5183–5186 (2009) 7. Dong, Y., Scisco, J., Wilson, M., Muth, E., Hoover, A.: Detecting periods of eating during free-living by tracking wrist motion. IEEE J. Biomed. Health Inform. 18(4), 1253–1260 (2014) 8. Sazonov, E.S., Fontana, J.M.: A sensor system for automatic detection of food intake through non-invasive monitoring of chewing. IEEE Sens. J. 12(5), 1340–1348 (2012)

Detection and Analysis of Meal Sequence and Time Based on Internet of Things

157

9. Farooq, M., Sazonov, E.: A novel wearable device for food intake and physical activity recognition. Sensors (Basel) 16(7), 1067 (2016) 10. Kalantarian, H., Alshurafa, N., Le, T., Sarrafzadeh, M.: Monitoring eating habits using a piezoelectric sensor-based necklace. Comput. Biol. Med. 58, 46–55 (2015) 11. Aizawa, K., Ogawa, M.: FoodLog: multimedia tool for healthcare applications. IEEE Multimed. 22(2), 4–8 (2015) 12. Bin Kassim, M.F., Mohd, M.N.H.: Food intake gesture monitoring system based-on depth sensor. Bull. Electr. Eng. Inform. 8(2), 470–476 (2019) 13. Kuwata, H., Iwasaki, M., et al.: Meal sequence and glucose excursion, gastric emptying and incretin secretion in type 2 diabetes: a randomized, controlled crossover, exploratory trial. Diabetologia 59(3), 453–461 (2016) 14. Imai, S., Matsuda, M., Hasegawa, G., et al.: A simple meal plan of ‘eating vegetables before carbohydrate’ was more effective for achieving glycemic control than an exchange-based meal plan in Japanese patients with type 2 diabetes. Asia Pac. J. Clin. Nutr. 20(2), 161–168 (2011) 15. Kaiya, K., Koyama, A.: Design and implementation of meal information collection system using IoT wireless tags. In: Proceedings of 10th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS 2016), CISIS 2016, vol. 66, pp.503– 508 (2016) 16. Zhang, L., Kaiya, K., Suzuki, H., Koyama, A.: A smart tableware-based meal information collection system using machine learning. Int. J. Web Grid Serv. 15(2), 206–218 (2019) 17. Zhang, L., Kaiya, K., Suzuki, H., Koyama, A.: Meal information recognition based on smart tableware using multiple instance learning. In: Proceedings of 22nd International Conference on Network-Based Information Systems, NBiS-2019, AISC, vol. 1036, pp. 189–199 (2019) 18. Chung, J., Gülçehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR, abs/1412.3555 (2014) 19. Paszke, A., et al: Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 8024–8035 (2019)

An Approach of Time Constraint of Data Intensive Scalable in e-Health Environment Eliza Gomes1(B) , Rubens Zanatta1 , Patricia Plentz1 , Carlos De Rolt2 , and Mario Dantas3 1

Federal University of Santa Catarina (UFSC), Florianopolis, Brazil [email protected], [email protected], [email protected] 2 State University of Santa Catarina, Florianopolis, Brazil [email protected] 3 Federal University of Juiz de Fora (UFJF), Juiz de Fora, Brazil [email protected]

Abstract. The increasing use of smart devices connected to the Internet has driven the technological industry and academia to propose applications that allow us to live in a more secure and autonomous way. However, technological advancement has been accompanied by increasingly demanding time requirements, such as fast processing, low latency, and presentation of data within acceptable times. Therefore, we propose in this article a model and computational architecture for a distributed IoT environment with fog computing configuration for a healthcare application. The goal of our proposal is to provide the correct use of specialized tools so that it is possible to indicate a time constraint and thus process and present the data near real-time. We carried out the implementation of the proposed architecture and generated preliminary results that presented reports, through graphs and tables, of the current situation of the assisted user, as well as generating alerts of abnormal situations.

1

Introduction

The large-scale use of smart devices connected to the internet has transformed several aspects of the way we live. In a smart home, for example, the implementation of IoT devices can offer more security and energy efficiency. Health monitoring devices can provide greater independence for the elderly and people with disabilities. Smart IoT traffic systems, vehicular networks, and sensors embedded in roads minimize congestion and accidents in a smart city. However, IoT applications require an environment with support to mobility and geographic distribution, in addition to location recognition and low latency, characteristics often not present in the cloud computing. To meet these requirements, a platform called Fog Computing is proposed by [3]. Fog computing, in turn, is a virtualized platform that provides computing, storage, and networking services between edge devices and cloud computing data centers. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 158–169, 2021. https://doi.org/10.1007/978-3-030-61105-7_16

An Approach of Time Constraint of Data Intensive Scalable

159

Most of the data generated by IoT applications require time-sensitive processing and, therefore, it is increasingly common proposals that aim to solve the challenge of generating responses in an acceptable time or delay. The requirements present in IoT applications such as low latency in communication, data stream processing, or fast execution and response have been presented as requirements of an application with a temporal nature, which generates multiple concepts applied to real-time systems. Data stream, in turn, can be conceptualized as continuous, infinite, fast input data with a variable time sequence [23]. In addition, solutions that use this type of data tend to have variations in execution time and, therefore, unpredictability in response time estimates. Therefore, in this article, we propose a model and a computational architecture for an IoT environment with a configuration in Fog computing. The goal of our proposal is to implement a health monitoring environment in order to monitor and analyze data processing and presentation times and delays. This allows us to indicate time constraint so that the system is able to execute near real-time, that is, within an acceptable time and/or delay according to the characteristics and requirements of the environment and the application. The paper is organized as follows. In the Sect. 2 some concepts related to research are presented. In Sect. 3 we list some related works. Our proposal is presented and described in Sect. 4. The initial results of the implementation of our proposal are presented in Sect. 5. Finally, the final considerations and indications for future works are presented in Sect. 6.

2

Overview

In this section, we present an overview of healthcare, fog computing, and realtime processing, in order to provide the concepts covered in this article. 2.1

Healthcare Environment

Healthcare is a smart environment where a health and context monitoring system is set up. It provides e-health services to monitor and evaluate the health of assisted users, which can be elders, people with disabilities, children or patients. The health monitoring of these users is carried out by specialty users such as doctors, nurses or caregivers. The healthcare environment configuration is composed of three main components: sensors, communication, and processing system [18]. Sensors are deployed in environments or user accessories such as belts, clothes, glasses, and they are responsible for data acquisition. The acquired data by sensors are transmitted through an access point or base station to a server or portable devices via network communication technologies. The data are stored, processed in the server and presented to specialty users so that they can act in case of abnormality or emergency. In the next subsections, we presented some conceptions e characteristics related to sensors implemented in a healthcare environment.

160

E. Gomes et al.

2.1.1 Sensors Physical sensors are the most common types of sensors in a healthcare environment and are responsible for collecting data about user physiology and user environment [15]. There are three main classes for monitoring the assisted user and the environment: Personal Sensor Networks, Body Sensor Networks, and Multimedia Devices [18]. • Personal Sensor Networks (PSN): usually are sensors deployed in the environment whose goal is to detect daily activities of human and to measure environmental conditions. • Body Sensor Networks (BSN): composed of sensors embedded in personal accessories such as clothes, belts or glasses. These sensors have the role of monitoring vital signs and health conditions of the assisted user [30]. • Multimedia Devices (MD): are audio and video devices responsible for monitoring the movements and promote greater interaction between the assisted user and the healthcare application. The data collected by sensors can be classified according to the frequency of their receipt in three types of events: constant, interval and instant. • Constant: the data are transmitted continuously. • Interval : the data are transmitted periodically, following a uniform time interval. • Instant: the data are instantaneously transmitted when an event occurs. 2.2

Edge Computing and Fog Computing

According to [25] Edge Computing is a paradigm in which the resources of communication, computational, control and storage are placed on the edge of the Internet, close to mobile devices, sensors, actuators, connected things and end users. An Edge device is not a datacenter neither a simple sensor that converts analog to digital and collects and sends data. An Edge device can be conceptualized as any computational or network resource that resides between data sources and cloud data centers. On the other hand, Fog computing can be conceptualized as computational elements intermediates, located between Edge devices and cloud, which typically provide some way of data management and communication service between Edge devices and cloud [12]. The main goal of this intermediate layer is to reduce the latency and response time since data do not have to reach the cloud to be processed. Bonomi et al. [3] present temporal requirements of Fog computing environments. They defend that some data generated by the sensor and device grid require real-time processing (from milliseconds to sub-seconds). All interactions and processes occur throughout the Fog computing environment are seconds to minutes (to real-time analyses) and until days (transactional analyze). Despite its increasing use, Fog computing is often called Edge computing. However, these approaches have key differences [12]:

An Approach of Time Constraint of Data Intensive Scalable

161

• Fog computing has hierarchical layers while edge tends to be limited to a small number of layers; • Unlike the Edge, Fog works with the cloud; • Beyond computing, Fog also covers network, storage, control and data processing. 2.3

Real-Time Processing

With the advent of Big data and the use of data stream, the concept of real-time presented in most current researches has distanced from the one proposed in the classical literature. The survey of Gomes et al. [9] presents a classification of articles that propose the use of the real-time approach in big data environments that use data stream. It can be noted that most articles use the term real-time as fast response and low latency. In this article, we consider the concept presented by [4,24,26], which define that a real-time system depends not only on the logical result of the computation but also the time in which the results are produced. For authors, it is a common misconception to consider only fast computation to a real-time system, since the purpose of these systems is to meet the temporal requirements of each task. Therefore, the network is considered a component that is directly related to the quality of service offered by the systems [29]. For this reason, it is considered one of the factors that most affect the performance of operations [5], which makes it a challenge for distributed systems, such as healthcare, to meet temporal requirements.

3

Related Works

In this section, we present some studies that address the use of Fog Computing for Healthcare monitoring. Nandyala and Kim [19] propose an IoT-based real-time u-healthcare monitoring architecture. The proposed architecture makes use of the advantages of fog computing to take advantage of the proximity between processing and the devices and thus provide real-time monitoring for smart homes and hospitals. Bhargava et al. [2] research presents a low-cost Wireless Sensor Networks (WSN)-based system for monitoring real-time mobility and outdoor positioning of older adults with Alzheimer’s. The purpose of monitoring is to detect anomalous behaviors and decrease the risk of the elderly to wander. Real-time data analysis is performed on the wearable device itself using the Fog Computing approach. Verma and Sood [27] propose remote monitoring of the health of smart home patients through the use of the concept of fog computing. The model uses advanced techniques and services such as embedded data mining, distributed storage, and notification services for network devices. In addition, it uses eventbased data transmission methodology to process patient data in real-time at the fog layer.

162

E. Gomes et al.

Nguyen Gia et al. [20] propose a system for continuous remote monitoring of IoT-based healthcare and fog computing. The objective of the proposed system is to improve disease analysis and diagnosis accuracy with health (blood glucose, ECG, temperature) and context (ambient temperature, humidity, and air quality) data. An encryption algorithm is applied to the system to protect the collected data. The data is then encrypted on the sensor nodes before being transmitted and decrypted on smart gateways. Vilela et al. [28] proposes a health monitoring, evaluation, and performance demonstration system based on the fog computing approach. The goal of the system is to minimize data traffic at the core of the network, improve information security and provide quality information about the patient’s health status. The aforementioned articles propose an architecture for remote health monitoring, with fog computing configuration to provide fast data processing and presentation. Therefore, they are different from our proposal, since we intend with our research to present the processing and data presentation times to indicate time constraints for the system to run near real-time.

4

Data Processing Model for a Healthcare Environment

In this article, we propose a computational model and architecture for a healthcare environment based on a fog computing configuration, so that it is possible to process and present the results near real-time. We understand that a system processes near real-time when it presents acceptable times and/or delays, according to the characteristics and requirements of the base application. We can conceptualize systems with processing near real-time as those that feature the high response sensitivity [21], or that process the data immediately after its input in the system [14]. As we previously proposed in our article [6], the basic infrastructure of the health monitoring environment is composed of the layers of edge computing (data providers), fog computing (processing and presentation of data), and cloud computing (persistent storage of data). A health monitoring environment is composed of context sensors (inserted in the environment), personal sensors, and wearable sensors (remains constantly with the user). Our proposal consists of inserting a cloud service of data storage in each residence in order to build the fog computing environment, that is, the data processing environment close to the data source. Therefore, as can be seen in Fig. 1, assisted users’ homes are configured as different fog computing that transforms, process and presents the data received by the sensors and send it to the cloud computing, represented by the hospital. Once the data is stored in the cloud computing, it can be accessed by medical doctors and/or nurses to obtain the patient’s history. Information on the patient’s current situation is obtained remotely through tools contained in the fog computing environment. Based on the health monitoring environment presented, we propose a layered computational model that presents all the necessary steps for data transformation, processing, and analysis.

An Approach of Time Constraint of Data Intensive Scalable

163

Fig. 1. Fog computing structure for a healthcare environment

As shown in Fig. 2, the proposed model consists of six layers. The first layer consists of the data source from the IoT devices (Data Source). This data is sent to the ETL layer, which is responsible for extracting, transforming and loading the data. After the data transformation has been carried out, they are sent to layer three, which represents the Data Processing step, which, when finalized, is sent to the storage layer (Data Warehouse). After the data storage stage is finished, it is possible to perform an analysis of the system performance, using metrics data obtained by specialized software (Analytics). Finally, the information of the assisted user and the system are presented to the responsible user through graphs and tables (User Interface).

Fig. 2. Computational model for processing IoT healthcare environment

164

E. Gomes et al.

Based on the proposed model, Fig. 3 presents the proposed architecture for the IoT environment for health monitoring. The objective of the architecture is to indicate specialized tools that provide efficient results because according to [7] the correct and structured use of tools determines positively the correct use of the data.

Fig. 3. Computational architecture for processing IoT healthcare environment

The proposed architecture consists of the layers proposed in the model and tools used to implement the health monitoring platform in fog computing. • Data Source: layer responsible for receiving data from sources such as sensors, vehicles, smartphones, smart homes, and smart watches. • ETL: responsible for data extraction, transformation, and loading. For this step, Apache Avro [1], a data serialization system, was used. • Processing: layer responsible for processing the data stream. It was used Apache Kafka [13], a distributed messages platform based on publish/subscribe model. • Data Warehouse: responsible for local data stream storage. The InfluxDB [11] is used on the platform because is a optimized temporal series data base to be fast, and has high storage availability and data recovery. • Analytics: layer responsible for obtaining system performance metrics, in order to enable the analysis of end-to-end processing times and delays. The tools used were Kafka-Monitor [16] and Prometheus [22]. The first is an opensource deployment by a group of Kafka experts of Linkedin and is responsible for obtaining the end-to-end metrics of Apache Kafka. The second, on the other hand, is used to store the received metrics and generate the visualizations, allowing the monitoring of the system and generating event alerts.

An Approach of Time Constraint of Data Intensive Scalable

165

• User Interface: layer responsible for presenting the data to the end-user. The tool used was Grafana [10], a web application that allows data to be viewed remotely through tables and graphs. Based on the proposed model and architecture, we implemented the health monitoring environment and generated the first results, which are presented in the next section.

5

Environment and Experimental Results

This section presents the results obtained by implementing the proposed model and architecture. First, we describe the computational environment used to implement the architecture. Then, we present, as a first result, reports that show the user’s current health situation and generate warnings for abnormal situations. 5.1

Environment

The environment used to carry out the experiments can be divided into 3 parts: hardware, software, and sensing environment. To compose the hardware environment we use a Desktop with Ubuntu 18.04 LTS, Core i7 processor, and 16GB of RAM. The software and sensing environments, we use the ones proposed by Gomes et al. [8]. The software environment runs on Docker and contains the tools present in the architecture proposed in this article, in addition to the MQTT Broker, responsible for receiving publish/subscribe messages which use MQTT protocol [17] and send them to the next level. The sensing environment is composed of 9 sensors, divided into 3 categories: environment, personal, and health. • Environment Sensors: comprise temperature and humidity sensors. With the data from these two sensors, it is also possible to obtain the thermal discomfort index. • Personal Sensors: are the location, gyroscope, and accelerometer sensors. They are used to obtain the user’s location, as well as his/her position (lying, sitting, or standing) and his/her movement (standing, running, or walking). This data makes it possible to know if the user is active or suffered a fall. • Health Sensors: they are the wearable sensors EGC, Pulse Oximeter, Blood Pressure (systolic and diastolic), and body temperature. The data obtained by this set of sensors, associated with data from personal sensors, for example, can provide information that the user’s body temperature is high because he is performing physical activity. After implementing the environments, we created a dashboard for generating alerts regarding the current health of the assisted user. The graphs are presented in the next subsection.

166

5.2

E. Gomes et al.

Experimental Results

For the generation of the reports, we use synthetic data generated randomly from a JSON script. The data, for this experiment, is produced every 10 s, making the system receive the data continuously simulating the generation of the data stream. Figures 4 and 5 show the reports of the environment and wearable sensors, respectively.

Fig. 4. Environment sensors report and warnings

Fig. 5. Wearable sensors report and warnings

As we can analyze, reports are generated with the data captured by the sensors (left side of the figures). In addition, alarm reports are generated to

An Approach of Time Constraint of Data Intensive Scalable

167

notify that the data obtained has abnormalities (right side of the figures). We can analyze in Fig. 5 that the electrocardiogram alert report has no data, which means that the user is well, that is, he/she is neither with bradycardia nor with tachycardia.

6

Conclusions

In this article, we proposed a computational model and architecture based on a distributed environment, such as IoT, with fog computing configuration. We use the concept of a health monitoring environment as a basic application. The goal of our proposal was to structure a healthcare environment in order to allow the utilization of tools that enable the correct and efficient use of data. Additionally, with our proposal, we intend to monitor the environment so that it is possible to indicate time constraints, in view of the time-sensitivity in data processing present in healthcare environments. Considering the unpredictable characteristic of an environment distributed as IoT, we present in this article the approach of executing data near real-time. In other words, we proposed to comply with the deadline in a better effort, that is, the data being executed immediately after its input. We implemented the proposed architecture and obtained preliminary results that consisted of generating a dashboard with constant visualization of the health reports of the assisted user. Through graphs and tables, it is possible to monitor the current situation of the user, as well as to display a warning when the data is not in conformity with the abnormality. As future work, we intend to improve the visualization of reports and the generation of alerts. In addition, we intend to install the monitoring tool (KafkaMonitor) so that we can obtain information regarding latency, delays, and system performance. With this information, we will then be able to indicate time constraints and verify the system’s performance faced with the time requirement. Acknowledgement. This study was financed in part by the Coordena¸c˜ ao de Aperfei¸coamento de Pessoal de N´ıvel Superior - Brasil (CAPES) - Finance Code 001.

References 1. Avro: Apache avro (2020). https://avro.apache.org. Accessed July 2020 2. Bhargava, K., McManus, G., Ivanov, S.: Fog-centric localization for ambient assisted living. In: International Conference on Engineering, Technology and Innovation, pp. 1424–1430. IEEE (2017) 3. Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: Proceedings of the First Edition of the MCC Workshop on Mobile Cloud Computing, p. 13 (2012) 4. Buttazzo, G.C.: Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications, vol. 24. Springer, Boston (2011) 5. Dai, D., Li, X., Wang, C., Sun, M., Zhou, X.: Sedna: a memory based key-value storage system for realtime processing in cloud. In: IEEE International Conference on Cluster Computing Workshops, pp. 48–56 (2012)

168

E. Gomes et al.

6. Gomes, E., Dantas, M., Plentz, P.: A real-time fog computing approach for healthcare environment. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 85–95. Springer (2018) 7. Gomes, E., Dantas, M.A., de Macedo, D.D., De Rolt, C., Brocardo, M.L., Foschini, L.: Towards an infrastructure to support big data for a smart city project. In: International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises, pp. 107–112. IEEE (2016) 8. Gomes, E.H., Dantas, M.A., Plentz, P.D.: A proposal for a healthcare environment with a real-time approach. Int. J. Grid Util. Comput. 11(3), 398–408 (2020) 9. Gomes, E.H., Plentz, P.D., Rolt, C.R.D., Dantas, M.A.: A survey on data stream, big data and real-time. Int. J. Networking Virtual Organ. 20(2), 143–167 (2019) 10. Grafana: Grafana labs (2020). https://grafana.com. Accessed July 2020 11. Influxdb: Influx data (2020). https://www.influxdata.com. Accessed July 2020 12. Iorga, M., Feldman, L., Barton, R., Martin, M.J., Goren, N., Mahmoudi, C.: Draft SP 800-191, The NIST Definition of Fog Computing. NIST Special Publication 800, March 2017 13. Kafka: Apache kafka (2020). http://kafka.apache.org. Accessed July 2020 14. Kononenko, O., Baysal, O., Holmes, R., Godfrey, M.W.: Mining modern repositories with elasticsearch. In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 328–331 (2014) 15. Lai, X., Liu, Q., Wei, X., Wang, W., Zhou, G., Han, G.: A survey of body sensor networks. Sensors 13(5), 5406–5447 (2013) 16. Linkedin: Kafka monitor (2020). https://github.com/linkedin/kafka-monitor. Accessed July 2020 17. MQTT: Mqtt. http://mqtt.org/ (2020). Accessed July 2020 18. Mshali, H., Lemlouma, T., Moloney, M., Magoni, D.: A survey on health monitoring systems for health smart homes. Int. J. Ind. Ergon. 66, 26–56 (2018) 19. Nandyala, C.S., Kim, H.K.: From cloud to fog and IoT-based real-time Uhealthcare monitoring for smart homes and hospitals. Int. J. Smart Home 10(2), 187–196 (2016) 20. Nguyen Gia, T., et al.: Energy efficient fog-assisted IoT system for monitoring diabetic patients with cardiovascular disease. Future Gener. Comput. Syst. 93, 198–211 (2019) 21. Perera, C., Qin, Y., Estrella, J.C., Reiff-Marganiec, S., Vasilakos, A.V.: Fog computing for sustainable smart cities: a survey. ACM Comput. Surv. 50(3), 1–43 (2017) 22. Prometheus: Prometheus (2020). https://prometheus.io. Accessed July 2020 23. Safaei, A.A.: Real-time processing of streaming big data. Real-Time Systems (2016) 24. Safaei, A.A.: Real-time processing of streaming big data. Real Time Syst. 53(1), 1–44 (2017) 25. Sponsored, D.C., Foundation, N.S.: NSF Workshop Report on Grand Challenges in Edge Computing (2016) 26. Stankovic, J.A.: Misconceptions about real-time computing: a serious problem for next-generation systems. Computer 21(10), 10–19 (1988) 27. Verma, P., Sood, S.K.: Fog assisted-IoT enabled patient health monitoring in smart homes. IEEE Internet Things J. 5(3), 1789–1796 (2018) 28. Vilela, P.H., Rodrigues, J.J., Solic, P., Saleem, K., Furtado, V.: Performance evaluation of a Fog-assisted IoT solution for e-Health applications. Future Gene. Comput. Syst. 97, 379–386 (2019)

An Approach of Time Constraint of Data Intensive Scalable

169

29. Volpato, F., Da Silva, M.P., Gon¸calves, A.L., Dantas, M.A.R.: An autonomic QoS management architecture for software-defined networking environments. In: IEEE Symposium on Computers and Communications, pp. 418–423. IEEE (2017) 30. Wang, X.: The architecture design of the wearable health monitoring system based on internet of things technology. Int. J. Grid Util. Comput. 6(3–4), 207–212 (2015)

A Tool to Manage Educational Activities on a University Campus Antonio Sarasa-Cabezuelo1(B) and Santi Caballé2 1 Universidad Complutense de Madrid, Madrid, Spain

[email protected] 2 Universitat Oberta de Catalunya, Barcelona, Spain

[email protected]

Abstract. A very common type of extracurricular activities in universities are seminars and workshops. These types of events are used by instructors to introduce advanced content of the subjects they teach, to carry out practical applications, to invite speakers or experts in the field who speak about leading research topics, etc. The development of these activities requires some tool that facilitates its organization and management. Thus, it is necessary to manage relevant aspects for this purpose, such as the control of the participants in the events, the carrying out of satisfaction surveys, or the sending of notifications about unexpected changes in the event (time change, classroom change, etc.). This article presents a tool that has been developed with the aim of managing this type of events in an agile and simple way for both participating instructors and students.

1 Introduction Workshops and seminars are widespread extracurricular activities at the university level. Normally, instructors use this type of activity to introduce advanced content [1], to hold practical sessions on some topic related to general content, to invite experts in the field to give talks on their lines of research, etc. In general, these activities are held outside official class hours, attendance is usually free, and they are intended to complement the basic training of students. To organize them, the instructor informs the students of the event, indicating its place, date and time. Another element related to this type of events is that some feedback is expected from the students to know if the training received has been useful to them as well as other data about the experience with the aim of improving or making changes in future activities. In addition, in certain occasions, the participation of the students in this type of activities is rewarded by the instructor with an increase of the final grade of the course directly or indirectly after carrying out a work or exercise on the treated contents. Therefore, the instructor requires to organize these activities by a communication mechanism with the students that allows him/her to inform when the event will take place, carry out satisfaction surveys about the event or communicate changes that take place in the realization of the event. In addition, the communication means used should be simple enough to be used by both students and instructors and also easily accessible. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 170–178, 2021. https://doi.org/10.1007/978-3-030-61105-7_17

A Tool to Manage Educational Activities on a University Campus

171

With the aim of offering a solution that would be suited to the above-described needs, a particular system was developed for managing extracurricular events that would offer the most common services needed to organize these activities. In addition, the design took into account the requirement that the organization process should be agile and that the use of the system should be as simple and intuitive as possible for both instructors and students. In this sense, the system developed consists of a web application and an Android app. The first application is oriented towards the instructor and offers him/her all the necessary services to create events associated to subjects, create alerts, develop surveys associated to each event or visualize information about the surveys answered by the students. On the other hand, the mobile application is oriented towards the students, so they can register for an event, receive information about the event as well as answer surveys that have been proposed by the instructor organizing the event. The structure of the article is as follows. Section 2 presents some proposal in order to solve the problem described in this section. Then, Sect. 3 describes the architecture of the application and explains briefly the data model used. Section 4 discuss in detail the functionality of the application. Finally, Sect. 5 presents a set of conclusions and future lines of work.

2 Background In order to meet the needs described above, there are several alternatives that can be grouped basically into official and non-official tools. The official tools refer [5] essentially to the possibilities offered by the virtual campuses that universities normally have, and which are based on some type of LMS (Learning Management System), such as Moodle, Blackboard and others. This type of system offers a large number of functionalities aimed at facilitating the digitalisation of teaching [3]. In this sense, it would be feasible for a instructor to have specific virtual spaces for workshops or seminars, where students who can attend the event would be registered, the documentation associated with the activity would be stored, or the performance of assignments or feedback surveys would be planned. The advantage of this option is the integration of these activities with official teaching, so that the instructor has all the information about the students in a single system. However, there are some disadvantages due to the unofficial nature of these activities. In general, virtual campuses are designed to support official teaching, and many universities do not consider supporting unofficial activities. Thus, the instructor does not have the possibility of creating new virtual spaces directly, having to make a request to the virtual campus managers, this makes the organization process of these activities not as agile and simple as it would be expected. In addition, the organization of these systems is designed so that access to the virtual spaces can only be done by a group of students previously registered by the instructor or by the virtual campus managers, but not in the opposite sense (i.e., that a student can register for the event). Likewise, the system for creating surveys is not simple since surveys are designed to cover a very wide spectrum of test types (normally, it is first necessary to create a bank of questions, and from the bank of questions several types of question tests can be created). Finally, instructors usually do not have a system of alerts or notifications oriented towards mobile devices [2]. Notifications are sent to the registered email address. Therefore, this option

172

A. Sarasa-Cabezuelo and S. Caballé

mainly poses problems of agility in the implementation process. Eventually, it should be noted that a large number of universities have created mobile apps [4] that offer services similar to those offered by the universities’ web applications, as well as a set of valueadded services, such as management of academic records, management of the calendar or class schedules in which students are enrolled, and management of events held at the university or faculty. However, these applications are generalist in nature and are not designed to adapt to the organization of this type of event and the information management it requires [6]. For example, with this type of application, it would be possible to manage the advertisement of the event, but it would not be possible to manage the students who participate or the carrying out of satisfaction surveys or their subsequent exploitation. This is why they do not adequately cover the needs raised. The second option consists of the use of non-formal tools [7, 8]. This option involves the combination of a communication application (usually a mobile app, such as WhatsApp, Twitter and others) and a survey application (e.g., Google Forms). Thus, in this option, the instructor uses the first application to communicate to the students all the aspects related to the organization of the event, notifications, such as changes in the schedule or venue, etc., while using the second type of application for the creation of surveys related to the event. This option introduces flexibility and agility into the process, as no third-party permissions are required to use these applications. However, it has some disadvantages. Firstly, the lack of an information integration system that manages the applications used. Thus, there is no system of persistence that unifies and integrates the information from each application, so it will be a task for the instructor. This issue introduces other types of problems, such as the exploitation of the individual information of each student who has participated in these events. Likewise, there are other additional problems, such as the persistence in time of the information that has been generated from an event, the organization of the events by subjects, the complexity that can be produced if several events have to be managed simultaneously, and finally the dependence with respect to third parties (in this sense the applications can be modified by changing their way of functioning, or even closed up so that the proposed organization system would fail). In sum, this option offers flexibility and agility in the organization of the process but has the disadvantage of lack of integration and dependence on third parties. In addition, as unofficial solutions, agenda tools that allow for the management and organization of events could be considered. The main problem is precisely that the only functionality offered by an agenda is that they do not allow for the development of surveys, user management or the exploitation of information.

3 Application Architecture and Data Model The tool has been implemented as a system consisting of a web application, an Android app and a relational database that is used to perform data persistence. In this way, the information that each application manages is synchronized by sharing the same database, which serves as a means of communication between both applications. The web application is used by the instructor to manage the extracurricular events of the subjects taught, while the Android app is used by the students to participate in the events generated by the instructors. In addition, the communication of the Android app with

A Tool to Manage Educational Activities on a University Campus

173

the database is carried out through a web service that acts as a proxy between the two. Figure 1 shows the architecture schematic.

Fig. 1. Application architecture

The following technologies have been used to implement each of the components of the architecture. First, the mobile app has been implemented using Android, and the web service and web application have been implemented using PHP. In addition, the responses to the requests made by the Android app to the web service are performed using json files where the encoded information is found. This information is later used by the Android app to display it on the mobile. Finally, the relational database has been implemented using MySQL. Although a MongoDB database could have been used as the information is exchanged with the web application in json format, however, it has been decided to use a relational database since the information that is stored is regular (i.e., all data always has the same structure). In order to implement the persistence of information, a relational database of the MySQL type has been used and a number of tables have been defined containing the information that the application needs to manage, as follows: data referring to users, data referring to the activities and the relationship with the students who participate in the activities, data about the surveys associated with the activities and data about the answers of the students to the surveys, data referring to the alerts that define the students and the relationship the activities to which they are associated, and data about the subjects defined by the instructor allowing for relating activities, students, and surveys.

4 Implementation In this section, the functionality of the developed tool will be described. Since it consists of two different applications, the web application will be presented first and the Android app second. 4.1 Web Application Functions have been defined in the web application for two different roles, administrator and instructors. The administrator performs 3 management functions:

174

A. Sarasa-Cabezuelo and S. Caballé

• Instructor Management: Add, modify or remove instructors from the system. • Survey Management: Create, modify or delete system surveys that will be associated later with a system activity. It is possible also to consult the statistics of the surveys associated with the activities. • User Management: Consult the data of the users of the application as well as modify some of their data or delete the user from the system in case of error.

Fig. 2. Administrator functionality

Figure 2 shows the main administrator user interfaces. All of them are based on simple and intuitive forms that guide the administrator through the different functionalities implemented. The screen in Fig. 2a represents the main administrator menu where the

A Tool to Manage Educational Activities on a University Campus

175

functions are listed. Figures 2b and 2c show the form for registering a instructor and for modifying or removing a instructor from the application. Figures 2d and 2e show the forms for generating a survey and for modifying or deleting a survey. Next, Fig. 2f shows the screen that allows users to manage the application. And finally, Fig. 2g shows an example of the statistics that the tool generates from the data collected in the surveys. On the other hand, the functions of an instructor are as follows: • Private Activity Management: A private activity is one in which it is necessary to be registered in the application in order to access the information. In addition, the student must have been validated by the instructor in the subject to which the private activity belongs. Once validated, the student will have access to all activities and surveys related to that subject. An instructor can create private activities, modify them or delete them. However, instructors can only manage those that they own. • Management of public activities: An instructor can create public activities as well as modify the information of an activity created by him/her or delete it. A public activity is the one in which it is not necessary to be registered in the application in order to access the information. Instructors will only manage and view public activities of which they are owners. • Alert Management: There is the possibility of creating alerts about activities. These alerts will be shown to the user in the mobile application. The alerts can also be modified or deleted by the instructor who created them. • Consult Surveys: An instructor can consult all the surveys in the system in order to know the questions contained in a survey and associate it with an activity. Instructors can also find out the results of the surveys associated with the activities. • Student management: An instructor must validate the students who request access to a subject created by the instructor. In addition, they have the possibility of cancelling any student who is registered in subjects created by them. Figure 3 shows the main instructor user interfaces. The screen in Fig. 3a represents the instructor’s main menu where the functions are listed. Figures 3b and 3c show the forms for registering, modifying or deleting a subject. Figures 3d and 3e show the forms to register, modify or delete an activity. Figures 3f and 3g show the forms to validate or eliminate a student. Figures 3h and 3i show the forms for displaying a survey and the statistics associated with it. Finally, Fig. 3j shows the form to manage an alert associated with an activity. 4.2 Android App In the Android application, the functionalities for the students will be implemented. It is necessary to differentiate between registered and unregistered students: 1. Registered student: • Enrolment in Private Activities: A registered student can consult all the information about private activities that a specific instructor has created for a subject. It is possible also to enroll in any of them. In order to access these activities, the student must have previously been validated by the instructor.

176

A. Sarasa-Cabezuelo and S. Caballé

Fig. 3. Instructor functionality

A Tool to Manage Educational Activities on a University Campus

177

• Enrolment in Public Activities: A registered student can enroll in public activities, which are associated a number of credits assigned to the students who carry them out. Likewise, they may have an associated satisfaction survey. • Credit counter. A registered student can consult the credits that s/he adds with the public activities to which the student has been registered and can establish a goal of credits to fulfill. When the user consults the credits, if it has reached the maximum established by the student, the app will be communicated to the student through a pop-up that s/he has reached her maximum credits. However, the student could continue to carry out activities and accumulate more credits. • Alert notification. A registered student will receive, by means of a notification in the form of a pop-up, the alerts generated by the instructors about the activities in which they are enrolled. 2. Unregistered student: Any student who installs the Android app can consult all the information about public activities and the alerts that have been generated about them.

Fig. 4. Student functionality

Figure 4 shows some of the functions associated with a student. The screen in Fig. 4a represents the main menu of the app. Figure 4b shows an example of a list of public activities, and Fig. 4c shows an example of access to a public activity. Figures 4d and 4e show information about the credits associated with a public activity and the notification generated when the user reaches the maximum number of credits that they have established. Finally, Fig. 4f shows the interface that allows configuring the maximum number of credits that a user wants to obtain.

178

A. Sarasa-Cabezuelo and S. Caballé

5 Conclusions and Future Work This article has presented a tool designed to manage extracurricular training activities such as seminars or training workshops. The system includes a web application oriented to instructor tasks where it is possible to register public and private activities associated with the subjects taught, as well as design surveys associated with the activities or consult the statistics generated. In addition, an Android app has been implemented that allows students to register for activities generated by instructors with the restriction that in order to access private activities they require the validation of instructors. Likewise, students can carry out surveys associated with these activities, and keep track of the credits they obtain for participating in them. The system is currently not active and is not used in any higher education environment. Funding has been requested with the aim of integrating it into several universities that have shown interest. As future lines of work, it is proposed: 1) Create a version for iOS; 2) Extend the functionality of the application to allow documentation to be associated with activities; 3) Allow the system to be configured in several languages; 4) Allow students to be able to send documents to the instructor associated with an activity or survey, 5) Currently, the application has not taken into account data privacy and the protection. In this sense, a future line of work is to implement the necessary mechanisms to respect the RGPD law and integrate privacy management into the mechanisms of the VLE of the universities. Acknowledgments. This work has been partially supported by the European Commission through the project “colMOOC: Integrating Conversational Agents and Learning Analytics in MOOCs” (588438-EPP-1-2017-1-EL-EPPKA2-KA).

References 1. Claudia, C.: The role of extracurricular activities and their impact on learning process. In: The Annals of the University of Oradea, 1117 (2014) 2. Lovászová, G., Cápay, M., Michalicková, V.: Learning activities mediated by mobile technology: best practices for informatics education. In: CSEDU (2), pp. 394–401 (2016) 3. Lubis, M., Fauzi, R., Lubis, A.: Enterprise application integration for high school students using blended learning system. In: MATEC Web of Conferences, vol. 218, p. 04016. EDP Sciences (2018) 4. Molnár, G., Sz˝uts, Z.: Advanced mobile communication and media devices and applications in the base of higher education. In: 2014 IEEE 12th International Symposium on Intelligent Systems and Informatics (SISY), pp. 169–174. IEEE (2014) 5. Sarasa-Cabezuelo, A., Sierra-Rodríguez, J.L.: An app for managing unregulated teaching activities. In: 2014 International Symposium on Computers in Education (SIIE), pp. 217–222. IEEE (2014) 6. Sarasa-Cabezuelo, A., Sierra-Rodríguez, J.L.: A system to manage non-formal higher education activities. IEEE Revista Iberoamericana de Tecnologias del Aprendizaje 11(3), 205–212 (2016) 7. Sarasa-Cabezuelo, A., Caballe, S.: A tool for creating educational resources through content aggregation. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 515–524. Springer, Cham, November 2019 8. Turani, A., Calvo, R.A.: Beehive: a software application for synchronous collaborative learning. Campus-Wide Information Systems (2006)

Towards the Use of Personal Robots to Improve the Online Learning Experience Jordi Conesa(B) , Beni Gómez-Zúñiga, Eulàlia Hernández i Encuentra, Modesta Pousada Fernández, Manuel Armayones Ruiz, Santi Caballé Llobet, Xavi Aracil Díaz, and Francesc Santanach Delisau Universitat Oberta de Catalunya, Barcelona, Spain {jconesac,bgomezz,ehernandez,mpousada,marmayones,scaballe, xaracil,fsantanach}@uoc.edu

Abstract. All changes are difficult and moving from face-to-face to online learning is not an exception. Nowadays, online students have many supports to ease their learning process due to the evolution of Virtual Learning Environments (VLE), the maturity of the pedagogical models used, and the vast experience of online teachers who design, create and deploy successful learning activities and accompany students through these activities. However, these supports are mainly centralized within the contexts of the VLE or the virtual classrooms. Therefore, new online learners should get the necessary habits to enter the VLE and the classrooms frequently. In this research we present an ongoing study in which robots are used as personalized companions of new students. Robots provide personal feedback to each student with the aim of promoting behavioral changes that facilitate the learning experience of new students and potentially reduce their dropout. Keywords: Assistive robot · Persuasive technology · Motivation · Learning experience

1 Introduction Until now, the role of robots as personal assistants for learning has been investigated, fundamentally, in children, less in adolescents, and very little with university students. In fact, the research with university students is mainly limited to learning programming or robotics content [1]. However, robots can be used also for other non-content related purposes, such as the promotion of student motivation and the acquisition of the necessary habits to perform the learning activities successfully. Online learning takes place in a virtual learning environment (VLE) where learning processes take place and many mechanisms to help students are provided. Online learning is usually provided together with a pedagogical model that promote the work of academic and competence aspects in the classroom, but also other aspects of an emotional nature thanks to the effort of teachers and technicians who accompany, help, guide and support the student, both from an academic and a motivational point of view. However, all © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 179–187, 2021. https://doi.org/10.1007/978-3-030-61105-7_18

180

J. Conesa et al.

this guidance and support is useless when students do not access the classroom, do it unfrequently or do not work continuously. Hence, it is important to help new students to get the right habits to promote frequent access to the classroom and regular and continuous work. In this work we propose to use robots for motivating novel students, promoting behavioral change, and improving their experience of use in an e-learning university context. In particular, the paper presents the experience of the Universitat Oberta de Catalunya (UOC) in the use of robots for promoting behavioral change of the novel students. The problem faced and the lessons learned may be useful and applicable for other distance learning organizations. The paper is organized as follows. Section 2 presents a background of the work, describing the environment where the experience has been performed and the different psychological theories considered. Section 3 briefly presents the proposed system, the robots used, the way they have been used and integrated within both, the learning process and the UOC’s digital learning environment and finally, the pilot study being conducted. Section 4 outlines the main conclusions and provides on-going and future directions of research.

2 Background With the aim of individualizing and personalizing the accompaniment of the newly incorporated students and streamlining their training process, our research is based on different psychological theories that aim to ensure that a robot, who will be called Botter in our case, establishes persuasive and motivating communication with the student. This section introduces these theories and presents context of the UOC, which has been the motivational context and the place where Botter will be used. 2.1 The UOC Context The Universitat Oberta de Catalunya (UOC), was created in 1995 as a completely online university [2], with an educational model adapted to the needs of students that promotes ubiquitous and self-directed learning [3]. Students can personalize the virtual campus according to their needs and likes. In addition, a personal tutor assists students in their educational experience with personalized feedback and answers. The communication nowadays involves several channels, such as tweet, mail, the typical support service and a community of interest for the whole university. Formative feedback may be provided to students from teachers, tutors but also their classmates. Basically, students learn with and from others. The UOC campus (the Campus from now on), and its virtual classrooms, provides many mechanisms to help students; furthermore, the pedagogical model used promotes students work from an academic and a motivational point of view. It also provides specific guidance and support to new students. However, the dropout of novel students, in their first year, is still high [4]. Therefore, it would be beneficial to provide personalized support to new students in getting the necessary habits and dynamics to be successful virtual learners.

Towards the Use of Personal Robots

181

2.2 Habit Theory Since the last century, psychology has been pointing out that people’s lives are influenced by something non-reflective, such as habit. Furthermore, its conceptualization has gone from a purely neuronal level to a macro construct of a cultural nature [5]. Whatever dimension we adopt for its analysis, Andrews’s classic definition [6] can help us understand how we work with habit with Botter; habit is a way of thinking, a custom, which is established through the repetition of a behavior based on a previous mental scheme. These are behaviors that we emit without too much conscious control, and that is what we want to achieve from our students, who acquire the habit of entering the campus. If this habit is not established, the chances of motivating them are slim. The classic scheme to establish habits is to emit a trigger (acoustic signal from the robot, or a light signal, for example) to generate a behavior (in our case, it could be entering the classroom to read a message from the teacher) and, then get a reward (a green light signal, or a movement of joy, for example). As this scheme repeats itself, the behavior will end up becoming a habit. 2.3 Self-determination Theory Motivation has been a constantly present theme in psychology, and for this theory it is also a fundamental concept. Based on the distinction between intrinsic and extrinsic motivation, [7] that postulates the existence of three innate psychological needs: competence, autonomy and relatedness. Intrinsic motivation is the intention to act, spontaneous interest, without external rewards, doing an activity for the satisfaction inherent in the activity itself, and that leads us to act or emit behavior. Extrinsic motivation is more determined by social pressure, for example, a child who does homework because of the control that her parents are exercising over her schoolwork. The value of behavior does not reside in itself, but in its instrumental value, and regulation is no longer internal, but external. Intrinsic motivation aims to promote the autonomous regulation of behavior, beyond extrinsic motivation. Initially, people behave in a certain way because such behavior is modeled or valued by other significant people for us with whom we feel or want to relate. This is how the basic need for relatedness, the need to feel belonging or connected with others, is very important for the internalization of motivation. Furthermore, context can promote intrinsic motivation by supporting the basic needs mentioned, since intrinsic motivation is intimately related to satisfying the needs of autonomy and competence (clearly) and also to that of relationships, although to a lesser degree. This theory can be very significant for the field of education, since what we want, ultimately, is to motivate students to commit, strive and have the best possible performance [7]. If the social context in which our students are immersed is responsive to their basic psychological needs, we will achieve an optimal development of their abilities, while taking an active, responsible and initiative role in their learning process. From this point of view, it is vital to design the robot in a way that adequately displays responsive behavior that supports the psychological needs of our students [8].

182

J. Conesa et al.

2.4 Persuasive Design Systems and technologies have been developed in recent years to change people’s attitudes or behaviors, and that is where persuasive design and evaluation systems play a very important role. Nowadays technologies create opportunities for persuasive interaction because users can be accessed quickly [9], fosters students’ motivation and thus provide a better learning experience. Persuasive systems can be defined as computerized software or information systems designed to reinforce, change or shape attitudes or behaviors without using coercion or deception [9]. In our case, with Botter, it is us, the humans, and not the robots, who have the objective of influencing the attitudes or behaviors of our students, and we will design Botter for this purpose. In this context, the most developed frame of reference is the one provided by Fogg [10] and the work of the Stanford Persuasive Tech Lab1 . In his model, the author points out that the behavior is the product of three factors: motivation, ability, and triggers, each of which has subcomponents. The Fogg Behavior Model (FBM) states that for a person to perform a behavior, the person must (1) be motivated enough, (2) have the ability to perform the behavior, and (3) be activated to perform the behavior. These three factors must occur at the same time, since if this is not the case, the behavior will not take place. As this model is useful for the analysis and design of persuasive technologies, we have adopted it to design Botter. Specifically, we use the gaze to make Botter more persuasive, at the same time that it emits gestures such as walking, shaking his head or clapping [11].

3 Proposed Approach The proposed solution is innovative, not only for designing a personal assistant who accompanies students in their first contact with online learning, but also for doing so within a university context. Our robot, whose name is Botter, is like a co-pilot for the new students training itinerary during their first semester. The robot has been designed and programmed for this purpose, keeping synchronized with the UOC VLE in real-time and providing personalized and updated information on different aspects of the Campus and, above all, from the students subjects and classrooms from a persuasive perspective. 3.1 Goals Based on the theoretical foundation that we have briefly described, we have specified the general objectives as follows: – Dynamize the training process, broadening the interaction with the new UOC student and the information we offer her about her integration into the Campus and in the classrooms.

1 https://captology.stanford.edu.

Towards the Use of Personal Robots

183

– Achieve a more continuous, more enriched accompaniment and beyond the virtual limits of the Campus, complementing, in the initial stages, the work that the tutors do throughout the student’s academic life. – Promote personalization and individualization in accompaniment, adapting it to the needs and characteristics of each student. – Increase adherence to training at the UOC, reducing dropout. To do this, Botter should be able to: – Get the student’s attention and keep it. – Present the student with significant information about their learning process, such as deadlines, whether to download documentation for the study, whether they have read a message posted by the teacher, etc. These bits of information will allow students to establish tiny goals, which will stimulate their motivation more than a single, more far-reaching, and longer-term goal. – Propose a good reinforcement system, so that students receive a reward (gamification) for each action that is appropriate for their learning. – Offer trustworthy information, which allows students to stay on their learning objective and increase their perception of self-efficacy. To do this, Botter must be able to offer updated information in real time, such as inviting the student to enter into the Campus when the student has not done so in days. The bond between Botter and the student is based precisely on this trust. – Show different expressions, either with "body" movements or with "facial" expressions. – Value the satisfaction of the student’s own learning process. Learning is very satisfying, but the most important thing is not that it is, but that the student realizes that it is. That is why Botter can, for example, give information about the qualifications of his continuous evaluation with a certain degree of gamification. 3.2 The Casting of Botter: Characteristics of the Chosen Robots After establishing the general and specific objectives for Botter, we have benchmarked to decide which robots could meet the required conditions. The first selection criterion was to achieve a balance between cost, usability (dimensions, skeleton), personalization (system, programming, aesthetics), accessibility (proximity detection, voice recognition, visual/facial recognition), connectivity (WIFI, Bluetooth), autonomy (high, medium, low) and emotional expression. We evaluated this last aspect considering three components: facial expression (light, mouth, eyes, text, graphics), movement (arms, legs, wheels) and audio (sound, voice). Based on these criteria, we analyzed the following commercial robots: Lego MindStorm EV3, Mbot 2.4G, Aisoy 1, Zenbo, Cozmo, Otto and Zowi; to end up choosing Vector, which is the evolution of Cozmo, and Zowi. We can see the personalized version of the robots chosen in Fig. 1. The Cozmo/Vector robot has the following characteristics: small, compact, degree of customization of the system only by the owner, limited programming (SDK), low

184

J. Conesa et al.

Fig. 1. Robots used for promoting a behavioral change in the novel students of the UOC: Zowi at the left and Vector at the right

autonomy, average cost, with wheels, voice, arms, eyes, text, voice recognition, facial recognition, proximity detection and WIFI. The Zowi robot has the following characteristics: medium size, compact, open source programming, 3D printing, degree of customization of the system only by the owner, high autonomy, low cost, with legs, sound, mouth, proximity detection and Bluetooth. 3.3 Proposal In this first stage of this research, some indicators potentially useful for behavioral change of novel students have been proposed. This process has been conducted by defining a first set of over 30 potential indicators that cover the different activities students may perform within the Campus. From these indicators, only 18 were selected, after a collaborative process of prioritization, which involved 6 persons of the team. Each of the selected indicators (frequency of connection for example) were analyzed to find out the expressions it may trigger (if the frequency states an obsessive conduct then some signal should be performed to let it know to the student, for example), the habit we would like to promote (self-regulation in the example) and the action to be done by the robot (some humoristic action to make student realize that he is connecting too much, for example). Note that the possible actions of the selected robots may be acoustic, luminous and of movement. Indicators mainly collect data about student connection to the Campus and classrooms, student access to the different resources within the classroom, and students marks of the different assessment activities. The details about the indicators are out of the scope of this paper.

Towards the Use of Personal Robots

185

Fig. 2. Architecture of the Botter System. The top of the figure shows the students environment and the bottom the UOC environment.

Unfortunately, the personalization and programming environment of the robots and their performance may make it harder to execute complex processes within them and to connect them to the Campus to get the necessary data about the indicators. In order to make the system as much generalizable and robot agnostic as possible, and reduce the workload within the robots, all the necessary computation is done in a Raspberry Pi (Raspberry from now on). Each robot pairs with a Raspberry that has to be installed at students’ home in order to use the robot. The architecture of the system (see Fig. 2) is split in two environments: the university and the students’ environment. In university environment, the fingerprint data of students (data about their interactions within the Campus) are stored in a data lake [12]. There is a Botter Engine that continuously check the data lake to find out new data about the selected indicators and recalculate the indicators (and the corresponding robot actions) as soon as new data arrive. Thereafter, the new actions are sent to the robots. At student’s home, the Raspberry receives the actions to be performed and send them to the robot. Depending on the student the robot may be a vector or a Zowi. The robots perform the actions and inform to the Raspberry about the actions performed and their results (if any). Then the collected data are sent to the Botter Engine at the university and will be potentially used to prioritize and filter new actions. 3.4 Pilot Design In order to evaluate the system and to analyze the usefulness of the approach, we have designed a pilot study that is currently being performed.

186

J. Conesa et al.

The pilot has begun at February of 2020 and will take one semester to finish. During the semester, 10 new students from the Psychology and Computer Science degrees of the UOC were chosen to use the robot as a companion in their first online learning experience. The inclusion criteria of the students were to belong to Psychology and Computer Science degrees, to be novel students and to have no experience in online learning. Each of the students received one robot and one configured Raspberry (5 Vector and 5 Zowi). A formation for each student was performed when delivering the robots. An infographic showing how the robot communicates was also provided. Data about the experience will be gathered by using pre-post questionnaires and semi-structured interviews after the experiment.

4 Conclusions and Future Work Robotics is very likely to occupy a privileged place in people’s training and education [13], although this does not necessarily imply the dehumanization of education. The fact of training with robots does not involve delegating the functions of educators to develop relatively autonomous, automatic or depersonalized processes. As long as e-learning professionals lead these changes, robotics can contribute not to dehumanize education, but to broaden its scope as a tool for social change. In this work we take the challenge of studying how to design robots to be persuasive and cause behavioral change for novel students in online learning environments. From the work done so far with Botter, it would be advisable to keep the use of robots low cost to ensure that their use as a support tool can be generalized to as many students as possible. Likewise, we consider that research is necessary on the characteristics of robots that can range from “boots” (operating systems) to anthropomorphic robots, and combine these physical characteristics with other “psychological” ones, such as the type of triggers, the type of messages, the tone and style of these, etc. It is important also to consider that a robot is just a tool to serve educational purposes and, therefore, its incorporation should be driven within the instructional design perspective and a given educational model. As further work, we plan to finish the experiment and share its findings. From there, the findings obtained will be used to decide whether to continue in this research line and, if so, in what direction. Acknowledgments. This work has been partially supported by the eLearn Center of the UOC through the project titled “Botter: a personal robot for novel UOC students” and by European Commission through the project “colMOOC: Integrating Conversational Agents and Learning Analytics in MOOCs” (588438-EPP-1-2017-1-EL-EPPKA2-KA). This research has also been supported by Seidor Labs department of Seidor company, who, as a UOC’s technology provider, implemented the robots and the interaction between them and the UOC campus.

References 1. Spolaôr, N., Benitti, F.B.V.: Robotics applications grounded in learning theories on tertiary education: a systematic review. Comput. Educ. 112, 97–107 (2017)

Towards the Use of Personal Robots

187

2. Sangrà, A.: A new learning model for the information and knowledge society: the case of the UOC. Int. Rev. Res. Open Distance Learn. 2(2), 152–167 (2002) 3. Hiemstra, R.: Self-Directed Learning. IACE Hall of Fame Repository (1994) 4. Grau-Valldosera, J., Minguillón, J.: Redefining dropping out in online higher education: a case study from the UOC. In: Proceedings of the 1st International Conference on Learning Analytics and Knowledge, pp. 75–80 (2011) 5. Clark, F., Sanders, K., Carlson, M., Blanche, E., Jackson, J.: Synthesis of habit theory. OTJR Occup. Particip. Heal. 27(1_suppl), 7S-23S (2007) 6. Andrews, B.R.: Habit. Am. J. Psychol. 14(2), 121–149 (1903) 7. Ryan, R.M., Deci, E.L.: Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. Am. Psychol. 55(1), 68 (2000) 8. Birnbaum, G.E., Mizrahi, M., Hoffman, G., Reis, H.T., Finkel, E.J., Sass, O.: What robots can teach us about intimacy: the reassuring effects of robot responsiveness to human disclosure. Comput. Human Behav. 63, 416–423 (2016) 9. Oinas-Kukkonen, H., Harjumaa, M.: Persuasive systems design: key issues, process model, and system features. Commun. Assoc. Inf. Syst. 24(1), 28 (2009) 10. Fogg, B.: A behavior model for persuasive design. In: ACM International Conference Proceeding Series, vol. 350 (2009) 11. Ham, J., Cuijpers, R.H., Cabibihan, J.-J.: Combining robotic persuasive strategies: the persuasive power of a storytelling robot that uses gazing and gestures. Int. J. Soc. Robot. 7(4), 479–487 (2015) 12. Minguillón, J., Conesa, J., Rodríguez, M.E., Santanach, F.: Learning analytics in practice: providing E-learning researchers and practitioners with activity data. In: Frontiers of Cyberlearning, pp. 145–167. Springer, Singapore (2018) 13. Chidambaram, V., Chiang, Y.-H., Mutlu, B.: Designing persuasive robots: how robots might persuade people using vocal and nonverbal cues. In: Proceedings of the Seventh Annual ACM/IEEE International Conference on Human-Robot Interaction, pp. 293–300 (2012)

Towards the Design of Ethically-Aware Pedagogical Conversational Agents Joan Casas-Roma(B) and Jordi Conesa Department of Computer Science, Multimedia and Telecommunication, Universitat Oberta de Catalunya, Barcelona, Spain {jcasasrom,jconesac}@uoc.edu Abstract. Pedagogical conversational agents (PCA) aimed at offering personalized support could greatly enhance the student experience by adapting their support to the student’s learning needs and habits. But, as they become more autonomous, there is a greater need to consider the potential moral consequences of their choices and to take into consideration an holistic view of students. In face-to-face setting, the individual and holistic understanding that human teachers can have about their students and classrooms is key in ensuring that the teachers’ decisions strive for the best learning experience. However, furnishing PCAs with such a holistic understanding of complex dimensions that often involve relations between multiple elements poses a challenge. Furthermore, ethical aspects should also be considered in order to avoid decisions that can lead to discriminatory and unfair collateral results and end up favouring certain students over others; therefore, the PCA should be aware of how decisions aimed at satisfying certain needs could negatively affect other dimensions. The current position paper states the need to furnish PCAs with ethical awareness and discusses the nature of ethical dilemmas in the context of PCAs in online learning environments. Through this discussion, the paper proposes some first steps towards the design of ethically-aware PCAs.

1

Introduction and Motivations

The new interdisciplinary approach of Learning Engineering as the merge of breakthrough educational methodologies and technologies based on Internet, Data Science and Artificial Intelligence have completely changed the landscape of online education over the last years by creating accessible, reliable and affordable data-rich powerful learning environments [1]. Particularly, Artificial Intelligence (AI) driven technologies have managed to automate pedagogical behaviours that we would deem as “intelligent” within an online education setting in order to provide management, motivational and adaptive support to large cohorts of online students with minimum intervention of human instructors who can leverage their value time to pedagogical critical tasks. However, as reported in more mature sectors where AI-driven technologies have already been developed and deployed, automatic decision-making processes c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 188–198, 2021. https://doi.org/10.1007/978-3-030-61105-7_19

Towards the Design of Ethically-Aware Pedagogical Conversational Agents

189

many times bear unexpected outcomes. For instance, ML-based systems have been reported to discriminate certain social communities such as in the context of law courts, job applications [2,3] or bank loans due to the use of biased datasets to feed the ML models [4]. Indeed, different studies [5,6] have shown how, in some cases, AI-driven systems have been making unfair and biased decisions that have been detrimental to the society, harmful to certain social groups, and contrary to the future we may expect. These studies conclude that, in order to avoid unforeseen outcomes in their integration, the ethical dimension of deploying AI in different settings must be taken into account [7–9]. In addition, as AI systems become more and more autonomous, their decisions have a greater effect on our society and pave the way of our future. As pointed out by different authors like [10–13], or [14], the more autonomous AI systems get, the more important is to ensure that those systems take consider the potential moral consequences of their choices. This position paper explores the need to integrate ethical awareness and moral reasoning into Pedagogical Conversation Agents (PCAs) offering personalized support to students in an online learning environment. As such, this work first discusses the nature of ethical dilemmas in online education and the use of PCAs in order to understand what tools would be more suitable for the task at hand. Then, the work briefly introduces Morality Systems as a way of encoding complex contextual notions in a computational setting. Furthermore, it identifies the need to acknowledge certain relevant dimensions of a learning environment as a prerequisite to furnish PCAs with ethical awareness and moral reasoning capabilities, and suggest taking inspiration from Morality Systems for that task. Finally, the paper points towards further lines of work.

2

Ethical Dilemmas in Online Education and PCAs

Being a relatively new setting, the ethical concerns behind online learning environments that integrate AI agents to offer personalized support to the students still need to be thoroughly explored. In this section we provide some first steps into exploring the nature and shape of ethical dilemmas1 in this context. In particular, we explore two structures that ethical dilemmas often take in other fields, and discuss how they relate to online education and PCAs. 2.1

Ethical Dilemmas as Resource Allocation

Some ethical dilemmas occur when having to decide how to distribute a somewhat small amount of resources among a larger population. Clear examples 1

We understand an ethical dilemma as a scenario involving a decision in which each possible choice results in both beneficial and detrimental consequences. Namely, in ethical dilemmas there are no solutions that are only (or clearly more) beneficial for the actors involved; instead, they always involve a choice between options that have a certain amount of “bad” consequences.

190

J. Casas-Roma and J. Conesa

would involve systems that decide whether to grant a bank loan to an applicant, or whether a prospective student can get access to a certain university degree. Because both money loans and availability in physical learning centers is usually limited, and the amount of people asking for them is greater than the available resources, a selection process to determine who gets the resource and who does not is needed. The ethical dimension comes in when the potential advantages and detrimental effects of making one choice over another can result in, for instance, disfavoring a certain social group, or reinforcing some existing situation that is considered unfair. Similarly, concerns about discriminatory practices arise when the factors taken into account to make such decision are not relevant –for instance, when gender, or ethnicity, play a role in deciding who gets a bank loan. These kind of ethical dilemmas can be seen as a resource allocation problem. We can easily locate the counterpart of resource allocation-based ethical dilemmas in face-to-face education. In this setting, human teachers have a limited amount of hours that need to be distributed amongst a certain number of students, and the way those hours are assigned will likely have consequences to their learning experience. Deciding how to allocate such office hours could take into account different features like the students’ performance (should the teacher devote more time to those who are performing more poorly?), their motivation (should the teacher support those that are more engaged in the learning process?), as well as many other features. Needless to say, taking into account traits such as the students’ gender, or ethnicity, when deciding how to distribute office hours would likely be seen as a discriminatory practice, as those features should not be relevant for this matter. When considering the case of an online learning environment integrating PCAs for personalized support, we need to understand what the equivalent of those cases would be. However, when looking at the interactions between the PCA and the student, resource allocation does not seem to be an issue here. Whereas the support given by a human teacher has clear time and spatial limitations, PCAs that offer personalized support to each student aim to bridge precisely this gap. In this sense, there is no scarcity of resources (in terms of the PCA’s availability of time and effort), considering that each student will be supported by a personalized instance of a PCA that has full time availability for the student, as well as access to the same learning materials. Due to this, PCAs do not need to decide which student to support, or give more time to, and thus no decision process is needed in this sense. Furthermore, concerns regarding how the PCA might prioritize its time, or about which features are taken into account to do so, vanish when we have a 1-to-1 relation between an instance of the PCA and a single student. What, then, is the shape of ethical dilemmas in this setting? 2.2

Ethical Dilemmas as Clashing Principles

Other ethical dilemmas are about how to make a decision that presents conflicting rules, or principles. For instance, in other areas such as healthcare, ethical

Towards the Design of Ethically-Aware Pedagogical Conversational Agents

191

dilemmas come when different paths of a decision may be in conflict with different principles concerning the patient. As an example, Biomedical Ethics aims to leverage principles such as Beneficence (about ensuring that actions are aimed towards improving the patient’s condition), Non-maleficence (about ensuring that those actions will not knowingly cause harm to the patient) and Autonomy (about granting the patient the right to decide whether they want to receive a treatment), among others. As it is explained in [15], ethical dilemmas in the context of healthcare may involve understanding how a choice can affect those principles differently (and potentially violating some of them); would it be ethically correct to disregard the patient’s choice on not to receive a treatment that will guarantee their improvement, thus disregarding their Autonomy in favor of Beneficence? Or would it be ethically correct to accept a patient’s request for a certain treatment, knowing that this treatment can cause serious harm to the patient, thus favoring Autonomy but potentially violating Non-maleficence? Under this view, these ethical dilemmas are not about the allocation of limited resources, but rather about coming up with a decision mechanism for choices that will be in conflict with principles that should not be violated. The ethical dimension behind a personalized PCA in an online learning environment is not about how to distribute scarce resources among a larger population, but rather about understanding and evaluating how the outcomes of different decisions will affect certain dimensions of the students differently. Drawing a parallelism with the healthcare scenario, just as a treatment for a patient should always try to respect the principles of Beneficence, Non-maleficence and Autonomy, learning activities proposed to a student should always aim at benefiting different dimensions of the student and their learning. Among others, these dimensions may include their conceptual understanding of the course contents and the development of the course skills (Learning), as well as their marks (Evaluation-marks), time-dedication (Dedication) and emotional well-being and motivational state (Engagement). However, the question behind the design of ethically-aware PCAs encapsulates a technological challenge: if ethical dilemmas in this setting take the shape of potential conflicts between different dimensions as a result of the PCA’s interactions, we first need to ensure that those dimensions can be successfully represented in a computational way and, furthermore, understood and taken into account by the PCAs. If we have access to these different representations, we can be aware and understand how a certain action made by the PCA can simultaneously affect some of them. Say, for instance, that a student is really struggling to follow a module and is doing poorly in their exercises; furthermore, the student feels overwhelmed and demotivated, and is considering to drop out entirely. How should the educational bot react, in this setting? By focusing on the Learning dimension, the bot could decide to dump a set of new exercises to the student that, although rightfully aimed at improving their understanding of the module’s content, might further overwhelm the student (Dedication) and cause them to completely disengage from the module. Conversely, if the bot only focuses on the emotional dimension of the student, and aims to provide reassuring support

192

J. Casas-Roma and J. Conesa

to keep the student motivated (Engagement), but without ensuring that they will get the chance to improve their understanding of the topic (Learning), the student might fail to overcome their limitations. We can also give more guidance to the students by narrowing the activities of the course to be similar to the ones students will face in the final exam, that would probably increase performance (Evaluation-marks) but will affect negatively to the student competence on the topic (Learning) and probably difficult the student understanding in further courses. Similarly as it happened in the healthcare context, an action that would be beneficial for one dimension might be detrimental for another. In this sense, we believe that deploying multiple layers to represent the relevant dimensions of a student in an online learning environment would allow the PCA to be aware, understand and evaluate what is the best course of action according to the overall state of the student. In order to achieve that, we propose to take inspiration and re-use some of the techniques that can be found in the computational representation of some of the most complex types of situations: morally-qualifiable actions.

3

Encoding Moral Reasoning

Moral reasoning is often considered one of the most complex types of reasoning [16], as it often involves taking into account a complex network of relationships between an agent performing an action, which has a set of outcomes that affect a patient. Therefore, reasoning about the moral outcomes of a situation often requires to take into account plenty of contextual information that can be challenging to represent in a computational way. In this regard, the so-called Morality Systems provide different approaches that can help represent complex notions involved in moral reasoning. Whereas some ethical and moral dilemmas are sometimes explored in a sort of vacuum (that is, enclosed to a particular situation with fairly limited scope), most dilemmas in real scenarios must take into account potential short and long-term effects in a complex network of different agents. When considering moral dilemmas involving multiple actors (either agents, or patients), works in Multi-Agent Systems (MAS) need to incorporate a way of representing and assessing moral values related to other agents and their environment. For example, in [17] the authors define a system that encodes desires, morals and abilities, as well as independently taking into account the “goodness” of an action (i.e.; how acceptable or good an action would usually be: for instance, “stealing” would be considered as bad), as well as its “rightness” (i.e.; whether a particular event would be deemed as acceptable, or fair, even if it involves an action that would not usually be: for instance, a starving orphan stealing an apple might be seen as acceptable, even if stealing usually is not). In this work, the authors encode a system of agents who independently perceive and judge situations differently, according to the encoding of their beliefs, desires and values. A quite different setting, but which can provide useful insights as well in this regard, is that of simulating behaviors of agents in virtual worlds –specifically, in

Towards the Design of Ethically-Aware Pedagogical Conversational Agents

193

ludic virtual worlds such as those features in certain digital games. Even though the primary goal in that field is usually to maximize the ludic engagement of their users (the players), achieving high levels of realistic social behavior is something many digital games have chased to improve their virtual worlds’ believability and which can be seen as closely related to certain MAS research settings [18]. Among other traits, this social behavior includes how to appropriately identify and react accordingly to morally-qualifiable events. In [19], for example, different titles of a series are analyzed in order to show how each one manages to capture one or another relevant trait in the representation and evaluation of morallyqualifiable actions, and further guidelines are given as to how to combine those existing mechanics to achieve a greater level of detail. Similarly, in [20] it is shown how different morality systems can capture a consequentialist, intentionalist, and virtue ethics approach into their morality system, respectively; furthermore, the paper proposes a theory-independent model to computationally represent the relevant parts of morally-qualifiable actions, as shown in Fig. 1. The way those titles highlight different parts of their virtual world and the virtual agents inhabiting in it points towards interesting ways of enhancing the functionality of an ethically-aware artificial agent this is not kept in a vacuum, but which rather interacts with different elements in a virtual environment.

Fig. 1. A general model for morally-qualifiable actions.

Works such as [21],[22] and [23] describe different approaches to modeling morality in the context of digital games. Although, as it has been mentioned before, the desired level of detail represented in those systems is constrained by the expected enjoyment of the player, the techniques highlighted in them can easily be translated and applied to represent certain dimensions of digital agents, avatars, or virtual personas. In the case that a virtual agent, such as a PCA, is deployed in an online medium where it can continuously interact with different agents (be them other artificial agents, or digital representations

194

J. Casas-Roma and J. Conesa

of human agents), being able to accurately represent and update those agents traits and relations can likely improve the PCA’s understanding of the situation in order to act accordingly. As we have argued in the previous section, identifying and representing the relevant dimensions of the students in an online learning environment paves the way towards furnishing PCAs with the tools needed to ensure that their actions are aimed at benefiting (or, at least, not being detrimental for) those dimensions. Furthermore, in case the PCA identifies potential conflicts between potential courses of actions that either benefit, or are detrimental to, some of the students’ dimensions, the PCA needs to be equipped with a decision procedure to take the best course of action, while ensuring that such course of action is ethically and morally acceptable. Considering this, we argue that using insights taken from Morality Systems in order to model similar complex notions and long-term effects would not only be beneficial regarding the ethical and moral awareness of the PCA, but would also help in the computational modeling of challenging, long-term goals related to the overall learning experience of the students.

4

Representing Complex Dimensions Using Morality Systems

In order to get a comprehensive picture of a classroom and their students, a human teacher has to be aware of multiple dimensions that go beyond the traditional methods used to evaluate the students’ progress in a module, such as exercises and assignments. In this sense, a human teacher has to be able to identify whether the students are actually learning, understanding and connecting the different topics within a module, to understand why the students might be struggling to solve certain exercises, to recognize the emotional climate of the classroom and the individual students (whether they react positively, or negatively to different inputs), and to identify their motivational state to prevent students from feeling overwhelmed and disengaging from the module. The task of the human teacher, therefore, often goes way beyond simply delivering lectures and dumping exercises, and require a holistic, thorough understanding of complex dimensions that usually involve leveraging multiple assets, preferences and goals, as well as foreseeing how different interactions will affect those dimensions. Following this, we argue that as PCAs become more autonomous and more aimed at offering personalized, quality support to the students, they too need to be able to have an awareness of these relevant traits. This will not only make the PCAs’ support better in terms of helping the students, but will also allow the PCAs to foresee and evaluate the potential benefits and detrimental consequences that their interactions might have over those multiple dimensions, both at an individual and at the level of the classroom. These dimensions are not disjoint with the ethical dimensions commented previously, but overlapping. We propose to capture these relevant dimensions in a computational way by defining different layers and taking insights and inspiration from Morality Systems. Similarly to what happens with moral reasoning, representing the aforementioned dimensions of students within an online learning environment poses

Towards the Design of Ethically-Aware Pedagogical Conversational Agents

195

a conceptual challenge, as one needs to take into account multiple elements with complex interconnected parts. Our proposed approach involves defining independent systems for each one of the relevant dimensions to take into account; for instance, a Learning Tracking System, an Evaluation-Marks Tracker System, an Engagement Tracker System, etc. Each one of these systems would need to be integrated to the PCA’s potential interactions that result from a student input in order to understand how those interactions might affect that particular dimension. Let’s consider an example situation to illustrate the idea behind this: a student struggling to solve a set of exercises asks the PCA for support. The input of the student will trigger a set of available interactions for the PCA that might result, in the end, in explaining something to the student, or in proposing a different exercise to strengthen their technical skills, to name a few. In this case, a Learning Tracking System integrated with the PCA’s potential interactions could foresee how they will likely affect different relevant parts of the student’s learning process (like their conceptual understanding of the module, their progress within a graded assignments, their technical skills, etc) by representing (through weighted relations) “how much” a certain interaction relates to those parts of the student’s learning process, and then choosing the one that will likely lead to the most desirable result. Figure 2 shows a simple schema capturing this idea.

Fig. 2. General schema of the Learning Tracking System.

Similarly, a different system should be defined to account for the same kind of information, but with respect to the other relevant dimensions of the student. For example, and going back to the example near the end of Sect. 2.2, let’s consider again a student who is struggling to follow the course content, but who is at the same time feeling overwhelmed and demotivated. By considering only the Learning dimension of the student, we may fail to consider other issues that need to be addressed for the student’s well-being and effective learning; in this example case, a Dedication Tracker System, acknowledging the student’s working habits and time spent on different activities, as well as an Engagement Tracker System, monitoring the student’s interest and involvement in the course content, would help take all three relevant dimensions into account. Only by bringing these different dimensions into the picture we could furnish a system

196

J. Casas-Roma and J. Conesa

with ways of understanding how to balance these different parts, as well as deciding which issue is more pressing and should be addressed first –for instance, in this example it might be worth providing the student with a more step-bystep explanation of previous concepts which, although it wouldn’t enhance the Learning dimension, it could make the student regain self-confidence and interest (Engagement) before throwing in a new exercise and prevent the student from feeling further pressured and frustrated. Nevertheless, if these systems are kept isolated, each one in a different bubble, the PCA will still lack a holistic perspective of how its interactions affect the student and the classroom overall. Interactions that might be desirable from the Learning perspective might be detrimental for the Engagement of the student, and the PCA needs to be aware of these potential tensions. This is where our approach meets the interpretation of ethical dilemmas as clashing principles. When the PCA’s interactions cannot be beneficial for each and every dimension of the student, the PCA must be furnished with mechanisms to decide what to prioritize in that case –or, conversely, deciding what dimension should be disregarded for a “greater good”. Our struggling, overwhelmed student might benefit more from emotional support and regaining confidence to begin with, before being ready to tackle more challenging learning activities: in this scenario, an interaction strengthening the Engagement dimension, but disregarding the Learning one, might be preferred. In order to achieve this, we would need to define an additional layer connecting the PCA’s interactions to all the available sub-systems and identifying how much a given interaction supports and disregards each one of them. In this sense, identifying what are the most pressing dimensions in the student’s interaction, and evaluating which of the available interactions support those in a better way, can help the PCA choose how to interact with the student while being aware not only of the benefits of that interaction, but also about the dimensions being disregarded in that case. This can be important afterwards if, for example, one of those dimensions is systematically disregarded due to the nature of the student’s interactions, as the PCA might then try to balance out that difference in future interactions.

5

Lines of Future Work

Ethical dilemmas in online contexts that use PCAs to offer personalized support can be seen as a problem of conflicting dimensions to support. Choosing one interaction over another as a result of a student query can strengthen certain dimensions, while disregarding others. In order to ensure the PCA’s decisions are fair and aimed to provide the best support, the PCA needs to be furnished with an awareness of both individual and holistic dimensions within the online classroom. Defining such complex notions in a computational way poses a conceptual challenge, but we claim that insights gained by Morality Systems can help pave the way towards that. Furthermore, defining several of those dimensions and integrating them through an additional layer can bring ethical dilemmas in online

Towards the Design of Ethically-Aware Pedagogical Conversational Agents

197

contexts closer to the ones found in healthcare scenarios, for instance. Works such as [15] already provide some interesting and successful first approaches to artificial moral reasoning in the context of healthcare, and those insights could be applied in online learning environments once ethical dilemmas in one and another scenarios are seen under the same light. A clear line of future work involves the formal definition and prototyping of the relevant dimensions identified in a learning context, such as the Learning Tracking System. The suitability of the resulting model would then need to be evaluated using sets of cases taken from online learning environments. Furthermore, if the model effectively captures the ethical dimension of such decisions, a metric to evaluate the ethical outcomes of any given decisions (be them made by a PCA, or by a human teacher) could be implemented to further refine and improve our approach. Acknowledgements. This work has been partially supported by European Commission through the project “colMOOC: Integrating Conversational Agents and Learning Analytics in MOOCs” (588438-EPP-1-2017-1-EL-EPPKA2-KA) and by a UOC postdoctoral stay.

References 1. Dede, C., Richards, J., Saxberg, B.: Learning Engineering for Online Education: Theoretical Contexts and Design-based Examples. Routledge, New York (2018) 2. Sullivan, C.A.: Vill. L. Rev. 63, 395 (2018) 3. King, A.G., Mrkonich, M.J.: Okla. L. Rev. 68, 555 (2015) 4. Gumbus, A., Grodzinsky, F.: ACM SIGCAS Comput. Soc. 45(3), 118 (2016) 5. Yapo, A., Weiss, J.: Proceedings of the 51st Hawaii International Conference on System Sciences (2018) 6. Favaretto, M., De Clercq, E., Elger, B.S.: J. Big Data 6(1), 12 (2019) 7. Angwin, J., Larson, J., Mattu, S., Kirchner, L., ProPublica, 23 May 2016 (2016) 8. Taylor, L.: Big Data Soc. 4(2), 2053951717736335 (2017) 9. Veale, M., Binns, R.: Big Data Soc. 4(2), 2053951717743530 (2017) 10. Gunkel, D.J.: The Machine Question: Critical Perspectives on AI, Robots, and Ethics. MIT Press, Cambridge (2012) 11. Floridi, L., Sanders, J.W.: Minds Mach. 14(3), 349 (2004) 12. Sullins, J.P.: Ethics Inf. Technol. 7, 139 (2005) 13. Wallach, W., Franklin, S., Allen, C.: Topics Cogn. Sci. 2(3), 454 (2010) 14. Muntean, I., Howard, D.: In: Seibt, J., Hakli, R., Norskov, M. (eds.) Sociable Robots and the Future of Social Relations: Proceedings of Robo-Philosophy 2014, pp. 217–230. IOS Press (2014) 15. Anderson, M., Anderson, S.L., Armen, C.: IEEE Intell. Syst. 21, 56 (2006) 16. Wallach, W., Allen, C., Smit, I.: AI Soc. 22(4), 565 (2008) 17. Cointe, N., Bonnet, G., Boissier, O.: In: AAMAS, pp. 1106–1114 (2016) 18. Casas-Roma, J., Nelson, M.J., Arnedo-Moreno, J., Gaudl, S.E., Saunders, R.: In: Proceedings of the 11th International Conference on Agents and Artificial Intelligence, AIIDE 2012, vol. 1, pp. 244–251 (2019) 19. Casas-Roma, J., Arnedo-Moreno, J.: In: Proceedings of the Digital Games Research Association Conference, DiGRA 2019 (2019)

198

J. Casas-Roma and J. Conesa

20. Casas-Roma, J., Arnedo-Moreno, J.: Categorizing morality systems through the lens of fallout. In: Proceedings of the Catalan Conference of Artificial Intelligence (CCIA), pp. 19–28 (2019) 21. Fitzpatrick, R., Walsh, M., Nitsche, M.: In: Aesthetics of Play, Bergen, Norway, 14–15 October 2005 (2005) 22. Russell, A.: In: AI Game Programming Wisdom, vol. 3. Charles River Media, Rockland (2006) 23. Heron, M., Belford, P.: Comput. Games J. 3(1), 34 (2014)

Evaluation on Using Conversational Pedagogical Agents to Support Collaborative Learning in MOOCs Santi Caballé(B) , Jordi Conesa, and David Gañán Faculty of Computer Science, Multimedia, and Telecommunications, Universitat Oberta de Catalunya, Barcelona, Spain {scaballe,jconesac,dganan}@uoc.edu

Abstract. Massive Open Online Courses (MOOCs) introduce a way of transcending formal education by realizing technology-enhanced formats of learning and instruction and by granting access to an audience way beyond higher education. However, although MOOCs have been reported as an efficient and important educational tool, there is a number of issues and problems related to their educational impact. More specifically, there is an important number of dropouts during a course, little participation, and lack of students’ motivation and engagement overall. This paper updates the progress of the European project called “colMOOC” that aims to enhance the MOOCs experience by integrating collaborative settings based on conversational pedagogical agents to support both students and teachers during a MOOC course. Conversational pedagogical agents guide and support student dialogue using natural language both in individual and collaborative settings. Integrating this type of conversational agents into MOOCs to trigger peer interaction in discussion groups can considerably increase the engagement and the commitment of online students and, consequently, reduce MOOCs dropout rate. The paper reports on the first evaluation experience of incorporating synchronous collaborative activities mediated by conversational pedagogical agents into a real MOOC with a massive international participation. Evaluation results are statistically described in terms of participation, performance and satisfaction. The research reported in this paper is currently undertaken within the project colMOOC funded by the European Commission.

1 Introduction Massive Open Online Courses (MOOCs) [1] arose as a way of transcending formal higher education by realizing technology-enhanced formats of learning and instruction and by granting access to an audience way beyond higher education. However, MOOCs have been reported as an efficient and important educational tool, yet there is a number of issues and problems related to their educational impact. More specifically, there is an important number of drop outs during a course, little participation, and lack of students’ motivation and engagement overall. This may be due to one-size-fits-all instructional approaches and very limited commitment to student-student and teacher-student collaboration [2]. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 199–210, 2021. https://doi.org/10.1007/978-3-030-61105-7_20

200

S. Caballé et al.

This paper reports on the current progress of a European project called colMOOC (https://colmooc.eu/) [3] that aims to enhance the MOOCs experience by integrating collaborative settings based on Conversational Agents (CAs) in synchronous collaboration conditions. CAs guide and support student dialogue using natural language both in individual and collaborative settings [3, 4] and have been produced to meet a wide variety of applications and studies exploring the usage of such agents have led to positive results. Integrating this type of CAs into MOOCs is expected to trigger productive peer interaction in discussion groups and, therefore, to considerably increase the engagement and the commitment of online students, reducing consequently, the overall MOOCs dropout rate. To achieve this, the project employs collaborative activities supported by CA [4–6]. Considering the above, this paper reports on the evaluation results of the first experience of using CA to support collaborative learning activities in a real MOOC with massive participation where CAs automatically mediated about 500 chat discussions in dyads (i.e. 2 learners). A throughout methodology is proposed to conduct the evaluation process and the results are shown in terms of participation, performance and satisfaction of both the CA-mediated activities and the MOOC overall. The remainder of the paper is structured as follows: Sect. 2 reviews the goals and expectations forming the context of the colMOOC project and in particular the issues and challenges of MOOCs. Section 3 and Sect. 4 are the core of the paper reporting a detailed evaluation methodology along with the evaluation results from a descriptive statistical perspective. Finally, Sect. 5 provides final remarks of the evaluation and outlines the following steps in the evaluation plan of the colMOOC project.

2 Background Supporting MOOCs implies to identify the drawbacks, problems and several user requirements that they have not yet addressed. Since top-ranked academic community joined the MOOC hype, academic sectors have hold controversial discussions on the many MOOC challenges that must be faced before moving on [7, 8], including high learners’ dropout rate, with only 5% to 15% of participants finishing the courses on average as well as limited teaching involvement during delivery stages [8]. The colMOOC project aims to provide innovative ways for promoting learners’ interaction by enabling the teacher to configure a CA software component which attempts to trigger learners’ discussion through appropriate interventions in dialogue-based collaborative activities [9]. The development of the agent will consider all current research evidence on the value of both ‘Academic Productive Talk’ [5] and the ‘Transactive dialogue’ [10] frameworks as forms of productive peer dialogue. The main functional features of the agent component include: (a) a user friendly for the instructor to easily model the knowledge domain necessary for the agent intervention and also configure the agent behaviour during students’ discourse; (b) an appropriately designed MOOC interface for students’ synchronous and asynchronous online discussions, where the agent also appears and intervenes in the discussion aiming to trigger productive students’ social interaction, domain-focused cognitive activity and deeper learning [11–13]. To achieve the above aims, the work methodology of the colMOOC project follows a four-step loop approach where education and training policies are used as input in

Evaluation on Using Conversational Pedagogical Agents

201

Fig. 1. colMOOC project main activities.

order to produce the project’s objectives and then provide new policy actions as outputs based on validated evidence (Fig. 1 and also [3]). This paper focused on the third step of this work methodology (see “MOOCs Evaluation” in Fig. 1). This step includes the evaluation plan of the project aiming to develop several pilot MOOCs of different pedagogical disciplines, namely Programming for Non-Programmers. Computational and Design Thinking, and Educational Technologies in the Classroom. The latter is the discipline of the MOOC used for the evaluation purposes reported in this paper.

3 Evaluation Methodology In this section, we present the methodology for evaluating the colMOOC platform and in particular the CA component. Following standard methodologies to report online education research [14], information about the participants, the apparatus used for the evaluation and the procedure of the pilots are provided in this section. The results of the evaluation are reported in the next section. 3.1 Evaluation Goals To evaluate the effects of the CA-mediated activities in the participation, performance and satisfaction behavior as well as its connection with potential learning benefits at the MOOC and CA-mediated activities level. 3.2 Participants The trial was run as a MOOC course on the MiriadaX platform1 between January and March of 2020. The course was delivered in Spanish mainly to Spaniards and LatinAmericans. About 2,000 people registered for the MOOC with the following demographics (data refers to about 30% of registrants who answered a survey during the registration process): 1 Miríadax is the world’s leading non-English speaking MOOC platform. It currently has more

than 6.3 million registered students, more than 800 courses from 113 universities and institutions and more than 3000 instructors in its teaching community. https://miriadax.net/.

202

S. Caballé et al.

• Gender of participants was balanced (52% men and 48% women). • Age of participants ranged on average between 24 and 35 years old and a great majority of them was between 25 and 54 (83%) while a marginal part was under 25 or older than 54. • Academic profile of participants related to higher education included mostly teachers and researchers (46%) as well as people who had a university degree (31%), being the former group the targeting profile of this MOOC. Finally, MOOC participants were geographically distributed widely among 40 countries. The vast majority of them (92%) were located in Spanish-speaking countries (Spain and Latin-America) being Spain the country with more participants (25%). 3.3 Apparatus and Stimuli This evaluation was carried out on the MOOC “Educational technologies in support for collaboration and evaluation in virtual learning environments” through the MiriadaX platform during five weeks between January 27 and March 2, 2020. The course was delivered in Spanish mainly to Spaniards and Latin-Americans. The main objective of this MOOC was to endow course participants with competencies and practical knowledge as for designing and implementing ICT-mediated learning activities that include collaboration and assessment in blended and online educational contexts. In particular, the participants learnt about collaborative learning and assessment methods with educational technologies within the framework of Digital Competence for Instructors. They learnt how to engage students in learning activities supported by educational tools for collaborative learning and assessment purposes. Additionally, the course provided participants with opportunities to reflect on the challenges and benefits of introducing educational technologies into the classroom. To this end, this MOOC targeted pre-service and in-service school and higher education instructors and lecturers who were interested in incorporating collaborative learning and assessment methods supported by ICT tools into their everyday teaching practice2 . The five-week course followed a traditional MOOC instructional design with 5 teaching modules based on video-lectures. Each module was scheduled to last one week with intermediate and final self-assessments as well as discussion forums to share questions and comments among the participants and with instructors (Fig. 2). Several key activities were included in which learners were required to participate actively, such as synchronous collaborative activities mediated by a conversational agent, asynchronous collaborative activities supported by discussion forums as well as a peer review activity (Fig. 2). While asynchronous activities had a flexible schedule in order for the learners to participate anytime, other schedules were strict in order to enable synchronous collaborative activities to be performed by dyads at the same time. MOOC learners were evaluated by their participation in self-assessment tests of different types (diagnostic, formative and summative), which were set up requiring a minimum amount of correct answers and maximum number of attempts to pass. 2 The MOOC is found following this URL: https://miriadax.net/web/tecnologia-educativa-para-

apoyar-la-colaboracion-y-evaluacion-en-entornos-de-aprendizaje-virtual_.

Evaluation on Using Conversational Pedagogical Agents

203

In addition, through the formative self-assessment tests, participants received immediate feedback on their understanding of key course concepts in order for the learners to regulate their own study habits and methods. A summary of the syllabus of the course with the above-mentioned 5 modules and all the course activities is depicted in Fig. 2, highlighting the CA activities conducted during the course.

Fig. 2. Course syllabus of the MOOC with the 5 CA activities.

Finally, intermediate and ending research surveys were conducted in the MOOC to collect information from the participants about each module with emphasis on the CA activities of the overall MOOC by an end-course survey. The surveys included test-based questionnaires to evaluate the MOOC and in particular the CA activities of each. • Questionnaires at the end of the MOOC’s modules asked about the following: • • • • •

Difficulty and effort invested to complete the module, Benefits of CA activities for learning the module’s contents, Appropriateness of the CA interventions in terms of time and content, Satisfaction with the CA activities of the module, Overall satisfaction with the module.

• The final questionnaire at the end of the MOOC asked about the following: • Experience and most valuable aspects of the CA activities, • Overall satisfaction with the MOOC and CA activities, • Potential participation in future CA activities. Finally, all sections of the questionnaire had a final field to express suggestions and further comments about aspects not represented in the questions as well as final hints for potential improvement of the CA activities and the MOOC overall. In order to participate

204

S. Caballé et al.

in the survey, participants were required to fill out and accept a consent form for private data collection and treatment prior starting the MOOC. 3.4 Procedure The MOOC course was run entirely online and participants could access it from anywhere and anytime with a computer or mobile device connected to Internet. In this context, as collaboration and interaction with peers is not only motivating but often improves understanding, a series of chat collaborative activities based on conversational agents (CA) were designed to facilitate discussion among learners with the aim to promote transactivity and collaborative knowledge building [11]. To this end, CAs triggered interventions based on transaction patterns to guide learners during the discussion [4]. However, these collaborative sessions were participated by dyads synchronously in which one peer needed to interact with another at the same time, thus requiring learners to schedule their meetings and perform the activity at a certain time. Overall, 5 CA activities were designed and implemented by the colMOOC platform throughout the course in order to support chat discussions as follows (see MOOC syllabus in Fig. 2). Prior to their start, all the CA-mediated chat activities were designed by the course design team through the Editor tool of the colMOOC platform where the discussion topic of each CA activity was essentially an open-ended domain question, which encouraged learners to discuss with each other during their respective course activity (see full description details of the Editor tool and how to create and configure a CA activity from the instructor perspective in [15]).

Fig. 3. Learner’s view of the colMOOC Player showing a chat discussion between two learners mediated by an agent (see https://colmooc.eu/presentation-video).

Each designed CA activity was ready to be implemented by the colMOOC Player tool (see Fig. 3) at the time when it was scheduled according to the course syllabus in the

Evaluation on Using Conversational Pedagogical Agents

205

form of a chat where dyads mediated by an agent performed the proposed activity. Upon joining the chat from the course platform, the two participants could find the description of activity (see left upper side of Fig. 3), which was also duly informed to the learners in the corresponding course module along with specific instructions of how to use the Player tool and specifically how to address the agent interventions. The response to the proposed activity was submitted through also the colMOOC Player (see left lower side of Fig. 3). This submission finalized the procedure of the activity. Finally, to complete the evaluation methodology, next section reports the results of the evaluation.

4 Evaluation Results In this section, following the methodology presented in the previous section, the results of the evaluation are presented at the MOOC and CA-mediated activities level along with a throughout discussion and interpretation of them. 4.1 MOOC Evaluation Results The MOOC evaluation results are reported here in terms of participation and completion rate as well as general satisfaction and performance. Participation and Completion Rate From about 2,000 learners registered in the MOOC, 1,160 started the course (about 40% initial dropout rate) and 322 finished it, managing a respectable 28% completion rate, considering the usual completion rate of participants who start a MOOC is between 5%–15% according to ample studies on MOOCs [2, 8, 16]. The participation during the MOOC was gradually decreasing through each of the weekly modules, which is considered a normal behavior. Performance The course performance of the MOOC is measured by comparing the results between an evaluation test of type diagnostic conducted right before starting the course and an evaluation test of type summative at the end of the course (5 weeks later from the diagnostic test). Both tests were formed by 6 multiple-choice questions, each with 4 answers (only one correct) on the same main topics of the course. The questions and the answers of both tests, though the same, were formulated differently and listed in a different order to avoid fraud. Results show that the diagnostic test was participated by 979 learners and 67,9% passed the test, while the summative test was participated by 322 learners and 83,9% passed the test. This means that the great majority of learners who performed the entire MOOC improved their cognitive state through the course’ contents and understood the main concepts, thus achieving those concept-understanding objectives declared for the MOOC (see Sect. 3.3).

206

S. Caballé et al.

Satisfaction The MOOC contents in terms of course modules (i.e. video lectures and activities of each module) were found in general difficult as most learners reported in the corresponding surveys at the end of each of 5 modules that the level of difficulty was between medium and high (see left graph of Fig. 4). These results show that the difficulty of the course contents was increasing from the start till the end of the MOOC. Considering the workload of all the modules were similar, it can be inferred that learners were accumulating a certain level of tiredness while the course was progressing. This reflection is confirmed by the results of the efforts invested to complete the course modules, which is in line with the level of the difficulty and also gradually increasing through the course (see right graph of Fig. 4).

What was the difficulty of the module for you? 70 60 50 40 30 20 10 0

How much effort have you invested to complete this module? 60 50 40 30 20 10 0

M1

M2

M3

M4

M5

M1

M2

M3

M4

M5

Fig. 4. Survey results on MOOC satisfaction in terms of course contents difficulty (left) and effort invested (right) for each CA activity.

The overall satisfaction of the MOOC was measured by sentiment analysis on the learners’ open comments in the survey conducted at the end of the course where learners were asked to explain their overall experience with the course by focusing on the following key aspects: planification and schedule, overall workload, study materials, individual and collaborative activities, surveys, teaching support, and the MiriadaX technological platform supporting the MOOC. From 321 learners who submitted the survey, 191 (60%) expressed a positive feeling with the course and 30 (9%) expressed a negative feeling, while the left 100 learners (31%) decided not to express any opinion with respect to the MOOC. Therefore, the overall feeling with the MOOC was mostly positive. The results are even more overwhelming if discarding the DK/DA responses and then facing the overall satisfaction of the MOOC in terms of responses with positive feelings (86%) versus negative feelings (14%).

Evaluation on Using Conversational Pedagogical Agents

207

4.2 CA Evaluation Results The evaluation results of the CA activities are reported here in terms of participation and satisfaction with this type of activities. Overall, about 500 chat discussions by dyads were automatically mediated by CAs. When asking those learners who participated in the CA activities (80% of the respondents to the final survey) about what aspect they liked most of this type of CA-mediated activities, most of them (65%) liked the exchange of ideas and then manage to reach a common solution of the activity with their partner, while the rest of participating learners liked the interaction with the agent either receiving the agent interventions or answering the agent question (Fig. 5). These results can be interpreted that while the direct interaction with the agent was perceived as interesting by the learners, the main benefit from the agent mediation was to foster the discussion and collaboration between the peers in a transparent way.

What aspect did you like most of the CA activities? 140 120 100 80 60 40 20 0

115 88 58 23

21

Exchange Manage to I did not Respond to The agent ideas with reach a perform any the agent intervenons my partner common CA acvity. quesons. soluon of the acvity with my partner

7 Others

Fig. 5. Survey results showing the most interesting aspects of the CA activities according to the learners.

Most of the MOOC learners (61%) who participated in the research survey at the end of the course found the CA activities an interesting experience while 14% of learners felt indifferent to these activities. Only a marginal 9% of learners felt negative as they found the CA activities a barrier to complete the course (Fig. 6). Finally, 51 learners (6%) did not participate in any CA activity because of technical and organizational reasons. The previous results are supported by 2 questions addressed to the learners after each module with CA activities. The first question was about whether the CA activities had been beneficial for their learning of the module. Most of learners (between 50% and 60%) somewhat or totally agree with the question while between 25% and 35% of them were indifferent. Only a marginal part of learners (between 10% and 20%) somewhat or totally disagreed (see left side of Fig. 7). The second question asked learners about their satisfaction with the CA activities in terms of the discussions with their partners

208

S. Caballé et al.

My experience with performing CA activities was... 190

200 150 100

42

50

51

29

0 ... an interesng ... not an ... rather an I did not perform experience. impediment, not impediment to any CA acvity an interesng complete the experience course. Fig. 6. Survey results showing the learners’ experience with the CA activities.

mediated by an agent. Similar to the previous question, between 50% and 60% of learners somewhat or totally agreed (with more learners who totally agreed rather than somewhat agreed) while between 25% and 35% of them were indifferent to this question. The rest of learners (about 15% and 20%) somewhat or totally disagreed (see right side of Fig. 7). Are you satisfied with the CAmediated discussion with your partner mediated by the agent?

Do you agree that the CA activity has been beneficial for your learning? 50 40 30 20 10 0

40 30 20 10 0

CA1

CA2

CA3

CA4

CA5

CA1

CA2

CA3

CA4

CA5

Fig. 7. Survey results on learning benefits (left) and discussion satisfaction (right) of each CA activity.

Finally, related to satisfaction with the CA activities, learners were asked about whether they would like to participate in CA-mediated collaborative activities in future MOOCs. The majority of them (62%) answered positively to this question while 19% felt indifferent and only 18% answered negatively. Some of the negative answers may

Evaluation on Using Conversational Pedagogical Agents

209

be partially justified by certain technical and organizational problems experienced during the CA activities, which impeded a number of learners from participating and/or completing these activities. In particular, the problems to form dyads at the same time to perform the chat-based activities as well as technical networks disconnections were perceived as the most negative aspects by the learners. These issues can be alleviated by arranging fixed times to conduct the chat activities as well as improving the server-side throughput for a better technical performance. Further solutions will be proposed in next experiences with CA activities in the same context of MOOCs.

5 Final Remarks and Conclusions As part of the evaluation process of the CA activities performed during the MOOC, the feedback collected through open comments in the research surveys was also carefully analyzed. In general, the result of the analysis showed high levels of satisfaction regarding the CA activities design, the synchronous discussions between the learners in dyads and the agent interventions, while lower level of satisfaction were observed when learners tried to find a partner to start a CA activity and suffer from technical network disconnections, which overall turned out to be an important barrier to make progress and complete the CA activities. In addition, participants’ satisfaction of CA activities was evident in their comments, which highlighted key aspects, such as that agent interventions were useful to reflect on the domain concepts and created opportunities for active learning and dynamic discussions. Another important aspect was the establishment of social connections between MOOC participants after the chat activity and the feeling of belonging in the learning community. This was evident in learners’ responses in the surveys and in the MOOC forums where posts with a social purpose, such as greetings to the chat partner and feedback regarding the chat activity. This paper has presented detailed knowledge from a real experience with using CAs for supporting synchronous collaborative learning activities in the context of a MOOC with massive participation. The main focus of the first part of the paper has been given on the expectations of the CAs to promote transactivity and collaborative knowledge building while helping learners sustain a productive peer dialogue, which is the cornerstone of the colMOOC project. Then, methodological decisions on the CA activities of the MOOC have been made for a sound and effective evaluation whose results have been described and discussed in the second part of the paper. Finally, following the evaluation agenda of the colMOOC project, the valuable feedback and data received from learners along with the evaluation data analysis presented in this paper will serve as a primary data source for preparing next pilot experiences with improved versions of the CA integrated in real MOOCs and other forms of online education in order to support peer discussions and sustain productive dialogues. Acknowledgements. This research was funded by the European Commission through the project “colMOOC: Integrating Conversational Agents and Learning Analytics in MOOCs” (588438EPP-1-2017-1-EL-EPPKA2-KA).

210

S. Caballé et al.

References 1. Siemens, G.: Massive open online courses: innovation in education. Open educational resources: innovation, research and practice (2013) 2. Daradoumis, T., Bassi, R., Xhafa, F., Caballé, S.: A review on massive e-learning (MOOC) design, delivery and assessment. In: Proceedings of the Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 208–213. IEEE Computer Society (2013) 3. Demetriadis, S., Karakostas, A., Tsiatsos, T., Caballé, S., Dimitriadis, Y., Weinberger, A., Papadopoulos, P.M., Palaigeorgiou, G., Tsimpanis, C., Hodges, M.: Towards integrating conversational agents and learning analytics in MOOCs. In: Proceedings of the 6th International Conference on Emerging Intelligent Data and Web Technologies. Lecture Notes on Data Engineering and Communications Technologies, vol. 17, PP. 1061–1072 (2018) 4. Demetriadis, S., Tegos, S., Psathas, G., Tsiatsos, T., Weinberger, A., Caballé, S., Dimitriadis, Y., Sánchez, G.E., Papadopoulos, M., Karakostas, A.: Conversational agents as groupteacher interaction mediators in MOOCs. In: Proceedings of 2018 Learning With MOOCS (LWMOOCS), pp. 43–46. IEEE Education Society (2018) 5. Tegos, S., Demetriadis, S., Karakostas, A.: Promoting academically productive talk with conversational agent interventions in collaborative learning settings. Comput. Educ. 87, 309– 325 (2015) 6. Bassi, R., Daradoumis, T., Xhafa, F., Caballé, S., Sula, A.: Software agents in large scale open e-learning: a critical component for the future of massive online courses (MOOCs). In: Proceedings of the Sixth IEEE International Conference on Intelligent Networking and Collaborative Systems, pp. 184–188. IEEE Computer Society (2014) 7. Barak, M., Watted, A., Haick, H.: Motivation to learn in massive open online courses: examining aspects of language and social engagement. Comput. Educ. 94, 49–60 (2016) 8. Schuwer, R., Jaurena, I.G., Aydin, C.H., Costello, E., Dalsgaard, C., Brown, M., Teixeira, A.: Opportunities and threats of the MOOC movement for higher education: the European perspective. Int. Rev. Res. Open Distrib. Learn. 16(6), 20–38 (2015) 9. Kumar, R., Rose, C.P.: Architecture for building conversational agents that support collaborative learning. IEEE Trans. Learn. Technol. 4(1), 21–34 (2011) 10. Tegos, S., Demetriadis, S., Tsiatsos, T.: A configurable conversational agent to trigger students’ productive dialogue: a pilot study in the CALL domain. Int. J. Artif. Intell. Educ. 24(1), 62–91 (2013). https://doi.org/10.1007/s40593-013-0007-3 11. Tegos, S., Demetriadis, S.: Conversational agents improve peer learning through building on prior knowledge. Educ. Technol. Soc. 20(1), 99–111 (2017) 12. Michaels, S., O’Connor, M.C., Hall, M.W., Resnick, L.B.: Accountable talk sourcebook: for classroom that works (v.3.1). University of Pittsburgh Institute for Learning. Retrieved March 1, 2017, from. https://www.ortingschools.org/cms/lib/WA01919463/Centricity/domain/326/ purpose/research/accountable%20sourcebook.pdf (2010) 13. Noroozi, O., Weinberger, A., Biemans, H.J.A., Mulder, M., Chizari, M.: Facilitating argumentative knowledge construction through a transactive discussion script in CSCL. Comput. Educ. 61(2), 59–76 (2013) 14. Caballé, S.: A computer science methodology for online education research. Int. J. Eng. Educ. 35(2), 548–562 (2019) 15. Tegos, S., Demetriadis, S., Psathas, G., Tsiatsos, T.: A configurable agent to advance peers’ productive dialogue in MOOCs. In: Proceedings of CONVERSATIONS 2019. Lecture Notes in Computer Science, 11970, pp. 245–259 (2019) 16. Reich, J., Ruipérez-Valiente, J.A.: The MOOC pivot. Science 363(6423), 130–131 (2019)

Detection of Student Engagement in e-Learning Systems Based on Semantic Analysis and Machine Learning Daniele Toti1 , Nicola Capuano2 , Fernanda Campos3 , Mario Dantas3 , Felipe Neves3 , and Santi Caballé4(B) 1

Catholic University of the Sacred Heart, Brescia, Italy [email protected] 2 University of Basilicata, Potenza, Italy [email protected] 3 Federal University of Juiz de Fora, Juiz de Fora, Brazil [email protected], {mario.dantas,felipe.neves.braz}@ice.ufjf.br 4 Open University of Catalonia, Barcelona, Spain [email protected] Abstract. This research presents a comprehensive methodological approach to detect and analyze student engagement within the context of online education. It is supported by e-learning systems, and is based on a combination of semantic analysis, applied to the students’ posts and comments, with a machine learningbased classification, performed upon a range of data derived from the students’ usage of the online courses themselves. This is meant to provide teachers and students with information related to the relevant aspects making up the students’ engagement, such as sentiment, urgency, confusion within a given course as well as the probability for students to keep their involvement in or to drop out from the courses altogether.

1 Introduction Education has changed from a knowledge-transfer model to an active collaborative selfdirected model by the disruptive influence of technology [1]. Learning and Social Media Technologies have influenced many aspects of education, from teacher role to student engagement, from innovation to student assessment, from personalized and unique interaction to security and privacy concerns. The behavioral characteristics of students related to their experience during the educational process is an important feature for predicting their performance. In the context of assisted education, the greatest challenge is not only sending students recommendations and academic topics, but also predicting learning issues and sending notifications to teachers, administrators, students, and families [2]. Dropping out of school comes from a long-term process of disengagement from school and classes, and it has profound social and economic consequences for the students, their families, and communities [3]. Behavioral, cognitive, and demographic c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 211–223, 2021. https://doi.org/10.1007/978-3-030-61105-7_21

212

D. Toti et al.

factors may be associated with early school drop-out. According to [4] the “student drop-out is a genuine concern in private and public institutions in higher education because of its negative impact on the well-being of students and the community in general”. Being able to predict this behavior earlier could improve the students’ performance, as well as minimize their failures and disengagement. The lack of academic engagement is especially problematic in massive online education, such as the popular Massive Open Online Courses (MOOCs) currently provided by a wide number of organizations, including top-level institutions all around the world, with an offer extremely varied and millions of students enrolled [5]. The scale they have reached is nonetheless a two-sided coin: for the huge amount of students they involve, the corresponding number of teachers is much more limited. As a consequence, the supervision and involvement of the latter in the students’ progress and engagement is physiologically lacking, often ending up as a waterfall, mono-directional process of knowledge transfer from teachers to students [6]. To somehow alleviate these problems, advanced Learning Management Systems (LMS) and e-Learning systems in general register and monitor the students’ progress and interactions with the courses, as well as provide specific discussion forums where students may talk about their experience and share opinions with their peers. However, left by themselves and without effective ways to harness the invaluable information behind them, these tools may prove insufficient to bridge the gap between teachers and students in MOOCs and in online higher education in general [7]. Indeed, each student has their own way of receiving and analyzing information, and it is also important to know their learning styles to understand teaching methods and techniques that work best in each context. Learning styles are mechanisms used to comply with the students’ preferences as they refer to tastes or choices regarding the reception and processing of information [8]. As a matter of fact, given the sheer amount of potential information scattered across those platforms, the employment of information extraction and Knowledge Discovery (KD) techniques and solutions could nowadays be the key to automatically derive such information and provide the teachers with a collected and cohesive representation of the students’ engagement as a whole in their courses, as well as the probability for them to either keep being involved or drop out from the courses altogether (REF). In addition, for all available educational data related to courses, classes, students, resource usage, and interaction, different data mining and Machine Learning (ML) techniques can be used to extract useful knowledge that helps improve e-learning systems [9]. Finally, Recommender Systems (RSs) are a solution for automatically integrating data with predictive models and intelligent techniques. These systems can be defined as “any system that produces individualized recommendations or that has the effect of guiding the user in a personalized way to relevant objects or that are useful to them from among the several possible options” [10]. Extracting the student’s profile and context is a fundamental step in educational RSs as well as the representation of a student’s knowledge, progress in a class, preferred media, and additional information. The learning goals, motivation, beliefs, and personal characteristics are often associated with the educational context, as well as with technological resources. For this purpose, RSs perform the filtering of information, analyzing the user’s profile and interests, in order to later recommend content and actions [2].

Detection of Student Engagement

213

In this regard, this work proposes a methodology to carry out a comprehensive detection and analysis of the students’ engagement within the context of online higher education and the e-learning platforms revolving around them. This methodology combines Semantic Analysis (SA) and Natural Language Processing (NLP) approaches applied to Text Categorization (TC) of the forum posts and assignments from the students, combined with ML-based classification techniques performed on a range of data derived from the students’ performance, behavior as well as usage and interaction with the courses themselves. This builds up from earlier defined approaches from [7] and [2] to bring about a comprehensive and integrated methodology for such a data analysis and presentation of the knowledge extracted. This paper is structured as follows. In the next section the related work is summarized and the paper is contextualized in the relevant literature. In Sect. 3, the proposed approach to SA and NLP as well as the CONCEPTUM framework [9] for automatic TC of student forum posts is presented, while Sect. 4 details the use of RSs and classification models and tools based on ML for drop-out prediction. Eventually, Sect. 5 presents a methodological approach for the effective integration of information from the previous approaches for engagement detection. Section 6 concludes the paper by summarizing the main ideas and outlining potential applications of the proposed methodological approach as future direction of research.

2 Related Work The application of information extraction and NLP techniques upon unstructured, userdriven texts to derive knowledge, sentiment and opinion is indeed not novel and has been carried out, per se, both in literature and for commercial applications over the course of at least the last ten years. Different approaches to the automatic categorization of student forum posts have been proposed by different researchers. Such techniques generally work on three phases. In the first phase, they learn word embeddings from the textual corpus. In the second phase, word embeddings are combined to produce document (post) representations. In the third phase document representations are classified according to specific dimensions like urgency, sentiment, confusion, intents, topics, etc. In [7], a TC tool able to analyze and classify online forum posts over several dimensions has been proposed. The method relies on an Attention-based Hierarchical Recurrent Neural Network, where a bidirectional recurrent network is used for word encoding, then a local attention mechanism is applied to extract the post’s words that are important to the meaning of each sentence and aggregate them in a single vector. The same process is applied at a global level to aggregate the sentence vectors. Eventually, a soft-max function is used for normalizing the classifier output into a probability distribution. The developed tool reaches a high accuracy. With respect to drop-out prevention, several approaches focused on identifying students with a high risk of drop-out. In [11] and [2] propose an educational recommender system for e-Learning assistance aimed at preventing student drop-out through a predictive model. Authors use classic ML techniques in ensemble form and evaluate the proposal aiming to ensure a higher level of certainty and highlighting the importance of notifying teachers and students of the prediction results.

214

D. Toti et al.

In order to solve a complex and hierarchical text classification problem, novel architectures and solutions like [12, 13] can be a considered. Otherwise, Deep Learning (DL) approaches can be used to capture the semantics of the text, and to understand the sentiment expressed by the students through e-mails stating their will to give up their studies or through feedback from a course. Recurrent Neural Network and LSTM [14] are the major neural networks used to do this by using embedding layers and sequence models. In sum, ML and RSs can work together in order to solve these problems in a more cohesive possible. In the first step, critical students are identified and selected using ML and DL methods, and then RSs can be used to solve their academic problems. Based on these approaches, this paper proposes a comprehensive and integrated architecture as a methodology for the detection of student engagement in e-learning systems based on semantics and machine learning.

3 Analysis of Texts from Student Forums Given a post from a student forum, ideally the purpose would be to categorize it according to its content, along a number of expressive dimensions or attributes that could help teachers understand and obtain the overall student engagement for their corresponding courses. The major attributes deemed relevant for this work are the following [9]: • Type: the macro-category of the post itself among a set of two, namely question and statement. Additional information could be related, for instance, to the specific type of questions asked (yes/no question, quantitative, qualitative, temporal, etc.). • Concepts: the core (educational) concepts mentioned in the post. These could be narrowed down to a number of high-level topics (e.g. humanities, natural and applied sciences, education) or kept at a finer-grain by considering all of the semanticallyrelevant concepts mentioned in the post. • Sentiment: The affective polarity of the post, which basically falls into three possible values, namely positive, negative and neutral. Two additional, somehow trickier attributes can be taken into account as well [7]: • Urgency: i.e. how urgently a reply to the post is requested (high, normal, low) • Confusion: the level of confusion expressed within the post. As a whole, the approach proposed to categorize the students’ posts according to these attributes is heavily based on the use and verticalization of an API framework, CONCEPTUM [9], which combines a NPL pipeline relying upon state-of-the-art ML classifiers trained on language-specific models, with custom algorithms and rules dependent on the language family (Neo-Latin, Anglo-Saxon, etc.). In the following subsections how the above-mentioned attributes are extracted and detected is detailed.

Detection of Student Engagement

215

3.1 Type The type of the post is derived by the application of one of CONCEPTUM’s secondary modules, AQUEOS [15], which is able to understand whether a given sentence contains a question or not and in the former case what type of question is asked from a range of question categories (who, what/which, how much/how many, when). In the event that a question is found but does not fall into any of such categories, the question is deemed to be a yes/no question, whereas if no question is found within the post, the post is considered a statement. 3.2 Concepts The physically-present, semantically-relevant concepts, i.e. nouns and their corresponding aggregations with adjectives and specifiers, are extracted from the post via CONCEPTUM’s main pipeline. In order to classify the post according to three established macro-categories (humanities, natural and applied sciences, education), three corresponding sub-branches of the Eurovoc taxonomy [16] are used as a reference, that is to say “3206 education” (under “36 EDUCATION AND COMMUNICATIONS”), “3606 natural and applied sciences” and “3611 humanities” (both under “32 SCIENCE”), where the elements of these sub-branches up to their leaves are matched (with a certain degree of fuzziness given by the application of syntactical distances like Jaccard/Tanimoto, Levenshtein and Jaro/Winkler) with a WordNet-based synonym expansion of the concepts extracted from the text. Each match is assigned a normalized score between 0 and 1; the sum of all the scores of the matches between concepts and elements from a given sub-branch makes up the total score for the sub-branch as a whole, normalized accordingly based on the number of matches. Therefore, the total score for a sub-branch represents how likely the given post falls into the root category of the branch itself, and can be classified as such [17–19]. 3.3 Sentiment In order to detect the overall sentiment of a post, it is possible to consider the following elements: • the concepts extracted from the text as described in the previous subsection, • the core verbs (excluding auxiliaries and such) of the sentences, • the modifiers (adjectives and adverbs), with the last two detected by CONCEPTUM’s main pipeline and stored separately. A polarized lexicon for the English language like SentiWordNet [20] is then used to check the polarization of the considered elements in terms of a score ranging from −1 (negative) to 1 (positive), with values around 0 deemed neutral. The weighted sum of the polarization of the textual elements (where the weight is assigned in decreasing value to modifiers, concepts and verbs, respectively), normalized accordingly based on the number of elements, produces an overall polarization score for the entire post, and thus allows for establishing the most likely sentiment associated with it.

216

3.4

D. Toti et al.

Urgency and Confusion

Instructors are often unable to effectively moderate forums, especially in massive environments where the student-to-teacher ratio is very high. This often prevents students from receiving the support they need, thus promoting low academic satisfaction and high drop-out. Detecting the urgency of a post on the forum means to deduce, from the analysis of the text of the post, how urgent a reply to the post is. The extracted information is important for instructors and allows them to plan their interventions carefully. With respect to confusion, some studies have pointed out that it has a complex influence on learning and engagement. The confusion experienced during learning processes is not always associated with negative results. Under certain circumstances, a prompt response to confusion (for example, the support of an instructor) can lead to a beneficial effect. On the other hand, students who experience confusion may find it difficult to get involved in a course and may eventually drop out. To detect such important parameters from the text of the posts, rather than the use of lexicons (the development of annotated lexicons for these tasks is still in an embryonic state) it is more useful to resort to approaches based on machine learning. The approach proposed in this work, in continuity with [7], considers the detection of confusion and urgency as classification tasks. In an initial pre-processing phase, the text composed of a forum post is divided into a sequence of tokens (words), each transformed into a vector projecting it into a continuous space. Then, the vectors representing post words are combined to obtain a single vector representing the whole post and a class is associated with that vector. For the execution of this last step, the Hierarchical Attention Network proposed in [21] is adopted. To sum up, following both the CONCEPTUM framework [9] and the multi-attribute TC tool [7] based on state-of-the art natural language understanding methods and tools, online forum posts can be detected, including intent (the aim of the post), topics (main learning domain concepts the post is about), sentiment (the affective polarity of the post), confusion (level of confusion expressed by the post) and urgency (how urgently a reply to the post is requested). The introduction of this automatic post categorization approach can be used to generate more targeted and timely interventions to students so improving their overall learning effectiveness and performance.

4 Drop-out Prediction Dropping out of school is a widespread phenomenon in all universities. Drop-out is considered when a student enrolled at an educational institution decides to voluntarily abandon the course. Student drop-out arises from academic and non-academic factors. In [4], the authors highlight some of these factors: academic achievement, institutional habitus, social interaction, financial constraints, motivation, and personality. School drop-out is a serious problem for students, society, and policymakers as early desertion causes monetary losses and also social costs while solving this problem means giving the students the chance to obtain a degree and finish with success the program they have started [2]. However, finding a solution to this issue is a complex challenge, not only for academic institutions, but also for machine learning researchers. Within the e-learning

Detection of Student Engagement

217

domain, when considering two students from two different programs, the same features may have different weights. For example, Student A may prefer a written exam for “mathematics”, whereas Student B may prefer an oral test for “modern history”. In this case, the preferred type of exam is strongly linked to the specific academic program. In machine learning approaches, this kind of problem is generally tackled and solved through a binary classification task where the two labels may be, for this domain, “drop-out” and “graduation” [22]. To solve this problem commonly used predictive models include: Random Forest, Logistic Regression, Neural Networks such as Multi-Layer Perceptron, Support Vector Machines and Decision Trees. This study focuses on combining the benefits of distance education context with the power of RSs, aiming to predict early student drop-out. This problem can be seen from two different perspectives. According to the first (and most common) view, RSs are used to solve problems for single students. Thus, for a given student, this approach is applied to offer single, specific solutions. From another perspective, it is possible to use a corpus of student profiles to build a classification model able to provide teachers with suggestions related to potential improvements, in order to satisfy the greatest number of students. Therefore, this challenge is faced as a two-fold problem: 1. A recommender system problem. 2. A classification model problem. Theoretically, the union of these two approaches could bring benefits for both parties involved. For this purpose, this section presents an overview of the main concepts and definitions involved in the learning approach used in this research to avoid school drop-out in distance education. 4.1 Recommender Systems A RS can be any system that produces individualized recommendations as an output or has the effect of guiding the user in a personalized way to interesting or useful objects in a large space of possible options [11]. To this end, a student profile is required, using statistical and behavioral features to understand the best solution to recommend in order to avoid drop-out. Student’s profile also characterizes the learning style as it identifies how the student interacts with the e-learning system and also his/her preferences. It makes it possible for the adaptive system to provide the student with relevant content [10]. Therefore, RSs are intended to recommend content to users based on their profiles and context, considering their preferences, needs, and interests. Recommendation techniques are used to help characterize the user’s profile and context, allocate them into groups with similar needs, locate resources that meet user’s needs and design a strategy to recommend most effectively [8]. Within these approaches, three steps are usually required: 1. Extraction and data preprocessing, generally used to transform unstructured and semi-structured data into a structured form. 2. Data filtering, useful to represent items in a typically multidimensional space.

218

D. Toti et al.

3. Model and recommendation, where the vector profiles are used to find the best solution for the input items. This process is continuous, that is, when a recommendation occurs, which means the recommendation is correct to the user, the positive result is used by the system to aggregate the user’s preference to carry out further recommendations. There are different perspectives or filtering types to recommend a resource in RSs as content-based filtering, collaborative filtering, and hybrid. According to [23], there are two main model categories: Memory-based models and Model-based models, both having a vast number of algorithms, such as K nearest neighbors, Euclidean distance, Pearson correlation, and cosine similarity. In memory-based, methods are Bayesian classifiers, neural networks, fuzzy systems, genetic algorithms, among others. In model-based methods are decision trees, deep learning algorithms, among others. 4.2

Classification Models

The adoption of artificial intelligence predictive models favors the classification and grouping contexts that may be useful to model-based methods as the use of ML approaches for classification problems is widely explored in the literature. The following are the classic ML models found in the literature, and the most adherent ones to the supervised classification problem [2]: • Decision Tree (DT): an abstract structure characterized by a tree, where each node denotes a test on an attribute value, each branch represents an outcome of the test, and tree leaves represent classes or class distributions. • K-Nearest Neighbors (KNN): a classical and lazy learner model. They are classifiers based on learning by analogy, which measures the distance between a given test tuple with training tuples and compares if they are similar. • SVM: by using kernels, this model transforms the original data in a higher dimension, from where it can search and find a hyperplane for the linear optimal data using training tuples called support vectors. • Random Forest (RT): a collection of tree-structured classifiers (Decision Tree) that fits many classifying decision trees to various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. • Logistic Regression (LR): a generalized linear approach that models the probability of some event to occur and is modeled as a linear function of a set of predictor variables. • Multi-Layer Perceptron: a neural network that contains neurons to pass the data through it. The model can learn a nonlinear function approximator for either classification or regression. However, there are many challenges in predictive approaches based on classification models. Identifying the semantic context when predicting an individual’s preferences and needs is a challenge, which poses great difficulty in assertively identifying the semantic meaning of such needs and preferences. Achieving greater precision through increased accuracy is another challenge related to prediction models. Results can be influenced by the volume of data, and old data generates less accurate results and, when

Detection of Student Engagement

219

the amount of data is too large, distortions and missing values can influence the predictions [2]. In addition, the amount of data generated by any LMS from on-line courses, such as MOOCs, contains rich information about the student learning style as the sequence of interactions between the students and the LMS, as well as about the student behavior and choices made during the learning process [7]. Eventually, the LMS can automatically notify teachers and students about what is going on during the online course. Students can receive personalized notifications from the system (e.g. messages and video) and also from the teachers, based on their learning style, which overall may alleviate their disengagement and motivate them to stay involved in the course [2].

5 A Methodology to Detect Student Engagement By leveraging the considerations and the background previously discussed, a novel methodological approach is proposed in this section. The proposed methodology is based on the DPE-PRIOR architecture to predict student drop-out [2], which includes ML-based classification models as a core of a RS as explained in Sect. 4. An extension of the DPE-PRIOR architecture is provided via the inclusion of the TC approach based on SA, NLP techniques and the CONCEPTUM framework [9] described in Sect. 3 for effective automatic categorization of text, such as student forum posts and course assignments [7]. 5.1 Goals and Overview The aim of the proposed methodology is to detect student engagement in LMS systems based on SA, NLP and ML approaches. The objective is to provide both students and instructors with timely notifications about the learning process, for improving it as a whole. The methodology depends on the dataset of the learning context, and thus can be used in different learning domains, requiring only a configuration compatible with the class data model, such as student and class information like schedule, total number of tasks, assessment, grades, and so forth. Following the DPE-PRIOR architecture, the proposed methodology is made up of the following sequential phases (see Fig. 1 and [2]): • Extraction: this is the first and fundamental phase in the proposed methodology, meant to collect all of the information about students and courses by using internal and external data. As such, this layer is responsible of extracting the information needed to compose a student’s activity profile, performance and behavior. • Filter: based on the informative goals set, this phase filters all the information collected for a specific purpose. For instance, in order to detect student engagement, it is necessary to derive the student’s performance throughout the course via filtering his/her activities by a limit date pre-established by the teacher. • Model: this phase combines different ML models, giving a more accurate result with a higher confidence. Each model is intended as an autonomous service capable of dealing with requests and predicting the student drop-out probability.

220

D. Toti et al.

Fig. 1. DPE-PRIOR architecture [2].

• Notification: as a result of the previous phases, a notification is sent to the teacher responsible for the course through this phase, where the teacher can intervene to prevent the student from dropping out of the course. It also sends messages to the students as a motivation to avoid their disengagement. • Training: this final phase trains the ML models using data from past courses that share the same structure for measuring student performance. 5.2

Discussion

In the educational context, textual information such as student forum posts, e-mails and written assignments is important for understanding the students’ activity and their performance during a course. Similarly to the info collected for the students’ profile, performance and behavior, this textual information is also periodically extracted during the Extraction phase of the approach proposed and passed to the next phase (Filter phase). In the Model phase, this information is analyzed by SA and NLP techniques as explained in Sect. 3 in order to detect relevant aspects making up the student’s engagement, such as sentiment, confusion, etc. As the Extraction phase can be implicit or explicit, it is possible to capture the student’s learning style as follows. Explicit data can be capture by a form from the LMS. For the implicit profile, data provenance allows capturing the historical documentation of learning objects, both through their metadata and the student trail, capturing the sequence in a study section and using ontological rules to infer the features. For the educational metadata, among others, level and type of interactivity and type of learning object can be used (e.g. Simulation, Exercise, Problem Solving, Text, Video, Audio, Image, etc.). Eventually, the student preferences will define the learning style and then send personalized media messages in the Notification phase. In the Model phase, different ML models are defined to compose the main autonomous services. All the models are supervised since the issue is a classification problem, which consists of indicating whether a student can drop out of class or not.

Detection of Student Engagement

221

The averaging process takes into account only the class predictions related to the notification trigger, which is the classification that is intended to be notified if, for a set of students’ characteristics, the result presents the student failure.

6 Conclusions and Future Work In this paper, a joint view from different research efforts in Learning Engineering has been presented, with the aim to alleviate the current issues and challenges found in online education, in terms of student disengagement, lack of performance and early drop-out. To this end, the approach proposes a comprehensive methodology based on models, techniques and tools from the field of AI in Education, including SA and NLP applied to TC of student forum posts, comments and class assignments, along with ML-based classification models and RSs to predict student abandonment. The combination of these data approaches provide a comprehensive view of student engagement and commitment, especially in the context of the massive online education, such as MOOCs, with relevant potential applications. indeed, the extracted knowledge from the application of the proposed data intensive methodological approach can be displayed graphically on an online course dashboard in order to provide support for teacher decision making in order to easily monitor online classes to find out in real time whether there may be a range of difficulties for students in a given topic or negative student opinions regarding a certain aspect of the course, providing a way to detect and correct situations of risk. The same information can be used as input for conversational pedagogical agents in order to automatically involve students in guided and constructive interactions in natural language within agent-mediated discussions. The goal of these agents would be to promote helpful and constructive discussions between peers, including argumentation and mutual clarifications and explanations [24]. These discussions, based on the extracted information, could be initiated and encouraged by the actions of informed agents, improving the effectiveness and timeliness of the interventions (see [7] for a full application case scenario). Future work will be guided by this direction of applied research of the methodological approach presented in this paper. Acknowledgements. This work has been supported by both the project colMOOC “Integrating Conversational Agents and Learning Analytics in MOOCs”, co-funded by the European Commission within the Erasmus+ program (ref. 588438-EPP-1-2017-1-EL-EPPKA2-KA) and the CNPq (National Center for Scientific and Technological Development) of the Brazil Government.

References 1. Bagheri, M., Movahed, S.H.: The effect of the Internet of Things (IoT) on education business model. In: Proceedings - SITIS 2016, pp. 435–441 (2016) 2. Neves, F., Campos, F., Ströele, V., Dantas, M., David, J.M., Braga, R.: Assisted education: using predictive model to avoid school dropout in e-learning systems. In: Intelligent Systems and Learning Data Analytics in Online Education, pages Accepted (2020, in press) 3. Márquez-Vera, C., et al.: Early dropout prediction using data mining: a case study with high school students. Expert Syst. 33, 107–124 (2016)

222

D. Toti et al.

4. Olaya, D., Vásquez, J., Maldonado, S., Miranda, J., Verbeke, W.: Uplift modeling for preventing student dropout in higher education. Decis. Support Syst. 134, 113320 (2020) 5. Siemens, G.: Massive open online courses: innovation in education. In: Open Educational Resources: Innovation, Research and Practice, pp. 5–15 (2013) 6. Daradoumis, T., Bassi, R., Xhafa, F., Caballé, S.: A review on massive e-learning (MOOC) design, delivery and assessment. In: 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 208–213 (2013) 7. Capuano, N., Caballé, S.: Multi-attribute categorization of MOOC forum posts and applications to conversational agents. In: Advances on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 505–514 (2020) 8. Ezzat Labib, A., et al.: On the way to learning style models integration: a learners characteristics ontology. Comput. Human Behav. 73, 433–445 (2017) 9. Toti, D., Rinelli, M.: On the road to speed-reading and fast learning with CONCEPTUM. In: Proceedings - 2016 International Conference on Intelligent Networking and Collaborative Systems, IEEE INCoS 2016, pp. 357–361 (2016) 10. Burke, R.: Hybrid recommender systems: survey and experiments. User Model. User-Adap. Inter. 73, 331–370 (2002) 11. Kelle Pereira, C., Campos, F., Ströele, V., Maria, J., David, N., Braga, R.: BROAD-RSI - educational recommender system using social networks interactions and linked data. J. Internet Serv. Appl. 9(1), 7 (2018) 12. Ciapetti, A., Di Florio, R., Lomasto, L., Miscione, G., Ruggiero, G., Toti, D.: NETHIC: a system for automatic text classification using neural networks and hierarchical taxonomies. In: ICEIS 2019 - Proceedings of the 21st International Conference on Enterprise Information Systems, pp. 284–294 (2019) 13. Lomasto, L., Di Florio, R., Ciapetti, A., Miscione, G., Ruggiero, G., Toti, D.: An automatic text classification method based on hierarchical taxonomies, neural networks and document embedding: the NETHIC tool. In: Lecture Notes in Business Information Processing, vol. 378, pp. 57–77 (2020) 14. Murthy, Dr., Allu, Shanmukha., Andhavarapu, Bhargavi, Bagadi, Mounika: Text based sentiment analysis using LSTM. Int. J. Eng. Res. Tech. Res. 9, 05 (2020) 15. Toti, D.: AQUEOS: a system for question answering over semantic data. In: Proceedings 2014 International Conference on Intelligent Networking and Collaborative Systems, IEEE INCoS 2014, pp. 716–719 (2014) 16. Eurovoc: EU’s multilingual thesaurus. http://eurovoc.europa.eu/ 17. Capuano, N., et al.: Ontology extraction from existing educational content to improve personalized e-learning experiences. In: ICSC 2009 - 2009 IEEE International Conference on Semantic Computing, pp. 577–582 (2009) 18. Arosio, G., Bagnara, G., Capuano, N., Fersini, E., Toti, D.: Ontology-driven data acquisition: intelligent support to legal ODR systems. Front. Artif. Intell. Appl. 259, 25–28 (2013) 19. Capuano, N., Longhi, A., Salerno, S., Toti, D.: Ontology-driven generation of training paths in the legal domain. Int. J. Emerging Technol. Learn. 10(7), 14–22 (2015) 20. Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010) (2010) 21. Yang , Z., Yang, D., et al.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016) 22. Lee, S., Young Chung, J.: The machine learning-based dropout early warning system for improving the performance of dropout prediction. Appl. Sci. 9, 3093 (2019)

Detection of Student Engagement

223

23. Bobadilla, J., Ortega, F., Hernando, A., Gutiérrez, A.: Recommender systems survey. Knowl. Based Syst. 46, 109–132 (2013) 24. Demetriadis, S., et al.: Conversational agents as group-teacher interaction mediators in MOOCs. In: Proceedings - Learning With MOOCS, LWMOOCS 2018, pp. 43–46 (2018)

Monitoring Airplanes Faults Through Business Intelligence Tools Alessandra Amato1 , Giovanni Cozzolino2(B) , Alessandro Maisto3 , and Serena Pelosi3 1

3

University of Napoli Federico II, Naples, Italy [email protected] 2 DIETI, University of Napoli Federico II, via Claudio 21, Naples, Italy [email protected] DISPC, University of Salerno, Via Giovanni Paolo II, 132, Fisciano, SA, Italy {amaisto,spelosi}@unisa.it Abstract. In this work we propose a tool able to monitor aeroplanes crashes, providing reports and views to managers, in order to improve services for the safety of passengers on board. We exploited Business Intelligence data visualisation tools functionalities to design specific data processing phases, from the pre-processing to the cleaning and filtering and finally visualisation of the reports.

1

Introduction

The abbreviation BI stands for Business Intelligence and is one of the most required skills for a Data Scientist in a company. Business Intelligence is a set of strategies, technologies and tools for analysing business information way used in a corporation, big industries and financial entities for providing many insights within the company, such as benchmarking, predicting sales and so on. It’s a compelling branch of Data Analysis. As is well known, within a company, there is one (or more) manager. This is the person who is prepared to make decisions [1] that guide the company in a given sector of the market, or for instance, make choices that require investment in advertising [2]. All the options made by managers are then evaluated according to the goals they have set themselves in the company. Increasing profits, minimising losses and so on. To be able to make these choices, the manager needs to discover useful information. For these reasons, he relies on Data Scientists, Data Engineering and Data Analysts. Those have to work together to develop a method by which offering him/her the best insights on data available. Therefore, within a BI process are included: (1) Business performance management, (2) Data mining, (3) Manage data with Data Warehouses, (4) Text Mining, (5) Predicting Analysis, and many more. With the advent of big data (the new oil) business intelligence processes have become fundamental within the most important companies. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 224–234, 2021. https://doi.org/10.1007/978-3-030-61105-7_22

Monitoring Airplanes Faults Through Business Intelligence Tools

2

225

Data Visualization

One of the most important aspects of BI is the visualisation of data and insights. This stems from a very important principle. The manager is not required to understand the perhaps so complicated code the data scientist has written to present and analyse data. He/she could also have no idea about Python’s Pandas library, as well as NLTK or TensorFlow. So a data scientist must have the ability of presenting data and results in a clear, fully-understandable and precise way. Best way to do that is to use a Business Intelligence tool. There is literally plenty of BI tools available online. Some of them are free; many are licensed. Some offers free-trials and others are entirely open-source. So it’s quite easy to find the tool that best meets our requirements. Only some of the most famous BI tools will be listed below • • • • • •

Cognos Analytics1 Microstrategy2 spagobi3 Power BI4 Business Intelligence5 (from Google Cloud) Qlik6

The choice fell on Microsoft’s Power BI. A leader among BI tools, Power BI is licensed software that offers a 30 days free trial that can only be used with a company or educational account. It is composed of two different tools • Power BI site. It offers the possibility to share insights among the same company. • Power BI Desktop. It’s an application straightforward to use in local machines. Offers the possibility to publish works on the cloud. So far, we haven’t talked about how data should be presented. Beyond the specific visual objects such as graphs, maps, pies, we would now focus on the two ways of presenting data: Reports and Dashboards. A report is a powerful tool for presenting data. It shows as a huge blank page in which one can add charts, slicers, images, text boxes, titles and so on. Power BI process of creating a report is divided into three main parts 1. Connect data source to Power BI 2. Work on relationships in case of a RDBMS 3. Create the model 1 2 3 4 5 6

https://www.ibm.com/it-it/products/cognos-analytics. https://www.microstrategy.com/it. https://www.spagobi.org/. https://powerbi.microsoft.com/it-it/. https://cloud.google.com/looker. https://www.qlik.com/it-it.

226

A. Amato et al.

Relationships are probably the most crucial part since we want our data to be “queried in real-time” making our reports interactive. Reports can have multiple pages and represent the core of the whole BI process. Dashboards are a type of visualisation optimised for mobile devices. They can be generated easily from a report, and represent the best way to present data in a “non-static” condition, so unlike business meetings, conferences and conventions. One of the main advantages of Power BI dashboards is the ability to perform data operations, which are, as mentioned, real-time queries. These are among others, Drill-down, Slice and Dice and so on. This is done by simply clicking interactively on the dashboard. By clicking on a specific type of product, for example, we can get sales data for that particular product. What it does is a conditional select by highlighting the lines in the dataset that match that specific product. The beauty is that all this happens in real-time, and it allows you to have some first insights on the data [3,4].

3

Case Study: Airplanes Crashes Report

We imagine the situation in which we are hired in a certain Airline. Our company has asked to know something more about aeroplanes crashes, intending to improve services for the safety of passengers on board. At this moment, we are not that concerned about what the manager is asking. We want to deploy a system on which he/she can ask for insights. Best way to do this is implementing a Report by using charts, stats and data accurately. As we already said, we use Power BI. The first step is where to get data. There is plenty of repositories online dealing with air accident data. We decide to choose a highly recommended and rated dataset from one of the most famous sites for Data Science: Kaggle7 . We choose the dataset at the link https://www.kaggle. com/saurograndi/airplane-crashes-since-1908 and we download it. It’s a CSV file8 , but it’s not a big deal. Power BI is compatible with an incredibly high number of data sources, such as Excel, CSV files, SQL DB and so on. When the source is correctly connected to Power BI, we are ready to do the next step [5]. 3.1

Preprocessing and Cleaning Data

After we get data correctly, we should look deep into them. A full understanding of them is essential to be able to give the best interpretation [6,7]. In our situation, the dataset was already really clean and “machine-friendly” [8,9]. Headers were placed in the correct positions and were properly recognised by Power BI. We had to do just two little things: • An “ID”-like column was needed. In fact, for obtaining the total number of crashes, we had to create a so-called Measure. A measure is a dynamic way to 7 8

https://www.kaggle.com/. Comma Separated Values.

Monitoring Airplanes Faults Through Business Intelligence Tools

227

create a column without storing data in the Power BI database. It’s instrumental when the source is huge. The method measures are designed through DAX (Data Analysis Expression). They are formulas that take columns and operations as input and output the measure. So we begin creating a new measure called crashes by counting the number of rows as Crashes = COU N T (Airplane Crashes and F atalities Since 1908[Date]) (1) At this point, we got a measure that can be used in every single chart of our report. • Date column’s type was read as “text”. We had to switch to date/time type. After that, all is set for loading data and creating the report. 3.2

Creating the Report

To create the report, we use a technique known as Power Pattern. Basically, it states a set of rules for creating the best BI report. The main idea behind Power Pattern is that , not all pixels are the same. In fact, rules can be summarise in the following: • On the top of the page, the attention of the auditor is at the highest level possible. So it’s so important not wasting space with pictures, logos or too complex graphics and visuals. The idea is to put the so-called High-Level Numbers. These are simple numbers, which synthesise a quantity and are often of great impact. For instance, quantity such as Sales, Budget, Profit would fit perfect. • Going down through the page we can easily add two kinds of visualisations – Breakdowns → Created with charts such as bars, pies. – Trends → Made by using line chart. They often describes a feature of interest related to time or space quantities. • Tables should be put at the right of the page because they sometimes need a deeper explanation • At the very end (like at bottom right), logos can be added • It is highly recommended to add a title to the report When writing a multiple-pages-report, a good habit is the concept of repeating pattern. Pages should be easy to read with the same key. So we start by adding the title of the report, which should mainly describe the dataset we’re illustrating, in a clear and concise manner. Since we are dealing with Airplane Crashes in a precise period of time, we choose: • Airplane Crashes as the title. • From 1908 to 2019 as a subtitle Since our attention is focused on the phenomenon of airplane crashes, as well as the number of deaths, the percentage of survivors, we use as high-level numbers

228

• • • •

A. Amato et al.

Number of Crashes (with Crashes measure previously created) Number of Fatalities number of people Aboard Percentage of Death in an Airplane crash (created as a new measure)

To create a sweeter view to look at, we add a coloured box to accentuate the numbers. The result is shown in Fig. 1.

Fig. 1. High-level numbers of our report

3.3

Breakdown Visuals

We create two different clustered bar charts • One for describing the Number of Crashes and Fatalities as a function of the Operator. This is, of course, useful to get to know the companies that have done better in terms of safety, and those that have had some major difficulties in containing accidents and deaths. In our report, the chart is the following Fig. 2a; • The other chart, in Fig. 2b, describes the same features as a function of Type of the airplane. This graph is, if possible, even more interesting than the previous one. In this case, in fact, comparing different types of aircraft can allow our company to discover which details, technologies and implementations make the aircraft safer.

Breakdown chart for Operators

Breakdown chart for Types

Fig. 2. (a) Breakdown chart for Operators, (b) Breakdown chart for Types

Monitoring Airplanes Faults Through Business Intelligence Tools

3.4

229

Table

Continuing in the middle of our report, we find a table. This has Route and Accidents as columns, i.e. how many incidents there have been for certain routes. In the first line of this table, we find a missing value for Route. This is because (you can see it by querying the report) until around 1930 the routes affected by accidents were not reported. Moreover, before starting to observe some routes, we notice values that show us that many accidents occurred in test phases (Fig. 3).

Fig. 3. The table of route and accidents

3.5

Trend Visuals

At the bottom of the report, we find views showing trends. In particular, we have • A line chart showing Fatalities as a function of time by using Data variable (Fig. 4a). It can be very useful if you want to analyse how the number of deaths has changed over the years, perhaps correlating this information with others external such as new legislation, implementation of new technologies and so on. • A map chart (Fig. 4b), showing the Location of the accident, where we have chosen to scale the values by the size of the dot that identifies the point of the accident based on the number of deaths of Fatalities variable. This map does not seemingly carry a huge amount of information but allows you to analyse entire countries, regions and cities where most accidents occur. It could be useful for an airline to choose which airports to depart from or land at.

230

A. Amato et al.

Fatalities over time

The map of accidents’ locations sized by fatalities

Fig. 4. (a) Fatalities over time, (b) The map of accidents’ locations sized by fatalities

3.6

Summaries Section

Of course, one might be interested to know a small summary of what caused the accident. Specifically, our dataset has a column called Summary. This column actually represents a brief summary of what happened. In this column, you can extract very interesting information, such as whether there were errors of the machines or drivers, what weather conditions caused accidents and so on. Analysing the dataset we notice that these short “expert reports” have not been collected since ever, but only from around the 1930s. To do this, we create a second page where we add views to get the description of a number of incidents. Please note that all the operations you do on this second page affect and filter the data on the first page as well. This is not automatic, but it is done using a Power BI function that makes the filters sync between report pages. 3.7

Filters

In the report, we add 4 filters • Date → The date can be interesting because the company manager may want to know the incidents that have occurred in some years comparing them with existing and new implementations, as well as with current regulations and much more. Using the visualisation called Data Hierarchy, it is also possible to identify the quarters called qrt. • Operator → The second filter concerns the operator. This allows our report to target a certain company with high safety standards and imitates its best features, or on the contrary, analyse the most fragile companies in terms of accidents and deaths. • Type → Again, the type of aircraft involved in an accident can be extremely interesting. This more than the company’s policy can be about where to direct your investment. In fact, the report can allow assessing which types of aircraft suffer from safety problems, both in terms of a number of accidents and in terms of survival rates after an accident. • Route → The route represents the last filter we use in the report. This can be useful to distinguish between summaries and different incidents related for example, to the same airline or the same type of aircraft.

Monitoring Airplanes Faults Through Business Intelligence Tools

3.8

231

Summaries

For summaries, we use a multiple line card. This is nothing more than a card where you can display more than one item. In other words, you don’t need to do aggregate operations like COUNT, SUM, MAX, MIN, AVG and so on. So, initially, all summaries are shown. If one click on a filter, the query in the

Fig. 5. (a) Date filter. The level of detail of the filter can reach the day, (b) Operator filter, (c) Type filter, (d) Route filter

232

A. Amato et al.

dataset in the Summary column returns all records that verify a certain condition, expressed by the filter itself (Figs. 6 and 5).

Fig. 6. Summaries filtered for Operator Aerosucre Colombia

4

Real-Time Sensors Data Stream

The application seen in the previous section concerns a display of data present in a dataset or database. These are in some ways “static” data, i.e. they remain the same (unless the source is refreshed). However, this is not the only case where Power BI dashboards and reports are an essential tool to evaluate and display data. Another high branch for BI applications is real-time data streaming. This is because Power BI has powerful tools for viewing data in real-time. It even offers the possibility to set a dashboard refresh time, to provide the user with the best viewing experience, perfectly fitting the context. There are many situations in which streaming data is essential. There are applications where you want, for example, to study user trends on social networks [10] and their reactions to certain political and non-political events [11,12]. Or you need to continually monitor data streams from the motion of a spacecraft in orbit to ensure that operations are conducted in complete safety. Or (as in the case of the airline in the previous section) you want to take advantage of the sensoristica to establish fundamental quantities for the flight of an aircraft, such as temperature, pressure, altitude, humidity and so on. During this section, you will see one of the many ways you can create a real-time data stream with Power BI, and you will see an application to a simulated dataset that simulates sensors in a room. Pubnub is one of the primary methods that are used to connect Power BI to a streaming data source. Of course, it is not the only method. There are APIs that take JSON code and connect real-time data sources to Microsoft’s tool. PubNub is a paid service that offers a network of information and data exchange. Through this product, Power BI can receive real-time data from a source. The site can be visited at the address https://www.pubnub.com/ and offers within it APIs that allow realising various applications such as (1) Chat, (2) Geolocalization and (3) IoT Device Control. As said, we will use a simulated dataset that reproduces the measurement values of (1) Temperature, (2) Humidity, (3) Light Radiation. These sensors are placed inside a room. In the case of real-time detection, we choose to make the dashboard simpler and less articulated than the previous

Monitoring Airplanes Faults Through Business Intelligence Tools

233

one. The reason is straightforward. When streaming data on Power BI, there are some rules similar to those concerning the distinction between measurement and calculated column. In fact, there are techniques that lead Power BI to use caches, which are filled with data and then emptied from time to time. In the case of PubNub instead, the data flow is not stored in Power BI. This brings a considerable advantage in terms of size when the data is too big, but it also has the disadvantage of not being able to query in real-time as seen in the previous section. So we choose to build our dashboard using only two types of views, the Card and the Line Chart. The former will allow us to know the values related to the quantities that are measured in that instant of time. Let’s put them next to graphs that instead show the trend of these quantities as a function of time, thanks to which we can reconstruct any peaks, valleys and study the stability of these measurements according to our specific requests. Moreover, to improve the readability of the dashboard (which we remind you is one of the main aspects in the construction of a data display) we add the units of measurement to which the sensors are calibrated and specify that the maximum time window that the graphs can describe is equal to one minute. So, ultimately, we get a dashboard that updates in real-time, as in the following Fig. 7.

Fig. 7. Dashboard of sensors in a room

5

Conclusion

This research suggests a method capable of tracking airplane accidents, supplying supervisors with information and recommendations to enhance aviation health systems on board. We used Business Intelligence data visualisation techniques to plan different stages of data analysis, including pre-processing, cleaning

234

A. Amato et al.

and filtering, and finally report visualisation. Future works foresee to apply more sophisticated data filtering techniques to provide further details to the management team.

References 1. Amato, F., Cozzolino, G., Mazzeo, A., Romano, S.: Intelligent medical record management: a diagnosis support system. Int. J. High Perform. Comput. Networking 12(4), 391–399 (2018) 2. Amato, F., Cozzolino, G., Moscato, F., Moscato, V., Picariello, A., Sperli, G.: Data mining in social network. In: International Conference on Intelligent Interactive Multimedia Systems and Services, pp. 53–63. Springer (2018) 3. Amato, A., Cozzolino, G., Moscato, V.: Big data analytics for traceability in food supply chain. In: Workshops of the International Conference on Advanced Information Networking and Applications, pp. 880–884. Springer (2019) 4. Amato, A., Balzano, W., Cozzolino, G., Moscato, F.: Analysis of consumers perceptions of food safety risk in social networks. In: International Conference on Advanced Information Networking and Applications, pp. 1217–1227. Springer (2019) 5. Huber, E.: Monitoring airplanes faults through business intelligence tools 6. Amato, F., Cozzolino, G., Mazzeo, A., Mazzocca, N.: Correlation of digital evidences in forensic investigation through semantic technologies. In: 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 668–673. IEEE (2017) 7. Cozzolino, G.: Using semantic tools to represent data extracted from mobile devices. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 530–536. IEEE (2018) 8. Canonico, R., Cozzolino, G., Ferraro, A., Moscato, V., Picariello, A., Sorrentino, F.R., Sperl`ı, G.: A smart chatbot for specialist domains. In: Workshops of the International Conference on Advanced Information Networking and Applications, pp. 1003–1010. Springer (2020) 9. Amato, A., Cozzolino, G.: Trust analysis for information concerning food-related risks. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 344–354. Springer (2019) 10. Castiglione, A., Cozzolino, G., Moscato, V., Sperl`ı, G.: Analysis of community in social networks based on game theory. In: 2019 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pp. 619–626. IEEE (2019) 11. Amato, F., Cozzolino, G., Moscato, F., Xhafa, F.: Semantic analysis of social data streams. In: International Conference on Intelligent Networking and Collaborative Systems, pp. 59–70. Springer (2018) 12. Amato, A., Cozzolino, G., Giacalone, M.: Opinion mining in consumers food choice and quality perception. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 310–317. Springer (2019)

Artificial Intelligent ChatBot for Food Related Question Alessandra Amato1 , Giovanni Cozzolino2(B) , and Antonino Ferraro2 1

2

University of Napoli Federico II, Naples, Italy [email protected] DIETI, University of Napoli Federico II, via Claudio 21, Naples, Italy [email protected], [email protected]

Abstract. This work wants to simulate dialogue with an artificial intelligent ChatBot, which reading the questions from clients, analysing each word and searching for a keyword tries to build the correct answer. The application’s domain chosen concerns a culinary context, in which our ChatBot represents the waiter, ready to take orders and give some more information about the restaurant. The user will be able to: Order one or more courses from the menu, get restaurant’s information about daily opening and closing times, get restaurant’s information about taking order away, get the full list containing his/her orders.

1

Introduction

The idea of the chatbot and its use applied to the hospitality sector has hypothesised a new use of the chatbot to help hotel and restaurant tourist facilities to dialogue with customers in a simple and immediate way. Every month 1.8 billion people are active on Facebook and many users use the social network to search for opinions and information about the structure that can be analysed with big-data techniques [1]. A simple consultation of a Facebook chatbot can create a database of users in a quick, easy and cheap way, like doing promotions for events and special offers or making analysis of social networks [2] and understanding users preferences [3,4] in order to give him a customised experience [5]. We applied the use of our chatbot in a restaurant context. Using the chatbot, the user will be able to: • • • •

2

Order one or more courses from the menu Get restaurant’s information about daily opening and closing times Get restaurant’s information about taking order awayGet the full list containing his/her orders

ChatBot

In this project [6] we exploited a lexical analysis approach for the understanding of users’ needs [7]. In order to answer the client’s request, our ChatBot [8] c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 235–240, 2021. https://doi.org/10.1007/978-3-030-61105-7_23

236

A. Amato et al.

searches keywords in the user’s input. A Keyword is a term by one or more words that synthesises the client’s request and it’s used by the IA to find and consult the restaurant’s Knowledge Base to optimise the answer relying on its meaning [9–11]. So we created a list of keywords, which are the only helpful to reach the ChatBot’s goal. Then we added all the possible synonyms for each keywords defining new lists of terms. To achieve this work we used the function Wordnet.synsets from nltk.corpus.

Once we got all those synonyms, we created a dictionary to collect them using some lists that are joined at the end using re.compile. The list’s name can be equal or different from the keyword.

Artificial Intelligent ChatBot for Food Related Question

237

The ChatBot uses a txt file as Knowledge Base, which is written by restaurant’s manager containing all the possible courses and their prices that clients are able to order.

All the possible responses are grouped into responses, in which they’re linked to the relative list’s name. In this way, when the client uses a word contained in the dictionary, the algorithm will connect it to the respective list’s name and the IA will know how to respond.

The last cell implements the main section for the ChatBot: it is an iterative operation where the user inserts the request which is elaborated by IA to obtain the correct output until the user ends the conversation. In the first part, we used the function search from re library to find a possible matching between the user’s input and one of the pattern in the dictionary.

238

A. Amato et al.

The iterative operation is implemented using while(cont==0). The key=fallback intent is selected by default in every iteration and it is used when the user does not insert a keyword (or one of its synonyms) in the request. When the keyword matches, key is overwritten and it becomes the IA response. In the section below, instead, there is a set of input controls: the IA makes sure the client asks for an existent course from the menu and asks the user if he/she wants to order something else. During the process, the ChatBot also saves all the request and at the end it prints the full order.

3

Running Example

The following figure shows an example of dialogue:

Artificial Intelligent ChatBot for Food Related Question

4

239

Conclusion

This work aims to simulate dialog with an artificial intelligent ChatBot, which attempts to construct the correct answer by reading the clients’ questions, analysing each word and searching for a keyword. The chosen domain of the application concerns a culinary background in which the waiter is portrayed by our ChatBot, ready to take orders and provide some more details about the restaurant. The user will be able to: order one or more courses from the menu, obtain information from the restaurant about the daily opening and closing times, get information from the restaurant about the removal of the order, get a

240

A. Amato et al.

complete list of orders. Future work will exploit more sophisticated Knowledge Base in order to enhance the reasoning capabilities of our chatbot.

References 1. Amato, A., Cozzolino, G., Moscato, V.: Big data analytics for traceability in food supply chain. In: Workshops of the International Conference on Advanced Information Networking and Applications, pp. 880–884. Springer (2019) 2. Castiglione, A., Cozzolino, G., Moscato, V., Sperl`ı, G.: Analysis of community in social networks based on game theory. In: 2019 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pp. 619–626. IEEE (2019) 3. Amato, F., Cozzolino, G., Moscato, F., Moscato, V., Picariello, A., Sperli, G.: Data mining in social network. In: International Conference on Intelligent Interactive Multimedia Systems and Services, pp. 53–63. Springer (2018) 4. Amato, A., Balzano, W., Cozzolino, G., Moscato, F.: Analysis of consumers perceptions of food safety risk in social networks. In: International Conference on Advanced Information Networking and Applications, pp. 1217–1227. Springer (2019) 5. Amato, F., Cozzolino, G., Moscato, V., Picariello, A., Sperl´ı, G.: Automatic personalization of visiting path based on users behaviour. In: 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 692–697. IEEE (2017) 6. Ciro, L., Francesco, I., Davide, R.: Restaurant chatbot 7. Alicante, A., Amato, F., Cozzolino, G., Gargiulo, F., Improda, N., Mazzeo, A.: A study on textual features for medical records classification. In: Innovation in Medicine and Healthcare 2014, vol. 207, p. 370 (2015) 8. Canonico, R., Cozzolino, G., Ferraro, A., Moscato, V., Picariello, A., Sorrentino, F.R., Sperl`ı, G.: A smart chatbot for specialist domains. In: Workshops of the International Conference on Advanced Information Networking and Applications, pp. 1003–1010. Springer (2020) 9. Cozzolino, G.: Using semantic tools to represent data extracted from mobile devices. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 530–536. IEEE (2018) 10. Amato, A., Cozzolino, G., Giacalone, M.: Opinion mining in consumers food choice and quality perception. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 310–317. Springer (2019) 11. Amato, A., Cozzolino, G.: Trust analysis for information concerning food-related risks. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 344–354. Springer (2019)

A Smart Interface for Provisioning of Food and Health Advices Alessandra Amato1 , Giovanni Cozzolino2(B) , and Antonino Ferraro2 1

University of Napoli Federico II, Naples, Italy [email protected] 2 DIETI, University of Napoli Federico II, via Claudio 21, Naples, Italy [email protected], [email protected] Abstract. The purpose of this work is to create an interactive chatbot that can help specialists to get information about food and health-related questions. This project uses main techniques of information processing as grammar tagging. The program gets information from a digital encyclopedia, named medlineplus.gov. This is one of the most complete encyclopedias on the web. Firstly we had to import necessary libraries as nltk and numpy that are useful for information processing, re to use regular expression, string for string type, bs4 to parse html lines and want to know the right extension and urllib.request to use websites and get information from these ones. We also have imported pos tag from nltk to do grammar parsing and stop-words from nltk.corpus to find meaningless words.

1

Introduction

This project is aimed at the design and implementation of an intelligent interface: HEARTBOT, a software capable of conversing with a user in natural language [1] simulating the behaviour of a human being. This result is due to a processing of the information, received in input, or a sequence of operations called pipeline is implemented for the automatic processing of the NLP (Natural Language Processing) text [2,3]. The latter is in fact an interdisciplinary head of research that embraces computer science, artificial intelligence and linguistics, whose purpose is to develop algorithms capable of analysing [4–6], representing and therefore understanding natural language, written or spoken, similarly to a human being. To develop the program, we used Python, an interpreted programming language, characterised by a simple interface that exploits libraries in compiled language, and which supports different types of programming paradigms: object oriented, imperative and functional. HEARTBOT [7] is a software capable of giving information about food and health-related questions, through which the user can choose between four options. The system was designed as a framework [8], so that the relationship between functionality and design was perfect. The graphical interface features elements that manage colours, icons, sounds and voices. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 241–250, 2021. https://doi.org/10.1007/978-3-030-61105-7_24

242

2

A. Amato et al.

HeartBot

The following are the libraries used to implement the code. They are a set of functions and data structures designed to be connected to a software program.

We begin to feed the knowledge base by inserting a function capable of requesting text from a website, analysing the text and storing the content in a string-type text variable [9].

Subsequently, all the sequential operations of the text processing are carried out, i.e. the pipeline of which we spoke previously. The first real phase of text manipulation is to divide the document into smaller and easier parts to process. The technique used is tokenization, the purpose of which is to divide the text into toke: depending on the application considered, they can be single words or entire sentences. We initially consider the identification of tokens at the level of individual words by importing the word tokenize function from the nltk.tokenize library, saving the output in a tokens variable, so that it can be used in subsequent pipeline operations.

A Smart Interface for Provisioning of Food and Health Advices

243

As previously mentioned, tokenization can also be carried out at sentence level or, given a text, it is possible to divide it, according to the point, into sentences. The sent tokenize function is imported from the nltk.tokenize library and in turn is saved in a sentences variable. For the purpose of our project we will model the knowledge base, that is, we will transform the input text into a matrix and the sentences will be the lines that make it up.

A further phase of pipeline for text processing is characterized by POS tagging (Part of speck tagging) which provides fundamental information for determining the role of words within the text. POS tagging consists in assigning a grammatical tag to each token of a corpus by importing the pos tag function from the nltk library. In these lines of code is imported an additional package called: RegexpParser that allows me to manage regular expressions or, grammar training rules.

Subsequently we carry out the transformation of the graphic forms into a lemma, producing a standardisation of the text. The lemma represents the canonical form with which a given entry is present in the dictionary, e.g.: infinitive

244

A. Amato et al.

for verbs, singular for nouns, masculine singular for adjectives, etc. are considered. So from the stem module of the nltk library we take advantage of the lemmatize function that associates each term with its basic form using a WordNet lexical resource. For the purposes of documentary modeling, the lemmas identified will represent the columns of the matrix.

We import stop-words from the corpus of the nltk library, words that do not bring particular information during the analysis of the text as articles, punctuation we create a function for the removal of the latter.

Prepared the basics, we implement the lemmatization on the so-called ‘Full’ words or, not belonging to the stopword, and we carry out a normalization, unification of the different graphic forms, on a text in which the capitalization has been reduced to lowercase, without punctuation.

In the final phase of creation and implementation of our chatbot, we import the SKLEARN library which allows automatic learning in order to be able to model the text. Modeling the text, as already mentioned, means transforming the input document into a matrix that will have as many lines as the sentences of the text obtained with the sent plus one that is precisely the one introduced by the user, and columns represented by the lemmas identified through the pipeline NLP text processing operations. Through CountVectorize we transform the text into a matrix and through Cosine similarity I calculate the similarity coefficient between a sentence of the text and the one introduced by the user [10–12].

We define a function that receives as input the sentence entered by the user which will be added to the knowledge base allowing our agent to compare it with

A Smart Interface for Provisioning of Food and Health Advices

245

the others already present, being able to return a fair answer. Functions have been defined inside that allow me to implement the text action model: 1. cv = CountVectorizer (max features = 50, tokenizer = LemNormalize, analyzer = ’word’) This first operation allows me to transform the input into a vector by operating directly on the lemmas. 2. X = cv.fit transform (sentences) The text is transformed into a matrix which, as we said previously, has the sentences of the text plus one, represented by the user’s question, as rows, and the lemmas as columns. 3. vals cv cosine similarity = (X [-1], X) The similarity coefficient is calculated between the first row of the matrix [X] and the row introduced by the user which is obviously the last (X [−1]). 4. flat vals cv = vals cv.flatten () 5. flat vals cv.sort () 6. index frase piu simile = flat vals cv [-2] The 4-5-6 respectively allow to flatten the array in a vector of rows, to order the cosine similarity values in ascending order and to memorize the highest value without considering the phrase stored with index −1 since it is the question just entered by the user.

We note that within the function there is also a cycle which analyses the cosine similarity value; in fact if the latter is equal to 0, it means that there

246

A. Amato et al.

has been no match and our chatbot will output a message to say that it cannot process the query. On the other hand, if the value is greater than 0 then it will propose to the user the phrase with maximum similarity.

We included a class that manage colours of print() that use ANSI escape sequences. We will use this colours for a better user experience during the interface interaction.

From YouTubeVideo Library we want to use IPython.display for using html video streaming inside cell. We define a method return a function that has as arguments the url and dimension settings.

We could introduce gTTS developed by Google for text speech but the voice it wasn’t natural and we want to simulate a phone call so we decided to introduce another library that is pyttsx3. Thankfully this we can set a lot of setting like volume, speed, gender, and language.

We use “sr to enable microphone recognition by voice and there is a conversion from speech to text. This powerful function can also reduce ambient noise. This appends because it waits for a second to let the recogniser adjust the energy threshold based on the surrounding noise level.

A Smart Interface for Provisioning of Food and Health Advices

3

247

User Interface

Let’s talk about User Interface. The first thing that we do is to set the title! We implement some rules for centering the text and adjust the colours. After that we created a method that manage the Menu Interface. It is also use a lot of print(). In the end we have the main(). This function set all rules for using the software from the start too the end. We created three Boolean variables because we must manage the while cycle for come back to menu. In the first cycle we have two variables (menu and repeat) one is useful for choosing the category and the other is useful to coming back to the list. For a better user experience we used also some icons converted from image so ASCII code. In “CHAT LIVE” category the user insert the question, then the bot with print (: , response(user response)) manage the information use the information implemented before and valuate the best answer from his KB. After that we clean the KB with sentences.remove(user response). We can choose to ask another question or end the category. The same thing appends for “CALL 911” category but the text created from bot and received from our voice is everything hidden because we want to simulate a call. But always considering the design we created all items that illustrate to user what happening. The microphone is enabled by user response=mic() and bot talk thanks to engine.say(response(user response)) and speak with engine.runAndWait(). After this we clean the KB with sentences.remove(user response) and user response=\". We have the category “VIDEO AND IMAGE” that execute the next cell that has the video and image information. In the end we have “EXIT” that close the program thanks sys.exit() that force SystemExit before opening final cells.

248

A. Amato et al.

A Smart Interface for Provisioning of Food and Health Advices

4

Running Example

5

Conclusion

249

The aim of this work is to build an intelligent chatbot that will help specialists to get information about food and health-related questions. The project utilises main information retrieval methods as the grammar labelling. A digital database, called medlineplus.gov, gives the program details. Future work will exploit more sophisticated Knowledge Base in order to enhance the reasoning capabilities of our chatbot.

250

A. Amato et al.

References 1. Canonico, R., Cozzolino, G., Ferraro, A., Moscato, V., Picariello, A., Sorrentino, F.R., Sperl`ı, G.: A smart chatbot for specialist domains. In: Workshops of the International Conference on Advanced Information Networking and Applications, pp. 1003–1010. Springer (2020) 2. Amato, F., Cozzolino, G., Mazzeo, A., Romano, S.: Intelligent medical record management: a diagnosis support system. Int. J. High Perform. Comput. Networking 12(4), 391–399 (2018) 3. Alicante, A., Amato, F., Cozzolino, G., Gargiulo, F., Improda, N., Mazzeo, A.: A study on textual features for medical records classification. In: Innovation in Medicine and Healthcare 2014, vol. 207, p. 370 (2015) 4. Amato, A., Balzano, W., Cozzolino, G., Moscato, F.: Analysis of consumers perceptions of food safety risk in social networks. In: International Conference on Advanced Information Networking and Applications, pp. 1217–1227. Springer (2019) 5. Amato, A., Cozzolino, G.: Trust analysis for information concerning food-related risks. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 344–354. Springer (2019) 6. Amato, A., Cozzolino, G., Moscato, V.: Big data analytics for traceability in food supply chain. In: Workshops of the International Conference on Advanced Information Networking and Applications, pp. 880–884. Springer (2019) 7. Lello, M., Francesca, F.: Heartbot 8. Amato, F., Cozzolino, G., Maisto, A., Mazzeo, A., Moscato, V., Pelosi, S., Picariello, A., Romano, S., Sansone, C.: ABC: a knowledge based collaborative framework for e-health. In: 2015 IEEE 1st International Forum on Research and Technologies for Society and Industry Leveraging a Better Tomorrow (RTSI), pp. 258–263. IEEE (2015) 9. Castiglione, A., Cozzolino, G., Moscato, V., Sperl`ı, G.: Analysis of community in social networks based on game theory. In: 2019 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pp. 619–626. IEEE (2019) 10. Amato, F., Cozzolino, G., Moscato, V., Moscato, F.: Analyse digital forensic evidences through a semantic-based methodology and NLP techniques. Future Gener. Comput. Syst. 98, 297–307 (2019) 11. Cozzolino, G.: Using semantic tools to represent data extracted from mobile devices. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 530–536. IEEE (2018) 12. Amato, A., Cozzolino, G., Giacalone, M.: Opinion mining in consumers food choice and quality perception. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 310–317. Springer (2019)

Analysis of COVID-19 Data Alessandra Amato1 , Giovanni Cozzolino2(B) , Alessandro Maisto3 , and Serena Pelosi3 1

3

University of Napoli Federico II, Naples, Italy [email protected] 2 DIETI, University of Napoli Federico II, Via Claudio 21, Naples, Italy [email protected] DISPC, University of Salerno, Via Giovanni Paolo II, 132, Fisciano, SA, Italy {amaisto,spelosi}@unisa.it

Abstract. A lot of research has been done during the first months of 2020 regarding the Covid-19. Researchers of different fields worked and cooperated to understand the virus better, in order to manage the pandemic and to model its spread. A series of tools have been developed in this sense, but there is a lack of work with regards to what has been developed from the scientific community. We would like to, at least partially, summarise the results obtained so far by analysing some of the published papers on the matter. To achieve such a result, we are going to use different python libraries that allow analysing texts. The entire work has been done with python on the Google Colaboratory platform.

1

Introduction

Analyse the results of thousands of researchers around the world about the Covid-19 is not an easy task for anyone, and it would be far easier if there could be a summary of the main points reached by the scientific community [1]. We propose an analysis of COVID related literature, aiming at extracting information and classify papers by using their significant topic and results. Not all the features that we aimed to include in the analysis are considered since some hiccups were met given the amount of data and the hardware/software limitations in which we operated [2]. The obtained results can constitute a good starting point for a future, more comprehensive analysis [3–6] and further exploitation, such as suggestions systems [7,8].

2

Data and Tools

The data was retrieved from the website of the European Bioinformatics Institute (EBI) that is part of the European Molecular Biology Laboratory (EMBL), Europe’s flagship laboratory for the life sciences. From this website, we retrieved about 73000 articles of various types, considering 60 different attributes. We used the attributes in order to select a subset of the data so that we could consider: c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 251–260, 2021. https://doi.org/10.1007/978-3-030-61105-7_25

252

• • • •

A. Amato et al.

Published journal articles; in English; open access; with PDF available

Out of this subset of the data, we decided to concentrate our analysis only on the articles that were cited more than 100 times [9–11]. The retrieval has been conducted using the requests and pandas libraries of python. The data was in JSON format that was first converted to CSV using pandas and then to string using PyPDF2. This last operation was possible by using the DOI of each open-access article to download their PDFs. This task was achieved by using the scidownl library, which lets users download papers from the Sci-Hub repository. Finally, they are converted to plain text using the PyPDF2. Since PyPDF2 most of the times make mistakes in reading the PDFs, some articles were corrupted. The resulting articles used in the analysis are 326. For the core, analysis tasks required a variety of libraries in python for different reasons. The nltk library was used in order to both clean the data and analyse it. We also used regular expressions to clean the data further, implemented in python with the re library. Lastly, we used the enchant library to check for the validity of the words, thus extrapolated and its suggest method to try to correct the corrupted words [12]. To plot the data the libraries matplotlib, seaborn and networkx were used. To facilitate the representation of the results the library collections was used. 2.1

Preparing the Data

Before going into the core of the analysis, we show how we proceeded to clean the data for the subsequent analysis. We used regular expressions in order to remove from the papers URLs and words that were not helpful for the review, such as all the words that contain the terms research, publish, scholar and some other composed words that belonged to the header of the papers or to the preamble as well as all the digits. After having applied these expressions we removed the stopwords using the ones that were given from the nltk library. In addition to the removal of the stop words, we decided to remove other words that occurred frequently, but that were not considered useful for the goal of the analysis. The list of the words we decided to exclude from the papers is as follows: ‘covid19’, ‘covid’, ‘coronavirus’, ‘covid2019’, ‘virus’, ‘covid-19’, ‘human’, ‘may’, ‘also’, ‘version’, ‘article’, ‘acad’, ‘material’, ‘available’, ‘online’, ‘usa’, ‘received’, ‘proc’, ‘natl’, ‘sci’, ‘limited’, ‘right’, ‘reserved’, ‘national’, ‘institute’, ‘health’, ‘university’, ‘department’, ‘nature’, ‘volume’, ‘issue’, ‘fig’, ‘figure’. After removing the words cited above, we checked if the remaining words were meaningful, that is to say, check if they were in the English vocabulary. This step was necessary for two reasons: we did not want words that had no interpretable meaning; the PyPDF2 library could have mixed up some words. This last step was pursued using the word module of the nltk library To compare the results that we obtained from the use of these words with a better

Analysis of COVID-19 Data

253

representation of the content of the articles, we decided to lemmatise them. The lemmas, that are the dictionary or citation form of a word or a set of words, were found using the WordNetLemmatizer of the nltk library. With these data we proceeded with our analysis.

3

Absolute Frequencies and N-Grams

From the frequencies of the words and the lemmas, in particular, we would like to see if it would be possible to find some information about the coronavirus and the state of the research on the matter. We start by looking only at the absolute frequencies of words and lemmas in all the papers considered. In the following are displayed the 30 most common lemmas and words. Figures 1 and 2 are bar plots made to better visualise the results displayed above. From the plots, it can be seen that the most frequent lemma, and word, is cell, followed by protein, infection and viral. The most indicative results are the ones obtained from the lemmas. In fact, looking at the results given using the words, we can see that there are repetitions of words used in different ways. For instance, the first two words cell and cells are better considered as a unique word rather than separated as it is the case without a proper lemmatisation. The 20 most common lemmas all describe pretty well the Covid-19 and, at least partially, the research on the matter. Nonetheless, the result shows that some information from the papers this way can be captured, even though it seems to be a bit superficial.

Fig. 1. Histogram of the frequency of the 30 most common words in the 326 papers considered.

254

A. Amato et al.

Fig. 2. Histogram of the frequency of the 30 most common lemmas in the 326 papers considered.

These results were obtained without looking for specific parts of speech. Likely, more desirable results would be found if we were to look only for nouns and adjectives. With the use of regular expressions, we could look for these specifications and compare the results with what we obtained above. To find the more helpful information, we proceed by analysing bigrams and trigrams. Since it is evident from this first glance at the papers that the lemmas are a better representation of the content of the documents than the raw words, we are going to focus on the results obtained using the lemmas in following sections while still providing the ones obtained with the words before lemmatisation. 3.1

Biagrams and Triagrams

In this part of the analysis, instead of considering the frequency of words by themselves, we consider the frequency of a sequence of words. In particular, we are going to focus on sequences of two and three words, respectively called bigrams and trigrams. To create these grams, we are going to use the two modules in the nltk library called bigrams and trigrams, precisely as one would expect them to be called. The results above give us more information about the content of the papers. The differences between raw words and lemmas are an almost non-existent, sign that the n-grams are far more robust than the single words, also called unigrams. The results are also represent in the histograms of Figs. 3 and 4. These same results can also be seen using graphs. In this way, it is possible to see the connections between the different words used in the papers considered, at least for the ones considered in the plots, that are the 20 most common ones.

Analysis of COVID-19 Data

255

Fig. 3. Histogram of the frequency of the 30 most common bigrams before lemmatising in the 326 papers considered.

Fig. 4. Histogram of the frequency of the 30 most common bigrams after lemmatising in the 326 papers considered.

The graphs in Figs. 5 and 6 show as nodes the single words of the bigrams while the edges connect the name that belongs to the same bigram at least once. The results of the trigrams are even more useful for the characterisation of the studies on the Covid-19 and on the virus itself. The results are as follows and show an even deeper understanding of the content of the papers: What observed so far can be verified also in Figs. 7 and 8. In Figs. 9 and 10 are shown the graphs for the corresponding histograms in the same fashion as for the bigrams.

256

A. Amato et al.

Fig. 5. Graph of the frequency of the 30 most common bigrams before lemmatising in the 326 papers considered.

Fig. 6. Graph of the frequency of the 30 most common bigrams after lemmatising in the 326 papers considered.

The graphs show which words fell into an n-gram, that is to say, which words happen in sequence among these most frequent n-grams. This shows how useful n-grams are in summarising, and eventually evaluating, the content of a collection of papers or texts in general.

Analysis of COVID-19 Data

257

Fig. 7. Histogram of the frequency of the 30 most common trigrams before lemmatising in the 326 papers considered.

Fig. 8. Histogram of the frequency of the 30 most common trigrams after lemmatising in the 326 papers considered.

4

IDF Analysis

The Inverse Document Frequency (IDF) measures how rare a word is in a collection of documents. This translates into a measure of how much information, or in other words, how rare or common a word is as a whole across all the papers

258

A. Amato et al.

Fig. 9. Graph of the frequency of the 30 most common trigrams before lemmatising in the 326 papers considered.

Fig. 10. Graph of the frequency of the 30 most common trigrams after lemmatising in the 326 papers considered.

considered. The IDF has been computed using the module TextCollection of the nltk library. The formula for the IDF is as follows: idf (t, D) = log

N |{d ∈ D : t ∈ d}|

(1)

On the numerator there is N, that is the number of documents considered; while at the denominator, there is the number of records where the term t appears. At the denominator 1 can be added since it can happen that a term is not in the collection of documents considered.

Analysis of COVID-19 Data

259

In the following are shown the top 30 words on the basis of the IDF: Even more interesting would be to search for specific words in order to assess their informational importance as a whole. For this reason, we made a function to search for a particular word in the documents and compute its IDF, so that it would be possible to look for the importance of a certain word of interest, as we have done for the word skin above, which IDF is equal to 0,983.

5

Conclusion

In this work, we applied a TF-IDF transformation on text extracted from scientific papers dealing with COVID analysis. After having identified and trained some categories of interest, one of the objectives of the analysis was the implementation of a recommending system exploiting the TF-IDF values and the citations. Another focus of the analysis was the performance of the program. At the moment, it takes a long time to clean the whole paper, mainly because of the addition of the enchant library methods. The use of spark with Map-Reduce methods would have helped in improving the performances, but some difficulties came up in the implementation on the Google Colaboratory platform. Lastly, it would have been interesting to add the IDF values for the n-grams, but the performances of the program would have made it challenging to obtain the results in a reasonable time. Therefore, the main focus in the future will be dedicated to improving the performance of the program in order to add the other more sophisticated features.

References 1. Amato, F., Cozzolino, G., Maisto, A., Mazzeo, A., Moscato, V., Pelosi, S., Picariello, A., Romano, S., Sansone, C.: ABC: a knowledge based collaborative framework for e-health. In: 2015 IEEE 1st International Forum on Research and Technologies for Society and Industry Leveraging a Better Tomorrow (RTSI), pp. 258–263. IEEE (2015) 2. Alicante, A., Amato, F., Cozzolino, G., Gargiulo, F., Improda, N., Mazzeo, A.: A study on textual features for medical records classification. In: Innovation in Medicine and Healthcare 2014, vol. 207, p. 370 (2015) 3. Amato, F., Cozzolino, G., Giacalone, M., Moscato, F., Romeo, F., Xhafa, F.: A hybrid approach for document analysis in digital forensic domain. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 170–179. Springer (2019) 4. Amato, A., Cozzolino, G.: Trust analysis for information concerning food-related risks. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 344–354. Springer (2019) 5. Amato, A., Cozzolino, G., Moscato, V.: Big data analytics for traceability in food supply chain. In: Workshops of the International Conference on Advanced Information Networking and Applications, pp. 880–884. Springer (2019)

260

A. Amato et al.

6. Castiglione, A., Cozzolino, G., Moscato, V., Sperl`ı, G.: Analysis of community in social networks based on game theory. In: 2019 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pp. 619–626. IEEE (2019) 7. Amato, F., Cozzolino, G., Mazzeo, A., Romano, S.: Intelligent medical record management: a diagnosis support system. Int. J. High Perform. Comput. Networking 12(4), 391–399 (2018) 8. Canonico, R., Cozzolino, G., Ferraro, A., Moscato, V., Picariello, A., Sorrentino, F.R., Sperl`ı, G.: A smart chatbot for specialist domains. In: Workshops of the International Conference on Advanced Information Networking and Applications, pp. 1003–1010. Springer (2020) 9. Cozzolino, G.: Using semantic tools to represent data extracted from mobile devices. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 530–536. IEEE (2018) 10. Amato, A., Cozzolino, G., Giacalone, M.: Opinion mining in consumers food choice and quality perception. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 310–317. Springer (2019) 11. Amato, A., Balzano, W., Cozzolino, G., Moscato, F.: Analysis of consumers perceptions of food safety risk in social networks. In: International Conference on Advanced Information Networking and Applications, pp. 1217–1227. Springer (2019) 12. Di Cicco Mattia Fonisto, F.: Covid-19 papers analysis

Towards the Generalization of Distributed Software Communication Reinout Eyckerman(B) , Thomas Huybrechts, Raf Van den Langenbergh, Wim Casteels, Siegfried Mercelis, and Peter Hellinckx IDLab, Faculty of Applied Engineering, University of Antwerp - imec, Sint-Pietersvliet 7, 2000 Antwerp, Belgium [email protected] Abstract. The widespread adoption of the Internet of Things requires a robust and transparent communication middleware. Such a middleware was provided by Vanneste et al. in the Distributed Uniform STreaming (DUST) framework, increasing the transparency and reducing the complexity of writing distributed software. We expand on the DUST concept, improving the transparency further and adding message pre- and postprocessing. This allows for easy configuration, and using a coordination mechanism, this can even cause automatic software communication optimization. We first define the architecture of the proposed improvements, and validate this on the requirements of distributed systems. We then validate this architecture by building a compression and encryption message processor, and define their models for a specific architecture.

1

Introduction

The advent of the Internet of Things (IoT) has revitalized the research interest in distributed systems. A traditional IoT application consists of three unit types: sensing, processing and actuation. The number of units is variable and can change dynamically, which increases the difficulty of working with IoT system. Although the data these IoT devices generate needs to be processed, these devices often lack the computational capabilities to do so, shifting the computations to dedicated devices or servers. Moving this logic to cloud servers can induce large latencies, which might become excessive depending on the use-case. As a solution, the Fog and Edge computing paradigms were proposed, which incite a shift of moving software closer to the end devices. A considerable amount of effort has gone into researching and providing for these distributed systems, to grasp and reduce their complexity. Although the creation of a distributed system has been simplified by, for example, MQTT communication, issues such as concurrency and transparency remain. And with the increasing network requirements and software demands, developers often use certain communication channels optimized for their specific needs. If, however, this communication channel does not fit within the framework they are using, support has to be created from scratch, adding overhead and reducing the used framework’s effectiveness. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 261–270, 2021. https://doi.org/10.1007/978-3-030-61105-7_26

262

R. Eyckerman et al.

To solve these issues, Vanneste et al. proposed the Distributed Uniform STreaming (DUST) framework [1]. This framework enables the usage of multiple communication middlewares using a single communication interface, allowing developers to use their transport protocol of choice without having to implement the technicalities. This increases the flexibility and transparency of communication in distributed systems. Moreover, the framework enables the use of a coordination mechanism, or coordinator, which ensures an optimal usage of the available network resources. This coordinator optimizes the software placement to minimize several metrics, such as application latency and energy usage, considering device and network constraints, as defined in previous research [2,3]. Since the original paper, DUST has been released under the Mozilla Public License v2 [4]. In this research we further expand on the DUST concept, improving the reusability of transport protocols by further generalizing the communication interface. In addition, we propose adding pre- and postprocessing stages to the middleware. This allows developers to compose and configure the communication stack using a configuration file, selecting the required features without having to manually implement them. We then define what the requirements are of distributed systems, and display how DUST fills several of these requirements. This is then validated by constructing compression and encryption models, which could be used in coordination mechanisms.

2

Related Work

One feature inherent to software design is that every problem tends to have multiple solutions, each with its own pros and cons. The large set of complexities in designing distributed systems, generates a large amount of possible solutions with their own trade-offs. This allows developers to optimize their distributed system to specific scenarios. Several such scenarios are defined by Garcia-Valls et al. [5]. They define the challenges for constructing middleware for specific systems such as Cyber-Physical Systems (CPS), mobile environments and cloud computing scenarios. Their focus is mainly on optimal resource management, considering the requirements of a coordinating entity as well. Cheng et al. propose a REST-based service middleware [6]. Their research defines a middleware that supports mashups, a way to compose a custom service from manually selected services. This is based on top of a Publish/Subscribe system, where services can subscribe on data providers using a RESTful Application Programming Interface (API). Such research would greatly benefit from using DUST, as the device and communication heterogeneity is considered in the research. By using DUST, they could abstract away this complexity while keeping the focus on the mashup methodology. Zheng et al. defined a communication controller with multiple communication stacks integrated [7]. This is then controlled using a multi-agent overlay, where multiple types of agents cooperate to ensure that the sensor information gets to the correct data processing agent, which will then target the control devices. Their proposed platform is, however, precisely tuned to their needs, and will not fit outside of their designed context.

Towards the Generalization of Distributed Software Communication

3

263

DUST Architecture

The original DUST Core was developed from a need of running distributed software in resource-constrained environments. While doing so, it focused on improving the transparency and reduce the fragmentation and development cost of distributed software. This has been achieved by abstracting away communication middleware behind a shared API, resulting in a uniform way of communicating across multiple middlewares. This design is shown in the left figure of Fig. 1. Moreover, a coordination mechanism was provided to distribute DUST components for reduced resource usage. We will now propose design changes to further improve transparency, fragmentation and uniformity of distributed systems, building upon the original DUST concept and adhering to its base requirements, such as supporting resource-constrained environments.

Fig. 1. Architectures for the original DUST Core and the improved DUST Core

3.1

Requirements

We will define these improvements by starting at the placement coordinator. Given a computer network and a DUST application, it is the requirement of this coordinator to find the DUST components’ optimal placement, to reduce resource usage and optimize metrics such as latency and bandwidth. When relocating these DUST components, the network capabilities over which they communicate might change. This could include a decrease of available bandwidth, or a decrease of communication latency. The effect of these changes can be profound, and should therefore be reflected in the running application if it is based on a resource-aware framework. As different transport protocols have different requirements and capabilities, we should be able to interchange them at runtime. This defines our first new requirement: The transport protocols should be interchangeable at runtime.

264

R. Eyckerman et al.

However, this change should not only reflect on which transport protocol is used. If we take this approach one step further, we can prepare the data for transmission in a dynamic approach as well. If we take the example of a network link with reduced bandwidth, the coordination mechanism should be able to decide if it would be beneficial to for example compress the data before transmitting it. Although we shift the resource-aware complexity to the coordinating mechanism, DUST should be able to execute the decisions this mechanism makes. This defines our second new requirement: The communicated data should be able to be prepared before transmission. 3.2

DUST Advancements

To realize the previously defined requirements, we propose to change DUST to the new architecture described in Fig. 1. Ensuring the first requirement, changing the transport protocols at run-time, shows little complexities as most of this design was already provided by the original DUST version. However, to ensure dynamic behaviour with a minimal memory footprint, these protocols were encapsulated in a shared library. This allows the loading of the communication library when it is in use, and keeping the memory free of any communication logic not in use. Given models of these transport protocols, the coordinator can decide which protocol should be used in which scenario. The second requirement, being able to process data before transmission, is provided by introducing the concept of channels, shown in Fig. 2. These are structures in which transformers are chained together, and where the channel’s tail represents a transport protocol. The transformer transforms messages going towards the communication medium, and reverse transforms messages coming from the medium, as the changes made at the transmitting end should be reversed at the receiving end. As the changes to the message should be reversed in the exact opposite order that they were transformed, there is an added requirement that the add-on stacks on both transmitting and receiving end are identical. Due to the large complexity this can generate for developers, the DUST Initializr tool has been developed by Huybrechts et al. [8]. This tool allows the automatic DUST configuration for static applications and networks using a graphical interface. This degree of freedom allows the developer to select and configure his communication, without having to lose time on development and debugging. In networks with a coordinator, the coordinator can be tasked with ensuring both stacks are identical. Network knowledge can be used to see which links require large amounts of bandwidth, allowing it to automatically configure a compression transformer. These channels and their transformer and transport segments can be loaded in and out of the DUST channel at run-time. This allows rapid channel reconfiguration and communication optimization with a minimal overhead. We continue the usage of publish/subscribe based communication, as was originally defined for DUST. The usage of publish/subscribe communication in the communication channels allows API simplification, as the developer will know which require and which provide data, reducing loss of information.

Towards the Generalization of Distributed Software Communication

4

265

Distributed System Characteristics

Our defined architecture will be validated against the requirements of a distributed system. Based on the book by Coulouris et al., we can define a set of key challenges for distributed systems [9]:

Fig. 2. Visualisation of channel communication

Openness: The ability to easily change an application component by another one, such as replacing a certain database with another type of database. Although not explicitly facilitated by DUST, it can work in tandem. Given that two blocks have the same communication interfaces, they can easily be interchanged. The coordinator could make use of this characteristic by determining the network capabilities and select from a set of interchangeable algorithms based on the resource consumption and required accuracy. Scalability: The ability to scale the running applications on demand. DUST partially aids in improving scalability, by moving the communication channel development to a configuration file. If a developer decides a component’s load becomes too much and must change from for example a simple socket-based messaging mechanism to a publish-subscribe pattern to facilitate concurrency, the developer only has to change the DUST transporter. Any new components must simply subscribe to the defined publisher to be ready for usage. Transparency: The ability to maintain operation without knowledge of any failing components or the current system scale, among others. DUST excels in this characteristic. Using the pub-sub methodology with transparent communication channels allows applications to be oblivious of who is connected to their communication channels. All that is known is the type of information that comes into the application from subscribing channels and that is required to go into publishing channels. Scalability transparency is, as defined before, partially supported by DUST but still mainly a developer requirement.

266

R. Eyckerman et al.

Fault Tolerance: The ability to maintain system operations when components are failing. This is not facilitated by DUST, but can be provided by the respective coordinating entity. If component or device failures are detected, this coordinator can spin up a new component, enabling rapid recovery. When investigating stateful component recovery, we suggest adding perdevice agents which monitor and store the running components’ state, to recover the state on component failure. Such recovery is, however, considered out of scope for this research. Concurrency: The ability to run multiple application components concurrently. This is a requirement for the developer, and is not facilitated by DUST. Heterogeneity: The ability to work on multiple software/hardware platforms and networks. This is supported by DUST. Currently, both the Windows and Linux OS are officially supported, and further research will be targeting systems running without operating systems. Security: The ability to prevent malicious entities of altering the system operations. Currently, DUST shifts the responsibility of link security to the application developer. With the proposed add-on system, this can be shifted to channel-specific security approaches. This gives the developer the freedom and responsibility of selecting the security type that is preferred. Quality of Service: The ability to provide a certain service quality by considering the network capabilities and application requirements. This is not directly provided by DUST, but can partially be provided by the coordinator.

5

Architecture Evaluation

Several transformers were constructed to validate the architecture. The results were measured using a docker container, with access to a Intel(R) Core(TM) i7-6800K CPU @ 3.40 GHz processor and 8 GB RAM (Dynamic Usage). 5.1

Compression

Due to the varying network link capabilities, compression was chosen as a first use-case. The compression transformer used zlib as a compression algorithm with variable compression level. We validated the transformer on three data sets, the first of purely random data, the second of random strings, and the third one of CAN data. The string-based data was constructed of a set of 96 characters, encompassing the set of readable characters in the ASCII charset. The CAN dataset was extracted from a BMW vehicle, and was used due to its relevance of both high-speed communication as storage. It has been used for compression research by the Celtic-SARWS project, where car sensor information is extracted to determine the road quality based on weather information [10]. A single Controller Area Network (CAN) message was 32 bytes. From these messages, batches were manually created of up to 16MiB. The compression library was configured to use level one compression. This was chosen as higher levels of compression did not contribute to the compression. These results have been omitted for brevity.

Towards the Generalization of Distributed Software Communication

267

We defined the compression stack behaviour in Fig. 3. The data points on this figure were calculated with the following formula: P = (Bi − (Bi ∗ Rc ))/t

(1)

Here, P represents the profit from compressing. Bi is the amount of bytes as input, Rc is the ratio of compressed bytes to input bytes, and t represents the induced latency, as it is the amount of time in ms it took to both compress at sender as decompress at receiver. Figure 3 clearly shows that compressing random unstructured data does not work well for compression. This can be attributed to the Kolmogorov Complexity [11], which defines the compressibility of strings based on a description language. As expected, however, string-based data does considerably better. This is because the data complexity has been reduced, which increases the odds of duplicate string compression, what zlib’s DEFLATE algorithm greedily makes use of [12]. This shows us that json-based messages could greatly benefit from using the compression transformer. We can see in Fig. 3 that we can get the most compression for the least amount of time at about 8KiB - 16 KiB message size when using string-based messages, and that we should not use it at all when using random data. Although comparable to string-based messages, the optimal trade-off point for CAN data is at 32KiB. What makes this graph interesting is that this behaviour can be transformed into functions, which would represent a transformer model. If this model is consequently given to the coordinator, it can determine the optimal compression scenario given a message. This could result in the coordinator adding a batching or message splitting transformer before the compression transformer to have the best bandwidth usage to latency ratio. The results are further clarified in Fig. 4, which shows the compression ratio of each message. Although the optimal ratio for both string-based and CAN-based messages are roughly the same, the CAN messages get compressed considerably more. This can be attributed to the large amount of repetition inCAN messages, allows for improved compression.

Bytes Compressed per ms

20 String Message Random Bytes Message CAN Batch Message

15 10 5 0 102

103

104 105 106 Message Size (Bytes)

107

Fig. 3. Profit P per message size for several message types

R. Eyckerman et al.

1.2 1 0.8 0.6 0.4

String Message Random Data Message CAN Batch Message

0.2 0

102

103

104 105 106 Message Size (Bytes)

107

Fig. 4. Message compression ratio

5.2

EncryptDecrypt Speed (Bms)

Ratio Compressed Message/Original Message

268

300

200

100 Decryption Time (ms) 100

101

102

103 104 105 Message Size

106

107

Fig. 5. Encryption transformer induced latency

Encryption

Due to the often lacking security in IoT systems, we selected an encryption add-on as our second test scenario. Due to the connection-less nature of publishsubscribe communication, many encryption methodologies become infeasible, as return communication is impossible. Due to this complexity, we have chosen for symmetric-key 256-bit AES with Cipher Block Chaining. Symmetric-key encryption allows the use of pre-shared keys, removing the need of backwards communication. The encrypted block size can be calculated using following formula: Bi (2) Bo = ( + 2) ∗ 16 16 Here, Bo is the number of output generated by Bi input bytes. The size is increased due to the addition of an Initial Value (IV) and the block size used when encrypting. In Fig. 5, the transformer’s speed over randomized messages of multiple sizes is shown. This speed includes both the time required to encrypt and decrypt, as both will have a significant impact on the network. The measured speeds were averaged over 128 runs. The optimal speeds obtained here were at 16KiB messages, reaching 360 B/ms. Due to the little impact of a message format on the encryption, it is considerably easier to create a transformer model for the coordinator.

6

Conclusion

We have defined an architecture building on top of the original DUST that successfully improves transparency and code reusability. This improvement moves protocols into channels, where they can be combined with transformers that adapt the message for transmitting. Due to the definition of such transformers, distributed system developers do no longer need to manually implement message pre- and post-processing, and can just select the necessary transformers.

Towards the Generalization of Distributed Software Communication

269

The transformers in the use-case show that the defined architecture works, and has an added benefit for the coordinator. If, for every transformer, enough measurements are done to make different models for different message types, then a coordinator can automatically optimize the message based on the content and the size.

7

Future Work

Further research should be done on automatic transformer model generation. As IoT environments are highly heterogeneous, this should be taken into account when modelling the transformers. With the support for dynamic behaviour in DUST, we can use these models in the coordinator for context-based optimal communication optimization. When given the model shown in Fig. 3, a coordinator could automatically optimize the transmission of CAN messages without any effort from the developer. As this might considerably reduce the number of messages, the remark can be made that this might not be in the application’s interest, as data needs to be transmitted as quickly as possible. This requirement should, however, be modelled in the application chain, which is provided to the coordinator, and for this we redirect the interested reader to [3]. Acknowledgements. This work was performed within the SARWS (Real-Time Location-Aware Road Weather Services composed from Multi-Modal Data [10]) CelticNext project. The Flemish project was realised with the financial support of Flanders Innovation & Entrepreneurship (VLAIO, project no. HBC.2017.0999). This research received funding from the Flemish Government (AI Research Program).

References 1. Vanneste, S., de Hoog, J., Huybrechts, T., Bosmans, S., Eyckerman, R., Sharif, M., Mercelis, S., Hellinckx, P.: Distributed uniform streaming framework: an elastic fog computing platform for event stream processing and platform transparency. Future Internet 11(7), 158 (2019) 2. Eyckerman, R., Sharif, M., Mercelis, S., Hellinckx, P.: Context-aware distribution in constrained IoT environments. In: Xhafa, F., Leu, F.-Y., Ficco, M., Yang, C.-T., (eds.) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. Lecture Notes on Data Engineering and Communications Technologies, vol. 24, pp. 437– 446. Springer, Cham (2019) 3. Eyckerman, R., Mercelis, S., Marquez-Barja, J., Hellinckx, P.: Requirements for distributed task placement in the fog. Internet Things 100237 (2020, in press) 4. Distributed uniform streaming library. https://imec.flintbox.com/#technologies/ 2ac57b69-253f-4480-a406-3cbd4050a3f7. Accessed 25 June 2020 5. Garcia-Valls, M., Bellavista, P., Gokhale, A.: Reliable software technologies and communication middleware: a perspective and evolution directions for cyberphysical systems, mobility, and cloud computing. Future Gener. Comput. Syst. 71, 171–176 (2017)

270

R. Eyckerman et al.

6. Cheng, B., Zhao, S., Qian, J., Zhai, Z., Chen, J.: Lightweight service mashup middleware With REST style architecture for IoT applications. IEEE Trans. Netw. Serv. Manage. 15(3), 1063–1075 (2018) 7. Zheng, S., Zhang, Q., Zheng, R., Huang, B.-Q., Song, Y.-L., Chen, X.-C.: Combining a multi-agent system and communication middleware for smart home control: a universal control platform architecture. Sensors 17(9), 2135 (2017) 8. Huybrechts, T., Eyckerman, R., Van den Langenbergh, R., Vanneste, S., Mercelis, S., Hellinckx, P.: DUST Initializr – graph-based platform for designing modules and applications in the revised DUST framework. Internet Things 11, 100229 (2020) 9. Dollimore, J., Coulouris, G.: Distributed Systems: Concepts and Design, 5th edn., p. 1067. Pearson (2011) 10. Celtic-Next SARWS, Real-time location-aware road weather services composed from multi-modal data. https://www.celticnext.eu/project-sarws/ 11. Li, M., Vit´ anyi, P.: An Introduction to Kolmogorov Complexity and Its Applications. Texts in Computer Science. Springer, New York (2019) 12. Deutsch, P.: DEFLATE compressed data format specification version 1.3. RFC Editor, Technical report RFC1951, May 1996

A Survey on the Software and Hardware-Based Influences on the Worst-Case Execution Time Thomas Huybrechts(B) , Siegfried Mercelis, and Peter Hellinckx IDLab - Faculty of Applied Engineering, University of Antwerp - imec, Sint-Pietersvliet 7, 2000 Antwerp, Belgium {thomas.huybrechts,siegfried.mercelis,peter.hellinckx}@uantwerpen.be

Abstract. The Worst-Case Execution Time (WCET) of software is an important indicator in time-critical systems and is influenced by softand hardware characteristics of the system. In this survey, we discuss seven prominent system parameters that influence the WCET of software running on a processing unit. For each component, we described the inner workings and interactions with the program logic how it influences the execution time, performance and predictability of the WCET.

1

Introduction

The execution time of software is determined by soft- and hardware characteristics of the system. Different inputs will result in triggering different program traces with different execution times. However, consecutive equal input sets could have distinctive results as the state of the entire system (software and hardware) differ. Therefore, a unique timing distribution is obtained for each configuration. The maximum value in the distribution determines the Worst-Case Execution Time (WCET). This value is an important indicator in time-critical systems for which real-time constraints are imposed [13]. In this small survey, we discuss the most prominent system parameters that influence the WCET of software running on a processing unit.

2

Compilers and Instruction Set Architectures

The compiler is a computer program which translates software code of a higherlevel programming language to a lower-level language. In this context, we are interested in the translation to binary code that the processing unit on an embedded controller is able to interpret. This step is essential as computers are only able to understand a limited set of predefined instructions, such as read/write register, add two values, etc. Based on this instruction set, the compiler has to translate each line of code in the higher level language to a combination of basic low level instructions. Depending on the available instructions in the set of the c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 271–281, 2021. https://doi.org/10.1007/978-3-030-61105-7_27

272

T. Huybrechts et al.

CPU, one instruction in your source code could require dozens or hundreds of basic instructions to perform. In the end, the final execution time is (partially) determined by the amount of actions (basic instructions) that the CPU performs. Each machine instruction takes one or more clock cycles to execute. The number of cycles required to perform an instruction depends on hardware implementation of the instruction set. Some specialised hardware have optimised arithmetic logic units (ALU) to perform certain calculation in just one clock cycle. For example, vector operations are commonly used in computer graphics. By reducing the number of cycles needed for an operation, the higher the throughput becomes, resulting in less stalling of the processor. The implementation of these instruction sets depends on the architecture of the processor. The most well-known instruction set architecture classifications are the ‘Complex Instruction Set Computer’ (CISC) and ‘Reduced Instruction Set Computer’ (RISC) processors. A CISC architecture has an extensive list of specialised instructions. The main concept behind this approach is to provide one instruction to perform a certain (complex) operation. As the instructions more closely matches high level programming constructs, the code density of the binaries increases as fewer instructions are needed to be encoded [6]. Therefore, more instructions can be stored in the limited high performance caches. The hardware could be optimised to execute complex operations in few clock cycles with specialised hardware. However, this increases the architecture complexity dramatically. An example of a CISC architecture is the popular x86 processor family. The instruction set of a RISC architecture focusses on simplifying the overall design and implementing instructions that are optimised in instruction cycle times and not in the number of instructions. The average cycle time of a RISC instruction is near one cycle. To accomplish this, the ‘complex’ CISC instructions are split in separate fetch, compute and store operations. As each instruction is optimised in cycle time, it becomes feasible to introduce a pipeline that chains the different stages of instructions to maximise the throughput. Many architectures are based on the RISC approach, such as ARM, PowerPC, Atmel AVR and MIPS processors. Each architecture requires individual binaries as the instruction syntax, available instructions (opcodes) and registers, etc. differ between them. As a result, different compilers for each architecture are used to generate binaries of source code. The execution time highly depends on how the code instructions are translated into machine operations. Therefore, different compiler toolchains (of different vendors) for the same architecture can have various results in the final binary due to their implementation and applied optimisation. Next to implementation differences, most compilers have extensive configuration parameters. These configurations have mostly impact on the optimisations the compiler will perform on the code to increase the overall performance (throughput, code density, etc.). However, while these optimisation improve the average case, they may induce adverse consequences for the worst-case

Survey on Software and Hardware Influences on the WCET

273

scenarios. The most prominent optimisation configuration is the ‘optimisation level’ option (-O). Each optimisation level will induce a list of optimisation techniques. Switching between these levels will have major impact on the execution distribution of the program!

3

Clock Speed

In a synchronous CPU, all operations are synchronised with each other by a central clock indicating the pace. Each operations has a deterministic timing scheme (which is determined by the state of the system – e.g. cache hit requires X cycles, cache miss requires X + Y cycles). A clock cycle is the period between two pulses of a clock generated signal. At the start of a clock signal, all components start their operations. Before the next period start, all components should finish their operation and be in a stable state ready for the next cycle. As a result, the maximum clock speed is limited to the slowest component in the chain. However, delays and/or pipeline stalls can be implemented for slower non-continuous operations, e.g. fetching data from slow main memory. It is important to note that the clock frequency in some processor are adjustable or even dynamically (CPU throttling/boosting). By increasing the clock speed (overclocking), the execution time can be reduced. However, overclocking is limited by the physical characteristics of the electrical components (transistors, etc.) of the processor and will result in higher power consumption, increased heat production and decreased lifespan of the components [1]. Additionally, overclocking will not always improve the performance when the application heavily depends on data of slower peripherals that are not influenced by the changing clock speed, such as main memory, IO devices, etc., as the pipeline needs to stall its operation. To determine the WCET, it is safe to assume to have no speed boost (i.e. CPU boost disabled during benchmarking), as throttling heavily depends on the CPU temperature. Unless you have guarantees that the environment and CPU temperatures are within certain ranges and the thermal space of the system has enough margin to spike. On the other hand, throttling (lowering) the clock speed is a commonly used technique in modern processors to improve the power consumption of the processor. In low power applications, energy efficient hardware have embedded sleep

Fig. 1. Memory hierarchy

274

T. Huybrechts et al.

capabilities in which certain features of the embedded controller are deactivated up to a deep sleep mode where the CPU clock is turned off.

4

Memory Access Speed

The structure of a program consists of a sequence of instructions and associated data on which operations are performed by following the instructions. This information is stored in the main memory of the controller. This memory is slow relative to the high clock frequency of the CPU. Valuable clock cycles are lost while the CPU is waiting for the requested data. Therefore, multiple memory layers are integrated into the hardware to improve the overall performance. The layers closest to the processor are the registries and the instruction/datacaches. These memory elements are extremely fast resulting in minimal latency for retrieving instructions and data. However, the size of this memory is very limited, due to the high costs of fast memory and the technical limitations of physical space to embed it in the proximity of the processing unit to guarantee low latency [21]. The next layers of memory become larger and cheaper to implement. Nevertheless, each layer comes with higher latency as every additional layer is slower to access. Figure 1 illustrates the memory hierarchy in a modern computer [21]. This multi-level hierarchy provides a good trade-off between speed and storage capacity. During first start-up, as all caches and main memory are clear, data and instructions need to be retrieved from the persistent storage at the outer most layer. The stored content of each layer is managed by the compiler, hardware logic or the operating system (OS). The register utilisation in the processing unit is determined at compile time by the compiler. It determines which registers are used to store variables. The outer storage, such as main memory and disk memory, are managed at runtime by the operating system. On both the registers and the OS managed memory, the software is able to acquire insight in the memory hierarchy. However, the different cache memory layers are managed by hardware itself and its existence/functionality is transparent for the software. In order to determine the WCET, the latency of retrieving data plays a major part as it could potentially stall the execution pipeline of the CPU.

5

Scheduling and Pre-emption

Systems that are able to run multiple tasks or applications (with or without OS) need a controlling mechanism to avoid a chaotic execution of those tasks. This mechanism that controls the execution of tasks is called the scheduler [9]. The scheduler determines which tasks are executed on the CPU and when. Each task is placed on a queue when it is ready for execution. The order in which these tasks get executed is based on a given priority to the tasks. The scheduling algorithm can be categorised into pre-emptive or non-preemptive algorithms. Pre-emptive schedulers are able to switch between tasks

Survey on Software and Hardware Influences on the WCET

275

even when the current one is not finished [9], for example round-robin scheduling and shortest remaining time first. A non-pre-emptive only switches tasks when the previous one is finished, such as shortest job first scheduling and firstcome-first-served. By using a pre-emptive scheduling approach, the throughput of the processor can be increased. If a program is waiting for I/O operations, the processor can execute another task in order to avoid idling during the time waiting for the I/O operation. However, pre-empting a task can lead to misbehaviour when it happens during a critical section of the code. A critical section is a segment in the code which contain operations that share resources. During pre-emption, another task could manipulate the state of the shared resource leading to unexpected false results, that result in race-conditions. In cyber-physical systems where real-time behaviour is an important constraint of the system, a real-time scheduler is used. Each task is these systems have a timing constraint. Each task must respond within the given deadline imposed by the constraint. Therefore, the functional behaviour as the temporal behaviour needs to be guaranteed [25]. In order to schedule tasks in a real-time environment, the scheduler needs to know the deadline of each task and the WCET as it must try to find an optimal scheduling to run each task to completion before its respective deadline. However, the scheduler adds an overhead on the system, because the scheduler needs computational resources to evaluate and schedule tasks. Additionally, pre-empting a running task requires to copy the context (i.e. registers, program counter) onto the stack and recover another task’s context. This is referred to as context switching [7]. The scheduler should take notice of this phenomena when scheduling the tasks. The scheduling of tasks can be performed periodic, sporadic or aperiodic. Periodic tasks are repeated every with a fixed time interval T. Sporadic tasks are reoccurring tasks with an minimal interval T between two executions. Aperiodic tasks have no minimal reoccurring time, but are able to be released at any given time based on an event or system interrupt. When a interrupt gets accepted, the context of the running task gets switched by the interrupt handler to trigger the appropriate Interrupt Service Routine (ISR). After interrupt handling, the scheduler is back responsible for scheduling the next task. For example, eventdrive scheduling reacts on events such as task releases, completions and priority changes. This allows the scheduler to directly select a new task to perform. For quantum-driven scheduling, the scheduling process is triggered on time quantum (Q) [9]. These quanta are time slices in which tasks are scheduled. As a result, the latency of scheduling new tasks is higher compared to event-driven, however this approach is easier to implement.

6

Shared-Resources

When introducing multi-core processors in real-time systems, all available resources need to be shared among multiple processes running on different cores. Therefore, the longest trace in software does not implicitly imply the worst-case path in the code [22,23]. In order to get access to different resources, the cores

276

T. Huybrechts et al.

are connected to those resources through shared busses. As these busses are shared, only one data stream can be sent at any time. In order to provide every resource access to the bus, an arbitration policy is put on the bus. This policy allocates time slots for each resource in order to communicate on the bus. Examples of arbitration policies are Round Robin (TDM), Lottery-Based Arbitration and Fuzzy Logic Arbitration [2]. The complexity of calculating the WCET for multi-core systems increases tremendously compared to single-core systems. The synchronisation mechanic used has a significant impact on the performance [16]. The different cores can fetch data simultaneously from memory. A standard L2-cache shares its memory between the different cores of the processor. Each core has access to the entire cache. As a result, the content needs to be protected for concurrent access to critical code sections to avoid race conditions. Locking the access to and from a resource is achieved by applying a mutex on a resource. This mutex will guard the access that only one thread could read from or write to the shared resource. Locking any resource requires multiple steps on different levels, such as hardware context switching, the ring levels on the CPU (permission levels) and the locking instruction itself [16]. The overhead for locking resources could go up to a factor 50 when inter-process sharing is used that requires switching to kernel level, compared to having only one process for each thread, which is the fastest as no context switching is needed [16]. In order to calculate the WCET, each possible state of the cache must be known at each point in time for all possible program traces. Since multiple applications with different program context may run in parallel on different cores, all influences (e.g. pre-emption, resource sharing) have to be taken into account to obtain sound approximations of the execution time bounds. However, this approach becomes infeasible to perform as an immense amount of states need to be evaluated. Hahn et al. [12] propose timing compositionality that decompose the architecture in minor hardware and software systems, whose timing contributions are calculated independently from other systems. The WCET is then obtained by the timing contributions of each subsystem. Two approaches are proposed to achieve compositionality [12]. A first approach is finding a compositional analysis for existing systems. Resource dependencies makes it difficult or even impossible to apply a compositional decomposition for each architecture. However, modifications can be applied to existing hardware to support decomposition. Chisholm et al. [8] propose to partition the shared cache so that each core has access to an assigned part of the cache. A second approach is to develop new architectures that are decomposable into subsystems whose timing distribution can be easily calculated [24] [26]. However, it limits the benefits of having multiple cores, i.e. shared context between cores [3]. The challenge is to design compositional systems with similar performance of current ones. No method has been found yet to perform a compositional analysis.

Survey on Software and Hardware Influences on the WCET

7

277

Caches

Caches are small memory elements which are integrated close to the processor unit. The fast cache memory will fetc.h data and instructions of main memory. As a result, it provides a tremendous improvement of the average execution time compared to the slower RAM or storage devices. However, the fast cache memory has a high cost per bit and the space close to the processor is limited. Therefore, the maximum cache size is limited to a few MBytes or KBytes to keep the cost affordable. Each cache line contains a small block of multiple data elements of data or instructions. These cached blocks are very efficient due to the sequential order of instructions in the program flow. This concept is referred to as principle of locality [15,21]. When the processor requests an instruction, it will first check the cache if the instruction is present. A cache hit occurs when the instruction is found and thus, is directly accessible to the processor compared to the slow main memory as shown in Fig. 1. In the opposite case, the cache will miss and the instruction needs to be retrieved from a higher level of the memory hierarchy. The total time to fetch a missed instruction is equal to the cache lookup time plus the retrieval time from the main memory. Each cache miss adds overhead to the execution time. This results in a higher variance in the program’s execution time distribution [12,21]. Multi-core processor uses multiple layers of cache memory. A first level of cache is placed close to each core and has similar functionality as single-core caches. A second layer is shared between cores to improve the performance of multithreaded applications working on the same data or the communication between cores using these caches. In order to calculate the WCET, each possible cache state must be known at each given point in time for all possible program traces. As multiple programs can be executed concurrently on different cores, all influences have to be taken into account to obtain sound approximations of the WCET. When data is retrieved from main memory, a fixed sized block with the requested data is copied in cache memory. Using data blocks improve the performance due to temporal and spatial locality of references. Temporal locality defines that recently referenced resources are more likely to been referenced again soon. Therefore, we want to cache those resources that have been accessed recently. For spatial locality, the chance to reference a resource that is closely located to a recently referenced resource is higher. Cache blocks use these features by copying the referenced resources with its immediate neighbours into cache. When a cache miss occurs, the required data must be retrieved from main memory to cache memory. Since the size of the cache memory is limited, a previously loaded block has to be evicted. The selection of this block is done according to a replacement strategy. This strategy has an impact on the performance of the system. The most often used replacement strategies in modern architectures according to Paun et al. [18] are Least-Recently Used, Pseudo-LRU, First-In First-out and Most-Recently Used. In order to determine the WCET, we need

278

T. Huybrechts et al.

to know if a cache access will result in a hit or miss. Cache analysis will determine the cache behaviour for a program on a set of inputs with an unknown initial cache state [21]. The goal is to determine the probability for certain resources to be located within the cache. As these replacement policies are implemented in hardware, it becomes difficult or even impossible to model the state of the cache at any given point in time. As the timing behaviour of the code depends on the state of the cache, it is important to note that starting from an empty cache will not result in the worst-case behaviour [19]! In order to incorporate the influence of the initial cache state, a technique called ‘cache pollution’ can be applied [10]. This approach will run arbitrary code before each measurement to acquire a non-empty cache state. As a result, more realistic execution states of the benchmarked software are achieved during measurement. An alternative to ‘classic’ cache memory is Scratchpad memory. This memory array maps objects in the last stage of compilation and thus is determined by the user or compiler [4,20]. As the content of the Scratchpad are known upfront, there is no need to check the availability of the data or instruction in its memory. This results in less circuitry needed to determine cache hit or miss [4]. The predictability of Scratchpad memory for WCET analysis is more feasible as the behaviour of this memory is under control by software instead of hardware controlled cache memory.

8

Pipelines

Each program instruction is a combination of different stages that are performed. These stages include: fetching instructions, decoding the instruction, executing the instruction, accessing memory and writing the results back to memory [17]. As each instruction involves the same stages to perform on dedicated parts of processor logic, it is possible to run these stages for different instructions in parallel. This results in instruction-level parallelism and provides higher efficiency of the processor as it tries to utilise each stage of the processor constantly. This pipeline is comparable to a factory belt where each step of the process is performed on one dedicated stage on the belt and each unit passes each stage consecutively after one another. By splitting the instruction execution into several stages and by allowing to execute these stages in parallel, the overall throughput (average) increases tremendously. Additionally, some processors are able to start multiple instructions per cycle (e.g. superscalar processors) and some may even execute instructions out-of-order to further improve the throughput based on the availability of data [17]. However, these optimisations introduces pipeline hazards that introduces large penalties for WCET, namely data, resource, control and instructioncache hazards. As hazards occur, it could result in stalling, or flushing the pipeline. When data or resources are not available, because of a dependency or cache miss, the execution has to temporarily halt until the data or resource becomes available.

Survey on Software and Hardware Influences on the WCET

279

During the pause of a stage, a delay or ‘bubble’ is introduced in the pipeline [17]. This bubble is an empty slot that travels along the line and produces no value for the stage it is located. Each pipeline stall will introduce new bubbles that delay the entire chain. To further improve the performance of instruction pipelines, new techniques are introduced to lower the chances of pipeline stalls or minimising the impact. A first technique is branch prediction for conditional operations [17]. The pipeline needs to wait for the result of the condition in order to determine which trace to continue. The processor will try to predict which branch to take when a conditional statement is presented. If the guess was correct, the pipeline benefits from speed gain as it could continue ahead. However, when the guess was incorrect, the pipeline has to ‘go back’ and continue with the other branch. Therefore, the pipeline gets flushed (i.e. cleared) as all work performed for the wrong branch was irrelevant. Another technique to minimise the impact of pipeline stalls is out-of-order execution [17]. The pipeline will decide in which order to execute all instructions by taking their dependencies into account. If data is not available, due to a cache miss, etc., the pipeline could use another instruction to fill in the ‘bubble’ and maximising the throughput. Each instruction is queued until the target ALU is free for execution. As each logical CPU unit has multiple execution units, the instruction pipeline is able to process two instruction streams in parallel that eventual merge in one set. When performing WCET analysis, all previously mentioned techniques add additional layers of interactions and complexity to the mix. By isolating all these subsystems first and taking their most pessimistic assumptions in a static analysis will result in a solution that is too pessimistic and non-feasible to create [5]. As for hybrid approaches, the sizes of the hybrid blocks play an important role in acquiring sound results [13,14]. The interactions and states of each stage in the pipeline is determined by the previous and next instructions during execution. Just profiling atomic hybrid blocks in isolation does not guarantee the worstcase performance, as the state of the pipeline could be ‘worst’ at runtime [5,11]. This principal is referred to as ‘timing anomalies’ where local WCETs are not necessary part of the actual global worst-case scenario [22].

9

Conclusion

In this survey paper, we discussed seven prominent components within processing units that have a repercussion on the code behaviour. For each part, we looked into the inner workings and how it interacts with the software during execution. Next, we explained the influence that the components introduce on the execution time of code and how it can negatively impact the performance and predictability of the WCET. Acknowledgements. This research was supported by Flanders Make, the strategic research centre for the manufacturing industry. This research received funding from the

280

T. Huybrechts et al.

Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme.

References 1. Ahn, Y.: Real-time task scheduling under thermal constraints. Ph.D. thesis, Texas A&M University (2010) 2. Bajaj, P., Padole, D.: Arbitration schemes for multiprocessor shared bus. In: Chiaberge, M. (ed.) New Trends and Developments in Automotive System Engineering. IntechOpen, Rijeka (2011). https://doi.org/10.5772/16197 3. Balasubramonian, R., Jouppi, N.P., Muralimanohar, N.: Multi-core cache hierarchies. Synth. Lect. Comput. Archit. 6(3), 1–153 (2011) 4. Banakar, R., Steinke, S., Lee, B.S., Balakrishnan, M., Marwedel, P.: Scratchpad memory: a design alternative for cache on-chip memory in embedded systems. In: Proceedings of the Tenth International Symposium on Hardware/Software Codesign, CODES 2002 (IEEE Cat. No. 02TH8627), pp. 73–78. IEEE (2002) 5. Betts, A., et al.: WCET coverage for pipelines. Technical report, RealTime Systems Research Group-University of York and Institute of Computer Engineering-Vienna University of Technology (2006) 6. Bhandarkar, D., Clark, D.W.: Performance from architecture: comparing a RISC and a CISC with similar hardware organization. In: Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS IV, pp. 310–319. Association for Computing Machinery, New York (1991) 7. Carbone, J.A.: Reduce preemption overhead in real-time embedded systems. Electronic (2016). https://www.microcontrollertips.com/1581-2/ 8. Chisholm, M., Ward, B.C., Kim, N., Anderson, J.H.: Cache sharing and isolation tradeoffs in multicore mixed-criticality systems. In: 2015 IEEE Real-Time Systems Symposium, pp. 305–316. IEEE (2015) 9. De Bock, Y.: Hard real-time scheduling on virtualized embedded multi-core systems. Ph.D. thesis, Universiteit Antwerpen (2018) 10. Deverge, J.F., Puaut, I.: Safe measurement-based WCET estimation. In: 5th International Workshop on Worst-Case Execution Time Analysis (WCET 2005). Schloss Dagstuhl-Leibniz-Zentrum f¨ ur Informatik (2007) 11. Engblom, J., Jonsson, B.: Processor pipelines and their properties for static WCET analysis. In: International Workshop on Embedded Software, pp. 334–348. Springer (2002) 12. Hahn, S., Reineke, J., Wilhelm, R.: Towards compositionality in execution time analysis: definition and challenges. ACM SIGBED Rev. 12(1), 28–36 (2015) 13. Huybrechts, T., De Bock, Y., Haoxuan, L., Hellinckx, P.: COBRA-HPA: a block generating tool to perform hybrid program analysis. Int. J. Grid Util. Comput. 105–118 (2019). https://doi.org/10.1504/IJGUC.2019.098211 14. Huybrechts, T., Mercelis, S., Hellinckx, P.: A new hybrid approach on WCET analysis for real-time systems using machine learning. In: 18th International Workshop on Worst-Case Execution Time Analysis (WCET 2018). Schloss Dagstuhl-LeibnizZentrum f¨ ur Informatik (2018) 15. Lv, M., et al.: A survey on static cache analysis for real-time systems. Leibniz Trans. Embed. Syst. 3(1), 48 (2016). https://doi.org/10.4230/LITES-v003-i001a005

Survey on Software and Hardware Influences on the WCET

281

16. Mercelis, S.: A systematic multi-layered approach for optimizing and parallelizing real-time media and audio applications. Ph.D. thesis, Universiteit Antwerpen (2016) 17. Patterson, J.R.C.: Modern microprocessors: A 90-minute guide! Electronic (2016). http://www.lighterra.com/papers/modernmicroprocessors/ 18. Paun, V.A., Monsuez, B., Baufreton, P.: On the determinism of multi-core processors. OASIcs, 32–46. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Informatik (2013) 19. Petters, S.M.: Worst case execution time estimation for advanced processor architectures. Ph.D. thesis, Technische Universit¨ at M¨ unchen (2002) 20. Puaut, I., Pais, C.: Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison. In: 2007 Design, Automation & Test in Europe Conference & Exhibition, pp. 1–6. IEEE (2007) 21. Reineke, J.: Caches in WCET analysis. Ph.D. thesis, University of Saarlandes (2008) 22. Reineke, J., et al.: A definition and classification of timing anomalies. In: 6th International Workshop on Worst-Case Execution Time Analysis (WCET 2006). Schloss Dagstuhl-Leibniz-Zentrum f¨ ur Informatik (2006) 23. Rihani, H., Moy, M., Maiza, C., Altmeyer, S.: WCET analysis in shared resources real-time systems with TDMA buses. In: Proceedings of the 23rd International Conference on Real Time and Networks Systems, pp. 183–192 (2015) 24. Schoeberl, M., et al.: T-CREST: time-predictable multi-core architecture for embedded systems. J. Syst. Architect. 61(9), 449–471 (2015) 25. Sha, L., et al.: Real time scheduling theory: a historical perspective. Real-Time Syst. 28(2–3), 101–155 (2004) 26. Ungerer, T., et al.: parMERASA–multi-core execution of parallelised hard real-time applications supporting analysability. In: 2013 Euromicro Conference on Digital System Design, pp. 363–370. IEEE (2013)

Intelligent Data Sharing in Digital Twins: Positioning Paper Thomas Cassimon(B) , Jens de Hoog, Ali Anwar, Siegfried Mercelis, and Peter Hellinckx IDLab - Faculty of Applied Engineering, University of Antwerp - imec, Sint-Pietersvliet 7, 2000 Antwerp, Belgium {thomas.cassimon,jens.dehoog,ali.anwar,siegfried.mercelis, peter.hellinckx}@uantwerpen.be Abstract. Digital Twins are an important innovation that will power the next generation of smart devices in many different contexts. Currently, scientific literature does not agree on what exactly constitutes a Digital Twin. Rather than adding another definition to an already long list, we analyze the current literature on Digital Twins, and attempt to relate the many different definitions of Digital Twins to each other, identifying not only the similarities, but also differences in what different companies and researchers view as a Digital Twin. We also discuss some of the key technologies that enable the creation of Digital Twins (DTs). Using the insights from these sections, we propose a new line of research and development, focusing on areas that are not currently represented in DT literature. Finally, we discuss the merits as well as flaws of our classification of DTs.

1 Introduction In recent years, the concept of Digital Twins has seen a massive increase in popularity [1]. This increase in popularity has been caused by an increase in the number and quality of various data-driven techniques, systems and frameworks, as well as an increase in the general availability of data. In this paper, we will take a closer look at DTs, and attempt to identify the differences and similarities between different views on DTs (Sect. 2), as well as the most important tasks a Digital Twin needs to fulfill (Sect. 3). Next, we will look at some of the most important technologies that have enabled this Digital Twin revolution in Sect. 4, and consider some current use-cases for DTs (Sect. 5). Using these insights, we outline a direction for future research in Sect. 6. Finally, we will conclude the paper with a discussion on the proposed classification of DTs and a brief overview of the proposed research.

2 Characteristics There are many definitions of what a Digital Twin is. This is demonstrated by [2], which lists 16 different definitions of the Digital Twin concept, or [3], which has a table with the meaning of Digital Twins according to eight large companies. Because of this large c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 282–290, 2021. https://doi.org/10.1007/978-3-030-61105-7_28

Intelligent Data Sharing in Digital Twins: Positioning Paper

283

variety of definitions, it can be concluded that a “Digital Twin” is not a singular concept with an exact definition, but rather a paradigm that can be used to achieve improvements in all phases of the lifecycle of an object. When analyzing some of the existing uses of DTs, we notice that there are three main dimensions along which DTs differ: (i) whether a DT is a collection of data – sometimes referred to as a Digital Shadow [4] – or a collection of models [5], (ii) where does the DT live? – at the edge [6], or in the cloud [7], and (iii) whether the DT focuses on the “Use” phase of the object’s lifecycle [8], or on the “Development” phase in the object’s lifecycle (such as those used by Dassault Systèmes [9]). This results in three dimensions along which we can classify DTs, shown in Fig. 1. We notice that there is also a significant amount of interest in DTs that live through the entire lifecycle of a product, from its inception, to its destruction. These Digital Twins are often called “Cradle-to-Cradle” or “Cradle-to-Grave” models of an object [2], and can be seen as moving along the design-product axis throughout their lifecycle. In Sect. 5, we discuss this space in further detail, using three different use-cases as examples.

Fig. 1. Three dimensions along which we can classify Digital Twins.

3 Role of a Digital Twin In [7], the three core tasks of each Digital Twin are identified as “Communication, Computation and Control”. We believe these three tasks are missing one of the two vital components of modern DTs: Sensing and Sharing. We also introduce “Storage” as an auxiliary task. In Fig. 2, we show the different roles of a DT in a flow chart. The first task a DT must execute is “Sense”, where the DT queries its sensors and databases for any available data. The next step is “Understand”, where a DT combines all the information it retrieved from its sensors, with any known and previous state it has, using sensor fusion algorithms. After creating a local state, using only data of this DT, the twin shares its local state with any nearby Digital Twins, in the “Share” step. Upon

284

T. Cassimon et al.

receiving additional information from nearby DTs in the “Share” step, the DT has to execute another batch of sensor fusion algorithms, this time fusing its own local state, with that of the other, nearby DTs, this is done in the second “Understand” phase. When the DT finally has a complete model of itself and its surroundings, it is ready to begin planning its next action, in the “Plan” phase. After completing the planning phase, the DT will execute the planned action(s), during the “Act” phase. The final step in the loop is the “Store” step, where the DT stores any information it acquired or generated during this loop, to allow it to incrementally build a model of itself and the world around it.

Fig. 2. Main responsibilities of a Digital Twin.

4 Key Enabling Technologies In this section, we will discuss some of the key technologies necessary for the successful development and operation of Digital Twins. 4.1

Data Fusion

In many Digital Twin applications, Data Fusion is a process of much importance. According to [10], an application framework for DTs consists of three main layers: (i) a physical space, (ii) a virtual space, and (iii) an information processing layer in between. The latter forms a bidirectional mapping in terms of data sharing between the two spaces. The information processing layer contains three main parts, which are (i) data storage, (ii) data processing, and (iii) data mapping. While these parts are all equally important in the overall architecture, we will only focus on the data processing part for now. In this processing part, the acquired raw data gets processed into clean, workable data which is ready to be used in dependent applications. A part of the processing involves data fusion. Here, different data elements are fused with each other, and this can happen at different levels: at data level, feature level or decision level. This also corresponds with the distinctive levels mentioned in [11, 12] and [13]. Data fusion is a technology that impacts both the model-data axis, and the edge-cloud axis. More complex data fusion strategies will require more compute power, and push DTs away from the edge, and towards the cloud. On the other hand, these strategies facilitate more complex models, moving DTs to the middle on the model-data axis.

Intelligent Data Sharing in Digital Twins: Positioning Paper

285

4.2 Planning In this section, we will touch on two different areas that are important for the planning aspect of DTs: Knowledge Graphs (Sect. 4.2.1), and Scheduling Algorithms (Sect. 4.2.2). In DTs knowledge graphs can be used to make decisions on higher abstraction levels, which in turn allows DTs to control their actuators in better ways [6]. Scheduling algorithms, on the other hand, are very important to make optimal use of the available actuators. And to allow DTs to perform simple load-balancing among themselves where possible [6]. 4.2.1 Knowledge Graphs Ehrlinger et al. [14] define a knowledge-based system, or a knowledge graph, as a system that “acquires and integrates information into an ontology and applies a reasoner to derive new knowledge”. The knowledge acquired by a knowledge graph is stored in a knowledge base, which can then be used by a reasoning engine to query the existing knowledge, or generate new knowledge. An ontology is a formal description of a set of related concepts, and the relations between them. Using an ontology, we can reason about these concepts on a semantic level. An example of how ontologies can be used to monitor data quality is demonstrated in [15]. 4.2.2 Scheduling Knowledge Graphs are an important tool in representing the set of “skills” that a machine possesses, but it is equally important to be able to utilize these skills to their fullest extent. This is demonstrated in the example given by Rosen et al. [6], where different machines posses the same skills, allowing for intelligent scheduling of operations on a product. In this example, some skills are only known by a single machine, creating bottlenecks. This example clearly shows a need for intelligent scheduling of tasks across Digital Twins. 4.3 Modelling and Simulation Another important aspect of Digital Twins, that is not explicitly covered by any of their responsibilities, but is present in many of them, is modelling and simulation. DTs in different domains, with different use-cases, require different simulations and models. For example: Schleich et al. [9] showcase different models and representations aimed at the aerospace industry. An example of a different set of models can also be seen in [6], whose models are more suited towards production and manufacturing of goods. The differences in these models underline the importance of good metadata and knowledgedescription, since these are requirements if DTs (and thus models) from different industries want to collaborate and share information. The presence of models in most DT applications implies that DTs will typically never be on either extremes of the modeldata axis, but always somewhere in between, since models also need data to run and update.

286

T. Cassimon et al.

5 Use-Cases In this section, we will discuss a number of use-cases for Digital Twins, highlighting some of the areas where DTs have been deployed successfully. 5.1

Product Designs

An important use-case of Digital Twins is to assist the designers of a product in their design cycle. By incrementally adding information to a DT of a product that is being designed (such as electrical schematics, mechanical models, thermal models,. . . ), we can identify potential design issues earlier, and prevent them from causing problems later on. For instance: Tao et al. [16] analyzed an example of this in their paper, where they study the potential for DTs in the context of the design and manufacturing of a bicycle. They identify three stages in the DT-driven development: Conceptual Design, Detailed Design and Virtual Verification. These stages are all executed in parallel with the production of the first prototype, allowing for rapid feedback and iteration, which results in a more agile design process [17]. 5.2

Asset Management

In 2018, Microsoft announced their Azure Digital Twins [18] platform, an application of Digital Twins for asset management. In order to ensure the re-usability of their framework, Microsoft uses an ontology1 to model the domain-specific concepts, making their DT framework domain-agnostic. In Microsoft’s framework, users employ an ontology to define a hierarchy in their assets. The framework then places these assets in their correct positions in the hierarchy. In their framework, Microsoft uses the Digital Twin Definition Language (DTDL) [19] to define ontologies. DTDL is a variant of JSON-LD [20], with a defined meta model. By default, Azure Digital Twin creates a data-twin, and does not include any models; it does allow user-defined models to run whenever telemetry data enters the system, updating the spatial intelligence graph. The use of models in this way makes Azure Digital Twins a data-model hybrid DT. 5.3

Manufacturing

The final use-case for Digital Twins we will discuss is the optimization of a manufacturing process. In their paper; Rosen et al. provide an example of how DTs can be used in this context [6]. They describe a production line consisting of four elements, a loading and unloading robot, a belt-based transport system, a milling machine and a drilling machine. Each of these systems maintains a DT of itself, and each product also maintains a digital twin of itself. This paper demonstrates how DTs can be kept in sync as they move through the manufacturing process, how DTs allow the manufacturing equipment to figure out the most optimal manufacturing process and how DTs can be used to work around issues with the manufacturing equipment. 1

https://docs.microsoft.com/en-us/previous-versions/azure/digital-twins/concepts-objectmodel -spatialgraph.

Intelligent Data Sharing in Digital Twins: Positioning Paper

287

5.4 Classification of Use-Cases In this section, we have positioned the use-cases discussed earlier in the 3D-space which has been defined in Fig. 1. Table 1 gives a brief overview of the classification of each use-case. Table 1. Overview of classification of use cases Data-model Design-product Edge-cloud Reference Product designs

Model

Design

Cloud

[16]

Asset management Data

Product

Cloud

[18]

Manufacturing

Product

Edge

[6]

Data

We note that there is no requirement for a Digital Twin to be a singular point in this space, it is possible for DTs to occupy a line, a plane or even a volume in this space. It is also possible for DTs to move from one position to another during their lifecycle, a great example of this is a design twin that gets shipped along with the final product it represents in order to become a product twin. The DTs proposed by Tao et al. [16] occupy part of the cloud twin plane. While Tao et al. do not consider this dimension, based on their proposed use cases, it is reasonable to assume that their twins would be placed in the cloud. Their applications consider both the collection of data, and the use of models, but focuses more around the use of models. Because of this, we placed it along the top of the model-data axis, but still let it occupy a section along this axis. Because of the proposed approach to product development, with a focus on iteration and early production of prototypes, the designed DTs are both design and product twins. When placing Azure Digital Twins in this space, we notice a clear focus on placing Digital Twins in the cloud. We also notice that Azure Digital Twins are made with product twins in mind, with much less of a focus on design twins. The Azure Digital Twin platform itself is centered around the collection of data, but it also allows users to include arbitrary models that use the incoming data, for this reason, we placed Azure Digital Twins as a line along the model-data axis, closer to the data-twin side. Finally, we consider the example from Rosen et al. [6]. In their example, Rosen et al. mention that the data is stored on the manufacturing machines and product palettes, this is an important distinction between this work and the others, since it means that the DTs proposed by Rosen et al. are considered edge twins. This example only considers product twins, in which the twins were designed for both manufacturing machines and the products. Rosen et al. also focus mostly on the collection, and updating of data. Their twins do include models of the capabilities of each machine, in order to allow for intelligent planning and orchestration, due to these capabilities, we have placed them close to the data-twin side, with a small amount of modelling.

288

T. Cassimon et al.

Fig. 3. Digital Twins discussed in Sect. 5 classified along the three axis demonstrated in Fig. 1. Labels around the origin were omitted for clarity.

6 Positioning After analyzing the use-cases mentioned in Sect. 5, we noticed that the “sharing” task of Digital Twins is only present in a very limited capacity [2, 6, 16]. The sharing of data is especially important in applications where DTs are situated at the edge, since moving all the data generated by edge twins from the edge to the cloud and back would require tremendous amount of networking resources. This can be remedied by using direct inter-twin communication (e.g. by making use of the DUST framework [21]), but this is currently quite rare as outlined in [1]. While expanding the amount of data shared between DTs is one thing, it is important to remember that on edge devices, we do not always have the resources to process many different, and often complex data streams. In order to prevent this data explosion, we will need to devise intelligent sharing strategies, that share specific pieces of data with specific DTs. These intelligent sharing strategies can be facilitated by the “understand” tasks that are executed before and after “share” task. We can use the local “understand” task to attach metadata to our data streams, such as data quality as described by [15]. Using this metadata, we can determine which pieces of information are relevant for which other DTs, and share only the relevant information, with the relevant DTs. The DTs receiving this data can also make use of the attached metadata to make more intelligent decisions regarding their global “understand” task, such as choosing to ignore low-quality data streams. In their work, Lu et al. highlight the importance of data quality for smart manufacturing applications [1], mentioning that low-quality data generates unusable results in industrial data analysis.

7 Conclusion and Future Work In this work, we propose a classification model for the Digital Twins. We picked the famous papers in this area and classified the twins presented in them according to our classification model. We positioned our vision of Digital Twins within the classification

Intelligent Data Sharing in Digital Twins: Positioning Paper

289

model. We see that the classification model provides a clear view and a good starting point for classifying different Digital Twin architectures. However, an important issue with the proposed classification model is that parts of the space still generally remains empty. An example of this are edge-design twins, since at design time, there is generally no lack of resources, therefore the systems used while designing products will generally not be classified as edge devices. While this does not seem to produce any immediate issues, it is an imperfection of the model, showing that there is still room for improvement. Additionally, a more extensive review can be carried out, with which our proposed classification model can be verified and validated. Another point of improvement for our model is the trade-off between the classification of a Digital Twin as a singular point versus a line on a specific axis. This can be especially difficult along the model - data axis; since models and data are closely intertwined, it can be difficult to identify a Digital Twin as focusing on one over the other. Finally, we also believe that the “sharing” task is often ignored or minimized in many works related to Digital Twins. We believe this aspect to be critical in building scalable Digital Twins at the edge, allowing for efficient information exchange between different twins. Acknowledgements. This work was performed within the SSAVE (Shared Situational Awareness for VEssels) - project. The project was realised with the financial support of Flanders Innovation & Entrepreneurship (VLAIO) and the Blue Cluster.

References 1. Lu, Y., Liu, C., Wang, K.I.-K., Huang, H., Xu, X.: Digital twin-driven smart manufacturing: connotation, reference model, applications and research issues. Robot. Comput.-Integr. Manuf. 61, 101837 (2020). http://www.sciencedirect.com/science/article/ pii/S0736584519302480 2. Negri, E., Fumagalli, L., Macchi, M.: A review of the roles of digital twin in CPS-based production systems. Procedia Manuf. 11, 939–948 (2017). 27th International Conference on Flexible Automation and Intelligent Manufacturing, FAIM2017, 27–30 June 2017, Modena, Italy. http://www.sciencedirect.com/science/article/pii/S2351978917304067 3. Qi, Q., Tao, F., Zuo, Y., Zhao, D.: Digital twin service towards smart manufacturing. Procedia CIRP 72, 237–242 (2018). 51st CIRP Conference on Manufacturing Systems. http://www.sciencedirect.com/science/article/pii/S2212827118302580 4. Uhlemann, T.H.-J., Lehmann, C., Steinhilper, R.: The digital twin: realizing the cyberphysical production system for industry 4.0. Procedia CIRP 61, 335–340 (2017). the 24th CIRP Conference on Life Cycle Engineering. http://www.sciencedirect.com/science/article/ pii/S2212827116313129 5. National Research Council: NASA Space Technology Roadmaps and Priorities: Restoring NASA’s Technological Edge and Paving the Way for a New Era in Space. The National Academies Press, Washington, DC (2012). https://www.nap.edu/catalog/13354/nasa-spacetechnology-roadmaps-and-priorities-restoring-nasas-technological-edge 6. Rosen, R., von Wichert, G., Lo, G., Bettenhausen, K.D.: About the importance of autonomy and digital twins for the future of manufacturing. IFAC-PapersOnLine 48(3), 567– 572 (2015). 15th IFAC Symposium on Information Control Problems in Manufacturing. http://www.sciencedirect.com/science/article/pii/S2405896315003808

290

T. Cassimon et al.

7. Alam, K.M., El Saddik, A.: C2PS: a digital twin architecture reference model for the cloudbased cyber-physical systems. IEEE Access 5, 2050–2062 (2017) 8. Lund, A.M., Mochel, K., Lin, J.-W., Onetto, R., Srinivasan, J., Gregg, P., Bergman, J.E., Hartling, K.D., Ahmed, A., Chotai, S., et al.: Digital twin interface for operating wind farms. Patent, Patent No.: US9995278B2 (2015). https://patents.google.com/patent/US9995278B2/ 9. Schleich, B., Anwer, N., Mathieu, L., Wartzack, S.: Shaping the digital twin for design and production engineering. CIRP Ann. 66(1), 141–144 (2017). http://www.sciencedirect.com/science/article/pii/S0007850617300409 10. Zheng, Y., Yang, S., Cheng, H.: An application framework of digital twin and its case study. J. Ambient Intell. Humanized Comput. 10(3), 1141–1153 (2019). https://doi.org/10.1007/ s12652-018-0911-3 11. White, F.E.: Data Fusion Lexicon. Technical report 0704 (1991). https://apps.dtic.mil/dtic/tr/ fulltext/u2/a529661.pdf 12. Balemans, D., Casteels, W., Vanneste, S., de Hoog, J., Mercelis, S., Hellinckx, P.: Resource efficient sensor fusion by knowledge-based network pruning. Internet Things 11, 100231 (2020). https://doi.org/10.1016/j.iot.2020.100231 13. Meng, T., Jing, X., Yan, Z., Pedrycz, W.: A survey on machine learning for data fusion. Inf. Fusion 57(2), 115–129 (2020). https://doi.org/10.1016/j.inffus.2019.12.001 14. Ehrlinger, L., Wöß, W.: Towards a definition of knowledge graphs. In: SEMANTiCS (2016) 15. Geisler, S., Quix, C., Weber, S., Jarke, M.: Ontology-based data quality management for data streams. J. Data Inf. Qual. 7(4) (2016). https://doi.org/10.1145/2968332 16. Tao, F., Cheng, J., Qi, Q., Zhang, M., Zhang, H., Sui, F.: Digital twin-driven product design, manufacturing and service with big data. Int. J. Adv. Manuf. Technol. 94 (2018). https://link. springer.com/article/10.1007/s00170-017-0233-1 17. Beck, K., Beedle, M., van Bennekum, A., Cockburn, A., Cunningham, W., Fowler, M., Grenning, J., Highsmith, J., Hunt, A., Jeffries, R., Kern, J., Marick, B., Martin, R.C., Mellor, S., Schwaber, K., Sutherland, J., Thomas, D.: Manifesto for Agile Software Development (2001). http://agilemanifesto.org/ 18. Microsoft: What is azure digital twins? (2020). azure Digital Twins Overview. https://docs. microsoft.com/en-us/azure/digital-twins/overview 19. Microsoft: Digital twins definition language (DTDL) (2020). gitHub Repository. https:// github.com/Azure/opendigitaltwins-dtdl/blob/master/DTDL/v2/dtdlv2.md 20. Kellogg, G., Champin, P.-A., Longley, D.: Json-ld 1.1 – a json-based serialization for linked data. Technical report (2020). https://hal-lara.archives-ouvertes.fr/hal-02141614/ 21. Vanneste, S., de Hoog, J., Huybrechts, T., Bosmans, S., Eyckerman, R., Sharif, M., Mercelis, S., Hellinckx, P., Vanneste, S., de Hoog, J., Huybrechts, T., Bosmans, S., Eyckerman, R., Sharif, M., Mercelis, S., Hellinckx, P.: Distributed uniform streaming framework: an elastic fog computing platform for event stream processing and platform transparency. Future Internet 11(7), 158 (2019). https://www.mdpi.com/1999-5903/11/7/158

Towards Hybrid Camera Sensor Simulation for Autonomous Vehicles Dieter Balemans2(B) , Yves De Boeck1 , Jens de Hoog2 , Ali Anwar2 , Siegfried Mercelis2 , and Peter Hellinckx2 1

2

University of Antwerp, Antwerp, Belgium [email protected] IDLab - Faculty of Applied Engineering, University of Antwerp - imec, Sint-Pietersvliet 7, 2000, Antwerp, Belgium {dieter.balemans,jens.dehoog,ali.anwar,siegfried.mercelis, peter.hellinckx}@uantwerpen.be

Abstract. To accurately test and validate algorithms used in autonomous vehicles, numerous test vehicles and very large data sets are required, resulting in safety constraints and increased financial cost. For this reason, it is desired to train the algorithms at least partly in simulation. In this work we lay the focus on the camera sensor and propose a novel methodology for injecting the instance of simulated vehicles into the camera data of real vehicles. To get qualitative results and improve generalization capabilities, the simulated data must sufficiently correspond to real-world sensor data in order to prevent loss of performance when moving the model to a real environment after training. The realism of the output is evaluated by object detection systems and a realism score produced by a CNN. Results show the potential of this approach for improving hybrid simulators for the validation of autonomous vehicles.

1

Introduction

In recent years autonomous driving application have been getting very much attention. However, the development and verification of these autonomous vehicles is challenging, as the systems should be guaranteed to have full control in all circumstances. Viable testing and training of these systems requires immensely large datasets as well as other expensive resources such as multiple test vehicles and a safe testing environment. Furthermore, verification of all possible scenarios in real life is extremely challenging. For example, how can we verify and validate these systems in complex situations, such as dangerous maneuvers of other vehicles? An appropriate solution for this problem is using simulated data in a hybrid setup. The concept of hybrid simulation is not new as proposed in the work of de Hoog et al. [1] for the simulation of LiDAR data. It has however been observed that this data does not fully correspond to real-life data; the hybrid data is only an abstract representation of the simulated vehicle instead of an accurate imitation of a real one. Additionally, since camera sensors are more c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 291–300, 2021. https://doi.org/10.1007/978-3-030-61105-7_29

292

D. Balemans et al.

common in vehicles nowadays due to the relatively low costs in comparison with LiDAR sensors, therefore it is important to study the hybrid simulation system incorporating the raw rgb camera stream. Hence, in this work, we focus on creating a hybrid simulation setup for RGB camera sensors, in which the hybrid data is represented as a vehicle which is visually close to real-life one. Our setup consists of two stages: (i) an image generation stage, and (ii) an image composition stage. The generation stage will accurately render a 3D model of a car based on the properties of the simulated entity: positional and orientation information, along with its shape. Afterwards, the generated image will be blended and harmonized on the intercepted camera image. For the harmonization part two different deep learning models were explored [2,3]. Remaining part of this work is organized as follows: in Sect. 2 related work is discussed. Section 3 elaborates on the concept of this work. Section 4 shows our test-setup, while results are presented in Sect. 5. In Sect. 6 a conclusion can be found.

2

Related Work

In this work, our goal is to improve the simulation environment of a hybrid simulator for autonomous vehicles. We modify a real camera data stream to include information from the virtual world similar to augmented reality. It enables interaction between the real- and virtual worlds. Simulators such as [4] succeed in accurately virtualizing sensors, however, they do not offer interaction between heterogeneous vehicles (real and simulated), along with scalability and flexibility in mind. The work of [1] already addresses these challenges and proposes a simulator in which LiDAR data is being manipulated. In our work, we aim to tackle the challenge by manipulating the data stream of a camera stream. Over the years, several techniques have been developed to obtain natural “harmonization” of a source object and a target image. To obtain acceptable results, spatial and color consistencies between the source and target images have to be improved. Some approaches use methods with color and tone matching techniques to ensure consistent appearances, such as transferring global statistics [5], matching multi-scale statistics [6] or utilizing semantic information [7]. The most popular image blending techniques, however, aim to provide gradient domain smoothness [8]. While these methods do improve the realism of the composite images in terms of color and/or texture, they do not take the contents of the composite images into account. This leads to unreliable results when the foreground and background regions are vastly different. There are however methods that studied the concept of this realism [9,10] and took context and semantics into account such as [2,3,11]. In this paper, the usage of the methods proposed in [2,3] will be examined.

Towards Hybrid Camera Sensor Simulation for Autonomous Vehicles

293

Fig. 1. Schematic representation of our approach.

3

General Approach

To tackle the problem discussed earlier, we propose the general process flow illustrated in Fig. 1. In a setup without hybrid simulation capabilities, the data of the camera sensor is passed directly to a controller for processing. In our hybrid simulation approach, the frame is manipulated in a few additional steps (i.e. adding the simulated car to the image frame) before it gets passed on. A simulator dictates position, orientation and possibly other features (e.g.. color, type) of one or multiple simulated vehicles and passes this information on to the generator (for image generation). The generator produces an image of a virtual car rendered based on the parameters of the simulator. Next, the generated car image is correctly placed on the camera frame and the resulting composite is harmonized by a machine learning network specialized in image composition. Only after these steps is the frame passed on to the controller logic. Regarding the generation of realistic images, modern developments in machine learning are used to achieve realistic images of cars in the scene. For example, Generative Adversarial Networks (GANs) are known to produce photorealistic images. Models such as StyleGAN2 [12] are able to produce hyper realistic images of cars. The disadvantage of this technique is that control over position and orientation is limited, since this involves controlling the complex latent space which forms the input of the generator network. On the contrary, our approach consists of using a predefined 3D model of a car for image generation. The cars are rendered using the Blender Cycles renderer [13]. Based on the positional, orientational and shape information from the simulator, the model is manipulated in order to be rendered and transformed into a 2D image. Afterwards, the 2D image is blended and harmonized in the background image from the real camera. This approach offers less realism in comparison but full control over position and orientation of the cars. In order to deal with the problem of decreased realism in our composite images, we explored Deep Image Harmonization (DIH) first [2]. In this method only foreground pixels get adjusted. It primarily consists of convolutional layers. However, the model is limited to input image sizes of 512 × 512. This is a limitation of the method since most cameras produce larger images. Though, this problem can be solved through splitting or scaling of the original images.

294

D. Balemans et al.

Secondly, we also evaluated Gaussian-Poisson Generative Adversarial Network (GP-GAN) [3]. It combines the closed-form solution of the GaussianPoisson equation [14] with a GAN loss to synthesize realistic blending images. We used this techniques as it offers higher-resolution, realistic images with fewer bleedings and unpleasant artifacts, compared to other gradient-based methods. Moreover, it provides many adjustable parameters such as the color weight, sigma for Gaussian-Poisson equation and the size of the latent vector. Compared to DIH, GP-GAN also adjusts the background pixels around the object which makes the blending more realistic.

4

Test Setup

In this work, three different car models are used along with frames from a real camera stream which serves as background images. We employed the KITTI Vision Benchmark [15] for this purpose. This data set provides real images along with ground truths for the positions of the detected cars. We used the positional ground truth as input for our algorithm. A total of six recordings were selected from their raw data recordings. Aside from the visual analysis performed on the images, validation was performed on two aspects: (a) the accuracy of the car positioning algorithm and (b) the realism of the resulting frames. 4.1

Image Generation Accuracy

For evaluation of the extent at which the application is able to accurately position a simulated vehicle at the desired position, we rendered the cars on the positions of real cars based on the KITTI tracklet file coordinates. By plotting the original bounding boxes of the tracklets afterwards, we effectively created a ground truth to compare with the position of the simulated vehicles. 4.2

Image Realism

Evaluating realism in images is not straightforward, since no generally accepted method is available. As an alternative to visual analysis, the method proposed by Zhu et al. [10] was adopted; it uses a Convolutional Neural Network (CNN) model to predict a visual realism score of a scene in terms of color, lighting and texture compatibility. This provides us with a numerical parameter to compare against. Using this model, scores are predicted for the four frame types that are produced as: (a) original images, (b) composites (copy-paste), (c) DIH harmonized images and (d) GP-GAN harmonized images. For each batch, we count the number of times at which the harmonized image scores were better than their copy-paste equivalent. By labelling the images as “realistic composite” (DIH, GP-GAN, original) and “unrealistic composite” (the copy-paste composites), we created a binary detection task for the CNN and determined the Relative Operating Characteristic (ROC)-curve along with the corresponding ROC-score [16]. The ROC score is a commonly used measure for evaluating binary detectors

Towards Hybrid Camera Sensor Simulation for Autonomous Vehicles

295

and provides us with a relative comparison measure for our harmonization techniques, as it was also used as evaluation metric for the discriminative CNN in the work of Zhu et al. [10]. Tests were conducted on 8491 KITTI images, consisting out of original images, DIH images and GP-GAN images. Additionally two object detection systems were deployed, for checking the effect on the overall features of the cars and the extent to which they remain detectable after harmonization. Yolov3 [17] was chosen because it shows state-ofthe-art results in mean average precision (mAP) as well as speed. Since YOLOv3 takes a fundamentally different approach then most other object detection systems, we also implemented R-FCN [18] for comparison, which takes a more common region-based approach. The R-FCN model was implemented using a ResNet-101 network trained for detection on PASCAL VOC 2007. For both systems the average precision (AP) of the PASCAL VOC challenge [19] was used as an evaluation metric. The object detection experiment was conducted for 9230 and 11277 images using YOLOv3 and R-FCN respectively.

5

Evaluation Results

As discussed in Sect. 4 we evaluated the realism of the images by two parameters: (a) the Relative Operating Characteristic (ROC) score based on the realism score output of a CNN, (b) the Average Precision (AP) using YOLOv3 and R-FCN. Additionally, the render accuracy of the cars was evaluated in a third experiment. In the following sections the results of the three experiments are presented. 5.1

Realism Score

In this experiment, we use a CNN to predict realism scores for the images. For this purpose, mainly the ROC score and success rate metrics were determined. Table 1. Averaged results of realism evaluation using the model of [10]. Parameter

Images Original DIH GP-GAN

Succesratio [-] 8491

0.39

0.39 0.60

ROC score [-] 9481

0.46

0.46 0.54

Table 1 shows the weighted average of the two metrics for each image type. It can be seen that GP-GAN scores best here, which shows that the CNN deemed the GP-GAN frames more realistic than the equivalent unharmonized frames in 60% of the cases. DIH scores considerably worse, but still equivalent to the original frames. Note that the scores of the original frames are particularly unexpected, since this means that the CNN found the original image more realistic than the copy-paste in only 39% of the cases. It should also be noted that overall, the results show much lower ROC scores than the results achieved in [10]. The

296

D. Balemans et al.

reason for this is most likely the fact that the model is originally meant to rank a set of random, very different composites by realism score. In our experiment, the model is faced with a much harder task, since the background of each tuple of frames it has to rank, is completely the same and the car sometimes makes up a very small portion of the image.

Fig. 2. More detailed results of realism evaluation using the model presented in [10]. (a) Results of the ROC score for each harmonization technique grouped by KITTI recording. (b) Results for the car types grouped by harmonization technique. (c) Phase1 results of the ROC score for the car types grouped by harmonization technique.

Figure 2a shows the ROC scores per KITTI recording. One can see that GPGAN effectively scores better in most cases. Also observe the very low scores for the kitti17 recording. Due to a scaling error with the data in this recording, the composites do not match the original frames. This apparently made the frame much more realistic according to the CNN, resulting in lower success rates for the original frames. It is worth noting that this was the only recording in which cars appeared transversal on the road. Figure 2b and 2c show the results for each car type. Note that we included the StyleGAN2 scores as reference, as it is not used in the final approach. This shows that our 3D models are lacking behind the photo realistic cars from the StyleGAN2 by a small margin. This was expected as our 3D models were required to be rendered as fast as possible. 5.2

Object Detection

In this experiment, we evaluated the extent at which the cars remained detectable after harmonization. This was done by determining the average precision (AP) for each combination of image type - car type - KITTI recording and then averaging the results. Table 2 shows the main results of this experiment. We observe that for YOLOv3, the original images get detected best (AP 78.84%), while we lose some detection capability when pasting our simulation cars on the original frame (AP 71.07%). If we then also harmonize that frame with DIH or GP-GAN, the object detection system loses again some of its AP (63.90% and 65.97% respectively). Exactly the same trend can be observed for R-FCN, but with lower overall AP (only 23.77% for the original frames). For GP-GAN this trend could be explained by the fact that its effect of smoothing out boundaries of the object, thus making

Towards Hybrid Camera Sensor Simulation for Autonomous Vehicles

297

Table 2. Averaged results of the AP using YOLOv3 [17] (row 1) and R-FCN [18] (row 2). Parameter

Images Original Composites DIH

GP-GAN

YOLOv3 [%] 9230

78.84

71.07

63.90 65.97

R-FCN [%]

29.69

23.77

21.92 21.14

9481

it harder for the object detection system to distinguish the features of the car object from the background of the image. DIH is an harmonization technique that only adjusts the pixels of the car object region, so the reason for its lower score could not be assigned to the same effect. However, as explained in section III, the innerproduct layer of the DIH net forces us to split our KITTI frames in separates parts before feeding them to the network. We observe that this sometimes leads to heterogeneity in the color of the car objects. This may have contributions to the bad performance. In future work this problem can be solved by either changing the networks architecture or applying a scaling method to the image first. If we look at the detailed results given in Fig. 3a and 3b, we see that the discussed trend is rather consequent for all KITTI recordings. The strong increase in AP for kitti17 can be explained by the fact that this is the only recording in which the cars appear in the transverse direction, perpendicular to the longitudinal axis of the road. This makes them easier to be detected than a smaller front view of a car. All cars also appear at the center of the screen on a relatively large scale, i.e. no small parked cars in the distance etc. For kitti15 the latter is very much the case, which is being reflected in the low scores for all frame types.

Fig. 3. More detailed results for object detection showing the AP for all frame types grouped by KITTI recording for YOLOv3 (a), the AP for all frame types group by KITTI recording for R-FCN (b).

5.3

Rendering Accuracy

To evaluate the accuracy of the rendering process, the ground truth bounding boxes were plotted on each composite image of the KITTI experiments. Some

298

D. Balemans et al.

results are shown in Fig. 4. In general the position of the rendered cars correspond very well to the ground truth. However, it is observed that an increasing deviation from the ground truth builds up, as the car moves further from the center and more to the edge of the image. This is due to the fact that the transformation operation from the 2D bounding box coordinates to a correct 3D scaling factor of the car in Blender renderer was calibrated on a base car in the center of the screen. Each car is scaled relatively to the scale of this car, based on the diagonal of the 2D bounding box of the car in question. Since cars that appear on the left or right edge of the screen are seen from an angle in the XY-plane, their bounding box will have a higher width/height ratio, as compared to a car of the same size but placed in the center of the frame. This causes the cars to scale too much for bounding boxes at the edges of the screen. To remedy this, an extra parameter was introduced that limits the scaling when the width/height ratio goes over 0.6. The consequence is that cars that are very close by, do not fill the whole bounding box. A second observation is that, when objects in the KITTI frames were detected as a car, but were partly hidden behind some other object in the scene, the car was placed over that other object. This can be categorized as an occlusion problem and is hard to solve because no tracklet information is available about the blocking object in question.

Fig. 4. Frame 6 of the kitti56 recording as the original frame and composite models for all three car models. Ground truth boxes of the original car in the frame are drawn to indicate the accuracy of the positioning algorithm.

6

Conclusion and Future Work

In this work we propose a method for hybrid simulation of a camera sensor. Our method allows for real and virtual cars to interact based on camera images. The main advantage of this hybrid simulation is that it allows for faster and safer training and testing, by not requiring as many test vehicles and scenarios. One requirement of a hybrid simulation is realism. Our goal is that our composite images should be as close as possible to real sensor data. In this work

Towards Hybrid Camera Sensor Simulation for Autonomous Vehicles

299

we presented a method to generate a simulated vehicle and project it onto an existing image using composition techniques. We validated this by using a CNN trained for scoring realism along with experiments on state-of-the-art object detection systems. In the detection experiment all frames were submitted to two object detection systems, namely YOLOv3 and R-FCN. In general, we observe that harmonization of the frames makes them less detectable. For both detection systems, DIH and GP-GAN show very comparable AP’s, but score lower than the composites, which in turn score lower than the original frames. This can be explained by the fact that the used harmonization methods also change the edges of the cars, making it harder on the object detection systems to detect specific features. We believe that this can be improved by fine-tuning the generation methods and detection mechanisms. This however has not been taken into account in this work. Future improvements of this approach include optimization of the currently used techniques. The influence of the many parameters of GP-GAN should be further investigated in order to suppress unwanted side effects. We also aim to continue exploring the path of the GAN generator by gaining more control over the latent space. Furthermore, we want to optimize validation techniques for evaluating realism and search for alternatives in this respect. As our end goal is to achieve a real time system, further optimizations need to be made. One of the optimizations we would like to explore are different rendering systems. Currently the Cycles renderer is being used. In a future version of the system we want look at the newer Eevee render [20] system as well. Acknowledgements. This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme.

References 1. de Hoog, J., Pepermans, M., Mercelis, S., Hellinckx, P.: Towards a scalable distributed real-time hybrid simulator for autonomous vehicles. In: Advances on P2P, Parallel, Grid, Cloud and Internet Computing, vol. 24, pp. 447–456. Springer International Publishing, Cham (2019) 2. Tsai, Y., Shen, X., Lin, Z., Sunkavalli, K., Lu, X., Yang, M.: Deep image harmonization. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2799–2807. IEEE Computer Society, Los Alamitos, July 2017 3. Wu, H., Zheng, S., Zhang, J., Huang, K.: GP-GAN: towards Realistic HighResolution Image Blending. In: Proceedings of the 27th ACM International Conference on Multimedia, ser. MM 2019, pp. 2487–2495. Association for Computing Machinery, New York (2019) 4. Gechter, F., Dafflon, B., Gruer, P., Koukam, A.: Towards a hybrid real/virtual simulation of autonomous vehicles for critical scenarios. In: The Sixth International Conference on Advances in System Simulation (SIMUL 2014), pp. 14–17 (2014) 5. Reinhard, E., Adhikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Comput. Graphics Appl. 21(4), 34–41 (2001)

300

D. Balemans et al.

6. Sunkavalli, K., Johnson, M.K., Matusik, W., Pfister, H.: Multi-scale image harmonization. ACM Trans. Graph. 29(4), 1–10 (2010) 7. Tsai, Y.-H., Shen, X., Lin, Z., Sunkavalli, K., Yang, M.-H.: Sky is not the limit. ACM Trans. Graph. 35(4), 1–11 (2016) 8. Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. ACM Trans. Graph. 22(3), 313–318 (2003) 9. Xue, S., Agarwala, A., Dorsey, J., Rushmeier, H.: Understanding and improving the realism of image composites. ACM Trans. Graph. 31(4), 1–10 (2012) 10. Zhu, J.-Y., Krahenbuhl, P., Shechtman, E., Efros, A.A.: Learning a discriminative model for the perception of realism in composite images. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3943–3951. IEEE, December 2015 11. Zhang, L., Wen, T., Shi, J.: Deep image blending. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 231–240. IEEE Computer Society, Los Alamitos, March 2020 12. Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020) 13. Blender Foundation: Cycles Open Source Production Rendering (2018). https:// www.cycles-renderer.org/. Accessed 30 June 2020 14. Burt, P., Adelson, E.: The Laplacian pyramid as a compact image code. IEEE Trans. Commun. 31(4), 532–540 (1983) 15. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE, June 2012 16. Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: Proceedings of the 23rd International Conference on Machine Learning, ser. ICML 2006, pp. 233–240. Association for Computing Machinery, New York (2006) 17. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018) 18. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, ser. NIPS 2016, pp. 379–387. Curran Associates Inc., Red Hook (2016) 19. Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2009) 20. Blender Foundation: Eevee rendering introduction (2018). https://docs.blender. org/manual/en/latest/render/eevee/introduction.html. Accessed 30 June 2020

Lane Marking Detection Using LiDAR Sensor Ahmed N. Ahmed1(B) , Sven Eckelmann2 , Ali Anwar3 , Toralf Trautmann2 , and Peter Hellinckx3 1

2

Faculty of Applied Engineering, University of Antwerp, Campus Groenenborger, Groenenborgerlaan 171, 2020 Antwerp, Belgium [email protected] Mechlab, Hochschule f¨ ur Technik und Wirtschaft Dresden, Friedrich-List-Platz 1, 01069 Dresden, Germany {sven.eckelmann,toralf.trautmann}@htw-dresden.de 3 IDLab - Faculty of Applied Engineering, University of Antwerp - imec, Sint-Pietersvliet 7, 2000 Antwerp, Belgium {ali.anwar,peter.hellinckx}@uantwerpen.be Abstract. To achieve fully autonomous driving, the vehicle needs visualization of the surrounding environment, and this makes it dependent on multiple perception sensors. Lane detection is significant in this case, as multiple tasks rely on its accuracy, for example, Simultaneous Localization And Mapping (SLAM), automatic lane keeping and lane centring which is commonly used in Advanced Driving Assistance Systems (ADAS), and other functions that require lane departure or trajectory planning decisions. These functions are responsible for minimizing the number and severity of road accidents, as they enable the car to position itself within the road lanes properly. Lane marking is challenging to model due to the road scene variations, and therefore, it is a complicated task. In this paper, we implement an automated algorithm for extracting road markings using a LiDAR point cloud utilizing the variances intensity properties. Our technique detects lane line coordinates based on computer vision algorithms, and without any dependency or knowledge of the test field parameters, like road width or centre-line coordinates etc. Experimental testing is conducted on a test field with ground-truth coordinates of the lane markings, and it shows that the proposed algorithm provides a promising solution to the lane marking detection.

1

Introduction

Recently, much research has been conducted in the area of computer vision in order to enhance the performance of object and lane detection algorithms for self-driving cars; this is mainly because lane detection is not only required to make lane changes, or lane deviation warnings in advanced driver assistance systems (ADAS) [1], but also to assist the vehicle in maintaining its stable position between the lane lines while driving straightly in the cruising phase. The c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 301–310, 2021. https://doi.org/10.1007/978-3-030-61105-7_30

302

A. N. Ahmed et al.

main challenges that arise during the development of real-time lane detection algorithms are conditions of the surrounding ecosystem, as well as the environmental variations, in addition to that, the reliability requirements that enforce an exact and precise decision-making process for autonomous vehicles, which is in turn highly dependent on the vehicle’s vision. Some other examples of these challenges include erased lane markings, varying illumination, snow on the road, shadows, and any other factor that might affect the colour gradient and intensity of the returned data, that needs to be processed for making faultless detection. Road marking detection is essential in many ADAS functions like lane departure warning that gives a warning to the driver if the car leaves or deviates from the lane, and also lane-keeping assists that takes control of the steering wheel in the case that the car leaves or diverges from the lane, bringing it back to a safe position [2]. The Light detection and ranging sensor (LiDAR) measures orientation and intensity of objects based on the time taken for the light beam to be reflected, the depth of the reflected ray and the reflection intensity. LiDAR sensors have been widely used in robotics and autonomous driving for identification of objects, as well as the position and size of the obstacles. The LiDAR sensor has a significant advantage in determining the distance between itself and the object. Although it is more expensive than cameras, researchers are heavily exploring to replace it with the camera in autonomous driving applications, since it is not affected by illumination, and it does not suffer from the uncertainties caused by shadows and other environmental conditions. Additionally, the variation of reflectance properties between the road’s asphalt and the lane markings lead to different point cloud intensity values, as shown in Fig. 2. Due to these capabilities of the LiDAR sensors, research is actively carried out in the area of lane detection using it as the primary sensor. Remaining part of this work is organized as follows: in Sect. 2 related work is discussed where we elaborate on the work implemented in the literature to tackle the lane detection problem using LiDAR sensor. Section 3 elaborates on the concept of this work as we are going to present the lane detection algorithm that operates by applying range dependent thresholding to the intensity values, and the field of view (FOV) in order to extract lane markings. The process can be encapsulated into three steps (1) point cloud preprocessing; (2) extraction of ROI and intensity from point clouds; (3) lane line fitting; while results are presented in Sect. 4. Finally, the paper is concluded in Sect. 5.

2

Related Work

The authors in [3] have tackled the lane detection problem by visualizing the road as a dynamic system. First the street was divided into different planes, the focal point of each plane and its heading was then decided by utilizing the Kalman filtering. Then random sample consensus (RANSAC) method was used to fit the mathematical models of the road. In [4] the authors used sensor fusion algorithm to fuse images from camera and scanning laser radar (LADAR) to identify curbs. The presented solution in this paper used nonlinear Markov

Lane Marking Detection Using LiDAR Sensor

(a)

(b)

303

(c)

Fig. 1. (a) BMW i3 car used in the testing process. (b) A top view of the test field. (c) OS1 LiDAR sensor.

Fig. 2. A sample from the 3D point cloud data on rviz as the vehicle drives through the test field. The intensity corresponding to the lane marking are indicated by the white arrows.

switching. The authors in [5] extracted the lane lines by developing an algorithm that combines both Gradient Vector Flow (GVF) and Balloon Parametric Active Contour models. First, the 3D LiDAR point cloud data were segmented based on elevation, intensity and ROI, then it was converted into 2D raster surfaces to reduce the noise in the image and the exclude the outliers while detecting lane lines; thresholding and canny edge detection were applied. Next, a snake curve was used to construct road markings based on the road points available from the LiDAR data. This method provided high precision in middle lane detection, however, it was not very accurate in detecting lanes at the edges of the road, and the predicted lanes interfered with the road curbs as well. The lane detection algorithm proposed in [6] worked on extracting lanes from LiDAR data based on an intensity threshold as well as ROI to limit the number of processed point cloud data. Afterwards, this data was converted to 2D image,

304

A. N. Ahmed et al.

the authors applied linear dilation to completely rubbed off lanes. The algorithm was tested over 93 roads, and managed to detect the markings in 80 roads. The authors claimed that the failures in detection of all the test roads was due to the reason that these roads had many wiped and eroded lanes, that caused small intensity to be received by the LiDAR, as well as low density. A novel range dependent methodology was also proposed by [7] to determine lane lines. The algorithm was fed with the LiDAR data of the trajectory road, afterwards, the received road data was segmented into horizontal blocks. These blocks were then used to detect the edges (road curbs) based on differences in elevation to determine the surface, as well as the boundaries of the road. After that the lanes were detected by Inverse Distance Weighted (IDW) prediction method, with the help of the intensities measured by the LiDAR. The last step of the detection was to remove the noise and to interpolate the incomplete lane markings, this was done by applying density based thresholding. The comparison between the predicted lane lines and the real lanes was done manually. The authors claimed to accomplish a success of 0.83 of correctness in detection of the lane markings. In [8], the authors developed lane marking detection algorithm based on a dynamic thresholding method. First, the data points received from the LiDAR were processed using “Probability Density Function” (PDF), and the maximum reflectance was matched with the highest values of the PDF. Then dynamic thresholding was applied on this reflected data, and since the lane markings are the ones that return high reflectance, due to material reflection properties, they were cut out of their neighboring context. In [9], the authors presented a lane detection algorithm based on scan line based method. First, the data from the LiDAR sensor was collected based on timestamp, then these points were segmented into scan lines based on scanner angle, the authors claim that organizing the data in such a way makes further processing more efficient. After that, the road data was determined based on a “Height Difference” (HD). The road limit identification process was done by applying moving least squares that only accepted points that lie within a certain threshold set by the authors. The lanes were classified based on the intensity values that differentiates between the road makings and the asphalt in addition to this, the authors also proposed using “Edge Detection and Edge Constraints” (EDEC) technique that detects fluctuations in the intensity, by using this method the noise was minimized, and the detected lanes were smoothed. The algorithm was tested on data from Jincheng highway China, the authors claimed that they accomplished an accuracy level of 0.90, which is high and proves the robustness of the algorithm proposed.

3

Methodology

The objective of this work is to detect lane markings using the LiDAR sensor mounted on the BMW i3 roof, shown in Fig. 1a. As the vehicle drives through the test field shown in Fig. 1b, the LiDAR sensor generates a 3D point cloud of the whole environment in front of the vehicle, as shown in Fig. 2. The high point density of the point cloud and intensity variances caused by reflected laser

Lane Marking Detection Using LiDAR Sensor

305

beams, makes it possible to specify unique features to be extracted from the point cloud set available in the region of interest. The LiDAR sensor used in the detection process is OS1-64, shown in Fig. 1c that has a vertical resolution of 64 channels, the full specifications of the sensor are shown in Table 1. Table 1. The LiDAR sensor used (OS1-64) specifications. Vertical resolution

64 channels

Horizontal resolution

512, 1024, or 2048

Range

120 m

Vertical field of view

45 (22.5◦ )

Vertical angular resolution 0.35◦ –2.8◦ (multiple options)

3.1

Precision

1.5–5 cm

Points per second

1,310,720

Rotation rate

10 or 20 Hz

Power draw

14–20 W

Weight

455 g

Ingress protection rating

IP68, IP69K

Feature Extraction

The 3D LiDAR point cloud in space are analyzed based on the intensity of the reflected rays and the regions of interest. The reflection differences between the road asphalt and the lane markings lead to variances in the intensity values, and in turn, the intensity values that lie within a pre-set threshold range which are characterized as lane line the resulting point cloud is presented in Fig. 3b. The corresponding coordinates of these intensities are then classified as lane points, that is then projected into a (x, y) plane where the x-axis is the road width, and the y-axis represents the road length. The points are then converted to a 2D binary image, as shown in Fig. 3a. After generating the binary image, the starting point of the lane line is determined using a histogram. The histogram applied to the binary image detects pixel with the highest color gradient (that corresponds to the lane marking light color gradient), and represents them as peaks indicating the base x-positions of the lane lines, as shown in Fig. 3. Afterwards, a sliding window algorithm is applied using the base x-positions as a starting point to search for the lines. A window slides in the y-direction around the line centers, to find and follow the lines up to the top of the frame to determine the regions with high color gradient, one window on top of the other that follows the lanes up the frame. At the end, these regions are then classified as lane markings. Subsequently, the data of interest is fed into a predefined polynomial. In our case, it is the 2nd degree function which produces the coefficients of the polynomial as shown in Eq. 1.

306

A. N. Ahmed et al.

Fig. 3. (a) 2D binary images extracted from LiDAR point cloud data, where white points indicate the lane lines. (b) The histogram representing the x axis base position of the lane lines. (c) A top view of the test field detected 3D point cloud data represented as lane lines.

xpred = Ay 2 + By + C 3.2

(1)

Transformation

In order to visualize the detected lane markings from the vehicle’s perspective, it is required to apply the right frame transformation that maps LiDAR sensor orientation mounted on the vehicle to the vehicle’s orientation, and the pose of the vehicle on the map instantaneously. Before diving into more details of the transformation process, we first need to give a brief description of the quaternion. Quaternion is a mathematical abstraction designed to encapsulate the representations, orientations and rotations in three dimensions. An illustration of the quaternion form is shown in Eq. 2, representation of a rotation as a quaternion consists of 4 numbers, it is composed of two parts, the first part xi + yj + zk reflects the vector part and the second part w reflects the scalar part. Compared to Euler angles, they are simpler to compose and do not suffer from the problem of gimbal lock. Compared to

Lane Marking Detection Using LiDAR Sensor

307

rotation matrices, they are more compact, more numerically stable, and more efficient. The rotation in ROS is represented in quaternion, which is used to encode the orientation of the robot (the vehicle in our case) in 3D space. The general structure of the quaternion can be written as: Q = w1 + x1 i + y1 j + z1 k

(2)

The aim of the transformation part is to transform from LiDAR frame to the base laser frame and then to the base link frame, and finally, to the map frame, the transform (tf) package provides efficient implementation of a TransformListener (which is a class that inherits from the tf and automatically subscribes to ROS transform messages) to help make the task of receiving quaternion and translation transforms less complex. However, before applying the transformation, since we converted the LiDAR points into a binary image; the lane lines coordinates were detected in the image frame (pixel coordinates), the centre of the image thus needs to be shifted from the top left corner to be in the middle of the image in order to get the detected lines in the right orientation. After transforming the lane points from the pixel frame, a TransformListener object is created to receive the transformations from the laser frame to the map frame. So the quaternion coefficients (x, y, z, w) are converted to Euler rotation (x, y, z) to enable the transformation of the lane markings in the 3D space represented in ROS. The conversion from quaternion representation to Euler is shown in Eq. 3. The transformation to the global coordinates is accomplished by first dot product of the lane coordinates with the generated rotation matrix and then adding the translation (x, y). The new coordinates of the lane markings are then ready to be published and presented over the global coordinates of the ground truth lanes. A representation of the detected lanes is shown in Fig. 4a. ⎤ ⎡ 2 2xz + 2wy 2w − 1 + 2x2 2xy − 2wz (3) R = ⎣ 2xy + 2wz 2w2 − 1 + 2y 2 2yz − 2wx ⎦ 2xz − 2wy 2yz + 2wx 2w2 − 1 + 2z 2 3.3

Real-Time Simulation

The robot operating system (ROS) is used to simulate the lane marking detection in real-time as the vehicle drives through the test field. ROS uses so-called subscriber and publishers mechanism to receive and send messages, respectively. Therefore, in order to receive LiDAR data, a subscriber is implemented with the topic name that holds the LiDAR data, and this is the topic specified on the vehicle’s computer that saves all the LiDAR data reported. Besides the specified topic that contains the data, a message type needs to be specified to allow only the type of data that we are interested in to be transmitted, i.e. the point cloud set. After the subscriber has received messages; the callback function is activated that applies the detection algorithm to the point clouds received and from there, the lanes get identified. Then the detected lane coordinates gets published on rviz (a 3D visualization tool in ROS) to be visualized in real-time as the vehicle cruise.

308

A. N. Ahmed et al.

Fig. 4. (a) Lane markings being detected using LiDAR sensor. (b) Root Mean square error of the detected lanes throughout the test field.

Fig. 5. Histogram resulting from a curved path.

4

Result Evaluation

The sensor data is recorded from the vehicle on a rosbag file, which is a format in ROS for storing ROS message data. The bagfile is created by subscribing to the topic assigned to store the LiDAR data that consists of the point clouds (x, y, z, i) where “x, y, z” are the coordinates, and “i” is the intensity, in such a way that for every new measurement (arriving in 0.10 s) is stored in the bagfile. This bagfile is used to send the LiDAR data to the lane detection algorithm. When recording the bagfile the vehicle was driving at a set speed of 15 km/h, and the weather was cloudy. The algorithm was implemented in Python using the OpenCV2 library on a Lenovo Thinkpad T590 with 16 GB RAM, Core i78565U (1.8–4.6 GHz, 8 MB), UHD Graphics 680, and the results were simulated on rviz. The detected lane markings are evaluated using the root mean square error (RMSE), we can determine the error through the Eq. 4 where, Ap denotes Predicted lane, while Ag represents the Lane ground truth. The predicted lanes at each pose were compared with the ground truth values. The RMSE throughout the test field track is shown in Fig. 4. It can seen that when the vehicle is driv-

Lane Marking Detection Using LiDAR Sensor

309

ing through a lane with shallow curvature, the detection accuracy is quite good with an RMSE between 0.1–0.5, whereas when it comes to steep curvatures, the RMSE increases to more than 0.5, and this can be related to the fact that the histogram shows more than one peak in curved paths that disrupt the process of finding the lane’s x-position, as shown in Fig. 5. The formula used to calculate the RMSE of n points in the point clouds is given as, n 1 (Ap,i − Ag,i )2 (4) RM SE = N i=1 In addition to the accuracy, the average lane marking processing time with the proposed technique was 0.11 s for a single pose lane marking detection (i.e. works in a frequency 11 Hz). This frequency is constrained by the frequency of the OS-1 LIDAR sensor, which 10 Hz.

5

Conclusion

This paper presented a LiDAR-based method for road marking detection that was performed using the reflective intensity data captured by the LiDAR sensor, which is insensitive to illumination. In summary, the proposed detection technique shows high accuracy when the vehicle is driving in a straight line and shallow curvatures; however, the technique encounters lower accuracy at steep curves. The lane marking experiments were performed in Mechlab’s1 test field (shown in Fig. 1b) with a BMW i3 testing vehicle with OS1 LiDAR sensor. All the results showed satisfactory performance on the test field, except on the sharp turns. Future Work and Remarks: In order to minimize the lane localization error on steep curvatures the radius of curvature can also be used in the detection technique. In future work, the LiDAR data will be fused with a camera that would enhance the performance of the lane detection process, because it balances the strengths of the different sensors, since the camera is able to detect lanes based on color gradient variations. There is also another approach discussed in [10] in which a Catmull-Rom spline based technique is used to model the lane, and it was able to fit a wider range of lane structures. Even though it was tested using camera images, it can also be implemented in our approach since we convert the LiDAR’s point cloud to binary image. Acknowledgements. This work was done at the mechatronics laboratory “Mechlab” at the Hochschule f¨ ur Technik und Wirtschaft Dresden (HTW Dresden) as part of a joint master thesis, titled “Lane detection techniques for self-driving cars”, between HTW Dresden and University of Antwerp IDLab.

310

A. N. Ahmed et al.

References 1. Akanegawa, M., Tanaka, Y., Nakagawa, M.: Basic study on traffic information system using led traffic lights. IEEE Trans. Intell. Transp. Syst. 2(4), 197–203 (2001) 2. Pang, G.K.H., Liu, H.H.S.: Led location beacon system based on processing of digital images. IEEE Trans. Intell. Transp. Syst. 2(3), 135–150 (2001) 3. Lam, J., Kusevic, K., Mrstik, P., Harrap, R., Greenspan, M.: Urban scene extraction from mobile ground based lidar data, pp. 1–8 (2010) 4. Kodagoda, K.R.S., Sardha Wijesoma, W., Balasuriya, A.P.: CuTE: curb tracking and estimation. IEEE Trans. Control Syst. Technol. 14(5), 951–957 (2006) 5. Kumar, P., McElhinney, C.P., Lewis, P., McCarthy, T.: An automated algorithm for extracting road edges from terrestrial mobile lidar data. ISPRS J. Photogramm. Remote Sens. 85, 44–55 (2013) 6. Kumar, P., McElhinney, C.P., Lewis, P., McCarthy, T.: Automated road markings extraction from mobile laser scanning data. Int. J. Appl. Earth Obs. Geoinf. 32, 125–137 (2014) 7. Guan, H., Li, J., Yongtao, Yu., Wang, C., Chapman, M., Yang, B.: Using mobile laser scanning data for automated extraction of road markings. ISPRS J. Photogramm. Remote Sens. 87, 93–107 (2014) 8. Thuy, M., Le´ on, F.: Lane detection and tracking based on lidar data. Metrol. Meas. Syst. 17(3), 311–321 (2010) 9. Yan, L., Liu, H., Tan, J., Li, Z., Xie, H., Chen, C.: Scan line based road marking extraction from mobile lidar point clouds. Sensors 16(6), 903 (2016) 10. Wang, Y., Shen, D., Teoh, E.K.: Lane detection using spline model. Pattern Recogn. Lett. 21(8), 677–689 (2000)

Applying Artificial Intelligence for the Detection and Analysis of Weather Phenomena in Vehicle Sensor Data Wouter Van den Bogaert1(B) , Toon Bogaerts2 , Wim Casteels2 , Siegfried Mercelis2 , and Peter Hellinckx2 1

2

Faculty of Applied Engineering, University of Antwerp, Sint-Pietersvliet 7, 2000 Antwerp, Belgium [email protected] IDLab - Faculty of Applied Engineering, University of Antwerp - imec, Sint-Pietersvliet 7, 2000 Antwerp, Belgium {toon.bogaerts,wim.casteels,siegfried.mercelis peter.hellinckx}@uantwerpen.be

Abstract. Weather predictions arise from observatory stations on fixed locations, forming a nationwide grid. The low resolution of this grid does not allow for the prediction and discovery of local road weather conditions. This paper aims to identify weather conditions on a high-resolution scale by applying machine learning on vehicle sensor data. The model classifies anomalous samples in time series data into a road weather condition. We examine how Decision Trees can be applied to classify anomalous vehicle behavior into weather phenomena. It also specifies which preparation steps on sensor observations are advisable before a model is applied. We constructed numerous Random Forest and Gradient Boosted Tree classifiers to classify anomalies of real-world vehicle data. The grid search performed on classifier hyperparameters and input configurations shows that a well-considered feature selection and filtering has a significant impact on the accuracy.

1

Introduction

Over the last decades, vehicles have been equipped with an ever-increasing number of sensors. These sensors gather specific information and communicate with each other. The exchange of the measurements between the sensors and various subsystems happens on one or more central networks in the vehicle; the CAN-bus [3]. Modern vehicles are equipped with for example a brightness-sensor which is used by the subsystem that (de)activates the headlights, An antilock brake-sensor detects differential wheel speeds which are processed by the ABSsubsystem to adjust the braking system. Besides being critical for in-car safety and comfort, these sensors can provide an indirect view of the vehicle’s environment. For example, active rain shieldwipers could indicate a kind of precipitation and, in addition, the ABS-system c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 311–320, 2021. https://doi.org/10.1007/978-3-030-61105-7_31

312

W. Van den Bogaert et al.

supports the diagnosis of the road slipperiness [12]. As road conditions have a major impact on driveability of a vehicle and road safety, a driver has to adapt his driving behavior appropriately. In Smart Highway and Connected Car ecosystems, this kind of observation can be shared in real-time with vehicles within the same boundary to warn other (autonomous) drivers. Furthermore, the spatial coverage of vehicles allows for weather identification on a finer scale when comparing to fixed weather stations. In brief, the potential availability of a substantial amount of vehicle sensor observations on which it is possible to analyze weather-related road conditions offers opportunities for both traffic safety and local road-weather prediction. This paper examines how to distinct road-weather related events by the use of CAN data and Machine Learning. A model will be trained to classify anomalous data present in the vehicle CAN-bus. The anomalies will be classified as weather dependant or not. Section 2 of this paper describes relevant state of the art techniques, Sect. 3 discusses methodologies for data analysis, pre-processing and feature selection. The classification results are discussed in Sect. 4. In Sect. 5, conclusions are drawn and future work is described.

2

State of the Art

The features we feed to the classification model originate from the fusion of measurements from multiple sensors. The diversity of these sensors and the fact that measurements are performed on a moving object involves complications that we need to address. It is necessary to have sufficient understanding regarding the interaction between the driver and the environment, which is described in [15]. Whereas some sensor measurements can directly be used as features (e.g. temperature), [8] introduces roughness and friction coefficients as indirect features based on the fusion of various sensors. Principal Component Analysis for feature extraction is applied in [9] and [6]. [14] uses backward feature selection to determine an optimal feature set, which improves the classification performance of their model. The sensor measurement data on which we perform analysis, are received as a multivariate time series. Once an anomaly is detected, the objective is to apply a machine learning technique to classify the anomalous state into a weather condition in which the vehicle is situated. We selected Decision Tree as a classifier algorithm because it can be constructed relatively fast and allow for classification at high speed while demanding limited resources [16]. It is considered to be a white box model as it is easy to construct, understand, and interpret. When the accuracy of a tree declines due to a high dimensionality in the data, one could contemplate ensemble learners such as Random Forests or Gradient Boosted Trees. Various studies have previously applied Decision Trees to vehicle data. Vuong et al. classify attacks on robot vehicles using a Decision Tree [17]. The model takes both cyber and physical features into account. They accomplish an accuracy of up to 94%. Freudling achieves an even higher accuracy of 97% when applying a Decision Tree for the detection of oversteering [6].

AI for the Classification of Weather Phenoma in Vehicle Sensor Data

313

Breiman [2] outlines the Random Forest as an improvement of bagged tree predictors where each tree depends on the values of a random vector. Bagged means that successive trees are not correlated to earlier trees. Each tree is constructed using a bootstrapped sample from the input set, after which a majority vote determines which classification outcome is retained. The split of a node is determined by the best subset of randomly selected features. The generalization error converges to a limit as the size of the forest becomes large. Liaw and Miener state in [10] that this strategy is robust against overfitting. While Random Forest builds an ensemble of non-correlated trees, gradient boosting involves creating and adding sequentially dependent trees. Each new tree attempts to improve the classification outcome of previous trees by the use of boosting [1]. The boosting method assigns extra weight points to incorrect outcomes by earlier predictors. Ultimately, this leads to a weighted vote, in contrast to the majority vote used in Random Forest [10]. Gradient Boosted Tree has proven to be successful across various domains [1]. The detection of anomalies, i.e. outliers in the vehicle sensory data, is performed by an Isolation Forest and a Long Short-Term Memory (LSTM). An anomaly exhibits two main characteristics: i) they occur less frequently than normal instances and ii) their attribute values differ profoundly from normal instances. Isolation Forest is an anomaly detection algorithm that works well in high dimensional data with a large number of irrelevant attributes [11]. It constructs a binary search tree for each sample of a random subset. To obtain the anomaly score for a new sample the algorithm inserts the new sample into each random tree. The score is then calculated by taking the mean of the insertion depth over all trees. Long Short-Term Memory (LSTM) is a type of Recurrent Neural Network (RNN). To perform time series forecasting an RNN uses information from posterior observations to calculate the output. They perform well when the forecasting requires a short term posterior observation set. When the dependency on posterior observations is spread out over a longer period the backpropagation takes a long time due to a decaying error backflow. Hochreiter and Schmidhuber address this problem by proposing the LSTM algorithm which is designed to learn long term dependencies [7].

3

Methodology

This section will describe the data in Sect. 3.1, followed by the applied preprocessing in Sect. 3.2. Section 3.3 will describe the expansion of the feature space and applied filters. 3.1

Data Analysis

The data set on which this research is conducted contains journey information from a Mercedes A180 (2016). This data covers general work-home journeys, i.e. a balanced amount of highway and non-highway sections, which took place

314

W. Van den Bogaert et al.

between December 2019 and March 2020. An in-depth analysis is performed on three trajectories, for each of these trajectories a journey in rainy circumstances as well in dry circumstances is present. Data is collected from the CAN-bus of the Mercedes and reverse engineered as described by J. de Hoog [5]. Features include wheel speeds, acceleration, steering wheel position, and many more. The windscreen wipers are also present in the available signals, however as we look to explore the possibilities of detecting weather dependant vehicle and driver behavior, the wiper data will be removed. Besides the labeled features, there are some unidentified signals which are continuous and will be considered as features for our algorithms. To give some visual context to the data Fig. 1 illustrates distinct patterns present in the wheel speeds when conducting certain maneuvers.

Fig. 1. Evolution of individual wheel speeds at a roundabout (left) and when making a U-turn (right).

3.2

Data Pre-processing

To obtain labeled data, the vehicle CAN-bus data needs to be merged with weather data. The royal weather institute of Belgium (RMI) provided us with the weather forecasts linked to the timestamp and location of the Mercedes. The information gathered from the RMI contains precipitation type and rate. CAN-bus data has an inconsistent sampling rate across the available signals. To handle this inconsistency, all features are sampled at an 20 Hz. Gaps within features are filled using linear interpolation. At this sample rate, wheel slips can be detected. Too high sampling rates increase the complexity of the data and result in inefficient algorithms. The chosen sampling rate is thus a tradeoff between efficiency and loss of information. Most safety features of vehicles work 20 Hz to 50 Hz such as anti-lock braking systems or electronic stability programs. The data is continuously labeled, one can intuitively say that under normal driving conditions such as standstill or slow driving at a constant velocity no differences can be noticed between a wet road surface or dry road surface. For this reason, some filters will be applied to the data to isolate events of interest such as high deviating behavior from normal behavior and high-speed sections.

AI for the Classification of Weather Phenoma in Vehicle Sensor Data

3.3

315

Feature Engineering

As mentioned in Sect. 3.1 the data contains a large number of discrete and continuous features. Discrete signals describe for example to activation of headlights, indicators, etc., whilst continuous signals involve wheel speeds, velocity, etc. In this research, we focus on weather-dependent vehicle behavior. For this reason, all features directly linked to weather are removed from the data such as the wipers. To expand the features present in this data, two methods are applied to expand the features space. First, secondary features are added being: Acceleration derived from the vehicle speed, wheel speed difference on front/rear axles, standard scaled-down features, and the steering wheel rate derived from the steering wheel position. Secondly, anomaly detection algorithms are used to score samples based on their deviation of normal behavior as described by S. Mercelis [13]. In general, these algorithms score high when a sample deviates from a Gaussian distribution fitted on their respective scaled feature. Each of these anomaly scores will be added as features in our data set. Furthermore, an LSTM neural network is trained to predict regular behavior on the wheel speeds. When these predictions deviate significantly from the ground-truth, the anomaly score will rise. Related research remarks that these anomaly scores may vary as a function of speed [14]. This dependency biases the anomaly score. The values of a sample recorded at a low speed fit within a different range compared to a sample that has been recorded at a higher speed, nevertheless they represent the same anomalies. To eliminate this dependency a polynomial is fit and subtracted from the scores as shown in Fig. 2. The filters used on this data are defined as a speed-filer, which removes all samples where the vehicle speed is below 70 km/h. This filter aims to remove parts of the data where it is uncommon to have differences between the vehicle behavior on dry or wet road conditions. Another filter works on the overall anomaly score: only samples with a score between quantile 2 and quantile 3 are kept. This expanded feature space of 53 original signals and 192 derived features, is then correlated with the weather information as shown in Fig. 3. This figure shows the importance of filtering, both filters have a reinforcing influence on the correlation between the weather type and the feature space. As a benchmark of a clear indicator of the weather the rear-wipers are added to compare to indirect features. We can see that multiple features have a strong correlation with the precipitation type highlighting the potential of weather dependant vehicle behavior.

316

W. Van den Bogaert et al.

Fig. 2. Anomaly scores dependent on the vehicle speed, fitted by a third-degree polynomial.

Fig. 3. Significant correlations to precipitation type. The correlation increases when filtering is applied. The unidentified signals are likely to be pressure or temperature related.

4

Results

We performed a grid search on the 644 combinations of hyperparameters and input configurations. The grid search has been iterated three times, so for each combination three models were trained and validated. Out of these groups of three, we retained the one with the highest accuracy. It is important to notice

AI for the Classification of Weather Phenoma in Vehicle Sensor Data

317

that accuracy as a metric alone leads to a biased interpretation because of an imbalance in the data set. When evaluating the models we considered the F1scores of each class as secondary metrics. When preventing for feature leak by removing the directly linked weather parameters, the accuracy averaged to 0.88. Although this is reasonable, the F1 score of the wet class was close to zero, both for the Gradient Boosted and Random Forest models. The highest F1-score of wet for an optimized configuration is hardly 0.16, while still maintaining a maximum of 0.92 F1-score for dry. Small Random Forests outperformed forests of larger size, however, information gain as the criterion is in favor of Gini. All Gradient Boosted models achieved a poor wet F1-score of 0.01. The wet F1-score could not be improved by tuning the hyperparameters, nor by reducing the number of features. To investigate if the classifiers encounter obstruction due to a large number of features, we also varied the number of input features with no improvement. Significant improvements in the wet F1-score occur when filtering is applied to the optimized configuration. Figure 4a shows how the wet F1-score for the Random Forest increases from 0.15 up to 0.64. The evolving wet F1-score of the Gradient Boosted classifier its wet F1-score is shown in Fig. 4b. Though filtering separately on score or speed has a substantial effect, the best result is achieved when both filters are combined. The performance of the Gradient Boosted tree is mainly determined by the tree method parameter: hist produces better results than auto. The variation of other hyperparameters results in no notable difference. The Random Forest produces the best results when the forest size is limited to 5 trees, in combination with a depth of 10 or 20. When filtering on speed or score individually, the Gini criterion scores the best. However, information gain as criterion takes the lead when both filters are applied. Our best-achieved accuracy and hyper-parameter configuration is shown in Table 1.

Fig. 4. a) Filtering on samples improves the maximum Random Forest classifier scores. The optimized feature set was used. b) Filtering on samples improves the maximum Gradient Boosted classifier scores. The optimized feature set was used.

318

W. Van den Bogaert et al. Table 1. Best feature and hyperparameter configurations Feature configuration Benchmark

Solution 1

Solution 2

Classifier

Random Forest

Random Forest

Gradient Boosted

Filters

None

Speed & Score

Speed & Score

Accuracy

0.87

0.92

0.92

F1 Dry

0.93

0.96

0.96

F1 Wet

0.20

0.64

0.64

Criterion/Method

Information Gain Information Gain Hist

Max Depth

No limit

No limit

6/10/20

Forest Size

5

5

1/5/10/20

Leaf Size

2

2

–

Split Size

4

4

–

Learning Rate

-

–

0.1

Bin Size

–

–

128/256/512

Furthermore, we reduced the feature space to only include the most correlated features related to the weather type. As this feature configuration contains only eight features this greatly reduces the complexity of the models. This configuration still achieves similar results and is thus preferred. It has to be stressed that misclassification cannot be avoided due to a mismatch between the actual road weather situation and the forecast. Although no rain has been forecasted, the road surface could still be wet from earlier precipitation.

5

Conclusions and Future Work

This work researched the classification of time series vehicle sensory data into a road-weather condition. We developed a data preparation tool that converts raw measurements into features by performing interpolation, normalization, and merging with weather data. The merge with the weather data provided ‘wet’ or ‘dry’ a classifier label for each sample. Analysis of anomalous samples - predicted by an Isolation Forest - exposed patterns in the wheel speeds curve. These patterns coincide with common driving maneuvres. During the feature selection, we introduced four extracted features and scaled duplicates of features. We associated each sample, as well each feature, with an anomaly score predicted by an LSTM. A correlation matrix identified a set of features with a small correlation to the prediction label. Correlations increased heavily when we employ filters on speed and score. These filters had a significant positive impact on the classifier accuracy. Without these filters, the accuracy dropped intensively. The results show that both the Random Forest and the Gradient Boosted Tree performed equally, obtaining an accuracy of 0.92. The F1-score for dry averaged to 0.92, the wet F1-score averaged to 0.64.

AI for the Classification of Weather Phenoma in Vehicle Sensor Data

319

Our experiment shows that the identification of road weather circumstances based on vehicle sensory data is possible. Dry driving circumstances are successfully distinct from wet ones. When this model is put into use in vehicles, it can provide road weather information to third parties, such as the RMI and downstream vehicles. Other classifier methods such as HMM or SVM have been left for future work. First, it is worth investigating if classifier fusion could improve overall accuracy. Second, further analysis of feature selection by the use of PCA or backward feature selection might optimize the accuracy and reduce the complexity of the models. Last, a windowed approach might be investigated, considering that this work worked on single samples. Acknowledgement. The SARWS (Real-time location-aware road weather services composed from multi-modal data [4]) Celtic-Next project (October 2018–2021) combines the expertise of commercial partners Verhaert New Products & Services, BeMobile, Inuits and bpost with the scientific expertise of research partners imec - IDLab (University of Antwerp) and the Royal Meteorological Institute of Belgium, together with an international consortium with partners in Portugal, France, SouthKorea, Turkey, Romania and Spain. The Flemish project was realised with the financial support of Flanders Innovation & Entrepreneurship (VLAIO, project nr. HBC.2017.0999).

References 1. Boehmke, B., Greenwell, B.M.: Hands-On Machine Learning with R. CRC Press, Boca Raton (2019) 2. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) 3. CAN in Automation: History of the CAN Technology, p. 1 (2019). https://www. can-cia.org/can-knowledge/can/can-history/ 4. Celtic-Next SARWS. Real-time location-aware road weather services composed from multi-modal data (2019). https://www.celticnext.eu/project-sarws/ 5. de Hoog, J., Bogaerts, T., Casteels, W., Mercelis, S., Hellinckx, P.: Online reverse engineering of CAN data. Internet Things, 100232 (2020) 6. Freudling, T., BMW Group: Detecting Oversteering in BMW Automobiles with Machine Learning, p. 1 (2019). https://nl.mathworks.com/company/newsletters/ articles/detecting-oversteering-in-bmwautomobiles-with-machine-learning.html 7. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 8. Irschik, D., Stork, W.: Road surface classification for extended floating car data. In: 2014 IEEE International Conference on Vehicular Electronics and Safety, pp. 78–83. IEEE, December 2014 9. Jonsson, P.: Road condition discrimination using weather data and camera images. In: 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 1616–1621. IEEE, October, 2011 10. Liaw, A., Wiener, M.: Classification and regression by RandomForest. R News 2(3), 18–22 (2002) 11. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422. IEEE, December 2008 12. Mahoney, B., Drobot, S., Pisano, P., McKeever, B., O’Sullivan, J.: Vehicles as mobile weather observation systems. Bull. Am. Meteorol. Soc. 91(9), 1179–1182 (2010)

320

W. Van den Bogaert et al.

13. Mercelis, S., Watelet, S., Casteels, W., Bogaerts, T., Van den Bergh, J., Reyniers, M., Hellinckx, P.: Towards detection of road weather conditions using large-scale vehicle fleets. In: 2020 IEEE 91st Vehicular Technology Conference (VTC2020Spring), pp. 1–7. IEEE, May 2020 14. Perttunen, M., Mazhelis, O., Cong, F., Kauppila, M., Lepp¨ anen, T., Kantola, J., Collin, J., Pirttikangas, S., Haverinen, J., Ristaniemi, T., Riekki, J.: Distributed road surface condition monitoring using mobile phones. In: International Conference on Ubiquitous Intelligence and Computing, pp. 64–78. Springer, Heidelberg, September 2011 15. Petty, K.R., Mahoney III, W.P.: Weather Applications and Products Enabled Through Vehicle Infrastructure Integration (VII): Feasibility and Concept Development Study (No. FHWA-HOP-07-084), United States. Federal Highway Administration (2007) 16. Sharma, H., Kumar, S.: A survey on decision tree algorithms of classification in data mining. Int. J. Sci. Res. (IJSR) 5(4), 2094–2097 (2016) 17. Vuong, T.P., Loukas, G., Gan, D., Bezemskij, A.: Decision tree-based detection of denial of service and command injection attacks on robotic vehicles. In: 2015 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–6. IEEE, November 2015

Proposal of a Traditional Craft Simulation System Using Mixed Reality Rihito Fuchigami(B) and Tomoyuki Ishida Fukuoka Institute of Technology, Fukuoka, Fukuoka 811-0295, Japan [email protected], [email protected]

Abstract. In this study, we implemented a traditional craft simulation system using Mixed Reality (MR) technology. The user can experience 3DCG of traditional crafts superposed on the real space by using MR. The user experiences this system using a head mounted display (HMD). This allows the user to simulate traditional craft objects with high presence and absorption. This system contributes to the spread of traditional Japanese crafts.

1 Introduction In Japan, traditional crafts such as textiles, dyed and woven goods, porcelain, lacquerware, and Japanese paper have been manufactured by craftsmen since ancient times. Some of these are designated as important cultural properties. Traditional crafts have been deeply involved in the Japanese lifestyle and culture. However, the current situation surrounding traditional crafts today has the following major issues [1]. • Stagnant demand due to changes in Japanese lifestyles and the import of inexpensive household goods from overseas • Shortage of successors due to population decline, high economic growth, and modernization of industry • Limitations of mass production due to careful work of traditional crafts and various complicated manufacturing processes • Decrease and depletion of limited natural raw material resources used as materials for traditional crafts

2 Previous Study Lu et al. [2] implemented a high immersive virtual traditional craft presentation system using an inexpensive Head Mounted Display (HMD). This system provides the user with a high realistic experience of traditional craft by constructing an interior space that combines “Japanese” and “Western” with virtual reality technology. In addition, this system enables remote users to cooperate by realizing the remote sharing function of the

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 321–329, 2021. https://doi.org/10.1007/978-3-030-61105-7_32

322

R. Fuchigami and T. Ishida

space by network technology. However, in order for users to experience traditional crafts, developers have to construct VR spaces such as Japanese and Western style houses and room spaces in advance. Iyobe et al. [3, 4, 5] implemented a mobile traditional craft presentation system using augmented reality technology. The user can easily use this system since the mobile terminal is used as a platform. By using AR technology, this system superposes 3D objects of traditional crafts on the real room space reflected by the camera of the mobile terminal. In addition, this system enables intuitive operation of 3D objects by using the multi-touch function of the mobile terminal. However, this system has an issue that it lacks the realism of traditional crafts because the presentation of traditional crafts is performed on a two-dimensional flat display of a tablet terminal.

3 Research Objective The traditional craft presentation system using HMD of the previous study realized movement and exchange of traditional crafts in VR space, and VR space sharing with remote users. However, this system needs to construct a VR space suitable for traditional crafts in advance, and the amount of work is enormous. On the other hand, the mobile traditional craft presentation system using AR of the previous study realized operations such as moving, rotating and scaling traditional crafts by superposing traditional crafts in real space. However, this system lacks high presence and absorption, because the user experience traditional craft objects superposed in the real space on a tablet terminal. Therefore, in this study, we implement a traditional craft simulation system using MR with high presence and absorption without the need to construct a VR space in advance.

4 System Architecture The system architecture is shown in Fig. 1. This system consists of an object presentation system control function, a head mounted display function, a mixed reality space control function, and a traditional craft object storage. 4.1 Object Presentation System Control Function The object presentation system control function consists of a user interface, a mixed reality space control manager, a real space scan manager, an object generate and delete manager, an object grab manager, and an object scaling manager. • User Interface The user interface is the interface between the user and the application, and provides the user with MR space and virtual objects of traditional crafts.

Proposal of a Traditional Craft Simulation System Using Mixed Reality

323

Fig. 1. System architecture of the traditional craft simulation system.

• Mixed Reality Space Control Manager The mixed reality space control manager manages controller position detection, object collision detection, button input detection, and so on. In addition, this manager creates objects and scans the real space by menu management in the MR space. • Real Space Scan Manager The real space scan manager scans the real space from the menu displayed in the MR space. • SRworks SDK The SRworks SDK provides modules such as depth, spatial mapping, and seethrough.

324

R. Fuchigami and T. Ishida

• Object Generate and Delete Manager The object generate and delete manager generates or deletes objects stored in the traditional craft object storage on MR space. • Object Grab Manager The object grab manager provides the user with a function to grab he object by operating the trigger with the controller touching the object arranged in MR space. • Object Scaling Manager The object scaling manager provides the user with an object scaling function by triggering another controller while holding the object by the object grab manager. 4.2 Head Mounted Display Function The head mounted display function is a function of the head mounted display itself, and has a dual camera and a display. 4.3 Mixed Reality Space Control Function The mixed reality space control function controls the real space via dual camera images and the virtual space where virtual objects are arranged. 4.4 Traditional Craft Object Storage The traditional craft object storage sotres 3D objects of tradtional crafts used in this study.

5 Traditional Craft Simulation System Prototype We use 3D objects of traditional crafts converted from CAD format (.dxf) to AutodeskFBX format (.fbx) (Fig. 2). 5.1 Initial Screen of the Traditional Craft Simulation System When the user activates the traditional craft simulation system, Unity logo is displayed for a few seconds. Then, the 3DCG of both hands with the user’s controller is displayed on the real space by the dual camera of the HMD as shown in Fig. 3. At this time, since the real space is not scanned by the dual camera of the HMD, the space viewed by the user is not the MR space. The user can scan the real space and arrange traditional craft objects by displaying the menu.

Proposal of a Traditional Craft Simulation System Using Mixed Reality

325

Fig. 2. Traditional craft objects of Nanao City, Ishikawa Prefecture handled in this study.

Fig. 3. Initial screen of the traditional craft simulation system.

5.2 Menu Screen of the Traditional Craft Simulation System The user can display menu screen by pressing the menu button on the left-hand controller as shown in Fig. 4. From this menu screen, the user can select an “Object generation function” button to arrange traditional craft objects in space and a “Scanning” button to scan the real space as shown in Fig. 5.

326

R. Fuchigami and T. Ishida

Fig. 4. Button operation for expanding the menu screen.

Fig. 5. Menu screen of the traditional craft simulation system.

5.3 Traditional Craft Object Selection Screen The user can select a traditional craft object to be arranged in MR space by pressing the “Object generation function” button from the menu screen as shown in Fig. 6. This system provides the user with the traditional craft shoji (paper sliding door) objects and fusuma (sliding door) objects.

Proposal of a Traditional Craft Simulation System Using Mixed Reality

327

Fig. 6. Traditional Craft object selection screen.

5.4 Traditional Craft Object Delete Function When deleting a traditional craft object from MR space, the user can delete the arranged object by pressing the grip button as shown in Fig. 7.

328

R. Fuchigami and T. Ishida

Fig. 7. Deleting traditional craft objects.

6 Conclusion In this study, we implemented a traditional craft simulation system using MR technology. This system provides the user with high presence and absorption MR environment. The user can freely arrange traditional crafts on MR space through the traditional craft simulation system. This makes it possible to enjoy simulation of traditional crafts from various viewpoints.

Proposal of a Traditional Craft Simulation System Using Mixed Reality

329

References 1. Traditional Craft Industry Office, Current situation and future promotion measures of the traditional craft industry. https://www.meti.go.jp/committee/summary/0002466/006_06_00.pdf. Accessed 7 July 2020 2. Lu, Y., Ishida, T., Miyakawa, A., Shibata, Y., Habuchi, H.: Proposal of a high-presence japanese traditional crafts presentation system integrated with different cultures. In: Proceedings of the 22nd International Conference on Network-Based Information Systems, pp. 341–349 (2019) 3. Iyobe, M., Ishida, T., Miyakawa, A., Shibata, Y.: Kansei retrieval method by principal component analysis of Japanese traditional crafts. In: Proceedings of the 23rd International Symposium on Artificial Life and Robotics, pp. 588–591 (2018) 4. Iyobe, M., Ishida, T., Miyakawa, A., Sugita, K., Uchida, N., Shibata, Y.: Development of a mobile virtual traditional crafting presentation system using augmented reality technology. Int. J. Space-Based Situated Comput. (IJSSC) 6(4), 239–251 (2017) 5. Iyobe, M., Ishida, T., Miyakawa, A., Shibata, Y.: Implementation of a mobile traditional crafting application using kansei retrieval method. IT CoNvergence PRAct. (INPRA) 5(1), 15–44 (2017)

Development and Evaluation of an Inbound Tourism Support System Using Augmented Reality Yusuke Kosaka and Tomoyuki Ishida(B) Fukuoka Institute of Technology, Fukuoka, Fukuoka 811-0295, Japan [email protected], [email protected]

Abstract. In this study, we have developed an inbound tourism support system that links tourism resources by plotting the tourism resource data on a map and transmitting the attractiveness of the city to tourists including foreign tourists visiting Japan. The inbound tourism support system developed in this study consists of a regional sightseeing content management agent that manages regional tourism content provided to users and a tourist agent that actually discovers the attractiveness of the city via mobile application. The regional sightseeing content management agent registers and edits regional tourism content via the regional tourism content management system, and links the regional tourism content with each other. On the other hand, the tourist agent browses various regional tourism contents registered by the regional sightseeing content management agent via the mobile application.

1 Introduction The following issues are evident from the results of a questionnaire survey conducted by the Japan Tourism Agency concerning the improvement of the environment for accepting foreign tourists visiting Japan [1]. • Communication with staff at facilities • Free public wireless LAN environment • Multilingual display of tourist information and maps Focusing on multilingual display, a large amount of cost is required for the multilingualization of tourist information boards and maps and the preparation of human resources capable of handling multiple languages. As a means to solve these issues, smartphones that have spread rapidly in recent years are useful. At present, many foreign tourists visiting Japan use smartphones as a means of collecting information during their visits to Japan. In order to reduce the cost of installing and translating multilingual guides, we try to solve these issues by combining augmented reality technology (AR) with smartphones used by many foreign tourists visiting Japan. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 330–338, 2021. https://doi.org/10.1007/978-3-030-61105-7_33

Development and Evaluation of an Inbound Tourism Support System Using AR

331

2 Research Objective In this study, we develop the inbound tourism support system that links tourism resources by plotting the tourism resource data on a map and transmitting the attractiveness of the city to tourists including foreign tourists visiting Japan. This system realizes effective information presentation and navigation to sightseeing spots to users by using augmented reality technology.

3 System Configuration and Architecture This system consists of the regional sightseeing content management agent, the tourist agent, the regional sightseeing content management server, and the regional sightseeing content database server. The system configuration of the inbound tourism support system is shown in Fig. 1. And, as shown in Fig. 2, the system architecture consists of the regional sightseeing content management agent, the tourist application system, the regional sightseeing content management server, the regional sightseeing content management server, and the Web API. 3.1 Regional Sightseeing Content Management Agent The regional sightseeing content management agent manages the regional tourism content provided to the tourism agent via the administrator’s Web application. The content managed by this agent includes the names and images of tourism resources, resource descriptions, coordinates, and AR content linked to the coordinates. 3.2 Tourist Agent The tourist agent browses the regional tourism contents stored in the regional sightseeing content database server via the regional sightseeing content management server. In addition, the user can browse AR content according to the user’s current location and use the route guidance function to the destination via the mobile application. 3.3 Regional Sightseeing Content Management Server The regional sightseeing content management server registers the content in the regional sightseeing content database server according to the request from the regional sightseeing content management agent that manages the regional tourism content. In addition, this management server provides the content to the end user from the regional sightseeing content database server according to the request from the tourist agent. 3.4 Regional Sightseeing Content Database Server The regional sightseeing content database server stores regional tourism content (names and images of tourism resources, resource descriptions, coordinates, and AR content linked to the coordinates, etc.).

332

Y. Kosaka and T. Ishida

Fig. 1. System configuration of the inbound tourism support system.

Fig. 2. System architecture of the inbound tourism support system.

4 Inbound Tourism Support System The inbound tourism support system consists of an inbound tourism support mobile application for foreign tourists visiting Japan to browse regional tourism content and an inbound tourism support management web application for managing regional tourism content. The functions of the inbound tourism support mobile application and the inbound tourism support management web application are shown in Table 1 and Table 2.

Development and Evaluation of an Inbound Tourism Support System Using AR

333

Table 1. Inbound tourism support mobile application functions. Function name

Function summary

AR Display Function

This function superimposes and displays regional tourism contents as AR in real space

Inbound Function

This function switches the regional tourism content superimposed and displayed as AR to Japanese, English, and Chinese

Map Display Function This function displays a map on a tablet terminal and visualizes regional tourism content as a marker Navigation Function

This function navigates to tourism resources and displays routes on a map

Table 2. Inbound tourism support management web application function. Function name

Function summary

Regional Tourism Content Management Function

This function manages regional tourism contents provided to the inbound tourism support mobile application

4.1 Top Screen and Menu Screen of the Inbound Tourism Support Mobile Application The top screen of the inbound tourism support mobile application is shown in the left of Fig. 3. After a few seconds elapse after opening the top screen, a menu screen as shown in the right of Fig. 3 is displayed. From the menu screen, the user can transition to the AR mode screen and the map mode screen. 4.2 AR Mode of the Inbound Tourism Support Mobile Application When the user selects “AR mode” on the menu screen, the screen transitions to the AR mode screen as shown in the left of Fig. 4. When the AR mode starts, AR regional tourism content is superimposed and displayed on the camera view. The regional tourism content provided by this system supports English and Chinese in addition to Japanese. The user can switch to English or Chinese regional tourism content by selecting the “English” or “Chinese” button displayed on the upper right of the AR mode screen. The English version of the regional tourism content superimposed and displayed on the real space is shown in the right of Fig. 4.

334

Y. Kosaka and T. Ishida

Fig. 3. Top screen (left) and menu screen (right).

4.3 Map Mode of the Inbound Tourism Support Mobile Application When the user selects “Map mode” on the menu screen, the screen transitions to the Map mode screen as shown in the left of Fig. 5. The user’s current location and the registered regional tourism contents are displayed on the map mode screen. When the user selects the marker indicating the registered regional tourism content on the map mode screen, the explanation of the tourism content is displayed. In addition, when the user selects the “Start Navigate” button while displaying the explanation of the tourism content, a route to the registered to the regional tourism content is displayed as shown in the right of Fig. 5. 4.4 Inbound Tourism Support Management Web Application The registration screen for regional tourism content is shown in Fig. 6. When the administrator clicks on an arbitrary point on the map, a marker is displayed and the latitude and longitude are automatically entered in the registration form. After that, input an arbitrary content name into the content name form. In addition, the administrator can register any regional tourism content data by selecting a regional tourism content file from the “Select File” button and pressing the “New Registration” button.

Development and Evaluation of an Inbound Tourism Support System Using AR

Fig. 4. AR mode screen for Japanese version (left) and English version (right).

Fig. 5. Map mode screen (left) and route navigation screen (right).

335

336

Y. Kosaka and T. Ishida

Fig. 6. Registration screen for regional tourism content using the inbound tourism support management web application.

5 System Evaluation To evaluate the operability, effectiveness, and applicability of each function in the inbound tourism support system, we conducted a questionnaire survey on 17 subjects after actually experiencing the AR application. 5.1 AR Mode Operability Evaluation Result In the AR mode operability of the inbound tourism support system, 76% subjects answered “Easy” and “Somewhat easy” as shown in Fig. 7. From this AR mode operability evaluation result, we were able to confirm high AR mode operability of the inbound tourism support system. 5.2 Map Mode Operability Evaluation Result In the Map mode operability of the inbound tourism support system, 88% subjects answered “Easy” and “Somewhat easy” as shown in Fig. 8. From this AR mode operability evaluation result, we were able to confirm high Map mode operability of the inbound tourism support system. 5.3 Inbound Tourism Support System Effectiveness Evaluation Result In the effectiveness of the inbound tourism support system, 82% subjects answered “Effective” and “Somewhat effective” as shown in Fig. 9. From this effectiveness evaluation result, we were able to confirm high effectiveness of the inbound tourism support system.

Development and Evaluation of an Inbound Tourism Support System Using AR

337

Fig. 7. AR mode operability evaluation result of the inbound tourism support system (n = 17).

Fig. 8. Map mode operability evaluation result of the inbound tourism support system (n = 17).

Fig. 9. Effectiveness evaluation result of the inbound tourism support system (n = 17).

338

Y. Kosaka and T. Ishida

5.4 Inbound Tourism Support System Applicability Evaluation Result In the applicability of the inbound tourism support system, 100% subjects answered “Possible” and “Somewhat possible” as shown in Fig. 10. From this applicability evaluation result, we were able to confirm high applicability of the inbound tourism support system.

Fig. 10. Applicability evaluation result of the inbound tourism support system (n = 17).

6 Conclusion In this study, we have developed and evaluated an inbound tourism support system that links tourism resources by plotting the tourism resource data on a map and transmitting the attractiveness of the city to tourists including foreign tourists visiting Japan. The inbound tourism support system developed in this study consists of the regional sightseeing content management agent and the tourist agent. The tourist agent can browse various regional tourism contents registered by the regional sightseeing content management agent from AR mode or Map mode. We conducted a questionnaire evaluation on 17 subjects to evaluate the inbound tourism support system. As a result of the evaluation, we were able to obtain a high evaluation for the operability, effectiveness, and applicability of each function.

Reference 1. Japan Tourism Agency: Questionnaire Results on Improvement of Accepting Environment for Foreign Tourists visiting Japan. https://www.mlit.go.jp/common/001281549.pdf. Accessed 7 May 2020

A Study on the Relationship Between Refresh-Rate of Display and Reaction Time of eSports Koshiro Murakami1(B) , Kazuya Miyashita2 , and Hideo Miyachi2 1 Graduate School of Environmental and Information Studies, Tokyo City University,

3-3-1 Ushikubo-nishi Tsuzuki-ku, Yokohama City, Japan [email protected] 2 Department of Information Systems, Tokyo City University, 3-3-1 Ushikubo-nishi Tsuzuki-ku, Yokohama City, Japan {g1772082,miyachi}@tcu.ac.jp

Abstract. In Electric Sports (eSports) filed, High frequency displays such as 144 Hz and 240 Hz are more often used than 60 Hz display. In addition, it is said that people who use a high performance computer with graphics board are ahead of the eSports. Therefore, we have been studying the relationship between the performance of computers and reaction time of eSports. As a preliminary test, we conducted a simple reaction time test in which a button was clicked when the screen color changed. As the result, differences between 60 Hz, 120 Hz and 240 Hz cannot be ignored. It is assumed that the reaction time of each individual is constant, so this difference may be due to refresh-rate. This is because, the time of presenting light stimulus is different between each refresh-rate. Therefore, it has been suggested that using a display with high refresh-rate when competing in such as a simple game is advantageous.

1 Introduction In recent years, Electric Sports (eSports) become much popular around the world [1]. eSports is defined as matching games such as board games, card games, shooting games and fighting games. In eSports filed, there are some discussions about positive and negative effect people are caused by eSports [2–4]. Especially in Japan, playing eSports seems like hobby, and it is often claimed that it has negative effect to health and studying. However, eSports has no relation to gender, region, age, and physical strength. Therefore, it has potential to become national sports which contribute to education and health. In our experiences, amateur cannot compete with professional, so professional must have positive skills which are improved by eSports. Hence, we started a project to clarify the skills. To obtain an objective, we have to estimate effect which scores of games are caused by performance of computer. In this study, we measured latency of reaction time which affected by refresh-rate.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 339–347, 2021. https://doi.org/10.1007/978-3-030-61105-7_34

340

K. Murakami et al.

2 Related Works In recent years, multi-faceted studies on eSports have been conducted, in which economic and social issues have been argued [5]. However, a limited number of studies have reported on human skills in eSports. In general, the reaction time for the display of professional gamers is considered to be faster than for amateurs. In order to measure it accurately, it is necessary to estimate the computer delay included in the reaction speed. Our literature search failed to reveal any relationship between the delay caused by computer system and the reaction time in eSports. However, videos can be found on the internet that show how different computer performance affects game play [6–9]. NVIDIA GeForce [6] pointed out that three kinds of problems, Animation Smoothness, Screen Tearing and Ghosting (Motion blur) are occurred if we play games by using low- performance computer. Firstly, the Animation Smoothness is a problem about delays that caused by difference of drawing update timing between software and display. For example, comparing 60 FPS (Frame Per Second) and 30 FPS, picture is drawn every 1/60 s in 60 FPS, but every 1/30 in 30FPS. Hence, a car in 30 FPS may be still drawn in initial position at t = 1/60 s, and it may reach same position as in 60 FPS at 2/60 s as shown in Fig. 1.

Fig. 1. Animation Smoothness

Secondly, the Screen Treating is a problem that an old picture and a new picture (#N and #N + 1 respectively in Fig. 2) are drawn at the same time across the scan line when updating display which shown in Fig. 2. Thirdly, the Ghosting is a problem that multiple frames overlap and can be seen at the same time which shown in Fig. 3. The video showed that these three problems occur on low-performance computer systems, but it doesn’t report which specifications of the system cause those problems. NVIDIA GeForce [7] demonstrated that two delays such as input delay and display delay occurred with relating to the performance of the graphics board and monitor. The delays occur a problem that players cannot defeat the enemy even if the players shoot when thinking it is just right. In the case of a game which called Double Door that defeats an enemy that crosses the gap at the end of the passage, a hit ratio of a player in 60 Hz/60 FPS was 2/30, but a hit ratio of the same player in 240 Hz/240 FPS, it increased

A Study on the Relationship Between Refresh-Rate of Display

341

Fig. 2. Screen Tearing

Fig. 3. Ghosting (Motion Blur)

to 7/10. When two people shoot at one target using machines with different speeds, the player with the slower machine can hardly defeat the target. It has been suggested that the performance of the computer has a great influence on not only the problems that occurs on the monitor but also the score of the game. However, no specific causal relationship has been shown in the video. Pixio Japan [8, 9] demonstrated that refresh rate of display affects the game score. When four players who have enough experience of eSports played a simple practice game in which they clicked what appeared on the screen by using same computer for three hours each, it observed that when assuming the average score at 60 Hz was 100, the average scores at 144 Hz and 240 Hz were 108 and 110 respectively. This is considered to be an experiment investigating the effect of the refresh-rate of the display on the game score, because the load on the CPU and GPU of a simple game is considered to be extremely light. In this experiment, the video concluded that

342

K. Murakami et al.

240 Hz display is effective for advanced gamers and displays up to 144 Hz is sufficient for beginners.

3 Computer System-Derived Delay In this chapter, the relationship between the issues discussed in the previous chapter and computer system is discussed. Figure 4 illustrates the various kinds of latency that occur when playing a game.

Fig. 4. Reaction time and various kinds of latency

In this experiment, the subject presses the button of mouse or keyboard when the color of an object in the screen is changed from white to blue. The structure of the reaction time measured in the test is shown in Fig. 4. The measurement starts when the game software determines that the color changes from white to blue (Character “S” in Fig. 4.). Based on the determination, the game software executes the rewriting process of the scene to be displayed. It is the process of changing the color of the object from white to blue in this test. Next, rendering processing is performed on the updated scene, and the result is written in the frame buffer. The data in the frame buffer is scanned by the display and the screen is updated. A subject percepts and recognizes the color change, and makes a judgment, and operates pushing a button on a mouse or a keyboard. The signal that the button is pressed is written to the event buffer via the IO controller, and the measurement ends when the game program detects the event (Character “E” in Fig. 4). The reaction time is the sum of three types of times: computer processing time, display scanning time and human reaction time. Computer processing time consists of two parts. One is the time from the start of measurement to the end of rendering. In

A Study on the Relationship Between Refresh-Rate of Display

343

other words, it is the sum of game processing time and rendering time in Fig. 4. It takes about one cycle of the game software loop. The other is the latency from when the button is pressed until the game software detects it. This depends on the timing at which the event signal comes in and takes up to 1 cycle. Considering that they come in at random timing, it is estimated about 0.5 cycle as average. Therefore, computer processing time is expected to take 1.5 cycles. The display scanning time is the time to update image on the screen. The time of one cycle of game software can be adjusted by setting FPS in Unity. In this test, since this adjustment was set to 240 FPS, one cycle is expected to be about 4.2 ms. The refresh rate of the display was set to 240 Hz, 120 Hz or 60 Hz from the OS setup menu. With this setting, the problem of Fig. 1 in the previous chapter occurs in proportion to the refresh rate. The screen tearing in Fig. 2 can be prevented by setting V-Sync on, but in this test, the setting is asynchronous. The ghost in Fig. 3 is related to the response time of the display. When a movie was shot at 920 frames per second on the display used in this test, it responded in about 4 ms corresponding to 240 Hz, and it is considered that ghost does not affect this test.

4 Experiment Environment 4.1 Software According to J. Kosinski [10], human reaction time is discussed in three types, simple response time, recognition response time, and selective response time. In this experiment, we measure simple reaction time which is defined as the time required for a subject to initiate a prearranged response to a defined stimulus. We have developed a game in which a player clicks the mouse button or keyboard when the color of the rectangular plate changes by using game engine, Unity. We measured the time from color of the rectangular plate changed to player clicked the button as reaction time. It is said that the human reaction time to light stimulus is 200 ms [11], but the discrimination ranges of the time when light stimulation occurs is not clarified. We fix Application.targetFrameRate variable in Unity [12] to 240 FPS, turn off V-Sync, change display refresh-rate to 60, 120, 240 Hz, and investigates the reaction time for each update refresh-rate. This experiment does not measure human reaction time, but measures the delay on the system. If the reaction time of individuals becomes an average value by a sufficient number of experiments, the time obtained by subtracting the human reaction time from the time of one loop described above is considered to be the delay caused by the system. When the time of one loop becomes constant, it is shown that the system delay is so small that it does not affect human reaction time. It is considered that how can discriminate system delays is the ability of individuals different from the reaction time. 4.2 Hardware 5 testers played the game by using same hardware which shown in Table 1.

344

K. Murakami et al. Table 1. Performance of computer

Display

Dell AW2720HF

Resolution

1920x1080

Bit depth

8 bit

Color format

RGB

Color space

Standard Dinamic Range (SDR)

Processor

Interl® Core™ i9-9900K CPU @3.60GHz 3.60GHz

RAM

32.0 GB

OS

Windows 10 Home

Version

1909

5 Experiment Result In this study, we did two patterns of experiments to measure human simple reaction time. In each experiment, we measured time from the color of the rectangular plate changed to player clicked the button as human simple reaction time. In the first experiment, tester click keyboard when color changed, and tester click mouse button in second one. In each experiment, tester did reflex test 30 times in each display refresh-rate, 60, 120, and 240 Hz. In this study, we listed data from fastest to slowest, and we used middle 10 data as valid data. This is because, we should reject extremely fast or slow data. The results of each experiment are shown in Fig. 5 and Fig. 6. The point figured in left side shows result of 240 Hz, middle one shows 120 Hz, and right side one shows 60 Hz. As you can see from these figures, almost tester clicked faster as display refresh-rate gets higher. As we expected, it is clarified that display refresh-rate affects to reaction time directly. However, in our prediction, if human reaction time is constant, difference of reaction time between each display refresh-rate is occurred from system delay, but it may not be true in this case. This is because, difference value of each tester is completely different such as show in Table 2, and Table 3.

A Study on the Relationship Between Refresh-Rate of Display

Fig. 5. Reaction time of keyboard

Fig. 6. Reaction time of mouse button

345

346

K. Murakami et al.

Table 2. Difference value of reaction time between each display refresh-rate in experiment 1

Table 3. Difference value of reaction time between each display refresh-rate in experiment 2

A Study on the Relationship Between Refresh-Rate of Display

347

6 Conclusion In conclusion, we focus on relation between display refresh-rate and reaction time. As a result, display refresh-rate affect to reaction time directly. Therefore, it is suggested that using display with high refresh-rate is mostly advantageous on eSports games such as shooting games or fighting games that affected by reflexes. However, difference value of reaction time in each tester between each display refresh-rater is completely different, but in most cases, difference values are slower than system delay. From these results, our prediction that human reaction time is constant is not satisfied. In addition, it might be some delays except system delay, but they are not clarified. In this study, we measured simple reaction time by playing simple game. Therefore, we should test in complex game and compare results with this study to know how performance of computer related to player skills deeply as a future study. Acknowledgments. This research was supported by a Grant from The Telecommunication Advanced Foundation.

References 1. Chikish, Y., Carreras, M., Garci, J.: eSports: a new era for the sports industry and a new impulse for the research in sports (and) economics? In: Sports (and) Economics, pp. 477–508. FUNCAS (Spanish Savings Banks Foundation) (2019) 2. Happonen, A., Minashkina, D.: Professionalism in esport: benefits in skills and health & possible downsides. LUT Scientific and Expertise Publications (2019). ISBN 978-952-335375-6 3. Griffiths, M.D.: The psychosocial impact of professional gambling, professional video gaming, and eSports. Casino Gaming Int. 28, 59–63 (2017) 4. Choi, C., Hums, M., Bum, C.H.: Impact of the family environment on juvenile mental health: esports online game addiction and delinquency. Int. J. Environ. Res. Public Health 15(12), 2850 (2018). https://doi.org/10.3390/ijerph15122850 5. Pedraza-Ramirez, I., Musculus, L., Raab, M., Laborde, S.: Setting the scientific stage for esports psychology: a systematic review. Int. Rev. Sport Exerc. Psychol. 13, 319–352 (2020). https://doi.org/10.1080/1750984X.2020.1723122 6. GeForce Powered High FPS CS:GO SLO-MO Video. https://www.youtube.com/watch?v= uJxxCgKa0mU 7. Does High FPS Make You a Better Gamer? – Favorite Moments. https://www.youtube.com/ watch?v=x-kwlaKKhp4 8. Difference of hitting rate and scores between each refresh-rate of display 60hz, 144hz and 240hz [First half] “translated from Japanese”. https://www.youtube.com/watch?v=-UMnJHe yibk 9. Difference of hitting rate and scores between each refresh-rate of display 60hz, 144hz and 240hz [Second half] “translated from Japanese”. https://www.youtube.com/watch?v=HfiAOr dSKf0 10. Kosinski, R.J.: A Literature Review on Reaction Time. Clemson University, September 2013. https://www.cognaction.org/cogs105/readings/clemson.rt.pdf 11. Oyama, T.: Historical background and the present status of reaction time studies. Jpn. J. Ergon. 21(2), 57–64 (1985) ((in Japanese)) 12. Unity, Unity Documentation, Retrieved from Unity (2020). https://docs.unity3d.com/Script Reference/Application-targetFrameRate.html

Basic Consideration of Video Applications System for Tourists Based on Autonomous Driving Road Information Platform in Snow Country Yoshitaka Shibata(B) , Akira Sakuraba, Yoshiya Saito, Yoshikazu Arai, and Jun Hakura Iwate Prefectural University, Takizawa, Japan {shibata,a_saku,y-saito,hakura}@iwate-pu.ac.jp

Abstract. Autonomous driving is getting popular year and year around the world. However, applications over autonomous driving car have not been discussed so far. In this paper, video applications including AR and VR contents for tourists on the autonomous vehicle in normal case are proposed. Automatic disaster information and road navigation systems from current area to safe evacuation area in urgent case are also considered. The system configuration, architecture of the autonomous driving road information platform and its presentation method are precisely explained. Finally, a prototype system using current available autonomous driving platform is discussed.

1 Introduction Recently, autonomous driving systems have been investigated and developed in industrial countries in the world. Some of countries, such as the U.S., China, Japan and Europe and produced the practical autonomous driving cars which can run on the exclusive roads and highway roads at level 3 or 4. However, most of those autonomous driving cars are only used to carry persons, goods and foods as the same as ordinary human driving cars on urban areas. In the snow countries, there are many tourist spots, such as national parks, historical places, festivals, sports events places, food centers, entertainment areas, spas and hotel etc. However, the autonomous driving system in those areas have not been considered. Since the road conditions are the worst because the road surface is snowy, dump, sherbet, icy in winter, it is very dangerous to drive on those roads. The information network infrastructure in rural has not also well developed. For those reasons, so far, there are few applications of autonomous vehicles for various purposes to take advantages of autonomous vehicles in many application fields in addition to secure and trust life applications, such as disaster response, medical and emergency transportation and COVID-19 and so on. For example, in the case of sight-seeing for foreign tourists by taking public transportation or rental cars as tourist application, current situations, it is very difficult to go and look around the tourist spots without any difficulties because they are not familiar with driving road, indications and parking lots. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 348–355, 2021. https://doi.org/10.1007/978-3-030-61105-7_35

Basic Consideration of Video Applications System for Tourists

349

Furthermore, in emergency case such as earthquake and tsunami, they cannot safely and quickly evacuate to the safe evaluation place. In those cases, if the autonomous vehicles are used for those applications, they can enjoy more comfortable sight-seeing in normal case and can evaluate safely and quickly to the evaluation place in emergency case. As automotive driving vehicle, electric cars, simply EV is very useful, cost effective and easily introduced even for the rural areas because the automatic control of electric car is relatively easy, just rotation control of motor and stealing compared with fuel typed autonomous driving cars. EV can also easily charge energy in the battery even at home and there is no gas station in the rural or mountain areas. EV is simple, light and small to drive even on the very narrow and mountain and bad roads. For those reasons, we apply EV as autonomous driving car and combine our road state sensing system and V2X communication system. Thus, in order to realize automotive EV system, we introduce road state information platform. The road state information platform collects, transmit and share those road state in relatime to automatically drive vehicle even in winter. As video application system, AR/VR system is introduced for tourists to provide the information and video contents with tourist spots. Using this system, the tourists can enjoy the AR/VR video images with the surround view at the running roads, national parks and historical places. By integrating and facilitating the EV based autonomous driving vehicle system, road information platform systems and video applications, more comfortable and attractive tourist oriented applications on cost effective autonomous driving system can be realized. In the following, autonomous driving road information platform is introduced in Sect. 2. Then EV based autonomous vehicle and control system are explained in Sect. 3. Next, Video application system on road information platform are shown in Sect. 4. After that, disaster evaluation conducting system in emergency case and its function are explained in Sect. 5. Then, a prototype system to evaluate preliminary and basic functions and performance of the proposed system is explained in Sect. 6. Finally conclusion and future works are summarized in Sect. 7.

2 Autonomous Driving Road Information Platform From our previous researches, we introduce a new generation autonomous driving road information platform based on crowd sensing and V2X technologies as shown in Fig. 1 [1, 2]. The wide area road surface state information platform mainly consists of multiple roadside wireless nodes, namely Smart Relay Shelters (SRS), Gateways, and mobile nodes, namely Smart Mobile Box (SMB). Each SRS or SMB is furthermore organized by a sensor information part and communication network part [3]. The vehicle has sensor information part includes various sensor devices such as semi-electrostatic field sensor, an acceleration sensor, gyro sensor, temperature sensor, humidity sensor, infrared sensor and sensor server. Using those sensor devices, various road surface states such as dry, rough, wet, snowy and icy roads can be quantitatively decided [4, 5]. In our system, SRS and SMB organize a large scale information infrastructure without conventional wired network such as Internet. The SMB on the car collects various sensor data including acceleration, temperature, humidity and frozen sensor data as well as GPS data and carries and exchanges to other smart node as message ferry while moving from one end to another along the roads [6, 7].

350

Y. Shibata et al.

On the other hand, SRS not only collects and stores sensor data from its own sensors in its database server but exchanges the sensor data from SMB in vehicle nodes when it passes through the SRS in roadside wireless node by V2X communication protocol. Therefore, both sensor data at SRS and SMB are periodically uploaded to cloud system through the Gateway and synchronized. Thus, SMB performs as mobile communication means even through the communication infrastructure is challenged environment or not prepared. This network not only performs various road sensor data collection and transmission functions, but also performs Internet access network function to transmit the various data, such as sightseeing information, disaster prevention information and shopping and so on as ordinal public wide area network for residents. Therefore, many applications and services can be realized.

Fig. 1. Wide road surface state information platform

3 EV Control System Figure 2 shows a system control system to automatically control EV combing with road state information system in cloud computing. Various sensors including dynamic accelerator, gyro sensor, infrared temperature sensor, humidity sensor, quasi electrical static sensor, camera and GPS measure the time series physical sensor data. Then, those sensor data are processed by the road surface decision unit (Machine Learning) and the current road state can be identified in realtime. Next, those road state data are input to the ECU to calculate the amount of breaking and steering and sent to the braking and steering components to optimally control the speed and direction of the EV. This close loop of the measuring EV speed and direction, sensing road data, deciding road state, computing and controlling braking/steering processes is repeated within a several msec. On the other hand, those road state data also transmitted to the road state server in cloud computing system through the edge computing by V2Xcommunication protocol and processed to organized wide the road state GIS platform. Those data are distributed to all of the running EVs to know the head road state of the current location. From the received the head state of the current location, the EV can look a head road state and predict proper target set values of speed and direction of the EV. Thus, by combining

Basic Consideration of Video Applications System for Tourists

351

the control of both the current and feature speed and direction of EV, more correct and safer automotive driving can be attained.

Fig. 2. Automatic EV control system

4 Video Applications System on Platform Figure 3 shows a system architecture of our proposed system. When the client device such as smart terminals and google glasses in vehicle receives the wave signal from the SRS, the signal identification module identify the ID which is equivalent to the objective tourist information. Then point of interest (POI) which is equivalent to the longitude and latitude coordinates information manager module sends the ID to the tourist information server and then receives the tourist information [8, 9]. Then POI information manager module sends the contents ID to the contents server and the equivalent contents are received and managed at the contents manage module. The contents animation module calculates the horizontal and vertical angles of the client device using various sensors and determines the coordinate to display the contents. At the AR viewer, the contents are displayed on the image from the camera. Thus, even though the tourist moves and rotates around the POI, the contents are automatically and correctly traced to tourist’s movement and rotation. Figure 4 shows the directions of contents to be displayed and client device to display. The parameters to determine the coordinates from those sensor data are also defined. In this case, the coordinate (x, y) on which the contents are displayed is calculated as follows. x=

windowx 2

+ (θcontents − θx ) ×

y=

windowy 2

+ (90 + θy ) ×

windowx θcamera

windowy θcamera

352

Y. Shibata et al.

Fig. 3. System architecture of AR application

Fig. 4. Definition of Parameters for AR presentation

5 Disaster Evacuation Guide System The Fig. 5 show a Disaster Evacuation Guide System using our proposed system for emergent case [10, 11]. When disaster occurred, the push typed disaster information, evaluation information are automatically delivered to those tourists by their languages. The autonomous EV system can be automatically conducted to the nearest safe evaluation place. Thus, the tourist can safely evacuate to the proper shelter from the current location. Through the mobility information infrastructure, the disaster state information, resident safety information, required medicine, feeds and materials are also collected and

Basic Consideration of Video Applications System for Tourists

353

transmitted by mobile nodes between the counter disaster headquarter and evacuation shelters as shown in Fig. 6.

Fig. 5. Disaster Evaluation Navigation System

6 Prototype System In this paper, video applications including AR and VR contents for tourists on the autonomous vehicle in normal case are proposed. Automatic disaster information and road navigation systems from current area to safe evacuation area in urgent case are also considered. The system configuration, architecture of the autonomous driving road information platform and its presentation method are precisely explained. Finally, a prototype system using current available autonomous driving platform is discussed.in order to verify the effects and usefulness of the proposed system, a prototype system which is based on the EV based is considered constructed and those functional and performance are evaluated. Figure 6 shows an autonomous EV system is based on electromagnetic induction line technology. The EV direction is inducted by the electro-magnetic induction line which is embedded in the ground and can be safely and reliably drive along the line. Only motor is needed to control the EV speed be considering the ahead road state data from SRS along the street. The EV of the prototype is made of YAMAHA, 7 limited persons and runs max. 12km/h. The prototype also includes sensor server system and Communication server System, Smart Mobility Base station (SMB) for mobility and Smart Rely Shelter (SRS) for roadside station as shown in Fig. 6. We currently use WI-U2-300D of Buffalo Corporation for Wi-Fi communication of 2.4 GHz as the prototype of two-wavelength communication, and OiNET-923 of Oi Electric Co., Ltd. For 920 MHz band communication respectively. WI-U2-300D is a commercially available device, and the rated bandwidth in this

354

Y. Shibata et al.

prototype setting is 54 Mbps. On the other hand, the OiNET-923 has a communication distance of 1 km at maximum and a bandwidth of 50 kbps to 100 kbps. On the other hand, in sensor server system, several sensor including BL-02 of Biglobe as 9 axis dynamic sensor and GPS, CS-TAC-40 of Optex as far-infrared temperature sensor, HTY7843 of azbil as humidity and temperature sensor and RoadEye of RIS system and quasi electrical static field sensor for road surface state are used. Those sensor data are synchronously sampled with every 10 ms. And averaged every 1 s to reduce sensor noise by another Raspberry Pi3 Model B+ as sensor server. Then those data are sent to Intel NUC Core i7 which is used for sensor data storage and data analysis by AI based road state decision. Both sensor and communication servers are connected to Ethernet switch. Currently, we are evaluating the road surface decision function using the video camera to compere the decision state and the actual road surface state.

Fig. 6. A prototype of autonomous driving system

7 Conclusions In this paper, we propose autonomous driving road information platform for EV and video application system for tourists to provide tourist information system by the augmented reality using the information based on the point of interests (POI). Basic system configuration of the platform and video application technology are introduced. With video application system, POI information and video contents triggered by wireless signal from SRS are downloaded and overlapped on the real image on smart device to realize augmented reality (AR) while running in tourist areas. A prototype system is constructed to evaluate its functionality. Currently, we are testing and evaluating the effects of our proposed system. We are also developing more sophisticated tourist contents such as 3D objects to realize more attractive tourist video services. Acknowledgement. The research was supported by Strategic Information and Communications R&D Promotion Program (SCOPE) Grant Number 181502003 by Ministry of Affairs and Communication, JSPS KAKENHI Grant Numbers JP 20K11773 and Strategic Research Project Grant by Iwate Prefectural University in 2020.

References 1. Shibata, Y., Sakuraba, A., Yoshikazu, A., Saito, Y., Hakura, J.: Road state information platform for automotive EV in snow country. In: The 34th International Conference on Advanced Information Networking and Applications (AINA-2020), pp. 587–594. Elsevier, April 2020

Basic Consideration of Video Applications System for Tourists

355

2. Shibata, Y., Sakuraba, A., Arai, Y., Sato, G., Uchida, N.: Predicted viewer system of road state based on crowd IoT sensing toward autonomous EV driving. In: The 23rd International Conference on Network-Based Information Systems (NBiS-2020). Elsevier, August 2020 3. Shibata, Y., Sato, G., Uchida, N.: A new generation wide area road surface state information platform based on crowd sensing and V2X technologies. In: The 21th International Conference on Network-Based Information Systems (NBiS2018), Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT), vol. 22, pp. 300–308 (2018) 4. Shibata, Y., Sato, G., Uchida, N.: Road state information platform based on multi-sensors and bigdata analysis. In: The 14th International Conference on Broad-Band Wireless Computing, Communication and Applications, (BWCCA2019), Antwerp, Belgium, November 2019, pp. 504–511 (2019) 5. Shibata, Y., Arai, Y., Saito, Y., Hakura, J.: Development and evaluation of road state information platform based on various environmental sensors in snow countries. In: The 8-th International Conference on Emerging Internet, Data & Web Technologies, (EIDWT 2020), Kitakyusyu, Japan, pp. 268–276, February 2020 6. Shibata, Y., Sakuraba, A., Sato, G., Uchida, N.: Realtime road state decision system based on multiple sensors and AI technologies. In: The 13th International Conference on Complex, Intelligent, and Software Intensive Systems, (CISIS 2019), Sydney, Australia, pp. 114–122, July 2019 7. Sakuraba, A., Shibata, Y., Uchida, N., Sato, G., Ozeki, K.: Evaluation of performance on N-wavelength V2X wireless network in actual road environment. In: The 13th International Conference on Complex, Intelligent, and Software Intensive Systems, (CISIS 2019), Sydney, Australia, pp. 555–565, July 2019 8. Hirakawa, G., Sato, G., Hisazumi, K.: Yoshitaka shibata: data gathering system for recommender system in tourism. In: 18th International Conference on Network-Based Information System Systems, pp. 521–525 (2015) 9. Shibata, Y., Sasaki, K.: Torist information system based on beacon based augument reality technologies. In: The 11th International Workshop on Network-based Virtual Reality and Tele-existence, (INVITE2016), Technical University of Ostrava. September 2016 10. Sato, G., Uchida, N., Shiratori, N., Shibata, Y.: Evaluation of never die network system for disaster prevention based on cognitive wireless technologys. In: The 11th International Conference on Complex, Intelligent, and Software Intensive Systems, (CISIS 2017), Istituto Superiore Mario Boella (ISMB), Torino, Italy, pp. 139–151, July 2017 11. Otomo, M., Hashimoto, K., Uchida, N., Shibata, Y.: Mobile cloud computing usage for onboard vehicle servers in collecting disaster data information. In: The 8th International Conference on Awareness Science and Technology, (iCAST 2017), CD-ROM, The Splendor Hotel, Taichung, Taiwan, November 2017

Design of In-depth Security Protection System of Integrated Intelligent Police Cloud Fahua Qian1 , Jian Cheng1 , Xinmeng Wang2(B) , Yitao Yang2 , and Chanchan Li2 1 Public Security Berau of Jiangsu Province, 1st Yangzhou Rd, Jiangsu, China

[email protected], [email protected] 2 Information Technology Department, Nanjing Forest Police College, Jiangsu, China

[email protected], [email protected], [email protected]

Abstract. In order to strengthen the security protection of the police cloud platform, ensure the safe and stable operation of the cloud infrastructure, core-data and key-applications, and ensure the overall security and controllability of the police cloud, Based on the four layer architecture composed of IaaS, PaaS, DaaS and SaaS (i.e. infrastructure, platform service, data service and application service layer), this paper constructs the infrastructure guarantee environment for cloud computing and big data, and builds a multi-level, cross network, wide area network police cloud vertical and deep security protection system. Keywords: Police cloud · IaaS · PaaS · DaaS · SaaS

1 Introduction With the rapid development of cloud computing and big data technology, the application of police big data has been carried out in depth. The police cloud provides basic support for the application of various research and judgment means, such as storage, association, integration, and behavior trajectory analysis, social relationship analysis and biological analysis for various types of trajectories, videos and images. It has initially realized the information sharing among different departments, police departments and regions, and provided strong support for the public security organs in criminal investigation, public security management, traffic management, social services and so on. At the same time, data resources (such as data related to public security and police work, personal information related to citizens’ privacy, case handling information related to case clues, etc.) are stored in the cloud, various analysis and mining association means are applied in the cloud, and all kinds of police work is carried out on the cloud, which puts forward new requirements for cloud security. The police cloud computing environment is an environment where many users share cloud facilities. Although constrained by internal network security management factors, the security threats they face cannot be ignored, such as data loss and leakage, cloud computing resource abuse and illegality Security isolation issues caused by the use and sharing of cloud facilities, the risk situation cannot be used by departmental awareness, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 356–365, 2021. https://doi.org/10.1007/978-3-030-61105-7_36

Design of In-depth Security Protection System of Integrated Intelligent Police Cloud

357

etc. Overall, the police cloud security issues mainly related to data privacy, system security, network theft, assault cloud itself, cloud services data leakage and other problems [1–2]. In response to the security threats faced by the police cloud, with the goal of enhancing the basic security protection capabilities of the Jiangsu police cloud, we carried out research and application demonstration of key security protection technologies such as cloud computing and big data for the public safety industry, and explored the construction of multi-level, Cross-node, wide-area network distributed integrated cloud-based security resource pool, select districts and cities to conduct comprehensive demonstrations, build a security protection management service system that meets the characteristics of smart policing and industry characteristics, and supports various types of cloud hosts, physical machines, Storage, network and big data resources provide comprehensive and three-dimensional security protection, and use big data intelligent analysis technology to quickly discover and respond to cloud security risks, and realize intelligent monitoring, security operations, security audits, and situational awareness of resources on the cloud. Provide a safe and stable operating environment for resource services on the cloud. According to the relevant construction requirements of the Ministry of Public Security, Jiangsu Police cloud mainly constructs the security environment of cloud computing and big data infrastructure. The police cloud deep security protection system meets the three major characteristics of multi-level, cross-network and wide area network. The overall construction architecture includes IaaS (infrastructure services), PaaS (platform services), DaaS (data services) and SaaS (software services), namely infrastructure layer, platform service layer, data service layer and application service layer [3]. It is mainly deployed in public security network, video network, mobile police network and other cross network environment. In terms of its construction scope, the provincial department and Municipal Public Security Bureau have carried out the construction of cloud platform in accordance with unified standards and specifications, and the framework of Wan integrated cloud platform has basically formed. Aiming at the police cloud security protection system of Jiangsu Province, this paper studies the network security implementation technology details of IaaS, PaaS, DaaS and SaaS layers in the system, and designs a police cloud vertical and deep security protection architecture. The arrangement of this paper is as follows: Sect. 2 is the overall design architecture; Sects. 3 to 5 describe the technical architectures used to implement the three features of multi-level, cross network and wide-area network; and Sect. 6 is the conclusion of this paper.

2 The Overall Design Framework of the Deep Security Protection System of the Police Cloud 2.1 Overall Architecture The composition of police cloud can be logically divided into seven types of protection objects: physical environment, network system, host system, virtualization, software platform, data and application, providing and implementing targeted security protection measures. Based on the different service modes of IaaS, PaaS, DaaS and SaaS provided

358

F. Qian et al.

by the police cloud, and relying on the public security intranet, image private network and other private networks, the police cloud vertical and deep security protection system with multi-level, cross network and cross Wan integration is constructed. The overall structure is shown in Fig. 1.

Fig.1. Overall architecture of multi-level security protection system

2.2 Composition of the Platform The hierarchical architecture of big data police cloud computing platform consists of IaaS layer, PaaS layer, DaaS layer and SaaS layer from bottom to top [3]. In short, IaaS layer mainly includes hardware infrastructure layer and infrastructure management. PaaS layer is mainly composed of platform supporting software layer. DaaS layer provides data resource services for various applications, and SaaS layer mainly provides various application services. (1) IaaS layer. The IaaS layer contains the most important hardware infrastructure and physical resources that constitute the big data police cloud computing platform, and constitutes the hardware resource pool shared by various departments and various police application systems, mainly including computing resources, storage resources and network resources. In order to effectively schedule and share the physical resources in the resource pool, virtualization software is needed to virtualize a large number of physical computing servers, storage servers and network resources. At the same time, in order to provide

Design of In-depth Security Protection System of Integrated Intelligent Police Cloud

359

dynamic and elastic expansion of resource allocation capacity for the application system, we also need to use infrastructure management software to manage, schedule and monitor the use of various virtualization and physical resources [4]. (2) PaaS layer. The PaaS layer mainly provides all kinds of platform support system software necessary to complete cloud computing and cloud storage, including cloud storage system and cloud computing system. Cloud storage system needs to provide the storage and quick query ability of structured data, as well as the storage and processing capacity of a large number of unstructured and semi-structured massive data. The cloud storage system of PaaS layer first needs to provide relational database system (such as mysql, SQL server, Sybase, Oracle, etc.), and at the same time, it needs to provide distributed file system to store massive data. The distributed file system which is more perfect and widely accepted and used for massive data storage is HDFS of open source Hadoop system. Cloud computing system is mainly used for the parallel processing of massive data, completing the analysis and mining of various massive police data. At present, the most mature parallel processing software for massive data is Hadoop, which provides MapReduce, spark and other parallel computing frameworks [5]. (3) DaaS layer. Between PaaS layer and SaaS layer, there is a layer of police application DaaS layer based on cloud storage system, including all kinds of shared data resource services, as well as road monitoring, image monitoring, cloud search and other massive cloud application data services. Between the PaaS layer and the SaaS layer, there is a layer of police application DaaS layer based on cloud storage system, including various shared data resource services, and also includes massive cloud application data services such as road monitoring, image monitoring, and cloud search. (4) SaaS layer. The SaaS layer mainly includes the common service resources of various police cloud computing application systems, as well as big data police cloud computing application systems. The common service resources of police cloud computing application system include portal service, message service, geographic information service, data extraction integration service, query service, statistical analysis and data mining service, security service, and unified data resource access and other public service modules and programs.

3 Design of Multi-level Security Protection (1) Security protection design of IaaS layer The IaaS layer security protection involves virtualization resource isolation, virtual network isolation, virtual network threat, mirror security and host security [4]. 1) Isolation of virtualized resources: The security isolation of virtual resources between different tenants of cloud platform is the basic requirement of cloud security. The internal virtual machine of police cloud platform needs CPU virtualization, memory virtualization, storage virtualization and other technologies to achieve resource isolation between different virtual machines.

360

F. Qian et al.

2) Isolation of virtual network: Different virtual hosts can be isolated by security groups. Each security group can set a set of access rules. When a virtual machine joins a security group, it is protected by the access rule group. The virtual machines in the same security group can communicate with each other, but the virtual machines between different security groups are not allowed to communicate by default. The communication relationship can be established only by configuring the relevant access control rules. 3) The threat of virtual network. There are two protection models: (a) Two layer network threat protection scheme: Virtual network security policy can prevent IP and MAC address impersonation and DHCP server impersonation; (b) Four to seven layer virtual network threat protection scheme: Through SDN (Software Defined Network) + NFV (network function virtualization) scheme to achieve security protection in the virtual network. 4) Mirror security: All the computing nodes and management nodes of the police cloud use SUSE Linux operating system, so it is necessary to consolidate the underlying operating system of the platform, including system service layer reinforcement, file directory control, account password reinforcement, authentication authorization and access control, and setting system security baseline. At the kernel level, some default functions of Linux system can be blocked by modifying the parameters of kernel configuration items of operating system, and malicious users can avoid modifying system settings. 5) Host and virtual machine security: The system security is strengthened for host computer, which involves system operation monitoring, system key process protection, malicious code protection and intrusion protection. (2) Security protection design of PaaS layer Security protection of platform service layer mainly provides security protection support for platform components such as development service, authorization service, authentication service, API gateway, transmission exchange, etc. Its security protection measures mainly include multi-level tenant data isolation, big data cluster security, platform management system security, identity authentication and authorization management, RDS service security, etc. 1) Big data cluster security and multi-tenant data isolation Through the construction of Hadoop cluster security audit function, collect the log data of big data components such as HDFS, Hive, Storm, Hbase, Redis, Flume, Kafka and other big data components in the provincial police cloud computing platform, and provide cloud security services such as big data component monitoring, multi-user big data access monitoring, big data node monitoring, etc. At the same time, through the construction of MPPDB cluster security audit function, it provides cloud audit services such as database vulnerability detection, realtime behavior monitoring, fine-grained protocol analysis and two-way audit, and application three-tier Association audit. 2) Security of platform management system Through the construction of big data cluster operation and maintenance management function, it provides cloud security management services such as centralized management of big data components and nodes and intelligent operation and maintenance.

Design of In-depth Security Protection System of Integrated Intelligent Police Cloud

361

3) Management of identity authentication and authorization Through the construction of security management function of big data platform, identity authentication and authorization management of different users can be realized. 4) Security of RDS service Through the construction of RDS service security function, cloud security services such as network isolation, access control, transmission encryption, automatic backup and snapshot, data replication and data deletion are provided for relational database. 5) Security of interface services The security of interface service includes the security protection of internal interface and external service interface of virtualization software. For the interface provided by cloud platform, the security assessment should be conducted regularly to prevent hackers from attacking. Regularly conduct penetration test and vulnerability assessment on the external interface of cloud platform to ensure the security of the interface. (3) Design of data service (DAAS) layer security protection The security of data service layer mainly combs the technical means and tools required for data security protection from the key links of data life cycle such as police data generation, collection, transmission, storage, use, sharing and destruction, so as to ensure the security and reliability of data services provided by the police cloud platform. 1) Access control of data service Data service access control refers to verifying the authenticity and legitimacy of application service identity, identifying the access request permission, and ensuring that the application service does not exceed the scope of authorized use. According to the principle of minimum permission, data service access rights are assigned to application services through the authorization center; data services are accessed through trusted API agents, and the application services are authenticated by trusted API agents in collaboration with the authentication services of security infrastructure to ensure the legitimacy of application service identities; application services access rights are identified by authorization services of security infrastructure. 2) Data authorization and authentication Data authorization is to configure data access policy according to user attribute, data attribute and data operation behavior. Through the authorization service of security infrastructure, the data access authority policy is configured based on user level and data level. The data access authority policy should include business scope definition, data access frequency, time range definition, query condition filtering, data sensitivity control, etc. the access right can be adjusted dynamically according to the environment attribute and security status of the subject. 3) Audit of data operation security In the process of data use and sharing, a data audit system is established to monitor the compliance of data access behavior, audit the access behavior of users, equipment and applications with legal identity, and generate relevant audit logs. The audit log includes the access subject, the accessed data, the access time, the access behavior type (read and write), and the results of the access. The audit log is

362

F. Qian et al.

stored in a separate audit storage space. Audit log can be used as an important basis for behavior monitoring and traceability. At the same time, the data audit system can collect and analyze the audit logs of users, equipment and applications, so as to realize the monitoring, identification and evidence collection of illegal infringement behaviors by legitimate users using police data, and investigate the illegal acts afterwards. If the circumstances are serious, they should be handled according to the laws and regulations, so as to achieve "illegal use cannot escape", so as to reduce the situation of unauthorized use of internal police officers Situation. Support the audit of distributed file system, structured storage system, non-relational database and other components. 4) Data desensitization Data desensitization refers to the deformation of some sensitive information through desensitization rules to realize the security protection of sensitive privacy data. The sensitive data contained in the shared data should be transformed and masked by the data desensitization service in general protection to prevent secondary leakage. The data level after desensitization should be lower than that before desensitization. Sharing public security big data to external units such as the party, government and military organs through online or offline means should not only meet the data use needs of government organs, but also prevent the leakage of data that should not be shared. Desensitization of shared data can effectively prevent the disclosure of sensitive data. 5) Detection of data leakage In the process of data opening and sharing, in order to prevent the leakage of data that should not be shared and opened due to violations and misoperation, the network data leakage prevention system is used to identify the sensitivity of external data, so as to timely discover and intercept the data that is prohibited from sharing and opening from flowing out of the big data center. The regular expression of PCRE and other common syntax rules should be supported. Sensitive data is identified by data identifier (ID card number, mobile phone number, household registration information and other data identifiers). Through machine learning, using Chinese semantic recognition and prevention, using optimized clustering and classification algorithm to capture the law of sensitive data, build a sensitive data model and use it for detection, so as to prevent the sensitive data that should not be shared out of the police cloud center due to illegal operation and mis operation in the process of sharing the data to external organizations such as the party, government and army. (4) SaaS layer security protection design SaaS layer is the display layer of police application. Application security mainly guarantees the security of application itself and application use, including access control, security isolation, communication encryption, attack protection and vulnerability management protection system. 1) Design of Web security attack protection Cloud WAF components are produced through resource pool to protect web application system from deep application attacks, such as SQL injection, command

Design of In-depth Security Protection System of Integrated Intelligent Police Cloud

363

injection, cookie injection, cookie counterfeiting, cross site scripting (XSS), sensitive information disclosure, malicious code, wrong configuration, hidden fields, parameter tampering, application layer denial of service attacks, etc. 2) Design of Web tamper proof By deploying the anti-tamper system on the website server, we can protect the website. At the same time, with the help of anti-tamper engine, we can monitor the tampering behavior. Web pages usually consist of static files and dynamic files. For the protection of dynamic files, the web anti attack module is embedded in the site, and the scanning and illegal access requests are intercepted by setting keywords, IP and time filtering rules. Static file protection uses tamper proof module to lock static pages and monitor static files inside the site. When illegal operations such as modifying or deleting web pages are found, protection and alarm will be given. 3) Design of application security audit Through the resource pool production web audit unit, the security audit of the application in the police cloud platform is carried out, and the usage of all police officers logging in and using the police cloud platform business application is recorded, and the log records of the application system in the police cloud are regularly reviewed to find the illegal use, and the audit record data are provided for statistics, query, analysis and generation The function of audit report. 4) Vulnerability management of SaaS layer Through the web vulnerability scanner of the security resource pool, it can scan the web vulnerability of the application system on the cloud, detect the application vulnerability of the system on the cloud, discover the security short board of the application on the cloud in time, and improve the security capability of the system on the cloud; the established security baseline criteria check the security configuration of the application system and provide rectification suggestions. Through the baseline verification service in the resource pool, the automatic baseline verification of SaaS layer police cloud application system is carried out.

4 Design of Cross-network Security Protection In order to avoid repeated investment in security construction, a security resource pool management system supporting cross network (public security network and image private network), cross different security domains (user domain and data domain), and cross level (province and city) is designed to provide unified security protection for different networks and domains. The security protection of a single node in the network can be directly referred to the multi-layer protection model (IaaS, PaaS, DaaS, SaaS). The NFV security components in the resource pool are used to implement cross network security protection between nodes across service networks [6]. The security resource pool node is attached to the forwarding system, and the service chain policy is distributed centrally by the security controller to guide the forwarding of traffic in the forwarding system. Two core modules: policy module and traffic scheduling module. The policy module is responsible for parsing the security protection policies

364

F. Qian et al.

required by the business, and then converting them into routing entries in the routing table for storage. For each forwarding node, the import, exit and node detection components are designed to execute the forwarding of security service chain. The typical forwarding process of the security service chain is as follows. When the traffic enters the forwarding system from the image network, the import component matches the flow with the rules in the routing table to identify the service chain it belongs to, and determines the current position of the traffic in the service chain according to the MAC address (or port). Then, the MAC in MAC is encapsulated according to the above information, and the outer destination MAC address is the MAC of the next hop service node. Finally, when the last security service node finishes processing, the traffic is forwarded to the security exchange system, and finally the traffic reaches the public security intranet.

5 Design of Wan Integrated Security Protection The distributed security resource pool is used to protect the network of different networks and data centers of provincial departments and cities. At the same time, it is necessary to meet the principle of intensive security construction. It is necessary to build a unified security management platform to manage all the security resource pools, monitor the security of the whole Province, and realize the whole network situation awareness of the police cloud in the whole province. A unified security portal is provided to realize the unified operation and maintenance management of multiple sets of security resource pools. The cloud security management platform provides the module of cloud security unified portal. Through the unified portal module, the security administrator can select the security resource pool of the DC (Data Center) on demand, which greatly improves the use experience of cloud tenants and reduces the construction cost.

6 Conclusion In recent years, the application of cloud computing in the public security industry has gradually emerged. Many provincial and municipal public security agencies have launched explorations and attempts of cloud computing and big data technologies, and have received very good application results. Starting from the application requirements, choosing the appropriate technology to build a police cloud service platform can ensure that the project construction can really play a role and solve practical application problems. In addition, in recent years, cloud platform technology has developed rapidly, and new parallel computing technologies have also emerged endlessly [7–8]. Therefore, an open and converged technical architecture has been selected to build a police cloud service platform, constantly absorbing mature and stable new technologies, and promoting large-scale The continuous development of data and police cloud service platforms is particularly important. Acknowledgments. The work described in this paper was fully supported by a grant from the Key Research and Development Program of Jiangsu Province in 2019 (No. SBE2019710010), and the Fundamental Research Funds for the Central Universities (No. LGYB202001).

Design of In-depth Security Protection System of Integrated Intelligent Police Cloud

365

References 1. Jiajun, L.V., Mao Cheng, P.Y.: Proposal for cloud security police cloud. China Comput. Commun. 23, 135–136 (2016) 2. Rong, S.: Research on security mechanism of policing cloud. People’s Public Security University of China (2017) 3. Qing, W., Ping, J.: Design and construction of urban big data police cloud computing platform— taking Nanjing Public Security Big Data Police Cloud Computing platform as an example. Police Technol. 2016(5), 12–15 (2016) 4. Yang, S.: Analysis of IAAs cloud security technology of Shanghai Telecom cloud resource pool. Telecommun. Technol. 2017(6), 84–88 (2017) 5. Wei,L.: Research on big data PaaS platform for smart city. Pract. Electron. 2019(14): 38–40+32 (2019) 6. Yongxin, C., Kai, L.C., Jiang, F., et al.: Design of a cross network switching security resource service architecture. Commun. Technol. 52(11), 2765–2769 (2019) 7. Huibo, H.: Research on security technology and information fusion platform based on cloud computing resource pool. Super Sci. 2017(15), 293–294 (2017) 8. Yongjian, W., Zhu Yunqi, Xu., Yang, et al.: Design and research of smart government security system based on cloud computing. Commun. Technol. 49(04), 84–90 (2016)

Design and Implementation of Secure File Transfer System Based on Java Tu Zheng, Su Yunxuan, Wang Xu An(B) , and Li Ruifeng Engineering University of PAP, Xi’an 710086, China [email protected], [email protected], [email protected], [email protected]

Abstract. The popularization and application of computers and the close integration of computer technology and communication technology make the transmission of information more convenient. In daily work, the use of file transfer systems can greatly improve work efficiency. While the file transfer system brings us convenience, it also has security risks. Through the research and analysis of the current situation of office informatization and information security issues, according to the daily work needs of enterprises and individuals, combined with file transfer technology and cryptography theory, a file transfer system with strong security and convenient operation is designed and implemented. The system consists of four modules: file transfer module, encryption and decryption module, instant communication module and log management module. It can complete the encrypted transmission of files, provide instant messaging functions, and keep transfer records. By testing and using the system, the secure transmission of data can be realized, which provides a good guarantee for the security of the transmitted files.

1 Introduction With the development of office informatization, modern information technology is used in daily work. In the information age, information technology is applied to all aspects of people’s lives. It is an important cornerstone of politics, military, economy, energy, transportation, finance, culture and other fields, as well as a strong support for maintaining national security, economic stability, production and social activities. By using modern computer communication technology, it is possible to build a channel between departments to support rapid data transmission. And it also helps to build an efficient data exchange mechanism, which can strengthen the information communication ability, realize the rapid exchange and transmission of information between nodes [3]. There is a risk of data being eavesdropped, intercepted or tampered with in the process of data transmission. Once high-value information is leaked, enterprises and individuals will suffer losses. Therefore, information security issues must be taken seriously. Data encryption and transmission have always been an important part of the information security field. In order to ensure the secure transfer of sensitive documents [9]. The system is compiled based on the JAVA language. Based on the traditional file transfer system, the © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 366–375, 2021. https://doi.org/10.1007/978-3-030-61105-7_37

Design and Implementation of Secure File Transfer System Based on Java

367

AES and RSA encryption algorithms are used to encrypt the transmitted data and generate the digital signature, which meets the needs of information interaction while ensuring the confidentiality, integrity and authenticity of the data in the transfer process [5]. The main contributions of this paper are summarized as follows: (1) We collect relevant papers and information, and make an analysis of the feasibility of the system on the basis of the current information security situation, the actual situation of various departments and relevant laws. (2) Through communication with the users, we learn their daily work requirements, and then make a functional requirements analysis, basing on common information security attack methods and cryptographic theories. (3) Basing on the above results, we establish a system model and design the overall architecture and specific functional modules. Then we choose the encryption algorithm and design an encryption scheme. (4) We complete the compilation and integration of each functional module based on JAVA, realize the file transfer system and test it.

2 Demand Analysis 2.1 User Demands (1) User-friendly interface and be convenient for users to use. (2) Able to carry out point-to-point file transfer and realize directional message transfer. (3) Encrypt the transmitted data and generate the digital signature. Protect the data from attack during transmission, and ensure the confidentiality and integrity. (4) Complete the signature verification and decryption of the received file. (5) The file transfer system can run and realize the functions in different platforms. (6) Provide point-to-point instant messaging function and allow simple communication between users. (7) The system can provide access to file transfer records and instant messaging content records [2]. 2.2 Java Language Java language is designed with a set of security strategies that focus on protecting code security. It can avoid most attacks relying on its security manager and external Socket interface, which can guarantee system security well. Java language has strong crossplatform capabilities. Java’s unique structural mechanism determines that the system written by it is strong. And it can be easily used in most mainstream platforms, such as UNIX, MacOS, Windows, etc. without any additional modification [4].

3 System Design 3.1 Functional Module Design (1) File transfer module: realize file sending and receiving. Receive the data from the tab to create the object of the receiver and the sender, and receive the files that need

368

T. Zheng et al.

to be transmitted. Receive and send the encrypted data, and call the encryption and decryption module to process it. When finish the work, it calls the log management module to record this. (2) Encryption and decryption module: accept the call of the file transfer module and the instant communication module to realize the encryption and decryption of the data. When receiving the call of the file transfer module, it encrypts or decrypts the received data. Generate digital signature of the encrypted data or verify the digital signature of the decrypted data. It also provides encryption and decryption for instant messages to ensure the security. (3) Instant communication module: achieve instant communication and be responsible for receiving and sending instant message. When receiving the call request, it completes the function to receive or send the instant message, and calls the encryption and decryption module to process the message. (4) Log management module: accept the call of the file transfer module and the instant communication module. It records the behavior of the file transfer and keeps the records in the database. The record format is: {file/instant message sender, file/instant message recipient, operation (file transfer/instant messaging), file name/message content, time stamp} [1] (Fig. 1).

Fig. 1. System module composition

3.2 Encryption Scheme Design The encryption and decryption method of this system, which used to process the transferred data, is a hybrid encryption scheme. The AES algorithm is used to encrypt the transferred data. The AES key is encrypted by the receiver’s RSA public key. The sender’s RSA private key is used to generate digital signature of the data. The ciphertext, digital signature and ID of the file sender are packaged and sent to the file receiver for processing, decryption and verification of the digital signature [8].

Design and Implementation of Secure File Transfer System Based on Java

369

We suppose A is the sender of data, and B is the receiver of data. When A and B complete the WebSocket connection through the system, the system will generate the AES key and two pairs of RSA keys and distribute them to A and B which will be used during the data transfer process [10] (Figs. 2 and 3).

Fig. 2. Decryption process

Fig. 3. Encryption process

3.3 System Process 3.3.1 Establish a WebSocket Connection (1) run the file transfer system. (2) use a browser to open the tab. Then the user enters the sender’s ID, the receiver’s ID and their names. Assume that two users A and B want to transfer files. And their ID numbers are 1, 2. A opens the sender’s tab, enters the sender ID:1, the receiver ID:2; B opens the receiver tab, enters the sender ID:2, the receiver ID:1. After entering, click connect.

370

T. Zheng et al.

(3) the background receives the information input by the foreground to form an object, and establishes a WebSocket connection based on the received information. If the connection is established successfully, the RSA keys (public key and private key) will be distributed to A and B. So far, the system has established two WebSocket connections [7]. 3.3.2 Files Transfer Process See Fig. 4.

Fig. 4. Password telegram transmission process

Design and Implementation of Secure File Transfer System Based on Java

371

3.3.3 Instant Message Transfer Process See Fig. 5.

Fig. 5. Instant message transmission process

4 System Implementation The programming of this system is based on JAVA. IntelliJ IDEA is used as the development tool, SpringBoot is used as the basic framework of the Web project, and the dependency packages that need to be referenced by the file transfer system are downloaded and managed through Maven. The system connects to MySQL through JDBC. The records of the system behaviors are stored in the database for preservation and management.

372

T. Zheng et al.

4.1 Create Object The system obtains the information of the file sender and file receiver entered by users through the Web interface. Then system creates the object of the sender and receiver based on the information (Fig. 6).

Fig. 6. Get information

Create two objects, the structure of the objects: {sid (sender ID), username (sender nickname), targetId (receiver ID)}. Assume the file sender’s name is A, its ID number is 1, and the receiver’s name is B, and its ID number is 2 (Fig. 7).

Fig. 7. Object structure

4.2 File Transfer Module The primary task is to establish a Socket connection between the file sender and receiver. The sender sends a request to establish a WebSocket connection with the receiver. After the WebSocket is successfully established, the sender and receiver can transmit data through the WebSocket connection. The implementation method of WebSocket connection is packaged in the configuration class provided by the SpringBoot framework, which can be directly called by adding the scan of this class in the configuration file. The implementation code of WebSocket connection is as follows: import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.web.socket.server.standard.ServerEndpointExporter; @Configuration. public class WebSocketConfig. { @Bean. public ServerEndpointExporter serverEndpointExporter() {return new ServerEndpointExporter();} }

Design and Implementation of Secure File Transfer System Based on Java

373

4.3 Encryption and Decryption Module The server obtains the object information and distributes the RSA key and the AES key (Fig. 8).

Fig. 8. Generate the keys

Digest the plaintext by MD5, and encrypt the digest with sender’s RSA private key to get the digital function. The AES function is called to encrypt the data stream firstly, and then call the RSA function to encrypt the AES key through the receiver’s RSA public key (Fig. 9).

Fig. 9. Data processing

Receive the ciphertext, and call the RSA function to decrypt the ciphertext with the receiver’s private key to obtain the AES key. Then call the AES function to decrypt the ciphertext using the AES key, and call the MD5 function to digest the decrypted ciphertext. Compare it with the received digests to complete the verification. The data stored after the verification is correct. 4.4 Instant Communication The instant message sender inputs the information through the Web interface. Then the encryption and decryption module is called to process the message, and return the processed data stream to the instant communication module. The instant communication module transfers the processed data to the receiver through the WebSocket connection. After the message is successfully transferred, the log management module is called to record this behavior (Fig. 10). 4.5 Database Connection To implement the log management module, the system connects to the MySQL database through JDBC. It needs the management account and password of the relational database. The simple implementation code is as follows: spring.datasource.driverClassName=com.mysql.cj.jdbc.Driver

374

T. Zheng et al.

spring.datasource.url= spring.datasource.username=root spring.datasource.password=123456 after the files or the instant message is successfully transferred, the log management module is called to record the behavior.

Fig. 10. Recording sheet

5 Conclusion Modern computer network technology has undergone revolutionary development. While the new technology is emerging, it also brings many risks. Cyberattacks and technological crimes are exploding, and public network information security incidents happen frequently [6]. On the basis of consulting a large number of papers and documents, this paper conducts demand analysis according to the actual situation of enterprises and individuals. We realize a file transfer system with certain security based on Java, combining data encryption technology and traditional file transfer system. After being tested and used, the expected goal can be achieved and it can provide some practical experience to solve the problem of file security transfer. Acknowledgements. This work is supported by the National Cryptography Development Fund of China (No.MMJJ20170112), Natural Science Basic Research Plan in Shaanxi Province of China (grant no. 2018JM6028), National Natural Science Foundation of China (No.61772550, U1636114 and 61572521), the Foundation of Guizhou Provincial Key Laboratory of Public Big Data (No.2019BDKFJJ008), and National Key Research and Development Program of China (No.2017YFB0802000). This work is also supported by Engineering University of PAP’s Funding for Scientific Research Innovation Team (No.KYTD201805), Engineering University of PAP’s Funding for Key Researcher (No. KYGG202011).

References 1. Li, X.: Design and JAVA realization of file encryption transmission system. Sci. Technol. Innov. Herald (24), 32 (2010) 2. Wang, R., Wang, J.: Design and implementation of file encrypted transmission system based on JAVA. Netw. Secur. Technol. Appl. (9), 28 (2009)

Design and Implementation of Secure File Transfer System Based on Java

375

3. Wang, X.: Design of a secure transmission system for large files in a public network environment. Anhui University of Science and Technology (2012) 4. Zhang, J.: Research and implementation of network file transfer system based on Java threetier architecture. Netw. Secur. Technol. Appl. 10, 49–50 (2007) 5. Haiyan, B.: Research and implementation of DES and RSA hybrid encryption algorithm under VC_ environment. J. Jinzhong Univ. 36(3), 61–65 (2019) 6. Liu, W.: Analysis of data encryption technology in computer security technology. Inf. Syst. Eng. 30, 66–67 (2012) 7. Juan, Z.: Design and implementation of a simple file transfer system. Chin. Market 50, 242 (2015) 8. Shi, H.: Design and implementation of network security transmission system based on hybrid cryptography. Inf. Secur. Technol. (05), 05–19 (2011) 9. Wang, J.: Research and design of SSL-based file secure transmission system. Chengdu University of Technology (2012) 10. Xing, J.: Research on file encryption algorithm and its application in network transmission. Xidian University (2012)

Secure Outsourcing Protocol Based on Paillier Algorithm for Cloud Computing Su Yunxuan, Tu Zheng, Wang Xu An(B) , and Li Ruifeng Engineering University of PAP, Xi’an 710086, China [email protected], [email protected], [email protected], [email protected]

Abstract. With the rapid development of the Internet, network technology has gradually penetrated into the lives of ordinary people, and the advent of the era of big data has made people finish work through computers. The development of cloud computing provides people a Virtual platform to solve the computing problems. However, when users outsource data to cloud computing platforms, cloud security issues also come behind. Because cloud computing technology is an emerging technology, technical confidentiality is not enough to meet user needs, and the relevant laws and regulations of the technology are not perfect. It also lacks perfect service agreement management standards and calculation mechanism. To solve the security problems of cloud computing, homomorphic encryption came into being. By analyzing the development trend of cloud computing and the application of homomorphic encryption in cloud computing, this paper introduces the basic knowledge of Paillier encryption algorithm, and proposes a cloud data secure outsourcing protocol, which partially solves the blind computing problems. The method, which transfers the data calculation to the two parties separately through key division, can ensure the security of the data in the calculation and protect the integrity and confidentiality of the outsourced data in the entire process. Through protocol analysis, this method can achieve the expected result.

1 Introduction With the rapid development of computer technology, the requirements for data computing are getting higher and higher, and the emergence of cloud computing provides great convenience for users [1]. Cloud computing is favored by users because of its large computing scale, high reliability, and good scalability. However, the security of cloud computing has also become a huge hidden danger, such as data loss, data destruction, internal personnel leakage, etc., which will cause a series of safe questions [2]. Homomorphic encryption is different from the current encryption technology. It can solve the security of cloud computing well in theory. Data must be encrypted before uploaded to the cloud computing platform, and then transferred to the cloud storage device. If users want to perform a second calculation on the data, he must decrypt it before performing the corresponding operation. It has become a serious flaw in cloud computing [3]. But if the data is homomorphically encrypted first, the corresponding calculation operation © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 376–384, 2021. https://doi.org/10.1007/978-3-030-61105-7_38

Secure Outsourcing Protocol Based on Paillier Algorithm for Cloud Computing

377

can be also performed on the data information without decrypting the ciphertext in the cloud computing platform. The result is the same as that after decrypting the data. The corresponding operations are the same, the original data is not exposed, and calculation operations are performed in the whole process. Therefore, choosing a reasonable encryption scheme between users and cloud service providers is the guarantee for users to choose reliable cloud computing services [4]. The main contributions of this paper are summarized as follows: 1) Proposed a homomorphic encryption scheme based on Paillier algorithm, which can satisfy the protection of data privacy and security after outsourcing data to cloud computing platform. This is a suitable solution for cloud data outsourcing. 2) The feasibility of the scheme was proved through rigorous mathematical knowledge, and the safety and reliability of the scheme were proved through theoretical analysis and comparison. The rest of the paper is arranged as follows: In the second and third parts, we introduce the background knowledge of homomorphic encryption and the basic knowledge involved in the encryption scheme. The fourth part introduces the specific process of the scheme, and proves the feasibility of the scheme through calculation and analysis. Finally, the fifth part summarizes the work and makes arrangements for further work.

2 Background and Preliminary Preparation 2.1 System Model As shown in Fig. 1, we divide the entire outsourcing protocol into the following three parts:

Fig. 1. Scheme flow chart

Cloud administrator: responsible for the distribution of keys used in the communication between the user and the cloud key management center. The role is similar to the

378

S. Yunxuan et al.

first inspector, who verifies the user’s identity authenticity and distributes keys based on this. Cloud Key Center: It is mainly responsible for the distribution of keys used in the communication between users and the cloud computing center. In the process of generating encryption keys and decryption keys, it divides the decryption keys into two parts. Cloud data processing platform: responsible for the data that users upload to the processing platform. Calculate the data according to the processing method negotiated by the user and the cloud provider, and store the calculated data in the cloud database. The basic idea of the protocol is based on the Paillier homomorphic encryption algorithm. The keys are divided into two parts, which are distributed to the user and the cloud data processing center. The ciphertext is decrypted separately by different keys and then merge the ciphertext to ensure the security of the data. 2.1.1 Scheme Framework See Fig. 2 and 3. 2.2 Preliminary Knowledge 1) Carmichael function: let n be a positive integer, with a thin (n) for the minimum positive integer m, satisfied with: am ≡ 1(mod n) for all the integers with n intertina. The λ(n) is called the Carmichael function, so when n is the power of 1, 2, 4, odd prime number, and twice the power of the odd prime number, it is the Euler function, and When n is a power of 2 other than 2 and 4, it is half of it [5]. 2) Compound residual problem: let m be the product of two primes q and p, then the integer z is the n-th residue of module m2 . If there is an integer y satisfying yn ≡ z mod m2 , then: (1) The n-th residue set of module m2 is a subgroup of multiplication group Z∗n2 ; (2) The n-th residue set of module m2 is the remaining z, there is n different n times roots, and among these n roots, only one is smaller than number m; (3) The x-root of the unit of the mold, which has the following form: m2 (1 + m)x = 1 + mx mod m2 3) Compound remaining class: (1) An integer-valued function Fg: Fg : Zn × Z∗n → Z∗n2 Fg(x,y) = gx yn mod n2 , g∈ Z∗n2 . (2) If g ∈ B, w ∈ z∗n2 the n-th residue class of w with respect to g is defined as x. If there is a y ∈ Z∗n2 which satisfies: Fg(x,y) = gx yn mod n2 = w This residual class of W is expressed as [W] g.

Secure Outsourcing Protocol Based on Paillier Algorithm for Cloud Computing

Fig. 2. Encryption process

Fig. 3. Decryption process

379

380

S. Yunxuan et al.

(3) N-th residue class problem. N-th residue: an integer z is a n-th residue of n2 . If there is an integer y ∈, Z∗n2 , let = z mod n2 . At the same time, the set of n-th residue forms a subgroup which satisfies √ multiplication. Every n-th residue has n n-th roots, and let the smallest root be n z and √ n n 2 n z be smaller than n. Because x = (x + kn) mod n , so, we can find a x less than n, and for any two numbers a, b less than n, they meet an = bn mod n2 . yn

2.3 Paillier Encryption Principle 1) Public and private keys generation: Select two large prime numbers p and q, calculate n = p × q and Carmichael function λ = λ(n) = lcm(p − 1, q − 1). Select random −1 number g, and guarantee the existence u = L gλ mod n2 mod n. The public key is(n, g), and the private key is (λ, u). 2) Encryption: Select plain text information m ∈ [0. n − 1], random number r ∈ (0. n − 1], and calculate C = gm γn mod n2 . 3) Decryption: m = L(Cλ mod n2 )u mod n [6].

3 Our Scheme In this section, we focus on the process and details of the scenario. First, the users and the cloud provider negotiate an outsourcing computing agreement based on the Paillier encryption algorithm. When the user makes a request to the cloud platform, the cloud administrator distributes the key K1 to the user, which can help to communicate with the cloud key management center. After receiving the service request, the key management center generates encryption and decryption keys. This key generation process is the same as the key generation process in the Paillier encryption algorithm. The only difference is that after the decryption key is generated, the cloud key service management center divides the private key into two parts λ1, λ2. The cloud key management center transmits the generated calculation parameters to the user and the cloud data processing platform respectively, and the user encrypts the plaintext information m with the calculation parameters and obtains the ciphertext C = gm rn mod n2 . Then it is uploaded to the cloud data processing platform. Perform corresponding calculation operations in the cloud data processing platform according to user needs, and the calculation data is stored in the database of the cloud platform. When the user requests data information, the cloud key management center sends the private key λ1 to the user, and sends the private key λ2 to the cloud data processing platform, so that the processing platform uses the key λ2 to partially decrypt the ciphertext c2 = cλ2 = rnλ2 (1 + mnλ2 )n2 . Then the processing platform sends the decrypted part of the ciphertext together with the undecrypted ciphertext to the user. The user decrypts the remaining ciphertext through the local computing server, and cλ1 = rnλ1 (1 + mnλ1 ), thus we merge the ciphertext, concluding that T = (c1 × c2 ) mod n2 , then we calculate the m = L(T) and get the clear text.

Secure Outsourcing Protocol Based on Paillier Algorithm for Cloud Computing

381

4 Security Analysis The cloud computing outsourcing protocol designed in this paper is based on the Paillier homomorphic encryption algorithm and uses the method of key segmentation to process the data. In the selection of g, make g = n + 1, because the order requirements of g must be in Z∗n2 , and can be divisible by n, and (n + 1) is only the element of order n, so meet the condition of g. The calculation process: C = gm rn mod n2 × rn mod n2 Also because, for (x, y), εg : Zn × Z∗n → Z∗n2 (x, y) → gx yn mod n2 , εg is Bijection, it is not computationally feasible to look for a different (x, y) to make the calculations the same [7]. For keys, divide the private key λ into λ 1 and λ 2, so that the cloud data processing platform calculates part of the ciphertext: c2 = rnλ2 (1 + mnλ2 ) The local server then calculates another part of the ciphertext: c1 = rnλ1 (1 + mnλ1 ) T = (c1 ∗ c2 ) = rn(λ1 +λ2 ) (1 + mnλ2 + mnλ1 + m2 n2 λ1 λ2 ) mod n2 λ1 + λ2 = 0 mod λ, λ1 + λ2 = 1 mod n, r ∈ Z∗n rn(λ1 +λ2 ) = 1 mod n2 , m2 n2 λ1 λ2 ≡ 0 mod n2 T = 1 + mnλ2 + mnλ1 mod n2 , L(T) = (mnλ2 +λ1 /n)mod n2 , λ1 +λ2 = 1 mod n. L(T) = mn/n[8]. It is proved that the algorithm is feasible in calculation, and even if the ciphertext is known by the third party, it is only part of the ciphertext, which ensures the integrity of the ciphertext to a certain extent.

5 Applications Here we describe the first application. To be close to real life, we apply the scheme to the electronic voting system. The bill information is handed over to a trusted third party for processing, which helps to protect the personal information of users. After processing, we get the voting results, which does not reveal the user’s personal information and

382

S. Yunxuan et al.

bill information. For example, when a user votes for someone, the system encrypts the bill information and uploads it to the cloud data processing platform. After the platform receives the ciphertext of the bill information uploaded by each user, it performs corresponding statistical operations on the ciphertext information. When returning information, the cloud data processing platform decrypts a part of the ciphertext information and returns it to the user, and the user decrypts the remaining ciphertext to obtain the complete bill information (Fig. 4).

Fig. 4. Vote information encryption process

Here we describe the second application. We put the scheme in the online payment system. For example, the payment information, which comes from mobile devices such as mobile phones or computers to the background, is encrypted. The platform generates a payment code based on the information submitted by the user. Then the platform returns the payment code to the user, decrypts the payment code in segments, and then the user decrypts the remaining payment code through the mobile device. The decrypted part is done in the background, and complete payment code is directly presented to the users.

Secure Outsourcing Protocol Based on Paillier Algorithm for Cloud Computing

383

In the whole process, the platform can guarantee the privacy and security of the payment code. Because the payment information sent by the user is different at each time, the generated payment code is also different, which ensures the security payment (Fig. 5).

Fig. 5. Electronic payment encryption scheme

6 Conclusion In recent years, the rapid development of cloud computing technology has promoted the redistribution of Internet resources. However, as a new technology, cloud computing still has many security challenges. How to ensure the privacy of data in cloud computing is still a serious problem. This paper proposes a cloud data outsourcing protocol based on Paillier algorithm, and proves the security of the protocol through protocol analysis. But at present, the protocol is not perfect. There isn’t a specific decryption scheme about how to ensure all kinds of computing operations of data on cloud computing platform. How to design a more practical outsourcing protocol is the future works. Acknowledgements. This work is supported by the National Cryptography Development Fund of China (No. MMJJ20170112), Natural Science Basic Research Plan in Shaanxi Province of China (grant no. 2018JM6028), National Natural Science Foundation of China (No. 61772550, U1636114 and 61572521), the Foundation of Guizhou Provincial Key Laboratory of Public Big Data (No. 2019BDKFJJ008), and National Key Research and Development Program of China (No. 2017YFB0802000). This work is also supported by Engineering University of PAP’s Funding for Scientific Research Innovation Team (No. KYTD201805), Engineering University of PAP’s Funding for Key Researcher (No. KYGG202011).

384

S. Yunxuan et al.

References 1. China Cryptography Society Group: China Cryptography Development Report 2011 China Information Technology, no. 20, p. 71 (2012) 2. Wang, W.: Development of modern cryptography technology and application in data security. Comput. Secur. (02), 36–39 (2012) 3. Liu, R.: Confidential computing research and application in cloud environment. Huazhong University of Science and Technology (2019) 4. Xu, W.: Blockchain-based Electronic Health Records Privacy Protection Mechanism 5. Wei, W.: Paillier’s application of homologous passwords in privacy protection 6. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: International Conference on Advances in Cryptology-Eurocrypt. Springer, Heidelberg (1999) 7. Bai, J., Yang, Y., Li, Z.: Paillier public key cryptography system homomorphic characteristics and efficiency analysis. Beijing Inst. Electron. Sci. Technol. J. 20(04), 1–5 (2012) 8. Liu, X., Deng, R.H., Yang, Y, Tran, H.N., Zhong, S.: Hybrid privacy-preserving clinical decision support system in fog-cloud computing. Future Gener. Comput. Syst. (2017)

Energy Consumption and Computation Models of Storage Systems Wenlun Tong(B) , Takumi Saito, and Makoto Takizawa Hosei University, Tokyo, Japan [email protected], [email protected], [email protected] Abstract. It is critical to reduce the electric energy consumption of information systems to realize green societies. Applications like database and web applications take usage of data in storages systems of servers. In this paper, we consider RAID storage systems which are composed of multiple drives like hard disk drives (HDDs) and solid state drives (SSDs). Types of RAID storage systems, RAID0, RAID10(1+0), and RAID5 are considered in this paper. The performance and reliability of RAID storage systems are so far studied by many researchers. The more number of storage drives are possibly in parallel accessed in the RAID storage systems, the more amount of electric energy is consumed while the higher reliability and availability are supported. The electric energy consumption of the RAID storage systems to read and write data is so far not discussed. In this paper, we measure the power consumption of RAID storage systems and time to read and write data in the storage systems in order to make a power consumption model of a storage system. We make clear how much energy each type of RAID storage system consumes to sequentially and randomly read and write data through experiment in this paper. Keywords: Electric energy consumption · RAID storage systems Hard Disk Drive (HDD) · Solid State Drive (SSD) · Power consumption model

1

·

Introduction

Information systems are getting more scalable like cloud computing systems [9] and IoT(Internet of Things) [5]. Here, huge amount of electric energy is consumed due to the scalability, e.g. millions of drives are interconnected in the IoT. It is critical to decrease electric energy consumed in information systems to reduce the carbon dioxide emission on the earth. The macro-level power consumption and computation models of a computer to perform application processes are proposed [8] [10] in order to design and implement energy-efficient information systems, models, and algorithms to select servers [8] [10] to perform application processes and make virtual machines migrate [7] in a cluster of servers are proposed in order c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 385–396, 2021. https://doi.org/10.1007/978-3-030-61105-7_39

386

W. Tong et al.

to reduce energy consumption. In this paper, we consider how much energy a storage system consumes to perform applications processes which read and write data in multiple drives of hard disk drives (HDDs) and solid state drives (SSDs). The RAID (Redundant Arrays of Independent Disks) models of storage systems [6] are used to improve the performance and reliability of storage systems. Here, data units, i.e. blocks of each file f are distributed and replicated on multiple storage drives. Blocks in different drives are in parallel accessed to increase the performance. Multiple replicas of each block are stored in different drives to increase the reliability and availability. One replica of each block is properly operational on the other drive, even if one drive is faulty. The performance and reliability of RAID storage systems are so far studied by many researchers [6]. However, the energy consumption of the storage systems is not discussed. In this paper, we discuss how much electric energy each model of RAID storage systems consumes to read and write data. We measure the power consumption [W] of a RAID storage system and time to sequentially and randomly read and write data in the storages system through experiment. In Sect. 2, we present a model of RAID storage system. In Sect. 3, we measure the electric energy consumption and time to read/write data in RAID storage systems.

2

System Model

A storage system S is composed of multiple storage drives SD1 , ..., SDd (d ≥ 1) where files are stored and accessed. Each drive SDi is a hard disk drive (HDD) or a solid state drive (SSD) (i = 1, ..., d). We consider three types of RAID (Redundant Arrays of Independent Disks) [2] storage systems, RAID0, RAID10(0+1), and RAID5 in this paper. A file f is a sequence of blocks b1 , ..., bm (m ≥ 1). A block is a storage unit which is read and written in a read and write operation. That is, a block is a unit of read/write operation. A pair of blocks bi and bi+1 (1 ≤ i < m) are referred to as consecutive in the file f . If the file f is sequentially accessed, a block bi is read before bi+1 . In the random access, a block bi is directly access. The RAID0 storage system supports striping Fig. 1. That is, the first block b1 of a file f is stored in the first storage drive SD1 and the second block b2 is stored in the second drive SD2 . Thus, each block bi is stored in a drive SD(i−1)%d+1 . Here, x % y stands for modulo of integer x by y. A pair of blocks bi and bj where i % d = j % d, i.e. which are in different drives, can be in parallel accessed. A pair of blocks bi and bj in different drives can be concurrently accessed. If one drive SDi is faulty, blocks in the drive SDi are lost since no block is replicated. The RAID10 storage system supports reliability and availability by using the mirroring technologies in addition to the stripping i.e. parallel access Fig. 2. Suppose a storage system is composed of four drives (d = 4). Each block bi is replicated in a pair of different drives. For example, two replicas of a block bi are stored in different drives. The first block b1 of a file f is stored in a pair of drives SD1 and SD2 . Next, the second block b2 is stored in the drives SD3 and SD4 .

Energy Consumption and Computation Models of Storage Systems

387

Fig. 1. RAID0 (d = 4).

Then, the third block b3 is stored in the drives SD1 and SD2 . Thus, each block bi is replicated in two different drives. Hence, the size of the storage to store a file f is double of the RAID0 since each block is replicated to two replicas. Even if one drive gets faulty, one replica of the block bi is proper in the other drive.

Fig. 2. RAID10 (d = 4).

In the RAID5 storage system, a subsequence of blocks b1 , ..., bm of a file f are divided to subsequences, each of which includes (d - 1) consecutive blocks bi , bi+1 , ..., bi+d−1 . A parity block pbi,i+d−1 is created for the subsequence of the blocks bi , bi+1 , ..., bi+d−1 where i % (d − 1) = 0. For example, a parity block pb1,d−1 of a subsequence of blocks b1 , ..., bd−1 is created by taking the exclusive or (xor) ⊕ of the blocks, i.e. pb1,d−1 = b1 ⊕ ... ⊕bd−1 . A subsequence b1 , ..., bd−1 , pb1,d−1 of d blocks and the parity block pb1,d−1 are stored in the drives SD1 , ..., SDd−1 , SDd , respectively. Then, the subsequence of blocks bd , ..., b2d−2 , pbd,2d−1 , b2d−1

388

W. Tong et al.

are stored in the drives SD1 , SD2 , ...., SDd−2 , SDd−1 , SDd , respectively. Let us consider a subsequence bkd+1 , bkd+2 , ..., b(k+1)d of d consecutive blocks of a file f (k ≥ 1). A parity block pbkd+1,(k+1)d is created for the d blocks bkd+1 , bkd+2 , ..., b(k+1)d . The parity block pbkd,(k+1)d−1 is the ((k + 1)d − k%d) the element of a subsequence bkd+1 , ..., b(k+1)d−k%d , pbkd,(k+1)d−1 , b(k+1)d−k%d+1 , ..., b(k+1)d . Hence, the size of data stored in the RAID5 storage system is smaller than the RAID10 while larger than RAID0. Even if one block bi is faulty, the faulty block bi is recovered by taking the xor ⊕ of the other blocks and the parity block in the subsequence. For example, if the block b5 is lost, the block b5 is obtained as b5 = b4 ⊕ pb4,6 ⊕ b6 (Fig. 3).

Fig. 3. RAID5 (d = 4).

Table 1 summarizes properties of each type of RAID storage systems, where a file of m ( ≥ 1) blocks b1 , ..., bm are stored in d (d ≥ 2) drives SD1 , ..., SDd . In the RAID0 storage system, no replica of each block bi is created, i.e. no redundancy. On the other hand, blocks in different drives can be concurrently accessed by applications. In the RAID10 storage system, two replicas of each block are stored in different drives. The RAID10 storage system supports more reliability than the RAID0. In the RAID 5 storage system, one parity block is created for (d - 1) consecutive blocks. A subsequence of d blocks and the parity block is stored in different drives. Actually, m · d/(d-1) blocks are stored in d drives SD1 , ..., SDd . The size of data stored in the d drives of the RAID5 storage system are d/(d-1) times larger than the RAID0 and d/[2(d-1)] times smaller than the RAID10.

3 3.1

Experiment System Configuration

We measure the execution time [sec] to read and write data and the energy consumption [J] of each model of RAID storage system is first, RAID0, RAID10,

Energy Consumption and Computation Models of Storage Systems

389

Table 1. Properties of RAID RAID Storage size Redundancy Number of blocks 0

m

0

m/d

10

2m

1

2m/d

5 m · (d/d-1) d/(d-1) m/(d-1) m = number of blocks. d = number of drives.

and RAID5. We consider a storage system “Yottamaster Y-Focus Series 4-Bay “ [4] by which the RAID0, RAID10, and RAID5 types can be used. Here, a storage system is composed of four storage drives SD1 , ..., SD4 (d = 4). For each drive SDi , an HDD (Seagate BarraCuda,2 [T B]) and an SSD (Crucial MX500, 500 [GB]) can be installed to do the experiment. The power consumption [W] of the storage systems are measured by using the UW meter [3]. The electric power [W] is supplied to the RAID storage system S through the UW meter as shown in Fig. 4. The power consumption [W] of the RAID storage system S can be measured every one hundred [millisecond] in the UW meter. The electric power [W] of the RAID storage system S measured by the UWmeter is transferred to a note PC by using the bluetooth communication.

Fig. 4. Experiment.

The storage system S is connected to a Windows PC as shown in Fig. 4. First, we measure the electric power of the storage system S where k (≤ d) storage drives are used. Initially, the storage systems consumes the minimum power minE = 3.618 [W] where no drive is accessed. Figures 5 and 6 show the power consumption of the storage system S for the number k (≤ d) of storage drives, HDDs and SSDs, respectively. Here a file f of 10 [GB] is copied to each of k drives. The storage system S of HDDs consumes 15.063, 16.207, 16.824, and 17.294 [W] for k = 1, 2, 3, and 4, respectively. The storage system S of SSDs consumes 5.491, 5.91, 6.063, and 6.142 [W] for k = 1, 2, 3, and 4, respectively.

390

W. Tong et al.

Fig. 5. Power consumption of HDDs.

Fig. 6. Power consumption of SSDs.

3.2

RAID for Sequential Access

First, the RAID type 0, 10, 5 or 5 is fixed in experiment. Then, a file f is sequentially written to the storage system by using a copy command from the PC. In turn, the file f is sequentially read by a copy command. Here, we measure the power consumption of storage system S with RAID0, RAID10, and RAID5 types.

Energy Consumption and Computation Models of Storage Systems

391

Figures 7 and 8 show the power consumption of the RAID storage system S composed of HDDs and SSDs, respectively. Here, a file of 10 [GB] is written to and read from the RAID storage system. The power consumption of the storage system composed of HDDs is 16.99,18.20, and 16.83 [W] for RAID0, RAID10, and RAID5 types, respectively. In the storage system composed of SSDs, the power consumption is 6.78, 6.93, and 6.84 [W] for RAID0, RAID10, and RAID5 types, respectively. The power consumption of the storage system S composed of the SSDs is about 60% smaller than the HDDs. The RAID10 storage system S consumes more power than the RAID0 and RAID5. In the HDDs, the RAID1 consumes more energy than the RAID0 and RAID5 and the RAID5 consumes the smaller power. On the other hand, the RAID0 consumes the smaller power while the RAID1 consumes the greatest power.

Fig. 7. Power consumption (HDDs) in sequential access.

Figures 9 and 10 show the execution time [sec] for data size [GB] of a file f to write data in the RAID storage system S. The execution time of the RAID storage system S composed of HDDs to write a file f of 10 [GB] is 261, 265, and 353 [sec] for RAID0, RAID10, and RAID5 types, respectively. The execution time of RAID0 is the fastest and RAID5 is the slowest in the HDDs. For the RAID storage system S composed of the SSDs, the execution time to write a file f of 10 [GB] is 259, 257, and 256 [sec] for the RAID0, RAID10, and RAID5, respectively. The execution time of the RAID5 storage system S is the fastest and the RAID0 is the slowest.

392

W. Tong et al.

Fig. 8. Power consumption (SSDs) in sequential access.

Fig. 9. Execution time (HDDs) in sequential access.

3.3

RAID for Random Access

Next, data in a RAID storage system S is randomly accessed. By using HDtune [1], data of size 512 [B] to 1 [MB] is randomly read in the file f of 10 [GB]. The average size of the data is 4 [KB]. Figures 11 and 12 show the power consumption [W] of the storage system S composed of HDDs and SSDs, respectively. The power consumption of the storage system composed of HDDs is 19.742, 17.05, and 19.318 [W] for RAID0, RAID10, and RAID5 types, respectively. The power consumption of the storage system composed of SSDs is 6.919, 5.933, and 7.047 [W] for RAID0, RAID10, RAID5 types, respectively. The power consumption of the storage system S composed of SSDs is about 65% smaller than the HDDs. The RAID0 storage system S of HDDs consumes more power than the RAID10

Energy Consumption and Computation Models of Storage Systems

393

Fig. 10. Execution time (SSDs) in sequential access.

Fig. 11. Power consumption (HDDs) in random access.

and RAID5, and the RAID 10 consumes the smallest power consumption in RAID0, RAID10, and RAID5. Compare with the sequential access of RAID storage system composed of HDDs, the power consumption of the sequential access of RAID storage system is smaller than random access of RAID storage system of RAID0, RAID10, and RAID5 types. In the storage system composed of SSDs, the power consumption of random access is bigger than the sequential access. Figures 13 and 14 show the execution time [sec] of the storage system S of HDDs and SSDs, respectively. The execution time in the random access of the RAID storage system S of HDDs to read a file f of 10 [GB] is 92, 87, and 92 [sec] for RAID0, RAID10, and RAID5 types, respectively. For the RAID storage system S composed of SSDs, the execution time in random access to read a

394

W. Tong et al.

Fig. 12. Power consumption (SSDs) in random access.

Fig. 13. Execution time (HDDs) in random access.

file f of 10 [GB] is 43, 43, and 42 [sec] for RAID0, RAID10, and RAID5 types, respectively. The execution time of the RAID5 is the fastest and the RAID0, and RAID5 is slowest, the execution time in random access is about 76% smaller than execution time in sequentially access.

Energy Consumption and Computation Models of Storage Systems

395

Fig. 14. Execution time (SSDs) in random access.

4

Concluding Remarks

The RAID storage system is widely used to realize reliable and high performance storage systems. The more number of storage drives data are stored and replicated, the more efficient and reliable storage systems used. On the other hand, the more electric energy is consumed. In this paper, we measured the energy consumption [J] and execution time [sec] of RAID storage systems, RAID0, RAID10, and RAID5 with HDDs and SSDs to sequentially and randomly read and write data. By taking advantage the measured data, we are now making the power consumption model and the execution model of a storage system.

References 1. 2. 3. 4. 5.

Hdtune. http://hdtune.com Raid wiki. https://ja.wikipedia.org/wiki/RAID Uwmeter. http://www.metaprotocol.com/UWmeter/UWmeter/TOP.html Yottamaster raid y-focus series 4-bay. https://www.amazon.co.jp/Yottamster Arridha, R., Sukaridhoto, S., Pramadihanto, D., Funabiki, N.: Classification extension based on iot-big data analytic for smart environment monitoring and analytic in real-time system. International Journal of Space-Based and Situated Computing (IJSSC) 7(2), 82–93. https://doi.org/10.1504/IJSSC.2017.10008038 6. Chen, Peter, M., Lee: Raid: high-performance, reliable secondary storage. ACM Computing Surveys 26(2), 145–185 (1994) 7. Duolikun, D., Enokido, T., Takizawa, M.: An energy-aware algorithm to migrate virtual machines in a server cluster. International Journal of Space-Based and Situated Computing (IJSSC) 7(1), 32–42. https://doi.org/10.1504/IJSSC.2017. 10004986

396

W. Tong et al.

8. Enokido, T., Ailixier, A., Takizawa, M.: An extended simple power consumption model for selecting a server to perform computation type processes in digital ecosystems. IEEE Transactions on Industrial Informatics 10(2), 1627–1636 (2014) 9. F, A.H., Alenezi, A., Alharthi, A.: Integration of cloud computing with internet of things. IEEE International Conference on Internet of Things (2017) 10. Kataoka, H., Sawada, A., Duolikun, D., Enokido, T., Takizawa, M.: Multi-level power consumption and computation models and energy-efficient server selection algorithms in a scalable cluster. In: Proc. of the 19th International Conf. on Network-Based Information Systems (NBiS-2016), pp. 210–217 (2016)

Performance Analysis of WMNs by WMN-PSODGA Simulation System Considering Uniform Distribution of Mesh Clients and Different Router Replacement Methods Seiji Ohara1(B) , Admir Barolli2 , Phudit Ampririt1 , Keita Matsuo3 , Leonard Barolli3 , and Makoto Takizawa4 1

2

3

Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected], [email protected] Department of Information Technology, Aleksander Moisiu University of Durres, L.1, Rruga e Currilave, Durres, Albania [email protected] Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan {kt-matsuo,barolli}@fit.ac.jp 4 Department of Advanced Sciences, Faculty of Science and Engineering, Hosei University, Kajino-Machi, Koganei-Shi, Tokyo 184-8584, Japan [email protected] Abstract. Wireless Mesh Networks (WMNs) are an important networking infrastructure and they have many advantages such as low cost and high-speed wireless Internet connectivity. However, they have some problems such as router placement, covering of mesh clients and load balancing. To deal with these problems, in our previous work, we implemented a Particle Swarm Optimization (PSO) based simulation system, called WMN-PSO, and a simulation system based on Genetic Algorithm (GA), called WMN-GA. Then, we implemented a hybrid simulation system based on PSO and distributed GA (DGA), called WMN-PSODGA. Moreover, we added in the fitness function a new parameter for the load balancing of the mesh routers called NCMCpR (Number of Covered Mesh Clients per Router). In this paper, we consider Uniform distribution of mesh clients and five router replacement methods and carry out simulations using WMN-PSODGA system. The simulation results show that LDVM and RDVM router replacement methods have better performance than other methods. Comparing LDVM and RDVM, we see that RDVM has better behavior.

1

Introduction

The wireless networks and devices can provide users access to information and communication anytime and anywhere [3,8–11,14,20,26,27,29,33]. Wireless c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 397–409, 2021. https://doi.org/10.1007/978-3-030-61105-7_40

398

S. Ohara et al.

Mesh Networks (WMNs) are gaining a lot of attention because of their low-cost nature that makes them attractive for providing wireless Internet connectivity. A WMN is dynamically self-organized and self-configured, with the nodes in the network automatically establishing and maintaining mesh connectivity among itself (creating, in effect, an ad hoc network). This feature brings many advantages to WMN such as low up-front cost, easy network maintenance, robustness and reliable service coverage [1]. Moreover, such infrastructure can be used to deploy community networks, metropolitan area networks, municipal and corporative networks, and to support applications for urban areas, medical, transport and surveillance systems. Mesh node placement in WMNs can be seen as a family of problems, which is shown (through graph theoretic approaches or placement problems, e.g. [6,15]) to be computationally hard to solve for most of the formulations [37]. We consider the version of the mesh router nodes placement problem in which we are given a grid area where to deploy a number of mesh router nodes and a number of mesh client nodes of fixed positions (of an arbitrary distribution) in the grid area. The objective is to find a location assignment for the mesh routers to the cells of the grid area that maximizes the network connectivity, client coverage and consider load balancing for each router. Network connectivity is measured by Size of Giant Component (SGC) of the resulting WMN graph, while the user coverage is simply the number of mesh client nodes that fall within the radio coverage of at least one mesh router node and is measured by Number of Covered Mesh Clients (NCMC). For load balancing, we added in the fitness function a new parameter called NCMCpR (Number of Covered Mesh Clients per Router). Node placement problems are known to be computationally hard to solve [12, 13,38]. In previous works, some intelligent algorithms have been recently investigated for node placement problem [4,7,16,18,21–23,31,32]. In [24], we implemented a Particle Swarm Optimization (PSO) based simulation system, called WMN-PSO. Also, we implemented another simulation system based on Genetic Algorithm (GA), called WMN-GA [19], for solving node placement problem in WMNs. Then, we designed and implemented a hybrid simulation system based on PSO and distributed GA (DGA). We call this system WMN-PSODGA. In this paper, we present the performance analysis of WMNs using WMNPSODGA system considering Uniform distribution of mesh clients and different router replacement methods. The rest of the paper is organized as follows. We present our designed and implemented hybrid simulation system in Sect. 2. The simulation results are given in Secti. 3. Finally, we give conclusions and future work in Sect. 4.

Performance Analysis of WMNs by WMN-PSODGA Simulation System

2 2.1

399

Proposed and Implemented Simulation System Particle Swarm Optimization

In PSO a number of simple entities (the particles) are placed in the search space of some problem or function and each evaluates the objective function at its current location. The objective function is often minimized and the exploration of the search space is not through evolution [17]. Each particle then determines its movement through the search space by combining some aspect of the history of its own current and best (best-fitness) locations with those of one or more members of the swarm, with some random perturbations. The next iteration takes place after all particles have been moved. Eventually the swarm as a whole, like a flock of birds collectively foraging for food, is likely to move close to an optimum of the fitness function. Each individual in the particle swarm is composed of three D-dimensional vectors, where D is the dimensionality of the search space. These are the current position xi , the previous best position pi and the velocity vi . The particle swarm is more than just a collection of particles. A particle by itself has almost no power to solve any problem; progress occurs only when the particles interact. Problem solving is a population-wide phenomenon, emerging from the individual behaviors of the particles through their interactions. In any case, populations are organized according to some sort of communication structure or topology, often thought of as a social network. The topology typically consists of bidirectional edges connecting pairs of particles, so that if j is in i’s neighborhood, i is also in j’s. Each particle communicates with some other particles and is affected by the best point found by any member of its topological neighborhood. This is just the vector pi for that best neighbor, which we will denote with pg . The potential kinds of population “social networks” are hugely varied, but in practice certain types have been used more frequently. We show the pseudo code of PSO in Algorithm 1. In the PSO process, the velocity of each particle is iteratively adjusted so that the particle stochastically oscillates around pi and pg locations. 2.2

Distributed Genetic Algorithm

Distributed Genetic Algorithm (DGA) has been used in various fields of science. DGA has shown their usefulness for the resolution of many computationally hard combinatorial optimization problems. We show the pseudo code of DGA in Algorithm 2. Population of individuals: Unlike local search techniques that construct a path in the solution space jumping from one solution to another one through local perturbations, DGA use a population of individuals giving thus the search a larger scope and chances to find better solutions. This feature is also known as “exploration” process in difference to “exploitation” process of local search methods.

400

S. Ohara et al.

Algorithm 1. Pseudo code of PSO. /* Initialize all parameters for PSO */ Computation maxtime:= T pmax , t := 0; Number of particle-patterns:= m, 2 ≤ m ∈ N 1 ; Particle-patterns initial solution:= P 0i ; Particle-patterns initial position:= x0ij ; Particles initial velocity:= v 0ij ; PSO parameter:= ω, 0 < ω ∈ R1 ; PSO parameter:= C1 , 0 < C1 ∈ R1 ; PSO parameter:= C2 , 0 < C2 ∈ R1 ; /* Start PSO */ Evaluate(G0 , P 0 ); while t < T pmax do /* Update velocities and positions */ = ω · v tij v t+1 ij +C1 · rand() · (best(Pijt ) − xtij ) +C2 · rand() · (best(Gt ) − xtij ); t+1 xij = xtij + v t+1 ij ; /* if fitness value is increased, a new solution will be accepted. */ Update Solutions(Gt , P t ); t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;

Fitness: The determination of an appropriate fitness function, together with the chromosome encoding are crucial to the performance of DGA. Ideally we would construct objective functions with “certain regularities”, i.e. objective functions that verify that for any two individuals which are close in the search space, their respective values in the objective functions are similar. Selection: The selection of individuals to be crossed is another important aspect in DGA as it impacts on the convergence of the algorithm. Several selection schemes have been proposed in the literature for selection operators trying to cope with premature convergence of DGA. There are many selection methods in GA. In our system, we implement 2 selection methods: Random method and Roulette wheel method. Crossover operators: Use of crossover operators is one of the most important characteristics. Crossover operator is the means of DGA to transmit best genetic features of parents to offsprings during generations of the evolution process. Many methods for crossover operators have been proposed such as Blend Crossover (BLX-α), Unimodal Normal Distribution Crossover (UNDX), Simplex Crossover (SPX). Mutation operators: These operators intend to improve the individuals of a population by small local perturbations. They aim to provide a component of randomness in the neighborhood of the individuals of the population. In our system, we implemented two mutation methods: uniformly random mutation and boundary mutation.

Performance Analysis of WMNs by WMN-PSODGA Simulation System

401

Escaping from local optima: GA itself has the ability to avoid falling prematurely into local optima and can eventually escape from them during the search process. DGA has one more mechanism to escape from local optima by considering some islands. Each island computes GA for optimizing and they migrate its gene to provide the ability to avoid from local optima (See Fig. 1). Convergence: The convergence of the algorithm is the mechanism of DGA to reach to good solutions. A premature convergence of the algorithm would cause that all individuals of the population be similar in their genetic features and thus the search would result ineffective and the algorithm getting stuck into local optima. Maintaining the diversity of the population is therefore very important to this family of evolutionary algorithms.

Algorithm 2. Pseudo code of DGA. /* Initialize all parameters for DGA */ Computation maxtime:= T gmax , t := 0; Number of islands:= n, 1 ≤ n ∈ N 1 ; initial solution:= P 0i ; /* Start DGA */ Evaluate(G0 , P 0 ); while t < T gmax do for all islands do Selection(); Crossover(); Mutation(); end for t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;

Fig. 1. Model of migration in DGA.

402

2.3

S. Ohara et al.

WMN-PSODGA Hybrid Simulation System

In this subsection, we present the initialization, particle-pattern, fitness function, and replacement methods. The pseudo code of our implemented system is shown in Algorithm 3. Also, our implemented simulation system uses Migration function as shown in Fig. 2. The Migration function swaps solutions among lands included in PSO part. Algorithm 3. Pseudo code of WMN-PSODGA system. Computation maxtime:= Tmax , t := 0; Initial solutions: P . Initial global solutions: G. /* Start PSODGA */ while t < Tmax do Subprocess(PSO); Subprocess(DGA); WaitSubprocesses(); Evaluate(Gt , P t ) /* Migration() swaps solutions (see Fig. 2). */ Migration(); t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;

Fig. 2. Model of WMN-PSODGA migration.

Initialization We decide the velocity of particles by a random process considering the area size. For√instance, when √ the area size is W × H, the velocity is decided randomly from − W 2 + H 2 to W 2 + H 2 . Particle-Pattern A particle is a mesh router. A fitness value of a particle-pattern is computed by combination of mesh routers and mesh clients positions. In other words, each particle-pattern is a solution as shown is Fig. 3.

Performance Analysis of WMNs by WMN-PSODGA Simulation System

403

Fig. 3. Relationship among global solution, particle-patterns, and mesh routers in PSO part.

Gene Coding A gene describes a WMN. Each individual has its own combination of mesh nodes. In other words, each individual has a fitness value. Therefore, the combination of mesh nodes is a solution. Fitness Function WMN-PSODGA has the fitness function to evaluate the temporary solution of the router’s placements. The fitness function is defined as: F itness = α × N CM C(xij , y ij ) + β × SGC(xij , y ij ) + γ × N CM CpR(xij , y ij ).

This function uses the following indicators. • NCMC (Number of Covered Mesh Clients) The NCMC is the number of the clients covered by the SGC’s routers. • SGC (Size of Giant Component) The SGC is the maximum number of connected routers. • NCMCpR (Number of Covered Mesh Clients per Router) The NCMCpR is the number of clients covered by each router. The NCMCpR indicator is used for load balancing. WMN-PSODGA aims to maximize the value of the fitness function in order to optimize the placements of the routers using the above three indicators. Weight-coefficients of the fitness function are α, β, and γ for NCMC, SGC, and NCMCpR, respectively. Moreover, the weight-coefficients are implemented as α + β + γ = 1. Router Replacement Methods A mesh router has x, y positions, and velocity. Mesh routers are moved based on velocities. There are many router replacement methods, such as: Constriction Method (CM) CM is a method which PSO parameters are set to a week stable region (ω = 0.729, C1 = C2 = 1.4955) based on analysis of PSO by M. Clerc et al. [2,5,35].

404

S. Ohara et al.

Random Inertia Weight Method (RIWM) In RIWM, the ω parameter is changing ramdomly from 0.5 to 1.0. The C1 and C2 are kept 2.0. The ω can be estimated by the week stable region. The average of ω is 0.75 [28,35]. Linearly Decreasing Inertia Weight Method (LDIWM) In LDIWM, C1 and C2 are set to 2.0, constantly. On the other hand, the ω parameter is changed linearly from unstable region (ω = 0.9) to stable region (ω = 0.4) with increasing of iterations of computations [35,36]. Linearly Decreasing Vmax Method (LDVM) In LDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). A value of Vmax which is maximum velocity of particles is considered. With increasing of iteration of computations, the Vmax is kept decreasing linearly [30,34]. Rational Decrement of Vmax Method (RDVM) In RDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). The Vmax is kept decreasing with the increasing of iterations as Vmax (x) =

W 2 + H2 ×

T −x . x

Where, W and H are the width and the height of the considered area, respectively. Also, T and x are the total number of iterations and a current number of iteration, respectively [25].

3

Simulation Results

In this section, we present the simulation results. Table 1 shows the common parameters for each simulation. Figure 4 shows the visualization results after the Table 1. The common parameters for each simulation. Parameters

Values

Distribution of mesh clients Uniform distribution Number of mesh clients

48

Number of mesh routers

32

Radius of a mesh router

2.0–3.5

Number of GA Islands

16

Number of migrations

200

Evolution steps

9

Selection method

Random method

Crossover method

SPX

Mutation method

Uniform mutation

Crossover rate

0.8

Mutation rate

0.2

Area size

32.0 × 32.0

Performance Analysis of WMNs by WMN-PSODGA Simulation System

Fig. 4. Visualization results after the optimization.

405

406

S. Ohara et al.

Fig. 5. Transition of the standard deviations.

optimization. In Fig. 5 are shown the transition of the standard deviations. The standard deviation is related to load balancing. When the standard deviation increased, the number of mesh clients for each router tends to be different. On the other hand, when the standard deviation decreased, the number of mesh clients for each router tends to go close to each other. The value of r in Fig. 5 means the correlation coefficient. Figure 5(a) and Fig. 5(b) show that there is no correlation between the number of updates and the standard deviations. In Fig. 5(c), the standard deviation is increased by increasing the number of updates. On the

Performance Analysis of WMNs by WMN-PSODGA Simulation System

407

other hand, the standard deviation of Fig. 5(d) and 5(e) decreases by increasing the number of updates. Especially, Fig. 5(e) has better behavior than Fig. 5(d). Thus, we conclude that RDVM has better behavior than other methods.

4

Conclusions

In this work, we evaluated the performance of WMNs using a hybrid simulation system based on PSO and DGA (called WMN-PSODGA). We considered Uniform distribution of mesh clients and five router replacement methods for WMN-PSODGA. The simulation results show that LDVM and RDVM router replacement methods have better performance than other methods. Comparing LDVM and RDVM, we see that RDVM has better behavior. In future work, we will consider other distributions of mesh clients.

References 1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005) 2. Barolli, A., Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L., Takizawa, M.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering constriction and linearly decreasing Vmax methods. In: International Conference on P2P Parallel, Grid, Cloud and Internet Computing, pp. 111–121. Springer (2017) 3. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance analysis of simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs considering different distributions of mesh clients. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 32–45. Springer (2018) 4. Barolli, A., Sakamoto, S., Ozera, K., Barolli, L., Kulla, E., Takizawa, M.: Design and implementation of a hybrid intelligent system based on particle swarm optimization and distributed genetic algorithm. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 79–93. Springer (2018) 5. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002) 6. Franklin, A.A., Murthy, C.S.R.: Node placement algorithm for deployment of twotier wireless mesh networks. In: Proceedingsof Global Telecommunications Conference, pp. 4823–4827 (2007) 7. Girgis, M.R., Mahmoud, T.M., Abdullatif, B.A., Rabie, A.M.: Solving the wireless mesh network design problem using genetic algorithm and simulated annealing optimization methods. Int. J. Comput. Appl. 96(11), 1–10 (2014) 8. Goto, K., Sasaki, Y., Hara, T., Nishio, S.: Data gathering using mobile agents for reducing traffic in dense mobile wireless sensor networks. Mob. Inf. Syst. 9(4), 295–314 (2013) 9. Inaba, T., Elmazi, D., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A secure-aware call admission control scheme for wireless cellular networks using fuzzy logic and its performance evaluation. J. Mob. Multimedia 11(3&4), 213–222 (2015) 10. Inaba, T., Obukata, R., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of a QoS-aware fuzzy-based CAC for LAN access. Int. J. Space Based Situated Comput. 6(4), 228–238 (2016)

408

S. Ohara et al.

11. Inaba, T., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A testbed for admission control in WLAN: a fuzzy approach and its performance evaluation. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 559–571. Springer (2016) 12. Lim, A., Rodrigues, B., Wang, F., Xu, Z.: k-center problems with minimum coverage. Theoret. Comput. Sci. 332(1–3), 1–17 (2005) 13. Maolin, T., et al.: Gateways placement in backbone wireless mesh networks. Int. J. Commun. Netw. Syst. Sci. 2(1), 44–50 (2009) 14. Matsuo, K., Sakamoto, S., Oda, T., Barolli, A., Ikeda, M., Barolli, L.: Performance analysis of WMNs by WMN-GA simulation system for two WMN architectures and different TCP congestion-avoidance algorithms and client distributions. Int. J. Commun. Netw. Distrib. Syst. 20(3), 335–351 (2018) 15. Muthaiah, S.N., Rosenberg, C.P.: Single gateway placement in wireless mesh networks. In: Proceedingsof 8th International IEEE Symposium on Computer Networks, pp. 4754–4759 (2008) 16. Naka, S., Genji, T., Yura, T., Fukuyama, Y.: A hybrid particle swarm optimization for distribution state estimation. IEEE Trans. Power Syst. 18(1), 60–68 (2003) 17. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007) 18. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of simulated annealing and genetic algorithm for node placement problem in wireless mesh networks. J. Mob. Multimedia 9(1–2), 101–110 (2013) 19. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of hill climbing, simulated annealing and genetic algorithm for node placement problem in WMNs. J. High Speed Netw. 20(1), 55–66 (2014) 20. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A simulation system for WMN based on SA: performance evaluation for different instances and starting temperature values. Int. J. Space Based Situated Comput. 4(3–4), 209–216 (2014) 21. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Performance evaluation considering iterations per phase and SA temperature in WMN-SA system. Mob. Inf. Syst. 10(3), 321–330 (2014) 22. Sakamoto, S., Lala, A., Oda, T., Kolici, V., Barolli, L., Xhafa, F.: Application of WMN-SA simulation system for node placement in wireless mesh networks: a case study for a realistic scenario. Int. J. Mob. Comput. Multimedia Commun. (IJMCMC) 6(2), 13–21 (2014) 23. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: An integrated simulation system considering WMN-PSO simulation system and network simulator 3. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 187–198. Springer (2016) 24. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evaluation of a simulation system based on particle swarm optimisation for node placement problem in wireless mesh networks. Int. J. Commun. Netw. Distrib. Syst. 17(1), 1–13 (2016) 25. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation of a new replacement method in WMN-PSO simulation system and its performance evaluation. In: The 30th IEEE International Conference on Advanced Information Networking and Applications (AINA-2016), pp. 206–211 (2016) 26. Sakamoto, S., Obukata, R., Oda, T., Barolli, L., Ikeda, M., Barolli, A.: Performance analysis of two wireless mesh network architectures by WMN-SA and WMN-TS simulation systems. J. High Speed Netw. 23(4), 311–322 (2017)

Performance Analysis of WMNs by WMN-PSODGA Simulation System

409

27. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Implementation of an intelligent hybrid simulation systems for WMNs based on particle swarm optimization and simulated annealing: performance evaluation for different replacement methods. Soft. Comput. 23(9), 3029–3035 (2017) 28. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering random inertia weight method and linearly decreasing vmax method. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 114–124. Springer (2017) 29. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid systems for node placement problem in WMNs considering particle swarm optimization, hill climbing and simulated annealing. Mob. Netw. Appl. 23(1), 27–33 (2017) 30. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering constriction and linearly decreasing inertia weight methods. In: International Conference on Network-Based Information Systems, pp. 3–13. Springer (2017) 31. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of intelligent hybrid systems for node placement in wireless mesh networks: a comparison study of WMN-PSOHC and WMN-PSOSA. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 16–26. Springer (2017) 32. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of WMN-PSOHC and WMN-PSO simulation systems for node placement in wireless mesh networks: a comparison study. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 64–74. Springer (2017) 33. Sakamoto, S., Ozera, K., Barolli, A., Barolli, L., Kolici, V., Takizawa, M.: Performance evaluation of WMN-PSOSA considering four different replacement methods. In: International Conference on Emerging Internetworking, Data & Web Technologies, pp. 51–64. Springer (2018) 34. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle swarms. J. Global Optim. 31(1), 93–108 (2005) 35. Shi, Y.: Particle swarm optimization. IEEE Connections 2(1), 8–13 (2004) 36. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. Evol. Program. VII, 591–600 (1998) 37. Vanhatupa, T., Hannikainen, M., Hamalainen, T.: Genetic algorithm to optimize node placement and configuration for WLAN planning. In: Proceedings of the 4th IEEE International Symposium on Wireless Communication Systems, pp. 612–616 (2007) 38. Wang, J., Xie, B., Cai, K., Agrawal, D.P.: Efficient mesh router placement in wireless mesh networks. In: Proceedings of IEEE Internatonal Conference on Mobile Adhoc and Sensor Systems (MASS-2007), pp. 1–9 (2007)

Forecasting Electricity Consumption Using Weather Data in an Edge-Fog-Cloud Data Analytics Architecture Juan C. Olivares-Rojas1(B) , Enrique Reyes-Archundia1 , José A. Gutiérrez-Gnecchi1 , Ismael Molina-Moreno1 , Arturo Méndez-Patiño1 and Jaime Cerda-Jacobo2

,

1 División de Estudios de Posgrado e Investigación, Tecnológico Nacional de México/Instituto

Tecnológico de Morelia, Morelia, Michoacán, Mexico {juan.or,enrique.ra,jose.gg3,ismael.mm, arturo.mp}@morelia.tecnm.mx 2 Facultad de Ingeniería Eléctrica, Universidad Michoacana de San Nicolás de Hidalgo, Morelia, Michoacán, Mexico [email protected]

Abstract. The forecasting of electricity consumption is a well-study research problem; however, electricity consumption is a complex model because it depends on many factors, and its accuracy is not always accurate. The accuracy of this forecasting impact; for example, in the utilities in the bulk generation of electricity and in the end-user at economical prices. This work shows the implementation of a forecasting model considering weather data across the smart metering system infrastructure using and edge-fog-cloud architecture for data analytics. The results show that using weather data across edge-fog-cloud architecture is an excellent alternative to forecast electricity consumption. Keywords: Data analytics · Edge-Fog-Cloud · Forecasting electricity consumption · Smart meters

1 Introduction The fourth industrial revolution has changed the way we have been living; it has been transforming all human activities achieving the digital transformation (DX) [1]. One of these activities is the power grid, which is now called Smart Grid (SG) due to its capabilities for data processing and communication between the utilities and end-users [2]. The most visible part of SG is the Smart Metering System (SMS) where the Smart Meter (SM) is the cornerstone of this system. The SM is an embedded device with telecommunication, storage, and processing modules reason why some authors considered it an Internet of Things (IoT) device [3]. The most common implemented SMS is the Advanced Metering Infrastructure (AMI) that allows end-users more visibility in their consumption and production of electrical energy and for the utilities the automation of © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 410–419, 2021. https://doi.org/10.1007/978-3-030-61105-7_41

Forecasting Electricity Consumption Using Weather Data

411

diverse operation such as cuts, reconnections, and data collection automatically without human interaction [2]. One of the most important challenges in SG is the forecast of electricity consumption because the correct estimation of future consumption is able to control the bulk generation, decrease the energy losses improving the final costs to the end-user, and pollute less the environment [4]. The issues with electricity forecasting imply a low-rate estimation due to electricity consumption and production is a complex model influenced by many variables such as weather and human factor, among others. Recently, the storage and processing capabilities of SM have been noticeably increased allowing new applications such as information processing and analysis using artificial intelligence and machine learning techniques [5]. These new capabilities of computing and storage can be used to forecast new electrical consumption in a better way. On the other hand, new distributed computation architectures are appearing. One of the most extended architectures is Edge-Fog-Cloud Architecture (EFCA) [6]. Traditionally, the computation is made at the cloud level in huge datacenters. The computation in the IoT devices is called edge computing, while the computation in an intermediate level is called fog computing. This new computing paradigm can be adapted to SG and SMS [7]. This work presents an electricity forecast model that considers weather data acquired across the SMS infrastructure implemented through edge-fog-cloud data analytics architecture. This paper is structured as follow. In Sect. 2, a review literature review of related works is presented. Section 3 shows the implementation of the proposed architecture. Section 4 shows the results and their discussion. Finally, Sect. 5 shows the conclusion and remarks.

2 Related Work Electricity forecasting is a well-study problem in research. Diverse works are focused on solving these issues. In particular, there are some works and approaches in the SMS field [4]. Below, a literature review discussion is presented. Most of the works are focused on a particularly geographic area such as urban [8, 25], tropical areas [14], cold places [15]; or countries such as the U.K. [10], the Netherlands [11], India [14], Russia [15], Thailand [16, 17], Qatar [19], Portugal [20], Cyprus [21], the U.S.A. [22], China [23, 25], South Korea [29]; or specific places such as Los Angeles [22], Nanjing [23]. Another way to classify the related work is the mathematical techniques used to forecasting, for example, Artificial Neural Networks (ANN) [8, 9, 27], Bagged Regression Trees (BRT) [8], Support Vector Machines (SVM) [9, 28], Multiple Regression Models (Linear and Non-linear) [10, 14, 17, 19, 23, 30], Genetic Algorithms (GA) [12], Fuzzy [18, 29], Simulation [20], Decision Tree [24], Particle Swarm Optimization [28, 29]. Other related works are focusing on some devices like Energy Management Systems (EMS) [8], weather station [12], microgrid [13].

412

J. C. Olivares-Rojas et al.

The horizon is other crucial way to classify the related works: day-ahead (daily) [8, 17–19, 26], hourly [11, 28], monthly [10, 14, 29], short-term [15, 18, 27], long-term [16, 17], yearly (seasonal) [17, 26], medium-term [20, 30]. Most of the related works used own data, but some works [9, 11, 27, 30] used open data. The data processing is made on the cloud or traditional devices as Laptop [11] or in data servers (cloud). A few works like [31, 32] are focused on EFCA using Machine Learning for SG but in the next years, it will increase. This work presents an electric consumption forecasting system using an EFCA for SMS considering weather variables implementing Lineal Regression Models (LRM).

3 An Edge-Fog-Cloud Architecture for Electricity Forecasting The SMS, particularly AMI architecture, is described as follows. The devices which consume electrical energy such as Appliances (A) and Smart Appliances (SA) are connected in a power grid, and telecommunication network called Home Area Network (HAN) and SM measures their electrical consumption. Additionally, the end-user not only can consume electrical energy perhaps they can produce their electricity using Distributed Energy Resources (DER) such as solar or wind. The HAN integrates also DER devices and SM in a special local network. The data communication in HAN is through Wireless Sensor Network (WSN) or using the same electrical cable named Power Lines Communication (PLC). In addition, the electrical consumption devices can be different in the buildings and in the industries. For this reason, the local power and communication networks can be Building Area Network (BAN) or Industrial Area Network (IAN). The SM’s consumption readings are sent to a Data Concentrator (DC). The DC concentrates the data of diverse SM in a geographical area or by the density of nodes (SM). Also, in some architectures, the SM can communicate with other SM using a mesh network. The communication network in this part is called Neighbor Area Network (NAN) and usually, the communication is made through Radio Frequency (RF) or other wireless networks. The NAN is equivalent in extension to a campus area network or a metropolitan area network according to the DC separations. The SM’s readings are sent in a specific time. The most common period is 15-min. The DC readings are stored in embedded databases. The first generations of SM have a smaller embedded database inside, but the next generation of SM is increasing their capacities in processing and storage. The original reason of use DC was related to costs. It is easier and cheaper to send single data packets from DC that sends individual each SM its data packet. The DC can communicate with other DCs. The DC are spreading along the distribution and transmission electrical lines. The communication between DCs is by Fiber Optics due to the high voltages and rugged protections. This network is called Field Area Network (FAN). Finally, the last DCs send the information to the utility´s data center. The total data of SMS is stored at Metering Database Management Systems (MDMS). The MDMS is connected with other information systems of the utility, such as billing, Outage Management System (OMS), among others. The information of MDMS is consulted for the prosumers (end-user who produce and consume electrical energy) through a website or mobile application.

Forecasting Electricity Consumption Using Weather Data

413

Figure 1 shows the adaptation of SMS AMI architecture to an EFCA to forecast electricity consumption. The Edge layer is composed of the SM and SAs, As, and DERs in the HAN, IAN, or BAN. The Fog layer is formed of DCs. Note that SM and DC have integrated a Weather Module (WM) to make readings of weather. The Cloud layer is composed of the MDMS servers. Additionally, a Weather Station (WS) is used to produce readings of weather. The WS is located at http://clima.itmorelia.edu.mx/.

Fig. 1. Edge-Fog-Cloud proposed architecture for electricity energy consumption forecasting.

The architecture of the testing is composed of seven SMs, two DCs, and one server. The SMs have a raspberry pi model 3B+ with a SmartPi energy board. The DCs are composed of Latte Panda Alpha board with current and voltage sensors. The server is an HPE ProLiant DL380 Gen10 with Intel Xeon Silver 4114 2.20 GHz processor, 32 GB RAM, and 8 TB disk space. Both SMs and DC use embedded Linux distributions. The server has Linux Debian Buster distribution. Communication between SMs and DC has used WiFi connections. Communication between DCs and server is using gigabit ethernet. Also, the connection between the server and the WS is gigabit ethernet. DC number 1 (DC1) has connected four SM, while DC number 2 (DC) has connected three SM. The embedded database in SM is SQLite, while the database for DC and MDMS is PostgreSQL. The SM is a triphasic meter records data of 27 variables about the electrical signal. The group variables are current (A), voltage (V), frequency (Hz), active power (W), reactive power (W), power factor (%), electricity consumption (kWh), electricity production (kWh). The WM can check the air temperature (°C), humidity (%), rain (mm), solar radiation (W/m2 ), UV index (0–16).

414

J. C. Olivares-Rojas et al.

4 Results and Discussion The first step was to reduce data dimensionality. SM’s data relevant to electricity consumption is consumption and production variables. The other variables such as current, frequency, voltage, and power, are relevant in other contexts like power quality but not for electricity demand. The consumption and production variable were consolidated in a single consumption variable where a positive value indicates a consumption value that the prosumer must pay to the utility while a negative value indicates that the production is higher than consumption; thus, the utility must be paid this surplus to the prosumer. Table 1 shows an example of the dataset used for the forecasting, the first row shows the min values and the second row shows the max values of each variable. Table 1. The general structure of the dataset to forecast (min and max values). Timestamp

Consumption

Temperature

Humidity

Rain

Solar radiation

UV index

2018-01-01 00:00:00

0

3

5

0

0

0

2019-12-31 23:45:00

1.2432

34.8

100

16

14.21

11.9

The total data has a sampling rate of 15-min, including all 2018 and 2019. The embedded databases in DC also include a variable named SM_ID to identify the SM. The MDMS includes SM_ID and DC_ID to identify the DC data. The second step is to analyze the correlation between weather variables. The Pearson coefficient correlation method was used. Table 2. Shows the correlation between consumption and weather variables. The most significant variables are temperature and solar radiation due to their absolute value are closed to one. Table 2. Correlation between consumption and weather variables. Variable

Correlation

Temperature

0.87

Humidity

0.24

Rain

0.19

Solar radiation −0.74 UV index

−0.07

The machine learning model used was an LRM which is defined in Eq. 1. y = β0 + β1 x1 + β2 x2 + . . . + βk xk + ε

(1)

where: y = electricity consumption, β o … β k = regression coefficients, x 1 = temperature, x 2 = solar radiation, and ε = error.

Forecasting Electricity Consumption Using Weather Data

415

The selection of LRM was due to the simplicity, easy implementation in the smart meter hardware architecture (edge computing), and well as easy to implement in fog and cloud layers. Figure 2 shows how the EFCA works in data analytics. Each layer, Edge Layer (EL), Fog Layer (FL), and Cloud Layer (CL) execute a Local Forecasting (LF) individually using LRM with its data, each layer obtains the Parameters (P), in this case, the regression coefficients β o , β 1, and β 2 . These Ps are sent to the next layers (Edge to Fog, and Fog to Cloud) and the upper layers calculated a new LF and New Parameters (NP) to the lower layers (Cloud to Fog, Cloud to Edge, Fog to Edge). The lower layers calculated the forecast using the new parameters and check if the NP are better. The models are validated using the coefficient of determination R2 . The best model is used to forecast new values of electricity consumption in each layer/device.

Fig. 2. Machine learning in EFCA

The dataset for training and testing the model include complete records of 2016 and 2017. The dataset was divided into two random groups, one group of 70% to the training dataset and the testing dataset of the rest 30%. Table 3 shows an example of improving forecasting using the proposed EFCA in SM number 1 (SM1) in the first period. The period of data analysis was bimonthly during the two-year analysis. The forecast model was improved with the NP of FL due to this model has a better R2 coefficient. We choose the R2 coefficient instead of other evaluation metrics such as RMSE, MAE, or MAPE; due to their simplicity to compare models more than measure precision. Table 4 shows the final results of improving the accuracy of forecasting using the proposed EFCA. The SMs forecasting have executed six times in 2018 and six times in 2019 (12 per analysis), but it ran in each execution twice (one for FL and one for CL). DC1 is executed in each of the 12 periods four times (one by for each SM) and an additional one for the NP of CL (12 x 6 times). These values represent the Forecast Model Executed column. On the other hand, the column First Better Case Forecast column expresses the number of occasions that the original LF was the better model.

416

J. C. Olivares-Rojas et al. Table 3. SM1 forecasting

P

R2

NP of FL

R2

NP of CL

R2

β o = 177.3, β 1 = 27.51, β 2 = −11.36

62.13%

β o = 166.53, β 1 = 29.17, β 2 = −9.74

63.43%

β o = 175.66, β 1 = 27.73, β 2 = −11.68

61.97%

Table 4. Final forecast result in each layer/device. Device Forecast models executed

First better case forecast

% of improving

SM1

36

3

91.67%

SM2

36

4

88.89%

SM3

36

3

91.67%

SM4

36

2

94.44%

SM5

36

3

91.67%

SM6

36

4

88.89%

SM7

36

3

91.67%

DC1

72

6

91.67%

DC2

60

5

91.67%

MDMS 36

4

88.89%

As we can observe that the forecasting models are improved in 91.11% (this data is obtained as the average of all devices percentage) but the number of executed forecasts is high particularly at intermediaries’ layer/devices such as FL and DC.

5 Conclusions The forecast of electricity consumption is a vital necessity for utilities and prosumers. While long-time different approaches are arising, trying to improve the accuracy of the predictions. Forecast electricity consumption is complex due to its produced by different variables. This paper proposes and EFCA to forecast electricity consumption using LRM. The results show that the accuracy of forecasting has been improved using the proposed EFCA; perhaps the number of executed models are increased and consumed more time. Considering a bimonthly period and the forecasting process could be executed in an offline way, the proposed architecture could be feasible to implement in the electricity forecasting problem. The forecasting can be improved using different approaches such as using hardware accelerators in the edge, select a better model (for example, deep learning techniques),

Forecasting Electricity Consumption Using Weather Data

417

and choosing better interactions between edge-fog-cloud tiers. All these approaches are our working in progress soon. Acknowledgment. This works is partial supported by Tecnológico Nacional de México under grants 7948.20-P and 8000.20-P.

References 1. Vial, G.: Understanding digital transformation: a review and a research agenda. J. Strateg. Inf. Syst. 28(2), 118–144 (2019). https://doi.org/10.1016/j.jsis.2019.01.003 2. Dileep, G.: A survey on smart grid technologies and applications. Renew. Energy 146, 2589– 2625 (2020). https://doi.org/10.1016/j.renene.2019.08.092 3. Borovina, D., et al.: Error performance analysis and modeling of narrow-band PLC technology enabling smart metering systems. Int. J. Electr. Power Energy Syst. 116 (2019). https://doi. org/10.1016/j.ijepes.2019.105536 4. Balaji, J., et al.: Machine learning approaches to electricity consumption forecasting in automated metering infrastructure (AMI) systems: an empirical study. In: Silhavy, R., Senkerik, R., Kominkova Oplatkova, Z., Prokopova, Z., Silhavy, P. (eds.) Cybernetics and Mathematics Applications in Intelligent Systems. CSOC 2017. Advances in Intelligent Systems and Computing, vol. 574. Springer (2017). https://doi.org/10.1007/978-3-319-57264-2_26 5. Rokan, B., Kotb, Y.: Towards a real IoT-based smart meter system. In: Luhach, A., Kosa, J., Poonia, R., Gao, X.Z., Singh, D. (eds.) First International Conference on Sustainable Technologies for Computational Intelligence. Advances in Intelligent Systems and Computing, vol. 1045. Springer (2020). https://doi.org/10.1007/978-981-15-0029-9_11 6. Adam, A., et al.: The fog cloud of things: a survey on concepts, architecture, standards, tools, and applications. Internet Things 9 (2020). https://doi.org/10.1016/j.iot.2020.100177 7. Forcan, M., Maksimović, M.: Cloud-fog-based approach for smart grid monitoring. Simul. Model. Pract. Theory 101 (2020). https://doi.org/10.1016/j.simpat.2019.101988 8. Dehalwar, V.: Electricity load forecasting for urban area using weather forecast information. In: 2016 IEEE International Conference on Power and Renewable Energy (ICPRE), Shanghai, pp. 355–359 (2016). http://doi.org/10.1109/ICPRE.2016.7871231 9. Zeng, Q., et al.: An optimum regression approach for analyzing weather influence on the energy consumption. In: 2016 International Conference on Probabilistic Methods Applied to Power Systems (PMAPS), Beijing, pp. 1–6 (2016). http://doi.org/10.1109/PMAPS.2016.776 4178 10. Hor, C., et al.: Analyzing the impact of weather variables on monthly electricity demand. IEEE Trans. Power Syst. 20(4), 2078–2085 (2005). https://doi.org/10.1109/TPWRS.2005. 857397 11. Prabakar, A., et al.: Applying machine learning to study the relationship between electricity consumption and weather variables using open data. In: 2018 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe), Sarajevo, pp. 1–6 (2018). http://doi. org/10.1109/ISGTEurope.2018.8571430 12. Moreno-Carbonell, S., et al.: Rethinking weather station selection for electric load forecasting using genetic algorithms. Int. J. Forecast. 36(2), 695–712 (2020). https://doi.org/10.1016/j.ijf orecast.2019.08.008 13. Agüera-Pérez, A., et al.: Weather forecasts for microgrid energy management: review, discussion and recommendations. Appl. Energy 228, 265–278 (2018). https://doi.org/10.1016/ j.apenergy.2018.06.087

418

J. C. Olivares-Rojas et al.

14. Jose, D., et al.: Weather dependency of electricity demand: a case study in warm humid tropical climate. In: 2016 3rd International Conference on Electrical Energy Systems (ICEES), Chennai, pp. 102–105 (2016). http://doi.org/10.1109/ICEES.2016.7510624 15. Rusina, A., et al.: Short-term electricity consumption forecast in Siberia IPS using climate aspects. In: 2018 19th International Conference of Young Specialists on Micro/Nanotechnologies and Electron Devices (EDM), Erlagol, pp. 6403–6407 (2018). http:// doi.org/10.1109/EDM.2018.8435002 16. Parkpoom, S., et al.: Climate change impacts on electricity demand. In: 39th International Universities Power Engineering Conference. UPEC 2004, Bristol, UK, 2004, vol. 2, pp. 1342– 1346 (2004). https://ieeexplore.ieee.org/abstract/document/1492245 17. Parkpoom, S., Harrison, G.: Analyzing the impact of climate change on future electricity demand in Thailand. IEEE Trans. Power Syst. 23(3), 1441–1448 (2008). https://doi.org/10. 1109/TPWRS.2008.922254 18. Shakouri, H., Nadimi, R., et al.: Investigation on the short-term variations of electricity demand due to the climate changes via a hybrid TSK-FR model. In: 2007 IEEE International Conference on Industrial Engineering and Engineering Management, Singapore, pp. 807–811 (2007). http://doi.org/10.1109/IEEM.2007.4419302 19. Gastli, A., et al.: Correlation between climate data and maximum electricity demand in Qatar. In: 2013 7th IEEE GCC Conference and Exhibition (GCC), Doha, pp. 565–570 (2013). http:// doi.org/10.1109/IEEEGCC.2013.6705841 20. Fidalgo, J., et al.: Impact of climate changes on the Portuguese energy generation mix. In: 2019 16th International Conference on the European Energy Market (EEM), Ljubljana, Slovenia, pp. 1–6 (2019). http://doi.org/10.1109/EEM.2019.8916539 21. Zachariadis, T.: Forecast of electricity consumption in Cyprus up to the year 2030: the potential impact of climate change. Energy Policy 38(2), 744–750 (2010). https://doi.org/10.1016/j. enpol.2009.10.019 22. Burillo, D., et al.: Forecasting peak electricity demand for Los Angeles considering higher air temperatures due to climate change. Appl. Energy 236, 1–9 (2019). https://doi.org/10.1016/ j.apenergy.2018.11.039 23. Li, G., et al.: Relations of total electricity consumption to climate change in Nanjing. Energy Procedia 152, 756–761 (2018). https://doi.org/10.1016/j.egypro.2018.09.241 24. Ahmad, T., et al.: Smart energy forecasting strategy with four machine learning models for climate-sensitive and non-climate sensitive conditions. Energy 198 (2020). https://doi.org/ 10.1016/j.energy.2020.117283 25. Zhang, C., Liao, H., Mi, Z.: Climate impacts: temperature and electricity consumption. Nat. Hazards 99, 1259–1275 (2019). https://doi.org/10.1007/s11069-019-03653-w 26. Staffell, I., Pfenninger, S.: The increasing impact of weather on electricity supply and demand. Energy 145, 65–78 (2018). https://doi.org/10.1016/j.energy.2017.12.051 27. Aslam, Z., et al.: An enhanced convolutional neural network model based on weather parameters for short-term electricity supply and demand. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) Advanced Information Networking and Applications. AINA 2020. Advances in Intelligent Systems and Computing, vol. 1151. Springer (2020). https:// doi.org/10.1007/978-3-030-44041-1_3 28. Nadtoka, I., Al-Zihery, A.: Mathematical modelling and short-term forecasting of electricity consumption of the power system, with due account of air temperature and natural illumination based on support vector machine and particle swarm. Procedia Eng. 129, 657–663 (2015). https://doi.org/10.1016/j.proeng.2015.12.087 29. Son, H., Kim, C.: Short-term forecasting of electricity demand for the residential sector using weather and social variables. Resour. Conserv. Recycl. 123, 200–207 (2017). https://doi.org/ 10.1016/j.resconrec.2016.01.016

Forecasting Electricity Consumption Using Weather Data

419

30. De Felice, M., et al.: Seasonal climate forecasts for medium-term electricity demand forecasting. Appl. Energy 137, 435–444 (2015). https://doi.org/10.1016/j.apenergy.2014. 10.030 31. Fei, X., et al.: CPS data streams analytics based on machine learning for Cloud and Fog Computing: a survey. Future Gener. Comput. Syst. 90, 435–450 (2019). https://doi.org/10. 1016/j.future.2018.06.042 32. Spiliotis, E., et al.: Cross-temporal aggregation: Improving the forecast accuracy of hierarchical electricity consumption. Appl. Energy 261 (2020). https://doi.org/10.1016/j.apenergy. 2019.114339

Vision-Referential Speech Enhancement with Binary Mask and Spectral Subtraction Mitsuharu Matsumoto(B) University of Electro-Communications, 1-5-1 Chofugaoka, Chofu-shi, Tokyo, Japan [email protected]

Abstract. This paper proposes vision-referential speech enhancement with binary mask and spectral subtraction as a sensor fusion of visual information and audio information. Recently, we can find many smart phones and tablet devices with a camera and a microphone in the worlds. We improve the sound quality of the audio signal by using the mask information from the visual information. Although the frame rate of the camera in such devices. it will be useful to enhance the speech signal if both signals are used adequately. We therefore aim to design a vision-referential speech enhancement. Throughout the experiments, it was confirmed that the speech could be enhanced even when there was high level of real noise in the environments.

1 Introduction Speech enhancement, which emphasizes the target speech from mixed signals, is one of the important issues in acoustic processing. Multi-channel approach using the microphone array is a typical approach to enhance the speech signal. Speech enhancement is realized by using the difference between the phase and amplitude of the sound entering each microphone [1–3]. Although it is an attractive approach to enhance the speech, it is necessary to set multiple microphones and to estimate the positions of the microphones. When we set the system for speech enhancement to the machines such as robots and vehicles, the background noise makes their performance worse. In order to solve this problem, we focus on speech enhancement technology in sensor fusion that combines image signal and audio signal. P. Duchnowaski et al. [4] proposed a method for extracting speech from the speaker’s lips using image information and tracking the speaker’s face based on the extracted lips to assist speech enhancement. H. Kulkarni et al. [5] proposed lip reading that can automatically enhance speech in correspondence with a language data set by combining deep learning and lip reading. In this way, sensor fusion is a technology that has been attracting attention for a long time. However, speech enhancement technology that simultaneously transmits and receives image signal and audio signal has not been studied so far. Recently, we proposed a speech enhancement technology of cooperative transmission and reception [6]. In this method, the speaker transmits not only the audio signal through the speaker but also the audio information as an image signal through the display. The listener emphasizes the target acoustical signal from the image and audio © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. Barolli et al. (Eds.): 3PGCIC 2020, LNNS 158, pp. 420–428, 2021. https://doi.org/10.1007/978-3-030-61105-7_42

Vision-Referential Speech Enhancement with Binary Mask

421

signals received through the standard camera and microphone mounted on smartphones and tablets. This framework is a framework for sensor-transmission/reception cooperative sensor fusion. The proposed framework makes it possible to implement speech enhancement that is not affected by the type of external noise, which was difficult with conventional speech enhancement techniques. Non-overlapping noise can be removed by the binary mask even for mixed speech with overlap on the frequency axis since the mask information can be obtained as visual information. However, overlapping noise remains on the time-frequency axis after the binary mask. In this paper, we focus on reducing the overlapping noise by using the spectral subtraction and give some experimental results to show the effectiveness of the proposed approach.

2 Problem Formulation In this section, we describe the supposed situation and formulate the problem. Figure 1 shows an example of how to use the proposed method. It can be used to send voice to an unspecified number of people, such as an election speech in public spaces. The speaker not only generates the voice through the speaker, but also transmits the mask information of the voice to the listener through the display. When the listener would like to hear the speaker’s voice, the listener moves the camera of the smartphone or tablet to the display and captures the audio signal and image information. The listener can obtain the voice information of the speaker with high quality even in a noisy situation by using the proposed method. Let us consider a target signal s(t) and ith noise ni (t). The mixed speech x 1 (t) acquired from the microphone array is described as follows.

Fig. 1. Assumed usage scenario of the proposed method.

x1 (t) = s(t) +

n t=1

nt (t)

(1)

422

M. Matsumoto

The mask information received as image information is then defined. Even if audio information and image information are sent at the same time, there is a time lag in the signal information received at the receiving side. Therefore, the mask signal X 2 (τ, ω) received by the receiving side can be expressed as follows: X2 (τ, ω) = M (τ − δ, ω)

(2)

where δ is the time delay. Here, when is defined as the maximum delay between sensors, the following equation is satisfied. |δ| ≤

(3)

Considering that the audio signal received by the microphone is restored by the image signal received by the camera, its output is expected to be maximum when there is no delay. Hence, the delay can be estimated as follows: |X2 (τ + δ, ω)X1 (τ, ω)| δ˜ = arg max (4) δ