Advances in Information and Communication: Proceedings of the 2019 Future of Information and Communication Conference (FICC), Volume 1 [1st ed.] 978-3-030-12387-1, 978-3-030-12388-8

This book presents a remarkable collection of chapters that cover a wide range of topics in the areas of information and

1,328 49 95MB

English Pages XIII, 1061 [1074] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Advances in Information and Communication: Proceedings of the 2019 Future of Information and Communication Conference (FICC), Volume 1 [1st ed.]
 978-3-030-12387-1, 978-3-030-12388-8

Table of contents :
Front Matter ....Pages i-xiii
Visible Light Communication Security Vulnerabilities in Multiuser Network: Power Distribution and Signal to Noise Ratio Analysis (Rana Shaaban, Prakash Ranganathan, Saleh Faruque)....Pages 1-13
The Applications of Model Driven Architecture (MDA) in Wireless Sensor Networks (WSN): Techniques and Tools (Muhammad Waseem Anwar, Farooque Azam, Muazzam A. Khan, Wasi Haider Butt)....Pages 14-27
Real Time Multiuser-MIMO Beamforming/Steering Using NI-2922 Universal Software Radio Peripheral (Aliyu Buba Abdullahi, Rafael F. S. Caldeirinha, Akram Hammoudeh, Leshan Uggalla, Jon Eastment)....Pages 28-50
5G Waveform Competition: Performance Comparison and Analysis of OFDM and FBMC in Slow Fading and Fast Fading Channels (Muhammad Imran, Aamina Hassan, Adnan Ahmed Khan)....Pages 51-67
Mitigating the Nonlinear Optical Fiber Using Dithering and APD Coherent Detection on Radio Over Fiber (Fakhriy Hario, Sholeh H. Pramono, Eka Maulana, Sapriesty Nainy Sari)....Pages 68-74
NavAssist-Intelligent Landmark Based Navigation System (Ratnakumar Madhushan, Cassim Farook)....Pages 75-86
An Enhanced RSSI-Based Detection Scheme for Sybil Attack in Wireless Sensor Networks (Yinghong Liu, Yuanming Wu)....Pages 87-102
Optimization of Polar Codes in Virtual MIMO Systems (Idy Diop, Papis Ndiaye, Papa Alioune Fall, Boly Seck, Moussa Diallo, Sidi Mohamed Farssi)....Pages 103-116
MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas: A Survey (Gang Wang, Yanyuan Qin)....Pages 117-142
Accurate Attitude Estimation for Drones in 5G Drone Small Cells (Vahid Vahidi)....Pages 143-153
Existence of an Optimal Perpetual Gossiping Scheme for Arbitrary Networks (Ivan Avramovic, Dana S. Richards)....Pages 154-163
How to Achieve Traffic Safety with LTE and Edge Computing (Niklas Hehenkamp, Christian Facchi, Stefan Neumeier)....Pages 164-176
Design of Microstrip Patch Antenna with Inset Feed in CST for EBS Channel (Mian Mujtaba Ali, Muhammad H. D. Khan, Omer Farooq)....Pages 177-184
Dynamic Spectrum Access of Virtualized-Operated Networks over MIMO-OFDMA Dedicated to 5G Cognitive WSSNs (Imen Badri, Mahmoud Abdellaoui)....Pages 185-202
Expanding Coverage of an Intelligent Transit Bus Monitoring System via ZigBee Radio Network (Ahmad Salman, Samy El-Tawab, Zachary Yorio)....Pages 203-216
CityAction a Smart-City Platform Architecture (Pedro Martins, Daniel Albuquerque, Cristina Wanzeller, Filipe Caldeira, Paulo Tomé, Filipe Sá)....Pages 217-236
Simplified Neural Networks with Smart Detection for Road Traffic Sign Recognition (Wei-Jong Yang, Chia-Chun Luo, Pau-Choo Chung, Jar-Ferr Yang)....Pages 237-249
Weighted Histogram of Oriented Uniform Gradients for Moving Object Detection (Wei-Jong Yang, Yu-Xiang Su, Pau-Choo Chung, Jar-Ferr Yang)....Pages 250-260
Enabling Pedestrian Safety Using Computer Vision Techniques: A Case Study of the 2018 Uber Inc. Self-driving Car Crash (Puneet Kohli, Anjali Chadha)....Pages 261-279
Towards Improved Drink Volume Estimation Using Filter-Based Feature Selection (Henry Griffith, Subir Biswas)....Pages 280-290
Coordinated Scheduling of Fuel Cell-Electric Vehicles and Solar Power Generation Considering Vehicle to Grid Bidirectional Energy Transfer Mode (Benslama Sami, Nasri Sihem, Zafar Bassam, Cherif Adnen)....Pages 291-303
Optimization of Bus Service with a Spatio-Temporal Transport Pulsation Model (Shuhan Lou, Ling Peng, Yunting Song, Xuantong Chen, Chengzeng You)....Pages 304-318
The Artificial Intelligence Application in the Management of Contemporary Organization: Theoretical Assumptions, Current Practices and Research Review (Dorota Jelonek, Agata Mesjasz-Lech, Cezary Stępniak, Tomasz Turek, Leszek Ziora)....Pages 319-327
Identification of Remote IoT Users Using Sensor Data Analytics (Samera Batool, Nazar Abbas Saqib, Muazzam Khan Khattack, Ali Hassan)....Pages 328-337
From Smart Concept to User Experience Practice a Synthetic Model of Reviewed and Organized Issues to Conceive Qualified Interactions (Cristina Caramelo Gomes)....Pages 338-357
Democratization of Intelligent Sensor Network for Low-Connected Remote Healthcare Facilities—A Framework to Improve Population Health & Epidemiological Studies (Santosh Kedari, Jaya Shankar Vuppalapati, Anitha Ilapakurti, Chandrasekar Vuppalapati, Sneha Iyer, Sharat Kedari)....Pages 358-376
Latency-Aware Distributed Resource Provisioning for Deploying IoT Applications at the Edge of the Network (Cosmin Avasalcai, Schahram Dustdar)....Pages 377-391
IntelliEppi: Intelligent Reaction Monitoring and Holistic Data Management System for the Molecular Biology Lab (Arthur Neuberger, Zeeshan Ahmed, Thomas Dandekar)....Pages 392-407
Smart and Pervasive Health Systems—Challenges, Trends, and Future Directions (Ramesh Rajagopalan)....Pages 408-419
CityBook: A Mobile Crowdsourcing and Crowdsensing Platform (Gilberto Marzano, Velta Lubkina)....Pages 420-431
A Framework for a Fuzzy Smart Home IoT e-Health Support System (Moses Adah Agana, Ofem Ajah Ofem, Bassey Igbo Ele)....Pages 432-447
Evaluation of Accuracy: A Comparative Study Between Touch Screen and Midair Gesture Input (Zeeshan Haider Malik, Miran Arfan)....Pages 448-462
Uses of Virtual Reality for Communication in Financial Services: A Case Study on Comparing Different Telepresence Interfaces: Virtual Reality Compared to Video Conferencing (Abraham G. Campbell, Thomas Holz, Jonny Cosgrove, Mike Harlick, Tadhg O’Sullivan)....Pages 463-481
Effects of Virtual Agent Gender on User Performance and Preference in a VR Training Program (Xiumin Shang, Marcelo Kallmann, Ahmed Sabbir Arif)....Pages 482-495
The Psychoinformatic Complexity of Humanness and Person-Situation Interaction (Suraj Sood)....Pages 496-504
Improving a Design Space: Pregnancy as a Collaborative Information and Social Support Ecology (Tamara Peyton, Pamela Wisniewski)....Pages 505-525
Image Gravity: Defining Spatial Constructs for Invisible Phenomena (Dana Karwas)....Pages 526-534
Service Robot Arm Controlled Just by Sight (Kohei Arai)....Pages 535-545
Usability Evaluation of Online Flight Reservation Systems (Zeeshan Haider Malik, Tayyab Munir, Mesan Ali)....Pages 546-559
MMORPG Player Classification Using Game Data Mining and K-means (Bruno Almeida Odierna, Ismar Frango Silveira)....Pages 560-579
Accurate, Timely, Reliable: A High Standard and Elusive Goal for Traveler Information Data Quality (Douglas Galarus, Ian Turnbull, Sean Campbell, Jeremiah Pearce, Leann Koon, Rafal Angryk)....Pages 580-598
Systematically Dealing Practical Issues Associated to Healthcare Data Analytics (Zeeshan Ahmed, Bruce T. Liang)....Pages 599-613
Towards Optimizing Data Analysis for Multi-dimensional Data Sets (Arialdis Japa, Daniel Brown, Yong Shi)....Pages 614-625
Clustering of Economic Data with Modified K-Mean Technique (Trung T. Pham)....Pages 626-644
Analysis of Data Governance Implications on Big Data (Lomso Trom, Johannes Cronje)....Pages 645-654
Predicting Human Position Using Improved Numerical Association Analysis for Bioelectric Potential Data (Imam Tahyudin, Berlilana, Hidetaka Nambo)....Pages 655-666
Design of an Analysis Guide for User-Centered Process Mining Projects (Yaimara Céspedes-González, Julio J. Valdes, Guillermo Molero-Castillo, Patricia Arieta-Melgarejo)....Pages 667-682
LDM: Lineage-Aware Data Management in Multi-tier Storage Systems (Pratik Mishra, Arun K. Somani)....Pages 683-707
Potential Data Sources for Sentiment Analysis Tools for Municipal Management Based on Empirical Research (Dorota Jelonek, Agata Mesjasz-Lech, Cezary Stępniak, Tomasz Turek, Leszek Ziora)....Pages 708-724
Crime Alert! Crime Typification in News Based on Text Mining (Hugo Alatrista-Salas, Juandiego Morzán-Samamé, Miguel Nunez-del-Prado)....Pages 725-741
Classification Model for Student Performance Amelioration (Stewart Muchuchuti, Lakshmi Narasimhan, Freedmore Sidume)....Pages 742-755
Towards Enhancing Historical Analogy: Clustering Users Having Different Aspects of Events (Ryohei Ikejiri, Ryo Yoshikawa, Yasunobu Sumikawa)....Pages 756-772
Educational Database Analysis Using Simple Bayesian Classifier (Byron Oviedo, Cristian Zambrano-Vega)....Pages 773-792
Two Approaches to Country Risk Evaluation (Ramin Rzayev, Sevinj Babayeva, Inara Rzayeva, Adila Ali)....Pages 793-812
Conceptual Model for the New Generation of Data Warehouse System Catalog (Danijela Jaksic, Patrizia Poscic, Vladan Jovanovic)....Pages 813-825
Towards the Processes Discovery in the Medical Treatment of Mexican-Origin Women Diagnosed with Breast Cancer (Guillermo Molero-Castillo, Javier Jasso-Villazul, Arturo Torres-Vargas, Alejandro Velázquez-Mena)....Pages 826-838
SPARQ\(\lambda \): SPARQL as a Function (Christian Vogelgesang, Torsten Spieldenner, René Schubotz)....Pages 839-856
A Holistic Approach to Requirements Elicitation for Mobile Tourist Recommendation Systems (Andreas Gregoriades, Maria Pampaka, Michael Georgiades)....Pages 857-873
A Marketing Game: A Model for Social Media Mining and Manipulation (Matthew G. Reyes)....Pages 874-892
Acoustic Event Detection with Sequential Attention and Soft Boundary Information (Jingjing Pan, Xianjun Xia)....Pages 893-903
Statistical Prediction of High-Cost Claimants Using Commercial Health Plan Data (Amy Z. Cao, Liana DesHarnais Castel)....Pages 904-912
A Personalized Blood Pressure Prediction Model Using Recurrent Kernel Extreme Reservoir Machine (Sundus Abrar, Ghalib Ahmad Tahir, Habeebah Adamu Kakudi, Chu Kiong Loo)....Pages 913-929
A Chronicle Review of Code Mixing and Switching or Language Exchanging in Punjabi Movie Names (Sanjeev Sharma, Deepak Sharma)....Pages 930-938
A Track Fuzzy Control of Robot Manipulator with Elastic Links (Nguyen Hoang Mai, Pham Anh Tuan)....Pages 939-951
Development and Initial Validation of the Big Data Framework for Agile Business: Transformational Innovation Initiative (Bhuvan Unhelkar, Joe Askren)....Pages 952-960
Estimation Model Based on Spectral-Reflectance Data (Tao Chi, Guangpu Cao, Bingchun Li, Zi Kerr Abdurahman)....Pages 961-969
Ethics in Analytics and Social Media (Ed Lindoo)....Pages 970-982
The Effects of the Number of Chinese Visitors on Commercial Sales in Japan (Koi Kyo)....Pages 983-997
Preliminary Multi-lingual Evaluation of a Question Answering System Based on the Node of Knowledge Method (Sanja Candrlic, Martina Asenbrener Katic, Alen Jakupovic)....Pages 998-1009
Empirical Similarity for Absent Data Generation in Imbalanced Classification (Arash Pourhabib)....Pages 1010-1030
Prediction Model for Prevalence of Type-2 Diabetes Complications with ANN Approach Combining with K-Fold Cross Validation and K-Means Clustering (Md Tahsir Ahmed Munna, Mirza Mohtashim Alam, Shaikh Muhammad Allayear, Kaushik Sarker, Sheikh Joly Ferdaus Ara)....Pages 1031-1045
Internet of Things Based Smart Community Design and Planning Using Hadoop-Based Big Data Analytics (Muhammad Babar, Waseem Iqbal, Sarah Kaleem)....Pages 1046-1057
Back Matter ....Pages 1059-1061

Citation preview

Lecture Notes in Networks and Systems 69

Kohei Arai Rahul Bhatia   Editors

Advances in Information and Communication Proceedings of the 2019 Future of Information and Communication Conference (FICC), Volume 1

Lecture Notes in Networks and Systems Volume 69

Series editor Janusz Kacprzyk, Polish Academy of Sciences, Systems Research Institute, Warsaw, Poland e-mail: [email protected]

The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. ** Indexing: The books of this series are submitted to ISI Proceedings, SCOPUS, Google Scholar and Springerlink ** Advisory Board Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas—UNICAMP, São Paulo, Brazil e-mail: [email protected] Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey e-mail: [email protected] Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA and Institute of Automation, Chinese Academy of Sciences, Beijing, China e-mail: [email protected] Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada and Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland e-mail: [email protected] Marios M. Polycarpou, KIOS Research Center for Intelligent Systems and Networks, Department of Electrical and Computer Engineering, University of Cyprus, Nicosia, Cyprus e-mail: [email protected] Imre J. Rudas, Óbuda University, Budapest Hungary e-mail: [email protected] Jun Wang, Department of Computer Science, City University of Hong Kong Kowloon, Hong Kong e-mail: [email protected]

More information about this series at http://www.springer.com/series/15179

Kohei Arai Rahul Bhatia •

Editors

Advances in Information and Communication Proceedings of the 2019 Future of Information and Communication Conference (FICC), Volume 1

123

Editors Kohei Arai Faculty of Science and Engineering Saga University Saga, Japan

Rahul Bhatia The Science and Information (SAI) Organization Bradford, UK

ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-030-12387-1 ISBN 978-3-030-12388-8 (eBook) https://doi.org/10.1007/978-3-030-12388-8 Library of Congress Control Number: 2018968383 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

After the success of Future of Information and Communication Conference (FICC) 2018, FICC 2019 is held on March 14–15, 2014 in San Francisco, USA. The Future of Information and Communication Conference (FICC), 2019 focuses on bringing together experts from both industry and academia, to exchange research findings in the frontier areas of Communication and Computing. This conference delivers programs of latest research contributions and future vision (inspired by the issues of the day) in the field and potential impact across industries. It features an innovative format for presenting new research, focussing on participation and conversation rather than passive listening. FICC 2019 attracted a total of 462 submissions from many academic pioneering researchers, scientists, industrial engineers, and students from all around the world. These submissions underwent a double-blind peer review process. Of those 462 submissions, 160 submissions (including 15 poster papers) have been selected to be included in this proceedings. It covers several hot topics which include Ambient Intelligence, Intelligent Systems, Data Science, Machine Learning, Internet of Things, Networking, Security and Privacy. This conference showcases paper presentations of new research, demos of new technologies, and poster presentations of late-breaking research results, along with inspiring keynote speakers and moderated challenge sessions for participants to explore and respond to big challenge questions about the role of technology in creating thriving, sustainable communities. Many thanks goes to the Keynote Speakers for sharing their knowledge and expertise with us and to all the authors who have spent the time and efffort to contribute significantly to this conference. We are also indebted to the organizing committee for their great efforts in ensuring the successful implementation of the conference. In particular, we would like to thank the technical committee for their constructive and enlightening reviews on the manuscripts in the limited time-scale.

v

vi

Preface

We hope that all the participants and the interested readers benefit scientifically from this book and find it stimulating in the process. See you in next SAI Conference, with the same amplitude, focus and determination.

Saga, Japan

Regards, Kohei Arai

Contents

Visible Light Communication Security Vulnerabilities in Multiuser Network: Power Distribution and Signal to Noise Ratio Analysis . . . . . Rana Shaaban, Prakash Ranganathan, and Saleh Faruque The Applications of Model Driven Architecture (MDA) in Wireless Sensor Networks (WSN): Techniques and Tools . . . . . . . . . . . . . . . . . . Muhammad Waseem Anwar, Farooque Azam, Muazzam A. Khan, and Wasi Haider Butt Real Time Multiuser-MIMO Beamforming/Steering Using NI-2922 Universal Software Radio Peripheral . . . . . . . . . . . . . . . . . . . . . . . . . . Aliyu Buba Abdullahi, Rafael F. S. Caldeirinha, Akram Hammoudeh, Leshan Uggalla, and Jon Eastment 5G Waveform Competition: Performance Comparison and Analysis of OFDM and FBMC in Slow Fading and Fast Fading Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muhammad Imran, Aamina Hassan, and Adnan Ahmed Khan Mitigating the Nonlinear Optical Fiber Using Dithering and APD Coherent Detection on Radio Over Fiber . . . . . . . . . . . . . . . . . . . . . . . Fakhriy Hario, Sholeh H. Pramono, Eka Maulana, and Sapriesty Nainy Sari NavAssist-Intelligent Landmark Based Navigation System . . . . . . . . . . Ratnakumar Madhushan and Cassim Farook An Enhanced RSSI-Based Detection Scheme for Sybil Attack in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yinghong Liu and Yuanming Wu Optimization of Polar Codes in Virtual MIMO Systems . . . . . . . . . . . Idy Diop, Papis Ndiaye, Papa Alioune Fall, Boly Seck, Moussa Diallo, and Sidi Mohamed Farssi

1

14

28

51

68

75

87 103

vii

viii

Contents

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas: A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gang Wang and Yanyuan Qin

117

Accurate Attitude Estimation for Drones in 5G Drone Small Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vahid Vahidi

143

Existence of an Optimal Perpetual Gossiping Scheme for Arbitrary Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ivan Avramovic and Dana S. Richards

154

How to Achieve Traffic Safety with LTE and Edge Computing . . . . . . Niklas Hehenkamp, Christian Facchi, and Stefan Neumeier

164

Design of Microstrip Patch Antenna with Inset Feed in CST for EBS Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mian Mujtaba Ali, Muhammad H. D. Khan, and Omer Farooq

177

Dynamic Spectrum Access of Virtualized-Operated Networks over MIMO-OFDMA Dedicated to 5G Cognitive WSSNs . . . . . . . . . . . . . . Imen Badri and Mahmoud Abdellaoui

185

Expanding Coverage of an Intelligent Transit Bus Monitoring System via ZigBee Radio Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ahmad Salman, Samy El-Tawab, and Zachary Yorio

203

CityAction a Smart-City Platform Architecture . . . . . . . . . . . . . . . . . . Pedro Martins, Daniel Albuquerque, Cristina Wanzeller, Filipe Caldeira, Paulo Tomé, and Filipe Sá

217

Simplified Neural Networks with Smart Detection for Road Traffic Sign Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei-Jong Yang, Chia-Chun Luo, Pau-Choo Chung, and Jar-Ferr Yang

237

Weighted Histogram of Oriented Uniform Gradients for Moving Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei-Jong Yang, Yu-Xiang Su, Pau-Choo Chung, and Jar-Ferr Yang

250

Enabling Pedestrian Safety Using Computer Vision Techniques: A Case Study of the 2018 Uber Inc. Self-driving Car Crash . . . . . . . . Puneet Kohli and Anjali Chadha

261

Towards Improved Drink Volume Estimation Using Filter-Based Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Henry Griffith and Subir Biswas

280

Contents

Coordinated Scheduling of Fuel Cell-Electric Vehicles and Solar Power Generation Considering Vehicle to Grid Bidirectional Energy Transfer Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benslama Sami, Nasri Sihem, Zafar Bassam, and Cherif Adnen Optimization of Bus Service with a Spatio-Temporal Transport Pulsation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shuhan Lou, Ling Peng, Yunting Song, Xuantong Chen, and Chengzeng You The Artificial Intelligence Application in the Management of Contemporary Organization: Theoretical Assumptions, Current Practices and Research Review . . . . . . . . . . . . . . . . . . . . . . . . Dorota Jelonek, Agata Mesjasz-Lech, Cezary Stępniak, Tomasz Turek, and Leszek Ziora Identification of Remote IoT Users Using Sensor Data Analytics . . . . . Samera Batool, Nazar Abbas Saqib, Muazzam Khan Khattack, and Ali Hassan From Smart Concept to User Experience Practice a Synthetic Model of Reviewed and Organized Issues to Conceive Qualified Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cristina Caramelo Gomes

ix

291

304

319

328

338

Democratization of Intelligent Sensor Network for Low-Connected Remote Healthcare Facilities—A Framework to Improve Population Health & Epidemiological Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . Santosh Kedari, Jaya Shankar Vuppalapati, Anitha Ilapakurti, Chandrasekar Vuppalapati, Sneha Iyer, and Sharat Kedari

358

Latency-Aware Distributed Resource Provisioning for Deploying IoT Applications at the Edge of the Network . . . . . . . . . . . . . . . . . . . . Cosmin Avasalcai and Schahram Dustdar

377

IntelliEppi: Intelligent Reaction Monitoring and Holistic Data Management System for the Molecular Biology Lab . . . . . . . . . . . . . . Arthur Neuberger, Zeeshan Ahmed, and Thomas Dandekar

392

Smart and Pervasive Health Systems—Challenges, Trends, and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ramesh Rajagopalan

408

CityBook: A Mobile Crowdsourcing and Crowdsensing Platform . . . . Gilberto Marzano and Velta Lubkina A Framework for a Fuzzy Smart Home IoT e-Health Support System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Moses Adah Agana, Ofem Ajah Ofem, and Bassey Igbo Ele

420

432

x

Contents

Evaluation of Accuracy: A Comparative Study Between Touch Screen and Midair Gesture Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zeeshan Haider Malik and Miran Arfan

448

Uses of Virtual Reality for Communication in Financial Services: A Case Study on Comparing Different Telepresence Interfaces: Virtual Reality Compared to Video Conferencing . . . . . . . . . . . . . . . . Abraham G. Campbell, Thomas Holz, Jonny Cosgrove, Mike Harlick, and Tadhg O’Sullivan

463

Effects of Virtual Agent Gender on User Performance and Preference in a VR Training Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiumin Shang, Marcelo Kallmann, and Ahmed Sabbir Arif

482

The Psychoinformatic Complexity of Humanness and Person-Situation Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Suraj Sood

496

Improving a Design Space: Pregnancy as a Collaborative Information and Social Support Ecology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tamara Peyton and Pamela Wisniewski

505

Image Gravity: Defining Spatial Constructs for Invisible Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dana Karwas

526

Service Robot Arm Controlled Just by Sight . . . . . . . . . . . . . . . . . . . . Kohei Arai

535

Usability Evaluation of Online Flight Reservation Systems . . . . . . . . . Zeeshan Haider Malik, Tayyab Munir, and Mesan Ali

546

MMORPG Player Classification Using Game Data Mining and K-means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bruno Almeida Odierna and Ismar Frango Silveira Accurate, Timely, Reliable: A High Standard and Elusive Goal for Traveler Information Data Quality . . . . . . . . . . . . . . . . . . . . . . . . . Douglas Galarus, Ian Turnbull, Sean Campbell, Jeremiah Pearce, Leann Koon, and Rafal Angryk

560

580

Systematically Dealing Practical Issues Associated to Healthcare Data Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zeeshan Ahmed and Bruce T. Liang

599

Towards Optimizing Data Analysis for Multi-dimensional Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Arialdis Japa, Daniel Brown, and Yong Shi

614

Contents

xi

Clustering of Economic Data with Modified K-Mean Technique . . . . . Trung T. Pham

626

Analysis of Data Governance Implications on Big Data . . . . . . . . . . . . Lomso Trom and Johannes Cronje

645

Predicting Human Position Using Improved Numerical Association Analysis for Bioelectric Potential Data . . . . . . . . . . . . . . . . . . . . . . . . . Imam Tahyudin, Berlilana, and Hidetaka Nambo Design of an Analysis Guide for User-Centered Process Mining Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yaimara Céspedes-González, Julio J. Valdes, Guillermo Molero-Castillo, and Patricia Arieta-Melgarejo LDM: Lineage-Aware Data Management in Multi-tier Storage Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pratik Mishra and Arun K. Somani Potential Data Sources for Sentiment Analysis Tools for Municipal Management Based on Empirical Research . . . . . . . . . . . . . . . . . . . . . Dorota Jelonek, Agata Mesjasz-Lech, Cezary Stępniak, Tomasz Turek, and Leszek Ziora

655

667

683

708

Crime Alert! Crime Typification in News Based on Text Mining . . . . . Hugo Alatrista-Salas, Juandiego Morzán-Samamé, and Miguel Nunez-del-Prado

725

Classification Model for Student Performance Amelioration . . . . . . . . Stewart Muchuchuti, Lakshmi Narasimhan, and Freedmore Sidume

742

Towards Enhancing Historical Analogy: Clustering Users Having Different Aspects of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ryohei Ikejiri, Ryo Yoshikawa, and Yasunobu Sumikawa

756

Educational Database Analysis Using Simple Bayesian Classifier . . . . . Byron Oviedo and Cristian Zambrano-Vega

773

Two Approaches to Country Risk Evaluation . . . . . . . . . . . . . . . . . . . Ramin Rzayev, Sevinj Babayeva, Inara Rzayeva, and Adila Ali

793

Conceptual Model for the New Generation of Data Warehouse System Catalog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Danijela Jaksic, Patrizia Poscic, and Vladan Jovanovic Towards the Processes Discovery in the Medical Treatment of Mexican-Origin Women Diagnosed with Breast Cancer . . . . . . . . . . Guillermo Molero-Castillo, Javier Jasso-Villazul, Arturo Torres-Vargas, and Alejandro Velázquez-Mena

813

826

xii

Contents

SPARQk: SPARQL as a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Vogelgesang, Torsten Spieldenner, and René Schubotz

839

A Holistic Approach to Requirements Elicitation for Mobile Tourist Recommendation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andreas Gregoriades, Maria Pampaka, and Michael Georgiades

857

A Marketing Game: A Model for Social Media Mining and Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matthew G. Reyes

874

Acoustic Event Detection with Sequential Attention and Soft Boundary Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jingjing Pan and Xianjun Xia

893

Statistical Prediction of High-Cost Claimants Using Commercial Health Plan Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amy Z. Cao and Liana DesHarnais Castel

904

A Personalized Blood Pressure Prediction Model Using Recurrent Kernel Extreme Reservoir Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . Sundus Abrar, Ghalib Ahmad Tahir, Habeebah Adamu Kakudi, and Chu Kiong Loo A Chronicle Review of Code Mixing and Switching or Language Exchanging in Punjabi Movie Names . . . . . . . . . . . . . . . . . . . . . . . . . . Sanjeev Sharma and Deepak Sharma A Track Fuzzy Control of Robot Manipulator with Elastic Links . . . . Nguyen Hoang Mai and Pham Anh Tuan Development and Initial Validation of the Big Data Framework for Agile Business: Transformational Innovation Initiative . . . . . . . . . Bhuvan Unhelkar and Joe Askren

913

930 939

952

Estimation Model Based on Spectral-Reflectance Data . . . . . . . . . . . . . Tao Chi, Guangpu Cao, Bingchun Li, and Zi Kerr Abdurahman

961

Ethics in Analytics and Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . Ed Lindoo

970

The Effects of the Number of Chinese Visitors on Commercial Sales in Japan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Koi Kyo

983

Preliminary Multi-lingual Evaluation of a Question Answering System Based on the Node of Knowledge Method . . . . . . . . . . . . . . . . Sanja Candrlic, Martina Asenbrener Katic, and Alen Jakupovic

998

Contents

xiii

Empirical Similarity for Absent Data Generation in Imbalanced Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1010 Arash Pourhabib Prediction Model for Prevalence of Type-2 Diabetes Complications with ANN Approach Combining with K-Fold Cross Validation and K-Means Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1031 Md Tahsir Ahmed Munna, Mirza Mohtashim Alam, Shaikh Muhammad Allayear, Kaushik Sarker, and Sheikh Joly Ferdaus Ara Internet of Things Based Smart Community Design and Planning Using Hadoop-Based Big Data Analytics . . . . . . . . . . . . . . . . . . . . . . . 1046 Muhammad Babar, Waseem Iqbal, and Sarah Kaleem Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1059

Visible Light Communication Security Vulnerabilities in Multiuser Network: Power Distribution and Signal to Noise Ratio Analysis Rana Shaaban(&), Prakash Ranganathan, and Saleh Faruque University of North Dakota, Grand Forks, ND 58201, USA [email protected]

Abstract. In the near future, Visible Light Communication (VLC) is expected to be used in multiple environments which were due to radio frequency RF congestion and health limitations, RF should not be employed. VLC is a combination of optical wireless communications and illumination. Due to the misconception that VLC-based communications cannot be eavesdropped on by malicious attacker since light does not penetrate through solid objects like walls, VLC security and privacy are areas that have been hardly studied. In this work, we study various techniques for physical layer security performance of a VLCbased communication. Then we propose a new VLC framework to defend against eavesdropping attacks. Three-step process was followed to achieve this aim. First implementing more APs in multiuser VLC network, then reducing the semi-angle of LED and, finally using the protected zone around the AP where eavesdroppers are restricted. The performance is measured in terms of the received optical power and SNR. The results of the simulations indicate that VLC secrecy performance can be enhanced using the proposed model. Keywords: VLC

 Security  Safety  Wireless communication

1 Introduction Visible light communication (VLC) is a promising candidate for future high speed broadband communications [1–3]. VLC potentially offers 10,000 times more bandwidth capacity than the RF based technologies [4]. The VLC technology is mostly based around the intensity modulation of white light emitting diodes (LEDs), which can be switched on and off at a very high rate, thus enabling data communications, and illuminations [5]. LEDs are widely used in everyday infrastructures including homes, offices, street and traffic lights and smartphones. By occupying the current lighting infrastructure and changing the wireless communication frequency to the visible spectrum, VLC could mitigate the spectrum crunch in the present wireless systems using radio frequency (RF). In order to commercialize VLC in the near future, recent approaches have conducted the standardization of short-range wireless optical communication using VLC for local and urban area networks [6]. Moreover VLC is an interesting technique as it utilize the existing lighting systems and work on license free spectrum as a result lower implementation cost. © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 1–13, 2020. https://doi.org/10.1007/978-3-030-12388-8_1

2

R. Shaaban et al.

Also it is considered safe for the electromagnetic sensitive areas, where RF is not allowed for safety issues. Additionally VLC can be applied along with current wireless networks since it receives zero interference and adds zero interference to RF counterparts. Due to the fact that visible light cannot penetrate through walls, it has high frequency reuse factor hence a high area spectral efficiency. However the broadcasting nature of VLC causes concerns in security and privacy in VLC, e.g., eavesdropping, has endorsed serious attention as it is the crucial step to validate the success of VLC application in the wild [7]. As shown in Fig. 1, a VLC network consisting of one sender (Alice), who utilizes the light sources for data transmission, one legitimate receiver (Bob), and one eavesdropper (Eve) [8]. Communication systems are designed to send information from a source to one or more destinations. The general communication system block diagram is shown in Fig. 2. The information generated by the source may be of the form of voice, a picture, video or plain text in some particular language, then converted into a sequence of binary digits by the source encoder. The channel encoder introduces, in a controlled manner, some redundancy in the binary information sequence which can be used at the receiver to overcome the effects of noise and interference in the transmission of the signal through the channel. The output of the channel encoder is passed to the modulator then transmitted through the channel which can be wired or wireless medium; such as copper wire, coaxial cable, wave guide, fiber optic cable, antennas and laser or LED. Similarly for the receiver, it can be wired, antenna or photodetector.

Fig. 1. An example of Eavesdropping [8]

In case of optical transmission, which will recover the data that will be demodulated and decoded to construct the original data. Communication protocol is susceptible to several attacks including packet falsification, replay attack, jamming, membership falsification, message manipulation and hijacking as shown in Fig. 2. The wall can block the light to provide certain degree of privacy but still there are potential concerns to legitimate users and network administrators addressing the information privacy and confidentiality, particularly in public areas, such as train stations, offices and libraries etc. Also, Traffic safety is the major concern of Intelligent Transportation Systems (ITS) as it is highly vulnerable to all attackers, as shown in Fig. 3. The fundamental objective of ITS is to scale down traffic accidents by maintain timely and adequate data collection about events like accidents, road conditions and traffic jams.

Visible Light Communication Security Vulnerabilities in Multiuser Network

3

Fig. 2. Communication system block diagram

Vehicular Ad Hoc Networks (VANETs) helps to alleviate traffic fatalities by vehicle-to-vehicle (V2 V) and vehicle-to-infrastructure (V2I) [9] based on dedicated short range communications (DSRC) and VLC as proposed in [10]. Autonomous information exchange is provided by (VANETs) using wireless communication between vehicles. Obviously VANETs will be utilized in the near future to mitigate traffic problems and support self- driving cars [11]. VANET applications transmit Cooperative Awareness Messages (CAM) frequently to maintain safe and efficient traffic flow. A CAM (aka Basic Safety Message (BSM) or beacon) includes different information like timestamp, position, speed and heading. A critical privacy threat can result because this information is broadcast publicly [12], especially if these CAMs are collected and analyzed. VANETs system are vulnerable to various malicious scenarios like packet falsification and replay attack. The former, the attacker continually listens to the communication channel among vehicles. Once received a packet, it changes the information and rebroadcasts it as if the packet comes from platoon leader. The later, replay attack, the adversary can retransmit a valid data packet, pretending to be a legitimate platoon member. As it eavesdrops the platoon communication and save the packets that are broadcasted by platoon members. Then later, it replays the packets as if packets are just generated. This packet has out-of-date information, which deceives the platoon members and corrupt the platoon stability. Modern automobile experimental security analysis shows that attackers can evaluate the information of data packets using an automotive diagnostic tool [13]. For instance, consider a scenario where malicious actor changes the acceleration of platoon from slowing down to speeding up. Modifying the acceleration may result in a collision. Wyner was a pioneer in proposing the wiretap channel [14], an informationtheoretic point of the physical-layer security and a channel in which an eavesdropper perceives a corrupted version of the transmitted signal. Further Csiszár and Körner continued the degraded wiretap channel to the non-degraded broadcast channel [15]. Their crucial work showed that ideal secrecy can be achieved as long as the legitimate user has a less degraded channel than the eavesdropper, and the secrecy capacity is the difference between the two user’s information capacities. Password protection and user admission control are common security improving methods that are achieved at upper layers of the communication chain.

4

R. Shaaban et al.

Fig. 3. Autonomous VANET communication architecture

However the number of legitimate information detected by unauthorized eavesdroppers is limited because the physical-layer security use the randomness of the wireless communication channel noise [14, 15]. The work in [14, 15] are all focused on RF based wireless networks. On the other hand, in VLC [16] proposed the fuzzy timing passwords, which used different time delay to differentiate between the legitimate user and the eavesdropper. Also in [17] screen view angles and leveraged user induced motions was used between smartphones in the secure barcode-based visible light communication (SBVLC). Furthermore to secure the physical layer, Mostafa et al. suggested the use of VLCfriendly jamming [18], VLC-artificial noise [19], and VLC beamforming [8]. For multiuser wireless networks, different from point to point communication, the secrecy operation in a large scale requires not only the information of locations of legitimate users but also the locations of eavesdroppers that may reach the legitimate users. The remainder of this paper is organized as follows. In Sect. 2, we study the challenges and motivation in the indoor model then in the vehicular model, respectively. Section 3, we introduce the indoor room model with channel analysis results and a discussion of Matlab simulation model results comparing between the proposed model and previous work to enhance the secrecy performance. A study of SNR and the effect of using multiple APs on the legitimate user signal strength is presented in Sect. 4 before the conclusion. Finally, concluding remarks and future work are given in Sect. 5.

2 Challenges and Motivation 2.1

Indoor Infrastructure VLC Prototype

Premier RF studies that define the secrecy performance in multiuser wireless networks from an information theoretic view depend on the secrecy graph model to study the node connectivity [20, 21] and the maximum secrecy rate [22]. Besides, the secrecy capacity scaling laws in a wireless network were indicated in [23] to study the secrecy rate per source-destination pair. Other than the network theory information, latter works used mathematical tools from stochastic geometry to study the secrecy performance in multiuser wireless networks [24, 25]. In contrast to RF communication, VLC uses intensity modulation and direct detection (IM/DD) due to the use of low-cost light-emitting diodes (LEDs) and photodiodes (PDs) as the optical transmitter and receiver, respectively. The signal in VLC

Visible Light Communication Security Vulnerabilities in Multiuser Network

5

is modulated using the LED intensity, but it must follow the dynamic range of typical LEDs and practical illumination guidelines [26–29] to meet the average, peak as well as non-negative amplitude constraints. The outcomes on the secrecy capacity achieved for RF networks cannot be directly applied to VLC networks. As LEDs have a nonlinear electrical-to-optical transfer characteristic, this nonlinearity can be well restored by pre-distortion means [30]. Also, the spatial diversity avoids the “multipath fading” in the VLC channel due to the wavelength of visible light is hundreds of nanometers while the detection area of a typical PD is millions of square wavelengths. It is important to attain the VLC channel information capacity with average, peak and nonnegative constraints before figuring the secrecy capacity in VLC network, because the secrecy capacity is associated with the communication channel capacity information [14, 15]. Even though still the definite information capacity of VLC channel is anonymous. Despite some lower and upper bounds have been evaluated, still the information capacity is unknown for the simplest case like single-input single-output (SISO). As in [8] studied lower and upper bounds on the secrecy capacity of the amplitudeconstrained Gaussian wiretap channel regarding one transmitter, one legitimate user and one eavesdropper. Also [8], applied beamforming to enhance the secrecy capacity for the multiple-input single-output (MISO) VLC channel. Subsequently, the ideal beam former design issue subject to amplitude constraints was additionally examined in [31]. However, for the MISO case, the effect of the channel correlation on VLC security is totally ignored by previous work. The procedure of VLC system secrecy using one access point (AP) in a single cell was studied in [32]. However the arbitrary action of legitimate users and eavesdroppers, specifically the interaction between them, have not been exactly evaluated when considering the multiuser VLC network secrecy performance. Moreover to the best of the author’s knowledge the previous work neglected the light reflection which is not feasible in indoor model. Also it was proved that eavesdroppers can deduce legitimate information by using a small portion of the reflected signal [33]. The work in [34] studied the performance of physical-layer secrecy in a three dimensional multiuser VLC network by using mathematical tools from stochastic geometry. Further they derived analytical expressions of secrecy outage probability, the ergodic secrecy rate, as well as their lower and upper bounds in tractable forms and proven with Monte Carlo simulations. In the seminal work, the effect of access point (AP) cooperation results in improving the secrecy performance of VLC networks was studied. The impact of reflected paths and channel correlation on VLC security was modeled and analyzed in [35]. As the VLC security was enhanced by utilizing VLC’s multi-path redundancy, time reversal and random choice technique under SISO-VLC and MISO-VLC models. That can make transmitted signal automatically focus on legitimate user while interfering the eavesdropper’s channel. Furthermore, this framework has applied K-L transform to cope with correlation so that secrecy capacity improves. Summary of different VLC network privacy enhancement methods are shown in Table 1.

6

R. Shaaban et al. Table 1. VLC secrecy enhancement techniques

Paper [8]

[14] [16] [17] [18, 19] [34, 35]

[46]

2.2

Method Studied lower and upper bounds on the secrecy capacity of the amplitudeconstrained Gaussian wiretap channel. Applied beamforming to enhance the secrecy capacity for the multiple-input single-output (MISO) VLC channel Showed that a non-negative secrecy rate can only be achieved when the legitimate user achieves a higher SNR than the strongest eavesdropper Fuzzy timing passwords, which used different time delay to differentiate between the legitimate user and the eavesdropper Screen view angles and leveraged user induced motions was used between smartphones in the secure barcode-based visible light communication (SBVLC) To secure the physical layer VLC-friendly jamming, VLC-artificial noise, and VLC beamforming was used Studied the effect of installing more access point (AP) and their cooperation results in improving the secrecy performance of VLC networks. The impact of reflected paths and channel correlation on VLC security was modeled and analyzed Used a strategy named the “protected zone” to enhance the secrecy performance of legitimate user in VLC networks

Outdoor Vehicular VLC Prototype

Basically privacy schemes in VANET is to change pseudonyms frequently in a hidden mix-contexts to avoid linkability of CAMs. It is obligatory to have privacy to restrict an adversary from using the CAM information to correlate the two successive messages of old and new pseudonyms. Unobserved mix- contexts are performed by using a silent period before a pseudonym change or by changing pseudonyms in cryptographic mix-zones (e.g., at road intersections). As shown in [36, 37], changing pseudonyms without these unobserved mixcontexts will not prevent vehicle tracking. In addition Sampigethaya et al. [38] proposed silent periods in VANETs when vehicles are merging or changing lanes and when merging into or exiting a freeway. In [39] Freudiger et al. suggested cryptographic mix zones (CMIX) that give vehicles the right to have a symmetric key from the Road-Side Unit (RSU) which controls the mix zone. Also Keys are forwarded when inquired to vehicles outside the range of the RSU to have access to decrypt received messages from vehicles within the zone. Buttya´n et al. [40] introduced stopping messages transmission when the vehicle speed is low, at intersections for example. Because fatal accident less likely to happen at low speed. Moreover Palanisamy et al. [41] developed and replaced the mix zones model with a framework, MobiMix, which is robust against timing and transition attacks. Recently Yu et al. [42] suggested MixGroup, which can efficiently utilize the sparse meeting chances among vehicles to change pseudonym. The study in [43] showed the effect of various VANET security and privacy schemes on an emergency braking alarm application. By simulating a dense platoon of vehicles moving at approximately high speed and count the occurrences of vehicle collisions when a leading vehicle suddenly brake. Furthermore Lefevre et al. [44] evaluated the impact of the silent period duration

Visible Light Communication Security Vulnerabilities in Multiuser Network

7

on the effectiveness of intersection collision avoidance (ICA) systems. They propose an ICA system and analyzed a silent period scheme in terms of missed and avoided collisions and claimed that the ICA system can effectively function with silent periods of less than two seconds. In this work we will focus on analyzing the secrecy performance in an indoor multiuser VLC network by considering the properties of VLC channel analysis using Lambertian radiant intensity. Our proposed model is based on the received optical power and SNR to reach the optimal secrecy rate. This study will enhance the secrecy performance for legitimate user by (1) implementing more APs in multiuser VLC network that are cooperated, (2) reducing the semi-angle of LED to help improve the secrecy and, (3) using the protected zone around the AP where eavesdroppers are restricted.

3 System Model A visible light indoor communication system model uses LED lights. In room 1 the APs are represented by four LEDs evenly distribution on the ceiling of the room, while room 2 used six APs with the locations shown in Table 2. We consider a downlink Table 2. The main parameters of simulation System parameters for a VLC link Room 1 Size Source Location (4 LEDs)

5  5  3 m3 (1.25, 1.25, 3), (1.25, 3.75, 3), (3.75, 1.25, 3), (3.75, 3.75, 3) Semi angle at half power (FWHM) 35° Transmitted power (Per LED) 20 mW Number of LEDs per array 60 * 60 (3600) Room 2 Size 5  5  3 m3 Source Location (6 LEDs) (1.6, 1.25, 3), (1.6, 3.75, 3), (2.5, 1.25, 3), (2.5, 3.75, 3) (4.1, 1.25, 3), (4.1, 3.75, 3) Semi angle at half power (FWHM) 15° Transmitted power (Per LED) 20 mW Number of LEDs per array 60 * 60 (3600) Receiver Receive plane above the floor 0.85 m Active area (Arx ) 1 cm2 FOV 60° Amplifier bandwidth Ba 50 MHz Concentrator gain g 6.0 Photodiode responsivity r 0.4 A/W pffiffiffiffiffiffi Amplifier noise density iam 5pA= Hz Ambient noise power Pn 19.272 lW Noise bandwidth factor I1 0.562 Optical filter’s transmission coefficient T 1.0

8

R. Shaaban et al.

transmission scenario of a multiuser VLC network with the presence of both legitimate user and eavesdroppers inside a three dimensional space. In this system model, white LEDs emit high frequency light waves that contains modulated signals, which transmit through the air to the receiver, lighting at the same time to complete the wireless transmission of data. The VLC APs are vertically fixed, since they are attached to the room ceiling, and their horizontal positions are in Table 2. Similarly, mobile users are assumed to be at a fixed height. The simulation results are shown in Fig. 4.

(a) Room 1

(b) Room 2

(c) Room 1

(d) Room 2

Fig. 4. Optical power distribution in received optical plane for a FWHM of (a) 35° with four APs, (b) 15° with six APs and (c), (d) are top view for room 1 and room 2 respectively

3.1

Transmitter

The channel transfer function for white LED light source directly pointing in the direction of optical receiver is given by [45]: HLOS ¼

A

rx

h2

Rð;Þ cosðuÞ 0

0  u  uc u [ uc

ð1Þ

where Arx denotes the effective detector area of the PD, h is the distance between the transmitter and the receiver, u is the angle of incidence. The PD at each user is assumed

Visible Light Communication Security Vulnerabilities in Multiuser Network

9

to be facing vertically upwards with a field of view (FOV) of uc . The VLC Aps are assumed to have a Lambertian radiation profile Rð;Þ known as: 

 nþ1 Rð;Þ ¼ cosn ð;Þ 2p

ð2Þ

n ¼ ln2=lnðcos ;12 Þ

ð3Þ

where n is Lambertian emission coefficient, associated with semi-angle at half power ;12 of the LED. (4) expresses the total received power for only LOS channel: PLOS ¼

XLEDnum i¼1

P  HLOS

ð4Þ

The receiver is consisted of photodiode, concentrator and optical filter, as a consequence the received power for LOS channel is: Pre ¼ PLOS  g  T

ð5Þ

where g is the concentrator gain, T is the optical filter’s transmission coefficient. 3.2

Simulation Results and Discussion

In this section, we used a MATLAB implementation to validate our proposed model. Two typical rooms are considered with a size of 5  5  3 m3, the network parameters used for simulation setup are described in Table 2. First we consider the scenario where the legitimate users are served by four APs, as depicted in Fig. 4(a). Therefore, malicious eavesdroppers can be as close as possible to the legitimate user, as shown in room 1 top view Fig. 4(c). It can be seen that, when the semi angle at half power is large, 35°; and using few APs, four in the first room can efficiently reduce the secrecy at the legitimate user. However, when reducing the semi angle to 15°, further increasing the APs within the second room to six, Fig. 4(b), the secrecy performance of the legitimate user obviously increased, as shown in room 2 top view Fig. 4(d). As the received power is highly concentrated in the zone around the legitimate user, 1 m of diameter, and it decreases as long as you move away from the specified zone, so eavesdroppers can’t reconstruct the signal with power levels lower the 4 mW. Therefore using both methods, reducing the semi angle at half power of the LED and applying more APs in the VLC indoor system, will automatically form a protected zone around the legitimate users to enhance the privacy and secrecy of the users in VLC networks. If any eavesdropper enters the protected zone, such behavior will be made aware to the AP, and the AP will notify the legitimate user and temporarily stop the communication. The practical implementation of the protected zone in VLC networks can be utilized with motion sensors that are already built in modern energy efficient lighting devices [46]. A secrecy protected zone can be defined by its center, i.e., its associated

10

R. Shaaban et al.

AP position, and a security radius, which is described as the smallest horizontal distance between the AP and any eavesdroppers that are undetectable.

4 Signal to Noise Ratio Analysis The authors of [34] proved that in order to have better secrecy performance it can be reached when the legitimate user achieves a higher SNR than the strongest eavesdropper. By analyzing the SNR and using the parameter in Table 2, the simulation results of the SNR for the two rooms is shown in Fig. 5. The photodetector will convert light signals to electrical signals and the SNR is indicated as:

(a) Room 1

(b) Room 2

Fig. 5. SNR or receiver for a FWHM of (a) 35° with four APs, (b) 15° with six APs

SNR ¼

i2 r2

ð6Þ

where r2 is the total noise variance and i is the photodiode’s output current and are shown as: r2 ¼ r2sh þ r2am

ð7Þ

i ¼ Pre  r

ð8Þ

r2sh ¼ 2ðPre þ Pn Þ  q  r  Bn

ð9Þ

Bn ¼ I12 Rb

ð10Þ

Visible Light Communication Security Vulnerabilities in Multiuser Network

r2am ¼ i2am Ba

11

ð11Þ

where r2sh is the shot-noise variance, r2am is the amplifier noise variance, r is the photodiode response rate, Pn is the ambient light’s noise power, Bn is the noise bandwidth, I1 is noise bandwidth factor, Rb is data rate, i2am and Ba are the amplifier noise density and the amplifier bandwidth, respectively. In comparison, when the semi half angle is decreased from 35° to 15° and by using six APs rather than four APs, the SNR outside the circular protected zone at each AP dropped from 95 dB to 75 dB, as shown in Fig. 5(a) and (b). This result is in agreement with what have been proved, increasing the density of APs can defiantly enhance the secrecy and privacy performance of the legitimate user in VLC networks. As it shows a significant decrement in SNR of the eavesdroppers located outside the protected zone and instead increasing the SNR of the legitimate user.

5 Conclusion In this paper, we have proposed a new VLC model for indoor environments that enhances the security in VLC technologies. First we implemented six APs rather than four in a typical 5  5  3 m3 multiuser VLC network (office) which are cooperated. Then we reduced the semi-angle of LED to 15° instead of 35° to further help improve the secrecy performance by directing the power in the specified zone for the legitimate user. Finally by analyzing the SNR along with performing the protected zone around the AP where eavesdroppers are restricted, we have found that these techniques strengthen the legitimate user signal and weaken the other signals outside the protected zone. As the eavesdropper signal strength dropped from 95 dB to 75 dB, validating our work and improvement to the network secrecy performance.

References 1. Komine, T., Nakagawa, M.: Fundamental analysis for visible-light communication system using LED lights. IEEE Trans. Consum. Electron. 50(1), 100–107 (2004) 2. Dimitrov, S., Haas, H.: Principles of LED light communications: towards networked Li-Fi. Cambridge University Press, Cambridge (2015) 3. Haas, H., Yin, L., Wang, Y., Chen, C.: What is LiFi? J. Light. Technol. 34(6), 1533–1544 (2016) 4. Basnayaka, D.A., Haas, H.: Hybrid RF and VLC systems: improving user data rate performance of VLC systems. In: IEEE Vehicular Technology Conference, vol. 2015 (2015) 5. Shaaban, R., Faruque, S.: A survey of indoor visible light communication power distribution and color shift keying transmission. In: IEEE International Conference on Electro Information Technology, pp. 149–153 (2017) 6. IEEE Computer Society: IEEE Standard for Local and metropolitan area networks - Part 15.7: Short-Range Wireless Optical Communication Using Visible Light, IEEE Std 802.15.7-2011, vol. 1, no. September, pp. 1–286 (2011)

12

R. Shaaban et al.

7. Mostafa, A., Lampe, L.: Enhancing the security of VLC links: physical-layer approaches. In: 2015 IEEE Summer Topicals Meeting Series, SUM 2015, pp. 39–40 (2015) 8. Mostafa, A., Lampe, L.: Physical-layer security for MISO visible light communication channels. IEEE J. Sel. Areas Commun. 33(9), 1806–1818 (2015) 9. Ucar, S., Ergen, S.C., Ozkasap, O.: Multihop-cluster-based IEEE 802.11p and LTE hybrid architecture for VANET safety message dissemination. IEEE Trans. Veh. Technol. 65(4), 2621–2636 (2016) 10. Ucar, S., Ergen, S.C., Ozkasap, O.: Security vulnerabilities of IEEE 802.11p and visible light communication based platoon. In: 2016 IEEE Vehicular Networking Conference, pp. 1–4 (2016) 11. Emara, K.: Safety-aware location privacy in VANET: evaluation and comparison. IEEE Trans. Veh. Technol. 66(12), 10718–10731 (2017) 12. ETSI: ETSI EN 302 637-3 Intelligent Transport Systems (ITS); Vehicular Communications; Basic Set of Applications; Part 3: Specifications of Decentralized Environmental Notification Basic Service, Etsi, vol. 1, pp. 1–73 (2014) 13. Koscher, K., et al.: Experimental security analysis of a modern automobile. In: Proceedings IEEE Symposium on Security and Privacy, pp. 447–462 (2010) 14. Wyner, A.D.: The wire-tap channel. Bell Syst. Tech. J. 54(8), 1355–1387 (1975) 15. Csiszár, I., Körner, J.: Broadcast channels with confidential messages. IEEE Trans. Inf. Theory 24(3), 339–348 (1978) 16. Araki, T., Suzuki, T.: Fuzzy timing passwords for providing easy user authentication to disable persons and their application to visible light communication. In: World Automation Congress (WAC), pp. 1–5 (2012) 17. Zhang, B., Ren, K., Xing, G., Fu, X., Wang, C.: SBVLC: secure barcode-based visible light communication for smartphones. IEEE Trans. Mob. Comput. 15(2), 432–446 (2016) 18. Mostafa, A., Lampe, L.: Securing visible light communications via friendly jamming. In: 2014 IEEE Globecom Workshops, GC Wkshps 2014, pp. 524–529 (2014) 19. Mostafa, A., Lampe, L.: Physical-layer security for indoor visible light communications. In: 2014 IEEE International Conference on Communications, ICC 2014, pp. 3342–3347 (2014) 20. Haenggi, M.: The secrecy graph and some of its properties. In: IEEE International Symposium on Information Theory - Proceedings, pp. 539–543 (2008) 21. Pinto, P.C., Barros, J., Win, M.Z.: Secure communication in stochastic wireless networks— part i: connectivity. IEEE Trans. Inf. Forensics Secur. 7(1), 125–138 (2012) 22. Pinto, P.C., Barros, J., Win, M.Z.: Secure communication in stochastic wireless networks Part II: maximum rate and collusion. IEEE Trans. Inf. Forensics Secur. 7(1 Part 2), 139–147 (2012) 23. Koyluoglu, O.O., Koksal, C.E., El Gamal, H.: On secrecy capacity scaling in wireless networks. IEEE Trans. Inf. Theory 58(5), 3000–3015 (2012) 24. Zhou, X., Ganti, R.K., Andrews, J.G., Hjørungnes, A.: On the throughput cost of physical layer security in decentralized wireless networks. IEEE Trans. Wirel. Commun. 10(8), 2764– 2775 (2011) 25. Wang, H., Zhou, X., Reed, M.C.: Physical layer security in cellular networks: a stochastic geometry approach. IEEE Trans. Wirel. Commun. 12(6), 2776–2787 (2013) 26. Ma, H., Lampe, L., Hranilovic, S.: Coordinated broadcasting for multiuser indoor visible light communication systems. IEEE Trans. Commun. 63(9), 3313–3324 (2015) 27. Lapidoth, A., Moser, S.M., Wigger, M.A.: On the capacity of free-space optical intensity channels. IEEE Trans. Inf. Theory 55(10), 4449–4461 (2009) 28. Wang, J.B., Hu, Q.S., Wang, J., Chen, M., Wang, J.Y.: Tight bounds on channel capacity for dimmable visible light communications. Light. Technol. J. 31(23), 3771–3779 (2013)

Visible Light Communication Security Vulnerabilities in Multiuser Network

13

29. Chaaban, A., Morvan, J.M., Alouini, M.S.: Free-space optical communications: capacity bounds, approximations, and a new sphere-packing perspective. IEEE Trans. Commun. 64 (3), 1176–1191 (2016) 30. Dimitrov, S., Haas, H.: Information rate of OFDM-based optical wireless communication systems with nonlinear distortion. J. Light. Technol. 31(6), 918–929 (2013) 31. Mostafa, A., Lampe, L.: Optimal and robust beamforming for secure transmission in MISO visible-light communication links. IEEE Trans. Sig. Process. 64(24), 6501–6516 (2016) 32. Pan, G., Ye, J., Ding, Z.: On secure VLC systems with spatially random terminals. IEEE Commun. Lett. 21(3), 492–495 (2017) 33. Classen, J., Chen, J., Steinmetzer, D., Hollick, M., Knightly, E.: The spy next door: eavesdropping on high throughput visible light communications. In: 2nd International Workshop on Visible Light Communications Systems, VLCS 2015, pp. 9–14 (2015) 34. Yin, L., Haas, H.: Physical-layer security in multiuser visible light communication networks. IEEE J. Sel. Areas Commun. 36(1), 162–174 (2018) 35. Liu, X., Wei, X., Guo, L., Liu, Y., Zhou, Y.: A new eavesdropping-resilient framework for indoor visible light communication. In: Proceedings of 2016 IEEE Global Communications Conference, GLOBECOM 2016 (2016) 36. Emara, K., Woerndl, W., Schlichter, J.: Vehicle tracking using vehicular network beacons. In: 2013 IEEE 14th International Symposium “A World Wireless, Mobile Multimedia Networks”, pp. 1–6 (2013) 37. Wiedersheim, B., Ma, Z., Kargl, F., Papadimitratos, P.: Privacy in inter-vehicular networks: why simple pseudonym change is not enough. In: 7th International Conference on Wireless On-Demand Network Systems and Services, WONS 2010, pp. 176–183 (2010) 38. Sampigethaya, K., Li, M., Huang, L., Poovendran, R.: AMOEBA: robust location privacy scheme for VANET. IEEE J. Sel. Areas Commun. 25(8), 1569–1589 (2007) 39. Freudiger, J., Raya, M., Félegyházi, M., Papadimitratos, P., Hubaux, J.-P.: Mix-zones for location privacy in vehicular networks. In: ACM Workshop Wireless Networking Intelligent Transportation Systems, vol. 51, pp. 1–7 (2007) 40. Buttyán, L., Holczer, T., Weimerskirch, A., Whyte, W.: SLOW: a practical pseudonym changing scheme for location privacy in VANETs. In: 2009 IEEE Vehicular Networking Conference, VNC 2009 (2009) 41. Palanisamy, B., Liu, L.: Attack-resilient mix-zones over road networks: architecture and algorithms. IEEE Trans. Mob. Comput. 14(3), 495–508 (2015) 42. Yu, R., Kang, J., Huang, X., Xie, S., Zhang, Y., Gjessing, S.: MixGroup: accumulative pseudonym exchanging for location privacy enhancement in vehicular social networks. IEEE Trans. Dependable Secur. Comput. 13(1), 93–105 (2016) 43. Papadimitratos, P., Calandriello, G., Hubaux, J.-P., Lioy, A.: Impact of vehicular communications security on transportation safety. In: IEEE Conference Computer Communications Workshops, IEEE INFOCOM 2008, vol. 00, no. c, pp. 1–6 (2008) 44. Lefèvre, S., Petit, J., Bajcsy, R., Laugier, C., Kargl, F.: Impact of V2X privacy strategies on intersection collision avoidance systems. In: IEEE Vehicular Networking Conference, VNC 2013, pp. 71–78 (2013) 45. Lu, H., Su, Z., Yuan, B.: SNR and optical power distribution in an indoor visible light communication system, pp. 1063–1067 (2014) 46. Romero-Zurita, N., McLernon, D., Ghogho, M., Swami, A.: PHY layer security based on protected zone and artificial noise. IEEE Sig. Process. Lett. 20(5), 487–490 (2013)

The Applications of Model Driven Architecture (MDA) in Wireless Sensor Networks (WSN): Techniques and Tools Muhammad Waseem Anwar(&), Farooque Azam, Muazzam A. Khan, and Wasi Haider Butt Department of Computer & Software Engineering, College of Electrical & Mechanical Engineering (CEME), National University of Sciences & Technology (NUST), Islamabad, Pakistan {waseemanwar,farooq,muazzamak,wasi}@ceme.nust.edu.pk

Abstract. Wireless Sensor Networks (WSNs) comprise several sensor nodes that work under certain operational constraints. The traditional software development approaches do not perform well while dealing with the complexity and real time properties of WSNs. Consequently, Model Driven Architecture (MDA) is commonly applied in WSN to verify the required system constraints in preliminary development periods. As MDA is highly suitable development approach for WSN, there is a strong need to explore and summarize the latest MDA trends in the field of WSN. Therefore, this article performs a Systematic Literature Review (SLR) to identify 27 research studies available during 20132018. This leads to classify the recognize studies into four MDA categories and five WSN groups. Moreover, 24 available tools are identified and organized into Model-driven (10), WSN-related (9) and other (5) groups. Furthermore, 12 tools developed by the researchers through the combination of MDA and WSN concepts are presented. In addition, MDA based algorithms (2) and protocols (2) for WSN are presented. Finally, comparative analysis of developed/proposed tools is performed to analyze the benefits and limitations of MDA for WSN. It is concluded that the major MDA attributes like reusability and early design verification are fully exploited in the domain of WSN. However, it is always challenging to choose right modeling and transformation approaches due to the diverse characteristics of WSN. Keywords: WSN

 MDA  Tools  Model-driven  Wireless networks

1 Introduction Wireless Sensor Networks (WSNs) are typically consist of various distributed sensors interconnected with each other to forward the particular data like temperature, pressure etc. of environment and/or machinery [1]. The technological improvements in different domains like industrial automation [2] and healthcare [3] leads to the requirement of WSN platform for data acquisition. Consequently, WSN are becoming the integral part in the variety of domains like avionics, industrial control systems and so on. As each domain has its own data acquisition needs, it is essential to tailor the basic © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 14–27, 2020. https://doi.org/10.1007/978-3-030-12388-8_2

The Applications of Model Driven Architecture (MDA)

15

characteristics of WSN to meet the particular requirements. For example, one general requirement is to reduce the energy consumption of sensor nodes due to limited batteries. Similarly, another significant requirement is to detect and take corrective measures in case of sensor node failure. Such diverse characteristics and requirements of WSN are difficult to manage through traditional software development approaches. This leads to the application of Model Driven Architecture (MDA) in WSNs. MDA is a modern software development approach where primary focus is to provide higher abstraction level for requirement specification through modeling phase [4]. Subsequently, the target model is produced through model transformation phase i.e. Model-to-Model (M2M) and/or Model-to-Text (M2T). Finally, the system design/constraint are verified in preliminary development periods through created target model. MDA is frequently applied in different domains like embedded systems [5] and demonstrate promising outcomes. MDA has typical features that are highly supportive for the development of WSNs. As MDA is commonly applied in WSNs, there is a strong need to explore and summarize the latest trends in this regard. This helps to identify the targeted WSN areas where modern MDA approaches have been applied so far. Moreover, this also facilitates to select right MDA approaches for a particular WSNs area. Furthermore, this also enables the practitioners and researchers to make the correct tool selection decisions as per requirements. Therefore, this article performs Systematic Literature Review (SLR) [6] to answer the following research questions: RQ1: What are the significant model-driven approaches that have been utilized in the domain of WSNs during 2013–2018? RQ2: What are the available tools for the MDA-based development of WSNs? RQ3: What are the leading tools, protocols and algorithms that have been proposed/offered by the researchers for MDA-based development of WSNs? RQ4: What are the key benefits and limitations of utilizing model-driven approaches in the domain of WSNs? We select four highly renowned repositories (i.e. IEEE [7], ACM [8], Springer [9] and Elsevier [10]) to perform this study. The review protocol (Sect. 2) is developed that includes selection rules (Sect. 2.1) for search process (Sect. 2.2). Consequently, we identify 27 studies [11–37] to achieve the objective of this SLR as shown in Fig. 1. Particularly, the dual categorization of selected studies is made with respect to four MDA approaches and five WSN areas respectively (Sect. 3.1). This leads to identify 24 existing tools (Sect. 3.2) i.e. Model Driven (10), WSN-related (9) and Other (5). Moreover, 12 tools are presented which are proposed in the selected studies for MDAbased development of WSN. Furthermore, the application of model transformation approaches and UML diagrams (i.e. 4 structural, 3 behavioral and 2 constraint specification diagrams) are investigated (Sect. 3.3) for WSN. Finally, a comparative analysis of the proposed tools is performed to investigate the benefits and limitations of MDA for WSN. The answers of the RQ’s and discussion are made in Sect. 4. Finally, concluding remarks are given in Sect. 5.

16

M. W. Anwar et al.

Fig. 1. Research Summary

2 Review Protocol This section summarizes the details of review protocol as given in subsequent sections. 2.1

Selection Rules

We define three rules for the selection of research studies. These rules not only provide the answers of RQ’s (Sect. 1) but also ensure the high quality outcomes. The rules are: (1) The research study can be selected only if MDA must be used as a development approach for WSNs. (2) The research study can be selected only if it is published in any one of the following renowned scientific repositories i.e. Springer, IEEE, Elsevier and ACM. (3) The research studies published during 2013–2018 are only allowed to be selected. The research study can only be selected if it completely follows all three abovementioned selection rules. The violation of even a single rule should lead to the rejection of study e.g. the studies should be rejected which are following first two rules but published before 2013.

The Applications of Model Driven Architecture (MDA)

2.2

17

Search Process

We perform the search process on the basis of well-defined selection rules. We only consider four recognized repositories as given in the second selection rule (Sect. 2.1). We apply the year filter (i.e. 2013–2018) during the search process to enforce the third selection rule. We apply the combination of different search terms to get the optimum search results. The results are summarized in Table 1. Table 1. Summary of search results Sr.# Search queries 1 2 3 4 5

Search results Springer IEEE MDA WSN 64 8 Model-driven WSN 138 24 UML WSN 154 9 MDA wireless sensor 286 18 Model-driven wireless sensor 688 77

Elsevier 49 437 79 97 945

ACM 528 789 432 1054 1843

We use different search queries and get corresponding search results in each repository as shown in Table 1. The relatively larger search queries (e.g. Model-driven wireless sensor) give thousands of search results which cannot be fully evaluated against first selection rule. Therefore, we apply advanced filters like journals selection (Elsevier) to further reduce the search results. To evaluate the first selection rule, we perform certain steps as shown in Fig. 2.

Fig. 2. Steps for selection of research studies

18

M. W. Anwar et al.

We overall consider 3248 search results to evaluate first selection rule. We reject 1912 search results on the basis of irrelevant study title. Furthermore, we reject 953 studies by reading abstract as first selection rule is clearly violated. Subsequently, we reject 287 studies by reading different sections. Finally, we thoroughly investigate remaining 96 studies and select 27 studies satisfying the selection rules (Sect. 2.1). 2.3

Quality Assessment

We select the prominent scientific repositories that always publish high quality research studies. This ensures the high quality outcomes of this study because the results of selected studies are reliable. We also try to consider latest studies as much as possible. The year wise distribution of selected studies is given in Table 2. Table 2. Distribution of selected studies according to publication Year Sr.# 1 2 3 4 5 6

2.4

Publication year Selected studies Total 2013 [11, 12, 14, 15, 22, 23, 25, 29, 36] 9 2014 [16, 18, 21, 26] 4 2015 [17, 27, 34] 3 2016 [19, 24, 28, 30–33, 35] 8 2017 [13, 20] 2 2018 [37] 1

Data Extraction

We define various data extraction and synthesis parameters to extract/analyze the desired information from the selected studies as given in Table 3. Table 3. Data extraction/synthesis parameters Sr. Parameter # 1 Common info Data extraction 2 Summary of study 3 Limitations 4 Proof-of-concept

Specifics Authors name, title of study, publisher, year of publication Purpose, impact, importance and findings Assumptions (if any) Experimental evaluation or other proof-of-concept methods (continued)

The Applications of Model Driven Architecture (MDA)

19

Table 3. (continued) Sr. Parameter # Synthesis of data 5 Categorization 6

Tools

7

Algorithms and protocols UML Diagrams

8 9

Model transformation approaches

Specifics

Categorization of selected studies w.r.t model driven (Table 4) and wireless sensor network approaches (Table 5) respectively (1) Existing model-driven, WSN and other tools that have been utilized in selected studies. (Table 6) (2) Tools proposed by researchers (Table 7) Proposed algorithms and protocols in the selected studies. (Table 7) The static and dynamic UML diagrams used in the selected studies to model WSN requirements (Table 8) Utilized model transformation approaches in selected studies i.e. M2 M, M2T or Both (Table 9)

3 Results This section provides the results obtained from the SLR as per the guidelines of review protocol (Sect. 2). The details are given in subsequent sections. 3.1

Categorization of Selected Studies

We selected 27 studies published during 2013–2018. Primarily, we categorize the selected studies with respect to MDA approaches into four groups as given in Table 4. The definition of each category is as follows:

Table 4. Categorization of selected studies on the basis of MDA approaches Sr. # Category Studies 1 Standard UML Profile [11, 25, 2 Customized UML-based extensions [12, 13, 3 DSML [17, 21, 4 Others [14, 16,

28, 15, 29, 18,

34] 20, 30] 32, 36, 37] 19, 22–24, 26, 27, 31, 33, 35]

(1) Standard UML Profile: This category includes the selected studies where standard UML profile and its extensions like SysML and MARTE are applied for WSNs. (2) Customized UML-based extensions: This category includes the selected studies where UML profile is extended to propose new customized profile/meta-models for WSNs. (3) DSML (Domain Specific Modeling Languages): This category includes the selected studies where novel DSML/meta-models are proposed for WSNs.

20

M. W. Anwar et al.

(4) Other: This category includes the selected studies where different models (e.g. Simulink models etc.) and approaches are applied for WSNs. Table 4 provides the clear idea about latest MDA approaches for WSNs. However, it doesn’t provide any classification for the particular WSNs area in which MDA approaches are applied. Therefore, we also classify selected studies with respect to particular WSNs category as given in Table 5. The description of each category is as follows:

Table 5. Categorization of selected studies on the basis of WSN areas Sr. # Category 1 Power management 2 WSAN 3 VANET 4 WBSN 5 General

Studies [11, 16, 24–26, 31, 34] [13, 33] [19, 32, 35] [23, 27, 36] [12, 14, 15, 17, 18, 20–22, 28–30, 37]

(1) Power Management: This category includes the selected studies that deals with the power management issues (e.g. energy consumption etc.) of WSNs. (2) WSAN (Wireless Sensor and Actuator Networks): This category includes the selected studies that particularly target WSAN. (3) VANET (Vehicular Adhoc Network): This category includes the selected studies that particularly focus on various WSN aspects in VANET. (4) WBSN (Wireless Body Sensor Network): This category includes the selected studies that particularly target WBSN. (5) General: This category includes the selected studies that target broader WSN concepts like application development and verification of general WSN properties. 3.2

Tools

We distinguish the tools into two groups i.e. existing tools used in the selected studies and proposed tools. We identify 24 existing tools as given in Table 6. It can be seen from the Table 6 that we categorize the existing tools into three groups i.e. Modeling (10), WSN-related (9) and Other (5). The purpose/functions of each identify tool is also provided to give the brief view. There are tools that cannot be directly related to modeling or WSN-related groups but importantly used in selected studies. We place all such tools in Other category. We also identified 12 tools that are proposed in the selected studies for MDA-based development of WSN. It is important to note that there are few studies where a novel framework is proposed without any name. Therefore, we summarize such unnamed frameworks in serial# 2 of Table 7. In addition to proposed tools, we also recognize model-based algorithms (2) and protocols (2) for WSN. The summary of proposed tools, algorithms and protocols is given in Table 7.

The Applications of Model Driven Architecture (MDA) Table 6. Existing tools used in the selected studies Sr. # Tool/framework Model-driven 1 ModelicaML 2 Topcased 3 ATL 4 Acceleo 5 XText 6 MagicDraw 7 Papyrus 8 GEF 9 StateFlow tool 10 Matlab Simulink WSN-related 11 TinyOS 12 TinyDDS 13 TRMSim-WSN 14 TikiriDB 15 Contiki 16 NS-2 17 OMNeT ++ 18 VEINS 19 Agilla Other 20 UPPAAL 21 VDM-SL Toolbox 22 CPN tools 23 Blender 24 Flux

Purpose

Study

Modeling and verification Modeling and verification Model Transformation Model Transformation Model Transformation Modeling Modeling Modeling editor development Modeling and verification modeling and verification

[11, 25] [11, 25] [11, 13, 17, 21, 30] [12, 17, 20, 21, 30] [24] [12, 34] [13, 30] [17] [27] [19, 23]

OS/platform WSN middleware for app dev. Simulator for trust Database abstraction layer OS/platform Simulator Simulation framework Simulation framework WSN middleware mobile agents

[13, 16, 21, 22, 30, 34] [14] [20] [21] [24, 30] [26] [31] [32] [34]

Formal verification Formal verification Formal verification 3D Toolkit Eclipse framework for cloud

[20] [33] [35] [23] [30]

Table 7. Proposed tools, protocols and algorithms for WSN Sr. 1 2 3 4 5 6 7 8

Tool/framework Study ArchWiSeN (Architecture for WSAN) framework [13] Unnamed framework [12] [14] [17] RWiN-environment [20] iLTLChecker [22] BodySim [23] SAMSON (Self-Adaptive Middleware for SensOr Networks) [24] Midgar [29] COMFIT [30] (continued)

21

22

M. W. Anwar et al. Table 7. (continued) Sr. Tool/framework 9 DSDVANET (Decentralized software defined VANET) 10 Agilla Modeling Framework (AMF) 11 VeriSensor 12 A4WSN Protocols 13 Time-triggered New Zigbee (TTNZ) 14 Secure routing protocol Algorithms 15 CApture Modeling Algorithm (CAMA) 16 Subnet-based Failure Recovery Algorithm (SFRA)

3.3

Study [32] [34] [36] [37] [18] [19] [26] [33]

UML Diagrams and Model Transformation Approaches

We identify 9 UML diagrams to model system structure (4), behavior (3) and properties (2) for WSN as shown in Table 8. It can be seen that UML class diagram is commonly used to model WSN structure. The behavior of the WSN is usually model through activity and state machine diagram. Finally, it can be analyzed that the requirement and parameter diagrams are typically used to model WSN constraints/ properties. Table 8. UML diagrams utilized for WSN Sr. # Diagram Studies System structure 1 Block definition diagram [11, 25] 2 Class [13, 15, 28, 34] 3 Component [13] 4 Use case [28] System behavior 5 State machine [11, 13, 25] 6 Activity [12, 13, 28, 30, 34] 7 Sequence [28] Properties 8 Requirement [11, 25] 9 Parametric [11, 25]

We also investigate the application of model transformation approaches in the domain of WSN to generate the desired target models from the source models. The results are summarized in Table 9. It can be seen that Model-to-Text (M2T) is commonly used approach. Furthermore, both M2T and M2 M are also utilized to perform the complex transformations.

The Applications of Model Driven Architecture (MDA)

23

Table 9. Application of model transformation approaches for WSN Sr. # Transformation approach 1 Model-to-Model (M2 M) 2 Model-to-Text (M2T) 3 Both (M2 M and M2T)

Study [11] [12, 15, 20, 24] [13, 17, 21, 30]

4 Answers and Discussion RQ1 Answer We identify 27 research studies and categorize into four leading modeldriven approaches (Table 4). Moreover, selected studies are also categorize according to respective WSNs area (Table 5). Furthermore, UML diagrams (Table 8) and model transformation approaches (Table 9) are presented to highlight the in-depth MDA applications for WSNs. RQ2 Answer We identify 24 existing tools (Table 6) and categorize into three groups i.e. Model-driven tools, WSN-related tools and Other tools. The existing functionalities of each tool is also provided. It has been analyzed that different types of tools are integrated with model-driven tools to provide the WSNs solution. For example, Philip Asare et al. [23] utilize Blender 3D toolkit with Simulink to provide the model-driven solution for WBSN. RQ3 Answer We present 12 proposed tools, 2 algorithms and 2 protocols for MDAbased development of WSNs as given in Table 7. It is analyzed that most of the proposed tools are based on the combination of model-driven and WSN tools. Particularly, the model-driven approaches and tools are used to specify WSNs requirements. Furthermore, model transformation approaches are applied to generate target WSN model. Finally, WSN tools are used for design verification by utilizing target WSN model. RQ4 Answer The general benefits of MDA are: (1) Ease of requirement specification (Modeling) (2) Target model generation (3) Early design verification. These benefits are directly related with other important factors like reusability and productivity. To analyze the pros and cons of MDA for WSNs, it is essential to investigate major MDA benefits in the domain of WSNs. Therefore, we analyze the proposed tools (Table 7) with the following parameters: (1) Modeling: This parameter evaluates the existence of modeling activity in the proposed tools. This facilitates to evaluate the “Ease of requirement specification” MDA benefit for WSNs. (2) Model Transformation (MT): This parameter evaluates the support for model transformation in the proposed tools. This facilitates to evaluate the “Target model generation” MDA benefit for WSNs. (3) Verification Type (VT): Generally, there are two types of design verification i.e. Dynamic and Formal [5]. This parameter evaluates the supported verification type (i.e. Dynamic Verification (DV), Formal Verification (FV), or Both) in the

24

M. W. Anwar et al.

proposed tools. This facilitates to evaluate the “Early design verification” MDA benefit for WSNs. (4) Availability: This parameter evaluates the availability of the proposed tools i.e. Public, No and Not Applicable (N-A). Although this parameter is not directly related with the objective of this comparison, it helps to perform further investigation on the proposed tool. The analysis results are summarized in Table 10. Table 10. Comparative analysis of proposed tools for key MDA parameters Sr.# Tool/framework Modeling MT VT 1 Yes Yes DV ArchWiSeN 2 Unnamed framework [12] yes Yes DV [14] Yes Yes FV [17] Yes Yes DV 3 RWiN-Envir. Yes Yes FV 4 iLTLChecker Yes No FV 5 BodySim Yes Yes DV 6 SAMSON Yes Yes DV 7 Midgar Yes Yes N-A 8 COMFIT Yes Yes DV 9 DSDVANET Yes Yes DV 10 AMF Yes Yes DV 11 VeriSensor Yes Yes FV 12 A4WSN Yes Yes DV a http://www.consiste.dimap.ufrn.br/projects/archwisen/ b http://lisi-lab.wixsite.com/rwinproject#!services/c1739 c http://osl.cs.illinois.edu/software/iltl/ d http://wirelesshealth.virginia.edu/content/bodysim e http://sealabtools.di.univaq.it/tools.php?tools_id=7 f https://pages.lip6.fr/Yann.Ben-Maissa/?N1=2&N2=0 g http://a4wsn.di.univaq.it/downloads.html

Availability Publica No No No Publicb Publicc Publicd No No No No Publice Publicf Publicg

It is analyzed that all tools support MDA modeling activity and simplify the requirement specification process. Similarly, model transformation (MT) is also supported by almost all tools to automatically generate target WSN model. Furthermore, dynamic verification is commonly performed to verify the system design of WSNs. Finally, we found seven tools that are publically available for further investigation/enhancements. From the detailed analysis (Table 10), it is concluded that MDA provide certain benefits like ease of requirement specification (modeling), target model generation and early design verification in the domain of WSN. Consequently, MDA certainly simplify the development process of WSN systems. It is also analyzed that there are few limitations while applying MDA for WSN. For example, it is difficult to develop generic modeling approach that covers all major aspects within a particular WSN area e.g. different modeling approaches exist to target power consumption issues of WSNs.

The Applications of Model Driven Architecture (MDA)

25

Consequently, the selection of right modeling and target verification approaches is always important while utilizing MDA for WSNs. In this context, there is a strong need to develop large scale model-driven solutions to manage diverse WSN’s characteristics within a single framework. A typical example of such framework is A4WSN where modeling capabilities are provided through three languages to represent several WSN’s requirements e.g. software, hardware and deployment characteristics of WSNs altogether. In addition, complete transformation solution is provided to generate target WSN code from source models. Although we completely follow SLR guidelines [6] to carry out this study, there are still few unavoidable limitations. For example, we consider four well-known repositories for the selection of studies, however, it is anticipated that some relevant studies are missed from other repositories like Google Scholar, Wiley etc. However, such unexploited sources do not significantly affect the outcomes of this SLR because selected repositories are widely acceptable.

5 Conclusions and Future Work This article investigates the applications of Model Driven Architecture (MDA) for Wireless Sensor Networks (WSN). A Systematic Literature Review (SLR) is performed to select 27 research studies published during 2013–2018. Subsequently, dual categorization of selected studies is made with respect to four MDA approaches and five WSN areas respectively. The analysis of selected studies leads to identify 24 existing tools i.e. Model Driven (10), WSN-related (9) and other (5). Moreover, 12 tools are presented which are proposed in the selected studies for MDA-based development of WSN. Furthermore, algorithms (2) and protocols (2) are recognized in the given research context. In addition, the application of model transformation approaches and UML diagrams (i.e. 4 structural, 3 behavioral and 2 constraint specification diagrams) are investigated for WSN. Finally, a comparative analysis of proposed tools is performed to investigate the benefits and limitations of MDA for WSN. It is concluded that the major MDA attributes like reusability and early design verification are fully exploited in the domain of WSN. However, it is always challenging to choose right modeling and transformation approaches due to the diverse characteristics of WSN. This study presents modern MDA trends for WSN and can be extended in multiple areas. For example, one interesting area is to perform comparative analysis of identified modeling approaches (e.g. UML extensions, DSML etc.) to analyze the best suited requirement specification methods for a particular WSN area e.g. power consumption, WBSN etc.

References 1. Engel, A., Koch, A.: Hardware-accelerated data compression in low-power wireless sensor networks, LNCS, vol. 8405, pp 167–178. Springer (2014) 2. Flammini, A., Sisinni, E.: Wireless sensor networks for distributed measurements in process automation. LNEE, vol. 268, pp. 317–320. Springer (2014)

26

M. W. Anwar et al.

3. Dmitriev, A.S., Ryzhov, A.I., Lazarev, V.A., Malyutin, N.V., Mansurov, G.K., Popov, M. G.: Experimental ultrawideband wireless sensor network for medical applications. J. Commun. Technol. Electron. 60(9), 1027–1036 (2015) 4. Rashid, M., Anwar, M.W., Khan, A.M.: Towards the tools selection in model based system engineering for embedded systems - a systematic literature review. JSS 106, 150–163 (2015) 5. Anwar, M.W., Rashid, M., Azam, F., Kashif, M.: Model-based design verification for embedded systems through SVOCL: an OCL extension for SystemVerilog. J. Des. Autom. Embedded Syst. 21(1), 1–36 (2017) 6. Kitchenham, B.: Procedures for Performing Systematic Reviews, TR/SE-0401/NICTA, Technical report 0400011T, Keele University (2004) 7. IEEE scientific database. http://ieeexplore.ieee.org/. Accessed July 2015 8. ACM. http://dl.acm.org/. Accessed July 2015 9. Springer. http://link.springer.com/. Accessed July 2015 10. Elsevier. http://www.sciencedirect.com/. Accessed July 2015 11. Berrani, S., Hammad, A., Mountassir, H.: Mapping SysML to modelica to validate wireless sensor networks non-functional properties. In: IEEE (ISPS) (2013) 12. Di Marco, A., Pace, S.: Model-driven approach to Agilla agent generation. In: 9th (IWCMC) (2013) 13. Rodrigues, T., Delicato, F.C., Batista, T., Pires, P.F., Pirmez, L.: An approach based on the domain perspective to develop WSAN applications. SoSym (4), 949–977. Springer (2017) 14. Boonma, P., Somchit, Y., Natwichai. J.: A model-driven engineering platform for wireless sensor networks. In: Eighth International Conference on 3PGCIC. IEEE (2013) 15. Paulon, A.R., Frohlich, A.A., Becker, L.B., Basso, F.P.: Model-driven development of WSN applications. In: 3rd (SBESC). IEEE (2013) 16. Potsch, T., Pei, L., Kuladinithi, K., Goerg, C.: Model-driven data acquisition for temperature sensor readings in wireless sensor networks. In: IEEE 9th ISSNIP (2014) 17. Tei, K., Shimizu, R., Fukazawa, Y., Honiden, S.: Model-driven-development-based stepwise software development process for wireless sensor networks. IEEE TSMCS 45, 675–687 (2014) 18. Ro, J.W., Bhatti, Z.E., Roop, P.S.: A model-driven approach with synchronous semantics for developing hard real-time WSNs. IEEE (ETFA) (2014) 19. Maxa, J.A., Mahmoud, M.S., Larrieu, N.: Joint model-driven design and real experimentbased validation for a secure UAV ad hoc network routing protocol. In: Integrated Communications Navigation and Surveillance (ICNS). IEEE (2016) 20. Grichi, H., Mosbahi, O., Khalgui, M., Li, Z.: RWiN: new methodology for the development of reconfigurable WSN. IEEE Trans. ASE 14(1), 109–125 (2017) 21. Shimizu, R., Tei, K., Fukazawa, Y., Honiden, S.: Toward a portability framework with multi-level models for wireless sensor network software. In: SMARTCOMP. IEEE (2014) 22. Kwon, Y., Agha, G.: Performance evaluation of sensor networks by statistical modeling and euclidean model checking. ACM TSN 9(4), 39 (2013) 23. Asare, P., Dickerson, R.F., Wu, X., Lach, J., Stankovic, J.A.: BodySim: a multi-domain modeling and simulation framework for body sensor networks research and design. In: Proceedings of the 8th Body Area Networks Conference, pp. 177–180. ACM (2013) 24. Jesus, M.T., Flavia, C.D., Paulo, F.P., Taniro, C.R., Thais, V.B.: SAMSON: self-adaptive middleware for wireless sensor networks. In: Proceedings of the 31st Applied Computing, pp 1315–1322. ACM (2016) 25. Hammad, A., Mountassir, H., Chouali, S.: An approach combining SysML and modelica for modelling and validate wireless sensor networks. In: Software Engineering for SoS, pp. 5– 12. ACM (2013)

The Applications of Model Driven Architecture (MDA)

27

26. Dezfouli, B., Radi, M., Whitehouse, K., Razak, S.A., Tan, H.P.: CAMA: efficient modeling of the capture effect for low-power wireless networks. ACM TSN 11(1), 20 (2014) 27. Sayyah, P., et al.: Virtual platform-based design space exploration of power efficient distributed embedded applications. ACM TECS 14(3), 49 (2015) 28. Uke, Shailaja, Thool, Ravindra: UML based modeling for data aggregation in secured wireless sensor network. Procedia Comput. Sci. 78, 706–713 (2016) 29. García, C.G., G-Bustelo, B.C., Espada, J.P., Cueva-Fernandez, G.: Midgar: generation of heterogeneous objects interconnecting applications. A domain specific language proposal for internet of things scenarios. J. Comput. Netw. 64, 143–158 (2014) 30. de Farias, C.M., et al.: COMFIT: a development environment for the internet of things. FGCN 75, 128–144 (2016) 31. Snajder, B., Jelicic, V., Kalafatic, Z., Bilas, V.: Wireless sensor node modelling for energy effciency analysis in data-intensive periodic monitoring. Ad Hoc Nets 49, 29–41 (2016) 32. Kazmi, A., Khan, M.A., Bashir, F., Saqib, N.A., Alam, M., Alam, M.: Model driven architecture for decentralized software defined VANETs. LNCS, vol. 185, pp. 46–56. Springer (2016) 33. Afzaal, H., Zafar, N.A.: Formal analysis of subnet-based failure recovery algorithm in wireless sensor and actor and network. Complex Adaptive System Modeling, pp. 4–27. Springer (2016) 34. Berardinelli, L., Di Marco, A., Pace, S., Pomante, L., Tiberti, W.: Energy consumption analysis and design of energy-aware WSN agents in fUML. LNCS, vol. 9153, pp 1–17 (2015) 35. Hussain, S.A., Khan, N.A., Sadiq, A., Ahmad, F.: Simulation, modeling and analysis of master node election algorithm based on signal strength for VANETs through colored petri nets. Neural Comput. Appl., 1–17 (2016) 36. Maissa, Y.B., Kordon, F., Mouline, S., Thierry-Mieg, Y.: Modeling and analyzing wireless sensor networks with VeriSensor. LNCS, vol. 8100, pp. 24–27. Springer (2013) 37. Malavolta, I., Mostarda, L., Muccini, H. et al.: A4WSN: an architecture-driven modelling platform for analysing and developing WSNs, Softw. Syst. Model., 1–21 (2018)

Real Time Multiuser-MIMO Beamforming/Steering Using NI-2922 Universal Software Radio Peripheral Aliyu Buba Abdullahi1,2,3(&), Rafael F. S. Caldeirinha1,2, Akram Hammoudeh1,2, Leshan Uggalla1, and Jon Eastment1,4 1

Wireless and Optoelectronic Research Group (WORIC), University of South Wales, Cardiff, UK [email protected] 2 Instituto de Telecomunicações (IT), Delegação de Leiria, ESTG, Polytechnic Institute of Leiria, Leiria, Portugal 3 Electrical and Electronics Engineering, School of Engineering, The Federal Polytechnic Mubi, PMB 35, Mubi, Adamawa, Nigeria 4 Science and Technology Facilities Council, Swindon, UK

Abstract. Exponential growth in wireless service subscription and its corresponding data traffic prediction poses a threat to the current 4G system; this triggered next generation wireless system (5G) research to improve system data rate and the overall network capacity thus, gained attention. In addition to the high data rates, other key features essential for successful system deployment were proposed and addressed towards 5G framework using various disruptive technologies. These include: the millimeter-Wave, massive-MIMO, beamforming-beamsteering, and Heterogeneous Network. These features further prompt, reexamining the system design and performance trade-offs with the existing 4G wireless system. This article presents the system design, implementation, hardware prototyping for 5G system beamforming/beamsteering. Hardware prototyping uses National Instrument Software Defined Radio (NI USRP 2922) and array antenna for system performance analysis. Results obtained show improved performance with increasing antenna. Keywords: 5G SDR

 MIMO  Beamforming/beamsteering  EVM  BER USRP

1 Introduction In the recent years, there has been an exponential growth in wireless service subscription resulting in over 7-billion subscriptions worldwide [1]. Subscribers higher data rates demand owing to growing subscription, is another fundamental challenge causing significant data traffic in 4G mobile and wireless networks [2, 3]. Meeting these demands call for an approach that can easily adapt, the user’s growth and their data rate demand over time, hence the next generation (5G) wireless system. Various disruptive technologies were proposed towards a 5G framework to significantly expand the network capacity beyond the current 4G systems to meet up with the growing subscriber’s © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 28–50, 2020. https://doi.org/10.1007/978-3-030-12388-8_3

Real Time Multiuser-MIMO Beamforming/Steering

29

requirements [4–7]. These technologies include; the new millimeter-Wave spectrum, Multi antenna MIMO (massive MIMO), Beamforming/beamsteering, Heterogeneous Network (HetNet) architecture etc. However, before deployment, it is very important to practically test system with the novel design, using these technologies as proposed. As one of the key enabling feature in the 5G system, MIMO has performances enhancement reputation in a wireless system ranging from system spectral efficiency, capacity, and improved bit error rate (BER) link reliability, thus, widely adopted in almost all 4G wireless standards. Other than performance improvement, the multiantenna MIMO with large antennas array (massive MIMO) usually form and control the antennas radiation pattern, and the process of combining and focusing the radiation of these antenna arrays signals to the desired location is referred to as beamforming. As one of the envisioned 5G technology, particularly with the propagation characteristics of the proposed spectrum and other related wireless channel effects, beamforming/ beamsteering using specific arrays structure, signal processing, and additional sensitivity control feedbacks can improve further, the receiver signal quality and minimise interference. Advances in Digital signal processing (DSP) design leads to system implementation in the digital domain, even though, other alternative approaches, e.g. analogue, and hybrid are presented to demonstrate beamforming/beamsteering [8, 9]. In both cases, the signals from these transmitting elements use either switched-beam antenna array with phase shifters and a baseband DSP algorithm to control the antenna array radiation adaptively. Beamforming/beamsteering using array antennas provides high resolution with precise radiation pattern control, to improves receiver signal quality, the Error Vector Magnitude (EVM), receive signal power, system BER etc. Performance improvement is proportional to the number of elements in the antenna arrays. Also, the techniques provide the possibility of allowing the receivers to operate with much lesser power than a conventional system. Furthermore as explored in the literature, most assessment uses a simulations analysis with propagation models that are generally limited with assumptions, it is very important, however, to test the system design using a hardware testbed scenario. Hardware prototyping design, implementation, performance evaluation uses a multiuser MIMO system with antenna array via National Instrument Universal Software Radio Peripheral (NI USRPTM-2922) Software Defined Radio for real-time results validation. As the name implies, the NI USRP software defined radio is plug-and-play that offers solutions for wireless system design, implementation and prototyping thus, used by various research groups [10–12].

2 Beamforming/Beamsteering As Fig. 1 presented, beamforming using switched antenna arrays or adaptive antenna array creates radiation pattern. The phase array beamforming has some pre-defined patterns through which the beam directivity is directed to the receiver. Its major disadvantage is, the beam may sometimes be in a direction that is not the exact requirement of the user. Adaptive antenna arrays beamforming, on the other hand, can adjust the directivity in real time, by modifying the radiation pattern as required via adaptive beamforming algorithm. As the most appropriate beamforming techniques for

30

A. B. Abdullahi et al.

mobile applications, adaptive beamforming use either the entire arrays or the subarrays with antennas individual weight to produce the desired directivity [13–15]. Beamsteering then samples each antenna element inputs continuously, combines and update the weight via feedback so that, the beams will points in the direction of the receiver at all time. Beamforming with beamsteering usually relies on an array of at least four to eight antenna [15].

Fig. 1. Antenna arrays beamforming techniques

Beamsteering, on the other hand, uses either analogue, digital or hybrid technique to control the beam. The analogue technique uses phase shifters to change the relative phase of the RF signals driving elements whereas, the digital counterpart uses DSP algorithm so that, the power amplifier output phases are digitally controlled individually through weighting of quadrature phases. It worth noting that, prototyping with NI-2922 can only use digital technique via DSP, because of certain radio limitation. The radio has single embedded daughterboard that contains RF front-end (including power amplifier) that are software defined [16]. Beamforming prototyping, particularly with USRP radio is dependent on the radio sampling frequency (clock rate). Digital beamsteering with these USRPs, the steering angle is limited by the motherboard clock rate as [17] confirmed. Now, let consider narrowband delay-sum beamformer. Assuming T ¼ s (i.e. channels are all time aligned) for all signals from h direction, the sum of the beamformer weights wM are the gain in the direction of h and consider less in other directions due to incoherent addition hence, the received signal with respect to beamforming weight is mathematically represented as; yð t Þ ¼

M1 X

wm xm ðt  ½M  m  1T Þ

ð1Þ

M¼0

where T = minimum possible angle, which beam, can be digitally steered. Assuming an ideal wireless channel, H ij ¼ hðtÞ ¼ dðtÞ, and applying a complex weights vector w ¼ ½w0; w1 ; w2 . . .wM1 , the received signal at the second antenna will

Real Time Multiuser-MIMO Beamforming/Steering

31

be a phase-shifted version of the signal received at the first antenna and similarly, with the respective antennas as these antenna grows to M. Taking aðhÞ as the response arriving from direction h (steering vector) express as;  2p aðhÞ ¼ 1 ej k dSinh

 2p . . .ej k ðM1Þdsinh T

ð2Þ

The received signal symbol vector x can be express as; xðtÞ ¼ aðhÞxðtÞ þ nðtÞ

ð3Þ

yðtÞ ¼ wH aðhÞx þ wH n yðtÞ ¼ wH x þ wH n

ð4Þ

yðtÞ ¼ wH aðhÞx

ð5Þ

Let the USRP hardware-sampling frequency be bHz with a sampling period T as the minimum value of delay so (i.e. 1/b), so ¼

  d so C Sinho $ the ho ¼ Sin1 C d

Since the beamformer delay is dependent on the sampling frequency, the available steering angle thus is restricted by the hardware motherboard clock rate. Hence, the T from Eq. (2) gives the minimum possible angle in which the beam is steered digitally, and beamsteering below ho will require much higher bHz. The radio is capable of handling digital beamsteering to a minimum ho ¼ 42 since the radio main board is capable of handling up 100 MHz sampling frequency. The weighting function w (which is independent of the input signal) controls the directivity over a range of angles so that, directivity gain remains the same for different steering angles thus, added to the carrier. In a multicarrier system, a delay of sd ¼ so is added on the carrier while the sampling frequency is chosen to be a multiple integers the carrier frequency so that the phase delay introduce will be expressed as; ua ðnT Þ ¼ wm xm ðnT  msd Þ Where wm ¼ Carrier frequency weighing function. System implementation using large array (massive-MIMO) have challenges, e.g. comlexity, space limitations, deployment cost etc. thus, a compromise between the system complexity and performance requires other beamforming/beamsteering called hybrid as [15, 18] proposed. Hybrid usually separate the array into subarrays and include the RF switch with a phase shifter conneted to each element so that, only the phase at the element level is control. Note that, the phase shifter are analogue components in which the adaptive beamforming weights can be applied digitally. In summary, beamforming/beamsteering main function irrespective of adopted techniques include;

32

1. 2. 3. 4.

A. B. Abdullahi et al.

Generate timing and possible weighing transmission factor. Apply time delays and signal processing in reception. Apply weighing factor and summing of delayed. Possible additional signal processing.

3 Adaptive Beamforming Algorithm Adaptive array beamforming system requires feedback via an adaptive algorithm to define the complex weight vector w to be use for radiation pattern optimisation by minimizing the error between the desired signal and the array output. In multiuser MIMO system, beamforming weights are specified as precoding and, this is determine using statistics of the signal xi ðtÞ arriving at the array. As detailed in [15], various methods can be used to compute the complex weights vector including; Recursive Least Square (RLS) algorithm, Conjugate Gradient (CG) algorithm, Least Mean Square (LMS) algorithm etc. as summarizes below. The state adaptive beamforming weights vector ðwÞ; A. RLS adaptive beamforming algorithm weights; wðnÞ ¼ wðn  1Þ þ K ðnÞn ðnÞ

ð6Þ

n ¼ 0; 1; 2; . . . where, K ðnÞ ¼ gain vector and nðnÞ ¼ priori estimation error. The priori estimation given mathematically using the expression nðnÞ ¼ dðnÞ  wðn  1ÞxðnÞ

B. CG adaptive beamforming algorithm weights; wðn þ 1Þ ¼ wðnÞ  lðn  1Þ þ DðnÞ

ð7Þ

where lðn  1Þ is the step size, l ¼ gain constant, which control the rate of adaptation. CG beamforming algorithm is used to compute the beamforming weights by accelerating the convergence rate as previous method convergence rates to Eigen value sensitivity spread with the correlation matrix. The main goal of this techniques is, iteratively search of the optimum solution by choosing conjugate paths for each new iteration, this produces orthogonal search directions that result to fastest convergence. C. LMS adaptive beamforming algorithm weights; wðn þ 1Þ ¼ wðnÞ  lxðnÞ½d  ðnÞ  xH ðnÞwðnÞ

ð8Þ

Real Time Multiuser-MIMO Beamforming/Steering

33

Adaptive beamforming using LMS computes the weight recursively with simplicity and computational ease, as it does not necessarily require off-line gradient estimations or repetition of data as in the cases with other. If the system is an adaptive linear combiner with the input vector and desired response available at each iteration, LMS adaptive algorithm is considered, the best choice, as compared with the other adaptive algorithm method. The step size is usually inversely proportional to the power of the reference signal though, stable only for a limited range of step size. Normalized Least Mean Square (NLMS) adaptive algorithm is an improved version to overcome some of its drawbacks as also presented in [15], e.g. sensitivity to the scaling input since the scaling of its input makes it very hard to choose a rate that will insure the algorithm stability. As a variant of the LMS adaptive algorithm, the NLMS algorithm solves these drawbacks by normalising to the power of the input signal. The beamforming weights are being updated using the equation; wðn þ 1Þ ¼ the wðnÞ þ a

lopt xðnÞeðnÞ c þ xT ðnÞxðnÞ

ð9Þ

where lopt ¼ optimal rate for the NLMS algorithm that is equal to 1 and is independent of the input, and c = is small positive value and its usually selected to be very small compared with xT ðnÞxðnÞ. For demonstration purpose, RLS algorithm will be utilized, since the algorithm does not necessary require any matrix inversion computations, the inverse correlation matrix is computed directly, this simply requires reference signal and correlation matrix information for the weight computation.

4 System Model 1. Simulation Model The simulation model uses LTE/LTE-Advanced downlink transmission modes-9 (Release-10) standards specification, for PDSCH enhanced UE-specific beamforming. This standard specification uses a codebook based precoding with eight (8) layers to serve users in the same time-frequency. As the number of transmission antennas and beamforming matrix values are not specified, transmission using this standard can apply beamforming with more transmitting antennas (  8Þ by using an appropriately dimensioned beamforming matrix. More antenna can be used to improve the beamforming by setting the antenna element weights, so that, the beam can be focused in a particular direction with more transmitting elements. The complete scenario proposed is shown in Fig. 2. The system is equipped with (M 128) transmitting antennas to serve (2 Users 4) UEs terminals simultaneously within a predefine area defined area (in x  y plane); each user is equipped with N 2 antennas. The system implement beamforming by pairing these multiple users with orthogonal precoders in a data region, and users’ specified position for beamsteering.

34

A. B. Abdullahi et al.

Fig. 2. Multiuser-MIMO system model

Aiming at multiuser beamforming/beamsteering, system processing requires the data streams to be modulated by weight, corresponding to the chosen direction so that when transmitted, the signal is maximised in that direction. The system uses a weighting vector to produce radiation. As a 1-D multiuser-MIMO beamforming/beamsteering, BS transmitter is positioned with reference to 0.0° elevation angle with all the respective UEs terminals whereas, the respective UEs are preposition along the azimuthal plane using the x  y plane. UEs position defines the user terminal range (m) from the BS transmitter, its corresponding x  y coordinates dimension, and its azimuthal angle cut (h°), and the path loss (dB) using the carrier frequency. Subject to UEs terminals to be serve, the resultant orthogonal (sub-matrices) channel is equal to M N dimension. Thus, radiation pattern (normalized power) with respect to these corresponding azimuthal angles is expect to have its peak power at user terminal’s specified azimuth angle cut. Results Analysis I Beamforming/Beamsteering Results The system uses the UEs terminal’s preposition with the feedback to direct the beam to these individual UEs terminals. Tables 1, 2 and 3 summarise the respective multiuser UEs receiver positions on the azimuthal plane referenced to 0.0° elevation angle and other related simulation parameters. Figures 3, 4 and 5 presents the respective beamforming response, and its equivalent multiuser azimuthal angle cuts radiation pattern using a different number of transmitting elements by the number multiusers being served simultaneously (i.e. 2 Users 4) thus, defined as scenarios 1, scenario 2 and scenario 3.

Table 1. 2-UEs MU-MIMO beamforming scenario 2-UEs beamforming scenario UEs x plane (m) y plane (m) Range (m) Path loss (dB) Azim. (Deg.) User-1 45.86 58.96 74.70 73.34 52.125 User-2 72.20 35.56 80.49 73.99 26.222

Real Time Multiuser-MIMO Beamforming/Steering

35

Table 2. 3-UEs MU-MIMO beamforming scenario 3-UEs beamforming scenario UEs x plane (m) y plane (m) Range (m) Path loss (dB) Azim. (Deg.) User-1 59.99 4.62 60.17 71.46 4.399 User-2 37.72 49.15 61.95 71.72 52.496 User-3 57.68 28.26 64.23 −72.03 26.114

Table 3. MU-MIMO beamforming scenario with 4-UEs 4-UEs MU-MIMO beamforming scenario UEs x plane (m) y plane (m) Range (m) User-1 71.38 13.18 72.59 User-2 58.03 34.59 67.56 User-3 42.53 51.49 66.78 User-4 26.83 61.49 67.09

Path loss (dB) Azim. (Deg.) 73.09 10.460 72.47 30.801 72.37 50.440 73.41 66.425

Scenario 1A;

(a) Scenario 1B;

(b) Fig. 3. Two UEs terminals Beamforming/Beamsteering with; (a) 8 BS transmitters. (b) 64 BS transmitters

36

A. B. Abdullahi et al.

(a) Scenario 2B;

(b) Fig. 4. Three UEs terminals Beamforming/Beamsteering with; (a) 8 BS transmitters. (b) 64 BS transmitters

From Table 1, the system uses M transmitting antennas to serve four users simultaneously. These UEs terminal (i.e. UE1, UE2, UE3 and UE4) are prepositioned with an approximate range of 72.59, 67.56, 66.78 and 73.41 m respectively from the BS transmitter, and this corresponds with coordinates positions UE1(71.38, 13.18), UE2(58.03, 34.59), UE3(42.53, 51.49), and UE4(61.49, 67.09), respectively in x  y plane. From the results, it is evident that the response results show its peak normalized power to appear at an azimuthal angle cut specified, as defined by the respective UEs positions. This show the beams formed are individual steered to the respective user, the paired users terminals beam directivity are not narrow enough due to the number of transmitting element utilized, an adjacent channel interference remnant of these multiuser are not fully suppressed. As the number of transmitting elements increases, the beam directivity and the inter-user interference suppression is greatly improve as presented in Figs. 3, 4 and 5b, c respectively. In summary, irrespective of UEs terminals to be served simultaneously, the beam directivity and the corresponding interference suppression in the adjacent channel is proportional to transmitting antennas utilized. Thus, the more ttransmitting elements, the better beam directivity as well as UEs terminal received signal. This simply demonstrates the beamforming/beamsteering with multi antenna (massive-MIMO) as envisioned in 5G systems.

Real Time Multiuser-MIMO Beamforming/Steering

37

Scenario 3A;

(a) Scenario 3B;

(b) Scenario 3C;

(c) Fig. 5. Four UEs terminals Beamforming/Beamsteering with; (a) 8 BS transmitters. (b) 64 BS transmitters. (c) Beamforming/Beamsteering with 128 BS transmitters

Results Analysis II Digital/Hybrid Results Comparison Unlike digital beamforming/beamsteering using digital signal processing, Hybrid technique incorporate a phase-shifters for appropriate RF signal phase control, its performance solely depends on phase shifter precisions. The results in Fig. 6 (Appendix Figs. 12 and 13) present the corresponding user terminal received signal constellation comparison using different beamforming/beamsteering scheme. It is

38

A. B. Abdullahi et al.

evident that, results in the subsequent figures (a) and (b) has equal performance with no any significant different between the two distinctive scheme, and this is a bit misleading. It is simply because of the infinite phase shifter precision assumed in the simulation, and in reality, the analogue phase shifters usually have only limited precision, and these are often categorised by the number of bits used in the phase shifters (usually represented by 2n ). Figure 6 (Appendix Figs. 12 and 13) (c), presents corresponding user terminal received signal constellation with hybrid beamforming with finite phase shifter precision. The higher the quantization bits utilize the better performance the system present to the digital system as expected.

(a)

(b)

(c) Fig. 6. Received Signal Constellation with (a) Digital beamforming Processing. (b) Hybrid beamforming processing with infinite (phaseshifter) precision. (c) Hybrid BF with limited phase shifters precisions (4-bits)

Performance assessment is further presented using different modulation; UEs terminal received signal EVM, effective SNR estimate, and system BER results were presented to demonstrate the impact on beamforming. These results were present in Fig. 7a–c with the

Real Time Multiuser-MIMO Beamforming/Steering

39

respective techniques specified, result in Fig. 7a show system percentage EVM with 8-elements of about 1.62, 1.65 and 1.90% respectively with 64-QAM, 256-QAM and 1024-QAM as compared with 1.32, 1.38 and 1.43% of 16-elements.

(a)

(c)

(b)

(d)

Fig. 7. a Digital-Hybrid percentage EVM. b Hybrid (finite) % EVM. c Effective SNR estimate, d 16-QAM System BER

Great performance enhancement is observed when the system utilises 32-elements as percentage EVM further dropped to1.20, 1.22 and 1.30% with these modulations. Similarly in Fig. 7c, a system with 64-QAM, 256-QAM and 1024-QAM modulation respectively has the received signal effective SNR estimates of about −5.3, −5.3 and −7.2 dB as compared with only −3.5, −3.8 and −4.2 dB of 16-elements system using the same reference point. Great enhancement is further observed with 32-elements of about −1.4, 1.3 and −1.8 dB as expected. Tables 2 and 3 present the numerical summary of achieved performances with various transmitting element and different modulation schemes. A very small variation occurs between digital to hybrid (finite) beamforming performances with more quantised bits used. System BER shows an improvement with higher transmitting elements as expected. using a reference threshold of say; 1 104 , in Fig. 7d 16-QAM presented, this performance is attained with only 7 dB SNR using 32 transmitting elements for both digital and hybrid as compared with 20 dB and 26 dB SNR requirement of 16 and 8

40

A. B. Abdullahi et al.

elements respectively. Thus, an SNR gain of 13 dB and 7 dB is achieve with 32 and 16 transmitting elements as compared with only eight transmitting elements as expected. Tables 4 and 5 summarises the estimated SNR at user terminal and the percentage EVM results obtained. Performance results with higher number of elements are presented in Appendix Figs. 11, 12, 13 and 14a–h.

Table 4. System estimated SNR at user terminal Beamforming scheme

8 elements system

16 elements system

32 elements system

Digital Hybrid (infinite) Hybrid (limited) Digital Hybrid (infinite) Hybrid (limited) Digital Hybrid (infinite) Hybrid (limited)

System percentage EVM results Modulation scheme 16-QAM 64-QAM 256-QAM (%) (%) (%) 1.61 1.62 1.65 1.61 1.62 1.65

1024-QAM (%) 1.90 1.90

1.64

1.81

1.82

2.45

1.32 1.32

1.38 1.38

1.38 1.38

1.43 1.43

1.35

1.48

1.50

1.58

1.18 1.18

1.20 1.20

1.22 1.22

1.30 1.30

1.19

1.23

1.28

1.36

Table 5. System percentage EVM results Beamforming scheme

System UEs estimated SNR Modulation scheme 16-QAM (dB) 64-QAM (dB) 8-elements Digital −4.0 −4.7 16-elements Digital −2.3 −3.2 32-elements Digital −1.2 −1.3

256-QAM (dB) 1024-QAM (dB) −5.0 −6.3 −3.3 −4.1 1.4 1.8

Multiuser Beamforming Prototype The Beamforming/beamsteering testbed setup will require a feedback automation from UE on the channel, position and is subdivided into two components, i.e. radio-in–loop hardware components comprises of the USRPs, its antennas, networking Ethernet cabling and the external synchronization and software signal processing components comprises mainly of the Matlab software toolkits, the hardware support packages, and

Real Time Multiuser-MIMO Beamforming/Steering

41

the host-PC. The USRP hardware setup to depict the complete 4 2 Multiuser MIMO system is present in Figs. 8 and 9. This hardware setup consists of Host PC together with an 8-port gigabit Ethernet Switch, (HP 1410-8G Switch) to connect the PC with the six (6) NI USRP NI-2922 radios (SDR). Table 6 summarises the components required to setup the testbed system.

Fig. 8. Complete 4x2 Testbed architecture

Fig. 9. USRP radio testBED Physical arrangement

42

A. B. Abdullahi et al. Table 6. MultiUSER-MIMO software/hardware components Hardware component Host PC 8-Port Ethernet Switch SMA-to SMA cable NI-USRP + Ethernet cable VERT2450 Antennas Ext. Ref. Source 4 power splitter

Qty. 1 1 2 6 4 2 2

Software component Matlab Software Matlab toolboxes USRP Hardware Support Package

For real-time performance assessment using Matlab, a single Host-PC connect all the USRP radios-in–loop. It worth noting that, this software tool does not support master-slave synchronisation mode with MIMO-cable, thus, necessary to synchronise external reference. The radios must connect the same PPS and 10 MHz clock generator via cables of equal lengths. The testbed synchronisation was successful through a 6Way Power Splitter/Combiner ZBSC-615 to REF IN and the PPS IN terminal to runs 4 2 Multiuser beamforming/beamsteering. For the prototype demonstration as shown in Fig. 9b, the system uses a Gold Sequences training to construct these multiuser payloads. Thus, 64 symbols with IFFT length of 256, each symbol uses 8subcarriers. Thus, UE1 payload creates a spectral null at UE2 by inverting the channel response and applying phase offsets. Similarly, the UE2 payload will also create a spectral null at UE1 achieve destructive interference. The propagation channel is considered a fixed, unlike the simulation that uses an ITU defined channel model. TestBED Real Time Results For display real-time testbed transmission results, the system explores a digital beamforming/beamsteering algorithm with a spectrum analyser included at receiver terminal (software component) for received signal constellation results processing. Figure 10a–h present the corresponding signal constellation, and system EVM, SNR estimates, and BER with different modulations were summarised in Table 7. This show success in implementing multiuser-MIMO beamforming/beamsteering testbed with USRP radios in real-time, the BER results within the threshold limit of wireless system standards. TestBED performance with 16-QAM, 64-QAM presents a good EVM, estimated SNR and BER as compared respectively with the higher order 256-QAM and 1024QAM as expected. Furthermore, from the system simulation result above, it is evident that both channel, multiuser interference is greatly improved with beamforming/ beamsteering and hence, more user terminal response as the BS transmitting elements increases.

Real Time Multiuser-MIMO Beamforming/Steering

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

43

Fig. 10. Multiuser beamforming Received Signal Constellation Scope Screenshot. a UE1 16QAM Rec. signal. b UE2 1024-QAM Rec. signal. c UE1 64-QAM Rec. signal. d UE2 256-QAM Rec. signal. e UE1 256-QAM Rec. signal. f UE2 64-QAM Rec. signal. g UE1 1024-QAM Rec. signal. h UE2 16-QAM Rec. signal

44

A. B. Abdullahi et al. Table 7. TestBED system performance

4-elements % EVM SNR Est. (dB) BER

16-QAM 1.321 −4.0 7.23 10−6

64-QAM 1.352 −4.7 6.328 10−4

256-QAM 1.405 −5.0 9.314 10−2

Fully digital prototype system@10 dB Modulation scheme 1024-QAM 1.701 −6.3 4.522 10−1

5 Conclusion System design, implementation, performance evaluation and hardware prototyping was successful using various Beamforming/beamsteering techniques. Hardware prototyping uses National Instrument Software Defined Radio (NI USRP 2922) and array antenna for system performance analysis. Results obtained show improved performance with increasing antenna. Hardware testBED is used to design, implement, and prototype beamforming/beamsteering. The system uses antenna arrays with a relevant modulation scheme to present multiuser MIMO performances in real-time, the results presented greatly enhance performance. Although the prototype uses a few USRP radios to implement and evaluate multiuser beamforming, the results obtained show significant performance improvement using beamforming/beamsteering as expected.

Appendices See Figs. 11, 12, 13 and 14.

Real Time Multiuser-MIMO Beamforming/Steering

(a)

45

(b)

(c) Fig. 11. a Fully Digital BF. b Hybrid BF signal. c Hybrid with limited phase-shifters precisions (i) 3-bits (ii) 7-bits

46

A. B. Abdullahi et al.

(a)

(b)

(c) Fig. 12. a Fully Digital BF. b Hybrid BF signal. c Hybrid BF with limited phase shifters precisions (i) 3-bits (ii) 4-bits (iii) 5-bits

Real Time Multiuser-MIMO Beamforming/Steering

(a)

47

(b)

(c)

(d)

(e)

(f)

Fig. 13. Performance with different transmitting element and modulation. a–d Digital-Hybrid % EVM. e Effective SNR estimate. f Effective SNR estimate

48

A. B. Abdullahi et al.

(a)

(c)

(e)

(b)

(d)

(f)

Fig. 14. Various-elements Received signal constellation. a, b 16-QAM constellation. c, d 64QAM constellation. e, f 256-QAM constellation. g, h 1024-QAM constellation

Real Time Multiuser-MIMO Beamforming/Steering

(g)

49

(h) Fig. 14. (continued)

References 1. Sanou, B.: ICT Facts and Figures The World in 2015 (2015). http://www.itu.int/en/ITU-D/ Statistics/Documents/facts/ICTFactsFigures2015.pdf. Accessed Mar 2016 2. Astely, D., Parkvall, S.: The evolution of LTE towards IMT-advanced. J. Commun. 4(3), 146–154 (2009) 3. Stefan Parkvall, A.F., Dahlman, E.: The evolution of LTE toward LTE advanced. J. Commun. (2008). http://ojs.academypublisher.com/index.php/jcm/article/view/. Accessed 12 Oct 2011 4. Rodriguez, J.: Fundamentals of 5G Mobile Networks, 1st edn. John Wiley Pub., Chichester (2015) 5. Rappaport, T., Health, R., Daniels, R., Murdock, J.: Millimeter Wave Wireless Communication. Pearson Education Inc., Westford (2015) 6. Rappaport, T., et al.: Millimeter wave mobile communication for 5G cellular: it will work. IEEE Access 1, 335–349 (2013) 7. Rappaport, T.S., Ben-Dor, E., Murdock, J.N., Qiao, Y.: 38 GHz and 60 GHz angledependent propagation for cellular and peer-to-peer wireless communications. In: IEEE International Conference on Communication, June 2012 8. Bo, H., et al.: Directional transmission by 3-D beam-forming using smart antenna arrays. In: Wireless VITAE 2013 (2013) 9. Hakam, A., et al.: Robust DOA estimation using a 2D novel smart antenna array. In: 2014 6th International Conference on New Technologies, Mobility and Security (NTMS) (2014) 10. Harris, P., et al.: A distributed massive MIMO Testbed to assess real-world performance and feasibility. In: 2015 IEEE 81st Vehicular Technology Conference (VTC Spring) (2015) 11. Luther, E.: 5G massive MIMO Testbed: from theory to reality, December 2015. http://www. ni.com/white-paper/52382/en/. Accessed 21 Sept 2015 12. Prahlad, K., Ramamurthi, B.: Design and implementation of a multi-terminal channel emulator on LTE TestBed. In: 2015 Twenty First National Conference on Communications (NCC) (2015) 13. Rahman, M.M., Dey, S., Saha, N.: Adaptive array antenna for WLAN: a smart approach to beam switching through phase shifting in feed network. In: 2012 15th International Conference on Computer and Information Technology (ICCIT) (2012)

50

A. B. Abdullahi et al.

14. Elvira, V., Vía, J.: Diversity techniques for RF-beamforming in MIMO-OFDM systems: Design and performance evaluation. In: 2009 17th European Signal Processing Conference (2009) 15. Khalaf, A.A.M., El-Daly, A.R.B.M., Hamed, H.F.A.: Different adaptive beamforming algorithms for performance investigation of smart antenna system. In: 2016 24th International Conference on Software, Telecommunications and Computer Networks (SoftCOM) (2016) 16. National Instruments: NI USRP-292x/293x Datasheet Universal Software Radio Peripherals. National Instrument, p. 6 (2015) 17. Tan, K.S., et al.: An efficient digital beamsteering system for difference frequency in parametric array. In: 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (2004) 18. Rozé, A., et al.: Comparison between a hybrid digital and analog beamforming system and a fully digital Massive MIMO system with adaptive beamsteering receivers in millimeterWave transmissions. In: 2016 International Symposium on Wireless Communication (2016)

5G Waveform Competition: Performance Comparison and Analysis of OFDM and FBMC in Slow Fading and Fast Fading Channels Muhammad Imran(&), Aamina Hassan, and Adnan Ahmed Khan College of Signals, National University of Sciences and Technology (NUST), Islamabad, Pakistan [email protected]

Abstract. An analysis of an emerging physical layer multicarrier (MC), waveform Filter Bank Multi Carrier (FBMC) with Orthogonal frequency division Multiplexing (OFDM) waveform under variety of channels has been performed. Filter banks are the advanced form of MC sub band processing and promise to deliver better results than OFDM. These filter banks exploit the shortcomings arising due to the usage of Fast Fourier transforms (FFT) at the trade-off of adding complexity to the systems i.e. Poly-Phase Filter Networks (PPN). We investigate transceivers design, out of band Emission (OOB), Power Spectral Density (PSD), Bit Error Probability (BEP) for these waveform. We validate our analyses through simulations ascertaining FBMC’s performance better than OFDM which makes it an ideal candidate for 5G physical layer. Keywords: OFDM

 FBMC  5G  Poly-Phase Filter

1 Introduction Multi carrier modulation is nowadays widely adapted for broadband communication in the form of OFDM [1]. The MC divides the spectrum into number of narrow band subcarriers aided through FFT. FFT produce number of orthogonal sub-carriers which do not overlap with each other. Addition of CP to FFT at the transmitter side caters for the multipath fading effects. Due to these properties OFDM standard is widely adapted in Digital subscriber Lines, Power Line communication, cellular communication LTE and Wi-Fi standard WiMAX [2]. Despite of the fact that OFDM is one of the most ideal technique because of its robustness and accelerated implementation using FFT, many drawbacks are associated with OFDM especially when its application is considered in low latency and spectrum efficient environment. 5G applications demands low latency, SE, robustness and loose synchronization [3]. OFDM suffers a higher OOB. OFDM has a very high peak to average power (PAPR) ratio [4]. OFDM uses a CP which adds to the spectral redundancy and it demands for a strict synchronization at the receiver. Several techniques are proposed to cater for the OOB radiation and PAPR of OFDM like windowing, companding [5], tone injections, interleaved OFDM, block © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 51–67, 2020. https://doi.org/10.1007/978-3-030-12388-8_4

52

M. Imran et al.

coding, clipping and cosine filters [6], but they actually do not cover all the merits of 5G that are needed. Scientific community has been working to propose another variant of MC technique which is most promising one. FBMC appeared in 1960’s preceding OFDM. Chang proposed the preliminary work for FBMC followed by Saltzberg [7] who proposed the theory for a prototype filter design in such a way to as to meet the Nyquist criteria, under which the prototype filter was supposed to achieve the ideal reconstruction without Inter symbol Interference (ISI) [8]. The prototype filter gained much attention as it claimed for the side lobes as small as possible [9]. This research paper focuses on the transceiver structure of OFDM and FBMC, enlightening their key differences, their performance and analysis in AWGN, Pedestrian Channel and Vehicular Channel. The already proposed waveforms for the 5G physical layer access spectrum includes FBMC, Universal filter bank multi carrier (UFMC), Generalized frequency division multiplexing (GFDM) [10] and OFDM [11]. FBMC applies the filter property to every subcarrier before it is transmitted; this reduces the OOB radiation to several folds. UFMC applies the filter properties to the whole symbol after being modulated and added [12] while in GFDM a tail biting technique for CP is added [10]. All these techniques contribute to low PAPR as compared to OFDM. FBMC is one of the most widely researched waveform for 5G. In this paper only FBMC and its implications are covered. FBMC is an evolved form of OFDM and MC modulation scheme. It uses the blocks of FFT and inverse FFT (iFFT), adds PPN structure to it [13] and produce more flexible results than already existing 4G multiple access spectrum technology OFDM. FBMC do not use CP which adds to its SE. CP in OFDM is used to cater for multipath fading when delay of the symbol is less than the CP. The filter banks add features to FBMC that is easy computations, band selectivity, low PAPR, low OOB radiation and low sampling rates. The paper begins with the introduction to the 5G challenges and problems to 4G physical layer scheme. Section 2 covers the OFDM transceiver model supported with expressions and drawbacks with OFDM. In Sect. 3 FBMC transceiver model is covered in detail, along with its implementation ways. It also cites the key differences between the two modulation schemes and proofed analyses. Section 4 is the simulations and proven results section. The paper is concluded with future work, acknowledgment and references. 1.1

Challenges to 4G Waveform

Challenges trigger the process of research and development. The cellular generations are always heading towards new user needs and enhancements towards quality and services. Since the last four decades’ cellular generation has evolved from 1G to 4G. 1G was the era of Analogue communication, 2G was the era of Digital communication, 3G dealt with the pseudorandom [14] and 4G with the multi carrier frequency modulation [15]. 5G will cover those aspects which the previous generations have not achieved so far and will introduce more SE, high data rates, some killer applications (MTC and IoT), battery efficient communication, target oriented services, high capacity, low spectrum leakage, self-organizing networks (SON) [3] and much more.

5G Waveform Competition …

53

Having said that we have OFDMA in current 4G networks. OFDM technologies have a few shortcomings which make us re-think about a new flexible technique that can heal the shortcomings of 4G waveform. The CP is added to the OFDM symbol, which is 1/4th of the symbol length itself, it is then concatenated at the beginning of the symbol as shown in the Fig. 1. CP adds to extra redundancy and hence, reduced SE [16]. The OFDM symbols have higher PAPR and OOB radiations. These shortcomings forced researchers to look for another solution. Filter banks made their way into these problems and offer much better solutions along with FFT/iFFT band.

Fig. 1. CP in OFDM

2 OFDM and Its Drawbacks OFDM is the one of the most widely used MC modulation technique. OFDM divides the sub-carriers orthogonally by using FFT and iFFT modules which are easy to implement [17]. Data is modulated using any QAM and higher order modulation and converted into parallel streams, the parallel streams are of low data rates and easy to process. Then the inverse Fourier transform block is inserted, iFFT is for the Orthogonality of the subcarriers see Fig. 2 for the block diagram. IFFT coverts the signal from frequency domain to time domain but here in OFDM by convention it is used for Orthogonality purpose only. (1) shows the IFFT operation and (2) shows the FFT operation for orthogonal sub carrier generation. The Equations symbols are defined in Table 1.

FBMC TX

FBMC RX

Fig. 2. OFDM and FBMC transeiver

54

M. Imran et al. Table 1. Equation symbols Symbol M m x(n) X(t) X(k) T K k L N Eb Pb Ccomp I No

Mapping in equations Data symbols Symbol index Data in serial form Data in time domain Data in frequency domain Symbol period Interpolation co-efficient Impulse index Filter length FFT points Energy per bit Bit error probability Computational complexity Modulation array Noise density

2 1 X N

xð t Þ ¼

X ½kej2pkt=N

ð1Þ

k¼N=2 2 1 1 X X ½K  ¼ ej2pkt=N N t¼N=2 N

ð2Þ

Orthogonality of the two subcarriers is proven by (3). 2 1 X N

e

j2pkt N

e

j2ppt N ¼ 0;

8 p 6¼ k

ð3Þ

t¼N=2

OFDM is highly sensitive to multipath fading affects so the demodulation of the symbols become difficult [18]. Multipath fading add inter symbol interference (ISI) to the symbol. Due to ISI delayed version of a symbol overlaps with the previous adjacent symbol. To counter the effect of Multipath fading a guard band is introduced in OFDM in such a way that a copy of the symbol is replicated and added to OFDM symbol. The symbol is made longer and a copy of tail is inserted to the beginning of the last adjacent symbol as shown in Fig. 1. In 802.11 standard CP is 1:4 of the original symbol. The signal is periodic because of the usage of FFT, so FFT (n’) is the delayed version of the original FFT (n) as shown by (4).

5G Waveform Competition …

FFT ðn0 Þ ¼ eð2jprf ÞFFT ðnÞ

55

ð4Þ

Delay in time division means rotation in frequency domain, so the correct signal is obtained by anti-rotation in frequency domain. Now without Multipath rotation is shown in (5). yðtÞ ! FFT ðnÞ ! y½K  ¼ H ½kX ½k

ð5Þ

(6) shows the behavior of the signal with Multipath fading.   yðtÞ ¼ FFT ðn þ n0 Þ ! a 1 þ e2jprk  X ½k ¼ H 0 ½kX ½k

ð6Þ

There are certain drawbacks with OFDM waveform which include Inter-Carrier Interference (ICI), PAPR, higher OOB, and low SE. These shortcomings pave ways to redesign the physical layer for 5G applications. 2.1

Sensitivity to Carrier Offset and High PAPR

A small frequency offset results in high ICI in OFDM. Single carrier systems are less sensitive to frequency and drift offsets. PAPR is the ratio of the maximum power of the transmission system to the average power [19]. The amplifier of the OFDM must behave in the linear regions in order to avoid saturation. To compensate the effects of PAPR many techniques have been introduced which include clipping, windowing, interleaving and block coding etc. [6]. However, these techniques do not fully qualify with the results needed for 5G applications. 2.2

Higher OOB and Reduced SE

The side lobes of the sub carriers after modulation in OFDM have very high side lobes in case of OFDM [20]. OOB radiations affect the data transferred in the main lobe and increases the PSD outside the main lobe. CP in OFDM is added at the cost of bandwidth, although its addition reduces multipath fading but it also reduce SE. The Signal to Noise Ratio (SNR) loss due CP is given by the (7), where Tcp is the CP time and the overall symbol duration is T which is the addition of CP and the original symbol ðTcp þ TÞ as shown in Fig. 1. SNRloss

2.3

  Tcp ¼ 10 log10 1  T

ð7Þ

Time to Re-innovate Multiple Access Approach

After enlisting disadvantages of OFDM and keeping in mind the 5G requirements and key drivers, it make the researchers think to redesign the multiple access scheme which ensures high SE, higher data rates, low latency and should also qualify for the requirements of IoT and MTC [3]. Several new modulation schemes as mentioned

56

M. Imran et al.

earlier are under research by the research and development sector of various countries. FBMC is one of the most widely researched and oldest waveform for 5G. It is nearest to OFDM and was proposed earlier than OFDM. Proposed research will analyse FBMC as lurking waveform nominee for 5G and will focus on its bit error analysis in more realistic environments than done before.

3 FBMC Model FBMC is a derived form of Multi carrier (MC) modulation. It uses FFT and iFFT quite differently than OFDM see Fig. 2 for detailed structure and difference of OFDM and FBMC transceivers. Since in the presence of multipath fading, FFT modulated signals cannot be reconstructed fully to original signal at Receiver (Rx) so in order to reconstruct the signal at receiver we have two options. First is to add an extended sequence or guard-band greater than channel impulse response which is followed in OFDM. Second way is to keep intact the length of the symbol but add some customization and processing to FFT block, second approach is used in FBMC Modulation [13]. FBMC model uses the filter characteristics of the FFT block and implements a bank of filters approach. An FIR filter is represented by (8). Ið f Þ ¼

sin pfM M sin pf

ð8Þ

Now Consider FFT (2) with k index and M FFT size we get (9). yk ¼

1 X j2pki 1M xðn  M þ iÞe M M i¼0

ð9Þ

The coefficients of FFT appears to manifold with an exponential term, which actually respond to a shift in frequency. With k/M as the shift in frequency, when a number of FFT blocks combine a filter bank of size M is obtained. Orthogonality condition is exhilarated at the zero crossings when frequencies are integer multiples of 1/M and it is in accordance with the Nyquist criteria. So there exist a low pass filter in FFT. Banks of these filters are combined together in the form of PPN in FBMC. A filter with zero frequency offset is termed as prototype filter [13]. Prototype filter is featured with a constant K. K¼

Filter Impulse Response Duration MC Symbol Duration T

ð10Þ

Here, the Nyquist Criteria which promises for zero Inter symbol interference (ISI) of symbols ensures the division of the filter into two halves and meet the symmetry conditions by taking squares of the frequency co-efficients. The co efficients are

5G Waveform Competition …

57

derived in [13] (see Table 1) using the same Criteria. The interpolation formula for the values of the coefficients is given in (11).   K sinðp f  MK MK    Hk Hð f Þ ¼ k MK sin p f  MK k¼k1 k¼1 X

3.1

ð11Þ

Implementation Methods of FBMC

There are two methods to implement FBMC: One is extended FFT or frequency spreading technique (FS) and other is PPN technique, Fig. 3a, b respectively. In the first technique if M data is fed to the input of the frequency spreading block 2K−1 size of input is fed to the adjacent iFFT block after passing through the frequency spreading block, in this case a size ok KM iFFT block is used. The output is obtained with the help of weighted spreading method in FS technique. Second approach is most widely adapted for FBMC implementation and in this technique iFFT block of size M is used, which is computationally efficient than the earlier method. In PPN-FBMC interleaved phase shifter is added in parallel (in a filter bank form), through which the FFT block size is reduced to K only instead of KM.

Fig. 3. FBMC implementation methods: a FS-FBMC b PPN-FBMC

In this paper PPN technique is followed for the implementations of FBMC. The PPN networks add more computations to the FFT and iFFT blocks, but the results produce by them are amazing. 3.2

Modulation Scheme with FBMC

After deciding the filter bank Implementation method for FBMC, next task is to define the modulation scheme to be followed with FBMC. In FBMC Orthogonality in needed in sub channels only, not in sub-carriers unlike OFDM, so any modulation scheme can be used with FBMC when we go for half-duplex communication. If we consider full

58

M. Imran et al.

duplex case, a little bit think process is needed as now we ought to adventure with the neighboring/adjacent sub channels as well. One way is to use even index of the sub channel with the real part of iFFT and odd index of the sub channels with the imaginary part of the iFFT. As the filter follows Nyquist criteria we can get a basis and increased throughput to choose our modulation scheme. The throughput is achieved due to the symmetry of the filter, the imaginary part crosses through the time axis of symbol period at the integral multiple of it, and the real part cross through half of the symbol period at the odd multiple of it. Offset Quadrature Amplitude Modulation (OQAM) is adapted to fulfill the criteria of a fullduplex mode in FBMC [21]. The offset corresponds to the sub channel spacing between the real and the imaginary part of the symbol. 3.3

Key Differences Between the Two Waveforms and Our Analysis

After the research of Tx, Rx models of OFDM and FBMC we list down their key differences and our research analysis in rationalized form in Table 2. These differences are the roadmap for the simulations and testing the two schemes. These logical Table 2. Summary of the Differences between OFDM and FBMC Parameters FFT/iFFT

Guard band/CP Modulation

Competitor Waveforms OFDM FBMC FFT/iFFT used as in Customized FFT/iFFT their standard form, blocks used no modifications

Processing is added in FFT blocks of FBMC at the cost of larger computations than OFDM, this increases the complexity of the overall system of FBMC SE is achieved in FBMC

CP required

No CP required

Higher order of QAM for fullduplex e.g. QAM, 16QAM, 64 QAM

OQAM in FBMC in needed to fully exploit the spectrum and to maintain orthogonality between sub channels Requires orthogonality in Neighboring channels adjacent sub-channels orthogonality is achieved only using OQAM Modulation Localized frequency coPHYDAS filter/hermite efficients are added in the prototype filter/IOTA form of PHYDAS filter, filter which fruits to low OOB emissions PPN along with OQAM Divides the given frequency into number of accomplish the task for full throughput in FBMC sub channels

Orthogonality Requires Orthogonality in sub-carriers Filters Window filter generally

Frequency exploitation

Analysis

Divides the given frequency into number of sub carriers

Higher order of OQAM modulation for full duplex. QAM Modulation for half-duplex

5G Waveform Competition …

59

differences are the base line for the re-design of a new waveform which meets the requirement of the 5G applications, out of which low latency, SE and loose synchronization directly demands from the physical layer. Our research analysis followed by our simulations and testing validate our research.

4 Simulations and Results FBMC and OFDM modulations are simulated using MATLAB 2015. The Fig. 2 blocks are adapted as simulation Algorithm in the same sequence. Several parameters of the two waveforms are simulated and compared. Filters comparison, subcarrier comparison, PSD and the BEP are simulated. BEPs are simulated using the expressions derived already in [22, 23]. 4.1

Filters Comparison

OFDM and FBMC filters are simulated both in time domain and frequency domain. The coefficients values used for the PHYDAS filters are taken from Table 1 of FBMC: A Primer which is the fundamental document for FBMC Model [13]. The parameters for filters, subcarriers and PSD are enlisted in Table 3. For the filter comparison the parameter channel number, M is kept 24 to visualize the difference between the side lobes of the two filters, if M is taken the same as that in the BER simulations the side lobes difference would not be clear. Table 3. Parameters for filters, subcarriers and PSD Parameters OFDM M 24 Modulation QAM Modulation order 64 Sub carrier 7 k –

FBMC 24 OQAM 64 7 4

Figure 4 shows the OFDM window filter and FBMC PHYDAS filter in time domain. In time domain the filter used for OFDM is generally a window filter which is applied to a band of subcarriers to remove clipping. In FBMC a PHYDAS filter is designed and applied to each sub-carrier before it is transmitted. Figure 5 shows the comparison of filters in frequency domain, it is clear that side lobes of FBMC filter are much lower than OFDM filter. Due to very low OOB radiations in FBMC, it undergoes a smaller ICI than OFDM and is ideal to use in cognitive spectrum.

60

M. Imran et al. Filters in Time Domain 1.4

1.2 FBMC OFDM

1

PSD

0.8

0.6

0.4

0.2

0

-0.2 -2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Normalized Time

Fig. 4. Filters impulse response in time domain Filters Impulse respnse in frequency domain 40

FBMC

30

OFDM

20

Magnitude (db)

10 0 -10 -20 -30 -40 -50 -60 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Normalized Frequency

Fig. 5. Filters impulse response in frequency domain

4.2

Subcarriers Comparison

The sub carriers of the two competitors are simulated using the same parameters as in filters comparison see Figs. 6 and 7 for the simulation results. The x-axis represents the normalized frequency and y- axis is the PSD. ICI in OFDM is greater than FBMC. Moreover, the PSD curve of both of the subcarriers show that the side lobes of the OFDM bear greater PSD in comparison to the side lobes of FBMC. For an energy efficient system the side lobes must possess very small PSD. Figure 8 shows a PSD difference of FBMC and OFDM. The curve shows a visible difference in the PSD of the two waveforms. The PSD results validate our analyses that when PHYDAS filter is applied in FBMC, the side lobes of the subcarriers are suppressed and don’t interfere with the adjacent sub channel lobes unlike OFDM subcarriers.

5G Waveform Competition …

61

OFDM subcarriers

0

-10

PSD

-20

-30

-40

-50

-60 -1

-0.5

0

0.5

1

1.5

1

1.5

Normalized frequency

Fig. 6. OFDM subcarriers

FBMC SubCarriers 0

-20

PSD

-40

-60

-80

-100

-120

-1

-0.5

0

0.5

Normalized frequency

Fig. 7. FBMC subcarriers

4.3

BEP of OFDM and FBMC in AWGN Channel

In AWGN channel OFDM is simulated without CP as AWGN channel has no multipath fading effects so CP is not required in this channel. BEP expressions for AWGN channel are taken from [23]. Pb is the probability of error in (12). The parameters for the BER simulations are cited in Table 4. The parameters for the earlier parameters are different to as to clarify the curves and results. sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffi I1 3log2 IEb pffiffi erfc½ Pb  pffiffi  2ðI  1ÞN0 I log2 I

ð12Þ

62

M. Imran et al. PSD of OFDM and FBMC 0 OFDM

-10

Power Spectral Density [dB]

FBMC

-20 -30 -40 -50 PSD Difference

-60 -70 -80 -90 -100 -50

-40

-30

-20

-10

0

10

20

30

40

50

Normalized Frequency

Fig. 8. PSD of OFDM and FBMC

Figure 9 indicates that in AWGN channel distribution, OFDM and FBMC shows almost the same BEP, which strengthen the fact that it is the CP which caters for the multipath fading and multipath fading don’t occur in AWGN channel. Moreover, it increases the keenness to simulate the waveforms in more realistic channels to investigate their performance and see a visible difference. 4.4

BEP of OFDM and FBMC in Vehicular Channel A and Pedestrian Channel A

Vehicular channel Model A is the fast fading channel and Pedestrian Channel Model B is slow fading channel. The parameters followed for BEP simulations under these channels are the same as for AWGN channel. The BEP is now simulated using CP for OFDM with one tap-equalizer. The channel Models are defined by the ITU standard Table 4. Parameters for BEP simulations Parameters M Modulation Modulation order CP Symbols K Carrier frequency Sub carrier spacing

OFDM FBMC 32 32 QAM OQAM 64 64 ¼ carrier spacing 0 3 7 – 6 2.5 GHz 2.5 GHz 15 kHz 15 kHz

5G Waveform Competition …

63

Fig. 9. BER of OFDM and FBMC in AWGN channel

Table 5. ITU parameters for vehicular channel model A and pedestrian channel model B Tap Vehicular channel model A Pedestrian channel model B Relative delay (ns) Average power (dB) Relative delay (ns) Average power (dB) 1 0 0 0 0 2 310 −1 200 −0.9 3 710 −9 200 −4.9 4 1090 −10 1200 −8.0 5 1730 −15 2300 −7.8 6 2510 −20 3700 −23.9

[24], which are also shown in Table 5. The objects for the channel Models are present in MATLAB under 802.11g Libraries. Equations for the BEP Simulations are taken from IEEE letter [22]. Figures 10 and 11 are the same BEP curves under Vehicular Channel Model A except for the fact that in Fig. 10 BEP is plotted against velocity and in Fig. 11 BEP is plotted against SNR. The results indicate that FBMC performs better than OFDM in mobile channel. CP-OFDM shows better BEP than FBMC in lower velocities but in higher velocities FBMC takes the lead. CP-OFDM shows the low BEP in low velocities because the CP perfectly gratifies for multipath fading but as soon as the velocity increases the PHYDAS filter in FBMC plays its part. This indicates that when the channel becomes dominated by Doppler shift (the change in frequency which produce delay spread) FBMC performs better than CP-OFDM due to its PHYDAS filter. The performance of OFDM without CP is the poorest one in both Fading channels.

64

M. Imran et al.

Fig. 10. BEP curves, 500 km/h velocity

BEP in Vehicular Channel

10 0

Bit Error Probability

OFDM CP-OFDM FBMC

10 -1

10 -2

10 -3 -5

0

5

10

15

20

25

30

35

Signal-to-Noise Ratio (dB)

Fig. 11. BEP vs. SNR in vehicular channel, 500 km/h

Figure 12 is the BEP curve for Pedestrian Channel Model B. As Pedestrian channel is a slow fading channel, FBMC and CP-OFDM shows the same BEP curves as in the case of AWGN channel. Figures 11 and 12 show that FBMC’s performance in slow fading channel is better than its performance in fast fading channel.

5G Waveform Competition …

65

Fig. 12. BEP vs. SNR in pedestrian channel, 10 km/h velocity

5 Conclusion The research started with enlisting the key drivers for 5G and keenness to redesign its physical Layer. The concept behind FBMC and OFDM modulation schemes was developed, alongwith their transceivers differences supported with expressions. Furthermore, the respective filters (both in time and frequency domain) were simulated and visualized their PSD. For the BEP simulations the AWGN channel was used for fast fading Vehicular channel and a slow fading Pedestrian channel. The results show that FBMC has much lower OOB radiations, almost comparable BEP in AWGN channel and a very good BEP in fast (Vehicular) and slow fading (Pedestrian) channels as compared to OFDM. This makes FBMC an ideal candidate waveform for 5G. Proposed work achieved very good bit error ratios for FBMC at the cost of adding more computations to the systems in the form of filters and PPN network.

6 Future Work 5G waveform is to be standardized in 2020. The paper concludes with proposed results that FBMC offered much lower OOB radiations. In cognitive systems where white spectrum is to be exploited, localized waveform frequencies are desired. FBMC is one of the best scheme to be considered as it has localized frequency co-efficients and high spectrum efficiency. Acknowledgements. This research paper is an outcome of Masters research work of Aamina Hassan. Authors are grateful to National University of Sciences & Technology (NUST), Islamabad, Pakistan, for the motivation, encouragement and support throughout the research process.

66

M. Imran et al.

References 1. Wu, Y., Zou, W.Y.: Orthogonal frequency division multiplexing: a multi-carrier modulation scheme. IEEE Trans. Consum. Electron. 41, 392–399 (1995) 2. Koffman, I., Roman, V.: Broadband wireless access solutions based on OFDM access in IEEE 802.16. IEEE Commun. Mag. 40, 96–103 (2002) 3. Andrews, J.G., Buzzi, S., Choi, W., Hanly, S.V., Lozano, A., Soong, A.C.K., Zhang, J.C.: What will 5G be? IEEE J. Sel. Areas Commun. 32, 1065–1082 (2014) 4. Amstrong, J.: OFDM for optical communications. J. Lightwave Technol. 27(3), 189–204 (2009) 5. Mahmud, Z., Hossain, M.S., Islam, M.N., Abdullah, M.I.: Comparative study of PAPR reduction techniques in OFDM. ARPN J. Syst. Softw. 1 (2011) 6. Bisht, M., Joshi, A.: Various techniques to reduce PAPR in OFDM systems: a survey. Int. J. Signal Process. Image Process. Pattern Recogn. 8, 195–206 (2015) 7. Saltzberg, B.: Performance of an efficient parallel data transmission system. IEEE Trans. Commun. Technol. 15, 805–811 (1967) 8. Farhang-Boroujeny, B.: Filter Bank Multicarrier Modulation: A Waveform Candidate for 5G and Beyond. ECE Department, University of Utah, Salt Lake City, UT 84112, USA (2014) 9. Farhang-Boroujeny, B.: Filter Bank Multicarrier (FBMC): An Integrated Solution to Spectrum Sensing and Data Transmission in Cognitive Radio Networks, USA (2009) 10. Michailow, N., Matthé, M., Gaspar, I.S., Caldevilla, A.N., Mendes, L.L., Festag, A., Fettweis, G.: Generalized frequency division multiplexing for 5th generation cellular networks. IEEE Trans. Commun. 62, 3045–3061 (2014). IEEE Communications Society 11. Gerzaguet, R., et al.: The 5G candidate waveform race: a comparison of complexity and performance. EURASIP J. Wirel. Commun. Netw. 13 (2017) 12. Schaich, F., Wild, T.: Waveform contenders for 5G—OFDM vs. FBMC vs. UFMC. In: 6th International Symposium on Communications, Control and Signal Processing (ISCCSP), Conference (2014) 13. Bellanger, M., et al.: FBMC Physical Layer: A Primer. http://www.ict-phydyas.org (2010) 14. Karjaluoto, H.: An Investigation of Third Generation (3G) Mobile Technologies and Services. University of Jyväskylä, October 2006 15. Reddy, M.H., Jaswanth, S., Pramod, N.V.: Evolution of mobile networks: from 1G TO 4G. Adv. Res. Electr. Electron. Eng. 3(4), 307–310 (2016) 16. Shah, D.C., Rindhe, B.U., Narayankhedkar, S.K.: Effects of cyclic prefix on OFDM system. In: Proceedings of the ICWET 2010 International Conference & Workshop on Emerging Trends in Technology, India, January 2010 17. Pandharipande, A.: Principles of OFDM. IEEE Potentials 21, 16–19 (2002) 18. Sadouki, B.R., Chaker, H., Djebbouri, M.: The effect of multipath on the OFDM system. Int. J. Comput. Appl. 89(13) (2014). ISSN 0975-8887 19. Gangwar, A., Bhardwaj, M.: An overview: peak to average power ratio in OFDM system & its effect. Int. J. Commun. Comput. Technol. 1, 22–25 (2012) 20. Baltar, L.G., Waldhauser, D.S., Nossek, J.A.: Out-of-band radiation in multicarrier system: a comparison, Germany (2009) 21. Sahin, A., Guvenc, I., Arslan, H.: A survey on multicarrier communications: prototype filters, lattice structures, and implementation aspects. IEEE Commun. Surv. Tutor. 16(3), 1312–1338 (2014)

5G Waveform Competition …

67

22. Nissel, R., Rupp, M.: OFDM and FBMC-OQAM in doubly-selective channels: calculating the bit error probability. IEEE Commun. Lett. 21, 1297–1300 (2017) 23. He, Q., Schmeink, A.: Comparison and evaluation between FBMC and OFDM systems. In: 19th International ITG Workshop on Smart Antennas, WSA 2015, 3–5 March 2015 24. Draft 802.20 Permanent Document: Channel Models for IEEE 802.20 MBWA System Simulations. IEEE (2003)

Mitigating the Nonlinear Optical Fiber Using Dithering and APD Coherent Detection on Radio Over Fiber Fakhriy Hario(&), Sholeh H. Pramono, Eka Maulana, and Sapriesty Nainy Sari Department of Electrical Engineering, Faculty of Engineering, Universitas Brawijaya, Malang, Indonesia [email protected]

Abstract. Nonlinear is one of the major interferences in the RoF (radio-overfiber) scheme. In this scheme, optical fiber transmission is an essential medium for transmitting data. Data transmission with high capacity and density is highly vulnerable to nonlinear effects. To overcome this problem, we proposed the use of HFD (high-frequency dithering) and APD (avalanche photodiode) to reduce the nonlinear effect on both sides of the transmitter and receiver sides. This study was more emphasized on SPM (self-phase modulation) and GVD (group velocity dispersion) nonlinear types. The results show that there was an increase in amplitude unit (a.u) of the APD component used on the CD. The maximum amplitude unit (a.u) generated was 500 k and the minimum was least −70 k at the optical fiber length of 10 km. Longer optical fiber length led to greater losses. Compared with the scheme without using APD on the receiving side, the results show a fivefold increase at the optical fiber length of 10 km. Keywords: RoF

 Dithering  APD  Nonlinear

1 Introduction In fiber optic transmission, nonlinear effects are almost always avoided and taken into account in making transmission systems based on the optical fiber. In some conditions, the nonlinear effect is more significant than the attenuation or distortion of the channel. In this regard, many studies have designed the system to be more resistant to nonlinear effects. The nonlinear types used in this study are the effects of SPM (self-phase modulation) and GVD (group velocity dispersion) [1, 2]. Many studies have focused on methods to overcome the attenuation caused by nonlinear SPM (self-phase modulation), XPM (cross-phase modulation), and FWM (four-wave modulation) in optical fiber communication systems [3, 4]. This study proposed a modified scheme to suppress the nonlinear effects, especially SPM and GVD. This study focused on the reduction of pulse width and phase shift of signals used in a single channel for long distance transmission (SMF) [5–9]. In previous studies [10–12], the developed methods were focused on the transmitter side of the modulator, the signal modulation process using the high dithering frequency scheme (fd), and on the receiver side using coherent © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 68–74, 2020. https://doi.org/10.1007/978-3-030-12388-8_5

Mitigating the Nonlinear Optical Fiber

69

detection (CD), i.e. the PIN detector component. This research related and develop with previous works [11, 12]. Contribution and limitation this research are in modify and combine of transmit components. This study maintained the scheme on the transmitter and signal modulation, but on the side of the CD, we used a photodetector in the form of APD. The aim was to clarify and improve the signal quality at the receiving end. In addition, by using new components on the CD side, it was expected that the resulting signal, especially on the constellations and constellation quantities, was better than the use of previous CD components.

2 Method Figure 1 describe scheme novelty of research based on APD and dithering technique. APD internally multiplies the primary photo signal flows before entering the amplifier circuit. The avalanche is an electron/hole multiplexing mechanism called impact ionization. The newly generated carrier is also accelerated by electric field strength, thereby strengthening the energy for subsequent impact ionization. Under the breakdown voltage, the number of specified generated carriers was finite while in contrast, the carrier generated can be infinite above the breakdown voltage.

LASER PIN

MODULATOR

APD (COHERENT DETECTION)

FREQUENCY DITHERING

EDFA

Fig. 1. RoF block diagram using high-frequency dithering and APD receiver

Dithering is the process of periodic signal (noise) injection into a linear or nonlinear system to gain some purpose such as adding linear characteristics to the open/close system. Dithering techniques can also be done by varying the amplitude and frequency to find the characteristic dithering signal that can provide a random signal distribution (Gaussian). Dithering can be in the form of an analog or digital signal and a random or deterministic signal. The dithering signal source can be derived from sine, triangle, and square wave generators. On the nonlinear characteristics, the input and output quantities can be expressed by y ¼ fNL ð xÞ. Figure 2 describes the block dithering system, with d (t) is the continuous signal of time from dithering. x(t) * y(t) are signal input and output variables of the nonlinear system. The mathematical formula of the output system can be expressed as follows [48, 72].

70

F. Hario et al.

rg (t)

Σ

x(t)

fNL (.)

z(t)

H1 (jω)

y(t)

d(t) h1 (t) Fig. 2. The onlinear single frequency with fd component after high-frequency dithering

y ¼ h1 ðtÞ  fNL ðxðtÞÞ ¼ h1 ðtÞ  fNL rg ðtÞ þ d ðtÞ



ð1Þ

with h1 ðtÞ is the input impulse response of H1 ðjxÞ filter. H1 ðjxÞ parameter can be converted into a time function with the addition of h1 ðtÞ time function. The transmission system side was divided into two parts, namely transmission of signal generation and RTO. The parameters of the radio and laser signal transmission used were the roll of factor, linewidth, optical launch power (OLP), and index modulation (IM) measured on the CW laser. In RTO section, the parameters used were phase, input voltage, and attenuation on LiNbO3 MZM modulator. On the other transmission side, which was the dithering scheme, the components used were the FM modulator with frequency deviation and input signal sinusoidal type. The parameters used in optical fiber media were fiber optic type SMF (single-mode fiber), nonlinear index, attenuation, and measuring power that was measured in pumping laser and pumping couple. On the receiver side, the components used were local oscillators and CDs consisting of multiple APDs.

3 Results and Analysis The overall design of the RoF (Radio over Fiber) system with the new model was based on the LiNbO3 MZM modulator and the use of dithering techniques, with existing pumping components and coherent detection schemes using APD on the receiver side. Measurements and observations were made on several TP (test point) which were classified into three parts of the basic block. The first part was the RTO (radio-tooptical transmission) block. The second part was the schematic of the optical link (optical fiber). The third part was the OTR (optical-to-radio conversion) block. Signals generated from the RoF block were laser signals that carry data in the form of analog signals. Sampling and observation of the simulation results were obtained from the measurement results of each TP of the three-part blocks. In this paper, the measurements shown were the measurement at the end of the receiver side. The measurement shown is in the form of the signal constellation which is a representation of signal quality and signal spectrum. The constellation results in Fig. 3 show an increase in amplitude unit (a.u) by using the APD component of the CD. The maximum amplitude of the unit (a.u) generated

Mitigating the Nonlinear Optical Fiber

71

was 500 k and the minimum was −70 k at the optical fiber length of 10 km. Longer optical fiber length led to greater losses. Compared with the scheme without using APD on the receiving side, the results show a fivefold increase at the optical fiber length of 10 km. In addition, the observations using spectrum on Fig. 4 showed achieved a power increase of 10 dBm at the optical fiber length of 10 km.

(a)

(b)

(c)

Fig. 3. Constellation after using the proposed scheme, (a) 10 km, (b) 50 km, (c) 100 km.

(a)

(b)

(c)

Fig. 4. Spectrum generated from the proposed scheme, (a) 10 km, (b) 50 km, (c) 100 km.

Figure 5 shows that at 10 km of optical fiber length, the maximum a.u value was 100 k (a.u) and 20 k (a.u) for 100 km of optical fiber length. The power received on the three schemes with the proposed scheme had the same (a.u), but with a different constellation. With only one modulator, the noise level and the resulting bit error were larger. The single modulator causes the signal to be incoherent. Moreover, the noise and phase are out of the pumping and they will worsen as the optical fiber length increases. A signal sent by an ideal transmitter will be represented by constellation points with also ideal conditions at the receiving end. However, various imperfections in the implementation (distortion, noise, loss, etc.) cause the actual constellation point to deviate from the ideal location. One phase in the demodulation process is the phase shift. This results in an I-Q (in-phase-quadrature) flow which can be used as a reasonably reliable approximation parameter for the ideal signal to be transmitted.

72

F. Hario et al.

(a)

(b)

(c)

Fig. 5. The generated constellation without using APD on CD, (a) 10 km, (b) 50 km, (c) 100 km.

Figure 6 exhibits correlation between distance and power recieved (mW), it compared with previous works. Power recive 10, 50, and 100 km achieved are 0.001, 0.01, and 0.0001 mW.

Fig. 6. Distance versus power received on variatif fiber span.

The measurement in this study used the dithering technique called the nonsubtractive dithered (NSD) type; thus, there was no signal reduction on the output side. In NSD settings, dithering is generally used in the frequency domain on unused portions of the spectrum, which allows dither signal to be utilized to separate the noise. The fundamental analysis of the dithering system is by obtaining the NSD dithering process in the digital signal mathematically by not removing the quantization process. Dithering adds amplitude to all signal samples with digital processes, which makes the level of amplitude value to be lower than the next threshold level. The next higher amplitudes signals are the sum of the dither noise and the previous amplitude. Dithering on this system is more specifically used to suppress the dynamic nature of a system and can stop unused (nonlinear) characteristics of the system or attenuate

Mitigating the Nonlinear Optical Fiber

73

resonance spikes. The dithering technique provides variations in amplitude and frequency to achieve good dithering signal characteristics to provide a random signal distribution. With this technique, the problem of signal widening caused by nonconstant amplitude will be muted by giving a varying effect on amplitude and frequency. This study used the receiver on the CD using APD. APD has an internal gain obtained from a high electric field that generates electrons and holes. The electrons and holes can ionize the bound electrons in the colliding valence bands due to the nonlinear properties during light propagation in optical fibers. In addition, the added electrons and holes can increase the energy produced to influence the ionization process. APD has a higher gain value than the PIN diode. The results show that the ionization process can help the electrons that are bound due to nonlinear. The resulting energy increased compared to the use of a PIN diode at (a.u) output. There was no significant change of constellation, with linear canonical properties were still fulfilled on the PIN diode. In addition, the process of minimizing free electrons by using APD due to nonlinear effects was faster than the PIN diode. The combination of two modulators and dithering techniques can be used to solve nonlinear distortion problems by using noise effects for high speed and capacity data transmissions. Improvements in transmission are essential to produce good laser quality. In this study, we used the dithering technique and addition of MZM LiNbO3 modulator as a follow-up to reduce the harmonization that arose due to nonlinear effect as the laser radiates through optical fiber medium. The essence of the problem is caused by the characteristics of the laser in an optical fiber. In the laser, the reaction between the atom and the photon occurs due to the diversity of distributed energy in the laser. One characteristic of SPM is the occurrence of chirp in the system, which generates new frequencies and causes spectral extension. In GVD, there is signal spreading caused by non-constant signal sources and amplitude. The GVD and SPM phenomenon cause widening of spectral in pulse propagation. In the spectral widening caused by SPM phenomenon with increasing chirp, the widened chirp propagation will be faster than in GVD phenomenon when b2 > 0. This is caused by the positive and negative GVD phenomena on b2 < 0 condition that causes the SPM chirp phenomenon.

4 Conclusion RoF performance using a combination of high-frequency dithering, laser pumping, and coherence detection receiver using the APD component had a significant effect on generated energy and power, but not significant to the shape of the constellation. This means that modification of the system on the receiving side is essential to obtain a better performance. The APD helps the ionization of the colliding free electrons caused by the nonlinear effects. The future works, is possibility to get novelty with develop another develop dithering method, and injection algorithm in APD or another detection to correction error data. However, with these scheme we can increase capacity of systems. Acknowledgment. This research was done by a collaboration of Laboratory of Telecommunication, Universitas Brawijaya, Indonesia and Universitas Gadjah Mada, Indonesia.

74

F. Hario et al.

References 1. Eason, G., Noble, B., Sneddon, I.N.: On certain integrals of Lipschitz-Hankel type involving products of Bessel functions. Phil. Trans. Roy. Soc. London A247, 529–551 (1955) 2. Maxwell, J.C.: A Treatise on Electricity and Magnetism, vol. 2, 3rd edn, pp. 68–73. Clarendon, Oxford (1892) 3. Jacobs, I.S., Bean, C.P.: Fine particles, thin films and exchange anisotropy. In: Rado, G.T., Suhl, H. (eds.) Magnetism, vol. III, pp. 271–350. Academic, New York (1963) 4. Wang, Y., Yu, J., Li, X., Xu, Y., Chi, N., Chang, G.K.: Photonic vector signal generation employing a single-drive MZM-based optical carrier suppression without pre-coding. J. Lightwave Technol. 33(24), 5235–5241 (2015) 5. Kanesan, T., Pang, W., Ghassemlooy, Z., Lu, C.: Impact of optical modulators in LTE RoF systems with nonlinear compensator for enhanced power budget. In: Proceedings of Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference, California, United States, pp. 1–3, March 2013 6. Kanesan, T., Pang, W., Ghassemlooy, Z., Lu, C.: Investigation of optical modulators in optimized nonlinear compensated LTE RoF system. J. Lightwave Technol. 32(23), 1944– 1950 (2014) 7. Kanesan, T., Pang, W., Ghassemlooy, Z., Perez, J.: Optimization of optical modulators for LTE RoF in nonlinear fiber propagation. IEEE Photonics Lett. 24(7), 617–619 (2012) 8. North, T., Rochette, M.: Analysis of self-pulsating sources based on regenerative SPM: ignition, pulse characteristics and stability. J. Lightwave Technol. 31(23), 3700–3706 (2013) 9. Jiao, Z., Zhang, R., Zhang, X., Liu, J., Lu, Z.: Modeling of single-section quantum dot mode-locked lasers: impact of group velocity dispersion and self phase modulation. J. Lightwave Technol. 49(12), 1008–1015 (2013) 10. Partiansyah, F.H., Susanto, A., Mustika, I.W., Idrus, S.M., Purnomo, S.H.: Dithering analysis in an orthogonal frequency division multiplexing-radio over fiber link. Int. J. Electr. Comput. Eng. 6(3), 1112–1121 (2016) 11. Khair, F., Partiansyah, F.H., Mustika, I.W., Setiyanto, B.: Performance analysis of digital modulation for coherent detection of OFDM scheme on radio over fiber system. Int. J. Electr. Comput. Eng. 6(3), 1086–1095 (2016) 12. Hario, F., Mustika, I.W., Susanto, A., Hadi, S., Idrus, S.M.: A novel scheme for orthogonal frequency division multiplexing-radio over fiber based on modulator and dithering technique: impact of self phase modulation and group velocity dispersion. Int. J. Intell. Eng. Syst. 10(4), 117–125 (2017)

NavAssist-Intelligent Landmark Based Navigation System Ratnakumar Madhushan(&) and Cassim Farook Informatics Institute of Technology, Colombo-06, Sri Lanka {madhushan.2013202,cassim.f}@iit.ac.lk

Abstract. Travelling has become an important aspect for humans in their daily life. Identifying the route for a particular location with its name only in a new area makes it very difficult for people. Even the regular navigation instructions make travelling a bit complex due to the unawareness of the route and the surrounding. But we humans have the ability to recognize and remember routes based on landmarks with their visual attributes. As a solution to this problem faced by people, this research has proposed the idea of suggesting navigation routes based on the significant landmarks that the users can easily identify due to its visible significance. This will help users navigate to certain destination with less hassle. In the process of route suggestion along with the landmark suggestion the shortest path also will be suggested for users who doesn’t require the help of landmarks to navigate. This system uses Openstreet maps and Geoserver to display the route for the user from one point to another. This solution will be a web-based solution. This new solution will ensure to make navigation much easier for travelers specially travelling to a new city. Keywords: Spatial temporal systems  Geographic information systems Short path  Route planning  OSM  Landmark based route suggestion

1 Introduction Navigation has been a major aspect for human beings, to travel from one place to a another. Many GIS (Geographical Information System) applications such as Google maps, Bing maps and Apple maps were developed to assist humans to navigate in an efficient and effective manner. People use mobile phones, GPS device, tabs, desktop and other smart devices to aid them in modern day navigation [1]. To make navigation even simpler and smarter, a solution has been proposed through this research. The landmark-based navigation system, will show user the shortest path and the path with maximum number of landmarks with a short distance. This feature of the system caters the need of both a new traveler to a new city and a traveler who knows the city. To fulfill this need, landmarks should be collected and saved in the system. The landmarks will be collected through user inputs and system will prioritize those landmarks based on its attribute score. The risk of covering a large geographical area persists, to overcome that problem a concept which is known as VGI (Volunteered Geographical Information) is used. This concept is to let users upload geographic information themselves. Based on this © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 75–86, 2020. https://doi.org/10.1007/978-3-030-12388-8_6

76

R. Madhushan and C. Farook

concept a well-known GIS called Open street maps which is an open source software where users are allowed to edit the map by placing landmarks and removing on their will. But considering the local context of Sri Lanka, the available landmark models are quite low. So that infrastructures like buildings, statues, huge billboards, shopping malls and other famous architectures could be made available as landmarks [2]. The proposed system has a unique feature for having a landmark data. User will be given the facility to add, remove and update landmarks. Those landmarks will be displayed on the map respectively and other travelers or users will get the opportunity to identify them and get use of it. Among all prevailing navigation solutions, the top three applications which are vastly used has been chosen to compare with the proposed solution as shown in Table 1.

Table 1. Comparison of navigation application features Application

Google Map Waze Co-pilot Nav- assist

Features Use of landmarks ✗

Best path suggestion for navigation ✓

Add landmark to map ✗

Traffic data ✓

✗ ✗ ✓

✓ ✓ ✓

✗ ✗ ✓

✓ ✓ ✗

The research paper will first discuss regarding the dataset used, methodology followed, tested methodology and finally provides the conclusion and future enhancements of the proposed system.

2 Dataset For this research the OpenStreetMap has been used as the base data set. To display the different routes, multiple layers have been created and attached to open street maps. The layers which were used for the representation purpose is described below [3] in Table 2. In addition to these layers, landmarks can be manually added by the user. After validating the landmark will be added to the map.

NavAssist-Intelligent Landmark Based Navigation System

77

Table 2. Geo server layers Layer name buffer hh_2po_4pgr hh_2po_4pgr_ vertices_pgr hh_2po_vertex landmarks nodes planet_osm_point point_collection path

Purpose Geometry details of map Handles geometry attributes of paths. (Fig. 4) Holds data of vertices road Keeps the count of landmarks in particular route Landmark layer stores landmark data which is added by user and show in map. (Fig. 5) Contains Latitude, Longitude information. Handles the OSM tags of places and maintains timestamp Hold data of highways, railways track, one-way, waterways and bridges Landmark information from OpenStreetMap dataset of Sri Lanka. (Fig. 6) Draws path between two points over prevailing roads

3 Methodology This navigation system has been divided in to 3 sections, they are route identification, landmark weightage calculation and route suggestion. The flow of this system is elaborated in the high-level diagram as shown in Fig. 1 below. The user can access the webpage by entering valid credentials. The system is secured with username and password in order to prevent hacking. In the map page user is required to enter the start point and destination point. The system will get the latitude and longitude of both locations and will display shortest distance route as shortest path, parallelly it’ll check all possible routes distance and landmark contribution in each route and display the optimum path. 3.1

Landmark Model

The landmark model contains information about the places which are considered as significant landmarks. Locations that lack significance will not be considered as landmarks. Landmarks have certain attributes which contribute towards its significance height of the landmark, horizontal spread of the landmarks are considered as the main attributes. Landmarks can be categorized into two, which are man-made structures and natural entities. Commercial buildings such as (cafes, supermarkets, shopping malls, theatres), public places (hospitals, banks), religious places (church, mosque, temple), billboards, clock towers and bridges are few examples for man-made landmarks. Rocks, tanks, large trees and streams are examples for natural landmarks. Most of the landmarks are linked with cultural and social significance. These landmarks can be quickly identified by the people even if they have not seen it before and for a person who has seen these landmarks before can recall it easily. This shows

78

R. Madhushan and C. Farook

Fig. 1. High level design diagram

that social and cultural significance of a place becomes an important attribute. Apart from the cultural and social significance the visibility of the landmark also becomes another important attribute. Certain landmarks have only day time visibility or night time visibility while some landmarks have both day and night visibility. Therefore, the landmark model considers the height, width, social cultural significance and day/night visibility as the main attributes of a place. below given equation shows the landmark significance [4, 5]. landmark significance ¼ f ðSpread; Height; Social Significance; Cultural Significance; Day=Night visibilityÞ After identifying the important attributes for the landmark model, the information was manually added to the system as this information are not available for the selected geographic area. To update and obtain landmarks the voluntary geographic approach will be used, but instead of using the mathematical information which are related to the measurements, qualitative measurements will be used. While considering the height and spread attributes which are quantitative measurements, it is quite unrealistic to expect the user to know the exact heights or spread of the landmark therefore ordinal measurements will be used for height and spread. Each place will be given a score. depending on this score it will be decided whether to

NavAssist-Intelligent Landmark Based Navigation System

79

consider this place as a landmark or not. This overall weight for the place will be calculated depending on the value given for the different attributes it has. Compared to the social and cultural significance, the day/night visibility, height and spread gives more contribution towards the significance of a landmark [5]. With the help of the survey conducted among the travelers the contribution of each attribute to a place is verified [4]. The summary of landmark attributes and its values and assigned weightage are show in Table 3 [5].

Table 3. Landmark attributes and weightage Attributes of landmark Values and weightage Social significance High = 5 Low = 3 None = 0 Cultural significance High = 5 Low = 3 None = 0 Spread High = 10 Medium = 6 Short = 3 Height Tall = 10 Medium = 6 Short = 3 Day/night visibility Both = 10 Day = 6 Night = 3

3.2

Navigation Model

Advanced model of the environment could be built with route directions which provides a certain number of procedures and descriptions that would help someone who’s using them to traverse in an advanced model of the environment. 3.3

Identifying Optimum Path

Completion of the landmark model will result in the places being identified as landmarks with a weightage. To find the optimum path, the SQL function was used to find the weight of road segments which is show below [6]. sql :¼0 select id; km; length; landmark count from path0 ; Soon as a weight was obtained by dividing the number of landmarks in that segment by the length of the segment, that’s simply the number of landmarks per kilometer. Which is also show as an equation below

80

R. Madhushan and C. Farook

Number of landmarks per kilo meter ¼

Number of landmarks in a segment Length of a segemnt

A method was written to find the shortest path with the maximum number of landmarks. To count the number of landmarks available in the selected path the below given query was used. To identify the shortest path between given two points the Dijkstra’s algorithm was used. Therefore, to get the shortest path with maximum number of landmarks the weighted cost from the weight of the path was calculated [7]. 3.4

Navigation and Identifying Turns

Once the optimum path using the landmark model is derived, the turns in the optimum path had to be identified. To identify the turns first we identified the intersection points of the optimum path with the other paths available or intersecting with the optimum path [8]. For that PostGIS function ST-Touches was used [9], selecting geometry of paths from hh_2po_4pgr table. The SQL method which was used is shown below. sql : ¼ 0 select geom way from hh 2po 4pgr where ST Touchesð000 jjrec:st makeline :: textjj000 ; geom wayÞ AND id not in ðselect gid from pgr bestpathð0 jjx1jj0 ;0 jjy1jj0 ;0 jjx2jj0 ;0 jjy2jj0 ÞÞ0 ; After querying all the roads that are intersecting with the optimum path, their intersection points are made visible by gasping out the geometry of those intersection points and was stored in record [10]. Angle between a midway point and intersection point on the resulting path was calculated, according to that turns were identified as shown in Table 4.

Table 4. Navigation instruction for turns Angle(x) x > 90 AND x  180 x > 0 AND x < 90 x > 270 AND x  360 x > 180 AND x  270

Identified turn Left Right Steep right Steep left

Let the angle be ‘x’ Next landmarks were used to give navigational instructions. This task was carried out again by an algorithm developed using PostGIS functions. ST_DWithin was used to build the algorithms shown in SQL query below [11].

NavAssist-Intelligent Landmark Based Navigation System

81

0

EXECUTE select geom; tags from point collection where ST DWithinð000 jjintersection :: textjj000 ; geom; 100ÞÞÞ0 INTO record; Then GeoServer layer was created and the SQL view was edited accordingly, ultimately the layer was called using a new open layer and the information was displayed on the map. The turns and navigational instructions using landmarks were displayed by labels created using the GeoServer style which is shown below in Figs. 2, 3 and 4.

Fig. 2. planet_osm_point layer

3.5

Algorithms Used

3.5.1 Dijkstra’s Algorithm The Solution for single-source shortest path identification in graph theory, is Dijkstra’s algorithm. As shown in Fig. 5 all of its edges should have non-negative weights and graph points must be connected. The original algorithm gives the value of shortest path with slight modification a necessary path can be obtained [12]. 3.5.2 A* Algorithm A* is a best first search algorithm, which means searching all possible path to the given destination with less distance or shortest time. Among all the available paths it

82

R. Madhushan and C. Farook

Fig. 3. hh_2po_4pgr layer

considers the one that would direct quickly to the destination [13]. It is derived from weighted graphs, starting from a particular node of graph, later it builds a tree of paths from the point it started, expanding the paths one step at a time until one among the paths ends at a destination.

4 Testing 4.1

Accuracy Testing

Eight testcases were conducted to calculate accuracy on path suggestion, suggested landmarks and shortest path. As shown in Fig. 6 sample map result shows shortest path (Blue line) Optimum path based on landmarks (Red line). Based on these results the accuracy of the solution was derived as shown below. Accuracy ¼

Landmark predicted correctly  100 Expected Landmarks

Accuracy ¼

95  100 ¼ 69:85% 136

NavAssist-Intelligent Landmark Based Navigation System

83

Fig. 4. Landmarks layer

Fig. 5. Dijkstra’s Algorithm

P ðAccuracy per test case  Mean AccuracyÞ2 Variance of landmark prediction ¼ Totally Expect Landmarks  1 ¼

0:2937 ¼ 0:00217 135

84

R. Madhushan and C. Farook

Fig. 6. Sample path results

Standard Deviation ¼ ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Varience of landmark prediction

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0:00217 ¼ 0:04658

Division of suggested landmark by available landmarks gives an accuracy rate of 69.85% which will fluctuate within +4.658% or −4.568% from the mean. This Accuracy result shows that the implemented solution has good accuracy in terms landmark and path suggestion during navigation. 4.2

Unit Testing

In unit testing all features were broken down to small objects such as landmark suggestion, path selection, shortest path calculation and optimum path suggestion. Eight random start and destination points were given and tested above mentioned objects were tested individually. Apart from above mentioned testing the solution was testing in Integration testing, Functional testing and non-functional testing.

5 Limitation People require mobile applications more often than web application. A navigation application being only a web-based application is a huge drawback in terms of requirement of current users.

NavAssist-Intelligent Landmark Based Navigation System

85

Due to lack of support and update in OpenStreetMap, some of road traffic regulations haven’t been updated with new ones which affects the end result of application.

6 Conclusion The proposed research has a unique feature for having landmarks for navigation. Users have been given the facility to add and update landmarks. These landmarks will be displayed in the map respectively and other travelers get the chance to use them to find the path. The results will show shortest path and optimum path based on the landmark weightage in a particular route. This solution would be helpful for users who are new to a particular area.

7 Future Works Navigation is mostly used by people on the go which requires a mobile application, so enhancing the solution to iOS and android platform would be mainly focused as next stage. To enhance the route suggestion accuracy, incorporating traffic data along with the prevailing data will make the navigation more accurate and easier. Acknowledgment. My sincere gratitude goes to my supervisor Mr. Cassim Farook and to the authority of Informatics Institute of Technology for the kind guidance, inspiration, suggestions and the fullest support which were given to me in completion of this research. I would also like to express my thankful words to my parents and friends who helped me in several ways to make this project successful.

References 1. Garber, M.: 8 tools we used to navigate the world around us before GPS and smartphones CityLab (2013). https://www.citylab.com/life/2013/04/7-examples-how-we-used-navigateworld-around-us/5286/. Accessed 2 May 2018 2. Goodchild, M.F.: Volunteered geographic information (2007). http://ncgia.ucsb.edu/ projects/vgi/. Accessed 16 Nov 2017 3. Obe, R.O., Hsu, L.S.: PgRouting : A Practical Guide. Locate Press, Chugiak (2017) 4. Zeeb, B., Kong, Q., Xia, J., Chang, E.: Development of landmark based routing system for in-car GPS navigation. In: IEEE International Conference on Digital Ecosystems and Technologies, pp. 132–136 (2013) 5. Chandrasekara, P., Mahaulpatha, T., Thathsara, D., Koswatta, I., Fernando, N.: Landmarks based route planning and linear path generation for mobile navigation applications (2016) 6. Ramsey, P. (ed.): PostGIS Manual 7. Noto, M., Sato, H.: A method for the shortest path search by extended Dijkstra algorithm. In: 2000 IEEE International Conference on Systems, Man and Cybernetics, SMC 2000 Conference Proceedings. Cybernetics Evolving to Systems, Humans, Organizations, and their Complex Interactions (Cat. No. 00CH37166), vol. 3, pp. 2316–2320 (2000)

86

R. Madhushan and C. Farook

8. Jin, Z., Wang, X., Morelande, M., Moran, W., Pan, Q.: Landmark selection for scene matching with knowledge of color histogram - IEEE Conference Publication. IEEE (2014). http://ieeexplore.ieee.org/document/6916115/. Accessed 16 Nov 2017 9. Postgis, “ST_Touches,” PostGIS 2.3.8dev Manual (2018). http://postgis.net/docs/ST_ Touches.html. Accessed 11 Apr 2018 10. Postgis, “ST_Intersection,” PostGIS 2.3.8dev Manual (2018). http://postgis.net/docs/ST_ Intersection.html. Accessed 11 Apr 2018 11. Postgis, “ST_DWithin,” PostGIS 2.3.8dev Manual (2018). http://postgis.net/docs/ST_ DWithin.html. Accessed 11 Apr 2018 12. Morris, J.: Data structures and algorithms: Dijkstra’s algorithm (1998). https://www.cs. auckland.ac.nz/software/AlgAnim/dijkstra.html. Accessed 17 Aug 2017 13. Edenwaith: Path Finding - A* Algorithm (2011). http://www.edenwaith.com/products/pige/ tutorials/a-star.php. Accessed 20 Sept 2017

An Enhanced RSSI-Based Detection Scheme for Sybil Attack in Wireless Sensor Networks Yinghong Liu1 and Yuanming Wu2(&) 1

2

School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, China School of Optoelectronic Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China [email protected]

Abstract. In Sybil attack, a single faulty entity illegitimately claims multiple identities to gather more packets, which is extremely detrimental to network performance. This paper proposes an enhanced RSSI-based detection scheme with an innovative detection framework, including suspicious node screening phase and Sybil node verification phase. Introducing reputation module and adaptive threshold, malicious nodes are figured and marked by monitoring node promptly, which is essential to guarantee the reliability of detection nodes. In the collaborative work, monitoring node screens out the suspicious Sybil nodes firstly, and then selects two high reputation nodes for every suspicious node as detection nodes to carry out the verification of Sybil nodes. Theoretical analysis and simulation results show that our scheme achieves fast locating ability and high detection accuracy with low energy consumption. Keywords: Sybil node  RSSI  Reputation model Monitoring node  Detection node



Adaptive threshold



1 Introduction Wireless sensor network (WSN) is a self-organized network, in a monitoring area being deployed many micro and cheap sensor nodes, adopting wireless multi-hop communication technology. It has been applied to many fields, such as medical accident rescue, urban management, intelligent household, military and other applications. Network security, one of core issues, is drawing more and more researchers’ attention [1]. Inside attacks are more difficult to be detected than outside attacks, so researchers generally put focus on inside attacks, in which malicious attacks are launched by compromised nodes, typically represented by black-hole attack, selective-forwarding attack, wormhole attack and Sybil attack [2]. Among the above attacks, Sybil attack is the most difficult one to be detected. Sybil attack is implemented by Sybil nodes, the identities defined by a malicious node illegitimately taking on multiple identities. The multiple identities are used to cheat their neighbor nodes. Sybil nodes are the worst “malicious” nodes, rather than other malicious nodes, such as black-hole attack node, selective-forwarding attack node, and © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 87–102, 2020. https://doi.org/10.1007/978-3-030-12388-8_7

88

Y. Liu and Y. Wu

wormhole attack node. The malicious node (S in Fig. 1) exploits its counterfeit identities (Sybil nodes: S1, S2 in Fig. 1) to entice their neighbors to forward more pockets to this entity S. There are two specific forms of Sybil attack, dropping all or part packets as the first form and forwarding to specific nodes as the second form. The former is similar to so-called black-hole attack or selective-forwarding attack, while the latter uneven-consumption attack seems normal but causes early death of specific nodes due to excessive energy consumption.

Fig. 1. Sybil attack model

When it comes to detection of Sybil attack in WSN, energy consumption is always the first consideration, followed by others, such as detection accuracy, false detection rate and missed detection rate. Detection methods based on RSSI (received signal strength indicator), without any extra implements or additional information, are regarded as the most energy-efficient and promising. To minimize energy overhead, we put forward an innovative detection framework, including suspicious node screening phase and Sybil node verification phase. At the beginning, reliable monitoring nodes estimate the distances of their neighbors by comparing their RSSIs. Once nodes are suspected to be Sybil nodes, monitoring nodes would employ two high reputation noncollinear neighbors as detection nodes to accomplish the final verification. Rather than other detect methods, our scheme employs detection nodes only after suspicious nodes have been found, the energy overhead is reduced greatly on the basis of ordinary RSSIbased methods. To the best of our knowledge, in other existing schemes, all verifiers are asked to offer full participation, which is a waste of energy to a certain extent. On the other hand, it is a common potential danger in previous schemes that malicious nodes may act as detection nodes. In this case, malicious detection nodes could cover up the real Sybil nodes and defamation normal nodes as Sybil nodes, making detection results unbelievable. To avoid this risk, we introduce two extensions of reputation model and adaptive threshold. With the co-existence of above two, malicious nodes are figured and marked by monitoring node promptly.

An Enhanced RSSI-Based Detection Scheme for Sybil Attack …

89

The remainder of this paper is arranged as follows. Section 2 outlines existing location verification-based schemes. Section 3 puts forward the problems and countermeasures, while our scheme is detailed in Sect. 4. Section 5 presents simulation results and analysis. Section 6 draws conclusions and ends up with future work.

2 Related Work In Sybil attack, Sybil nodes, multiple illegitimate identities of the malicious device, belong to the same one entity. Taking advantage of this feature, location verificationbased methods are prevailing in recent years. Kuo and Wei proposed a Sybil attack detection method based on neighbor set in [3], which main idea is to collect neighbor information by all nodes in WSNs via internode communication, and then to figure out suspicious Sybil nodes by their common neighbor set. Neither nodes’ locations nor special hardware modules are needed. However, this method increases the communication burden between nodes and shortens the lifetime of the network because all nodes in WSNs bear the detection burden. Utilizing the anti-interference capability of UWB (ultra-wideband) technology, a lightweight detection algorithm based on UWB (ultra-wideband) ranging was designed for IEEE 802.15.4 compliant WSN operation in [4]. Equipped with a UWB transmission, each node in the network periodically emits impulse Radio UWB to construct a table containing the ranging estimates of every detected neighbor node. Once a legitimate node finds a distance match between at least two distinct nodes, it will raise an alarm revoking Sybil nodes by inserting them in the node black list. No cooperation or information sharing is needed among the nodes and the cost of communication is reduced. Nevertheless, the algorithm may result in misjudging normal nodes as Sybil nodes because the black list includes Sybil nodes and normal nodes. The studies found that the RSSI method consumed relatively less energy compared to other methods [5] and need no special requirements or extra information because simple calculation is enough to obtain the distances among nodes from the RSSIs. As a promising approach to detect Sybil attack, RSSI-based methods are favored by more and more researchers. That is Murat who first utilized RSSI to detect Sybil attacks [5]. Based on the binding IDs of the sending nodes, the actual distance between the nodes can be estimated by the RSSI values, and the Sybil nodes are figured out by comparing the RSSI ratio from time to time. If the RSSIs of some nodes are the same, these nodes are judged to belong to one entity. In a later Cooperative RSSI-based Sybil Detection method (CRSD) in [6], each node groups the received RSSI values of its neighbor nodes, and then broadcasts the group information. Sybil node detection results are given by the common determination of multiple nodes. Without any registration information, the storage consumption is saved. Regrettably, the participation of all nodes makes energy overhead soar.

90

Y. Liu and Y. Wu

Unlike the CRSD scheme, the scheme proposed in [7] relies on the cooperation of four nodes to detect one Sybil nodes. To judge the Sybil node accurately, the scheme analyzes the deployment density and the node radius of the network, and uses the overlap area of two nodes to determine the Sybil node. With appropriate system parameters, the efficiency is improved without increasing the burden of nodes. The deficiency of the scheme is that the simulation data is small and this scheme is not practical in large-scale network. It is also a pity that no wide range of derivations for the system parameters proposed in the scheme. Marian and Mircea [8] went further and used only two nodes to detect one Sybil attack through collaboration. Moreover, in this scheme, the feasibility of RSSI is analyzed by experiments. It is concluded that the RSSI is reliable enough in a controlled environment and a dedicated transceiver environment. It would be better if the generation of Sybil nodes is limited or a deployment plan comes up with validating the feasibility of the scheme. Jan and Nanda proposed a RSSI-based Sybil attack detection scheme for a centralized clustering-based hierarchical network in [9], in which any two high-energy nodes can determine whether a node is malicious or not. It can calculate the packet loss rate and packet delivery rate in the network, and prolong network lifetime. Limited by requirements for special network layout and different node types, this scheme can only prevent Sybil nodes from being selected as cluster head and further discussion is needed on network detection. In the novel lightweight detection scheme of Sybil-free APIT algorithm (SF-APIT) in [10], each sensor node builds a merged anchor list by exchanging information with its neighbor sensor nodes. Any pair of anchor nodes is claimed as Sybil nodes, as long as more than three sensor nodes find the RSSI from them matching. However, the reliability of those detection nodes is not discussed and detection against Sybil attack is just conducted among anchor nodes instead of the whole network. Not talked about in above schemes, malicious detection nodes can spoil detection results, by covering up the real Sybil nodes or framing normal nodes. Out of this consideration, we introduce two extensions of reputation model and adaptive threshold. With the co-existence of above two, malicious nodes have long been excluded, let alone selected as detection nodes. In addition, we notice that all verifiers are asked to offer full participation in the existing schemes in the distributed WSNs with no clusters, which is a waste of energy to a certain extent. Innovatively, we put forward a detection framework in this paper, including suspicious node screening phase and Sybil node verification phase. Trustable monitoring nodes, selected at the very beginning, take the charge of screening out suspicious Sybil nodes. Only if suspicious Sybil nodes exist, monitoring nodes would employ two high reputation non-collinear neighbors selected by the monitoring nodes as detection nodes for each suspicious Sybil node to accomplish the final verification. Both the detection workload and energy consumption are minimized, due to such a creative idea of reforming the detection process into a two-phase collaboration of three nodes.

An Enhanced RSSI-Based Detection Scheme for Sybil Attack …

91

3 Problem Statement and Brief Solution The characteristic of multiple identities owned by the same one entity makes Sybil the most sophisticated attack and extremely difficult to be detected. In this section, we elaborate the behaviors of Sybil attack and put forward our main problems with countermeasures. 3.1

Problem Description

As shown in Fig. 1, the attacker S, usually with higher energy or external energy supplement, forges two Sybil nodes S1 and S2 to attack the network. Generally, S pretends its counterfeit identities S1 and S2 as favorable forwarders to mislead their neighbor nodes B, E, and F to forward packets to S1, and C, D, E, J, K and N to forward packets to S2. Actually all the above nodes transmit their packets to the attacker node S because S knows all operations of its Sybil nodes S1 and S2. The attacker S employs two ways to deal with a large amount of illegitimately obtained resources. In one case, S discards all packets or forwards packets selectively. In another case, the attacker S forwards all packets to nodes G, I and H. Too much energy consumption of G, I and H cause their early deaths. What’s worse is that this may lead network black hole or network segments. For the above first attack form, Sybil nodes S1 and S2 are responsible for collecting packets, the malicious entity S does not forward all nodes. These abnormal behaviors would affect reputation values of nodes, and monitoring nodes can easily and promptly catch them with reputation model. While, for the second attack form, the Sybil nodes forward the received packets as ordinary nodes do. Normal reputation values help them escape the scrutiny of reputation model. No matter which form the Sybil attack takes, the entity of multiple Sybil nodes is the same one. In other words, physical locations of Sybil nodes S1 and S2 are the same, and their distances to other nodes C, D, etc. are equal, that is, dAS1 ¼ dAS2 , dBS1 ¼ dBS2 . Thus, it can be deduced that if multiple nodes have the same distance values from other nodes, they can be considered as Sybil nodes. And then there are two main problems to be solved. To locate the Sybil nodes’ entity, how many detection nodes are needed at the least and how do they cooperate with each other to reduce energy consumption? 3.2

Brief Solution

As shown in Fig. 2, if only one detection node D is employed, the ordinary nodes A and B, who are on the circle ABS1S2, would be misjudged as Sybil nodes. In the case that two detection nodes D1 and D2 are employed, as shown in Fig. 3, if node P1 (assuming as Sybil node) and node P2 (assuming as ordinary node) are exactly located on the perpendicular to line D1D2, node P1 and node P2 are detected as suspicious Sybil nodes or Sybil nodes in some methods because their distances to D1 and D2 are the same respectively. The probability of misjudgment is 50% in theory.

92

Y. Liu and Y. Wu

Fig. 2. Decision of one detection node

(a) The suspicious nodes are located on

(b) General situation

communication radius of the detection nodes

Fig. 3. Decision of two detection nodes

Fig. 4. Multiple collinear nodes decision

The only difference between Figs. 4 and 3 is that more detection nodes are employed, while suspicious nodes P1 and P2 are still located on the perpendicular to line D1D2…D6. It is crucial to notice that as long as detection nodes are collinear, it is inevitable to make misjudgment, no matter how many detection nodes are employed.

An Enhanced RSSI-Based Detection Scheme for Sybil Attack …

93

According to above analysis, a detection group of three non-collinear nodes is a good choice. Using the triangle sides’ theorem, we can easily judge whether there nodes are in a straight line or not. To make the network most energy efficient, we come up with a detection framework, including suspicious node screening phase and Sybil node verification phase. One monitoring node and two detection nodes are employed to accomplish the twophase collaboration. At the very beginning, there are no attacks in the network and all nodes are reliable, and it is at this time monitoring nodes are selected. Once screening suspicious nodes, monitoring nodes evaluate the reputation values of the monitored nodes, and select two high reputation non-collinear nodes among them as detection nodes instantly. Suspicious nodes are added to a Doubt List, initially built by monitoring node and then checked by two detection nodes successively. Obviously all neighbors of monitoring node must be in neighborhood with suspicious nodes. Through the geometric analysis, the requirement is met as long as the communication range of the monitoring node is within the common communication range of suspicious nodes. To this end, the communication radius r of the monitoring node is equal to or less than half of that R of ordinary node. As shown in Fig. 5, r ¼ 1=2R, the monitoring range is at the largest.

Fig. 5. Monitoring node radius r = 1/2 R

4 Details of Our Scheme Just after all nodes have been deployed randomly and uniformly in the certain area under no attacks, the monitoring nodes are selected like as cluster heads are selected by LEACH method. In this section, we will give an elaborate description of the scheme in the immobile data-gathering WSNs. 4.1

Reputation Model

Reputation model is a mechanism which analyzes and evaluates the reliability of a node and eventually determines whether it is reliable or not.

94

Y. Liu and Y. Wu

Reputation Calculation Reputation value is the comprehensive evaluation of node reliability. There are some classic evaluation models, like the Bayesian method, the Entropy method, the Gametheoretic method, the Fuzzy method, and the Beta model [11, 12]. Not to overuse the high-reputation nodes and shorten the lifespan of the network, the reputation model based on the residual energy proposed in [13] is adopted in this paper. In the reputation model, f and r denote the amount of packets forwarded and received by node nid respectively, thus, the trust value of node nid is Prid ¼ ðf þ 1Þ=ðr þ 2Þ

ð1Þ

The reputation value of this node is Val½nodeid  ¼ a  Prid þ b  Powerid

ð2Þ

Here we have 0 < a < 1 and 0 < b < 1, while a + b = 1. Powerid is the residual energy of node nid . Adaptive Threshold In practice, the channel quality is often changing and the fixed threshold does not work well. As the sensor nodes are exposed to the outside environment for a period of time, the channel quality can be affected by complex environmental factors such as weather, noise, and so on. High threshold under poor channel quality may cause high false alarm rate. On the contrary, Low threshold under high channel quality may cause high missed detection rate. Out of this consideration, the adaptive threshold designed in [13] is employed in our scheme. The adaptive threshold is based on the forwarding rate statistics of all nodes in the monitoring area. In a monitoring area, if a node receives xi packets in ith round, p(i) is the node forwarding rate of the ith round, Ptn is the total forwarding rate of the node in n rounds, then Ptn ¼

n X

xi  pðiÞ=n

ð3Þ

i¼0

The initial threshold T(0) can be specified by the actual situation while the program initial T(0) = 0.7 in [13]. The threshold in the ith round is T ðiÞ ¼ T ði  1Þ  fPti þ ð1  PtiÞ  pðiÞg

ð4Þ

Despite p(i) is likely to be affected by the current channel, the threshold would just fluctuate slightly with relatively stable Pt used to control the total variation factor. 4.2

Energy Consumption

Equation 5 explains how RSSI-based ranging works. PT and PR denote the transmitting and receiving power of the wireless signal respectively. d is distance between the

An Enhanced RSSI-Based Detection Scheme for Sybil Attack …

95

transmitter and receiver, while n is propagation factor, whose value depends on the wireless signal transmission environment. PR ¼ PT =d n

ð5Þ

In the data-gathering WSN, nodes can be classified into three types by function. Ordinary nodes act as data collectors and packets forwarders. Monitoring nodes are in charge of monitoring the behavior of ordinary nodes and calculating the comprehensive reputation of ordinary nodes, as well as screening Sybil nodes and selecting detection nodes. Once an ordinary node is selected as detection node, it is assigned extra job of detecting the RSSIs of suspicious nodes. Obviously, the energy consumed in calculation could be ignored since the communication energy consumption is much more. For wireless communication, a simple common energy consumption model [1] of nodes is shown in Fig. 6.

Fig. 6. Energy consumption model

The energy overhead of receiving a data packet of k bits by receiver is Er , and the energy overhead of sending a data packet of k bits by transmitter is Et . Er ¼ lk ¼ kEelec

ð6Þ

Et ¼ lk þ kefs d2

ð7Þ

lk is the energy overhead of receiving or transmitting k bits data packet, efs is the free space factor, and d is the communication distance. Eelec is the energy overhead of receiving or transmitting one bit, kefs d2 is the energy overhead of the amplifier per bit. With the nodes randomly and uniformly distributed in WSN, when packets are forwarded hop by hop, we suppose the tasks are spread equally among all nodes of the same hop. In our network layout, shown in Fig. 7, if there are 3 hops in total, the number of nodes of the first-hop, the second-hop and the third-nodes is the rate of 1:3:5. The number of nodes in one monitoring area, NM , depends on the density of the nodes. In one round of data gathering, the communication consumption of ordinary node and monitoring node can be estimated as shown in Table 1.

96

Y. Liu and Y. Wu

Fig. 7. Network layout Table 1. The communication consumption of ordinary node and monitoring node Hop

Type Ordinary node The third-hop Et The second-hop 1:67Er þ 2:67Et The first-hop 8Er þ 9Et

4.3

Monitoring node ðNM  1ÞEr 4:33  ðNM  1ÞEr 17  ðNM  1ÞEr

Network Layout

In order to achieve the regional control of the monitoring node, network layout needs to be carried out, which can be divided into the following steps: Step 1: To start with, deploy nodes randomly and uniformly in a certain range. Step 2: The Sink node broadcasts Hello information and gets reply of ACK message from the first-hop nodes. Step 3: The first-hop nodes then broadcast Hello message to their neighbors, and the non-first hop receivers are marked as the second hop nodes and reply with ACK information at the same time. During this process, nodes build their neighbors lists, with each neighbor’s RSSI value recorded. Step 4: So it goes on till every node gains its hop and neighbor list. Step 5: Assign the monitoring nodes as cluster heads are selected in LEACH. Subsequently, monitoring nodes change their transmission radius into half of the original one, and rebuild respective neighbor lists. Finally, the network layout is formed, as shown in Fig. 7.

An Enhanced RSSI-Based Detection Scheme for Sybil Attack …

4.4

97

Phases of Our Scheme

Before the introduction of specific algorithm, let’s clarify the function of monitoring nodes again. On the one hand, they are in charge of monitoring the behaviors of ordinary nodes by listening for the packet stream, working as the first defense just like watchdogs. On the other hand, monitoring nodes also screen Sybil nodes and select suitable detection nodes to verify the suspicious nodes. In our detection framework, the process includes suspicious node screening phase and Sybil node verification phase. Trustable monitoring nodes, selected at the very beginning, take the charge of screening Sybil nodes. Only if there are suspicious nodes, monitoring node would employ two high reputation non-collinear neighbors as detection nodes to accomplish the final verification. Both the detection workload and energy consumption are minimized, due to such a creative idea of reforming the detection process into a two-phase collaboration of three nodes. Part 1: Suspicious node screening phase Work in this phase is all done by the monitoring nodes. Step 1: Monitoring node nM finds the RSSI values of some neighbors, taking np and   nq for example, almost the same, that is, dMp  dMq   e (e is the error). np and nq will be regarded as a set of suspicious nodes and added to the list Doubt[i]. Step 2: The monitoring node nM listens for the packet steam and evaluates the reputation values of nodes, marking nodes whose reputation values are below the adaptive threshold as malicious nodes and broadcasting them. Meanwhile, as long as the trust value of a node is below the threshold in 3 successive rounds, it is judged as malicious node as well. Step 3: Once Doubt[i] is not empty, monitoring node nM appoints neighbor of the highest reputation as detection node na and sends Doubt[i] to node na . Part 2: Sybil node verification phase. Step 4: Detection node na compares the RSSI values of suspicious nodes in Doubt [i]. If the RSSI values of np and nq are not the same in the view of na , then this set of suspicious nodes are removed from Doubt[i]. After the all sets in Doubt[i] are processed by detection node na , we get a new list Doubt’[i], excluding partial or all ordinary nodes. Step 5: If no nodes are left in Doubt’[i], the verification is over. Otherwise, node na asks node nM to select another detection node. Subsequently, node nb , who is of the relatively highest reputation value as well as not collinear with nM and na , is selected as additional detection node to conduct the final verification of Doubt’[i]. In the end, the remainders are judged as Sybil nodes and their information will be spread to the whole network.

5 Simulation Results and Analysis In this section, we implement simulations of our scheme to verify its feasibility.

98

5.1

Y. Liu and Y. Wu

Simulation Environment

Randomly and uniformly deployed in a stable and reliable area E, nodes remain stationary and acquire no location information. In this paper, the number of sensor nodes varies from 50 to 600 and the scale of E is 3000 m2. Considering the accuracy of RSSI distance measurement, the distance error e in this paper is 0.5 m. In addition, it is assumed that the transmitting power of each node remains unchanged. The settings of the attributes and communication parameters of the nodes are shown in Table 2.

Table 2. The simulation parameters of the nodes Parameters Values Ordinary communication radius (R) 50 m Eelec 50 nJ/bit efs 100 pJ/bit/m2 Initial energy of node 2J Monitoring cycle 50 s Monitoring radius (r) 25 m

5.2

Simulation Results and Analysis

Simulation of the Reputation Value Parameter When it comes to the reputation value al½nodeid  in Eq. (2), values of a and b vary, and the two detection nodes vary. If a is too large and b is too smaller, nodes of high trust value are vulnerable to overuse, resulting in uneven energy consumption and the shorten life of network. On the contrary, if a is too small and b is too large, it is degraded into energy equalization choice, making the network overall forwarding rate unstable. The two coefficients should be reasonably adjusted according to actual situation. Simulation runs 36,000 s until all nodes were completely dead. The variation of forwarding rate under different values of a and b is described in Fig. 8, while the number of remaining nodes in area E is recorded in Fig. 9. Here, as we can see, when a = 0.9 and b = 0.1, the network has an overall high forwarding rate, but, nodes die much earlier. When a = 0.1 and b = 0.9, although the life cycle of the network reaches the longest, but the overall network forwarding rate is low. So for balance, our simulations sets a = 0.5 and b = 0.5 for the next steps. Detection results of the Sybil node For the first form of Sybil attack, the forwarding rate of the black-hole-attack-like is 0%, and that of the selective-forwarding-attack-like is assumed between 30% and 50%. Meanwhile, the second form of Sybil attack, uneven-consumption attack, can lead to network black holes in the long run.

An Enhanced RSSI-Based Detection Scheme for Sybil Attack …

99

1

Forword Probability

0.8 0.6 a=0.7,b=0.3 a=0.9,b=0.1 a=0.3,b=0.7 a=0.5,b=0.5 a=0.1,b=0.9

0.4 0.2 0

0

1

2

3

4 5 6 Network Life Cycle (h)

7

8

9

10

9

10

Fig. 8. Forwarding rate under different values of a and b 12 10

Nodes

8 a=0.7,b=0.3 a=0.9,b=0.1 a=0.3,b=0.7 a=0.5,b=0.5 a=0.1,b=0.9

6 4 2 0

0

1

2

3

4 5 6 Network Life Cycle (h)

7

8

Fig. 9. Number of remaining nodes in the area

It takes 50 s to finish one round of data gathering in our experiments, that is, the monitoring period of the monitoring node is 50 s. In the simulation of 36000 s (720 rounds), Tables 3 and 4 show the detecting results of putting the Sybil node in the 50th and 500th round respectively. In practice, we set the forwarding rate of a malicious entity owning two identities from 70–100% plummet to 0–50% in the 50th round and 500th round. Tables 3 and 4 show that our scheme can detect Sybil nodes quickly, regardless of specific attack forms. Table 3. The detecting result of putting the Sybil node in the 50th round The forwarding rate (%) The round of detecting 0 In the 52th round 30 In the 52th round 40 In the 52th round 50 In the 53th round 70–100 (ordinary) In the 52th round

The required rounds of detecting 2 rounds 2 rounds 2 rounds 3 rounds 2 rounds

100

Y. Liu and Y. Wu Table 4. The detecting result of putting the Sybil node in the 500th round The forwarding rate (%) The round of detecting 0 In the 52th round 30 In the 52th round 40 In the 52th round 50 In the 53th round 70–100 (ordinary) In the 52th round

The required rounds of detecting 2 rounds 2 rounds 2 rounds 3 rounds 2 rounds

Detection Accuracy The detection accuracy is expressed in the ratio of the number of Sybil nodes detected to the total number of malicious nodes, and malicious nodes here include Sybil nodes and other malicious node. Taking into account the UWB method [4] consistent with the detection principle of this program, these two schemes are compared this section.

1 this scheme UWB

Detection accuracy

0.98 0.96 0.94 0.92 0.9

5

10

15

20

25 30 Sybil nodes

35

40

45

50

Fig. 10. Sybil node detection accuracy varies with the number of Sybil nodes

Figure 10 shows how the detection rate varies with the number of Sybil nodes when the total number of nodes is fixed to 150. As the number of Sybil nodes increases, the detection accuracy of both schemes decreases, but compared to UWB method, our scheme dose not decline significantly. Figure 11 shows how the detection accuracy varies with the density of nodes when the ratio of the number of Sybil nodes to the total number of nodes is fixed to 20%. As the density increases, the detection accuracy of both methods declines, but the decline trend of our scheme is very slow while detection accuracy of UWB method declines almost linearly. When the number of nodes per 500 square meters reaches 600, the detection accuracy of our scheme drops to 92% and that of UWB method drops to 80%. Above results tell that our scheme not only has a higher detection rate, but also has stronger stability. This is mainly because our scheme employs two high-reputation

An Enhanced RSSI-Based Detection Scheme for Sybil Attack …

101

1

Detection accuracy

0.95 0.9 0.85 0.8 this scheme UWB

0.75 0.7 50

100

150

350 400 450 200 250 300 Density of nodes per 500 square meters

500

550

600

Fig. 11. Sybil node detection accuracy varies with the total number of nodes

detection nodes with an improved reputation model, which reduces the false detection rate and achieves high detection accuracy. In addition, the monitoring of packets steam and the evaluation of reputation value are carried out by monitoring nodes and following verification is conducted by two selected detection nodes. Thus, both the number of participators and the workload of detection are minimized, the total energy consumption is reduced and the lifetime of network is prolonged.

6 Conclusions Detecting the Sybil attack is one of the key issues to ensure the security of WSN. This paper makes improvement on RSSI-based scheme, the recognized energy-efficient and promising detection scheme for Sybil attack. The novelty of the proposed enhanced RSSI-based detection scheme for Sybil attack is that, detection is carried out under an excellent framework including suspicious node screening phase and Sybil node verification phase. Monitoring nodes screens the suspicious Sybil nodes, as well as utilizes the reputation model to evaluate the reputation values of monitored nodes, thereby marking nodes below adaptive threshold as malicious nodes and making the two high reputation non-collinear neighbors detection nodes to obtain the final Sybil node information. Compared with other existing Sybil attack detection schemes based on RSSI, our scheme minimizes the detection cost by the innovative division of labor among nodes and avoids the situation where malicious node acts as detection. Theoretical analysis and simulation results show that our scheme achieves fast locating ability and high detection accuracy with low energy consumption. Our scheme will need to be modified to adapt to more challenging WSNs, in which the transmission powers of nodes are adjustable, or monitoring nodes may be captured as malicious nodes, or part or all nodes may move freely.

102

Y. Liu and Y. Wu

References 1. Zhou, H., Wu, Y., Feng, L., Liu, D.: A Security mechanism for cluster-based WSN against selective forwarding. Sensors 16(9), 1537–1553 (2016) 2. Wu, Y.M.: An energy-balanced loop-free routing protocol for distributed wireless sensor networks. Int. J. Sens. Netw. 23(1), 123–131 (2017) 3. Kuo, F.S., Wei, T.W., Wen, C.C.: Detecting Sybil attacks in wireless sensor networks using neighboring information. Comput. Netw. 53(18), 3042–3056 (2009) 4. Sarigiannidis, P., Karapistoli, E., Economides, A.A.: Detecting Sybil attacks in wireless sensor networks using UWB ranging-based information. Expert Syst. Appl. 42(21), 7560– 7572 (2015) 5. Murat, D., Youngwhan, S.: An RSSI-based scheme for Sybil attack detection in wireless sensor networks. In: WoWMoM 2006, Proceedings of the 2006 International Symposium on a World of Wireless, Mobile and Multimedia Networks, pp. 564–570. Institute of Electrical and Electronics Engineering Computer Society, Buffalo-Niagara Falls (2006) 6. Lv, S.H., Wang, X.D., Zhou, X., Zhou, X.M.: Detecting the Sybil attack cooperatively in wireless sensor networks. In: Proceedings of the 2008 International Conference on Computational Intelligence and Security, CIS 2008, pp. 442–446. IEEE Computer Society, Suzhou (2008) 7. Liu, R.X., Wang, Y.L.: A new Sybil attack detection for wireless body sensor network. In: Proceedings of the 2014 10th International Conference on Computational Intelligence and Security, CIS 2014, pp. 367–370. Institute of Electrical and Electronics Engineers Inc., Kunming (2014) 8. Maarlan, S., Mircea, P.: Sybil attack type detection in wireless sensor networks based on received signal strength indicator detection scheme. In: SACI 2015, Proceedings of the 10th Jubilee IEEE International Symposium on Applied Computational Intelligence and Informatics, pp. 121–124. Institute of Electrical and Electronics Engineers Inc., Timisoara (2015) 9. Jan, M.A., Nanda, P., He, X.J., Liu, R.P.: A Sybil attack detection scheme for a centralized clustering-based hierarchical network. In: Proceedings of the 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2015, pp. 318–325. Institute of Electrical and Electronics Engineers Inc., Helsinki (2015) 10. Yuan, Y.L., Huo, L.W., Wang, Z.X., Hongefe, D.: Secure APIT localization scheme against Sybil attacks in distributed wireless sensor networks. IEEE Access 6, 27629–27636 (2018) 11. Yu, Y.L., Li, K.Q., Zhou, W.L.: Trust mechanisms in wireless sensor networks: attack analysis and countermeasures. J. Netw. Comput. Appl. 35(3), 867–880 (2012) 12. Josang, A., Ismail, R.: The beta reputation system. In: Proceedings of the 15th Bled Electronic Commerce Conference, Bled, Slovenia, pp. 1–17 (2002) 13. Hu, Y., Wu, Y.M., Wang, H.S.: Detection of insider selective forwarding attack based on monitor node and trust mechanism in WSN. Wirel. Sens. Netw. 6, 237–248 (2012)

Optimization of Polar Codes in Virtual MIMO Systems Idy Diop1(&), Papis Ndiaye1, Papa Alioune Fall2, Boly Seck2, Moussa Diallo1, and Sidi Mohamed Farssi1 1

Department of Computer Engineering, Polytechnic School (ESP) Cheikh Anta Diop University (UCAD), Dakar, Senegal {idy.diop,idrissa.ndiaye}@esp.sn, moussa.diallo@ucad. edu.sn, [email protected] 2 Applied Physics Section Faculty of Applied Science and Technology (SAT), Gaston Berger University (UGB) Saint-Louis, Saint-Louis, Senegal {papa-alioune.fall,seck.boly}@ugb.edu.sn

Abstract. Decode and Forward Mode (DF) is one of the best cooperation techniques used for half duplex relay channels. In this article, we propose a scheme DF low complexity based on polar codes, using a special technique: a decoder by the successive cancellation list (SCL) assisted by a cyclic redundancy check (CRC). With this proposal, the decoder at the relay node is designed to reduce decoding errors by exploiting the CRC detection technique and the probability of metric of L candidate decoding paths. Simulation results show that our scheme outperforms in terms of bit error rate some previous work, especially when the source-relay channel is a high SNR region. Keywords: Decode and forward Cyclic redundancy check

 Polar codes  Successive cancellation list 

1 Introduction The relay channel, introduced by Van der Meulen [1], has been the subject of intense activity researches. Various coding techniques have been studied [2, 3] and applied in cooperative communication. In addition, many relaying protocols have been proposed in the literature to obtain a compromise between energy consumption, latency and spectral efficiency of cooperative systems. Among which the most used are: Amplify and Forward (AF) and Decode-and-forward (DF) [4]. With AF protocol, the relay does not modify the information of the received signal; it only performs signal amplification without processing. However, since the signal is amplified in its entirety, the noise which accompanies it is also amplified: there is a noise amplification. Concerning DF protocol, the relay decodes the received signal from the source and reencodes it before sending it to destination. In this case, the noise is eliminated. However, the fundamental difficulty with this protocol is when there are decoding errors. In this case the relay reencodes and sends a wrong package: there is a propagation error.

© Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 103–116, 2020. https://doi.org/10.1007/978-3-030-12388-8_8

104

I. Diop et al.

In general systems with DF protocol give better performance compared to that with AF protocol. So, the great issue is how to improve the DF protocol to reduce or eliminate the propagation errors? This paper will attempt to answer this question by using very powerful error correcting codes: Polar Codes. Coding with polar codes is a new technique introduced in 2009 by Arikan [5]. The construction of these codes depends on the polarization phenomenon of the channel by using encoders and decoders with low complexity. Polar codes were studied for the first time in the relay channels [6]. These codes have been used for the Gaussian channel with degraded DF, and the authors showed that their use provides a substantial performance gain compared with LDPC codes [7]. In 2017, Duo and al. proposed a new version of DF, called «generalized partial information relay protocol» [8]. Their method is developed for degraded multiple relay networks which have orthogonal receiver components based on the nested structure of polar codes. A polar cooperative coding scheme based on the retransmission bits from partially perfect channels by exploiting the polarization is proposed in [9]. This approach offers interesting BER performance by using an SC decoding. The limitation of this method is the use of the SC decoder that is not efficient for small sizes codes. Recently, polar codebased Compress and Forward (CF) scheme with an SCL decoder at the relay level in an LTE context have been used [10]. Simulation results show that the proposed scheme outperforms some works that are based on the LDPC codes in terms of BLER (Block Error Rate). The main contribution of this article is to propose a new cooperative DF scheme based on polar codes with the association of an SCL decoder and a CRC (CALSC) at the relay side in order to reduce not only the simplicity of the decoding, but also the bit error rate. The next sections are organized as follows: Section 2 introduces the preliminary and the system model, the SCL and CALSC decoding are discussed in Sects. 3, 4 presents our proposed cooperative scheme, while Sect. 5 provides the analysis of simulation results. Section 6 which is the last one will stand for the conclusion.

2 System Model 2.1

Notations

In the following, we use bold letters such as G, to designate a matrix. We use calligraphic characters, such as A to denote sets, and | A | to denote the cardinality of sets. Ac is the complement of A. We use the notation xN1 to denote a vector of dimension N ðx1 ; x2 . . .xN Þ and xij for designating a sub vector ðxi ; xi þ 1 . . .xj Þ of xN1 for 1  i, j  N. 2.2

The Polar Codes

We consider a discrete channel without memory, with binary input (B-DMC) with X at the input and Y at the output for a transition probability W (y |x), x 2 X, y 2 Y such that:

Optimization of Polar Codes in Virtual MIMO Systems

105

W : X ! Y: For N independent B-DMC channels, we have: W N : XN ! Y N with a transition probability W N ðyN1 jxN1 Þ ¼

N Y

W N ðyi jxi Þ

ð1Þ

i¼1

The code length is N ¼ 2n ; n [ 0 • Encoding : The encoding equation is as follows xN1 ¼ uN1 GN

ð2Þ

uN1 is the sending message (uncoded bits), xN1 is the code word. The matrix GN is given by: GN ¼ Bn Fn 2 where Bn is a N  N permutation matrix. Fn 2 is the nth power of the Kronecker matrix of F2 with 

1 F2 ¼ 1

 1 : 0

By recursively applying the channel combination and division techniques, N independent B-DMC channels are transformed into synthesized channels, denoted : ðiÞ WN ; i ¼ 1; 2. . .:N: ðiÞ

Then, the transition probability of WN is given by: ðiÞ

WN ðyN1 ; ui1 1 jui Þ ,

X uNiþ 1 2X N1

1 2N1

WN ðyN1 juN1 Þ

ð3Þ

Q where WN ðyN1 juN1 Þ ¼ Ni¼1 WN ðyi jui Þ Polarized channels are either completely noisy (bad channel) or perfect (good channel) and we can then know, from the transmitter, their exact state. Thus, during the transmission of a message, it will be possible to fix the frozen bits of the set Ac for the noisy channels, and to send the information bits of the set A coded or not at the level of almost perfect channels.

106

I. Diop et al.

• Decoding : The primary decoding algorithm for polar codes is the SC. (successive cancellation). This algorithm can be described as follows: Let ^uN1 be the estimate of uN1 , the ^ui are decoded successively for i ¼ 1. . .N and are given by:  ^ui ,

c ui;  if iAo N i1 ^ hi y 1 ; u1 if iA

ð4Þ

where ( hi ðyN1 ; ^ui1 1 Þ,

2.3

ðiÞ

0; if

WN ðyNi ;^u1i1 j0Þ ðiÞ

WN ðyNi ;^u1i1 j1Þ

1

ð5Þ

1; else

Study Model

We consider a typical three-node relay channel (Fig. 1). It consists of a source node, a relay node, and a destination node, which are designated S, R, and D, respectively. The transmission on the channel is divided into two time slots. In the first time slot, the source encodes the information by a polar coder, and sends it by broadcast to the relay and to the destination. In the second time slot, the relay processes the received signal during the first time slot and transmits the result to the destination, while the source node remains silent. For simplicity, we assume that each node has a single antenna and the modulation scheme is BPSK.

Fig. 1. The proposed cooperative scheme

Optimization of Polar Codes in Virtual MIMO Systems

107

Furthermore, the S-D, R-D and S-R channels are all assumed to be AWGN additive Gaussian white noise. At the source node, the modulated signal is written xN1 , where N is the length of the coded signal. At the relay node, the signal received from the source and the signal re-transmitted at the destination are denoted yN1;SR and wN1 respectively. yN1;SR is given by: yN1;SR ¼

pffiffiffiffiffi N Ps x1 þ nN1;SR

ð6Þ

where Ps is the transmission power of the source and nN1;SR is the noise of the S-R channel. At the destination node, the signals received from the source and the relay are designated by yN1;SD and yN1;RD respectively and are given by: yN1;SD ¼

pffiffiffiffiffi N Ps x1 þ nN1;SD

ð7Þ

yN1;RD ¼

pffiffiffiffiffiffi N PR w1 þ nN1;RD

ð8Þ

where wN1 is the signal after processing at relay R.PR is the transmission power of the relay, nN1;SD and nN1;RD are the noises for S-D and R-D channels respectively. We will assume that PR ¼ Ps ¼ 1.

3 SCL and CALSC Decodings 3.1

SCL Decoding

The performance of the SC decoder is limited by the bit-by-bit decoding strategy. Then, if a bit is badly decoded, there is no chance of correcting it in the rest of the decoding process. To improve the SC decoder, Tal and Vardy [11] proposed SCL decoding. The Successive Constellation List (SCL) decoder is governed by a single integer parameter power of 2: L, which denotes the size of the list. In general, larger values of L mean lower error rates, but longer execution times. To better explain the SCL decoding principle, a binary tree representation is chosen. As SC, the SCL decodes the input bits sequentially one-by-one (level-by-level). But instead of retaining a single path after processing, the SCL decoder simultaneously exploits the L candidate paths for the next level. For example, when L = 2, the SCL decoder doubles the number of candidate paths for each bit ^ui (^ui ¼ 0 and ^ui ¼ 1) at each level as shown in Fig. 2. If N ¼ 2  L candidate paths are obtained, then a pruning procedure is used to select the N=2 ¼ L most likely paths (with larger metrics). These paths are stored in a list for processing at the next level. Note that for a frozen bit, the number of candidate paths is not doubled because such a bit is fixed and its value is known.

108

I. Diop et al.

Fig. 2. SCL decoder search process [11]

At the end of the decoding process (when reaching the leaf nodes), the most likely (which has the largest metric in the list) among the L decoding paths is selected as the output of the decoder. Figure 2 gives a simple example of searching a code tree using an SCL decoder, with L = 2, n = 4 and k = 4. At level 1, the SCL decoder visits both nodes with ^ u1 ¼ 0 and ^u1 ¼ 1. Then, at level 2, the 2 descending nodes (child nodes), for each node of the previous level, are explored (4 nodes in total). Since the size of the decoder list is L = 2, after calculating all 2L = 4 new path metrics associated with these child nodes, the SCL decoder selects the L = 2 paths that have the largest metrics (most likely) as the paths to keep. Then, at the next level 3, 2L = 4 child nodes (the numbered and black nodes in Fig. 2) which are connected to the L paths retained at the previous level (level 2) are visited by the SCL decoder. Pruning is also applied at this level to choose the L = 2 most likely paths. At the last level 4, 2L = 4 child nodes are still visited whose L = 2 more probable will be retained. Finally, the decoder outputs the path with the largest metric of the two candidates in the list; it is (0011) with a metric of 0.2 and (1000) with a metric of 0.36. The valid decode path (1000), which could not be found by the SC decoder, can now be obtained by the SCL decoder. 3.2

CASCL Decoding

When the transmitted code word (path chosen by the SCL decoder) is not the most likely, decoding errors occur. Indeed, there is in this case another more likely path that has escaped the SCL decoder. This kind of case is not common but if it ever happens, even the Maximum Likelihood (ML) decoder, known to be the most powerful, will fail to decode the correct code word. To solve this problem, it would be good to use a tool to identify the transmitted code word if it is in the list, thus improving the performance of polar codes. This can

Optimization of Polar Codes in Virtual MIMO Systems

109

easily be implemented using the CRC cyclic redundancy check pre-coding [12]. This technique consists in adding more non-frozen bits to the polar codes. Thus, the SCL decoder first eliminates the paths among the L candidates that do not validate the conditions of the CRC and then chooses the most probable path among those remaining.

4 Proposed Decode and Forward Scheme 4.1

Description of the Method

Decode and Forward (DF) is nowadays one of the best cooperation techniques implemented at the half-duplex relay level when the Source-Relay channel is better than that of the Source-Destination channel ðSNRSR [ SNRSD Þ. In our model, we denote by M, the message to be sent of length N ¼ 2n ; ðn  0Þ ^ be the estimate of M. We composed of K information bits and (N-K) frozen bits. Let M first perform a CRC coding of the K bits of information before the polar encoding at the source. • CRC coding The CRC (Cyclic Redundancy Check) coding detects the existence of successive error packets, but also corrects them by exploiting the polynomial algebra, when they are not too large. Consider S a vector of size (k = K-c) bits of information. It is identified with a polynomial with binary coefficients: SðXÞ ¼ sk1 X k1 þ    s1 X þ s0

ð8Þ

We introduce a generator polynomial of the form: PðXÞ ¼ X r þ    þ 1

ð9Þ

The dominant term is X r and the constant term is 1, the other coefficients are arbitrary. We now calculate the rest of the Euclidean division of X r SðXÞ by PðXÞ operating by modulo 2 and we have: X r SðXÞ ¼ QðXÞPðXÞ þ RðXÞ

ð10Þ

The remainder obtained is a polynomial of degree strictly inferior to r, that we can note: RðXÞ ¼ cr1 X r1 þ . . .c1 X þ c0

ð11Þ

For transmission, we complete S by the coefficients of the remaining polynomial, which amounts to transmitting the sequence of length n ¼ k þ c below:

110

I. Diop et al.

v ¼ ½sk1 . . .. . .s1 ; s0 ; cr1 ; . . .. . .c1 ; c0 

ð12Þ

Corresponding to the polynomial VðXÞ ¼ X r SðXÞ þ RðXÞ

ð13Þ

• Source: The source encodes the information bits and the frozen bits according to the principle of polarization of the xN1 channel containing k information bits, c CRC bits and N-K frozen bits. Thus, the frozen bits are chosen from the set F such that:  F ¼ fi0; 1. . .. . .. . .N  1g : ZðWNi Þ  dN

ð14Þ b

ZðWNi Þ represents the Bhattacharyya parameter for the ith channel and dN ¼ N1 2N where b is a constant. The code word attached to a CRC is then broadcast to the relay and to the destination in the first time slot. • Relay: At the relay after demodulation, the signal passes through the CALSC decoder: the decoding algorithm will be described in Sect. 4. • CRC decoding: On receipt of the signal yN1;SR , information bits v′ are substantially equal to the original bits v, with a few inverted bits. Mathematically, this can be translated as: v0 ¼ v þ e where e equal to a frame consisting essentially of 0 except for the erroneous bits where there are 1. In terms of polynomial, we obtain: V 0 ðXÞ ¼ VðXÞ þ EðXÞ

ð15Þ

Thus, the detector checks if the received polynomial V 0 ðXÞ is divisible by P (X). • If there are no errors in the transmission, V 0 ðXÞ is a multiple of PðXÞ. • Otherwise V 0 ðXÞ is not a multiple of PðXÞ. After these decoding operations, the signal is then re-encoded by the relay as at the source, and sent to the destination during the second time slot. • Destination: The destination receives the two signals yN1;RD and yN1;SD coming from different channels (order of diversity equal to 2), combines them by the MRC method before the final CALSC decoding.

Optimization of Polar Codes in Virtual MIMO Systems

4.2

111

Description of the Method Based on the Compress and Forward

The use of the CALSC decoder at the relay level provides a gain in performance and considerably reduces the complexity of the scheme of cooperation compared to some cooperation schemes such as Compress and Forward (CF) [10]. This method of cooperation CF consists of sending the polar coded signal by diffusion to the relay and to the destination in a first time (YSD and YSR ) Fig. 3.

Fig. 3. Compress-and-forward scheme [10]

The cooperation takes place when the state of the Source-Relay channel is better than that of the Source-Destination channel. At the relay level, if there is no decoding error, the message is encoded and sent to the destination during the second time slot as in conventional DF mode. However, if there are some decoding errors, a polar source encoding (compression of YSR give YQ ) is performed on the signal received from the source, before being modulated and sent to the destination in a second time. In this case, before combining the signals (coming directly from the source and the relay) at the destination, a decompression operation is performed using the frozen bits from the source.

5 Simulation and Results For the simulations, we use the following parameters: • • • • •

Polar code word length: N = 1024; The number of Information bits: K = 512; The input sequence of length K of the polar encoder contains : LCRC ¼ 24 bits CRC Modulation: BPSK Channel Type: Half Duplex

In Fig. 5, it can be seen that the CALSC decoding gives better BER performances than the SCL one. This confirms the theory of Tal and Vardy [11]. These authors had

112

I. Diop et al.

proposed the use of an error detector in SCL decoding, especially when the codeword is not the most likely in the list. Cooperative communication further enhances performance over conventional communication systems (without cooperation) as shown in Fig. 4. The reason is that at destination, the signals from the source and relay are combined to increase resistance against channel fluctuation, by exploiting spatial diversity.

Fig. 4. Comparison of our scheme with a non-cooperative scheme

Figure 6 shows the BER curves of our scheme and the Compress and Forward scheme proposed in [10]. Note that, taking into account some equivalent constraints during the simulation, the coding gain between the blue curve (our model) and the red curve [10], can reach 0.25. This is because CALSC decoding performs better than SCL decoding as shown in Fig. 5. In addition, the Compress and Forward mode is less efficient than DF when the state of the source-relay channel is good. This can be explained by the fact that the word will be decoded with less error at the relay before forwarding it to destination.

Optimization of Polar Codes in Virtual MIMO Systems

Fig. 5. Comparison of CALSC and SCL DF method based on polar code

Fig. 6. BER performance between our model and the scheme proposed in [10]

113

114

I. Diop et al.

Fig. 7. BER performance for different sizes of CRC

It should also be noted that a long CRC further increases the possibility of detecting errors, but decreases the efficiency of the coding. As explained above, the bits of the CRC are taken in the information bits during pre-coding. Figure 7 shows different BER curves depending on the size of the CRC used. As expected, we notice that the bigger the size, the better the performance. In practice, the length of the CRC is normalized and is usually a power of 2. Figure 8 shows the performances of CALSC decoding according to the size of the list L. We can note that the decoding is all the more efficient as the size of the list L is large (as in the case according to the length of the CRC). It should also be emphasized that the larger the list L, the more complex the decoding and the higher the latency. In general, the size of the list L and the length of the CRC in the CALSC decoding are defined according to the target application.

Optimization of Polar Codes in Virtual MIMO Systems

115

Fig. 8. BER performance for different sizes of list

6 CONCLUSION In this work, we propose a new cooperative coding method based on the use of polar codes. Thus, in addition to the SCL decoding, which makes it possible to have the L most likely candidates, the CRC detection technique, that allows to eliminate the least optimally candidates, is applied to further reduce decoding errors. The results of simulations show that the proposed method gives better performances compared to some previous works. Throughout the work, perfect knowledge of the state of the channels (S-R, S-D and R-D) has been assumed, which is never the case. So, in our future research, we will try to propose a new method for a more optimal estimation of these channels.

References 1. van der Meulen, E.C.: Three-terminal communication channels. Adv. Appl. Probab. 3, 120– 154 (1971) 2. Diop, I., Ndiaye, I.P., Fall, P.A., Diallo, M.: Optimization of LDPC codes used in cooperative relay systems: case of mobile telephony. In: IEEE Conference International Symposium on Networks, Computers and Communications (ISNCC) 2017 3. Wang, Y., Feng, W., Xiao, L., Zhao, Y., Zhou, S.: Coordinated multi-cell transmission for distributed antenna systems with partial CSIT. IEEE Commun. Lett. 16(7), 1044–1047 (2012)

116

I. Diop et al.

4. Laneman, JN., Tse, D.N., Wornell, GW.: Cooperative diversity in wireless networks: efficient protocol and outage behavior. IEEE Trans. Inf. Theory. 50(12), 3062–3080 (2004) 5. Arıkan, E.: Channel polarization: a method for constructing capacity-achieving codes for symmetric binary-memory-less input channels. IEEE Trans. Inform. Theory. 55, 3051–3073 (2009) 6. Andersson, M., Rathi, V., Thobaben, R., Kliewer, J., Skoglund, M.: Nested polar code for wiretap and relay channels. IEEE Commun Lett. 14(8), 752–754 (2010) 7. Bravo-Santos, A.: Polar code for gaussian degraded relay channels. IEEE Commun. Lett. 17 (2), 365–368 (2013) 8. Duo, B., Zhong, X., Guo, Y.: Practical fleece building code for multi-relay networks degraded. China Commun. 14(4), 127–139 (2017) 9. Soliman, T., Yang, F., Ejaz, S Almslmany, A.: Decode-and-forward thriller coding scheme for receive diversity: a relay retransmission partially perfect for half-duplex wireless relay channels. IET Commun. 11(2), 185–191 (2017) 10. Madhusudhanan, N, Nithyanandan, L.: Compress-and-forward relaying with polar codes for LTE-A system. In: International Conference on Communication and Signal Processing, 3–5 April 2014, India 11. Tal, I., Vardy, A.: List decoding of polar codes. In: Proceedings of IAS International Symposium on Information Theory Proceedings (ISIT), Aug 2011 12. Tal, I., Vardy, A.: List decoding of polar codes. Proc. IEEE Trans. Inf. Theory. 61, 2213– 2226 (2015)

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas: A Survey Gang Wang(B) and Yanyuan Qin Department of Computer Science and Engineering, University of Connecticut, Storrs, USA [email protected]

Abstract. Multi-beam antenna technologies have provided lots of promising solutions to many current challenges faced in wireless mesh networks. The antenna can establish several beamformings simultaneously and initiate concurrent transmissions or receptions using multiple beams, thereby increasing the overall throughput of the network transmission. Multi-beam antenna has the ability to increase the spatial reuse, extend the transmission range, improve the transmission reliability, as well as save the power consumption. Traditional Medium Access Control (MAC) protocols for wireless network largely based on the IEEE 802.11 Distributed Coordination Function (DCF) mechanism, which cannot take the advantages of these unique capabilities of multi-beam antennas. This paper surveys the MAC protocols for wireless mesh networks with multi-beam antennas. The paper first discusses some basic information in designing multi-beam antenna system and MAC protocols, and then presents the main challenges for the MAC protocols in wireless mesh networks compared with the traditional MAC protocols. A qualitative comparison of the existing MAC protocols is provided to highlight their novel features, which provides a reference for designing the new MAC protocols. To provide some insights on future research, several open issues of MAC protocols are discussed for wireless mesh networks using multi-beam antennas.

Keywords: MAC protocols antennas

1

· Wireless mesh networks · Multi-beam

Introduction

With the increasing popularity of wireless local access, there is a high demand to improve the throughput and energy efficiency in data transmission between terminals and access points (or base stations). Traditionally, wireless networks are designed to provide a single-hop transmission, either to Wireless Local Area Networks (WLAN) access points or to base stations. However, the explosive c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 117–142, 2020. https://doi.org/10.1007/978-3-030-12388-8_9

118

G. Wang and Y. Qin

implementations of wireless network in practical have sparked the idea of Wireless Mesh Networks (WMN) [1], extending the wireless coverage, improving the overall capacity and enabling network auto-configuration. The wireless mesh network is a network consisting of communication nodes organized in a mesh topology, which also is a form of wireless ad-hoc network. Wireless mesh networks often consist of mesh clients, mesh routers and gateways [2]. The use of multibeam antennas (smart antennas) in wireless mesh networks has received growing attention thanks to its higher antenna gain, better spatial reuse, longer transmission range, as well as the lower interference between multi-beam antenna [3]. Therefore, it is of great interest to consider the use of multi-beam smart antennas in wireless LAN, especially for the wireless mesh networks. The increasing interest in WMN and their applications in the battlefield and disaster relief environment have later evolved to a broader arena [4]. The advantages of multibeam antenna on WMN have attracted the researchers from both the academy and the industry, which result in rapid commercialization as well as numerous standardization efforts [5]. With the multi-beam smart antenna system, multiple omni-antenna or directional-antenna nodes may transfer data to or from the other nodes simultaneously, thus potentially increase the throughput substantially. The omni-directional antenna usually spreads the electromagnetic energy of wireless signal over a large area in space, while only a very small portion is actually received by the intended receiver, thus potentially limiting the overall capacity and performance. Also, the omni-directional antenna has some common problems, e.g., multipath fading, delay spread and co-channel interference (CCI) [7]. Currently, with the help of the availability of low-cost computing capacity and the development in new algorithms for processing signals from arrays of simple antennas, it makes the beamforming antennas available to wireless communication systems [8,9]. The beamforming antennas commonly have arrays of simple smart antennas, which consist of multi-beam antennas (MBA). The multi-beam antennas can enhance the radiating electromagnetic waves in wireless communications by actively controlling the temporal paces among the radiating elements of antenna array using the Digital Signal Processing (DSP) units. A typical wireless local area network consists of the Access Point (AP) and a finite set of mobile stations. Generally, the AP is much more powerful and less physically constrained than the mobile stations, which is a kind of Full Function Unit (FFU). The AP usually equips multiple smart antennas to boost the network throughput by exploiting the spatial reuse [10]. The existing multi-beam smart antennas could be broadly classified into three categories: switched multibeam antennas, adaptive array antennas, and multiple-input-multiple-output (MIMO) links [11]. Each of these antenna technologies has its pros and cons, we will discuss that later. The switched multi-beam antennas are relatively simple and commercially available, which have been deployed in many real applications [3].

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

119

The superior capabilities of smart antennas can be leveraged through appropriately designed upper layer network protocols, including MAC protocols. However, there still exist several design challenges for MAC protocol compared with traditional MAC protocols. The traditional network protocols were originally designed to run on the nodes equipped with omnidirectional antennas, which fails to interact with the underlying smart multi-beam antennas and may deteriorate the overall performance even below the level achieved by omni-directional antennas without the appropriate control [12]. Hence, it is essential to investigate innovative protocols, especially in the MAC layer, that are capable of harnessing the potential benefits of the smart multi-beam antennas in wireless mesh networks. The rest of the paper is organized as follows. Section 2 introduces the background information of multi-beam antennas. Section 3 describes current design challenges in beamforming antennas. Section 4 presents MAC protocols classification. Section 5 surveys classic MAC protocols for WMNs with multi-beam antennas. Section 6 discusses the problems with current design and predication. Section 7 concludes this paper.

2

Basics of Multi-beam Antennas

In this section, we provide some concise knowledge about the multi-beam antennas, on MAC protocols, in wireless mesh networks. 2.1

Multi-beam Smart Antennas

The wireless mesh networks typically extend the infrastructure-based single-hop wireless network [1]. Initially, almost all wireless architectures assume the use of omni-directional communication in a wireless mesh network, which causes poor spatial reuse in multi-hop networks, adversely affecting the network capacity [13]. The transmission capacity can be enhanced considerably by using smart antennas, given their better spatial reuse [14]. Recent researches have investigated the applicability of multi-beam antennas in wireless mesh networks. Multi-beam smart antenna, shortly named as multi-beam antenna, is referred to a multiple beam antenna array. This antenna array can simultaneously transmit (or receive) multiple packets on different beams using the same channel, thus substantially improving the single hop throughput as well as the overall network throughput [15]. However, simultaneous transmission (or reception) by the same node requires the corresponding smart antennas equipped with spatial multiplexing and demultiplexing capability, which has been termed Space Division Multiple Access (SDMA) in the literature [8,16]. Before discussing the multi-beam antenna, it is necessary to discuss the beamforming antennas for easy understanding. Any radio-based antennas can provide its primary function that couples electromagnetic energy from one medium to another of the same type. For omni-directional antennas, their simple dipole antennas can be used to radiate/receive energy equally to/from all directions.

120

G. Wang and Y. Qin

For the directional antenna, another type smart antenna, it can be able to radiate/receive energy to/from one specific direction more than the others [5]. To quantify Quality of Service (QoS) of the antennas, one of the most important characteristics is the antenna’s gain, which can be used to measure the QoS. Usually, the gain is used in the directional antennas, which indicates the relative power in a certain direction compared to omni-directional antennas, the gain is often measured in dBi. Specifically, the gain of an omni-directional antenna equals 0 dBi. For the reciprocal antennas with the characteristics of transmission and reception, the gains can be further separated into transmission gains and reception gains. However, it is difficult to obtain the exact gain values due to the properties of wireless signals, and the gain values in all directions of space can be represented by the antenna radiation pattern. A directional antenna pattern usually consists of a high gain main lobe and several gain side and back lobes [5]. Figure 1 shows an example of an antenna radiation pattern with the main lobe pointing to 90◦ and side lobes with smaller gains. The axis of the main lobe is known as the boresight of the antenna, which also lies along the peak gain, the maximum gain over all directions. The beam width formally refers to the angle subtended by the directions on either side of boresight, are 3dBi less in gain. Ideally, the directional antennas are assumed to have an ideal antenna pattern in which the gain is a constant value in the main lobe and zero outside of the main lobe. However, it is not practical in real design applications for ideal antennas due to the existing of the interference, especially for the multi-beam antennas.

Fig. 1. Antenna radiation pattern with a main lobe pointing 90◦ and side lobes with small gains

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

121

The smart antenna usually equipped with antenna arrays with physical separation in terms of a fraction of the wavelength. It can produce a specific antenna radiation pattern. The overall radiation pattern of an antenna array is determined by several important parameters, such as the number of the single antenna elements (e.g., dipoles), the space among the elements, the geometrical configuration of the array as well as the amplitude and phase of the applied signal to each element [5]. Beamforming antenna is a type of smart antenna which includes a Multiple Input Multiple Output (MIMO) control system [6]. This system combines the antenna array with Digital Signal Processing (DSP) techniques to allow the transmission and reception for the antenna elements. Usually, the beamforming antenna employs the sophisticated antenna array schedule and control algorithms to automatically and adaptively control the overall radiation patterns of the antennas. In this paper, the beamforming antennas can be simply and interchangeably referred to as directional antennas. Figure 2 shows the coverage range of different transmission modes: (a) Omnidirectional mode. (b) Uni-directional Mode. (c) Multi-directional Sequential Mode. (d) Multi-directional Concurrent Mode.

Fig. 2. The coverage range of different transmission modes

Compared to the traditional omni-directional antennas [17], there exists several potential benefits and numerous advantages. 1. Significantly reducing interference: Due to the radiated energy in the direction of the intended receiver of directional antennas, the transmission (or reception) does not interfere too much with neighboring nodes residing in other directions. 2. Increased Signal-to-Noise Ratio (SNR): With the same transmit power, the gain of beamforming antennas focuses more energy in the intended direction to increase SNR, moreover, the link quality and transmission rate. 3. Extended communication ranges: In wireless mesh networks, the extension of communication range may lead to fewer-hops routes and consequently reduce the end-to-end delay [18], also may improve the connectivity of the network [19]. 4. More energy efficient communication [20]. 5. More secure wireless communication: The beamforming antenna can reduce the risks of eavesdropping and jamming to provide more secure communication [21,22]. 6. Location estimation [23] and efficient broadcasting [24].

122

2.2

G. Wang and Y. Qin

Medium Access Control (MAC)

To take the benefits of multi-beam smart antennas, the link layer needs to be properly designed for providing the service to the network layer. IEEE 802.* standards provide the corresponding protocols for the wireless LANs and Ad Hoc networks. One of the goals of the MAC protocol is to set the rules in order to enable efficient and fair sharing of the common wireless channel [26,27]. MAC protocols for wireless networks could be classified into two major categories: contention-free MAC and contention-based MAC [28]. Contention-free MAC is largely based on the controlled access in which the channels are, according to the predetermined schedule, allocated to each node. Contention-based MAC protocol usually is implemented through the random access, which the nodes in it compete to access the shared medium. When there happens a conflict, the distributed conflict resolution algorithm is used to resolve the collision. IEEE 802.11, the de facto standard for medium access control in wireless networks, is typically designed for the omni-directional antennas in wireless communication [25]. IEEE 802.11 standard provides one mandatory channel access function DCF (Distributed Coordination Function) and one optional channel access function PCF (Point Coordination Function). In the construction of these two types, the difference is that PCF is centralized, while DCF is fully distributed. Here we describe the general design of DCF in MAC layer, then we will discuss the design challenges for multi-beam antenna in MAC layer. IEEE 802.11 DCF is a carrier sense based MAC protocol, which employs CSMA/CA (Carrier Sense Multiple Access with Collision Detection) mechanisms at the MAC layer. The CSMA/CA provides contention-based single-channel access to APs in the network. In CSMA/CA mechanism, as shown in Fig. 3, when a node wishes to transmit the data, it first performs physical carrier sensing before initiating transmission, which is the CSMA part of the CSMA/CA protocol. There exists the case that two nodes, each is outside the carrier sensing range, are trying to transmit data with a common node. In this case, a collision occurs at the receiving node. To avoid the collision happening, collision avoidance, the CA part of the protocol, is implemented by a handshaking mechanism before data transmission [29]. For

Fig. 3. Basic operation of IEEE 802.11 DCF

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

123

the handshaking mechanism, if the sender senses the channel at the receiver site as idle for a Short Interframe Spacing (SIFS) period, the sender transmits a short Request-To-Send (RTS) packet to the intended receiver and the receiver in turn responses with a short Clear-To-Send (CTS) packet. Both RTS and CTS packets contain the proposed duration of the transmission. Above process is the handshaking mechanism. If the channel is still sensed idle, the sender waits for a DCF Interframe Spacing (DIFS) period before sending its packet. If the channel is busy with physical sensing after this waiting period, then the node chooses a random backoff duration from the set of [0, CW ]. This duration is determined by the contention windows (CW ). The CW is an integer between CWmin and CWmax , where CWmin and CWmax depend on physical layer characteristics. Initially, the CW is set to the value of CWmin . The node decreases the backoff timer by one after each idle slot time. When the timer equals to zero and the channel which the sender sensed is idle, the sender can transmit its packet. If any activity is detected on the channel during the backoff period, the node freezes its backoff timer and waits until the channel is idle again. In the backoff algorithm, when the backoff timer expires, the sender doubles its CW, chooses a new backoff interval and tries retransmission again. The CW is doubled on each collision until reaching a maximum threshold, CWmax . Also, the number of retransmission attempts is limited by the threshold after which the packet is discarded. In the case of the CTS or ACK packet not received back, the sender assumes that a collision has occurred with some other transmission and it invokes the binary exponential backoff algorithm as mentioned above. When the transmission successfully transmitted between the sender and receiver, the contention window CW is initialed to its minimum value for the next transmission. The above CSMA/CA mechanism mitigates the probability of two nodes transmitting data at the same time. The random deferral by each node can ensure fair channel award in the long term and is given by Random Backof f = Random() × Slottime, where Random() returns a pseudo-random integer from a uniform distribution over the interval[0, CW ] and Slottime is related to the corresponding physical layer characteristics. From the design of IEEE 802.11 standard, it implicitly assumes an omnidirectional antenna at the physical layer. However, when using the multi-beam smart antennas, IEEE 802.11 standards do not work properly.

3

MAC Design Challenges for Beamforming Antennas

As we mentioned early in this paper, two types of multi-beam smart antenna systems have been widely studied in the current literature: one is based on adaptive arrays and the other is based on the fixed beam directional antenna. For the first one, adaptive-array based smart antenna, it may work better in a multipath rich environment, however, it is more complex to design the transceiver and the corresponding MAC protocols. This section will discuss the main challenges for designing a MAC protocol in beamforming environment.

124

3.1

G. Wang and Y. Qin

Beam-Synchronization Constraint

In the multi-beam antenna, it needs the cooperation so that the AP works correctly with each beam. To avoid the co-site interference problem, all sections at the AP must be in either the transmission mode or the reception mode [3]. This needs the sophisticated scheduling policies to deal with the synchronization between different modes. 3.2

Hidden Terminal Problem

There exists the hidden terminal problem in the traditional wireless network, occurring when two nodes are outside of their carrier sensing range of each other during CSMA and both of them attempt to communicate with a common node. The solution to this traditional problem is by implementing the RTS/CTS handshaking mechanism before data transmission to avoid the occurrence of collision [29]. For example, as shown in Fig. 4, we assume that station F wants to communicate with the AP when the AP is communicating with either station B or station E. Station F, without hearing the signal from either station B or station E, infers that the media is free, and then sends data to the AP, which will eventually receive a collided data.

Fig. 4. An example of sectorized multi-beam antenna system. Number the beams and sectors in a clockwise direction

In the multi-beamforming antenna, there exists a new hidden terminal problem. When a potential interferer could not receive the RTS/CTS exchange information, due to its antenna orientation during the handshake, and then initiates

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

125

a data transmission this may cause a collision. There exist two new types of hidden terminal problems [30]: Hidden Terminal Due to Asymmetry in Gain Usually, in the multi-beam antennas, it uses the beamforming techniques to enhance the gains to transmit the data. The gain in the omni-directional mode is much smaller than the gain in the beamformed mode. If an idle node is listening to the medium omni-directionally, this node will be unaware of some ongoing transmission that could be affected with its directional transmission [5]. Thus, this case causes a collision. Hidden Terminal Due to Unheard RTS/CTS The loss of the channel state information during beamforming causes this type of hidden terminal problem. When a node is involved in a beamforming communication, it would appear deaf to other directions, therefore, important control information may be lost during that time period. When a neighboring node fails to receive the channel reservation packets exchanged by a transmitter-receiver pair, such as RTS and CTS, it will cause the hidden terminal problem. In this case, the receiver becomes unaware of the imminent communication between that particular transmitter-receiver pair and accordingly could later initiate a transmission which causes the collision [5]. The hidden terminal problem is much more serious in the wireless mesh network, it needs more complex MAC protocol to synchronize the exchanged data information among the neighbor nodes, especially in the case of dealing with mobility. For traditional access point AP with omni-directional antennas, the mobility may not be a big problem. In the wireless mesh network, however, it arises as a non-negligible problem for AP with multi-beam antennas, in which each node is physically associated with a beam-sector. The medium access control, especially the download medium access control, will be highly affected when a mobile node moves from one beam-sector to another. It needs an efficient location updating algorithm to keep the freshness of the location information with reasonable overhead, which increases the complexity of the wireless mesh network [3]. 3.3

Deafness

One of the aims using multi-beamforming antenna is to exploit the spatial reusability, deafness maybe occur and is by far the most critical challenge in a wireless network, which was first identified in the context of the basic directional MAC protocol [31–34]. When a transmitter tries to communicate with a receiver, however, this trying transmission process fails because the receiver is beamformed towards another direction that is away from the transmitter. Different from the characteristics of omni-directional antennas, the intended receiver in the beamforming antenna is unable to receive the transmitter’s signal. Thus, it appears deaf to the transmitter. Figure 4 shows the deafness problem, also known as receiver blocking problem. Assume that station B in sector 1 intends to send data to the access point

126

G. Wang and Y. Qin

AP, meanwhile, the AP is sending data to station A in sector 0. Without hearing the beamformed signal from the AP to station A, station B assumes that the media which station B will be used is free at the end of its back-off, and then sends transmission data to the AP. Since the AP is multi-beamformed in this case, it needs the beam-synchronization to constrain before receiving or sending data and is deaf to station B’s transmission, as the result the AP is unable to receive the data sent from station B. Without getting response, such as CTS, from the AP, station B typically considers this kind of failure as an indication of collision and reacts accordingly. Thus, the station B involves the binary exponential backoff algorithm before attempting retransmissions. Station B may keep sending data until its retry limit is reached. These unnecessary retransmissions reduce the network capacity and lead to significant bandwidth waste. Also, the exponential increase in the backoff contention windows results in channel underutilization. In this case, it may make matters worse, the transmission of station B may corrupt the reception of station A from the AP since station A is located close to station B. The consequences of deafness may be even worse and may lead to shortterm unfairness between flows that share a common receiver if the involved transmitter has multiple packets to send and constantly transmits the data by choosing a backoff interval from the minimum contention windows. Moreover, the deadlock may happen when a chain of deafness is possible in which each station attempting to communicate with a deaf station becomes itself deaf to another station [34]. 3.4

Beam-Overlapping

Another aim of multi-beam antenna is to improve the spatial utilization. Multibeamforming technique implements multiple beamformed beams in the smart antenna, there must exist the interference between different beams. Due to the physical imperfection of beamforming antennas, there generally exists a small portion of the beam-overlapping area for two adjacent beams, as showing in Fig. 4. If a station lies in the beam-overlapping area, this station can hear data transmissions from multiple sectors, in turn, multiple sectors in this access point AP can also hear data transmission from that station. For example, in Fig. 4, assume that stations C and D are simultaneously sending data to the AP. Since both sector 0 and sector 2 can hear the signal from station D, sector 2 will receive a collided data from stations C and D [10]. This is not what the multibeam wants. Moreover, the beam-overlapping problem can cause the back/side-lobe problem. As known that beamforming technique usually has high-gain than that of traditional antennas, even though the APs are equipped with multiple high-gain narrow-beam directional antennas, however, the negative effects of back/sidelobe problem introduced by interference cannot be totally ignored. For example, in Fig. 4, when station E intends to send data to the AP, all sectors may receive this signal from station E since it is so close to the AP and falls in the back-lobe or side-lobe of many other beams [10].

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

127

Also, the beam-overlapping problem is much more serious in the multipath rich environment. Any station, say station C in Fig. 4, in the range of AP, may hear transmissions from all sectors, in turn, all sectors at the AP can hear the transmission from station C. This is the multipath rich problem. It will degrade the omni-directional communication and no spatial reuse can be exploited. 3.5

Unnecessary Defer

For multi-beam antenna, there exist two kinds of unnecessary defer problems, one is the Head-of-Line (HoL) blocking problem with beamforming MAC protocol which was first identified in [35], the other is caused by the rule of CSMA/CA. For omni-directional antenna, it typically uses First-In-First-Out (FIFO) queueing policy to buffer the received signals and this policy works fine in omnidirectional antennas due to using the same medium for all outstanding packets. In case that the medium is busy, no packets can be transmitted. However, in the case of multi-beamforming antennas, the medium is spatially divided and it may be available in some directions but not others. There exists this case since multiple beams are separated from each other and have these own communication ranges. If the packet at the top of the queueing, using of FIFO queueing policy, is destined to a busy station or beam, then, it will block all the subsequent packets even though some of them can be transmitted. The HoL blocking problem maybe aggravate when the top packet goes into a round of failed retransmissions including their associated backoff periods [36]. Due to the geographically close for each other in multi-beam, or even existing beam-overlapping, one station can hear another station’s signal and maybe subdues to transmit the data. For example, in Fig. 4, assume that one of stations in sector 1, say station B, intends to send data to the AP, meanwhile, the AP is receiving data from station A in sector 0. We know that different sectors in the same AP can simultaneously transmit the data to the AP. It is clear that station A and B can simultaneously transmit their respective data to the AP because of the different sectors which station A and station B located in. However, since two adjacent stations, say station A and B, are geographically close to each other, station B can hear station A’s signal, even lying in different sectors, and will keep silent according to the rule of CSAM/CA, thus causing the throughput down. 3.6

Miss-Hit

Most wireless communications are based on the rule of Direction of Arrival (DOA) [37]. In the wireless mesh network, stations tend to be more movable, thus challenging the wireless communication. Under the basis of DOA estimation techniques, when one station tries to send packets to a AP with a smart antenna system, the AP can identify which beam, or which beams if beam-overlapping problem, back/side-lobe problem occur, is located in the sending station [3]. Due to the mobility in a wireless mesh network, the beam location information cached in the AP maybe stale and incorrect, which needs to update according

128

G. Wang and Y. Qin

to the new location, when the mobile station moves. In this case, the AP may direct a wrong beam for the corresponding downlink transmission, leading the miss-hit problem.

4

MAC Protocols Classification

Since IEEE 802.11 DCF-based WLANs have been widely deployed, the question is how an IEEE 802.11 node transmits and receives data to/from a multi-beam access point with the above-mentioned challenges. The problem of designing an efficient MAC protocol for wireless mesh networks with multi-beam antennas has been of a great interest in recent years. Generally, the MAC protocols in literature can be broadly classified into random access protocols and synchronized access protocols. Random access protocols allow the stations lying in different sectors to access the shared medium, say access point AP, randomly through contention with each other. This type of protocol typically employs CSAM/CA to avoid collision among different beams. Synchronized access protocols allow the stations to access the medium based on the predetermined schedule which can be achieved through local and/or global synchronization. Random access protocols can be further classified into sub-categories according to the techniques used to deal with the MAC challenges mentioned above section. One sub-category of random access protocols relies solely on the control packets, such as RTS/CTS, to avoid the collision. The other sub-category employs busy tones that are usually transmitted on a dedicated control channel [5]. Synchronized access protocols require some sort of synchronization between the stations to coordinate conflict-free transmissions to occur simultaneously. All beams in multi-beam antenna at the access point AP can be either in the receiving mode or the transmission mode due to the well-known co-site interference problem [38], assuming all the beams operate in the same frequency band. This cause that it is hard to achieve since beam-synchronization (for either reception or transmission) requires to facilitate multiple parallel transmissions. It is a common discipline to allow only one node to transmit at a time. However, in this type of protocol, the time period is usually divided into frames and each frame consists of sub-frames which are simply a group of time slots [5]. The sub-frames are further divided into two types: one sub-frame is to perform the schedule for channel contention; the rest of sub-frames is to transmit the scheduled contention-free data [39–41]. Generally, it is difficult to achieve the global synchronization in multi-hop wireless mesh networks, most protocols in literature choose to relay on local synchronization among neighboring stations [42,43]. There are other types classification regarding to the multi-beam antennas in wireless mesh network, according to different specifications. Table 1 shows the different classifications. Note that these classifications are not independent of each other and one MAC protocol may belong to more than one classes. To best our knowledge, we summary several typical state-of-the-art MAC protocols as examples in next section, regardless of the classifications, for wireless mesh networks with multi-beam antennas.

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

129

Table 1. Taxonomy of MAC protocols for wireless mesh network with antennas Classifications Antenna capabilities

Switched-beam antennas Steered-beam antennas (adaptive antennas)

Communication range Omni-directional antennas Directional antennas

5

Channel number

Single channel Multiple channel

Access medium

Random access Synchronized access

Classic MAC Protocol Designs

Wireless mesh networks using multi-beam smart antennas have received intensive attention recently due to its performance in higher gain and throughput. However, the ever popular contention-based MAC protocols, such as IEEE 802.11, are not too much effective for multi-beam antennas, and many challenging problems mentioned in Sect. 3 need to be resolved. This section will present several classic MAC protocols to deal with these challenges. In paper [3], Wang et al. proposed a method, enhancing the performance of medium access control for WLANs with multi-beam access point. The authors proposed a novel MAC protocol to carefully address these challenging problems and improve the communication efficiency. Their design also considers the backward compatibility whereby an IEEE 802.11 terminal can transparently access a multi-beam access point. In [3], the authors assume their system configured with multi-beam access points, omni-directional mobile nodes, and a single frequency channel, and they provided a distributed MAC layer solution based on IEEE 802.11 DCF. The data link of MAC protocol is separated into two types: Uplink medium access control and Downlink medium access control. The basic idea of their MAC protocol is to introduce a timing-structure to facilitate multiple handshakes (sequential or overlapping) before parallel collision-free data transmissions. Figure 5 shows the outline both of uplink and downlink super-frame in their design. For uplink medium access control, all the nodes having uplink packets in the queues contend for the channel access during the contention period. They design a contention resolution scheme to facilitate multiple nodes to win out, correspondingly, the winning nodes will be collision-free with each other. Thereby, the “collision free” is actually a state that the winning nodes will not collide with each other when the winning nodes simultaneously send data to or receive ACK from the access point. Also, different winning nodes should be in different beam-sectors in order to be collision-free. In wireless mesh networks, the access point needs to contend for the channel as an ordinary node. In this case, the access point can send request-to-receive (RTR) messages in a higher priority

130

G. Wang and Y. Qin

Fig. 5. Uplink and downlink super-frame

over mobile-terminals to exchange RTS messages. For downlink medium access control, the access point in WMN needs to buffer the location information of each node. They provide two network scenarios: static and mobile. In the static scenario, the location information, obtained during the association process, is buffered in a static manner. While in the mobile scenario, the beam-location information of each node keeps cached by AP and it could be updated reactively whenever there is data needed to be exchanged or could be updated proactively using the periodic polling/probing initiated by the access point. In paper [45], Emad et al. proposed a distributed asynchronous directionalto-directional MAC protocol for wireless Ad hoc networks. The existing MAC protocols assume that the nodes can operate in both directional and omnidirectional modes, however, using both modes simultaneously could lead to the asymmetry-in-gain problem. The authors proposed a directional-to-directional (DtD) MAC protocol, where both sender and receiver operate in a directionalonly mode, and they also derive the saturation throughput of ad hoc network using DtD MAC. The DtD MAC protocol is fully distributed, does not require synchronization, eliminates the asymmetry-in-gain problem as well as alleviates the effect of deafness and collisions. DtD MAC protocol assumes that sending nodes cache the location information about their neighbor nodes, which this information is later used to determine the direction in which it should first try to send directional RTS (DRTS)messages. While the idle nodes, say potential receivers, continuously scan through their antenna sectors to emulate omnidirectional antennas, the potential receivers lock in the respective direction and response with a directional CTS (DCTS) message once they hear a DRTS intended for themselves. Each node estimates and caches the Angle-of-Arrival (AoA) of any messages it overhears to estimate the direction of the intended receiving node. Before sensing the medium, the sending node checks, using a machine learning mechanism, its AoA cache to determine the receiver’s most likely direction. For the nodes that overhear ongoing communication accordingly, it sets their Directional Network-

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

131

Allocation Vector (DNAV) to refrain from interrupting the ongoing communications in these directions. In paper [9], Bao et al. proposed a distributed receiver-oriented multiple access (ROMA) channel access scheduling protocols for ad hoc networks with directional antennas, each of which can form multiple beams and commence several simultaneous communication sessions. ROMA protocol determines the number of links for activation in every time slot using only two-hop topology information, which is unlike the random access schemes that use on-demand handshakes or signal scanning to resolve communication targets. This ROMA protocol significantly improves the network throughput, as well as the delay, can be achieved by exploiting the multi-beam forming capability of directional antennas in both transmission and reception. ROMA protocol adopts a Neighboraware Contention Resolution (NCR) to derive channel access schedules for a node. The contention to the shared resource is resolved in each context according to the priorities assigned to the entities based on the context number and their respective identifiers, and then select the highest priorities to access the common resource without conflicts. Each link of ROMA protocol has a weight that reflects the data flow demand over the link in the ROAM network topology, and this weight is determined dynamically by the head of the link which monitors traffic demands or receives bandwidth requests from the upper-layer applications. Nodes and links are assigned priorities based on their identifiers and the current time slot. When the current time slot is t, the priority of a node i can be expressed by i.prio = Hash(i ⊕ t) ⊕ i where the sign ⊕ is designated to carry out the bit-wise concatenation operation on its operands and has lower order than other operations, function Hash() is a fast pseudo-random number generator. ROMA is a link-activation receiver-oriented multiple access protocol that exploits the multi-beam forming capability of multi-beam adaptive array antennas. Given the up-to-date information about the two-hop neighborhood of a node and link bandwidth allocations, ROMA decides whether a node i is a receiver or a transmitter in a wireless mesh network, and which corresponding links can be activated for reception or transmission during the time slot t. Before the actual link activation at the transmitters, ROMA protocol has to decide the active incoming links of each node in reception mode. In paper [10], Chou et al. proposed a MAC protocol, named M-HCCA (Multibeam AP-assisted HCCA), which fully take advantage of multi-beam antennas equipped at the AP. This protocol does not only boost the overall capacity of WLAN, but also support QoS (Quality-of-Service) and power consumption for individual mobile stations. This MAC protocol has the following attractive features: (1) This MAC scheme is a polling-based protocol, hence it can innately conquer the problems induced by carrier sensing or directional signals. (2) M-HCCA can adaptively adjust the sector configuration, employing the deterministic tree-splitting algorithm as its reservation scheme, to quickly

132

G. Wang and Y. Qin

resolve contention/collision and to increase data transmission parallelism within the bounded reservation time, thus it achieves high real-time throughput. (3) MHCCA adopts the mobile-assisted admission control technique, run-time admission control mechanism, such that the AP can admit as many new streams as possible during the reservation procedure while not violating QoS guarantees made to already-admitted streams, even in a multipath environment. (4) M-HCCA utilizes the beam-location-aware polling-based access scheme to reduce energy waste on collision and retransmissions as far as possible, meanwhile, it solves the beam-overlapping problem and back/side-lobe problem. (5) M-HCCA can effectively alleviate the miss-hit problem by offering a location updating mechanism to promptly renew the beam-location information of a non-responsive station. To get the Wifi services, a mobile station should first discover the presence of APs by passive scanning or active scanning. The AP normally operates in the multi-beam antenna mode during the Contention-Free-Period (CFP), except in a multipath rich environment. Figure 6 shows the super-frame structure for a WLAN with multi-beam AP using M-HCCA protocol. If being equipped with the reconfigurable multi-beam antennas, the AP can adaptively adjust the sector configuration during the collision resolution period and the polling period to speed up the reservation process and minimize the average awake up time of polled stations. During the collision resolution procedure, M-HCCA utilizes an identifier-based tree-splitting algorithm to solve the collision problems. This algorithm is to use the stack to implement a pre-order traversal of the dimension splitting tree.

Fig. 6. Super-frame structure for a WLAN with multi-beam AP

In paper [15], Jain et al. proposed on-demand medium access in multi-hop wireless networks with multiple beam smart antennas. Traditional on-demand MAC protocols for omnidirectional and single beam directional antennas based on the DCF mechanism cannot take advantage of this unique capability of multiple beam antennas, as well as do not facilitate concurrent transmissions or receptions. This paper proposed a novel protocol, hybrid MAC (HMAC), which is backward compatible with IEEE 802.11 DCF. This protocol enables concurrent packet reception (CPR) and concurrent packet transmission (CPT) at multiple beamforming antennae. In the HMAC protocol, the design considers a wide-azimuth switched-beam smart antenna comprised of a multiple beam antennas array (MBAA), which is

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

133

capable of calculating the exact angle of arrival of an incoming signal. HMAC is a cross-layer protocol that uses information from both the network and the physical layer for its operation. The novel features of HMAC include its channel access mechanism, an algorithm for mitigating deafness and contention resolution, jump back-off and role priority switching mechanisms for enhancing throughput. In HMAC, each node maintains its neighbor’s information into the Hybrid Network Allocation Vector (HNAV) table, as well as a beam table (BT), which stores the backoff duration and two boolean variables for each beam. To eliminate the mitigating deafness problem, HMAC employs a hybrid approach based on the algorithm for mitigating deafness (AMD). To solve the channel contention problem, HMAC exploits a runtime sender-estimation algorithm (RSA). The node maintains a transmission probability for each neighbor node in its HNAV table, and the reciprocal of this probability is current estimate of the number of transmission in this beam. HMAC can also conduct the role priority switching, switching between transmitter and receiver roles, depending on the packets in its buffer. The rule is that as long as a data packet exists in the queue, the node gives priority to the transmission mode; otherwise, the reception mode supersedes. In paper [44], Lal et al. provided a performance evaluation of medium access control for multi-beam antenna node in a wireless LAN. In the paper, the authors used a wide-azimuth switched beam smart antenna system comprised of a multiple beam antenna array, and analyze and simulate the one-hop performance of CSMA as well as Slotted Aloha for these systems. The paper also investigated the problem of synchronization for multiple beams in CSMA. These results show that, under heavy offered load conditions, CSMA is a good choice with nodes that multiple-beam smart antenna, despite the performance loss due to the beam synchronization. Also, CSMA protocol provides a stable throughput approaches unity and is invariant to fluctuations in the offered load. However, the performance of slotted aloha drastically reduces beyond optimum offered loads, although slotted Aloha is capable of higher peak throughput in a narrow range of offered loads as more switched beams are employed. The analytical model of CSMA is basic CSMA, which is no handshake, no ACK. The carrier sense is performed by a node on the different channel than the one used for transmitting data, which the channel is comprised of a set of narrow-band tones. The model assumes that the estimated receiver beam number is mapped to a discrete tone, and the omnidirectional range of the tone must be set to reach all members that potential lie within a receiver beam, meanwhile it must keep as low as possible to reduce the probability of misclassification. The analytical model of slotted aloha is similar to that of CSMA, with some modification. They use the same concept, as within CSMA, of a regeneration cycle to estimate the throughput, whose cycle is a subchannel or a beam consisting of an idle period (no terminal has a packet to send), a busy period (one or more terminals have the packet for transmission). In the paper [45], Sundaresan et al. presented a medium access control in Ad Hoc networks with multiple inputs multiple outputs (MIMO) links. MIMO links can provide extremely high spectral efficiency in multipath channels by simul-

134

G. Wang and Y. Qin

taneously transmitting multiple independent data streams in the same channel. The unique characteristics of MIMO links coupled with several key optimization considerations necessitate an entirely new MAC protocol. The authors present a centralized algorithm called stream-controlled medium access (SCMA) that has the key optimization considerations incorporated in its design. There exist some unique characteristics for the relevant physical layer. The incoming data is demultiplexed into M streams and each stream is transmitted out of a different antenna with equal power, at the same frequency, same modulation format, and in the same time slot. One of the aims of their work is to exploit the spatial multiplexing gain to increase the capacity of the system. In these designs, they use the centralized stream-controlled medium access protocol for ad hoc networks with MIMO links, which based on the observation about the receiver overheating problem: There exists a specific subset of links in the network that contributes to the lack of receiver overloading when performance pure stream control. To control the overflows of the bottleneck links, they use the schedules in the nonstream controlled fashion, such links can essentially be removed from further scheduling considerations, leaving the scheduling algorithm with only independent contention regions within which pure stream control can be employed. There is two scheduling component involves scheduling of bottleneck links in a nonstream controlled manner; scheduling of the nonbottleneck links in the network based on pure stream control. In the paper [46], Lal et al. proposed a novel MAC layer protocol for space division multiple access in wireless ad hoc networks. Most recent MAC protocols using directional antennas for wireless ad hoc networks are unable to attain substantial performance improvements because they do not enable the nodes to perform multiple simultaneous transmissions/receptions. The authors, in this paper, propose a MAC layer protocol that exploits space division multiple access thus using the property of directional reception to receive more than one packets from spatially separated transmitter nodes. Space division multiple access employs time division duplex (TDD) between transmission and reception, thus this system does not require two separate antenna systems. It harnesses parallelism, at the same time, in the reception process, improving the throughput at a node, this needs those prospective transmitters to need to be synchronized to a receiver. Since each of their transmissions is dictated by a potential receiver, the transmitting nodes cannot synchronize their own transmissions to others. The system uses the receiver-initiated approach to achieve the time synchronization for receptions. In the paper [47], Choi et al. proposed a complementary beamforming. This paper proposed two new methods, called “subspace complementary beamforming (SCBF)” and “complementary superposition beamforming (CSBF)”, to deal with the issue of complementary region, a region where some stations in the network cannot sense the directional signals (beams) often causing the hidden beam problem. The SCBF uses dummy data to ensure a controlled level of received energy in all directions of eigenvectors unused by downlink channels. Meanwhile, it enables CSBF which can also send data contained in the comple-

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

135

mentary beam. Using this method, the passive nodes in the network can receive “broadcast” information, while the active nodes are engaged in the exchange of user-specific data. The effects of complementary beamforming can be achieved simply by increasing the transmit power of only one of antenna elements when space-division multiple access is not applied. The main idea behind complementary beamforming (CBF) is that much less power is needed for unintended users to correctly detect the presence of transmission than that required for correct decoding of the transmitted packet in general. The SCBF technique creates a flat beam pattern in the “otherwise hidden beam” directions, thus, the transmit power in all directions or locations is guaranteed to be great than a certain level. While CSBF applies a linear combination of the CBF vectors of SCBF. By modifying one of the SCBF downlink beamforming column vectors, its sidelobe level is increased without interfering with the desired beams. The overall objective of this design is to increase the probability that all the nodes can hear the channel activity so that they defer their transmissions. This means that the received power at any location in the service area should be greater than a certain threshold. In the paper [48], Wang et al. proposed a MAC protocol for multi-beam directional antennas, in which each beam-sector has its own control channel, and the communications among different beam-sectors are independent. It utilizes the directional network allocation vector (DNAV) to record the establishment processes. After sensing all the sectors of the multi-beam antenna, the protocol then uses the global assignment strategy to assign the directional communication channels. The whole protocol has several steps: connection initialization, channel contention, and data communication. These steps are sequentially in each beam sector. In the papers [49,50], Biomo et al. presented the case that full potential of Multi-Packet Transmission/Multi-Packet Reception (MPT/MPR) capability of multi-beam antennas can be unlocked to drastically reduce the end-to-end delay in ad hoc networks. The authors defined a formal optimization model for delay reduction and observed that the optimal end-to-end delay is attained when links are scheduled in the way that opportunities for MPT/MPR are maximized. Their results show that using the shortest routes, a widespread criterion in traditional routing protocols for ad hoc networks, results in higher delays, in which bridges among the routes incur waiting and rescheduling delay that adds to the end-toend delay. In the paper [51], Kuperman et al. presented a novel unslotted, uncoordinated ALOHA-like random access MAC policy for multi-beam directional systems that asymptotically achieves the capability of the network. In their setting, each communication node acts independently of one another. Its Multi-Beam Uncoordinated Random Access MAC (MB-URAM) does not make use of any reservation message, such as scheduling or RTS/CTS, and does not need to synchronize time slots or transmissions. The proposed protocol can asymptotically achieve the maximum possible throughput for any MAC approach, even a scheduled one. In the simulation, the authors considered practical considera-

136

G. Wang and Y. Qin

tions on the performance of MB-URAM, including power constraints, latency, beamwidth, and packet error rate. In the paper [52], Babich et al. discussed the design requirements for enabling multiple simultaneous peer-to-peer communication in IEEE 802.11 asynchronous networks in the presence of adaptive antenna arrays, and proposed two novel access schemes to realize multi-packet communication (MPC). One designed scheme, threshold access MPC (TAMPC) is based on a threshold on the load sustainable by the single-node; the other scheme, signal-to-interference ratio (SIR) access MPC (SAMPC), is based on an accurate estimation of the SIR and on the adoption of low-density parity check codes. Both schemes rely on the information acquired by each node during the monitoring of the network activity, which is suitable for distributed and heterogeneous scenarios. Its setting considers the coexistence of legacy and non-legacy nodes equipped with different antenna systems. Besides, both schemes are designed to be backward compatible with IEEE 802.11 standard, and their performances are compared to the theoretical one and to that of the IEEE 802.11n extension in a mobile environment. In the paper [53], Proulx et al. analyzed a new MAC protocol for multi-beam directional network via high-fidelity simulation using a real-time emulator. The work focuses on exploiting simultaneous transmission to create a distributed, low complexity, random access MAC protocol for multi-beam directional networks by exploiting the underlying physical abilities of a digital phased array (e.g., the ability to form receive beam a posteriori and form multiple transmit or receive beams). The protocol designed location tracking and power control methods to ensure the transmit beam is correctly pointed with the correct power. Besides, the proposed scheme is able to track the state of a neighbor’s random access protocol in order to drastically reduce the number of dropped packets and interference in the system. The proposed scheme was implemented on a new Extendable Mobile Ad-hoc Network Emulator (EMANE) model which allows for real-time, high fidelity performance evaluation. In the paper [54], Hong et al. reviewed multibeam antenna technologies for 5G wireless communications. The authors presented the key antenna technologies for supporting a high data transmission rate, an improved signal-to-interferenceplus-noise ratio, and increased spectral and energy efficiency, and versatile beam shaping. Multi-beam antennas hold a great promise in serving as the critical infrastructure for enabling beamforming and massive MIMO that boost the 5G. The paper provided a thorough discussion on implementing multi-beam antennas on 5G settings.

6

Discussions

We presented many state-of-the-art MAC protocols for wireless mesh networks with multi-beam antenna in previous section. Here we discuss some issues that are highly related to MAC protocols.

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

6.1

137

Problems with Current MAC Protocols

1. Staleness of Beam Information As we had known, the beamforming information in wireless mesh network need to be obtained in advance and recorded in the Look-up Tables (LUT). However, due to the nodes mobility in wireless mesh networks, the staleness of beamforming information stored in LUT could occur if there exists the larger gap, than the beamwidth, between the cached and the actual beamforming information. The results obtained in paper [15] demonstrate that the performance of multiple beam antennas largely depends on the network topology. This problem could not be solved unless the beamforming information in the mesh topology is collected on a per-packet basis. In case of handoff in WMN, when a node moves out from one subnet of a transmitter’s beam to another subnet covered by another beam, all packets outbound for this node need to redirect to the new subnet. If the previous transmitter tries to transmit the packet to that moving node, the packet transmission addressed to this node fail. In this case, the packets addressed to this moving node in the current subnet need to be redirected to the new subnet. Also, it is important to detect these transmission failures at the MAC layer before being reported to the network layer. This handoff issue maybe launches an open area of research in wireless mesh networks. 2. Neighbor Nodes Discovery To communication, the node should discover its neighboring nodes before data transmission. The neighbor discovery process does not lie within the domain of the MAC protocols, however, it has a great impact on the MAC layer’s operation, especially in wireless mesh networks. In the WMN, the nodes should not only discover which nodes are within their communication range, but also identify the beamforming information of these neighbor nodes. The beamforming information is usually decided based on either the AoA estimation or the relative position of the nodes. However, traditional research usually relies on channel contention solution mechanisms, such as CSMA/CA in a distributed environment, to find the neighboring nodes. However, in current larger scales WMN, this issue will be very evident and complex or even falls into the deadlock to find their neighbor nodes. The neighbor’s address is provided by the network layer. The location-based beamforming usually requires additional hardware such as GPS and also implicitly assumes that a Line-of-Sight (LoS) exists between these nodes. This may not be accurate in multi-path and multi-hop environment considering a switched-beam antenna. It is better to use AoA estimation to find their neighboring nodes in WMN. 3. Multipath Multipath occurs, and very common in WMN, when multiple copies of the same signal are received by the receiver node from different directions in the multi-beam antenna. Multipath is primarily caused by reflection from terrestrial objects, such as buildings, and it thus is very high in urban areas. The multipath problem may result in a node activating several beams in

138

G. Wang and Y. Qin

the multi-beam antennas, thereby degrading the network performance. One possible solution can be a consideration that multiple beam smart antennas is installed on Mesh Routers (MRs) or Internet Gateways (IGWs) at the top of buildings, in this case, multipath may not affect the performance drastically. 6.2

Prediction

1. Heterogeneous Antennas In the previous sections, the analysis for most of the proposed MAC protocols is mostly based on homogeneous antennas which the antennas have the same characteristics. The considered antenna homogeneity includes the antenna type, the number of beams, radiation pattern or sometimes a beamforming reference direction. However, in large-scale WMN, there may exist different kinds of antennas, that is those communication nodes within the same WMN networks are equipped with heterogeneous antennas. It needs to carefully design the corresponding MAC protocols to deal with the heterogeneity-aware MAC protocols to fit large-scale wireless mesh networks. 2. Fairness One of the important characteristics of the MAC protocol is to provide fair channel access among the competing nodes in WMNs. It should not only focus on the optimization performance metrics, such as throughput and delay, but also consider the fairness among the competing nodes. One of the goals of MAC protocols is to enhance the spatial reuse, with continuous improvement of this goal, these MAC protocols usually result in the unfair medium access. Therefore, achieving fairness in wireless mesh networks with multibeam antennas is a very challenging task that needs further exploitation. 3. QoS-aware Protocols With the pressing need of real-time services and running content-rich multimedia applications, Quality of Service (QoS) has become a vital component in wireless mesh networks. Most existing QoS-aware MAC protocols are limited to the single-hop wireless networks, however, although multi-hop wireless networks can improve the network performance at some extends, these concerns usually focus on the throughput and delay, little attention has been devoted to exploring this effectiveness in providing QoS guarantees, especially at the MAC layer. Thus, the researches need to focus more on the design of QoSaware MAC protocols, as well as both intra-node and inter-node scheduling, in wireless mesh networks.

7

Conclusion

In this paper, we presented a comprehensive survey of MAC protocols in the wireless mesh network using multi-beam antennas. Theoretically, the capacity of WLAN can be considerably boosted by the use of multi-beam smart antennas. However, if the designers directly apply IEEE 802.11 to a WLAN with multi-beam antennas, it will inevitably encounter many challenges. The existing solutions to these challenges are based on DCF and hence are not suitable

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

139

for multi-media applications. The design principles of MAC protocols need to exploit the benefits of multi-beam antennas and overcome the beamformingrelated challenges. Based on these aims, we enlisted and discussed the main challenges facing MAC protocols in wireless mesh networks with multi-beam antennas. Besides, we introduced the basics of multi-beam antennas, including multi-beam smart antennas and traditional medium access control protocols. Finally, we analyzed several classic MAC protocols for wireless mesh network using multi-beam antennas.

References 1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005) 2. Akyildiz, I., Wang, X.: Wireless Mesh Networks, vol. 3. Wiley, Chichester (2009) 3. Wang, J., Fang, Y., Dapeng, W.: Enhancing the performance of medium access control for WLANs with multi-beam access point. IEEE Trans. Wirel. Commun. 6(2), 556–565 (2007) 4. Conti, M., Giordano, S.: Multihop ad hoc networking: the reality. IEEE Commun. Mag. 45(4), 88–95 (2007) 5. Bazan, O., Jaseemuddin, M.: A survey on MAC protocols for wireless adhoc networks with beamforming antennas. IEEE Commun. Surv. Tutor. 14(2), 216–239 (2012) 6. Sundaresan, K., Lakshmanan, S., Sivakumar, R.: On the use of smart antennas in multi-hop wireless networks. In: 2006 3rd International Conference on Broadband Communications, Networks and Systems, BROADNETS 2006. IEEE (2006) 7. Winters, J.H.: Smart antennas for wireless systems. IEEE Pers. Commun. 5(1), 23–27 (1998) 8. Cooper, M., Goldburg, M.: Intelligent antennas: spatial division multiple access. Ann. Rev. Commun. 4, 02–0013 (1996) 9. Bao, L., Garcia-Luna-Aceves, J.J.: Transmission scheduling in ad hoc networks with directional antennas. In: Proceedings of the 8th Annual International Conference on Mobile Computing and Networking. ACM (2002) 10. Chou, Z., Huang, C., Chang, J.: QoS provisioning for wireless LANs with multibeam access point. IEEE Trans. Mob. Comput. 13(9), 2113–2127 (2014) 11. Sundaresan, K., Sivakumar, R.: A unified MAC layer framework for ad-hoc networks with smart antennas. In: Proceedings of the 5th ACM International Symposium on Mobile Ad Hoc Networking and Computing. ACM (2004) 12. Athan, R.: Antenna beamforming and power control for ad hoc networks. In: Mobile Ad Hoc Networking, p. 139 (2004) 13. Li, J., et al.: Capacity of ad hoc wireless networks. In: Proceedings of the 7th Annual International Conference on Mobile Computing and Networking. ACM (2001) 14. Zhang, J.L., Liew, S.C.: Capacity improvement of wireless ad hoc networks with directional antennae. In: 2006 63rd IEEE Vehicular Technology Conference, VTC 2006-Spring, vol. 2. IEEE (2006) 15. Jain, V., Gupta, A., Agrawal, D.P.: On-demand medium access in multihop wireless networks with multiple beam smart antennas. IEEE Trans. Parallel Distrib. Syst. 19(4), 489–502 (2008)

140

G. Wang and Y. Qin

16. Paulraj, A.J., Papadias, C.B.: Space-time processing for wireless communications. IEEE Sig. Process. Mag. 14(6), 49–83 (1997) 17. Li, G., et al.: Opportunities and challenges for mesh networks using directional antennas. In: WiMESH 2005 (2005) 18. Choudhury, R.R., Vaidya, N.H.: Performance of ad hoc routing using directional antennas. Ad Hoc Netw. 3(2), 157–173 (2005) 19. Li, P., Zhang, C., Fang, Y.: Asymptotic connectivity in wireless ad hoc networks using directional antennas. IEEE/ACM Trans. Netw. (TON) 17(4), 1106–1117 (2009) 20. Spyropoulos, A., Raghavendra, C.S.: Energy efficient communications in ad hoc networks using directional antennas. In: Proceedings of the Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies, INFOCOM 2002, vol. 1. IEEE (2002) 21. Noubir, G.: On connectivity in ad hoc networks under jamming using directional antennas and mobility. In: Wired/Wireless Internet Communications, pp. 186–200. Springer, Berlin (2004) 22. Zefreh, M.S., Khadivi, P.: Secure directional routing to prevent relay attack. In: 2008 3rd International Conference on Information and Communication Technologies: From Theory to Applications, ICTTA 2008. IEEE (2008) 23. Malhotra, N., et al.: Location estimation in ad hoc networks with directional antennas. In: 2005 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems, ICDCS 2005. IEEE (2005) 24. Hu, C., Hong, Y., Hou, J.: On mitigating the broadcast storm problem with directional antennas. In: 2003 IEEE International Conference on Communications, ICC 2003, vol. 1. IEEE (2003) 25. IEEE 802.11 Working Group: Wireless LAN medium access control (MAC) and physical layer (PHY) specifications (1997) 26. Gummalla, A.C.V., Limb, J.O.: Wireless medium access control protocols. IEEE Commun. Surv. Tutor. 3(2), 2–15 (2000) 27. Jurdak, R., Lopes, C.V., Baldi, P.: A survey, classification and comparative analysis of medium access control protocols for ad hoc networks. IEEE Commun. Surv. Tutor. 6(1), 2–16 (2004) 28. Kumar, S., Raghavan, V.S., Deng, J.: Medium access control protocols for ad hoc wireless networks: a survey. Ad Hoc Netw. 4(3), 326–358 (2006) 29. Bharghavan, V., et al.: MACAW: a media access protocol for wireless LAN’s. ACM SIGCOMM Comput. Commun. Rev. 24(4), 212–225 (1994) 30. Choudhury, R.R., et al.: On designing MAC protocols for wireless networks using directional antennas. IEEE Trans. Mob. Comput. 5(5), 477–491 (2006) 31. Choudhury, R.R., et al.: Using directional antennas for medium access control in ad hoc networks. In: Proceedings of the 8th Annual International Conference on Mobile Computing and Networking. ACM (2002) 32. Gossain, H., et al.: The deafness problems and solutions in wireless ad hoc networks using directional antennas. In: IEEE Global Telecommunications Conference Workshops, GlobeCom Workshops 2004. IEEE (2004) 33. Takata, M., Bandai, M., Watanabe, T.: A MAC protocol with directional antennas for deafness avoidance in ad hoc networks. In: 2007 IEEE Global Telecommunications Conference, GLOBECOM 2007. IEEE (2007) 34. Choudhury, R.R., Vaidya, N.F.: Deafness: a MAC problem in ad hoc networks when using directional antennas. In: 2004 Proceedings of the 12th IEEE International Conference on Network Protocols, ICNP 2004. IEEE (2004)

MAC Protocols for Wireless Mesh Networks with Multi-beam Antennas...

141

35. Kolar, V., Tilak, S., Abu-Ghazaleh, N.B.: Avoiding head of line blocking in directional antenna [MAC protocol]. In: 2004 29th Annual IEEE International Conference on Local Computer Networks. IEEE (2004) 36. Bazan, O., Jaseemuddin, M.: An opportunistic directional MAC protocol for multihop wireless networks with switched beam directional antennas. In: 2008 IEEE International Conference on Communications, ICC 2008. IEEE (2008) 37. Stevanovic, I., Skrivervik, A., Mosig, J.R.: Smart antenna systems for mobile communications. Technical report. Ecole Polytechnique Federale De Lausanne (2003) 38. Lovelace, W.M., Townsend, J.K.: Adaptive rate control with chip discrimination in UWB networks. In: 2003 IEEE Conference on Ultra Wideband Systems and Technologies. IEEE (2003) 39. Singh, H., Singh, S.: A MAC protocol based on adaptive beamforming for ad hoc networks. In: 2003 14th IEEE Proceedings on Personal, Indoor and Mobile Radio Communications, PIMRC 2003, vol. 2. IEEE (2003) 40. Zhang, Z.: Pure directional transmission and reception algorithms in wireless ad hoc networks with directional antennas. In: 2005 IEEE International Conference on Communications, ICC 2005, vol. 5. IEEE (2005) 41. Wang, J., Fang, Y., Wu, D.: SYN-DMAC: a directional MAC protocol for ad hoc networks with synchronization. In: 2005 IEEE Military Communications Conference, MILCOM 2005. IEEE (2005) 42. Chang, J.-J., Liao, W., Hou, T.-C.: Reservation-based directional medium access control (RDMAC) protocol for multi-hop wireless networks with directional antennas. In: 2009 IEEE International Conference on Communications, ICC 2009. IEEE (2009) 43. Wang, J., et al.: Directional medium access control for ad hoc networks. Wirel. Netw. 15(8), 1059–1073 (2009) 44. Lal, D., et al.: Performance evaluation of medium access control for multiple-beam antenna nodes in a wireless LAN. IEEE Trans. Parallel Distrib. Syst. 15(12), 1117– 1129 (2004) 45. Sundaresan, K., et al.: Medium access control in ad hoc networks with MIMO links: optimization considerations and algorithms. IEEE Trans. Mob. Comput. 3(4), 350–365 (2004) 46. Lal, D., et al.: A novel MAC layer protocol for space division multiple access in wireless ad hoc networks. In: 2002 Proceedings of the Eleventh International Conference on Computer Communications and Networks. IEEE (2002) 47. Choi, Y.-S., Alamouti, S., Tarokh, V.: Complementary beamforming: new approaches. IEEE Trans. Commun. 54(1), 41–50 (2006) 48. Wang, G., Xiao, P., Li, W.: A novel MAC protocol for wireless network using multi-beam directional antennas. In: 2017 International Conference on Computing, Networking and Communications (ICNC). IEEE (2017) 49. Biomo, J.-D.M.M., Kunz, T., St-Hilaire, M.: Exploiting multiple beam antennas for end-to-end delay reduction in ad hoc networks. In: Ad Hoc Networks, pp. 143–155. Springer, Cham (2018) 50. Biomo, J.-D.M.M., Kunz, T., St-Hilaire, M.: Exploiting multi-beam antennas for end-to-end delay reduction in ad hoc networks. Mob. Netw. Appl. 23(5), 1293–1305 (2018) 51. Kuperman, G., et al.: Uncoordinated MAC for adaptive multi-beam directional networks: analysis and evaluation. In: 2016 25th International Conference on Computer Communication and Networks (ICCCN). IEEE (2016)

142

G. Wang and Y. Qin

52. Babich, F., Comisso, M., Crismani, A., Dorni, A.: On the design of MAC protocols for multi-packet communication in IEEE 802.11 heterogeneous networks using adaptive antenna arrays. IEEE Trans. Mob. Comput. 14(11), 2332–2348 (2015) 53. Proulx, B., Madiedo, J., Jones, N.M., Kuperman. G.: Topology control in aerial multi-beam directional networks. In: 2018 IEEE Aerospace Conference. IEEE (2018) 54. Hong, W., Jiang, Z.H., Yu, C., Zhou, J., Chen, P., Yu, Z., Zhang, H., et al.: Multibeam antenna technologies for 5G wireless communications. IEEE Trans. Antennas Prop. 65(12), 6231–6249 (2017)

Accurate Attitude Estimation for Drones in 5G Drone Small Cells Vahid Vahidi(&) Hanover College, Hanover, Indiana 47243, USA [email protected]

Abstract. In this paper, a new attitude estimation procedure for drones in 5G drone small cells (DSC) is described using array of antennas placed on the body. The proposed method utilizes a fractal structure for the locations of the receiver antennas. By applying the least square (LS) procedure on the received signals, the proposed method estimates the drone attitude more accurately than the state of the art attitude estimation method that exploits hexagonal antenna placement patterns. In order to improve the accuracy of the attitude estimation further, those initial estimated angles can be refined in a second phase by implementing the multiple signal classification (MUSIC) algorithm in two dimensions. The two-phase method is called fractal structure array (FSA). The simulation results indicate that by the employment of the second phase, the accuracy of the attitude estimation is enhanced considerably. In addition, larger fractal structure using more receiver antennas can be applied to improve the performance of the proposed estimation method even further. The computational complexity of the methods are also compared and it is concluded that even by the addition of the second phase, the computational complexity of the FSA method is lower than the other ones. Keywords: Drone  Attitude estimation  Fifth generation (5G)  Least square (LS)  Multiple signal classification (MUSIC)  Fractal structure

1 Introduction Evolving applications for high speed wireless communications, such as tangible Internet, remote sensing, virtual reality, real-time control, and road safety, have prompted the development of fifth generation (5G) cellular networks. In order to accomplish enhancements for 5G communication systems, the utilization of novel technologies is crucial. A promising technique is to employ drone small cells (DSCs) to increase the service coverage for mobile users [1]. Precise attitude estimation of drones is vital for their control and displacement. Error in measuring the attitude results in more power consumption and can cause accidents. In addition, the carrier frequency for 5G communication systems is expected to be between 27.5–71 GHz in the United States [2] and antennas are highly directive in those frequencies and their gains degrade rapidly even for a little deviation from the pick point of the radiation pattern. As a result, the minimum error in attitude or angle of

© Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 143–153, 2020. https://doi.org/10.1007/978-3-030-12388-8_10

144

V. Vahidi

arrival (AOA) estimation causes considerable degradation in signal to noise ratio (SNR) and consequently, in the DSC communication systems. In conclusion, the deployment of proprioceptive sensors like inertial measurement units (IMUs) are not suitable because of high inertial guidance error [3] and antenna arrays which are located on the drones can be employed for attitude estimation. Several papers studied drone attitude estimation by the deployment of the arrays of antennas on the drones [4–6]. In [4, 5], a 4-element cross-shaped antenna array was employed for estimating attitude. In that cross structure, two pairs of antennas are placed at both ends of the body and wings. In [6], which estimates the attitude more accurately compared to [4, 5], a hexagon-shaped seven-element electronically steerable parasitic antenna radiator (ESPAR) array was utilized to estimate the pitch and roll angles by utilizing the GPS position of the drone. In the hexagon shaped, one antenna is located at the center of the drone while the other six antennas are located on the wings and body symmetrically. In all the mentioned papers, the carrier frequency was supposed to be 30 MHz and therefore, the size of the drone would be in the order of wavelength and antennas can only be located in specific locations in order to prevent the mutual effect between the antennas. However, because of the high carrier frequency of 5G systems, the wavelength would be small and all the array elements can be located at the center of the drone and therefore, more effective array structures can be developed for estimating the attitude more accurately which is crucial for 5G communication systems. In this paper, a new attitude estimation procedure is proposed. In this scheme, the receiver antennas are positioned according to fractal structure on the drone’s body. The proposed method applies the least square (LS) estimation method in order to estimate the spatial frequencies of the phase shifts from the received signals at the array of antennas. Afterwards, the estimated special frequencies are utilized for estimating the pitch and roll angles of the drone. The simulation results indicate that the proposed method estimates those angles more accurately and with lower computational complexity compared to the method of [6]. In order to increase the accuracy of attitude estimation and make it more applicable for 5G communication systems, another step is added. The multiple signal classification (MUSIC) scheme is applied to search for more accurate angles in the neighbor of the estimated angles from the previous step. The new two-phase attitude estimation method is called fractal structure array (FSA). The proposed antenna placement scheme plays the major rule in the performance of the FSA scheme. Since 5G communication systems are expected to employ massive multiple input multiple output (MIMO) [2], larger number of antennas can be employed for attitude estimation. The simulation results indicate that by increasing the size of the fractal structure, the drone attitude would be estimated more accurately. The remainder of this paper is organized as follows. Section 2 describes the problem. Our proposed antenna placement structure is discussed in Sect. 3 and the attitude estimation methodology based on that structure is described in Sect. 4. The simulation results are presented in Sect. 5, and Sect. 6 concludes the paper.

Accurate Attitude Estimation for Drones in 5G Drone Small cells

145

2 Problem Statement The scenario that is considered in this paper is described as follows. A high base station which has line of sights (LOSs) to drones is located at the center of the communication cell and therefore, the communication channel between the base station and the drones can be modeled by one tap delay line. Drones can fly with any arbitrary pitch, roll and yaw angles. The problem is to estimate pitch and roll angles from the received signals by the array of antennas that were transmitted by the base station. In this paper, similar to [6], it is assumed that yaw is known a priori since it is estimated by a different method before estimating pitch and roll angles [7]. Indicating the number of the receiver antennas by M and the number of transmitted symbols by N, the received signal at the array of antennas is described as: X ¼ AST þ Z,

ð1Þ

where X is the M  N matrix of the received symbols, S is the N  1 vector of the transmitted symbols, A is the M  N array steering vector which depends on the receiver antennas structure and the pitch u, roll h and yaw w angles and it is defined in Sect. 4, and Z is the M  N matrix of the additive noise. The goal in this paper is to estimate u and h angles.

3 Antenna Placement The attitude estimations performance highly depends on the antenna placement. For example [6] uses a hexagon shaped seven element array (HSSEA) structure which is the modified scheme of [4] and [5] that use 4 element cross structure. The distance between the center antenna to its surrounding antennas is k=4 where k is the wavelength. The equations are calculated based on the geometry of the antenna placements and the u and h angles are estimated using 3-D unitary ESPRIT algorithm [6]. Since the small wavelength of the transmitted signals in 5G communications provides more flexibility for choosing various antenna placement structures, in this paper, we proposed a new structure for antenna placement in order to estimate u and h angles more accurately. The fractal structure with one repetition for locating the receiver antennas is depicted in Fig. 2. Similar to [6], the distance between each antenna to its adjacent antenna is k=4. The spatial frequencies along the axes are indicated in Fig. 1 and are defined as: l  u2  u1 ;

ð2Þ

#  u4  u1 ;

ð3Þ

where ui ¼ 1; 2; 4, indicates the phase delay of the antenna Si . The number of antennas that are considered for massive MIMO is more than 6 [2]; therefore, more repetitions of the structure in Fig. 1 can be deployed for antenna locations. The second repetition of the structure is depicted in Fig. 2.

146

V. Vahidi

Fig. 1. Fractal structure with one repetition

1

1

1

1

1

1

Fig. 2. Fractal structure with two repetitions

4 Attitude Estimation Procedure 4.1

Basic Methodology

Attitude estimation is performed in two steps: LS and MUSIC. Phase 1, LS The sensor 1 is assumed to be the reference antenna; therefore, the following equations between the received symbols at the sensors is achieved: ½E2 ; E3 ; E5 T ¼ ejl ½E1 ; E2 ; E4 T ;

ð4Þ

½E4 ; E5 ; E6 T ¼ ej# ½E1 ; E2 ; E4 T ;

ð5Þ

where Ei ; i ¼1; 2; . . .; 6 is the correlation of the received symbols at the ith receiver antenna which is obtained as: Ei ¼

1 T  X X ; N i i

ð6Þ

Accurate Attitude Estimation for Drones in 5G Drone Small cells

147

where X i is the received symbol at the ith receiver antenna. By applying the LS method for (4) and (5), l and # are estimated. Afterwards, the pitch, and the roll angles are estimated based on the concepts that b 1 and hb1 respectively. By the assumption that are defined in [6], and are nominated as u the position of antenna 2 is known priori by GPS as P2 ¼ ½x2 ; y2 ; z2 , the position of the antenna 3 and antenna 6 are obtained as: P3 ¼ P2 þ P03 ;

ð7Þ

P6 ¼ P2 þ P06 ;

ð8Þ

2 3 cosðhÞ k P03 ¼ Rz ðwÞRx ðuÞ4 0 5; 4 sinðhÞ

ð9Þ

2 3 0 p ffiffi ffi k P06 ¼ 3 Rz ðwÞ4 cos½h 5; 4 sin½h

ð10Þ

where P03 and P06 are obtained as:

2

where

w

is

the

yaw

angle,

2

1 Rx ðuÞ ¼ 4 0 0

0 cosðuÞ sinðuÞ

3 0 sinðuÞ 5 cosðuÞ

and

3 cosðwÞ sinðwÞ 0 Rz ðuÞ ¼ 4 sinðwÞ cosðwÞ 0 5. 0 0 1 Using the phase differences, the following equations are obtained: 

k l kP3 k  kP2 k ¼ 2p k ð2#  lÞ kP6 k  kP2 k ¼ 2p

ð11Þ

b 1 is estimated as: By considering (11)–(13), u 0 k 1 Bp

c1 ¼ sin @ u

where a ¼ tan1

1 k2 4p2

2

k2  16 C

 ð2#  lÞkP2 k þ ð2#  lÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi A  a; k ðx2 cosðwÞ  y2 sinðwÞÞ2 þ z22 2



x2 cosðwÞy2 sinðwÞ z2

ð12Þ

 . By the assumption that the drones in DSCs do not

fly upside down, i.e h 2 ½900 ; 900 , hb1 is calculated by: hb1 ¼ sin1

k p

! 2 k2 k2  ðlÞkP2 k þ 4p 2 ð2#  lÞ  16 pffiffiffiffiffiffiffiffiffiffiffiffi  b; k 4 AþB

ð13Þ

148

V. Vahidi



x2cosðwÞ þ y2 sinð wÞ   where, b ¼ tan , A ¼ ðx2 cosðwÞ þ u 1  y2 cosðwÞ sin b u 1 þ z2 cosðwÞ x2 sinðwÞ sin b b 1 Þ  y2 cosðwÞ sinð u b 1 Þ þ z2 cosðwÞÞ2 . y2 sinðwÞÞ2 and B ¼ ðx2 sinðwÞ sinð u 1

Phase 2, MUSIC b 1 and hb1 to obtain more refined estimation of At this stage, MUSIC method utilizes u the pitch and roll angles. We considered the following points for the implementation of the MUSIC method: • Instead of searching the whole space for pitch and roll angles, only the neighboring b 1 and hb1 would be considered; therefore, the computational complexity values of u decreases considerably. • Since two angles should be estimated by the MUSIC method, 1-D MUSIC does not result in a single pair of estimated angles. On the other hand, as the antennas are located on triangles not on uniform rectangular array (URA), the 2-D music that is proposed in [8] would not be practical. As a result, we propose a new implementation of the MUSIC by employing the received signals from the antennas that are located on the vertices of the outer triangle. • For estimation of l and #, if the exact values of those parameters are small, the b 1 and hb1 angles would be estimated estimation error would become smaller and u more accurately. As a result, if the distance between the antennas would be lower, those angles would be estimated more accurately; however, too small distance causes mutual effect on the antennas and therefore, the distance between the sensors is set to k=4 similar to [6]. On the other hand, for the MUSIC method, larger distance between the antennas results in better accuracy for angle estimation since the variation of the received signal’s phase for a certain variation in the angle would be larger and therefore, angles would be estimated more precisely. In conclusion, we employ the received signal of the antennas that are located on the vertices of the outer triangle for the MUSIC procedure. Each pair of the antennas on the vertices make an antenna array, and the received signal at each array is defined as: X q;r ¼ aq;r ðu; hÞS þ Zq;r ; ðq; r Þ 2 fð1; 3Þ; ð1; 6Þ; ð3; 6Þg;

ð14Þ

where X q;r is the 2  N matrix of the received symbols, and Zq;r is the 2  N matrix of the additive noise to the rth and qth sensors, and aq;r ðu; hÞ is the 2  1 array steering T

T T

vector which is equal to ½1; ej2l  , 1; ej2# and ej2l ; ej2# for (q, r) equal to (1,3), (1,6) and (3,6) respectively. Now, we define the MUSIC pseudospectrum as: PMU ðu; hÞ ¼

Y

1 ;   a ð u; h ÞV V a ð u; h Þ ðq;rÞ 2 fð1;3Þ;ð1;6Þ;ð3;6Þg q;r q;r q;r q;r

ð15Þ

Accurate Attitude Estimation for Drones in 5G Drone Small cells

149

where V q;r is the 2  1 least dominant eigenvector of the correlation matrix of the received symbols in the ðq; r Þth array. The aq;r ðu; hÞ vectors can be written in terms of b 1 and b the us and hs that are in neighboring of u h 1 . The ðu; hÞ pair thatmaximizes  b ;b h2 . PMU ðu; hÞ defines the estimated pitch and roll angles and is indicated by u 2

4.2

Higher Repetitions for the Fractal Structure

In the structure of Fig. 2, the number of equations that can be defined for the LS step is considerably higher compared to the structure of Fig. 1. By considering three set of S2 ¼ f3; 4; 5; 8; 9; 12g and antennas as, S1 ¼ f1; 2; 3; 6; 7; 10g, S3 ¼ f10; 11; 12; 13; 14; 15g, S1 can be mapped to S2 and S3 by 2l and 2# phase shifts respectively. As a result, the equations that can be defined for the estimation of l are obtained from the following four sets of equations: 8

T

T > E22 ; E23 ; E27 ¼ ej2l E21 ; E22 ; E26 > >

T > 2 2 2 T < E ; E ; E ¼ ej2l E23 ; E24 ; E28

42 52 9 2 T

T > E11 ; E12 ; E14 ¼ ej2l E210 ; E211 ; E213 > > >

T : 2 2 2 2 2 2 T E3 ; E4 ; E5 ; E8 E9 E12 ¼ ej2l E21 ; E22 ; E23 ; E26 E27 E210

ð16Þ

Similar procedure can be applied for estimating #. This increment in the number of equations compared to (9) or (10), enhances the performance of the LS procedure considerably. In addition, in the two repetition structure, phase 2 of the attitude estimation procedure which is MUSIC performs more accurately compared to the structure in Fig. 1 since the distance between the vertices have been doubled and a little amount of variation in the ðu; hÞ pair causes more change in the pseudospectrum. By employing more repetitions of the fractal structure, more accurate attitude estimation procedure would be achieved. 4.3

Computational Complexity of the Attitude Estimation Methods

Each attitude estimation scheme consists of several steps and therefore, the computational complexity of each method is the summation of those steps. Tables 1 and 2 summarize the steps and the computational complexity of HSSEA (method of [6]) and FSA methods respectively. In those tables, the computational complexity of the angle estimation by utilizing the estimated spatial frequencies, Eqs. (12) and (13), are discarded since they consist of one function and their complexity is negligible compared to the other steps that deal with matrixes. The reader is referred to [6] for the description of the steps of the HSSEA method. The numerical values of the parameters in Table I and II are as follows. As it can be seen in Fig. 1, M = 7. By considering the Eqs. (4) and (16), Mp which defines the size of the LS procedure are 3 and 15 for the structure of Figs. 1 and 2 respectively. The number of antennas in each pair of array in the MUSIC phase is denoted by Ma which is set to 2 in both Figs. 1 and 2 structures. The number of ðu; hÞ pairs that the MUSIC method calculates the pseudospectrum for them is denoted by P, and it is set to

150

V. Vahidi Table 1. Computational complexity of the steps of the HSSEA method

Step Construction of centro-Hermitian matrix Calculating the autocorrelation of centroHermitian matrix Eigenvalue decomposition (EVD) of the autocorrelation matrix Total

Computational complexity Oð4M 2 N þ 4MN 2 Þ O ðN 2 Þ Oð6M 3 þ 12M 2 Þ Oð6M 3 þ 4M 2 N þ ð4M þ 1ÞN 2 þ 12M 2 Þ

Table 2. Computational complexity of the steps of the FSA method Step Least square (3) & (4)

Computational complexity   O 2Mp3

Calculating the autocorrelation matrix (5) OðN 2 Þ   MUSIC (14) O Ma2 ðP þ 3N Þ   Total O 2M 3 þ N 2 þ M 2 ðP þ 3N Þ p

a

  b 1; b 10  10 = 100 in this paper since after the estimation of u h 1 , 5 degrees in the neighboring of the estimated angles are considered to be searched in Eq. (15). If only the MUSIC method  2 is applied  for attitude estimation, the computational complexity is obtained as O Ma ðP þ 3N Þ , where Pm scans the whole possible values for ðu; hÞ and therefore, it is equal to 180  180 = 32400.

5 Simulation Results In this section, the computational complexity and the attitude estimation accuracy of FSA method is compared with those of the HSSEA method. By considering the computational complexities that are defined in Tables 1 and 2, the ¼ number of functions for attitude estimation method ; number of functions for the HSSEA method

versus the number of transmissions (N), for the FSA and MUSIC methods are depicted in Fig. 3. For the sake of brevity, the FSA method that utilizes one repetition or two repetitions of the fractal structure are called FSA-S1 and FSA-S2 respectively. As it is indicated in Fig. 3, the number of functions for running the FSA-S1 method is smaller than the HSSEA method for any N and the number of functions for running the FSA-S2 method is smaller than the HSSEA method for N > 10. The number of functions for running the MUSIC method is considerably larger than the FSA-S1 and FSA-S2 methods for any number of N, and it is larger than the HSSEA method for N < 63.

Accurate Attitude Estimation for Drones in 5G Drone Small cells

151

Fig. 3. R vs. N for proposed and MUSIC methods

The accuracy of the attitude estimation schemes is compared by the employment of Monte Carlo simulation method. The carrier frequency is set to 60 GHz since 7 GHz bandwidth is available at that center frequency [9]. The number of simulation runs is set to 105 and at each run, the yaw w angle is generated uniformly in [0, 360°] and is considered to be known. The pitch u and roll h angles are generated randomly in [−90°, 90°], and N is set to 10. The figure of merit is the root mean square error (RMSE) of the estimated pitch and roll angles which is defined as: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u L  2 u1 X b l  ul Þ2 þ b RMSEðu; hÞ ¼ t ðu h l  hl ; 2L l¼1

ð17Þ

where L is the number of Monte Carlo simulations, ðul ; hl Þ defines the actual value of  b b l ; h l defines the estimated value of the angles. the angles while u The RMSEðu; hÞ of the attitude estimation methods versus signal to noise ratio (SNR) is presented in Fig. 4. In this figure, the RMSE curves for running only the MUSIC method for the fractal structure with one and two repetitions are not indicated since they completely match with the performance of the FSA-S1 and FSA-S2 methods respectively. On the other hand, the LS-S1 is indicated in this figure which represents the performance of the first phase of the FSA method, without applying the MUSIC phase and by the employment of the one repetition of the fractal structure. As it is indicated in Fig. 4, the performance of the FSA method is considerably better than the performance of the HSSEA method even in the condition that only the first phase of the FSA method is applied. On the other hand, FSA-S2 performs better than the FSA-S1 specifically in low and medium SNR and it is concluded that in very high SNR values, it is not essentials to employ higher repetitions of the fractal structure.

152

V. Vahidi

Fig. 4. RMSE vs. SNR for attitude estimation methods

6 Conclusion A two-step attitude estimation procedure was proposed in order to deal with the problem of accurate pitch and roll estimation of drones in 5G DSC. The proposed method benefits from a fractal structure for the placement of the receiver antennas. The first step of the proposed method applies a LS procedure in order to estimate special frequencies; afterwards, the first estimation of the angles is achieved. The estimated angles from the first step is utilized for more accurate estimation of the attitude by implementation of the MUSIC method is second phase. Simulation results indicated that the FSA method estimates the attitude more accurately and with lower computational complexity compared to the HSSEA method. In addition, it is understood from the simulation results that the repetition of the fractal structure can be increased on demand specifically in low SNR values in order to improve the accuracy of attitude estimation. In this current work, we assumed direct line of sight between the base station and drones. In our next work, we will indicate that how the procedure should be enhanced in ad-hoc networks that there is no base station and the signals are transmitted through multipath rich environment.

References 1. Kulali, S., Sabir, E., Taleb, T., Azizi, M.: A green strategic activity scheduling for UAV networks: A sub-modular game perspective. IEEE Commun. Mag. 54(5), 58–64 (2016) 2. 4G Americas’ 5G Spectrum Recommendations, August 2015 3. Sitzmann, G.L., Drescher, G.H.: Tactical ballistic missiles trajectory state and error covariance propagation. In: Proceedings of IEEE Position Location and Navigation Symposium, Las Vegas, NV, USA, April 1994 4. da Costa, J.P.C.L., Schwarz, S., de A. Gadˆelha, L.F., de Moura, H.C., Borges, G.A., Pinheiro, L.A.R.: Attitude determination for unmanned aerial vehicles via an antenna array.

Accurate Attitude Estimation for Drones in 5G Drone Small cells

5.

6.

7. 8.

9.

153

In: Proceedings of ITG IEEE Workshop on Smart Antennas (WSA 2012), Dresden Germany, March 2012 Liu, K., da Costa, J.P.C.L., So, H.C., Gadelha, L.F.A., Borges, G.A.: Improved attitude determination for unmanned aerial vehicles with a cross-shaped antenna array. In: Proceedings of 14th IASTED International Conference on Signal and Image Processing (SIP 2012), Honolulu, Hawaii, USA, pp. 60–67, August 2012 Liu, K., da Costa, J.P.C.L., So, H.C., Roemer, F., Haardt, M., Gadelha, L.F.A.: 3-D unitary ESPRIT: accurate attitude estimation for unmanned aerial vehicles with a hexagon-shaped ESPAR array. Digital Signal Process. 23, 701–711 (2013) Lai, Y.C., Jan, S.S.: Attitude estimation based on fusion of gyroscopes and single antenna GPS for small UAVs under the influence of vibration. GPS Solut. 15(1), 67–77 (2010) Zhou, M., Zhang, X., Qiu, X., Wang, C.: Two-dimensional DOA estimation for uniform rectangular array using reduced dimension propagator method. Int. J. Antennas and Propag. 2015, 10 (2015). Article ID 485351 Dyadyuk, V., et al.: A multi-gigabit millimeter-wave communication system with improved spectral efficiency. IEEE Trans. Microw. Theory Tech. 55(12), 2813–2821 (2007)

Existence of an Optimal Perpetual Gossiping Scheme for Arbitrary Networks Ivan Avramovic(B) and Dana S. Richards George Mason University, Fairfax, VA 22030, USA {iavramo2,richards}@gmu.edu

Abstract. Gossiping is a problem in which a peer-to-peer network must disperse the information held by each machine to all other machines in the minimum number of communication steps. In perpetual gossiping, new information may be introduced to any machine at any time, and the objective is to find a perpetual communication scheme which guarantees that new information will be completely dispersed in optimal time. The basic gossiping problem has a well-known solution, but the perpetual gossiping extension has defied a general solution. Additionally, prior to this paper, it has not been shown that there is even a means to arrive at an optimal solution on a case-by-case basis. Attempts at optimization have thus far taken place in a series of progressive refinements, broadening the scope of network topologies for which optimal or near-optimal solutions are known. This paper proceeds from the opposite direction, by demonstrating an algorithm which is guaranteed to find an optimal perpetual gossiping scheme for an arbitrary graph. The network model is then generalized so as to apply to a broader class of communication schemes.

Keywords: Perpetual gossip Network topology

1

· Peer-to-peer networks · Optimization ·

Introduction

Gossiping is a well-understood problem [8] in which one is given a network of machines, each containing a unique piece of information. The objective is to disseminate the information to other machines in the network through a sequence of bidirectional communications transferring information between pairs of neighboring machines until all information is known by all machines. The optimal solution (the solution requiring the minimum number of calls) to the gossiping problem is simple and well-known [2,8]. Perpetual gossiping is an extension to gossiping [13]. In perpetual gossiping, the communication scheme (known as a gossip scheme) is assumed to continue indefinitely, and new information can be introduced to any machine in the network at any time. The goal remains to disseminate all information to all machines, although now the performance of a gossip scheme is measured in c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 154–163, 2020. https://doi.org/10.1007/978-3-030-12388-8_11

Optimal Perpetual Gossiping

155

terms of the window from the moment that new information is introduced to the moment that information is known by all machines. To be considered an optimal perpetual gossip scheme, the scheme’s window size must be checked over all possible combinations of starting time and machine at which new information can be introduced. Perpetual gossiping is an all-to-all communication problem in peer-to-peer networks. While the first association one might have is with social networks, there are other interesting and less obvious applications, involving, for instance, management of parallel data storage systems [9]. Finding optimal perpetual gossiping schemes has proven a challenge, however, and existing papers have approached the problem incrementally, either by expanding the topologies for which optimal solutions are known, or by improving the known bounds on optimality. This paper approaches the problem from the opposite direction, by demonstrating guaranteed means by which one can arrive at optimality. Furthermore, the paper attempts to broaden the reach of existing gossiping research, by noting that the conventional abstract model used to describe gossiping may be too simplistic to describe a number of practical networks, and attempts to remedy this by presenting a generalized network model to apply to perpetual gossiping problems. This paper demonstrates that an optimal perpetual gossiping scheme can be found for an arbitrary network G by providing an algorithm which will find one. The algorithm is based on the observation that, knowing the bounds on the size of an optimal scheme, there are a finite number of schemes which need to be searched in order to produce an optimal scheme. This paper also proposes a more general model for a gossiping network, since the basic network model implied by a gossiping graph is a limited one. It does not allow things like broadcast, half-duplex, simultaneous communication, or sources which interfere with one another. The paper therefore extends the optimality result to the general model. The paper proceeds as follows: a discussion of related work; a presentation of definitions related to gossiping networks and perpetual gossiping schemes, which will clarify the goals of the optimal perpetual gossiping scheme algorithm; a proof of the correctness of the algorithm; the description of a generalized network model; a proof showing that the optimal scheme algorithm extends to the generalized model; an evaluation and discussion of the limitations of the model, followed by concluding statements.

2

Related Work

The gossiping problem was first introduced by Boyd as follows [15]: “There are n ladies, and each of them knows some item of scandal which is not known to any of the others. They communicate by telephone, and whenever two ladies make a call, they pass on to each other, as much scandal as they know at that time. How many calls are needed before all ladies know all the scandal?” Originally phrased, the problem assumed a network built on top of a complete graph, and had a solution requiring 2n − 4 calls for n ≥ 4 [1,7,15].

156

I. Avramovic and D. S. Richards

It was later shown that the gossiping problem on an arbitrary graph depends on whether the graph contains a 4-cycle [2,8]. Bumby showed that the optimal solution to the gossiping problem requires 2n − 4 calls in the case that the network contains a 4-cycle, and 2n − 3 when it does not. Krumme established that gossiping schemes can be reordered in several ways number of ways to produce equivalent results [11]. A number of variants to the gossiping problem have been proposed, including perpetual gossiping which is the subject of this paper. For example, k-call gossiping permits multiple parties to join the same call, while parallel gossiping allows multiple independent calls to take place simultaneously [10]. Some versions attempt to minimize the total cost of calling rather than the total number of calls, or optimize other objectives such as the number of edges used [5]. Variants such as exact gossiping expect only a fraction of information atoms to be learned by each node [3,16]. A recent direction in the study of gossiping schemes is epistemic gossiping, an introspective problem which is interested in what each node knows about what the other nodes know [4]. Thus far, an optimal perpetual gossiping scheme has been found for several special case graphs [13,14]. For example, for machines arranged in a path, an optimal perpetual gossiping scheme requires 3n − 6 calls in the worst case, while in a cycle, 2n − 3 calls are required. However, finding a general solution for an arbitrary graph has proven to be difficult [6]. The form of the solution for a tree, for example, is not known [12].

3

Definitions

A gossiping network is a connected, undirected graph with at least two nodes. In an instance of a gossiping network, each node in the graph is associated with a set of information atoms (Fig. 1, left). If G = (V, E) is a gossip network, then

2

1

3

2

0

4

5

1

3

0

4

5

4

5

Fig. 1. On the left, the initial state of a gossiping network, in which each node contains a distinct, numbered atom of information. On the right, a call on the edge between nodes 4 and 5 shares all information between the two.

Optimal Perpetual Gossiping

157

a call can take place on an edge of G connecting nodes v1 , v2 ∈ V . If v1 contains information A1 prior to the call and v2 contains A2 , then both v1 and v2 contain A1 ∪ A2 after the call (Fig. 1, right). An instance of a gossiping network is said to have complete information if every node is associated with the same set of information (Fig. 2).

5 4

5 4

0 3

0 3

1 2

5 4

0 3

1 2

1 2

5 4

5 4

0 3

1 2

5 4

0 3

0 3

1 2

1 2

Fig. 2. The goal state of the gossiping network, in which the network has complete information.

Assuming a sequence of discrete time steps, a gossiping scheme is a sequence of calls with one call corresponding to each time step. A perpetual gossiping scheme is a gossiping scheme containing an infinite number of calls. A gossiping scheme forms a complete gossip if, assuming that new information atoms are only introduced prior to the execution of the scheme, the network has complete information after the execution of the scheme. Note that a finite subsequence of a perpetual gossiping scheme can itself form a complete gossip, and if a scheme has such subsequence then the full scheme must form a complete gossip as well. A gossiping window (or simply window ) is a complete gossip such that any shorter scheme would no longer be a complete gossip (Fig. 3).

c0

c1

c2

...

ci−1

ci

ci+1

...

cn−1

cn

Fig. 3. A gossiping scheme consisting of calls c0 , ..., cn . If complete information is first established as a result of call ci , then the gossiping window consists of calls c0 , ..., ci (the shaded portion).

A perpetual gossiping scheme may have multiple windows of different sizes if multiple different starting points within the scheme are tried. The window size of a perpetual gossiping scheme is the length of the longest window over all starting points within the perpetual gossiping scheme. The optimal window

158

I. Avramovic and D. S. Richards

size of a gossiping network is the best (minimum value) window size over all perpetual gossiping schemes for the given graph. An optimal perpetual gossiping scheme (or simply optimal scheme) for a given network is a perpetual gossip scheme whose window size is the optimal window size.

4

Optimal Scheme Algorithm

It will now be shown that for gossiping network G = (V, E) with n = |V | > 1 nodes, an algorithm exists which will produce an optimal perpetual gossiping scheme. 4.1

Algorithm Description

1. Produce a set of all gossip schemes on G whose length is ≤ 4n − 6. 2. Remove from the set all gossip schemes which do not form a complete gossip. 3. Create a new set by taking schemes from the previous set and concatenating them. Each scheme in the new set is a composition of one or more schemes from the previous set such that each member of the previous set appears at most once in the scheme. Subject to this constraint, every possible composition is represented. 4. For each scheme in the new set, determine the window size of the scheme through simulation at each starting point in the scheme, assuming that the scheme is perpetual and repeats cyclically. 5. A scheme with the best window size from the previous step is a candidate optimal scheme. 4.2

A Note About Simulation

If there are n nodes in a network, then it is possible to use O(n2 ) time to simulate a gossip scheme of size O(n) to determine whether it forms a complete gossip or not, and if so, to determine the size of its window. This can be done by first setting up an n-by-n matrix representing which nodes know which other nodes’ information. Note that this initial set-up step already accounts for the O(n2 ) runtime by itself. The procedure is as follows. Initialize a counter to n. For every step in the gossip scheme, update the knowledge matrix of the pair of communicating nodes. If new information is gleaned by at least one of the communicating nodes, then update the counter as well. The process completes when either the counter reaches n2 (complete information) or all of the steps in the scheme have been exhausted. Since there are O(n) steps taking O(n) time each, the total runtime is the stated O(n2 ). 4.3

Proof of Correctness

Since G is a connected graph, then it must have a spanning tree. A complete traversal of the spanning tree requires not more than 2n − 3 steps (depending

Optimal Perpetual Gossiping

159

on the number of leaves in the tree). One complete traversal will guarantee that all information is consolidated at a single node, while a second traversal will guarantee that the information is dispersed from that one node to all other nodes. Thus, at worst, 4n − 6 calls are required in a perpetual gossiping scheme based on repeatedly traversing the spanning tree. An optimal scheme must exist because there exists at least one scheme with window size ≤ 4n − 6, and there exist a finite number of possible window sizes ≤ 4n − 6. For every possible window size 0 < w ≤ 4n − 6, out of the set of all possible gossip schemes, either there exists a scheme with that window size or there does not. Thus there is a smallest w for which there is a scheme with window size w. That is the optimal window size, and thus the scheme is an optimal scheme. Let S be the set of all gossip schemes on G whose length is ≤ 4n − 6 and which forms a complete gossip. Since G is a finite network then |E| must also be finite, so |S| is finite as well. For any perpetual gossip scheme with window size ≤ 4n − 6, every window of the scheme must be a member of S. Let S ∗ be the set of gossip schemes formed by concatenating two or more sequences from S. Every perpetual gossip scheme with window size ≤ 4n−6 must be a member of S ∗ . For any perpetual gossip scheme s ∈ S ∗ , derive perpetual gossip scheme s as follows: since s is formed by the concatenation of an infinite number of schemes, let s = s0 s1 s2 ..., where each sk ∈ S. Since there are a finite number of elements in S, the sequence of sk ’s must repeat at some si and sj , 0 ≤ si < sj . If i and j are chosen as the smallest such values, then the subsequence si ...sj−1 can be repeated over and over to give s (so s = si ...sj−1 si ...sj−1 ...). Since every window in s is identical to some window in s, it follows that the window size of s is no worse than the window size of s. Let S  be the set of all possible gossip schemes s which are derived as described above. Since there are a finite number of sequences formed by concatenating distinct sk ∈ S, S  must be finite. Furthermore, S  must contain schemes of every possible window size ≤ 4n − 6. Since there is at least one scheme in that range, S  must be nonempty. Therefore, an optimal scheme can be found in S  by exhaustive search. Note that S ∗ is an infinite set, so an actual algorithm would not rely on the construction of S ∗ . However, S ∗ is used to define S  , which is a set which can be constructed by algorithm.

5

Generalized Network Model

Observe that the network model for a gossiping network is limited in a number of ways. To begin with, the graph is undirected, so it does not represent unidirectional or half-duplex communication channels. Only one call per time unit is allowed, so simultaneous calls and broadcasts are not permitted. The case of k-call gossip schemes (which allow up to k simultaneous calls) has been studied [14], but even then, not all practical cases are covered by the model. For example, two nearby wireless nodes may both be able to broadcast to all

160

I. Avramovic and D. S. Richards

neighbors, but not simultaneously. In that case, the gossip scheme may allow k simultaneous calls, but those calls are only one-way, and not every set of k calls is allowed. In the new model, G = (V, E) is now a strongly-connected directed graph, and it is paired with an undirected interference graph GI = (E, I) representing interference between pairs of edges (Fig. 4). If a call is made on an edge (v1 , v2 ) ∈ E, where v1 contains information A1 prior to the call and v2 contains A2 , then after the call, v1 still contains A1 while v2 contains A1 ∪ A2 . In general, any number of simultaneous calls are allowed for as long as there is no interference edge between two of the callers. If e1 , e2 ∈ E and {e1 , e2 } ∈ I, then calls on e1 and e2 cannot be made simultaneously.

Fig. 4. An example of a network expressed using the generalized model. The solid arrows represent network connections, while the dotted lines represent interference edges. In this example, each of the three nodes can broadcast to the other two, but no two nodes can communicate at the same time.

Under this model, the original behavior can be emulated by replacing every undirected edge with two directed edges, and by including an interference edge {e1 , e2 } for every e1 , e2 ∈ I such that e1 and e2 do not refer to the same pair of nodes (in either order) in V . Broadcasts can be allowed by not including interference edges between broadcast channels. Half-duplex can be modeled by including interference between the two directions of a bidirectional edge. Wireless nodes which interfere with one another simply require a suitable selection of interference edges. If a schedule requires certain subsets of nodes to communicate at different times, then interference between subsets can be used to make sure that no two subsets are communicating at the same time. Under the new model, the lower bound of the original perpetual gossiping problem no longer applies. In particular, for a complete digraph with no interference edges, every node can communicate with every other node in a single step, so the optimal window size is 1, independent of the size of the network.

Optimal Perpetual Gossiping

161

Given this new model, it is still the case that an optimal solution can be found. To begin with, one can devise a simple scheme which is guaranteed to produce a complete gossip and use that to determine an upper bound on the optimal scheme. Assume that n = |V | > 1. Since G is strongly connected, if v ∈ V is any node in the graph, then it requires n − 1 single-edge calls to consolidate all information from all nodes at v, and n − 1 single-edge calls to disseminate all information at v to all other nodes. If a scheme is devised to iteratively consolidate at v and then disseminate from v, then it requires 2n − 2 calls from the start of an iteration to achieve complete information. Since at most one full iteration may be wasted during a perpetual scheme, then the window size of the iterative scheme is not more than 4n − 4. Since a scheme exists, then an optimal scheme exists. The method to show that a scheme exists is similar to the original version of the problem, in which sets S, S ∗ , and S  are defined, S  is constructed, and an exhaustive search of S  is used to find the best possible solution. Let S be the finite set of all possible complete gossips of length 4n − 4 or less. Note that unlike in the original model, schemes which appear in S may include schemes with multi-calls, depending on the gossiping network. As in the original model, in any perpetual gossip scheme of length 4n − 4 or less, every window of the scheme must be a member of S. As before, let S ∗ be the set of all concatenations of members of S. From ∗ S , construct S  as before. Since S  is composed of cyclic perpetual gossiping schemes, where each cycle of S  is constructed from a finite number of members of S without repetitions, then S  must be finite. Furthermore, S  is a nonempty set which must contain schemes of every possible window size ≤ 4n − 4. Therefore, an optimal scheme can be found in S  by exhaustive search.

6

Evaluation and Limitations

The algorithms described will produce perfectly optimal solutions, as has been proven in the previous sections. However, there is no expectation that those solutions will be arrived at quickly. In the worst case,  the first step of the algorithm produces the set S containing O (n − 1)4n−6 gossip schemes to consider, while the set S  of compositions of S has size Θ(|S| · |S|!). Given S  , the window size of each scheme s ∈ S  can be determined through simulation in a process which is O(n2 · |s |). While the algorithm is of theoretical interest, the complexity of the first two steps place it well beyond feasible computability for any reasonably-sized network. The best result for a static gossiping scheme is known, but it no longer applies for perpetual gossiping, because the arbitrary time at which information is introduced means that the starting point of the perpetual gossiping scheme cannot be guaranteed to coincide with the start of an optimal static gossiping scheme. However, the best result for a static gossiping scheme, 2n − 4, serves as an absolute lower bound on the performance of a perpetual gossiping scheme. It is also possible to produce a reasonable upper bound by noting that a perpetual

162

I. Avramovic and D. S. Richards

gossiping scheme can be constructed as a repeated traversal of some spanning tree of G. No more than two full traversals of a spanning tree are required to disseminate information to all machines, therefore, the window size of an optimal perpetual gossiping scheme is not worse than 4n − 6 calls. This means that a reasonable guaranteed upper bound on window size is about an order of 2 times larger than the absolute best window. For practical applications, this suggests that an investment in optimizing heuristics may be more productive than a search for a truly optimal solution. With regard to the applicability of the gossip model, it is important to note the assumptions on the nature of the network. The generalized network model presented in this paper can be used to describe a broad range of network topologies. The algorithm presented in this paper can easily be adapted to incorporate several gossiping variations (for example, networks with costs associated with connections or gossip schemes in which partial information is sufficient). However, it is assumed that the network is a reliable, static network with a wellknown, fixed, perpetual gossip scheme. Thus, probabilistic phenomenon, such as unreliable communications or communication patterns with a stochastic component, are not covered by the model. Furthermore, dynamic networks are not considered. While it would be possible to recompute an optimal scheme for a network after it has changed its topology, the cost of doing so using this algorithm would be prohibitive.

7

Conclusion

This paper gives an algorithm for producing an optimal perpetual gossiping scheme for an arbitrary graph, which proves that first of all an optimal scheme must always exist, and that there is always a way to find one. This is noteworthy because for many classes of graphs, the form of the solution is not known. Finding the optimal window size would still require computing the optimal scheme, but this paper shows that it is possible. The authors’ current and future work lies in extending the set of classes of network topologies for which optimal solutions are known, and analyzing selected networks to infer some necessary properties of optimal solutions. For example, the conjecture that any topology has an optimal solution composed purely of sequences of calls to contiguous edges seems straightforward, yet has been elusive to prove. It may be useful to study optimal perpetual gossip on tree graphs, because every connected graph has a spanning tree. The authors feel that the knowledge presented in this paper will simplify the task of seeking optimal solutions, either directly by providing a reference solution, or more subtly by eliminating the question of whether an optimal solution can be derived. It should be noted that the algorithm is super-polynomial in efficiency. The authors have good reason to believe that the problem of finding an optimal scheme for a general network is NP-hard, due to prior analysis of certain restricted cases. Although it remains an open problem, it is reasonable to assume that the super-polynomial nature of the algorithm is not something that can be

Optimal Perpetual Gossiping

163

fixed for all cases. Thus, the work shown in this paper is likely to represent a satisfactory answer to the general optimality question for any practical purpose. Future work would involve definitively proving NP-completeness and providing some approximation algorithm for deriving an optimal solution. The paper also shows how the result can be extended without problem to a more general network model which may be more practical in real-world scenarios. The fact that one can generalize the result suggests that it would not be difficult to generalize other results as well.

References 1. Baker, B., Shostak, R.: Gossips and telephones. Discret. Math. 2(3), 191–193 (1972). https://doi.org/10.1016/0012-365X(72)90001-5 2. Bumby, R.T.: A problem with telephones. SIAM J. Algebraic Discret. Methods 2(1), 13–18 (1981) 3. Chang, G.J., Tsay, Y.J.: The partial gossiping problem. Discret. Math. 148(1), 9–14 (1996). https://doi.org/10.1016/0012-365X(94)00257-J 4. van Ditmarsch, H., van Eijck, J., Pardo, P., Ramezanian, R., Schwarzentruber, F.: Epistemic protocols for dynamic gossip. J. Appl. Logic 20, 1–31 (2017). https:// doi.org/10.1016/j.jal.2016.12.001 5. Fertin, G.: A study of minimum gossip graphs. Discret. Math. 215(1–3), 33–57 (2000) 6. Fertin, G., Labahn, R., et al.: Compounding of gossip graphs. Networks 36(2), 126–137 (2000) 7. Hajnal, A., Milner, E.C., Szemer´edi, E.: A cure for the telephone disease. Canad. Math. Bull 15(3), 447–450 (1972) 8. Hedetniemi, S.M., Hedetniemi, S.T., Liestman, A.L.: A survey of gossiping and broadcasting in communication networks. Networks 18(4), 319–349 (1988) 9. Khuller, S., Kim, Y.A., Wan, Y.C.J.: On generalized gossiping and broadcasting. J. Algorithms 59(2), 81–106 (2006). https://doi.org/10.1016/j.jalgor.2005.01.002 10. Knoedel, W.: New gossips and telephones. Discret. Math. 13(1), 95 (1975). https:// doi.org/10.1016/0012-365X(75)90090-4 11. Krumme, D.W.: Reordered gossip schemes. Discret. Math. 156(1), 113–140 (1996). https://doi.org/10.1016/0012-365X(94)00302-Y 12. Labahn, R., Hedetniemi, S.T., Laskar, R.: Periodic gossiping on trees. Discret. Appl. Math. 53(1), 235–245 (1994). https://doi.org/10.1016/0166-218X(94)901872 13. Liestman, A.L., Richards, D.: Perpetual gossiping. Parallel Process. Lett. 3(04), 347–355 (1993) 14. Scott, A.D.: Better bounds for perpetual gossiping. Discret. Appl. Math. 75(2), 189–197 (1997) 15. Tijdeman, R.: On a telephone problem. Nieuw Archief voor Wiskunde 3(19), 188– 192 (1971) 16. Tsay, Y.J., Chang, G.J.: The exact gossiping problem. Discret. Math. 163(1), 165– 172 (1997). https://doi.org/10.1016/S0012-365X(96)00317-2

How to Achieve Traffic Safety with LTE and Edge Computing Niklas Hehenkamp, Christian Facchi(B) , and Stefan Neumeier Technische Hochschule Ingolstadt, Ingolstadt, Germany {niklas.hehenkamp,christian.facchi,stefan.neumeier}@thi.de

Abstract. Multi-Access Edge Computing (MEC) is an emerging technology that is promising for applications demanding a low latency and high bandwidth using cellular communication techniques. Vehicular communication is regarded as key technology on the way to fully autonomous vehicles. The requirements for safety critical applications in vehicles are harsh concerning timing and reliability. This paper analyzes the properties of MEC with regard to the requirements of vehicle safety applications. The paper elaborates the problems faced by vehicle-to-everything communication and possible approaches to solve them with MEC.

Keywords: Multi-Access Edge Computing Road safety

1

· MEC · V2X · LTE ·

Introduction

Achieving road safety is an ongoing effort in the research community. The communication of road users is a promising approach to decrease the number and severity of traffic accidents. In recent years, computational capacity has become increasingly prevalent in consumer devices or as web-based service. A current approach is the deployment of computational resources at the edge of the Radio Access Network (RAN) and thus the infrastructure of the cellular network between end users and the World Wide Web is called Multi-access Edge Computing. 1.1

Multi-access Edge Computing

Over the past years cloud-based services have gained significance. While the performance of computing systems has been increasing continuously, many computationally intensive tasks are still performed on centralized remote servers rather than on local machines. Energy intensive applications and services can be shifted to specialized nodes to decrease the power consumption of local machines and free up computational resources for time-critical tasks. Due to the signal latency in physical networks the response time and bandwidth of cloud-based applications is limited. When a subscriber in a Long Term Evolution (LTE) cellular c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 164–176, 2020. https://doi.org/10.1007/978-3-030-12388-8_12

How to Achieve Traffic Safety with LTE and Edge Computing

165

network requests a service on a server in the World Wide Web, the request is passed from the cellular user equipment to the Evolved Node B (eNodeB) and through the RAN to an internet gateway before reaching the destined server. The core network and the internet gateway therefore are bottlenecks, limiting the available bandwidth depending on the utilization of the RAN. Instead of requesting a service from a remote machine, the cloud-server can be placed directly at the edge of the RAN in close physical proximity to the eNodeB. This approach is called Multi-access Edge Computing (MEC) or just Edge Computing. The proximity to the end users in the RAN can reduce the latency of communications, increase the bandwidth compared to cloud-based services and enable context aware applications that utilize network traffic information available at the eNodeB. A MEC deployment strategy, system architecture and various use case scenarios for edge computing have been proposed by the European Telecommunications Standards Institute (ETSI). 1.2

Vehicle to Everything Communication

Vehicles are becoming increasingly autonomous [1]. Current driver assistance systems already take over partial driving functionality and reduce the level of driver interaction [2]. A traffic system that comprises fully autonomous vehicles greatly depends on safety related requirements that can be achieved through interconnection of road users and infrastructure. Vehicle-to-vehicle (V2V) and Vehicle-to-infrastructure (V2I) communication therefore is supposed to be an enabling technology for fully autonomous vehicles. Vehicle-to-everything (V2X) communication enables the distribution of sensory information among multiple road users. Infrastructure entities can share information such as camera images of intersections before approaching vehicles are in line of sight, vehicles can share e.g. their driving trajectory, velocity and sensory information to help avoiding dangerous situations. Road hazards that are not directly visible for certain road users are perceived through inter vehicular information exchange. Currently there are two basic approaches towards V2X communication: WiFi based and cellular based communication. Both technologies have distinct advantages and disadvantages. Thus, mixed approaches featuring both deviceto-device and cellular networking depending on the traffic situation have been discussed [3]. This paper focuses on V2X communication through cellular networks, because it is not the target of this paper to solve the already existing discussion on whether V2X should use Wi-Fi or LTE-V/5G as a method of communication. 1.3

MEC and Vehicle Safety

Compared with traditional cellular based applications, the improvement areas of MEC include proximity, high bandwidth, low latency and context awareness. The density of the geographical distribution of MEC servers is only limited by the density of eNodeBs. In theory, MEC servers can be located even closer to

166

N. Hehenkamp et al.

end users using access technologies different from LTE such as Wi-Fi or Bluetooth. Thus, time-critical or bandwidth intensive applications can be implemented through MEC. The support of Quality of Service (QoS)-control in LTE and future cellular networking technologies benefits the fulfillment of real-time requirements in timing constrained applications [4]. V2X communication is beneficial for road safety applications since it enables information sharing across multiple road users. Road hazards can be avoided, accidents can be foreseen and their severity can be reduced through proactive measures. The core requirements for safety-critical V2X applications are often a low deterministic latency, predictable latency variance and a high availability and robustness of communication. MEC is a promising approach to implementing V2X based road safety applications because it is reliable and satisfies this core V2X requirement of low latency. MEC enables the sharing of traffic and sensory information across vehicles and aggregating data from multiple road users at one place, thus providing a bird’s eye perspective on the current traffic situation and networking conditions. Whether MEC is suitable for V2X safety applications depends on the following research questions: – Do MEC-based services fulfill the requirements of V2X applications regarding timing and reliability? – Are there road safety applications that especially suit the properties of MEC? – How can vehicle safety be increased with MEC? – Are there properties of MEC aside from its latency, bandwidth and robustness that could benefit vehicle safety? The goal of this paper is to summarize open research questions regarding road safety applications based on LTE and MEC and to present possible solutions to these questions. The paper is structured as follows. Section 2 gives an overview of previous research related to the topics of MEC, V2X in LTE networks, V2X communication and MEC and Use Cases for MEC and V2X. Section 3 addresses open research questions regarding the implementation of V2X based road safety applications in cellular networks. Section 4 presents possible solutions for the aforementioned questions and discusses the opportunities of MEC. Section 5 gives an overview of future work and Sect. 6 concludes the paper.

2

State of the Art

The following sections elaborate current research activities regarding MEC and V2X communication. It presents use cases of V2X communication. 2.1

Multi-access Edge Computing

MEC has been a subject of academic research for several years. While ETSI has been proposing standards regarding MEC, the research community discusses the architecture, orchestration and possible use-case scenarios.

How to Achieve Traffic Safety with LTE and Edge Computing

167

Taleb et al. present a survey of MEC with regard to future cellular networks of the fifth generation (5G) [5]. They present a profound overview of enabling technologies of MEC, ongoing standardization efforts, deployment scenarios and use cases. The use cases presented include computation offloading, distributed content delivery and caching, web performance enhancements, internet of things and big data as well as smart city services. The MEC reference architecture is analyzed and future research challenges are presented, not including V2X communication and road safety [5]. Ahmed et al. focus on the investigation of application scenarios of MEC [6]. They present a taxonomy of MEC that includes the characteristics, actors, applications, access technologies, objectives, computational platforms and key enablers of MEC. Characteristics of MEC include proximity, dense geographical distribution, low latency, location awareness and network context information. The proximity of MEC to potential actors due to its location in the RAN enables the analysis and use of device and traffic information. The low latency of MEC results from a higher bandwidth that is available in the edge network compared to the core network [6]. Mao et al. provide a survey of the state-of-the-art MEC research with focus on joint radio-and-computational resource management and presents challenges and future research issues. The relationship between MEC and V2X is also discussed. Accordingly, existing car cloud services cannot satisfy the latency requirements of safety-critical services since those services can work with an end-to-end latency ranging from 100 ms to 1 s. With MEC based services, an end-to-end latency of 20 ms can be achieved [7]. Malandrino et al. discuss the conflict of utilization and latency in MEC. While a dense deployment of MEC servers would decrease the overall latency of MEC based services for large areas, the uneven utilization of mobile network infrastructure could result in an underutilization of MEC services and thus unacceptable costs of the MEC infrastructure. They present solutions on how to approach the problem based on the operator’s deployment strategy and the geographical properties but do not address vehicular applications [8]. 2.2

V2X Communication in LTE Networks

According to Tanenbaum, QoS requirements include timing and other nonfunctional requirements such as the bit rate at which data should be transported, the maximum delay until a session has been set up, the maximum end-to-end delay, the maximum jitter and the maximum round-trip delay of a message [9]. QoS requirements play a key role in V2X systems since safety-critical, road safety related services often have to fulfill hard real-time requirements. For example pre-crash sensing warnings or high density platooning demand 20 ms and 10 ms latency respectively [10]. According to the LTE standards by the 3rd Generation Partnership Project (3GPP), QoS-control can be applied per service data flow in LTE networks. Thus, because QoS is supported in modern cellular networks, a certain QoS policy can be applied per service [4].

168

N. Hehenkamp et al.

The LTE standards also include a study of LTE-based V2X services [11]. The study features three operation scenarios, the scope of their technical support, the architecture and high level procedures for V2X. In the first operation scenario, direct V2V, Vehicle-to-pedestrian (V2P) and V2I communication is implemented through the LTE side link. Thus, there is device to device communication. Vehicles can communicate with Road Side Units (RSUs), Vulnerable Road Users (VRUs) or other vehicles directly, without using the eNodeB. The second scenario, the RAN works as a relay for V2X communication. Road users communicate with others through the uplink and downlink to and from the eNodeB. The scenario includes Vehicle-to-network (V2N) communication, a case in which road users communicate with an application server in the RAN or the internet. MEC would be able to support the idea of a RAN-based application server for V2X applications. The third scenario describes a mixed operation of the first and the second scenarios. Vehicles could communicate with RSUs via a direct side link, while the RSU relays important messages to the eNodeB and possibly the application server through the LTE uplink. The eNodeB distributes the information to other road users and RSUs via LTE downlink [11]. Jeong et al. propose an architecture comprising of Vehicular Ad-hoc Networks (VANETs) through Wi-Fi and cellular network based V2X communication. The mixed system is evaluated in several scenarios in a real driving environment. They also discuss the limitations of current technologies for safety-critical applications [12]. Chen et al. introduce LTE-V as an opposing technology to Wi-Fi based device-to-device communication. They compare the different modes of LTE-V, cellular or direct, with existing 802.11p based technologies and present an integrated V2X solution [13]. Ahmed et al. discuss privacy and security issues of LTE-based V2X services. They identify shortcomings of the security requirements in the LTE V2Xstandard regarding privacy preservation and propose a privacy preserving security for LTE-based V2X services [14]. 2.3

V2X Communication and Edge Computing

The following section presents research activities that focus on V2X communication and MEC. Husain et al. give an overview of the standardization efforts for V2X services [15]. MEC is regarded as a technology that can facilitate traffic congestion indications and warnings from other vehicles. Thanks to the bird’s eye perspective of MEC based services a V2X system can take vehicles and pedestrian’s user equipment into account. The low latency of MEC services enables almost real-time processing of vehicle sensory information that can be used to avoid accidents and increase road safety. MEC is also promising for future smart cities by enabling location aware services for traffic management, smart parking or fuel planning [15]. Hagenauer et al. present a concept in which parked vehicles serve as a virtual network infrastructure thus forming temporary edge nodes. Parked cars form a

How to Achieve Traffic Safety with LTE and Edge Computing

169

cluster which can be accessed through certain parked vehicles serving as gateway nodes. The cluster forms an RSU spanning over a larger geographical area that can be used to exploit unused computational capacities in parked vehicles [16]. Xiao et al. present the concept of vehicular fog computing in which vehicles become mobile edge computing nodes themselves to provide cost-effective and on-demand fog computing for V2X applications using the mobility of vehicles. Possible actors of vehicular fog computing are buses or taxis thus providing computing and communication capacity where it is needed [17]. Li et al. propose a MEC-based architecture for cellular V2X communication comprising of the different communication scenarios V2I, V2V, V2N and cloud computing. They also present a classification of application types related to V2X and MEC and propose a handover mechanism for vehicles based on MEC [18]. Sasaki et al. present a vehicle control system that gathers sensory information from vehicles and is coordinated between cloud servers and MEC servers by allocating resources among them dynamically. A prototype using micro cars was implemented to investigate the concept and the influence of latency on the driving trajectory. Their results show that a MEC-based approach minimizes deviations from the desired trajectory. The prototype implements cloud-based, MEC-based and mixed servers, automatically switched control of the micro cars [19]. 2.4

V2X Use Cases for Edge Computing

Lee et al. analyze the latency requirements of V2X applications and investigate their fulfillment regarding LTE [10]. They list use cases related to road safety and automated driving as well as the corresponding latency constraints. These use cases and their latency requirements comprise of the following [10]: – – – – – – – – – – –

Forward collision warning, 100 ms Control loss warning, 100 ms Emergency warning, 100 ms Emergency stop, 100 ms Queue warning, 100 ms Road safety services, 100 ms Pre-crash sensing warning, 20 ms Automated overtake, 10 ms Cooperative collision avoidance, 100 ms High density platooning, 10 ms See-through, 50 ms.

Among these use cases, the pre-crash sensing warning demands for a maximum latency of 20 ms. The automated overtake and high density platooning require a latency below 10 ms. The see-through use case, meaning that road users can see through obstacles impairing the driver’s vision such as buildings or vehicles driving ahead, requires a latency below 50 ms [10]. All of these applications demand for latencies that cannot be achieved in a cloud-based scenario while

170

N. Hehenkamp et al.

those with a maximum latency requirement of 100 ms could be implemented with the support of cloud computing [7]. Buchenscheit et al. propose an emergency vehicle warning system that incorporates V2X communication in addition to the existing visual and acoustic warnings. They also show the additional benefits of V2X in emergency vehicles such as the delivery of detailed traffic information over the air [20]. Fujioka et al. present field trial results for cellular-based warning of vehicles using geomessaging. The warnings focus on natural disasters and emergency vehicles [21]. Nunna et al. propose a real-time context-aware collaboration system based on MEC, in which road accidents are automatically reported to the nearest MEC server. The MEC server takes a set of measures to minimize the time required for an ambulance to arrive at the scene by rerouting traffic in the area [22]. Rahman et al. present a MEC framework for the management of large crowds. Their example focuses on the largest pilgrim gathering worldwide, the annual Hajj in Mecca and shows the delivery of context aware, real-time services for a dense concentration of a large number of people. Their research indicates challenges that affect high density traffic scenarios [23].

3

Problem Statement and Research Gap

In a cellular V2X scenario there is no central node that aggregates traffic information resulting from the contents of messages. Each vehicle that uses direct ad-hoc communication receives and sends messages, thus collecting information without knowing if its image of the current traffic situation is complete. There is no entity that receives everything with a very high probability. Thus, road users are not necessarily fully informed. There is also no actor that collects and processes real-time network information to evaluate current network conditions such as the round trip latency of messages or the overall message occurrence in an area. The message latencies between all road users and the package loss are unknown and not predictable. Cellular V2X lacks a road and network traffic observer with a bird’s eye view. The eNodeB in a LTE network collects valuable information about the networking conditions. These are currently not accessible by cellular V2X applications. VRUs are not equipped with the hardware and software that vehicles use to send information related to road safety, so pedestrians or cyclists cannot actively share their location or driving trajectory with other road users. They can be detected by sensors of vehicles or infrastructure units, but Wi-Fi based deviceto-device communication and cellular V2X do not directly include VRUs into their communications. Thus, a cyclist does not receive a warning if endangered. However, the eNodeB has access to the knowledge about VRUs. Whether MEC can support the utilization of that knowledge and the utilization of infrastructure sensor information is an open research question. Vehicular safety applications require a deterministic, low latency as well as an evaluation of the transmission reliability. Unless parameters such as the package

How to Achieve Traffic Safety with LTE and Edge Computing

171

loss rate and the round trip latency of the communication are known, applications on which human lives depend, cannot be implemented and the required Automotive Safety Integrity Level (ASIL) cannot be achieved. The eNodeB aggregates information regarding QoS. How can it be used with the help of MEC? The underlying research question is whether MEC can contribute to the evaluation of the network reliability and QoS parameters and if MEC can solve the following issues: – – – – –

4

Deterministic, low latency Deterministic jitter Observation of road activity and traffic Observation and analysis of network traffic Consideration of VRUs.

Realization of Vehicle Safety Applications Using MEC

Based on the properties of MEC servers, MEC-based applications can be regarded as cloud-based applications with much lower latency and higher bandwidth. These are the main arguments mentioned concerning the deployment of MEC. The bandwidth properties of MEC are beneficial for V2X-infotainment applications and the sharing of sensory information across road users. Since MEC servers are located close to the edge of the RAN, the throughput of MEC-based applications can be much higher than that of cloud-based services. The MEC server does only serve those clients connected to certain eNodeBs. The bandwidth is shared by fewer clients than in a centralized scenario thus being much higher for MEC-based than for cloud-based services. The low latency of MEC is particularly interesting in the safety domain. Since the MEC server is located very close to the clients served, the distance a package has to travel and the number of network elements to pass is lower than that of applications whose server is located in the World Wide Web. The latency is therefore only determined by the radio channel, protocol implementation, processing time at the eNodeB and the response time of the MEC application. In general, the bandwidth and latency properties of MEC do not necessarily justify the large deployment of MEC servers in the RAN for V2X applications. A key argument to support the idea of further investigation of MEC for V2X communication is its location at the eNodeB or an aggregation point of multiple eNodeBs. The eNodeB can ensure QoS and therefore has access to information on round-trip latency, latency variance, package loss and the current network utilization. These are important parameters to implement vehicle safety applications. To evaluate and predict possible actions of drivers, information must be up to date. Furthermore, information about a vehicle and its environment must be reliable. Thus, reliability and actuality of information are crucial factors for safety-critical applications. While the availability of computational capacity at the edge of the RAN itself can be beneficial for certain use cases, the combination

172

N. Hehenkamp et al.

of an eNodeB and a MEC node offers the unique opportunity to aggregate information about road and traffic conditions as well as network traffic information and network conditions. The MEC server in that scenario has access to more network related information than an internet-based cloud server. Its advantage over simple relaying cellular V2X is the aggregation of road information resulting from the evaluation of messages received from vehicles. Compared to device-to-device communication this solution supports QoS, so it would be able to evaluate the networks reliability and latency conditions. The reliability and latency of a transmission can be evaluated and categorized dynamically according to the requirements of different applications. The MEC server can decide whether an application can be offered to a road user according to the current reliability and latency level of the connection. Figure 1 shows an early exemplary MEC-based V2X software platform prototype that collects road traffic information and network information, storing it in a database of road users. QoS parameters are evaluated and predicted based on measurements. These measurements can be used to draw a detailed map showing the latency, jitter and package loss for each location in the area of the MEC server. Various applications can be implemented based on the platform, using the values stored in the database. Based on the QoS data, the MEC server can decide whether an application can be used reliably and in which geographical area it is functioning reliably. The platform also implements backup functions in case MEC servers fail and edge-to-edge or handover functionality to share data across multiple MEC nodes. The platform accesses network traffic data available at the eNodeB to include road users actively transmitting information and those being subscribers only. The MEC platform can support real-time positioning of road users and evaluate movement patterns to predict future road hazards and dangerous situations.

Fig. 1. Architecture of a MEC-based software platform for V2X applications

How to Achieve Traffic Safety with LTE and Edge Computing

173

VRUs that are not directly communicating in a road safety context are connected with the eNodeB if they are using a LTE capable user equipment, e.g. smartphones. Information about their position and movements can be processed by the MEC server and thus can be included in road safety applications. Furthermore, infrastructure units such as traffic lights that are equipped with cameras can detect VRUs and share their location with the MEC server and therefore with other road users. If, for example kids play in close proximity to the road and suddenly cross it, sensor and camera data can be processed quickly at the edge of the RAN and can be used to warn approaching vehicles. The close proximity of computational capacity and the high bandwidth available when using MEC, are beneficial for applications such as the transmission and processing of camera images. Data acquired by cameras (located e.g. on traffic lights) can be sent to the MEC server in short time and processed on site.

5

Future Work

A MEC server prototype that implements basic functionality required for V2X applications will be implemented and evaluated regarding the current shortcomings of existing approaches. Figure 1 shows a possible software architecture of the prototype. Future research work includes the further investigation of mixed scenarios comprising 802.11p and cellular based technologies. MEC is not tied to LTE as access technology. Thus, a diverse, truly multi-access system will be investigated. It is not yet clear to which extent MEC can address the latency and throughput requirements of safety-critical applications in the automotive domain. The latency and jitter of MEC will be investigated further in an automotive test scenario. Interesting future research will also include the influence of the number of clients connected to the MEC server. Further research work will also address the question: at which density do MEC servers have to be deployed to serve safety-critical V2X services sufficiently and how the systems reliability can be achieved using multiple MEC servers? On the road to the fifth generation of cellular networks (5G) there will be concepts that could enable very low latency applications in the RAN. In which extent MEC will contribute the 5G cellular network and how automotive applications can profit from that will also be part of future research.

6

Conclusion

Current cellular-based technologies used to implement V2X communication lack the ability to evaluate QoS parameters such as the round trip latency of transmissions or jitter. These are crucial parameters for safety critical applications in the vehicular domain. MEC offers the opportunity to aggregate network and road traffic information in close proximity to end users, resulting in low transmission latencies and high available bandwidth. MEC is a promising technology to overcome the challenges of V2X communication since it can fulfill the timing

174

N. Hehenkamp et al.

and reliability requirements of vehicle safety related applications and integrate VRUs into an unified road safety system. Commercially available V2X approaches do not include the control and analysis of QoS parameters such as latency, jitter and package loss. Those can be determined in real-time by the MEC server and used to estimate the reliability of safety critical V2X applications that have to fulfill harsh timing and reliability requirements. The level of communication confidence can be determined and such provide additional communication information to the applications. That might be a possible way to decrease the effect of a non-deterministic over-the-air communication. The aggregation of road traffic information at a MEC server enables the management and prediction of hazardous traffic situations. Thus, a MEC server can observe road traffic from a bird’s eye perspective, a perspective that is currently not available for V2X communication. The MEC server can have access to information of the eNodeB. Therefore, VRUs can be included into road safety applications if they have the necessary user equipment such as a smartphone. The high bandwidth available when using MEC can be used to collect sensor information of numerous vehicles and infrastructure facilities. Traffic lights or other infrastructure units can be equipped with cameras that observe roads and intersections. The images can be processed on the MEC server and used to detect hazards. Because of the low transmission latency, vehicles in range of the hazard can be warned and accidents can be avoided. The processing of vehicle sensor data is useful to predict hazardous driving maneuvers and warn road users accordingly. MEC is a key technology to increase road and vehicle safety significantly because of the usage of information that is not available in usual cellular or Wi-Fi based environments. Acknowledgemexnt. The authors want to thank Biraj Parikh and Suprateek Banerjee for cross-reading and providing valuable feedback.

References 1. ETF Connectivity and A Driving: Automated Driving Roadmap, ERTRAC, resreport, July 2015. http://www.ertrac.org/uploads/documentsearch/id38/ERTRAC Automated-Driving-2015.pdf 2. ORADO Committee: Taxonomy and Definitions for Terms Related to On-Road Motor Vehicle Automated Driving Systems. SAE International Std., January 2014 3. Abboud, K., Omar, H.A., Zhuang, W.: Interworking of DSRC and cellular network technologies for V2X communications: a survey. 65(12), 9457–9470 (2016). http:// ieeexplore.ieee.org/document/7513432/ 4. 3GPP: Policy and charging control architecture, 3rd generation partnership project (3GPP). Technical Specification (TS) 23.203, version 15.1.0, 12 2017. http://www. 3gpp.org/DynaReport/23203.htm

How to Achieve Traffic Safety with LTE and Edge Computing

175

5. Taleb, T., Samdanis, K., Mada, B., Flinck, H., Dutta, S., Sabella, D.: On multiaccess edge computing: a survey of the emerging 5G network edge architecture and orchestration. 1 (2017). http://ieeexplore.ieee.org/document/7931566/ 6. Ahmed, A., Ahmed, E.: A survey on mobile edge computing. In: 2016 10th International Conference on Intelligent Systems and Control (ISCO), pp. 1–8. IEEE (2016). http://ieeexplore.ieee.org/abstract/document/7727082/ 7. Mao, Y., You, C., Zhang, J., Huang, K., Letaief, K.B.: A survey on mobile edge computing: the communication perspective. 1 (2017). http://ieeexplore.ieee.org/ document/8016573/ 8. Malandrino, F., Kirkpatrick, S., Chiasserini, C.-F.: How close to the edge?: delay/utilization trends in MEC. In: Proceedings of the 2016 ACM Workshop on Cloud-Assisted Networking, ser. CAN 2016, pp. 37–42. ACM (2016). http://doi.acm.org/10.1145/3010079.3010080 9. Tanenbaum, A.S., Van Steen, M.: Distributed Systems: Principles and Paradigms. Prentice-Hall, Upper Saddle River (2007) 10. Lee, K., Kim, J., Park, Y., Wang, H., Hong, D.: Latency of cellular-based V2X: perspectives on TTI-proportional latency and TTI-independent latency. 5, 15800– 15809 (2017). http://ieeexplore.ieee.org/document/7990497/ 11. 3GPP: Study on LTE-based V2X services, 3rd Generation Partnership Project (3GPP). Technical Report (TR) 36.885, version 14.0.0, 07 2016. http://www.3gpp. org/DynaReport/36885.htm 12. Jeong, S., Baek, Y., Son, S.H.: A hybrid V2X system for safety-critical applications in VANET. In: 2016 IEEE 4th International Conference on Cyber-Physical Systems, Networks, and Applications (CPSNA), pp. 13–18 (2016) 13. Chen, S., Hu, J., Shi, Y., Zhao, L.: LTE-V: a TD-LTE-based V2X solution for future vehicular network. 3(6), 997–1005 (2016) 14. Ahmed, K.J., Lee, M.J.: Secure, LTE-based V2X service (2017) 15. Husain, S., Kunz, A., Prasad, A., Samdanis, K., Song, J.: An overview of standardization efforts for enabling vehicular-to-everything services. In: 2017 IEEE Conference on Standards for Communications and Networking (CSCN), pp. 109– 114. IEEE (2017) 16. Hagenauer, F., Sommer, C., Higuchi, T., Altintas, O., Dressler, F.: Parked cars as virtual network infrastructure: enabling stable V2I access for longlasting data flows, pp. 57–64. ACM Press (2017). http://dl.acm.org/citation.cfm? doid=3131944.3131952 17. Xiao, Y., Zhu, C.: Vehicular fog computing: vision and challenges. In: 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), pp. 6–9. IEEE (2017). http://ieeexplore.ieee.org/ abstract/document/7917508/ 18. Li, L., Li, Y., Hou, R.: A novel mobile edge computing-based architecture for future cellular vehicular networks. In: 2017 IEEE Wireless Communications and Networking Conference (WCNC), pp. 1–6. IEEE (2017). http://ieeexplore.ieee. org/abstract/document/7925830/ 19. Sasaki, K., Suzuki, N., Makido, S., Nakao, A.: Vehicle control system coordinated between cloud and mobile edge computing. In: 2016 55th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), pp. 1122–1127. IEEE (2016). http://ieeexplore.ieee.org/abstract/document/7749210/ 20. Buchenscheit, A., Schaub, F., Kargl, F., Weber, M.: A VANET-based emergency vehicle warning system. In: 2009 IEEE Vehicular Networking Conference (VNC), pp. 1–8. IEEE (2009). http://ieeexplore.ieee.org/abstract/document/5416384/

176

N. Hehenkamp et al.

21. Fujioka, M., Takahashi, M., Matsumura, T., Wang, C., Hirakoba, H.: Field trial of cellular warning for automobiles using geomessaging in Japan. In: 2012 12th International Conference on ITS Telecommunications, pp. 497–501 (2012) 22. Nunna, S., Kousaridas, A., Ibrahim, M., Dillinger, M., Thuemmler, C., Feussner, H., Schneider, A.: Enabling real-time context-aware collaboration through 5G and mobile edge computing, pp. 601–605. IEEE (2015). http://ieeexplore.ieee.org/ document/7113539/ 23. Rahman, M.A., Hassanain, E., Hossain, M.S.: Towards a secure mobile edge computing framework for Hajj (2017)

Design of Microstrip Patch Antenna with Inset Feed in CST for EBS Channel Mian Mujtaba Ali(&), Muhammad H. D. Khan, and Omer Farooq Department of Electrical Engineering, Bahria University, Islamabad, Pakistan {mujtabaali.buic,omerfarooq.buic}@bahria.edu.pk, [email protected]

Abstract. In this research paper, the authors have designed a micro-strip rectangular patch antenna used in educational broadcasting Service (EBS). The EBS is reserved by Federal communication commission to be used for educational purposes and devices. Wi-Fi routers, sensors designed in this particular band can be used by educational institutes and they can achieve higher data rates and less interference as compared to traditional devices. The antenna resonance frequency is 2.5675 GHz with s11 coefficient of −27.79 dB. The antenna is design in Computer Simulation Technology (CST) studio using the theoretical calculations and simulated results shows the antenna has −10 dB bandwidth of 40 MHz with peak gain of 6.62 dBi. Return loss, far fields, efficiency, and radiation pattern shows that it is a suitable design to be used for wireless devices operating in EBS channel. Keywords: Antenna

 Patch  CST  Microstrip  EBS

1 Introduction Antenna transmits information by converting electrical power to EM waves and vice versa. Transmitting antennas are used in almost every communicating device starting from Television dish antennas to the more advanced applications antenna arrays in space communication systems. Since antenna is the most vital component of wireless communication. One of the most commonly used antennas is patch antennas. Patch antennas due to their small foot print, weight, and cost are used in many applications. A simple microstrip patch antenna comprises of a conducting patch and ground plane which is separated by a dielectric material having a particular dielectric constant [1]. Patch antenna radiates primarily due to fringing field between patch and ground planes. For a good antenna performance, a think dielectric substrate having low dielectric constant ensures a better efficiency, good bandwidth and better radiation [2]. Micro-strip patch antennas can be fabricated in numerous shapes like circular or rectangular shape. The shape preference usually depends upon application requirement like resonance frequency, antenna polarization and radiation efficiency. The most common applications for patch antennas are Bluetooth [3], Wi-Fi [4] and Wireless sensor Networks [5]. The Educational Broadcasting Service (EBS) is a flexible use channel Band(s) licensed to educational institutes or non-educational institutes such as © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 177–184, 2020. https://doi.org/10.1007/978-3-030-12388-8_13

178

M. M. Ali et al.

hospitals, nursing homes and training centers. The particular band can be used to provide broadband services and internet access [6]. This paper presents a design of such antenna which can be used to transmit information in Educational Broadcasting Service channel which would help design WiFi routers or Bluetooth devices for educational institutes or some non-educational institutes that will face less interference than the traditional Wi-Fi and Bluetooth Devices as this channel is different from the traditional 2.4 GHz channel.

2 Antenna Design and Structure The resonant frequency of antenna is the Centre frequency at which the antenna communicates. The purposed antenna is designed to work in EBS channel so the resonant frequency of antenna is selected as 2.5675 GHz. Substrate with high dielectric constant is selected which will reduce the dimensions of antenna [2]. The substrate is FR-4 with dielectric constant (er) is equal to 4.9. Patch antennas are mostly used in size sensitive devices so it is essential that the antenna is not bulky this is the reason the substrate is kept approximately 1.56 mm in height. A perfect conducting metal is used for ground and patch having thickness of 0.035 mm. There are several methods which can be adopted to feed microstrip patch antennas i.e. micro-strip line inset feed, coaxial probing and coupling based on proximity. The microstrip line inset feed is usually preferred because usually micro-strip inset line is smaller than width of antenna, it is easier to fabricate on printed boards, the need for soldering or drilling holes in case of coaxial feed is also eliminated and the process of impedance matching is done by controlling both the length and width of micro-strip inset feed line. First, the antenna’s dimensions are calculated which are then used to design and simulate an antenna in CST software. 2.1

Patch Antenna Width (W)

The patch antenna width has been calculated using the below mentioned equation [7]. c W¼ 2fr

rffiffiffiffiffiffiffiffiffiffiffiffi 2 er þ 1

ð1Þ

Substituting c which is speed of light in vacuum equal to 3  108 m=s, the center or resonant frequency fr ¼ 2:5675 GHz and er which is the dielectric constant called constant called permittivity equal to 4.9 gives the width of the patch equal to W = 34.0148 mm. 2.2

Effective Dielectric Constant (ereff)

The effective dielectric constant of the substrate is calculated by using following equation [7].

Design of Microstrip Patch Antenna with Inset Feed in CST

ereff ¼

  er þ 1 er  1 12h 1=2 þ 1þ 2 2 w

179

ð2Þ

Substituting er ¼ 4:9, h = 1.56 mm gives: ereff = 4.5704. 2.3

Extended Length (ΔL)

The effective dielectric (ereff) is used to calculate the extended length ΔL, the extended length is the length which electronically added to actual length due to fringing fields, following equation is used to calculate ΔL [7].    ereff þ 0:3 Wh þ 0:256 DL   ¼ 0:412 h ðereff  0:258Þ Wh þ 0:8

ð3Þ

Substituting ereff = 4.5704, h = 1.56 mm, W = 34.0148 gives: ΔL = 0.16 mm. 2.4

Length of Patch (L)

The actual length of the patch is calculated by using following equation [7]. L¼ Substituting ereff ¼ 4:5704, L = 26.0771 mm. 2.5

c pffiffiffiffiffiffiffiffi  2DL 2fr ereff

fr = 2.5675 GHz

ð4Þ and

ΔL = 0.16 mm

gives:

Patch Input Impedance (Ri )

The input impedance of the patch is calculated by using following equation. Ri ¼

90

ð5Þ

2ðW= cfÞ2

Substituting the values of width, speed of light, resonance frequency gives: Ri = 530 X. This impedance of the antenna patch is used to calculate length and width of the inset feed line. 2.6

Inset Feed Length (yo )

Inset feed length is the length of the feed point inside the patch from the start of patch. It is calculated by using following equation [7, 8]. Ri ðy ¼ yo Þ ¼ Ri ðy ¼ 0Þ cos2

pyo L

ð6Þ

180

M. M. Ali et al.

Ri ðy ¼ yo Þ or Zo is desired input impedance which is chosen as 50 X which is preferred antenna input impedance, Ri (y = 0) is the resistance of the Patch antenna which in this case is 530 X, substituting these value gives: yo = 10 mm. 2.7

Microstrip Feed Line Width (Wo)

Microstrip Feed line width Wo which is used to give or feed input to the antenna which antenna transmits has been determined by using following equation. The calculation are used for to set 50 X impedance (Zo) of feed line.   60 8h Wo Zc ¼ pffiffiffiffiffiffiffiffi ln þ ereff Wo 4h

ð7Þ

Solving Equation gives the width of microstrip feed line Wo = 2.28 mm. 2.8

Microstrip Line Length (l1 )

Microstrip line length l1 has been calculated by using following equation. l1 ¼

116:8451 pffiffiffiffiffi eeff

4

ð8Þ

Substituting and solving equation yields: l1 ¼ 13:66 mm 2.9

Tooth Gap

The gap between patches and inset feed line is set to 1 mm on each side. 2.10

Substrate Dimensions

The length and width substrate used is taken double the length and width of patch respectively. The design and simulation of antenna has been carried out in computer simulation technology (CST) studio. The 3-Dimensional shape of the antenna design in CST is shown in the figure (see Fig. 1) below.

3 Results and Simulations The CST Simulation shows antenna has s11 parameter (return loss) of −27.79 dB at resonance frequency of 2.5675 GHz as shown in the Fig. 2. The VSWR (voltage standing wave ratio) of antenna is 1.005. The antenna has transferred 99.82% of the power supplied to it and only 0.18% power loss due to reflection, High efficiency it is because of excellent s11 parameter. The −10 dB bandwidth of the antenna is 40 MHz. Bandwidth is calculated from the S-Parameter figure (see Fig. 2) the bandwidth of antenna is taken as the band of

Design of Microstrip Patch Antenna with Inset Feed in CST

Fig. 1. Antenna structure in CST studio

Fig. 2. S-Parameter

181

182

M. M. Ali et al.

frequencies with s11 parameter below −10 dB. The patch antennas normally have narrow bandwidth and it is one of the disadvantages of the patch Antenna. The antenna operates in Educational broadband service which covers bands from 2495–2690 MHz. The bandwidth of the antenna is normally one of the factors that determines the data rate of the antenna. The input impedance of the inset feed antenna is 49.3214 X (Resistive) with reactive part of −1.8660 X at resonance frequency. The Figure below (see Fig. 3) shows the smith chart of antenna. The desired input impedance matching used for calculation of inset feed was 50 X.

Fig. 3. Smith chart

The good impedance matching of the inset feed line results in the maximum power transmitted through the antenna. The Radiation efficiency found from simulation is 86%. It means maximum power transmitted to the antenna is radiated. The maximum directivity of the antenna is along z-axis of the antenna which is normal to plane of antenna. The patch antenna’s gain is shown 6.62 dBi in z-Axis as shown in Fig. 4. The far field antenna simulation results are shown (see Fig. 4).

Fig. 4. Far field

Design of Microstrip Patch Antenna with Inset Feed in CST

183

The radiation pattern is normal to the plane of patch antenna and as shown in far field simulation results shown (see Fig. 4). As shown in the figure below the maximum gain is along z-axis as required for the antenna. The maximum gain is along z-axis of 6.62 dBi. As for the patch antennas the directivity is a desired aspect and for obtaining directivity in more than one direction an array of the patch antennas is used. The directivity (maximum) and main lobe of the antenna is along the z-axis (perpendicular to the plane of the patch antenna) as shown in polar graph directivity results (see Fig. 5). Two minor lobes are also present due to imaginary impedances and shown perpendicular to the ground plane of antenna. The minor lobes abnormal gain is due to presence of imaginary impedance (Capacitance and impedance caused by materials used in patch) as shown in smith chart in the antenna.

Fig. 5. Directivity

4 Conclusion This paper presented a design approach for a rectangular microstrip patch antenna with microstrip inset feed line. The antenna is designed for 2.5675 GHz resonance frequency and simulated in Computer Simulation Technology studio. The antenna simulation shows it is a viable antenna for educational broadcasting service channel EBS bands with VSWR of 1.005 and gain of 6.62 dBi. The proposed antenna can be used for Wi-Fi routers or Bluetooth devices in EBS channel. The maximum directivity is perpendicular to the plane of antenna. An array of antenna or such arrangement can be made and simulated for the designed antenna in future so that it can emit in multi direction i.e. required in case of Wi-Fi routers.

184

M. M. Ali et al.

References 1. Afridi, M.A.: Microstrip patch antenna - designing at 2.4 GHz frequency. Biol. Chem. Res. 2015, 128–132 (2015) 2. Harihara, S.G., Prabhu, S.S.: Design, analysis and fabrication of 2  1 rectangular patch antenna for wireless applications. Int. J. Adv. Res. Electron. Comm. Eng. 4, 599–603 (2015) 3. Dixit, A., Singh, O.P., Mishra, G.R.: Design and analysis of a patch antenna for bluetooth application. Int. J. Res. Eng. Tech. eISSN 2319-1163, pISSN 2321-7808 4. Sharma, G., Sharma, D., Katariya, A.: An approach to design and optimization of WLAN patch antennas for Wi-Fi applications. Int. J. Wirel. Commun. 1(2), 09–14 (2011) 5. Bhanarkar, M.K., Nadaf, A.J., Korake, P.M., Waghmare, G.B., Navarkhele, V.V.: Rectangular microstrip patch antenna for wireless sensor networks 6. FCC: Wireless Services: BRS & EBS Radio Services: BRS & EBS Home. Wireless.fcc.gov. N.p., 2016, 15 November 2016 7. Pete, B.: Antenna-Theory.Com - Rectangular Microstrip (Patch) Antenna - Feeding Methods. Antenna-theory.com. N.p., 2016, 19 November 2016 8. Ramesh, M.: YIP KB motorola design formula for inset fed microstrip patch antenna. J. Microwaves Optoelectron. 3(3), 5–10 (2003)

Dynamic Spectrum Access of VirtualizedOperated Networks over MIMO-OFDMA Dedicated to 5G Cognitive WSSNs Imen Badri and Mahmoud Abdellaoui(&) WIMCS Research Team, ENET’COM, Sfax-University, Sfax, Tunisia {Imenbadri1,mahmoudabdellaoui4}@gmail.com

Abstract. The wireless smart sensors networks (WSSNs) is expected to play significant role in Internet of Things (IoT) and wireless based application service delivery such as: in healthcare, in environment monitoring, in intelligent agriculture, … . Therefore, cognitive radio is promising in handling spectrum efficiently, however the Cognitive Radio approach for WSSNs is not efficient in utilizing spectrum because they also suffer from interference which induced collision. In this paper, we present a dynamic spectrum access for WSSNs based on the channel availability of likelihood distribution using continuous-time Markov chain considering primary transmitting users, temporal channel usage, channel pattern and spatial distribution. On the other hand, as the 5G promising technique, Multiple Inputs-Multiple Outputs Orthogonal Frequency Division Multiple Access (MIMO-OFDMA) based Cognitive Radio schemes are proposed to significantly improve the system capacity while mitigate the interference for dynamic spectrum access networks. The energy efficient spectrum sensing employing a dedicated smart sensors and virtualized-operated networks for spectrum sensing is given focus in this paper. The experiment outcome shows that the proposed approach improves overall spectrum efficiency of Cognitive Radio wireless smart sensors networks. On the subject of the powerallocation policies for the MIMO-OFDMA based Cognitive Radio network, a set of simulations show that our proposed scheme outperforms the other existing schemes in terms of effective capacity to efficiently implement the heterogeneous statistical QoS over MIMO-OFDMA based Cognitive Radio network. The improvement virtualized-operated network life time and energy efficiency is shown through simulations. Keywords: Dynamic spectrum access  Virtualized operated networks MIMO-OFDMA  5G cognitive WSSNs  Intelligent agriculture Automated irrigation system

1 Introduction The evolution of technology in the fields of computer science and electronics has allowed the emergence of wireless sensors networks (WSNs), which are considered a special type of ad hoc networks whose nodes are sensors to low cost, low power and easy to deploy. They are able to collect and transmit data in multiple applications in an © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 185–202, 2020. https://doi.org/10.1007/978-3-030-12388-8_14

186

I. Badri and M. Abdellaoui

autonomous way. Recall that in recent years, many wireless sensors networks have been deployed in different applications to explore the requirements, constraints and guidelines for the overall design of the sensors network architecture. A static sensor is a device sensitive to the physical quantity to be measured which transforms the state of a physical quantity observed into a usable quantity, for example an electrical voltage, a height of mercury, an intensity, the deflection of a needle. The use of WSN in precision farming makes it possible to make the right decision for each area on the farm [1]. The advantages of the traditional WSN network is the absence of cabling, which greatly reduces the cost of installation, the other advantage is deployment flexibility and ease of maintenance. Indeed, the sensors are autonomous and require very little human intervention on the fields especially in the case where the communication protocols are fault-tolerant and support the mobility of the nodes. Nodes can measure different parameters such as air and soil parameters. In order for this network to become functional and complete at the architecture level, we must present the radio part which makes it possible to transfer, transmit and change information from one source to another. This network equipped with Cognitive Radio is called CWSN. In the CWSN which is a specialized ad hoc network, in which the nodes present wireless sensors administered by a SINK node, the wireless sensor is an instrument for measuring the physical quantities in a well-defined field, its function is to collect and to transmit the data to a base station. Sensors are the basic elements of data acquisition systems [1]. Their implementation is in the field of instrumentation. This network, in addition to the advantages presented above, recommends many limitations and disadvantages which does not meet the expectations and needs of users since the features of a static wireless sensor are limited; moreover, the static wireless sensor is not able to make decisions and send orders to actuators. Faced with the huge problems encountered in using the classic WSN (wireless sensors network) and to solve and overcome its problems, we introduced the notion of a virtualized-operated system. Remember that the aspect of virtualization is a solution used in the computer world to share resources (execution, storage) between multiple applications, while ensuring the tightness between applications. Network Functions of Virtualization (NFV) is a key element in optimizing the use of network resources by “virtualizing” functions usually implemented in proprietary hardware, thus reducing the costs of investment and operation for operators. Virtualized-operated networks are more agile, more scalable, more innovative and above all more adapted to the imperative of operation. In another way, network virtualization consists of virtualizing on application resource pools namely: computation, storage, network-DHCP-DNS service, software firewall and load balancing. In this work, we will have modeled a specific intelligent sensors network system based on virtualized-operated networks while applying the MIMO-OFDMA technique and Cognitive Radio. The goal to be reached by the proposed CWSSN is to answer and to solve the CWSN problems, we have proposed another type of network occupied by intelligent sensors nodes that are equipped with a promising Cognitive Radio in the effective management of spectrum called cognitive wireless smart sensors network (CWSSNs) based on Cognitive Radio network, virtualized-operated networks, smart sensors and MIMO-OFDMA technology. It is a whole specific dedicated system for different applications, we propose in this research to apply our new specific network

Dynamic Spectrum Access of Virtualized-Operated Networks …

187

in the field of agriculture to solve such irrigation problems and for a comfortable agriculture, modernized, automated and intelligent in order to have a very good product quality. To ensure the proper functioning of this network and to ensure security against any attacks and intrusions, we introduced the concept of Software-Defined Wireless Sensors Cognitive Radio Networks (SD-WSCRNs). This intelligent sensors network consists of intelligent sensors nodes and a base station on which all detected data are transmitted. This network can be reconfigured remotely after the deployment and when its database is saturated-empty and populated. This network is agile and adapts to topological changes. It is also programmable, which facilitates network management [2]. To meet the challenges of the spectrum, smart sensors nodes are equipped with a promising Cognitive Radio in effective spectrum management since Cognitive Radio (CR) is one of the technologies capable of enabling intelligent sensors nodes (as secondary users) to detect and temporarily use the licensed underutilized spectrum when primary users (PUs) do not use it [3]. But, the existing problems in the CWSN are solved like as: the bandwidth problem, the problem of the response time, the resolution problem and several other limits. In addition, there are also several problems of: autoconfiguration, self-monitoring, learning, decision-making and control actuators. Conventional wireless sensors networks are used in several fields of application [1]. Among these is the field of agriculture, where this network has not met the expectations and needs of farmers since it requires a large number of forestry workers who perform manual work recommending errors likely as: firstly, in the delivered water quantity; secondly, at the time of the release/shutdown. For this, we have proposed our CWSSN network which can provide an important support facilitating agricultural practices allowing then the use of intelligent agriculture. This paper is organized into sections as follows; Sect. 2 defines the model of the proposed system, Sect. 3 describes the theoretical analysis of the network of metrics, the application and simulations are presented in Sect. 4 and finally, Sect. 5 concludes the article and the proposed prospects.

2 Cognitive Wireless Smart Sensors Network System Model Based on MIMO-OFDMA To overcome the limitations of the classic CWSN network equipped with a Cognitive Radio and their problems mentioned in the previous chapter; For this, we proposed a specific wireless sensors network model consisting of smart sensors called Cognitive Wireless Smart Sensors Network (CWSSN). This type of network represents a new generation of the embedded system running in real time with significant communication constraints different from the traditional network. The key element of the CWSSN is the intelligent sensor which is an embedded system with several intelligent features such as self-diagnosis, self-calibration, remote reconfiguration, ease of maintenance, real-time data storage on malfunction and possibly the diagnosis of defects. Thanks to the complementarity of its functionalities, we can say that the intelligent sensor is a microsystem intelligently managing the operation of an application. Also, we can integrate several other features namely:

188

I. Badri and M. Abdellaoui

• Advanced communication interpreting a two-way digital dialogue with other related microsystem or smart sensor. • The generation of virtual quantities (magnitudes that are not directly measurable but developed from the available physical quantities and calculation models). • The realization of an automatic passage in a fallback position during a certain malfunction. • The presentation of the possibilities of self-adaptation, that is to say automatic choice of the scale best suited to the quantities to be measured. In this study, we replaced the static wireless sensor with a smart sensor. Smart sensor technology can detect and connect objects in different applications, particularly intelligent agriculture. The proposed network is an interconnected wireless network; it has grown massively because of the vast potential of wireless smart sensors to connect the physical globe to the virtual globe. So, it’s a multi-hop network. CWSSN nodes are located in a monitoring field in a distributed manner. This CRequipped network can help overcome bandwidth limitation while detecting ghost holes and using MIMO-OFDMA technology to improve spectrum sensing utilization on the one hand and minimize interference on the other. Due to the characteristics and limitations of smart sensors nodes, the coupling of cognitive technology into smart sensors nodes introduces other challenges such as spectrum detection, spectrum sharing and spectrum management [4]. As the use of Cognitive Radio increases day by day, the problems associated with spectrum detection are also increasing. These communication problems in our CWSSN network have been solved through Cognitive Radio technology. The cognitive technique is the process of knowledge through perception, planning, reasoning, updating action and continuous updating with a learning history; this allows us to appear a network of distributed smart sensors. The CR offers some advantages such as: the efficient use of the spectrum as well as spacing for new technologies, the use of several channels, the energy efficiency, the use of the spectral band specific to each application and the reduction of attacks. One of the primary goals of integrating a CR into wireless smart sensors is to use the unused licensed spectrum opportunistically. The overarching problem facing the CWSSN is the implementation of an effective spectrum access mechanism to respond to existing challenges. In this work, we propose and present a mechanism of access to the spectrum of the opportunistic primary user by using a continuous time Markov chain. So, consider a network of intelligent wireless cognitive radio smart sensors, which consists of a set of primary users such as the base station (sink) and unauthorized secondary users (smart sensors devices). An intelligent sensing device is equipped with a Cognitive Radio for opportunistic access of licensed spectrum [4, 5]. Let us consider the set U of the nonempty opportunistic channel accessible to intelligent sensors. There is limited channel availability for an intelligent sensor device due to the nature of the main user characteristics and the positions/locations of the smart sensors devices. The characteristic of the opportunity of the spectrum depends on the availability of the channel for an intelligent sensor device which is defined as the time during which the channel is available for intelligent detection devices or not [4].

Dynamic Spectrum Access of Virtualized-Operated Networks …

189

Recall to the Cognitive Radio technology is an emerging technology in order to solve the spectrum scarcity problem in wireless smart sensors communication networks and especially in the virtualized-operated communication networks. Spectrum sensing which enables dynamic spectrum is found to be one of the most complex and power intensive tasks in a Cognitive Radio. The smart sensors nodes within the communication range of each Cognitive Radio forms clusters around the Cognitive Radio to sense the spectrum. These clusters (see Fig. 1.) are again divided into disjoint subsets to further improve energy efficiency. Only one subset is made active at a time to sense the spectrum. The benefits and application of a dedicated smart sensors and virtualizedoperated networks for Cognitive Radio spectrum sensing was developed and discussed in [4]. A promising approach for energy efficient spectrum sensing via a separate dedicated smart sensors and virtualized-operated networks. Spectrum sensing via smart sensors and virtualized-operated networks benefits in providing better primary user and effectively detecting a possible weak primary user signal [4, 6–8].

Fig. 1. CWSSN network model

New generation communication systems promise to implement a wide variety of new features requiring the exchange and transfer of a very wide range of information. For this, older generations (2G, 3G and 4G) do not meet the expectations and growing needs of communications and links; which requires the appearance of a new generation that is 5G. 5G designs require MIMO antenna arrays with hundreds of antenna elements on base stations (eNodeB). MIMO technology uses spatial multiplexing gain to

190

I. Badri and M. Abdellaoui

achieve the highest spectral efficiency. Since several transmit antennas can be applied to OFDM-based Cognitive Radio systems, the researchers have designed and proposed the very promising candidate, called MIMO-OFDMA, which can compensate for lack of capacity while increasing spectral efficiency [4, 9]. OFDMA is considered a modulation and multiple access technique for 5G generation wireless networks. OFDMA is an extension of orthogonal frequency division multiplexing (OFDM) that can achieve adaptive bandwidth allocation in intelligent wireless cognitive sensors networks. In the MIMO technique, there are two concepts based on how base station antennas are used to serve users. They are divided into two categories: single-user MIMO and multi-users MIMO. In single-user MIMO, all streams from base station antennas are focused on a single user. In multi-users MIMO, different streams produced using a combination of different antennas are focused on different users or subscribers [4, 10]. The model, presented in the Fig. 1, is formulated by a graph R = (N; L; C; F), with: N is the set of the nodes of the network. L is the set of links that are communication pairs achievable on the duty of transmissions. This is the set of spectrum channels that are currently assigned to links in L. F is the set of simultaneous multi-hop data streams in the network. We consider Cognitive Radio networks based on MIMO-OFDMA, where there is secondary source node with nodes (E1, E2, …, EK), a MIMO-based relay station, and secondary destination node with nodes. (R1, R2, …, RK) in coexistence with the primary users (PUs), as shown in Fig. 1. Assume that each intelligent transmit/receive sensor node is equipped with a transmit antenna and a receive antenna, while the relay station is composed of AE broadcast antennas and AR receive antennas. Then, in our application, we assume that the number of antennas is finite AE = AR = 100 transmit/receive antennas in a definite path. In this research, this model is considered as a virtual network operated, it is composed by clusters. Actually in our application of intelligent farming, clusters are considered as lines of apricot/peach/olive plants. Each cluster has a responsible node called Cluster Head (CH) or SINK, which is the group leader that coordinates cluster members, aggregates, processes data and passes it to the data collector. The clustering method offers several objectives to know: the reduction of energy consumption, the minimization of communication, the load balancing, the extension of the service life of the network, the increase of the rate of the total connectivity and the reduction of delays, the optimization of bandwidth and the improvement of Quality of Service (QoS). Cluster members or smart sensors nodes can access neighboring clusters and route data between them. So, they measure the necessary data and send them to the SINK. Afterwards, the SINK processes the data streams and transmits them to the base stations, which are based on MIMO technology, using the 5G operated networks.

3 Dynamic Spectrum Access, Spectral Allocation and Metric Networks Generally, the technique allowing unlicensed users to dynamically access unused licensed tapes in order to minimize unused spectral bands or blanks is known as a dynamic spectrum access scheme. In the diagram above, unlicensed users essentially use unused licensed frequency bands at no charge. When the licensed user starts using

Dynamic Spectrum Access of Virtualized-Operated Networks …

191

the frequency band, the unlicensed user must release the group and move to another idle band. So, for the intended purpose and the desired purpose, the CR technique and the dynamic spectrum access scheme are major, interesting and essential solutions for increasing the use of the spectrum in intelligent radio cognitive wireless smart sensors networks [4]. l In this paper, the impulse response function of the channel, denoted Ui;j ðtÞ of the ith transmit antenna at the jth receive antenna at SUk at time t can be expressed as follows:     l Ui;j ðtÞ ¼ ali;j d t  sli;j exp J/li;j

ð1Þ

pffiffiffiffiffiffiffi With J ¼ j2 ; i 2 {1, 2, …, AR}, j 2 {1, 2, …, AE}, and l 2 {1, 2, …, L}; n o n o n o sli;j is the path delay; ali;j is the envelope of the path and /li;j is the phase shift. n o We assume that /li;j is an independent and uniformly distributed random variable n o (i.i.d.) uniformly distributed between [0, 2p] and ali;j is i.i.d. random variable. The path gains are assumed to be invariant in a frame duration Tf, but vary independently from frame to frame. In subsequent discussions, we can omit the discrete time index t when representing the impulse response of the channel. We can derive an impulse response from the AR  AE channels, denoted U(l), to the secondary relay station, remember that AR = AE = 100, as follows: 2 6 U ðlÞ ¼ 4

ul0;0 ul99;0

.. .

ul0;1 ul99;1

3    ul0;99 .. 7 .. . 5 . l    u99;99

ð2Þ

l where Ui;j with i 2 {0, 1, …, (AR − 1)}, j 2 {0, 1, …, (AE − 1)}, and l 2 {1, 2, …, L} is the impulse response of the channel specified by (1). Suppose that the path gains to be invariant in a frame duration Tf, but varies independently from frame to frame. Assume that the channel state information (CSI) is perfectly known at the receiver, using SVD (Singular Value Decomposition), the impulse response matrix of the channel can be designated as follows:

U ðlÞ ¼ H ðlÞ KðlÞ ðV ðlÞ ÞU

ð3Þ

with ()U is the conjugated transpose of the matrix; HðlÞ 2 CAR AE and V ðlÞ 2 CAR AE are unitary matrices and K(l) is a rectangular matrix whose diagonal elements {kl,1  kl,2  ⋯  kl,Amin  0 where Amin ¼ min fAR  AE gð1  Amin Þ are the ordered singular values of the matrix U(l). Consequently, the whole wideband MIMO channel decomposes into Amin independent subchannels. Since we assume that the transmit node and the receive node are far enough apart, there is no direct link between them due to shading conditions and power limiting in the network. The relay process consists of two steps. In the first step, the transmitting nodes transmit signals to the relay station after detecting the spectral white space.

192

I. Badri and M. Abdellaoui ðlÞ

The received signal, denoted yS , from the cognitive source node to the relay station can be expressed as follows: ðlÞ

yS ¼ ðlÞ

qffiffiffiffiffiffiffi XL qffiffiffiffiffiffiffi pffiffiffiffiffi ðlÞ ðlÞ ðlÞ ðnÞ ðnÞ ðlÞ pS US xðnÞ þ nS pS US xðlÞ þ pP UP xðPÞ þ n6¼l ðnÞ

ð4Þ

With pS and pS are respectively the transmission power of the source node lth and the source node nth; P P represents the transmission power of the base station PU (BS). x(l) and x(n) respectively denote the signals sent by the lth source node and the nth ðlÞ ðnÞ source node; x(p) denotes the signals sent by the BS PU. US and US indicate the impulse response of the transmitter channel of the source node lth and the source node ðlÞ nth to the receiver of the relay node specified by (2), respectively; U P indicates the impulse response of the PU base station channel to the relay node receiver specified by ðlÞ (2); and nS is the Additive White Gaussian Noise (AWGN) with zero mean and variance r2S [4, 9]. With regard to the dynamic spectrum access, a generally fixed spectrum allocation scheme is used in the CWSSN deployment. CWSSN can be used on unlicensed and licensed tape [4, 10]. Spectra are directly associated with the cost that increases the cost of the network. However, LPWAN and SigFox-LoRa WAN networks can also operate in the unlicensed band. Access to multiple channels to comply with different spectrum regulations: As we know that the availability of spectrum frequency varies, a band that is available in Tunisia may not be available in another country. Thus, if the smart sensors nodes are designed with a predefined frequency band, they create a problem for the user. This problem can be overcome by the use of the Cognitive Radio capacity, which changes their communication frequency according to the availability of the spectrum. In the classical WSN [1], the Dynamic Spectrum Access (DSA) is considering as the main key to solving the global spectrum shortage. Open wireless support subjects DSA systems to unauthorized use of the spectrum by illegitimate users. Secondary user authentication is therefore essential to ensure the proper functioning of DSA systems [1, 11–14]. Detecting crowd misuse of spectrum eliminates the need for dedicated sensor deployment and dramatically reduces deployment and maintenance costs. Based on the foregoing that’s applied at CWSSNs and especially in a DSA system of the CWSSNs, the spectrum owner leases its underutilized spectrum under license to unlicensed users. To improve spectrum efficiency in the CWSSNs, the spectrum owner can regulate spectrum access by issuing spectrum authorizations, each specifying a frequency channel, a geographic area, and a duration. A valid spectrum license is used to authorize the use of the corresponding frequency channel in the specified area and duration. In traditional WSN networks [1], the Quality of Service (QoS) is typically characterized by four parameters: bandwidth, delay, jitter and reliability. To avoid dangerous consequences in critical applications and especially in the CWSSNs, local storage virtualized-operated networks must maintain an adequate level of Quality of Service. The QoS support is a complex issue due to resource constraints such as: processing power, memory, and power sources in wireless smart sensors nodes. The

Dynamic Spectrum Access of Virtualized-Operated Networks …

193

QoS is a management concept that aims to optimize the resources of a virtualizedoperated network or a process on the one hand, and to guarantee good performance to critical applications for the organization of somewhere else. The QoS allows users to offer different speeds and response times depending on the applications considered and the protocols that are implemented at the structure level. Different objective functions for measuring communication quality metrics in the network. Three objective functions of wireless communication performance are considered in this paper: Maximize Throughput/debit-rate (5), Reduce Power Consumption (6) and Minimize Bit Error Rate (BER) (7): ¼

fmax

debit

fmin

energy

fmin

BER

Þ log2 ðM log2 ðMmax Þ

ð5Þ

 P Pmax

ð6Þ

log10 ð0:5Þ   log10 PBER

ð7Þ

¼ 1

¼ 1

We will also study the Signal-Interference-to-Noise Ratio (SINR) that is used to measure the quality of communications. In link transmissions, the SINR can be thought of as the received power of the signal provided at the divided receiver by the sum of the received powers of unintended signals (interference) from other links on the same spectrum channel. For a link (i, j) on the spectral channel c, its SINR ratio can be calculated as follows (8): SINRij ðcÞ ¼

r2

þ

P

Gij pi

ða;bÞ2I ðcÞ;ða;bÞ6¼ði;jÞ

Gaj pa

ð8Þ

With Pi: the transmission power of the sender i. In this paper, we assume that the transmit power of all links is at the fixed level. Gij: the gain of the channel between the transmitter i and the receiver j, which can be denoted k=ðdij Þa ; K: the path loss constant; dij: the distance between i and j; a: the exponent of path loss; r2: the thermal noise that can be considered a constant, and the sigma notation presents the global interference to the receiver j, which is generated by the links transmitting concurrently on the current spectrum channel; I(c): the set of links sharing the spectrum channel c [4]. To ensure the effective transmission of the link, each intended signal must be decoded successfully at the receiver.

194

I. Badri and M. Abdellaoui

4 Spectral Detection Based on Cyclo-Stationary Intelligent Agriculture Application and Simulation The use of the sensors network in the field of agriculture is not new. Good crop irrigation is one of the most important aspects of product performance and quality. If the plant does not receive the necessary amount of water and therefore not enough moisture in the soil, it dries and veins. On the other hand, an excess of water or exceeding the water limit causes too much moisture which is suitable for the appearance and development of plant diseases [4, 15]. In addition, unnecessary irrigation is significantly wasting water resources. In this context, the intelligent sensors nodes for measuring soil moisture (dendrometers) and air temperature on agricultural land have been developed. The measured values from the sensors nodes had to be reliably transferred with minimal power consumption, regardless of their position in the field and the distance to the central control unit of the system. CWSSN has become a logical choice because it provides all the requirements. The proposed intelligent irrigation system (see Fig. 2.) is designed to autonomously decide when to activate and deactivate electronic valves for the release/shutdown of water in the agricultural field [16]. The main control unit manages the autonomous decision-making process. Thus, in addition to the automated control of the irrigation process, the user can configure an autonomous operation of the system. In this mode of operation, the appropriate decision on irrigation intervals is made by an intelligent irrigation algorithm, based on ambient temperature, soil moisture, brightness (day/night) and the season considered (spring, summer, autumn and winter) applied on the SigFox CWSSN [16]. In order to have a complete study, we detail the operating principle of the irrigation system of intelligent agriculture application. Figure 2. shows and presents the functioning diagram of the new irrigation management deployment. This irrigation management system architecture deployment is based on smart sensors technology and consists in allowing remote control of the irrigation system to facilitate the management of the water network [17]. Then, new irrigation management system allows an automatic control of the electronic valves that close or open the water flow. The system optimizes water consumption because it irrigates with the proper amount according to weather conditions and the plant’s needs [16]. It’s saving resource such as water with IoT technology and contributing to enhance the environment too [16–18]. To provide and to prospect the need of water for a parcel irrigation with a good estimation and precise accurate quantity of water [16, 17, 19]. Some apricot-peach and olive roots are too dry while others are water logged smart sensors. The moisture soil smart sensors are simultaneously placed at different depths, under each a tree (apricot/peach/olive), the local water retention in the soil can be assessed. By measuring evapotranspiration it is possible to work out how much irrigation water is being actually absorbed by the plants [16, 17, 20]. To promote agricultural research, to deliberate an important issues of agricultural research and to select/to use the better system versus nature of plants. For this, we have installed, at the same time and to be placed on the different footings: one connected at apricot-peach orchard whereas the other at olive orchard, two different wireless smart sensors systems to monitor soil water status to plan irrigation in an olive orchard and in peach-apricot

Dynamic Spectrum Access of Virtualized-Operated Networks …

195

Fig. 2. Application for IoT intelligent agriculture [16]

orchard. Data has been recorded with the same, but information has been transmitted to the platform by two different connections (5G and SigFox). Two libelium waspmote plug & sense-smart agriculture have been deployed with watermark smart sensors in different depths to control soil moisture with fruit diameter dendrometer smart sensors to measure the size of the fruit (see Fig. 3); and temperature and humidity smart sensors to monitor environmental conditions. One of the smart sensors platforms is connected to a 5G shield and the other with the SigFox. The information collected by the smart sensors has been sent to the platforms that includes both 5G and SigFox technologies. To manage SigFox-LoRa WAN stations however a server has had to be configured. A Meshlium IoT Gateway has been used embedding management system making data handling easier. Farmers can get valuable information to schedule irrigation timing to avoid stress conditions, which is fundamental on apricot-peach and olive plants [16, 17, 21]. Agriculture smart sensors networks using waspmote send data using SigFox LoRa WAN communication system. Alarms can also be sending to the mobile phone network using waspmote’s LoRa board. Data gathered by waspmote smart sensor platform can be sent to a gateway or directly to the cloud. The information collected in

196

I. Badri and M. Abdellaoui

the Meshlium Gateway can be visualized in a platform which concentrates and allows knowing the state in each parcel. We carry out the application which can be controlled with computers, smart phones and also tablets. Multi-protocol router is used to gather all the data from the smart sensors nodes and leaving them in the cloud computing. The new smart agriculture IoT kit is factory programmed and enables monitoring of environmental parameters in agriculture. The IoT kit includes a visualization plugging in Meshlium where you can check data in real-time, display a graphic with every measured parameter between two time periods or geo-locate the smart nodes via GPS and compare different parameters in the same node. We note that we can distinguish two types of intelligent sensor: intelligent sensor used in the ground and intelligent sensor used on the plant. In our application, we can use temperature sensors, humidity sensors Waspmote, agricultural product monitoring sensors called dendrometers smart sensors (see Fig. 3).

Fig. 3. Apricot fruit diameter dendrometer smart sensors [16]

We recall that the deployment of intelligent sensors nodes in the plot to be controlled and supervised is gigantic and an important and essential role for the design of an efficient and cost-effective irrigation system in intelligent agriculture. In this context, we have established a topology of intelligent sensors deployment of the irrigation system designed for our application of intelligent farming as shown in Fig. 4. Remember to the smart sensors array is descripted as following as: the smart sensors array consisted of a centrally located receiver connected to a laptop computer and multiple smart sensors nodes installed in the field (irrigation parcel). The smart sensors nodes consisted of smart sensors (soil moisture smart sensors and thermocouples, Waspmote,…) an active transmitter which transmitted data to the receiver. The smart sensors boards acquired sensors values and wirelessly (CR/5G/SigFox-LoRa WAN) transmitted those values to a centrally located radio frequency receiver. The board can read up to three watermark granular resistive type soil moisture smart sensors and up to four thermocouple temperature smart sensors.

Dynamic Spectrum Access of Virtualized-Operated Networks …

197

Fig. 4. Network topology of application IoT intelligent agriculture

The topology of the Cognitive-Radio IoT in the simulation of the intelligent farming application is shown in Fig. 4 where 100 nodes are randomly deployed over an area of 2000 m to 1000 m. Each smart sensor node is mounted with a single TRX. There are several streams of data simultaneously in the network. The transmission power of each link is set at 13 dBm and the thermal background noise r2 is equal to −100 dBm. The gain of the channel is defined as hij ¼ k=ðdij Þa , where dij is the distance between two smart sensors. We adopt the path loss constant k = 1 and the path loss exponent a = 5. Figure 5 shows an example of a Multiple Inputs-Multiple Outputs (MIMO) system, which provides multiple antennas across the transmitter and receiver of a wireless communication system. MIMO systems increasingly use in communication systems for potential capacity gains in achieving the use of multiple antennas. BER performance analysis by SINR contribution of the binary PSK modulation technique with AWGN channel (see Fig. 6). It presents a comparison between theory and simulation. The genetic algorithm (GA) is a method of solving optimization problems, with or without constraints, based on a natural selection process. We used this algorithm to simulate our metrics from our new CWSSN network which are presented in Eqs. (8), (6) and (5). Figure 7 proves the simulation of SINR; but, Fig. 8 shows the energy consumption simulation; so, Fig. 9 presents the throughput simulation.

198

I. Badri and M. Abdellaoui

Fig. 5. MIMO simulation

Fig. 6. Simulation of BER metric

Dynamic Spectrum Access of Virtualized-Operated Networks …

Fig. 7. Simulation of SINR with genetic algorithm

Fig. 8. Power consumption metric simulation

199

200

I. Badri and M. Abdellaoui

Fig. 9. Throughput metric simulation

5 Conclusion In this paper, a cognitive networking with dynamic spectrum access mechanism for WSSNs is introduced. The channel availability of likelihood distribution is computed using continuous-time Markov chain considering primary transmitting users, temporal channel usage, channel pattern and spatial distribution. A slotted aloha medium access control is considered. The outcome shows significant performance improvement. An improvement of 18.5% is achieved in terms of collision reduction and an improvement of 40.2% in terms of throughput is achieved. We proposed the resource allocation scheme by applying the MIMO-OFDMA based on Cognitive Radio over virtualizedoperated networks. In the other hand and remember to, the available limited frequency spectrum is not enough to cater the increasing in traffic demand. Thus, the Cognitive Radio technology plays an important role in such a scenario by dynamically accessing the licensed spectrum. Among the cognitive tasks, spectrum sensing is found to be the most energy a consuming one. In this paper, an energy efficient spectrum sensing technique was proposed which employs a dedicated wireless smart sensors network to sense the spectrum. The simulation results show that the proposed virtualized-operated network with clusters, subset and special subset formation shows an improved energy efficient spectrum sensing approach increases the overall virtualized-operated network life time of the Cognitive Radio networks. Acknowledgements. This work has been accomplished at WIMCS-Research Team, ENET’COM, Sfax-University, Tunisia. Part of this work has been supported by APIA-Tunisia Agriculture Ministry & MESRSTIC Scientific Research Group-Tunisia.

Dynamic Spectrum Access of Virtualized-Operated Networks …

201

References 1. Abedllaoui, M.: Multitaskes-generic-intelligent-efficiency-secure WSNs and their applications. In: Part 4, Reliable WSNs and their Applications, pp. 186–323. LAMBERT Academic Publishing (LAP) (2017). ISBN: 978-3-330-04707-5 2. Sejaphala, L.C., Velempini, M.: Detection algorithm of sinkhole attack in software-defined wireless sensor cognitive radio networks. IEEE Glob. Wirel. Summit (GWS), 151–154 (2017) https://doi.org/10.1109/gws.2017.8300470 3. Maisuria, J., Mehta, S.: An overview of medium access control protocols for cognitive radio sensor networks. In: 4th Int. Electronic Conference on Sensors and Applications, vol. 2, no. 3, p. 135 (2017). https://doi.org/10.3390/ecsa-4-04963 4. Badri, I., Abdellaoui, M.: Spectral sensing & multi-objective spectrum allocation over MIMO-OFDMA based on 5G cognitive WSSNs for IoT intelligent agriculture. Int. J. Mod. Eng. Res. (IJMER) 6(8), 23–33 (2018). ISSN 2249-6645 5. Xu, Y., Wang, J., Wu, Q.: Opportunistic spectrum access in unknown dynamic environment: a game-theoretic stochastic learning solution. IEEE Trans. Wirel. Commun. 4(11) (2012). https://doi.org/10.1109/twc.2012.020812.110025 6. Giweli, N., Shahrestani, S., Cheung, H.: Spectrum sensing in cognitive radio networks: QoS considerations. Comput. Sci. Inf. Technol. (CS & IT) 09–19 (2015). https://doi.org/10.5121/ csit.2015.51602 7. Jayakrishna, P.S., Sudha, T.: Energy efficient wireless sensor network assisted spectrum sensing for cognitive radio network. In: IEEE International Conference on Intelligent Techniques in Control, Optimization and Signal Processing (2017). ISSN 978-1-5090-47729/17 8. Ghasemi, A., Sousa, E.S.: Spectrum sensing in cognitive radio networks: requirements, challenges and design trade-offs. IEEE Commun. Mag. 4(46), 32–39 (2008). https://doi.org/ 10.1109/MCOM.2008.4481338 9. Zhang, X., Wang, J.: Heterogeneous statistical QoS-driven resource allocation over MIMOOFDMA based 5G cognitive radio networks. In: IEEE Wireless Communications and Networking Conference (WCNC), pp. 1–6 (2017) 978-1-5090-4183-1/17 10. Shahraki, H.S., Mohamed-pour, K., Vangelista, L.: Sum capacity maximization for MIMO– OFDMA based cognitive radio networks. Phys. Commun. 10, 106–115 (2014) https://doi. org/10.1016/j.phycom.2012.10.002 11. Saroja, T.V., Ragha, L.: A dynamic spectrum access model for cognitive radio wireless sensor network. In: 4th International Conference on Electronics and Communication Systems (ICECS), pp. 7–11 (2017). https://doi.org/10.1109/ecs.2017.8067845 12. Akyildiz, I.F., Lo, B.F., Balakrishnan, R.: Cooperative spectrum sensing in cognitive radio networks: a survey. Phys. Commun. 1(4), 40–62 (2011). https://doi.org/10.1016/j.phycom. 2010.12.003 13. Jin, X., Sun, J., Zhang, R., Zhang, Y., Zhang, C.: SpecGuard: spectrum misuse detection in dynamic spectrum access systems. IEEE Trans. Mob. Comput. 1–14 (2018). https://doi.org/ 10.1109/tmc.2018.2823314 14. Myrvoll, T.A., Hakegard, J.E.: Dynamic spectrum access in realistic environments using reinforcement learning. In: International Symposium on Communications and Information Technologies (ISCIT), Gold Coast, QLD, Australia, 2–5 October 2012 (2012). https://doi. org/10.1109/iscit.2012.6380943 15. Savic, T., Radonjic, M.: WSN architecture for smart irrigation system. In: IEEE 23rd International Scientific-Professional Conference on Information Technology (IT) Zabljak, Montenegro, pp. 1–4 (2018). https://doi.org/10.1109/spit.2018.8350859

202

I. Badri and M. Abdellaoui

16. Abdellaoui, M.: Two different smart irrigation agriculture systems to improve apricot-peach and olive production in Sidi Bouzid area. Agric. Res. J. 3(6), 62–68 (2016) 17. Abdellaoui, M.: Smart sensors & internet of things platform for remote control and identification of advanced irrigation agriculture project. In: European Advanced Materials Congress, Stockholm, Sweden, 22–24 August 2017 (2017) 18. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of things (IoT): a vision architectural elements and future direction. Future Gener. Comput. Syst. 29, 1645–1660 (2013) 19. Xiao, K., Xiao, D., Luo, X.: Smart water-saving irrigation system in precision agriculture based on wireless sensors network. Trans. CSAE 11(26), 170–175 (2010) 20. Veldis, G., Tucker, M., Perry, G., Kiven, G., Bednarz, C.: A real-time wireless smart sensors array for scheduling irrigation. Comput. Electron. Agric. 61, 44–50 (2008) 21. Gargouri, F., Abdellaoui, M.: Smart sensors & internet of things platform for remote control and identification of advanced irrigation agriculture project. In: Advanced Materials World Congress-American Sensors & Actuators Summit, Miami, USA, 03 August, December 2017 (2017)

Expanding Coverage of an Intelligent Transit Bus Monitoring System via ZigBee Radio Network Ahmad Salman(B) , Samy El-Tawab, and Zachary Yorio College of Integrated Science and Engineering, James Madison University, Harrisonburg, VA 22807, USA {salmanaa,eltawass,yoriozp}@jmu.edu

Abstract. Public transportation around midsize college towns has become increasingly vital as residential, and commuter populations continue to grow every year. Our research team proposes a cyber-physical system that monitors the quality of service of the transit bus system around James Madison University. By using the power of Internet of Things (IoT), it is possible to create a network of smart nodes that collect data on the bus routing efficiency and ridership. Using Big-Data gathered, improvements can be made to bus route efficiency and traffic congestion in Harrisonburg, as well as similar college towns in the future. This paper concentrates on the steps taken to improve and expand upon our current system’s network infrastructure to allow the collection of data to take place outside the range of campus WiFi. Additionally, the security of passenger data during transmission and storage from smart nodes is also addressed. Keywords: IoT (Internet of Things) · Cyber-physical system · Intelligent Transportation Systems · Cloud Computing · Zigbee radio Hashing algorithm

1

·

Introduction

Recently, the power of IoT has played an essential role in several applications of Intelligent Transportation Systems (ITS). United States Department of Transportation is looking for Intelligent Transportation Systems (ITS) that can improve transportation safety and mobility. With the expanded coverage of wireless communication, ITS uses these technologies to advance its applications [1]. Several applications (e.g., smart parking, bus systems, traffic control, incident detection) have used wireless communication to improve transportation systems [2–4]. It is clear that Intelligent Transportation has assisted in decreasing the number of crashes and death in the United States in the last ten years. In 2016, United State Department of Transportation (US-DoT) fatal traffic crash reports, there c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 203–216, 2020. https://doi.org/10.1007/978-3-030-12388-8_15

204

A. Salman et al.

were 3, 450 deaths caused by distraction-related accidents which is a decrease by 2.2%. Despite all effort, the report found that distracted driving and drowsy driving fatalities declined, while crashes (resulting in death) related to other activities (e.g., speeding, and alcohol impairment) have increased in 2016 [5]. It is a mission to improve safety and decrease the negative environmental impact of the transportation sector on the society. Despite all this improvement in the world of intelligent transportation, reliability and security of all these new systems are still under investigation. In this paper, we examine the data transmission issues and security vulnerabilities of the bus data collection system proposed as an example of an Intelligent Transportation System [6,7] and suggest remedies to design a more robust system. The remaining of this paper is organized as follows: Sect. 2 describes an overview of related research done recently in the field of tracking systems using communication. Section 3 presents the details of the previously deployed system using WiFi to monitor Intelligent Transit Bus System, while Sect. 4 provides a solution for the inclusion of nodes outside of WiFi range within our system. Section 5 discusses the security measurements implemented in the recording and submission of data. Finally, Sect. 6 concludes the paper, and show directions for future work.

2

Related Work

Recently, several researchers have been looking into the idea of using Wireless Communication as part of Intelligent Transportation System. For example, Tubaishat et al. [8] proposed the idea of using Wireless Sensor Networks (WSNs) as a significant improvement over the traditional wired sensors for several applications (e.g., Smart Parking and Traffic Monitoring). Although most of the researchers focused on how to use the wireless technology in conjunction with the rapid increase of Cloud Computing (e.g., Vehicular Cloud [9,10]) and Internet of Things (IoT) used in the Intelligent Transportation; only a few researchers highlight on the threat of wireless attacks [11]. Moreover, most of the later researchers focused on the Vehicle-to-Vehicle (V2V) and Vehicleto-Infrastructure (V2I) communication security [12]. Trends show that WiFi, Bluetooth, and Wireless technology has gained more and more attention among researchers for more intelligent transportation application without depending on vehicle communication [13,14]. In fact, WiFi readers installed inside buses were used to estimate origin-destination of bus passengers [15]. Having the WiFi readers inside the buses, don’t give us accurate information about which rider may have gave up, or took another mean of transportation. The use of IoT devices for crowdsourcing has increased in the last couple of years [16]. Real-time sensing gains its efficiency from the evolution of sensing capabilities through smartphones, smart vehicles, and Internet of Things (IoT). Smartphones usage was doubled from 2010 to 2014, and expected to double again in 2018 [17]. Almost every student, staff and faculty carry smartphone, loaded with sensors

Intelligent Transit Bus Monitoring System

205

that can be easily utilized in the real-time sensing. Smart vehicles (e.g., Tesla) with all their sensing and communication capabilities (e.g. Bluetooth) are not far from the market, in 2013 they showed 48% sales increase in California [18,19].

3

Overview of Tracking System via WiFi

El-Tawab et al. [6] introduced a cyber-physical system (CPS) to monitor the efficiency and the quality of service for transit buses around an academic institute. This cyber-physical system detects the number of riders on each bus. The CPS included several components that were used to scan and analyze the Bluetooth and WiFi data (e.g., Raspberry Pi 3 represents the IoT device and a rechargeable battery using a solar panel). The system names the self-contained package as a “smart node.” Smart node(s) located at seven different bus stations at James Madison University located in Virginia, United States of America. Each smart node monitors (set to monitor mode) to sniff wireless network traffic around the bus station (with a radius of 7m) [20]. Using a network analyzer (e.g., Tshark) [21], we capture these packets of data including the arrival time, MAC address of the device, the strength of the WiFi signal, etc. from various WiFi-enabled devices. Security (as discussed in Sect. 5) is essential [22], we also encrypt the following data: the MAC address to identify the number of riders waiting beside the bus station. The data is saved in a cloud-based database as shown in Fig. 1.

Fig. 1. Network architecture of smart nodes at several bus station within WiFi coverage

In its current form, the cyber-physical system has a total of seven smart nodes that was deployed in Spring 2017 and data was collected, stored in a

206

A. Salman et al.

cloud-based database. The locations of the smart nodes are shown in Fig. 2. Data sent to the Cloud Storage consists of time stamps, MAC addresses, and Received Signal Strength Indicator (RSSI(s)). Using the minimum and maximum waiting times for a specific device in a bus station, we are also able to determine the duration of passenger wait time. These MAC address data are further filtered by re-matching it with the MAC addresses obtained at the next bus stop(s) to eliminate false positives and false negatives.

Fig. 2. Smart nodes located at 7 different bus stations around James Madison University with reliable WiFi signal

From the collected data in the cloud-based database, false positives and false negatives can occur. We discuss the cases of false data, examples and suggested solution to eliminate or ignore as follows: – False positives: These cases can occur when students walking by the bus station; or vehicles with new communication capabilities driving by the bus station; or WiFi signals coming from inside the buildings surrounding the bus station. Suggested Solutions: These types of data can be detected, removed using the device name of the captured data (parsed later to be removed) and/or by removing inconsistent case: such as a student sitting at the bus stations for longer than average time.

Intelligent Transit Bus Monitoring System

207

– False negatives: Students do not carry a smart cellphone, or students’ phone battery died. Suggested Solutions: With the big-data collected of the number of students riding the bus at each station. An approximated estimate can be accepted (ignored).

4

Expanding Monitoring Coverage with ZigBee

For the expansion of this cyber-physical system to remain fiscally viable, a supplementary technique is required to monitor the many bus stops outside of campus WiFi range. These bus stations are located in the middle of the university or outside the university but until now are not covered with WiFi signals. ZigBee is lower in cost than WiFi or Bluetooth options for Wireless Personal Area Networks, and operates on minimal energy consumption, allowing us to power the transmitters and receivers with the solar panel and rechargeable batteries already installed in our current smart nodes [23]. 4.1

Implementing a ZigBee Network

During proof of concept trials, a Peer-to-Peer connection between one end device and its coordinator, a smart node within campus WiFi range, was achieved. This network architecture is shown in Fig. 3. Bus data collected by the remote smart node was successfully transmitted to its coordinator and subsequently to the cloud database for storage. To expand this example for scalability reasons, ZigBee devices support multiple topological configurations, including star, tree, cluster tree, and mesh [23]. We believe that both star and tree architectures will prove useful in our network since a one-to-one ratio of coordinators to end devices may not always be necessary. We can reduce costs and energy consumption in areas where multiple end devices can reliably transmit data to one central coordinator node in a star topology. For more obstructed areas (buildings, trees, etc.), a tree topology may be necessary. In instances such as these, some end devices will also act as routers for other end devices out of range of a coordinator. Figure 4 depicts the possible transmission of data packets from the end nodes across from Festival and Potomac Hall to the coordinator node near the Physics and Chemistry building. In this simple tree configuration, the end device across from Festival is also acting as a router for the Potomac Hall node to communicate with the coordinator node at the Physics and Chemistry building. ZigBee extended range modules have been proven effective over distances up to two miles in outdoor scenarios. This is well beyond the range requirements for our campus-wide nodes, with nearby nodes usually within a quarter to a half-mile of one another [24]. 4.2

Transmitting Bus Data via ZigBee

Data transmission is handled by Python sending and receiving scripts, regulated by timers and acknowledgement receipts. The following summarizes the functionality of the code:

208

A. Salman et al.

Fig. 3. Network architecture for Zigbee-enabled nodes

Fig. 4. Potential end device, router, and coordinator node locations on JMU’s East Campus

Intelligent Transit Bus Monitoring System

209

– End device node collects bus ridership data for a set time increment – Data is divided into packets of 80 bytes – Data then transmitted one packet at a time, recorded in a temporary file on the coordinator device – Temporary file is then looped through to submit from coordinator’s local memory to the cloud via WiFi Due to the robustness of ZigBee hardware and software, we are able to make amendments to our code for particular needs and use cases. One example being the duration of data collection before and end node submits to a router or coordinator node. This can be altered from ranges of minutes to hours depending on the amount and urgency of the data recorded. We are also able to control the rate of data transfer to ensure data integrity. Because of ZigBee’s maximum packet size of 128 bytes, packet “chunking” is required to send the entire log of data. We decided upon the size of 80 byte chunks to properly accommodate for necessary protocol and security overhead included in each packet sent.

5

Security and Privacy Concerns

As previously mentioned in Sect. 3, we chose to identify patrons at bus stations through MAC addresses collected from their WiFi-enabled devices. This way we can uniquely identify patrons without redundancy as MAC addresses are unique for the device that broadcasts them. However, the process of gathering such information might fall under the invasion of personal privacy [25]. Hence, instead of storing the unique MAC addresses, we perform an authenticated hashing operation on the collected data using Message Authentication Codes (MAC) [26]. This way, if someone else is collecting MAC addresses from Patrons and our data is compromised, they are not able to regenerate the MAC value as they will be missing the key that was used to generate the MAC value and the data they collect will not provide them with any useful information. Another reason is if an attacker tries to imitate one of our nodes and tries to send hashed MAC addresses to our database, the transmitted hash values will not be authenticated and entered to the database since the attacker will not be able to authenticate the data with one of our secret keys. We took these extra measures of security compared to the work done in [27]. 5.1

Message Authentication Code (MAC)

Like Hash functions, MAC algorithms generate a fixed hash value from any arbitrary message length known as MAC value. The only difference between regular hash functions and MAC is that the latter requires a key where as the formal does not which provides authenticity as an extra layer of protection as shown in Fig. 5.

210

A. Salman et al.

Fig. 5. Derivation of a MAC value from a message Table 1. Security requirements of a MAC Security requirement

Given

Computationally infeasible to find

Preimage resistance

y

x, such that M ACk (x) = y

Second preimage resistance

x and y = M ACk (x) x = x, such that M ACk (x ) = M ACk (x) = y

Collision resistance

x = x, such that M ACk (x ) = M ACk (x)

In addition to the security requirements summarized in Table 1, MAC algorithms are required to be resistant to Known-text attacks, Chosen-text attacks, and Adaptive chosen-text attack [28]. 5.2

Generating HMAC

The Keyed-Hash Message Authentication Code (HMAC) was introduced as a method of calculating MAC using hash functions [26] and since its introduction, it has been a widely used and established secure hashing method [29]. To generate HMAC, we use the Secure Hash Algorithm (SHA-256) [30] in combination with

Intelligent Transit Bus Monitoring System

211

Message Authentication Code (MAC) and a key size of 128-bit according to the following steps. – The key is brought to the block size of 512-bit by concatenating it with 348-bit of ‘0’ – The concatenated key is XORed with a 512-bit block of a repeated “0x36” constant value known as ipad and the output is stored as a 512-bit value known as key ipad – The concatenated key is also XORed with a 512-bit block of a repeated “0x5C” constant value known as opad and the output is stored as a 512bit value known as key opad These steps are pre-computed and stored in a secure memory. The remaining steps to calculate HMAC are data dependent and they are only calculated when MAC addresses are collected and they can be summarized as follows – The key ipad is concatenated with the message (i.e. the MAC address) and then input to the SHA-256 function to produce a fixed 256-bit HMAC value known as HM AC one – The key opad is then concatenated with HM AC one and input to the SHA256 function to produce the final output HM AC SHA256 fixed value of size 256-bit and this is the value we store in our database for further analysis The HMAC calculation steps are illustrated in Fig. 6. An important property required by a hash function is that given a hash value of a message, h(m), it is computationally infeasible to find the original message m. This property ensures that storing the hash value of the MAC addresses will not violate the privacy of the patrons since a MAC address cannot be driven back from its stored hash value. 5.3

Secure Hash Algorithm (SHA-256)

We use Secure Hash Algorithm (SHA-256) [30] function to derive hash values. Not only that SHA-256 provides all the security requirements needed in a hash function, it also has a fast software implementation which ensures the hashing of the detected MAC addresses without the need to use a large buffer [30]. Usually, information hashing operations, such as password hashing, require the underlying hash function to have special characteristics such as slow processing and having a cryptography salt as part of the input [31]. These requirements are to prevent attacks, such as dictionary attacks, aimed to retrieve the original password. However, we are not concerned about such attacks, and for this reason, we chose a hash function that would process the information fast to prevent bottlenecks when collecting the data from multiple stations at once. 5.4

Confidentiality with AES

In order to assure a secure communication between the remote nodes and the central node connected to the database, we encrypt our data using the Advances

212

A. Salman et al.

Fig. 6. Computing an HMAC value from MAC addresses

Encryption Standard (AES) (cite the standard). This way we can assure that even if the data is intercepted, it would not reveal any information first because i is encrypted and second because it already does not hold any privet information as explained earlier in this section. We use AES-128 which requires a 128-bit key and 10 rounds of operation. Our implementation uses the standard AES with four main operations as follows: – AddRoundKey: Which calls for the key scheduler to add the key for a specific round – SubBytes: An S-Box which replaces the bytes in the original message with different data to provide confusion – Shif tRows: Shifts the bytes in the block rows to provide diffusion. The first row remains the same, the second is shifted by one byte, the third by two bytes, and the fourth row is shifted by three bytes

Intelligent Transit Bus Monitoring System

213

– M ixColumns: Another form of diffusion in the algorithm which is performed through mixing the columns in a block. This operation is only done in the first nine rounds of the encryption and the decryption processes. Figure 7 summarizes the AES operations.

Fig. 7. AES-128 encryption and decryption rounds illustration

5.5

MAC Address Randomization

MAC address randomization is a technique developed aiming to protect user privacy by preventing tracking through MAC addresses. The idea behind it is that instead of broadcasting MAC addresses, devices perform randomization to the MAC address and then broadcast that MAC address to other WiFi enabled devices and access points. Most of the smart phones running iOS or Android, perform MAC address randomization nowadays which affects our data collection methodology. However, in [32], the authors show that by sending certain control frame to client devices performing MAC randomization, it was possible to reveal the global MAC address of these devices. The technique was applied to various iOS and Android devices with a success rate of 100%. This shows that MAC

214

A. Salman et al.

addresses can still be reviled after applying some minor modifications to our current methodology. We should highlight that regardless of how we can obtain MAC addresses, we will always use the secure hashing technique proposed in this section and never store MAC addresses to protect user privacy.

6

Conclusions and Future Directions

In this article, we discussed the viability of introducing a ZigBee radio network into our current cyber-physical monitoring system. In doing so, remote locations in and around the James Madison University Campus may submit bus passenger data to our cloud database without the need for additional infrastructure beyond ZigBee transmitters and receivers. This will improve sampling of data per deployment of our smart nodes and allow for more detailed data analysis and traffic modeling. Also discussed were the steps being taken to provide security and anonymity for public transit patrons via SHA-256 hashing techniques. With a proof of concept now complete, future work will involve deploying multiple nodes in the more remote locations around the JMU campus with ZigBee transmitters. Receiving node locations must also be decided on so that the remote nodes’ data is reliably submitted to the cloud database. Acknowledgements. This work was supported by the 4-VA Collaborative at James Madison University http://4-va.org/ Fall 2017. The authors would like to thank James Madison University Public Safety, and transit bus manager Mr. Lee Eshelman for allowing us to conduct our experiments.

References 1. United States Department of Transportation. Intelligent Transportation Systems, November 2015 2. Centenaro, M., Vangelista, L., Zanella, A., Zorzi, M.: Long-range communications in unlicensed bands: the rising stars in the iot and smart city scenarios. IEEE Wirel. Commun. 23(5), 60–67 (2016) 3. Chatzigiannakis, I., Vitaletti, A., Pyrgelis, A.: A privacy-preserving smart parking system using an IoT elliptic curve based security platform. Comput. Commun. 89, 165–177 (2016) 4. Popescu, O., Sha-Mohammad, S., Abdel-Wahab, H., Popescu, D.C., El-Tawab, S.: Automatic incident detection in intelligent transportation systems using aggregation of traffic parameters collected through V2I communications. IEEE Intell. Transp. Syst. Mag. 9(2), 64–75 (2017) 5. USDOT Releases 2016 Fatal Traffic Crash Data, October 2017 6. El-Tawab, S., Oram, R., Garcia, M., Johns, C., Park, B.B.: Data analysis of transit systems using low-cost IoT technology. In: First International Workshop on Mobile and Pervasive Internet of Things 2017 - 2017 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops), March 2017

Intelligent Transit Bus Monitoring System

215

7. Evers, K., Oram, R., El-Tawab, S., Heydari, M.H., Park, B.B.: Security measurement on a cloud-based cyber-physical system used for intelligent transportation. In: 2017 IEEE International Conference on Vehicular Electronics and Safety (ICVES), pp. 97–102. IEEE (2017) 8. Tubaishat, M., Zhuang, P., Qi, Q., Shang, Y.: Wireless sensor networks in intelligent transportation systems. Wirel. Commun. Mob. Comput. 9(3), 287–302 (2009) 9. Florin, R., Ghazizadeh, P., Zadeh, A.G., El-Tawab, S., Olariu, S.: Reasoning about job completion time in vehicular clouds. IEEE Trans. Intell. Transp. Syst. PP(99), 1–10 (2016) 10. Ghazizadeh, P., Olariu, S., Zadeh, A.G., El-Tawab, S.: Towards fault-tolerant job assignment in vehicular cloud. In: 2015 IEEE International Conference on Services Computing, pp. 17–24, June 2015 11. Blum, J., Eskandarian, A.: The threat of intelligent collisions. IT Prof. 6(1), 24–29 (2004) 12. Mejri, M.N., Ben-Othman, J., Hamdi, M.: Survey on VANET security challenges and possible cryptographic solutions. Veh. Commun. 1(2), 53–66 (2014) 13. Elhamshary, M., Youssef, M., Uchiyama, A., Yamaguchi, H., Higashino, T.: TransitLabel: a crowd-sensing system for automatic labeling of transit stations semantics. In: Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services, pp. 193–206. ACM (2016) 14. Garcia, M., Rose, P., Sung, R., El-Tawab, S.: Secure smart parking at James Madison University via the cloud environment (SPACE). In: 2016 IEEE Systems and Information Engineering Design Symposium (SIEDS), pp. 271–276 (2016) 15. Dunlap, M., Li, Z., Henrickson, K., Wang, Y.: Estimation of origin and destination information from Bluetooth and Wi-Fi sensing for transit. In: Transportation Research Board 95th Annual Meeting, no. 16-6837 (2016) 16. Liu, J., Shen, H., Narman, H.S., Chung, W., Lin, Z.: A survey of mobile crowdsensing techniques: a critical component for the internet of things. ACM Trans. Cyber-Phys. Syst. 2(3), 18 (2018) 17. Statista: Forecast: number of smartphone users in the U.S. 2010–2018. Technical report (2015) 18. United states vehicle registration data, automobile statistics and trends. Technical report (2015) 19. Salem, A., Nadeem, T., Cetin, M., El-Tawab, S.: DriveBlue: traffic incident prediction through single site Bluetooth. In: 18th IEEE International Conference on Intelligent Transportation Systems, 15-18 September 2015 (2015) 20. Dimatteo, S., Hui, P., Han, B., Li, V.O.K.: Cellular traffic offloading through WiFi networks. In: 2011 IEEE Eighth International Conference on Mobile Ad-Hoc and Sensor Systems, pp. 192–201, October 2011 21. Merino, B.: How-to Instant Traffic Analysis with Tshark. Packt Publishing Ltd., Birmingham (2013) 22. Tellez, M., El-Tawab, S., Heydari, H.M.: Improving the security of wireless sensor networks in an IoT environmental monitoring system. In: 2016 IEEE Systems and Information Engineering Design Symposium (SIEDS), pp. 72–77, April 2016 23. Kumar, T., Mane, P.B.: ZigBee topology: a survey. In: International Conference on Control, Instrumentation, Communication and Computational Technologies Paper (2016) 24. Parallax Inc.: XBee-PRO ZB S2B extended range module, wire antenna 25. Salman, A., Ferozpuri, A., Homsirikamol, E., Yalla, P., Kaps, J.P., Gaj, K.: A scalable ECC processor implementation for high-speed and lightweight with side-

216

26. 27.

28. 29.

30. 31.

32.

A. Salman et al. channel countermeasures. In: 2017 International Conference on ReConFigurable Computing and FPGAs (ReConFig), pp. 1–8, December 2017 National Institute of Standards and Technology. FIPS PUB 180-4: The Keyed-Hash Message Authentication Code (HMAC). pub-NIST, August 2008 Yorio, Z., Oram, R., El-Tawab, S., Salman, A., Heydari, M.H., Park, B.B.: Data analysis and information security of an internet of things (IoT) intelligent transit system. In: 2018 Systems and Information Engineering Design Symposium (SIEDS), pp. 24–29, April 2018 Stevens, M.M.J.: Attacks on hash functions and applications. Ph.D. thesis, Mathematical Institute, Faculty of Science, Leiden University, June 2012 Salman, A., Rogawski, M., Kaps, J.P.: Efficient hardware accelerator for IPSec based on partial reconfiguration on Xilinx FPGAs. In: 2011 International Conference on Reconfigurable Computing and FPGAs, pp. 242–248, November 2011 National Institute of Standards and Technology. FIPS PUB 180-4: Secure Hash Standard. pub-NIST, August 2015 Hatzivasilis, G., Papaefstathiou, I., Manifavas, C.: Password hashing competition - survey and benchmark. Cryptology ePrint Archive, Report 2015/265 (2015). https://eprint.iacr.org/2015/265 Martin, J., Mayberry, T., Donahue, C., Foppe, L., Brown, L., Riggins, C., Rye, E.C., Brown, D.: A study of MAC address randomization in mobile devices and when it fails. CoRR abs/1703.02874 (2017)

CityAction a Smart-City Platform Architecture Pedro Martins(B) , Daniel Albuquerque, Cristina Wanzeller, Filipe Caldeira, Paulo Tom´e, and Filipe S´ a Department of Computer Sciences, Polytechnic Institute of Viseu, Viseu, Portugal {pedromom,dfa,cwanzeller,caldeira,ptome,filipe.sa}@estgv.ipv.pt

Abstract. Fast population growth in cities and surrounding regions force cities to become smarter to have a sustainable economy, social quality, and environmental well-being. Smart-Cities will be the ones using information and communication technologies to make cities services more efficient (in performance and cost), interactive, and aware of events. For a city to become smarter, it needs to make use of emerging technologies related with Internet-of-Things (IoT), not only to collect information and interact (actuate, command, control) but also to provide services for analytics and other applications. In this paper, is researched the concept of smart-city in the context of the project CityAction, tested on the city of Castelo Branco, Portugal. This project focuses on the relationship between IoT, monitoring, actuating and displaying data. Based on collected data from sensors spread across the city, the proposed project aims to make “smart” decisions to optimize resources, cost, well living, and environmental impact. Results introduce an architecture to integrate multiple heterogeneous sensors, develop a dashboard able of displaying data in a user-friendly way, and making this information available to population and users through a mobile app. This mechanism makes possible to infer better decisions on the city management/behavior and put in place the needed mechanisms to improve response time, safety and well living. Keywords: CityAction · Architecture · Platform · Framework · Performance · Bigdata · Smart-city · Mobile · Management · Urban areas · Internet of Things (IoT) · Wireless sensor networks · Quality of service · Computer architecture · Telecommunication services

1

Introduction

Smart-Cities require tools for efficient services management. Nonetheless, it is also important to share information with different stakeholders to promote the creation of innovative services that go beyond the direct supervision of the urban space, with a particular focus on citizens well living. Thus, for the implementation of the city of the future, information and communication technologies play a vital role in the ecosystem of Smart-Cities. c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 217–236, 2020. https://doi.org/10.1007/978-3-030-12388-8_16

218

P. Martins et al.

Smart-Cities are built using sensors distributed in the urban space, which communicate using wired or wireless networks. These devices may be owned by different actors outside the city council. Data is collected and handled by different management systems, for example, garbage collection, street illumination, traffic lights. All data is open, allowing the community to develop new applications for smart-cities. The existence of a Data Broker facilitates the exchange of information between systems and devices. Apart from sensors, other sources may be available, such as smart-phones or stand-alone systems. Access to this information requires security and control mechanisms. This ecosystem is also prepared to carry out monetization of IoT events, promoting the creation of business models that are transversal to the different domains of activity. The Operational Management Center (OMC) of the architecture CityAction is the tool that allows services to efficiently control the city, centralizing all relevant data with the ability to correlate it in a single point. This tool enables exposing the information produced by the different systems, presenting it in a structured way to the policymakers, facilitating decision making in almost nearreal-time. 1.1

Motivation

Cities and urban areas are complex systems where managers are faced with challenges for which they have to make decisions based on available information. In [13] it is recognized that information control is essential to promote smarter urban management. It is, therefore, necessary to move from reactive urban logic management (very bureaucratic) to a proactive logic based on available information, which allows knowing the context of the city at every moment, enabling preventive actions. The state-of-the-art in sensor technologies and the increase of IoT, coupled with the social networks (Internet of People), provides urban management with new sources of data, with high potential for decision support. Current solutions for remote monitoring and control of city subsystems and infrastructures are structured in independent areas: environmental monitoring, urban mobility, lighting, waste, safety, among others. The management model is predominately sectoral with a low interaction between services, and with a low level of automation. This logic of separating information into slots leads to inefficient use of city resources. The overcome of this holistic management allows the overview of the state of multiple systems operating in the city. Only this way is possible to cross-reference information from different domains and act globally. There are visible links between, car traffic and air pollution, meteorological conditions ad the irrigation, city events and the demand for public transportation, etc. The European Innovation Partnership on Smart-Cities and Communication (EIP-SCC) recognize that integration of platforms for Smart-Cities is a crucial element for city management to improve living quality [3].

CityAction a Smart-City Platform Architecture

219

The proposed platform, CityAction is being developed to respond to all these challenges. The platform allows the collection and analysis of near-real-time data, regarding the city infrastructures and its association with social networks and open-data format. It is then possible to dynamically adjust the operation of various systems taking into account the real city context, enhancing more efficient management. 1.2

CityAction Objectives

The main objective of CityAction is the design and development of an integrated platform that combines city data from different sources, producing information for a more efficient urban management, contributing for better citizens life quality, and a more sustainable environment. The CityAction platform will be tested with a set of smart-cities applications, which are implemented within the proposed platform. The specific objectives of the CityAction are as follows: – Based on IoT and information from citizens (Internet of People), build a platform to manage cities. – IoT components integration. – Create a dashboard that provides a near-real-time integrated view of the city, parameterize rules for dynamic control of components, and act on various urban spaces. – Provide data in open format, to facilitate the creation of services by third parties; – Demonstrate the advantages of CityAction though testing and validation of smart-cities solutions with impact on quality of life. 1.3

Document Organization

This paper is organized as follows. In Sect. 2, is made an overview of the related work in Smart-Cities and associated platforms. In Sect. 3, is explained the architecture of the proposed platform. In Sect. 4, are described the considered experimental scenarios to test and evaluate the performance of the proposed tool. In Sect. 5, are presented conclusion of the proposed work, new research directions, and future work.

2

Related Work and Reference Architectures

First stage smart-city conception included the concept of wired city and society [22]. Most recently, smart-cities evolved to IoT. Moreover, the evolution of smartcities includes the improvement of competitiveness, where the community and quality of life are enhanced. A smart-city cannot focus on economic aspects only, or it will not be smart at all. Social conditions of their citizens should be of high priority.

220

P. Martins et al.

In this project is envisage that smart-cities need to focus on the components that make the city work as a competitive entity and also as a social organism. The idea of CityAction is to improve performance and generate a better quality of life [18]. Recent IoT applications from academic and industrial researchers, will provide a new generation of monitoring and actuating in near-real-time and timecritical, for different areas, e.g., location, social [24,38], context-aware services, emergency and health [35], surveillance [36]. Other research works in smart-cities focus on: events monitoring, control and prediction [12]; sensor data fusion with delays and losses in transmission [16], business [26], ontologies [40], different services models [17] and quality of use [23]. In other fields, such as, environment monitoring [14,28], medical diagnostics [27,30], social interests [37], and traffic monitoring [39], dedicated applications taking advantage of different type of sensors were developed. The problem with these dedicated applications is that they are not integrated or prepared to work as a single organism. Problems such as heterogeneity, decentralization and different distinct systems, raise integration issues to achieve innovative applications and to manage a city from a single point. Middleware solutions try to provide solutions to abstract the development of smart-cities applications. However, they do not address interoperability issues to manage smart-cities. Some middlewares concentrate on IoT environments [21,29,33], others on the integration of cloud platforms [25,34]. Still, middleware approaches are focusing on interoperability, OverStart [31], but unfortunately not addressed to smart-cities. More works in the field of System-of-Systems (SoS) [15,19,20] can be adapted to be used in the context of Smart-Cites. However, they do not address the needs and complexity of smartcities. 2.1

Industry Oriented Architectures

Existing platforms in the market, when compared to the proposed, are predominately vertical IoT platforms, usually “closed” to devices of a specific manufacturer (e.g., Phillips, CISCO) and do not take into account the requirements of integrated management. On the other hand, standards, technological protocols, and communication interfaces are still being defined, leading to some uncertainty about the solutions to adopt. These issues are a significant challenge regarding compatibility between vertical solutions, developed by autonomous entities, in distinct languages, and not intended for horizontal interaction as desired in the CityAction system. Some examples of commercial platforms available in the market are: – AGT Platform for IoT [5]. An IoT platform that allows vertical base application development, the interconnection of several sensors, and makes a detailed analysis focused on predictive models, combining information from various sources. The focus of this platform are IoT applications for the industry.

CityAction a Smart-City Platform Architecture

221

– Cisco Platform - Cisco Internet of Things [11]. It is presented as a management platform for devices and large-scale objects. Presented as a management platform for communications and IoT devices, this Platform is closed, at the connectivity level it only features interfaces with Cisco products for IoT. – IBM Platform - Watson Internet of Things [4]. It presents itself as a middleware of device interconnection and application development. Developed with scalability and elasticity of resources as a fundamental point of the platform. It is a generic platform that can be integrated at the application level for smart-cities scenarios, however, requires specialized development, knowledge of the various services and articulation between systems developed for different purposes. – NEC Platform - Cloud City Operations Center (CCOP) [1]. Allows integrating vertical solutions developed by various actors to solve specific problems, such as traffic congestion, accessibility or health care. This platform promises to help decision-makers through a holistic approach that enables the deployment of an economical and efficient solution. In the area of smart-cities, according to the company responsible for the platform, it allows the collection and analysis of information throughout the city in real-time using Artificial Intelligence (AI) and the Internet of Things (IoT). – Phillips and SAP Platform - Phillips with SAP HANA [8]. It is provided as a solution to collect real-time information from Phillips luminaires and data from other sensors, showing them in a single integrated panel of the city. This platform combines real-time information gathered by the SAP HANA platform with the Phillips CityTouch street lighting management to help decision-makers (e.g., street lighting, waste management, parking, and Traffic). – Compta Platform - Compta Smart Cities [2]. This tool offers a set of IoT services, all managed from the same back-end interface, allowing cities to have a unified global view of day-to-day operations and management. The platform consists of a set of proprietary service modules and an aggregation layer. The aggregation layer is generic, while there is a need to create new service modules whenever you want to add a new vertical application. – Siemens Platform - Synergy City [10]. The Siemens Intelligent Cities platform aims to aggregate data from multiple infrastructures in a city to create an integrated system for its management. According to [32], the platform is being tested in several pilot cities to show how to increase the efficiency of the city infrastructure as well as reduce energy consumption. Currently, it receives information from multiple sources, namely commercial and residential buildings, power generation infrastructures, traffic control to water distribution systems. It aims to use sensors or other forms of measurement in everything that concerns the city. For example, existing video acquisition systems can collect information on traffic volume, which speed will be incorporated into the city traffic management system, for example by changing the duration of traffic lights changes.

222

P. Martins et al.

– Microsoft Platform - CityNext [7]. The CityNext platform incorporates a portfolio of Microsoft technologies and products that can be used by a global network of partners to develop solutions that meet the challenges of smart cities. It aggregates products such as the Azure platform, mobile application support, and data management capabilities to aggregate city data and manages it. The CityNext initiative focuses on various areas such as public administration, security, health, buildings, tourism, education, transport network, energy and water networks, with the aim of assisting decision making in cities. – Schneider Electric Platform - EcoStruxure for Smart Cities [9]. The EcoStruxure platform offers a suite of applications and analytics for managing sustainability, performance and operational resources. This platform provides the integration capability of software that allows integration of many parts of the city, allowing real-time management. Although the platform allows the integration of multiple parts of the city, this is very directed to the management of energy resources. – Huawei Platform - e-Government platform [6]. This platform consists of an interconnected and inter-operable government network supported by city devices and cloud processing. Because it uses cloud processing, it has a multidimensional security system and a data-sharing platform that allows citizens to collaborate with information. The urban infrastructure network links, citizens equipment, and the city, while the cloud-based urban data center stores, shares and integrates the entire sector and subsystem of data services. Based on the urban information sharing platform it is possible to develop various smart applications such as smart government, safe city, intelligent transport network, smart Enterprise, smart Education and smart hospital. All mentioned platforms are comparable to the CityAction platform for the purpose - all of them aim to manage the city in an integrated way. However, they are all predominantly vertical, proprietary IoT platforms, where it is challenging to integrate devices from different manufacturers. The evolution of a “conventional” city to a smart-city is not based on the creation of a new city with all the equipment/devices of a single manufacturer. This passage has to be made with the adaptation of the existing systems in the “conventional” city, that most of the time are from different manufacturers. With the lack of definition of standards, protocols and communication interfaces there is significant uncertainty about the best solutions for a given city. The CityAction project takes advantage of this fragility and proposes a set of mechanisms and solutions that allow the integration of different systems to create an ecosystem of horizontal interaction for city management.

3

CityAction Architecture

The proposed architecture in this project is represented in Fig. 1 as an integrative architecture of vertical systems for intelligent cities and holistically allows data

CityAction a Smart-City Platform Architecture

223

visualization, the parameterization of rules and the actuation in real-time in the different vertical systems.

Fig. 1. CityAction architecture

CityAction architecture, Fig. 1, consists on 4 main layers: – Device layer in which IoT sensors, actuators and gateways corresponding to the various vertical systems in a city are aggregated; – M2M connectivity layer, responsible for the devices interconnection to the Internet. Inside this layer is the connectivity management layer, which provides the management of mobility services, both in mobile networks and in low-power radio networks (LPWAN). This layer is a framework of services and tools aimed to simplify the M2M connectivity of new services and applications. It is a fundamental ecosystem service to control and monitor connectivity regardless of the communication technology used (2G/3G/4 mobile networks, LPWAN technologies); – Middleware layer. This layer integrates several blocks like the data broker, which ensures the exchange of information between the various systems and IoT devices; the monetization for IoT event pricing; the data management and analytical for data processing and management; the vertical management M2M management solutions and finally the API management layer, transversal to all these blocks. The API management layer will allow you to expose the data from the lower layers to the application layer. – Application layer, which uses the IoT data to create applications with value for the municipality. This layer also allows the use of open data from other sources to enrich the portfolio of applications.

224

P. Martins et al.

The data produced by the IoT sensors, aggregated by the Gateways or generated by the systems with vertical management are transferred through the data broker and integrated into the data management and analytical block, allowing the raw data to be exposed, and the information processed. The block of data management and analytics is responsible for monitoring, analyzing and correlating events from IoT devices as well as from open sources, for example, social networks. This block enables to trigger a set of tasks independently and automatically. The IoT event monetization block is an indispensable block for the creation of business models in the field of smart cities. The layers of middleware and managed connectivity are linked to the systems operations supporting and to the operators business. In the device layer can be found, sensors and actuators of the most varied types for the characterization of different contexts and gateways that mainly allow the aggregation of data coming from sensors. Although only managed systems are used in this project (i.e., vertical sensors do not connect directly to the platform) - the data first passes through the respective management systems. The architecture contemplates the integration possibility with native devices. 3.1

Operation Management Center (OMC)

The Operational Management Center (OMC) is an application capable of integrating all the information coming from the city, allowing municipal services to effectively control the city by centralizing all relevant data in a single point. This tool allows exposing the information produced by the different systems, presenting it in a structured way to both operational and political decision makers, thus facilitating decision-making. The OMC supports graphical configuration environment, exposing at any moment only the information that is considered relevant or that is configured by the user. The OMC is fed by data generated in different systems - for example, waste management, public lighting, air quality or energy consumption in the building, and can also be enriched with information provided by smart-phones, social networks or even by open data on the Internet. The collected information is available through a common dashboard, allowing to monitor different systems in real-time. The information is presented in an integrated way, exposing correlations of different indicators, independently of the service provider and the area of activity. The information gathered can also be made available to the developer community, allowing the development of new applications for smart cities. The OMC also enable the direct operation, controlled by a municipal operator. In certain situations, the municipal services need to modify in real-time the state of certain equipment due to possible changes in external factors. It may be, for example, necessary to control street lighting in a particular neighborhood or dynamically modify the signalization. For this, the OMC has mechanisms of

CityAction a Smart-City Platform Architecture

225

direct control, allowing real-time performance of the different devices distributed in the urban space. A vital capacity of the OMC is the possibility of parameterizing rules between domains and their concrete application. The idea is to adapt in real-time to the context of the city taking into account data from different managed systems. It makes possible to dynamically configure operations and parameters, depending on collected data and actions to be performed in the different managed systems. Whenever certain events occur, it triggers actions, which may happen through automatic actuation in certain systems or in a more operational way. This actuation makes possible to keep the technicians abreast of events and adapt the different infrastructures to correct certain behaviors. For instance, the number of lanes on the road and the state of street lighting may be changed according to the number of vehicles circulating and the levels of CO2 in the air. The use of an OMC (Fig. 2) brings several benefits to the municipalities. OMC is a handy tool to ensure informed decision-making based on the up-to-date and relevant information. Governance is more effective and transparent as it is supported by concrete data collected in the context of the urban space itself. The manager starts to have a holistic view of the state of the municipality, through a single tool, grouping all the information. The use of a single interface facilitates the interpretation of the data and guarantees an increased user experience. It is now possible to visualize in real-time what is happening in the city and analyze trends. Historical information is available and useful for assessing future behavior change. The possibility of acting in the different systems is beneficial to guarantee the control of the various infrastructures. Rules parameterization ensures the adaptation of the methods to the state of the municipality in a dynamic way, ensuring an “oiled” operation, thus improving the quality of life of citizens.

Fig. 2. Functionalities of the Operational Management Center

3.2

Middle-Ware

The middleware layer allows to process and work data to make it available to the application layer. Following are the main modules that make up this layer: – Data Broker. The Data Broker block is the service mediation block that supports secure

226

P. Martins et al.

data transfer between IoT devices and applications and uses synchronous and asynchronous communication modes. Typically they include functionality for device management and support multiple protocols, different manufacturers and development tools (e.g., sandbox, RAD environments). – Event Management and Correlation. The management and correlation of events require the extraction of useful information from the data received from the devices, or from other third parties. It is in this block that data must be cleaned (identification of incomplete, incorrect, inaccurate data and its replacement, modification or removal), normalization and enrichment. It is also here that real-time event processing is performed and analytics are used, which are fundamental for cross-checking and correlating data from multiple domains. – Monetization. The monetization block is a Services Framework that enables the monetization of IoT services resulting from complex business models. Includes: • a flexible, scalable and real-time charging engine; • a catalog of products/services that allows the design and control of joint business offers; • advanced billing features to manage revenue streams. – APIs management. The API management layer is a mediation layer that ensures exposure of the IoT APIs to the application layer by ensuring: • security and privacy - includes mechanisms for authentication, authorization and access control; • policy control; • orchestration of systems services belonging to several domains. This layer is still essential to enable the creation of an API marketplace of service providers and applications from other entities.

4

Experimental Scenarios

The CityAction project demonstrates the functionalities and potentials of the CityAction platform through a prototype that is tested with a set of smart cities applications, developed in close collaboration with the municipalities of Aveiro, Castelo Branco and Viseu (Portugal). The objective is to demonstrate the advantages of integrating information generated by different urban subsystems into a single city management center. 4.1

Truism Motorization

Many cities have a WiFi network, providing free Internet access to its citizens and visitors. On the other hand, city managers would like to know better the number of tourists who visit the city to analyze the impact of tourism promotion strategy. It turns out that many points of tourist interest are areas of free access, there being no control or sale of tickets. Even where tickets are sold, for example,

CityAction a Smart-City Platform Architecture

227

museums, the number of tickets sold does not allow correlations between visits, or distinguish recurring visitors from tourists. The WiFi coverage provided by smart cities opens up a whole new set of possibilities for monitoring and analyzing the flow of tourists. The WiFi network card for smartphones, tablets or laptop computers has a unique code called Media Access Control (MAC) that is transmitted whenever a user connects to a WiFi access point. Thus, a tourist that wanders the city and access the WiFi network at different points leaves a “digital track” that can be used for a statistical analysis of tourists in that city (Fig. 3). Through the MAC it is not possible to know the phone number of the smartphone or identify the person, where privacy is assured.

Fig. 3. Tourists count in areas with WiFi coverage

Under this scenario, the CityAction project provides a software platform for monitoring tourist flows using WiFi access, with the following functionalities: – Availability of analytical information: number of daily, weekly, monthly and annual accesses; – Distinction between tourists (devices with new MAC addresses) and accesses from local citizens (devices with recurring MAC addresses); – Correlation of the number of visits between tourist areas visited in the city (probability of a tourist visiting point A also visiting point B); – Detecting groups of tourists based on groups of MAC addresses that move in the city; – Representation on a map of hot spots (heat maps) with the number of accesses and tourist areas most visited. 4.2

Public Transportation

This scenario is in the area of urban mobility in the city of Castelo Branco, Portugal, and aims to provide citizens with information on the position of buses in the urban network, enabling citizens to better manage their waiting time at stops (Fig. 4). – Development and production of the hardware modules (GPS and 4G) to be installed in the buses of the fleet operating in the city’s urban network.

228

P. Martins et al.

Fig. 4. Indicative panel of the bus arrival time

– Development of software for estimating delays and for communication with electronic panels to be installed in bus stop shelters; – Installation and configuration of electronic billboards (LED outdoor) in three bus shelters, with information on the estimated time of arrival (in minutes) of the vehicles. – Integration of the electronic panels with the system to monitor the environmental parameters, making this information available to the population. – Integration with the CityAction platform. Electronic notice boards can also be used to provide other information to the population (climate, pollution, etc.). 4.3

Water Quality

Many cities are crossed by rivers, canals or other waterways. In these cases, the quality of water has an impact on the environment and the city’s economy, so it is important to monitor it on a continuous basis. Continuous monitoring allows detecting changes in the physicochemical composition of water: for example, a chemical discharge can be quickly detected by measuring variations in temperature, pH, and dissolved oxygen values. Another interesting application is the continuous monitoring of water quality in river beaches and making this information available to swimmers. In this scenario, Fig. 5, the proposed system makes available graphical information related to the following water parameters: – – – – –

Oxygen concentration dissolved in water [mg/L]; pH, on the Sorensen scale; Conductivity [µS/cm]; Reduction Potential [mV]; Temperature in degrees centigrade.

Data is updated every 15 min. Water quality degradation alerts will be sent via email to the responsible entities in the Municipality. Historical water quality data may be combined with meteorological information and used for scientific research, for example, in predictive models of pollution levels.

CityAction a Smart-City Platform Architecture

229

Fig. 5. Remote monitoring of river water quality

4.4

Smart Residues Management

In the universe of waste, there is surface and underground containers. These are normally collected by different vehicles. However, the process of depositing waste is similar in both cases. To have intelligent waste management is necessary to realize the volume of each container using the installation of a sensor in the lid or another top location to collect a volumetric reading of the container, Fig. 6.

Fig. 6. Sensor installed in container + 360Waste platform

The sensor is autonomous in its operation, once installed it begins to report the volume of the container. Integrates batteries and communications systems allowing you to upload data to the 360Waste platform. The exchange of information between the sensor and the 360Waste also allows the sensor to receive behavioral settings regarding how and when to perform volumetric readings. Thus, there is in this process a direct dependence between the sensor and the 360Waste central data platform that can not be suppressed. The CityAction platform is powered by several data sources; each source has its technology and its interconnection process. In this scenario, the CityAction platform will interconnect via web service to the 360Waste platform. 4.5

Electrical Consumption

Energy monitoring is an essential tool when it is intended to characterize, over time, the energy consumption of a given building and, later, to evaluate the impact of the implementation of saving measures. As constituent parts of an energy monitoring system, it is important to note:

230

P. Martins et al.

– Energy analyzer placed in the intended building; – Communications Gateway; – Remote server with databases relating to the configurations of each element of the system and for storing the information collected by the sensors/analyzers; – Smart city application for processing and presenting information to the user. Figure 7 shows the connection between the building with the energy analyzer and the database server for storing the obtained data.

Fig. 7. Building with energy analyzer connected to the energy monitoring platform

The energy analyzer is connected to a communications gateway, which is responsible for either the exchange of configuration information between the monitoring platform and the analyzer or by sending the analyzer values to the platform. Thus, the monitoring platform consists of two databases (storage of values obtained and storage of configuration parameters) and an API-type application. Within the CityAction project, any application developer that wishes to have access to the data collected by the energy analyzer should establish a connection to the Power Monitoring Platform API. 4.6

Smart Illumination

Public lighting plays a key role in the safety of people and goods, either by providing good visibility to those on the public highway or as a deterrent to illegal or criminal activities. As is well understood, maintaining a city illuminated during the night or low-light periods corresponds to a cost of energy whose cost is very significant and whose ecological footprint is not negligible. For this reason, it is of all interest to adopt intelligent public lighting systems to achieve significant savings. The intelligent public lighting system that is proposed consist of: – Illumination points with ambient, current, presence/movement sensors and communication gateways; – Lighting management platform, with API type application; – App on the CityAction platform to check the history of measurements made and to diagnose faults. Each lighting point consists of a current sensor (SC) for fault detection, a light sensor (SL), a motion sensor (SM) and a gateway to allow communication with the lighting management platform, Fig. 8.

CityAction a Smart-City Platform Architecture

231

Fig. 8. Point of illumination, with respective sensors and communications

Fig. 9. Example of using the intelligent lighting system

Figure 9, represents a situation where the lighting points remain off (green) until a vehicle is detected. In this case, the lighting points that the vehicle approaches automatically light up (blue in Fig. 9). The light sensor included in each unit is used so that the illumination is not switched on during the day, without requiring a preset time. Another feature to note is the ability of each lighting unit to measure the electrical current it is consuming, which makes it possible to detect faults in the luminance. All events identified and the status of each lighting point are transmitted to the lighting management platform for storage.

5

Conclusions and Future Work

This document is intended to present the first version of the CityAction project architecture. Based on the defined scenarios and requirements extracted, the initial version of the system is designed to meet the needs of “cities of the future”. Intelligent cities use information and communication technology for the management of their urban space. The dissemination of sensors and actuators facilitates the knowledge of the city in real-time enhancing informed decision supported by reliable data. The defined architecture proposes a set of functional entities that guarantee holistic management but also a detailed vision of what is happening at each moment, guaranteeing efficient management of the municipal services. An added value of the proposed system passes through the possibility of correlating typically disjointed information. This approach makes possible to perceive the impact that one system has on another. For example, it will be possible to see the connection between the number of cars and the levels of air pollution. It was also taken into account the need for action in the different infrastructures to guarantee the human control of every one of the management systems.

232

P. Martins et al.

To promote the economy and to foster innovation, the developed system in the CityAction project provides the sharing of information by different entities. The use of these data by external entities to the municipality will boost the creation of new businesses as well as facilitate the scrutiny of the city’s governance. The architecture designed in CityAction and its development on the ground is a key element in building the cities of the future for the benefit of citizens and more sustainable society. 5.1

Requirements

Based on identified scenarios in this document, this section defines the functional requirements of the CityAction service platform. The goal is to have a specification aligned with the needs and expectations of the end consumer/market. The functional requirements will then be mapped into technical requirements. In general, the various CityAction services requirements are highlighted: – Dashboard of the city: To make it possible to visualize the state of the city in real-time, the parameterization of rules transversal to the different systems and the performance of the different urban systems. – Interoperability: Ability to integrate information generated by different urban subsystems into a single city management center. This requires the definition of standardization or standardization requirements, particularly about the types of messages exchanged and access to information. – Scalability: Characteristic of the system through which the various entities exist independently and replicable, which allows them to be developed separately, while allowing an increase in the number of units of the same type and, consequently, an increase in the size of the system without impairing its functionality. – Sensors autonomy: The implemented solutions should, whenever possible, be energy efficient, either through the adoption of mechanisms to reduce their consumption or through the incorporation of energy generation and storage capacities. This way, we intend to consider and eventually apply the concepts energy harvesting and off-grid power. – Information security: All entities of the system must be developed taking into account integrity guarantees, confidentiality, availability and authenticity of the information. This is because the information collected, stored and processed by the system must be reliable and accessible to system entities that need it, but at the same time be protected from unauthorized consultation or alteration. It is intended that access for consultation or editing and the insertion of new data is possible while ensuring that unauthorized intrusion or loss of information does not occur. – APIs open to third parties: Without prejudice to the concerns associated with Information Security described in the previous point, open data access may be granted for the development of new smart city applications. This way the concept of the CityAction ecosystem can be extended to the community.

CityAction a Smart-City Platform Architecture

233

At the global CityAction system level, the following requirements must also be mentioned: – Integrative architecture of vertical systems for intelligent cities. – Visualization of city data in a holistic way, parameterization of rules and real-time performance in different vertical systems. – Controlled exposure layer for communication between services in a Service Oriented Architecture (SOA) model. – Inclusion of an open layer of abstraction for the exchange of information between systems and IoT device. – Monetization of IoT events allowing the creation of business models in the field of smart cities. – Component monitoring, analysis, and correlation of events from IoT devices as well as open sources, for example, social networks, able to trigger a set of tasks independently and automatically. – Integration of the information provided by IoT devices with vertical management systems, allowing the exposure of raw data or information. 5.2

Future Work/Scenarios

With all of the technological advancements, the concept of a smart city is not as far-fetched as it once seemed. With the development of IoT and big data analysis, a new potential for is emerging, allowing the evolution and improvement of urban spaces. In the context of CityAction project there are some future research topics that we would like to highlight: – Apply new deep learning techniques to make decisions more autonomous, fast and efficient. – Use multi-objective decision-making techniques. – Actuate smartly on emergency situations (e.g., clear traffic on the path of an emergency ambulance). – Implement block-chain processing across the city network to optimize decisions. Acknowledgment. “This article is a result of the CityAction project CENTRO01-0247-FEDER-017711, supported by Centro Portugal Regional Operational Program (CENTRO 2020), under the Portugal 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), and also financed by national funds through FCT - Funda¸ca ˜o para a Ciˆencia e Tecnologia, I.P., under the project UID/Multi/04016/2016. Furthermore, we would like to thank the Instituto Polit´ecnico de Viseu for their support.” Additionally, we want to thank the project members: Nuno Gomes (Exatronic), Filipe Cabral Pinto (Alltice Labs), Paulo Marques (IPCB e Allbesmart) e H´elio Silva (Evox).

References 1. Cloud city operations center. http://www.nec.com/en/global/solutions/govpublic. html. Accessed 30 Mar 2018

234

P. Martins et al.

2. Compta - smart-cities, Intelligent solutions for demanding cities. https://www.cebsolutions.com/pt-pt/solucoes/cidades-inteligentes/. Accessed 30 Mar 2018 3. European innovation partnership on smart-cities and communication. http://ec. europa.eu/eip/smartcities/index en.htm. Accessed 30 Mar 2018 4. The internet of things becomes the internet that thinks with Watson IoT. https:// www.ibm.com/internet-of-things?lnk=mpr iot&lnk2=learn. Accessed 30 Mar 2018 5. IoT analytics explained. http://www.agtinternational.com/iot-analytics-inaction/iot-analytics-explained/. Accessed 30 Mar 2018 6. Leading new ICT, creating a smart city nervous system. http://e.huawei.com/en/ solutions/technical/smart-city. Accessed 30 Mar 2018 7. Microsoft citynext. https://enterprise.microsoft.com/en-us/industries/citynext/# fbid=xWPGkEusJWO. Accessed 30 Mar 2018 8. Phillips with SAP HANA. http://www.lighting.philips.com/main/inspiration/ smart-cities/lighting-in-smart-cities/smart-city-ecosystem-collaboration-saphana#. Accessed 30 Mar 2018 9. Schneider-electric, smart-cities challenges. https://www.schneider-electric.com/ en/work/solutions/for-business/smart-cities/challenges.jsp. Accessed 30 Mar 2018 10. Siemens - synergy city. https://www.siemens.com/innovation/en/home/picturesof-the-future/infrastructure-and-finance/smart-cities-city-intelligence-platform. html. Accessed 30 Mar 2018 11. Cisco IoT system, November 2017. http://www.cisco.com/c/m/en us/solutions/ internet-of-things/iot-system.html. Accessed 30 Mar 2018 12. Adeleke, J.A., Moodley, D., Rens, G., Adewumi, A.O.: Integrating statistical machine learning in a semantic sensor web for proactive monitoring and control. Sensors 17(4), 807 (2017) 13. ANEXO I: Cidades sustent´ aveis 2020 (2015) 14. Bellavista, P., Giannelli, C., Zamagna, R.: The pervasive environment sensing and sharing solution. Sustainability 9(4), 585 (2017) 15. Billaud, S., Daclin, N., Chapurlat, V.: Interoperability as a key concept for the control and evolution of the system of systems (SoS). In: International IFIP Working Conference on Enterprise Interoperability, pp. 53–63. Springer, Heidelberg (2015) ´ 16. Caballero-Aguila, R., Hermoso-Carazo, A., Linares-P´erez, J.: Optimal fusion estimation with multi-step random delays and losses in transmission. Sensors 17(5), 1151 (2017) 17. Choi, H.-S., Rhee, W.-S.: IoT-based user-driven service modeling environment for a smart space management system. Sensors 14(11), 22039–22064 (2014) 18. Correia, L.M., W¨ unstel, K.: Smart cities applications and requirements. White Paper. Net (2011) 19. Coulson, G., Blair, G., Elkhatib, Y., Mauthe, A.: The design of a generalised approach to the programming of systems of systems. In: 2015 IEEE 16th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp. 1–6. IEEE (2015) 20. Curry, E.: System of systems information interoperability using a linked dataspace. In: 2012 7th International Conference on System of Systems Engineering (SoSE), pp. 101–106. IEEE (2012) 21. Delicato, F.C., Pires, P.F., Batista, T., Cavalcante, E., Costa, B., Barros, T.: Towards an IoT ecosystem. In: Proceedings of the First International Workshop on Software Engineering for Systems-of-Systems, pp. 25–28. ACM (2013) 22. Dutton, W.H., Blumler, J.G., Kraemer, K.L.: Wired Cities: Shaping the Future of Communications. GK Hall & Co., Boston (1987)

CityAction a Smart-City Platform Architecture

235

23. Floris, A., Atzori, L.: Managing the quality of experience in the multimedia internet of things: a layered-based approach. Sensors 16(12), 2057 (2016) 24. Govoni, M., Michaelis, J., Morelli, A., Suri, N., Tortonesi, M.: Enabling social-and location-aware IoT applications in smart cities. In: International Conference on Smart Objects and Technologies for Social Good, pp. 305–314. Springer, Heidelberg (2016) 25. Grace, P., Bromberg, Y.-D., R´eveill`ere, L., Blair, G.: OverStar: an open approach to end-to-end middleware services in systems of systems. In: ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing, pp. 229–248. Springer, Heidelberg (2012) 26. Guijarro, L., Pla, V., Vidal, J.R., Naldi, M., Mahmoodi, T.: Wireless sensor network-based service provisioning by a brokering platform. Sensors 17(5), 1115 (2017) 27. Guo, Y., Chen, X., Wang, S., Sun, R., Zhao, Z.: Wind turbine diagnosis under variable speed conditions using a single sensor based on the synchrosqueezing transform method. Sensors 17(5), 1149 (2017) 28. Hu, Q., Wang, S., Bie, R., Cheng, X.: Social welfare control in mobile crowdsensing using zero-determinant strategy. Sensors 17(5), 1012 (2017) 29. Lea, R., Blackstock, M.: City hub: a cloud-based IoT platform for smart cities. In: 2014 IEEE 6th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 799–804. IEEE (2014) 30. Liu, J., Hu, Y., Wu, B., Wang, Y., Xie, F.: A hybrid generalized hidden Markov model-based condition monitoring approach for rolling bearings. Sensors 17(5), 1143 (2017) 31. Lopes, F., Loss, S., Mendes, A., Batista, T., Lea, R.: SoS-centric middleware services for interoperability in smart cities systems. In: Proceedings of the 2nd International Workshop on Smart, p. 4. ACM (2016) 32. Lytra, I., Engelbrecht, G., Schall, D., Zdun, U.: Reusable architectural decision models for quality-driven decision support: a case study from a smart cities software ecosystem. In: 2015 IEEE/ACM 3rd International Workshop on Software Engineering for Systems-of-Systems (SESoS), pp. 37–43. IEEE (2015) 33. Mitton, N., Papavassiliou, S., Puliafito, A., Trivedi, K.S.: Combining cloud and sensors in a smart city environment (2012) 34. Petrolo, R., Loscri, V., Mitton, N.: Towards a smart city based on cloud of things. In: Proceedings of the 2014 ACM International Workshop on Wireless and Mobile Technologies for Smart Cities, pp. 61–66. ACM (2014) 35. Rahmani, A.M., Gia, T.N., Negash, B., Anzanpour, A., Azimi, I., Jiang, M., Liljeberg, P.: Exploiting smart e-health gateways at the edge of healthcare internetof-things: a fog computing approach. Future Gener. Comput. Syst. 78, 641–658 (2018) 36. Rashid, B., Rehmani, M.H.: Applications of wireless sensor networks for urban areas: a survey. J. Netw. Comput. Appl. 60, 192–219 (2016) 37. Saikar, A., Parulekar, M., Badve, A., Thakkar, S., Deshmukh, A.: Trafficintel: smart traffic management for smart cities. In: 2017 International Conference on Emerging Trends & Innovation in ICT (ICEI), pp. 46–50. IEEE (2017) 38. Saleem, Y., Crespi, N., Rehmani, M.H., Copeland, R., Hussein, D., Bertin, E.: Exploitation of social IoT for recommendation services. In: 2016 IEEE 3rd World Forum on Internet of Things (WF-IoT), pp. 359–364. IEEE (2016) 39. Villanueva, F.J., Santofimia, M.J., Villa, D., Barba, J., Lopez, J.C.: Civitas: the smart city middleware, from sensors to big data. In: 2013 Seventh International

236

P. Martins et al.

Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 445–450 (IMIS) 40. Wu, Z., Xu, Y., Yang, Y., Zhang, C., Zhu, X., Ji, Y.: Towards a semantic web of things: a hybrid semantic annotation, extraction, and reasoning framework for cyber-physical system. Sensors 17(2), 403 (2017)

Simplified Neural Networks with Smart Detection for Road Traffic Sign Recognition Wei-Jong Yang, Chia-Chun Luo, Pau-Choo Chung, and Jar-Ferr Yang(&) Department of Electrical Engineering, Institute of Computer and Communication Engineering, National Cheng Kung University, Tainan, Taiwan [email protected], [email protected]

Abstract. Improving driver’s safety is the main goal of the advanced driver assistance system, which has been widely deployed for proactive driving security in recent years. For road driving, the advanced driver assistance system should visually recognize circular prohibition and triangular warning traffic signs to help drivers to grab complete traffic conditions. In this paper, we proposed a low-computation neural assistance system for traffic sign recognition. First, we proposed shaped-based detection algorithms to detect the regions, which are with circle and triangular traffic signs in designated regions of interest. For classification to those detected regions, we then suggest a convolutional neural network to achieve about 5% improvement of top 1 accuracy compared with LeNet model in German traffic sign recognition benchmarks dataset. For real applications, we also establish a Taiwanese traffic sign database to train the proposed neural network. The simulation results on self-collect driving videos demonstrate that the proposed traffic sign recognition system achieved above 97% recognition rate can be effectively adopted in ADAS applications. Keywords: Advanced driver assistance system  Traffic sign detection Traffic sign recognition  Convolutional neural network

1 Introduction Ameliorating safety environment is always important issue for drivers. There are many traffic accidents which cause a lot of people injured and property damage in every year. In order to avoid accidents, the development of intelligent transportation systems becomes an active research trend in automobile industries. Thus, the advanced driver assistance system (ADAS) is designed to improve vehicle safety and reduce traffic accidents. In order to reduce driver’s loads, an automatic traffic sign recognition (TSR) system is one of the key technologies in intelligent vehicles. Generally, an automatic TSR system is composed of traffic sign detection and traffic sign classification. Currently, the deep neural networks [1–3] could be used to achieve the above goal; however, they acquire a lot of computation such that the low-cost ADAS becomes impossible. To cover all traffic signs, we need to detect circular and triangular regions first. With the detected regions, we could apply a designed convolution neural network (CNN) for classification with low computation for real-time applications. To achieve © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 237–249, 2020. https://doi.org/10.1007/978-3-030-12388-8_17

238

W.-J. Yang et al.

lower computation and higher performance, the proposed TSR system contains a simple detection unit of active regions and a classification unit with a learning neural network.

2 Related Works Traffic signs have the characteristic of unified simple shape and known stiffness color. Thus, the most TSR algorithms consist of detection and classification stages. The candidate detection approaches are mostly based on the nature properties. For detection, the methods could use color features for image segmentation [4–7]. The colorbased sign detection methods frequently suffer from detection rates decreased if the scenes are with illumination change or in various weather conditions. To avoid such problems, some candidate detection methods only consider the shape of traffic sign information from grayscale images [8]. Barnes et al. used the fast radial symmetry detector to detect circular sign with highly effective and low false-negative rates [9]. However, some detection methods combined multiple nature properties of color and shape information [5, 10]. Some approaches apply HOG features [30, 31] for traffic sign feature extraction and simultaneously utilized the support vector machine (SVM) through different size of sliding windows to determine the sign [11–13]. Especially, Xiao et al. not only make use of HOG descriptor with SVM but also use Boolean CNN [14]. They used HOG with SVM to detect potential regions and checked the detected potential regions by Boolean CNN to refuse the false positive regions. In classification, the classifiers, which are mostly trained by using hand-labeled training data, could be support vector machine (SVM) [6, 11], k-d tree [15], random forest, [16, 17], AdaBoost [18] and neural networks [19, 27, 28, 32]. By using deep learning schemes, some recent approaches utilized CNN to recognition the content of road sign [29]. Yang et al. [20] proposed an automatic traffic sign detection and recognition system. They used CNN to classification each subclass of traffic sign in classification stage. In other words, the system simultaneously involves three CNN’s. They attempt to easily classify in detection stage before actually CNN classification. Hu et al. [21] proposed a novel branch convolution neural network (BCNN) to accelerate the testing computation by analyzing the losses of each branch classifier. The core concept of BCNN is a stopping in advance when testing data procedure. They need a pre-trained CNN model to BCNN model that BCNN predicts some traffic sign samples in previous layers rather than in original final output layer. There is a completed approach by using end-to-end CNN network for both traffic sign detection and classification which can detect and classify traffic signs at the same time [2]. However, the CNN training system needs to label not only the classes of traffic sign but also the bounding box position and the pixel-wise label. Unlike the CNN classification training data just need the labels of the classes of traffic sign. It is hard to collect amount of complex different type training data. In this paper, the proposed system is based on shaped-based detection for computation reduction. We then suggest a new CNN architecture for classification such that the proposed traffic sign recognition system can reduce the computation and achieve high performance. Then the proposed system can be realized with a low-cost chip to

Simplified Neural Networks with Smart Detection for Road TSR

239

meet the ADAS requirements for intelligent vehicles. By using the smart detection techniques, we can find the target regions to reduce the computation load. By deep learning under limited regions, we can achieve more robust classification in real environments.

3 The Proposed TSR System As shown in Fig. 1, the proposed traffic sign recognition (TSR) system involves detection, state machine and convolutional neural network for recognition of the traffic sign. In traffic sign detection, we consider the visual characteristics of the sign by referring sign shape and color. When system detects candidates and enters to Alarm state, we then use convolutional neural network to recognize the traffic sign. The details of the proposed TSR system are described in the following subsections. Input Image next ROI region

ROI Regions

Yes

DetecƟon State? Color Transform

Edge DetecƟon Triangular DetecƟon

Circle DetecƟon

Feature ExtracƟon

AcƟve Block DetecƟon AcƟve? Yes

No

N=N+1 Alarm State ?

GTSRB

Yes

No

Warning State?

ConvoluƟonal Neural Network ValidaƟon

Output Decision

Fig. 1. Flow chart of the proposed TSR system

3.1

Regions of Interest

In the proposed traffic sign recognition (TSR) system, the regions of interest (ROIs) shown in Fig. 2 are selected. In Taiwan, the traffic signs are appeared in the right side of main road. Three ROI regions in the input frame are selected for practical applications aimed to cover different distances of traffic signs. If we search all three ROI regions in each frame, the computation will be still too high. Logically, the traffic sign appears from far to near in consecutive images, so we can use this fact to design a detection state machine to solve this problem and get better detection performance. The state machine contains three states: detection, warning and alarm states.

240

W.-J. Yang et al.

Fig. 2. Selected ROI regions for traffic sign detection

3.2

Traffic Sign Detection

Apart from contour information, the shape of the traffic sign is regarded as a significant characteristic. Thus, two shape-based detection algorithms are proposed in this TSR system. Generally, the categories of traffic signs are roughly divided into red circular prohibition signs, blue circular information signs and red triangular signs. We first use Sobel filter to find the edges Edge(x, y) in three ROI regions. For circle detection, we search the circle on Edge(x, y), the binary gradient map. In the traditional Hough circle transform, it searches circle by every edge pixel and votes the optimized center. On the contrary, we propose the polar coordinate search method, which is designed with sliding center and computes the circumference whether reliable edge exist on binary gradient map detected by: (

x ¼ x0 þ r  sin h y ¼ y0 þ r  cos h

ð1Þ

where (x0, y0) are the coordinate of circle center, r is radius, and h 2 ½0 ; 2p. The proposed circular detection searches for circles with polar coordinate for the center and radius. With ranged values, we can estimate the size of circle beforehand in each ROI region. The values of sinh and cosh are established in lookup table initially and the values of (x, y) is approximate to integers. Because of that, proposed circle detection is not sensitive to traffic sign yaw rotation while Hough circle transform is too sensitive to this condition. The flow chart of triangular detection is shown in Fig. 3. Triangular detection is also based on Edge(x, y). From geometric points of view, we can detect and classify them with three fixed slopes, which are 0 (HL), 60 (RU) and −60 (LU) degrees with respect to y-axis. So, we need to scan every slope in its normal direction to find out which has the maximum count of edge points. We scan two fixed slant slopes with the dominated horizontal line to search y-interceptions with the lines with three maximum count edges. As the result, we use Cramer’s rule to determine the intersection point of triangular vertex. It is noted that the maximum counts can help to detect any triangular size. In real environments, the image can be confronted with illumination change, which will cause large color distortion. In order to disentangle this issue, we suggest

Simplified Neural Networks with Smart Detection for Road TSR

241

color transform to normalize the color. Figure 4 shows example images of transform colors with respect to RGB color image. The transform red and blue colors, XR and XB will be used to detect the red and blue colors, respectively.

Edge(x, y)

Scan Slope Count

Detected Triangle

Vertex Compute

Image

Threshold Yes

Discard

Fig. 3. Flow chart of triangular detection

(a)

(b)

(c)

(d)

Fig. 4. Color transformed images of: (a) RGB; (b) XR; (c) XB and (d) XG

3.3

Traffic Sign Recognition

Classifying traffic sign into their specific classes is the purpose of traffic sign recognition. The HOG feature which is hand-coded feature, the SVM classifier may not be robust enough to special cases. In order to achieve robust problem, we consider another way to make use of the convolutional neural network (CNN). The CNN is proved that it can achieve a good recognition rate in many classification competitions. The advantage of the CNN is that the models’ input is a raw image rather than hand-coded features. The proposed CNN architecture is shown in Fig. 5, which combines LeNet-5 and GoogLeNet together. We can achieve the advantages of LeNet-5 and GoogLeNet models because these two models have an excellent performance in their task domains. In traffic sign recognition system, a real time system plays critical role in intelligent vehicle for ADAS. A real time traffic sign recognition system will have a higher life utilization rate in the world. Thus, the lower calculation load is also an important factor which we should consider. Because of these reasons, we initially apply original LeNet5 model and all setting are same with origin such using grayscale raw data and the input size of training images. However, the accuracy of original LeNet-5 model is unsatisfactory. Improving the classification accuracy rate becomes the primary issue. GoogLeNet has a great accuracy in classification task, but it is too deep and using too large size of input images in training procedure. GoogLeNet applies the size of input size image far larger than LeNet-5. It is not match with our purpose because of traffic signs resolution for normal camera. Combining the advantages of LeNet-5 and GoogLeNet becomes a good choose for us. Therefore, we modify CNN architecture based on

242

W.-J. Yang et al.

LeNet-5 model and add the concept of inception module to improving the accuracy. In the proposed CNN architecture, we think color information is one of crucial characteristic. Only using grayscale input image is too insufficient and non-robust. For those reasons, the input image of CNN utilizes color image and we resize all input images to a fixed size since the input of CNN should have the same resolution.

Fig. 5. The proposed CNN architecture

In order to reduce the parameters of the proposed CNN, we employ convolution before other size kernels of convolutional layers which is in each inception layer. In each inception layer, using different kernel sizes of convolution from previous layer to filter out many feature maps. Then all the feature maps should concatenate together, so the all the feature maps in the inception need to fix on same size. In each inception concatenation the total filter bank is 61, 200, 400, respectively. In first inception module, each the number of filter bank is 20, 20, 20, 1 in each kernel convolution 1  1, 3  3, 5  5, and max-pooling respectively. In the second inception module, the number of filter bank is 45 in each kernel convolution 1  1, 3  3 reduce, 3  3, 5  5 reduce, 5  5, pool projection respectively. In the third inception module, the number of filter bank is 100 in each kernel convolution 1  1, 3  3 reduce, 3  3, 5  5 reduce, 5  5, pool projection respectively. In our CNN architecture, pooling layers apply max-pooling to subsampling the resolution of feature map mainly. The first full-connected layer is output a long vector with the length of 500 and the second full-connected layer output vector length is equaled to the number of classes.

Simplified Neural Networks with Smart Detection for Road TSR

243

Finally, our system can predict every detection potential candidate which category is in each frame. However, if we show that the prediction of each candidate in every frame, it may be bothered for driver when occurring false prediction. Thus we should consider the predicted answer of the potential region candidates in continuous frames. It’s too dangerous that every classification result is equal weight, therefor, our decision result should refer to the probability of prediction. The average score of every class can be described mathematically as: Sk ¼

N 1X Pk ðRn Þ; k 2 ½1; 2; 3; . . .; K N n¼1

ð2Þ

where k is the index of classes, Rn is the nth of the potential region candidate.

4 Experimental Results The experimental videos are captured by Chimei-Motor camera with resolution and installed on the front window inside the vehicle. The system is implemented Visual Studio 2012 and OpenCV 2.4.9 which is a library of programming function. The CNN training is implemented on NVIDIA DIGITS [22] and Caffe [23]. 4.1

Performances of Traffic Sign Detection

We tested on 13 videos total 10721 frames and the experimental results are shown in Tables 1 and 2. In order to evaluate the performance of proposed circle detection, the detection rate and false positive rate are defined as: Detection Rate ¼

Number of Traffic Signs Detection ; Number of Traffic Signs inside ROI in Total Frames

False Positive Rate ¼

Number of False Positive : Number of Total Frames

Table 1. Comparisons of different circle detection methods Methods Detection rate (%) False positive rate (%) Proposed circle detection 94.44 3.35 Hough circle transform 88.89 1.10

Table 2. Comparisons of our different triangular detection methods Methods Detection rate (%) False positive rate (%) Proposed triangular detection 100.00 0.93 1.73 Proposed triangular detectiona 100 a With sliding windows

ð3Þ ð4Þ

244

W.-J. Yang et al.

We compare the proposed circle detection method and Hough circle transform method. These two methods both can detect circle candidates. Table 1 shows that the proposed method is better than Hough circle transform. The reason is that the view of circle traffic signs is always with some spin conditions. Hough circle transform is too sensitive to the candidate of spin sign view. Thus, the proposed circle detection method is more suitable for real application. Table 2 shows the performances of the triangular detection method. We find out the proposed triangular detection with sliding windows is better. Without sliding windows, each ROI only performs triangular detection once and it may detect wrong y-intercept of the scan line due to the edge on the y-intercept of scan line not belong to the edge of triangular traffic sign in rectangle ROI. Owing to the reason, we apply proposed triangular detection with sliding windows. 4.2

Classification Performances of CNN Architectures

A. Test on GTSRB In order to verify the performance of the proposed CNN and well-known CNN architectures, we test all the architectures on public online database, German traffic sign recognition benchmark (GTSRB) [24]. The sampled images of GTSRB traffic signs are shown in Fig. 6.

Fig. 6. Sampled images of traffic signs in GTSRB

In GTSRB database, it has 43 classes of traffic signs, more than 50000 lifelike real environmental images in total, and the size of traffic signs raw data varies between 15  15 and 222  193 pixels. Data augmentation is a way to increase the amount of training data and it can provide more various situations of training data. Sometimes collecting specific condition data is difficult to achieve, so synthesizing data is one method to achieve. We extra apply three types of data augmentations: randomly scaling with a factor from 1 to 1.2, randomly translating with a factor from 1% to 10% in each side length, and randomly rotating in a range (−5, 5°). Table 3 shows the performances of different CNN architectures. We set that learning rate as 0.0005, Adam [25] as our solver, and input image is 28  28 pixels color image in the experiments. The mark “LeNet-5 + Conv.” represents that LeNet-5 insert one convolution layer. According to

Simplified Neural Networks with Smart Detection for Road TSR

245

the result in Table 3, we choose the CNN architecture in the proposed traffic sign recognition system.

Table 3. Peformance comparisons of various CNN architectures (GTSRBT) CNN architectures LeNet-5 LeNet-5 + Conv. LeNet-5 + inception Proposed architecture_28

TOP 1 (%) TOP 5 (%) 92.68 97.98 93.29 98.14 95.55 96.36 97.67 99.71

We can compare the accuracy of our proposed CNN architecture with other methods’ on the GTSRB database, and the results of the other methods come from the website of GTSRB as Table 4 shown. The “Proposed architecture_56”, “Proposed architecture_32” and “Proposed architecture_28” represents that we use training data with 56  56, 32  32, and 28  28 input sizes, respectively. It is reasonable that the bigger input size data can get higher accuracy because the bigger input size can provide more information. The accuracy of the proposed CNN architecture is almost close to the accurateness of human performance [1] in use of color images and the results [2, 26] both used grayscale images. The experimental result achieved by [1] is higher than our proposed CNN architecture owing to employing even larger number of filters in final convolution stages than us. Table 4. Compare the accuracy of different methods in GTSRB Method Human performance [1] Proposed architecture_56 Multi-scale CNNs [2] Proposed architecture_32 Proposed architecture_28 Random forests [16] LDA on HOG 2 [1] LDA on HOG 1 [1] LDA on HOG 3 [3]

Accuracy (%) 98.84 98.39 98.31 97.79 97.67 96.14 95.68 93.18 92.34

B. Test on Taiwanese Traffic Sign Database In Taiwan, the classes of traffic sign are dissimilar to GTSRB. However, we don’t have traffic sign database publicly provided. In order to solve this problem, we currently collected Taiwanese traffic sign from real driving environmental videos by a car recorder. We tried to build the dataset and mainly divide into eight classes, like speed limit 25, speed limit 40, speed limit 50, speed limit 60, speed limit 90, speed limit 100, speed limit 110, and none speed limit sign, as Fig. 7 shown. Originally, the core of

246

W.-J. Yang et al.

establishment dataset is focused on the recognition which the content of speed limit signs. On the other hand, it contains many other types of traffic sign but the number of collections is only a few up to now. For the CNN training, unbalanced training data is hard to learn the class which has only a little training data. Due to that, we regard the traffic sign only with tiny training data as same class. Table 5 shows the performances of the proposed classification results. The proposed architecture_28 performs the best.

Fig. 7. Classes of traffic sign in Taiwanese traffic sign Table 5. Comparisons of different CNN architecture in Taiwanese traffic sign testing database CNN architecture LeNet-5 LeNet-5 + Conv. LeNet-5 + inception Proposed architecture_28

TOP 1 (%) TOP 2 (%) TOP 3 (%) 95.36 97.25 98.20 96.39 97.82 98.15 96.69 98.18 98.59 97.85 98.30 98.66

C. Fusion Version In practical usage, the proposed TSR system could be failure for only a few frames. In order to evaluate the performance, the success rate and false signaling rate could be defined as: Success Rate ¼

Number of Traffic Sign Correct Recognition Number of Traffic Sign in Total Videos

False Alarm Rate ¼ Candidate Accuracy Rate ¼

Number of None Traffic Sign False Positive Number of Traffic Sign in Total Videos

ð6Þ ð7Þ

Number of Traffic Sign Candidate Correct Recognition Number of Potential Traffic Sign Candidate Region ð8Þ

Thus, the final performance of the fusion version of the proposed TSR system is shown in Table 6. The proposed TSR system is with 95% success detection rate and zero false alarm rate, which can successfully avoid the annoying error warning. The overall recognition accuracy rate is 92.79%.

Simplified Neural Networks with Smart Detection for Road TSR

247

Table 6. Performance of fusion version of proposed TSR system Method Success (%) False alarm (%) Candidate accuracy (%) Proposed TSR system 95.00 0.00 92.79

5 Conclusions In this paper, we proposed a traffic sign recognition (TSR) system that uses the traditional image processing techniques to reduce the computation load for traffic sign detection. First, we set the ROI regions and traffic sign detection methods to find the target block images. Then, the state-of-the-art deep learning neural network for classification is only applied to those detected blocks to achieve robust recognition. For traffic sign detection, in this paper, we proposed simple circle detection and triangular detection algorithms. Experimental results verity that the proposed detection system achieves better detection rate than the other methods. Furthermore, we proposed a new CNN architecture combined LeNet-5 structure and Google’s inception modules, to promote almost 5% improvement of top 1 accuracy compared with LeNet model in German traffic sign recognition benchmarks (GTSRB) testing dataset. For practical usages, we also establish a small set of Taiwanese traffic sign database, where the number of image samples is not as many as that in the GTSRB database. The results also show that the proposed TSR system achieves better performances. Of course, we will keep on collecting the image samples gradually to further improve the performances. Acknowledgements. This work was supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 105-2221-E-006-065-MY3.

References 1. Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: Man vs computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 32, 323–332 (2012) 2. Sermanet, P., LeCun, Y.: Traffic sign recognition with multi-scale convolutional networks. In: International Joint Conference on Neural Networks, pp. 2809–2813 (2011) 3. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016) 4. Fleyeh, H.: Color detection and segmentation for road and traffic signs. In: IEEE Conference on Cybernetics and Intelligent Systems, vol. 2, pp. 809–814 (2004) 5. Bahlmann, C., Zhu, Y., Ramesh, V., Pellkofer, M., Koehler, T.: A system for traffic sign detection, tracking, and recognition using color, shape, and motion information. In: Proceedings of the Intelligent Vehicles Symposium, pp. 255–260 (2005) 6. Maldonado-Bascón, S., Lafuente-Arroyo, S., Gil-Jimenez, P., Gómez-Moreno, H., LópezFerreras, F.: Road-sign detection and recognition based on support vector machines. IEEE Trans. Intell. Transp. Syst. 8(2), 264–278 (2007)

248

W.-J. Yang et al.

7. Shadeed, W., Abu-Al-Nadi, D.I., Mismar, M.J.: Road traffic sign detection in color images. In: 2003 Proceedings of the IEEE International Conference on Electronics, Circuits and Systems, vol. 2, pp. 890–893 (2003) 8. Loy, G., Barnes, N.: Fast shape-based road sign detection for a driver assistance system. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 1, pp. 70–75 (2004) 9. Barnes, N., Zelinsky, A., Fletcher, L.S.: Real-time speed sign detection using the radial symmetry detector. IEEE Trans. Intell. Transp. Syst. 9(2), 322–332 (2008) 10. Kim, S., Kim, S., Uh, Y., Byun, H.: Color and shape feature-based detection of speed sign in real-time. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, pp. 663–666 (2012) 11. Adam, A., Ioannidis, C.: Automatic road sign detection and classification based on support vector machines and HOG descriptors 12. Overett, G., Petersson, L.: Large scale sign detection using HOG feature variants. In: IEEE Intelligent Vehicles Symposium IV, pp. 326–331 (2011) 13. Greenhalgh, J., Mirmehdi, M.: Real-time detection and recognition of road traffic signs. IEEE Trans. Intell. Transp. Syst. 13(4), 1498–1506 (2012) 14. Xiao, Z., Yang, Z., Geng, L., Zhang, F.: Traffic sign detection based on histograms of oriented gradients and Boolean convolutional neural networks. In: International Conference on Machine Vision and Information Technology, pp. 111–115 (2017) 15. Zaklouta, F., Stanciulescu, B.: Real-time traffic-sign recognition using tree classifiers. IEEE Trans. Intell. Transp. Syst. 13(4), 1507–1514 (2012) 16. Zaklouta, F., Stanciulescu, B., Hamdoun, O.: Traffic sign classification using k-d trees and random forests. In: International Joint Conference on Neural Networks, pp. 2151–2155 (2011) 17. Greenhalgh, J., Mirmehdi, M.: Traffic sign recognition using MSER and random forests. In: Proceedings of the European Signal Processing Conference (EUSIPCO), pp. 1935–1939 (2012) 18. Xu, Y., Wang, Q., Wei, Z., Ma, S.: Traffic sign recognition based on weighted ELM and AdaBoost. Electron. Lett. 52(24), 1988–1990 (2016) 19. Kouzani, A.Z.: Road-sign identification using ensemble learning. In: IEEE Intelligent Vehicles Symposium, pp. 438–443 (2007) 20. Yang, Y., Luo, H., Xu, H., Wu, F.: Towards real-time traffic sign detection and classification. IEEE Trans. Intell. Transp. Syst. 17(7), 2022–2031 (2016) 21. Hu, W., Zhuo, Q., Zhang, C., Li, J.: Fast branch convolutional neural network for traffic sign recognition. IEEE Intell. Transp. Syst. Mag. 9(3), 114–126 (2017) 22. NVIDIA, Deep Learning GPU Training System NVIDIA DIGITS. https://github.com/ NVIDIA/DIGITS 23. Caffe: a fast open framework for deep learning. https://github.com/BVLC/caffe 24. Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The German traffic sign recognition benchmark: a multi-class classification competition. In: International Joint Conference on Neural Networks, pp. 1453–1460 (2011) 25. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv: 1412.6980 (2014) 26. Qian, R., Yue, Y., Coenen, F., Zhang, B.: Traffic sign recognition with convolutional neural network based on max pooling positions. In: International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, pp. 578–582 (2016) 27. Jin, J., Fu, K., Zhang, C.: Traffic sign recognition with hinge loss trained convolutional neural networks. IEEE Trans. Intell. Transp. Syst. 15(5), 1991–2000 (2014)

Simplified Neural Networks with Smart Detection for Road TSR

249

28. Abedin, Z., Dhar, P., Hossenand, M.K., Deb, K.: Traffic sign detection and recognition using fuzzy segmentation approach and artificial neural network classifier respectively. In: International Conference on Electrical, Computer and Communication Engineering, pp. 518–523 (2017) 29. Zhu, Z., Liang, D., Zhang, S., Huang, X., Li, B., Hu, S.: Traffic-sign detection and classification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2110–2118 (2016) 30. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005) 31. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998) 32. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

Weighted Histogram of Oriented Uniform Gradients for Moving Object Detection Wei-Jong Yang, Yu-Xiang Su, Pau-Choo Chung, and Jar-Ferr Yang(&) Department of Electrical Engineering, Institute of Computer and Communication Engineering, National Cheng Kung University, Tainan, Taiwan [email protected], [email protected]

Abstract. With the growth of the automotive electronics technology, the advanced driver assistance system (ADAS) becomes more and more important. Especially, the moving object detection (MOD) is an important issue in the ADAS in intelligent vehicles. In realistic systems, there exist two critical challenges including computing time and detection rate for MOD. To overcome these problems, we propose a novel moving object detection system which contains pre-processing, feature extraction, classification and state machine. The pre-processing contains ROI extraction and skipping low busyness windows, which accelerates the computing time to solve the mentioned problem. To improve the performances, in this paper, the weighted histogram of oriented uniform gradient (WHOUG) with support vector machine (SVM) is proposed to promote the detection accuracy. Besides, the finite state machine could further improve the robustness of the proposed system. The results demonstrate that the proposed system achieves better performance than the traditional one, and also maintains real time computation. Keywords: Moving object detection Support vector machine



Histogram of oriented gradient



1 Introduction Moving object detection has wide application domains in surveillance, robotics, intelligent vehicles, etc. Owing to the large variations in pedestrians and cars, bicycles and motorbikes, as well as the varying background and illumination, it is still a challenging task in computer vision. In recent years, developments of machine learning and pattern classification approaches have shown some successful results in moving object detection. These approaches mainly include two keyre points: feature extraction and classifiers. In the feature extraction, positive and negative sub-images are densely scanned from the top left to the bottom right with sliding windows in the region of interested to capture images. The dominant features such as edges, patches and shapes are extracted from the positive and negative sub-images. Then, these features are used to train a classifier to achieve a proper classifier. During the testing phase, the entire input image is scanned by feature extraction associated with the classifier to detect the moving objects. © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 250–260, 2020. https://doi.org/10.1007/978-3-030-12388-8_18

Weighted Histogram of Oriented Uniform Gradients …

251

In the past decades, several researchers concentrated on the domain of intelligent vehicles. The studies of the global feature use template matching method [1–4] which is through the construction of different angles and posture of the human shape and detect human using the template. In contrast to global feature, local features [5–9] are extracted from the blocks in the image first. Then, the features combine a machine learning classifier such as support vector machine (SVM) and Adaboost [10]. If the local features are appearance and posture, light changes of the objects are not as sensitive as the global feature. Belongie et al. proposed shape context [11], which is intended to be a way of describing shapes and allows for measuring shape similarity and recovering of point correspondences. In [12], Dalal et al. used the histogram of oriented gradient (HOG) to represent the characteristics of the human. First, the edge information is extracted from the training image and its gradient is calculated. Then, according to the different angles of the histogram, algorithm form a high-dimensional vector as a HOG features. Finally, the vector is normalized and then sent it to the SVM classifier to detect the objects. Zhu et al. [7] integrated a cascade-of-rejecters approach using AdaBoost algorithm for feature selection with modified HOG features to reduce the detection time and improve the accuracy. In this paper, we propose a more powerful moving objects detector by the weighted histogram of oriented uniform gradient (WHOUG) feature in down scale videos, while the linear SVM [13] is used to train the moving objects classifier. In this paper, we propose a more powerful moving objects detector by the weighted histogram of oriented uniform gradient (WHOUG) feature in down scale videos, while the linear SVM is used to train the moving objects classifier.

2 The Proposed MOD System As shown in Fig. 1, the proposed MOD system involves four major functions including prescreening, WHOUG feature extraction, SVM classification and states machine. The prescreening unit including grayscale transformation, region of interest (ROI) selection and skipping low busyness windows will be illustrated in Subsection II-A. Then, the proposed feature extraction algorithm which is called the weighted histogram of oriented uniform gradient (WHOUG) will be demonstrated. Finally, the state machine could maintain the stability of detection. 2.1

Prescreening

In order to reduce computation time, we will prescreen the image with some simple operations which includes color space transformation, ROI selection and skipping low busyness windows before feature extraction. To search the moving objects and perform the classifications in all testing windows in the whole frame will cost too much time for the MOD purpose. If we can detect some vacuity windows, which obviously do not contain any moving objects, we can skip those areas in advance for computation saving. The areas without moving objects are usually on the road since the road area is

252

W.-J. Yang et al.

smooth without large variations. To detect the edge variations, the horizontal and vertical uniform gradients are respectively given as: 1 1 g0x ðx; yÞ ¼ ½Iðx þ 2; yÞ þ Iðx þ 1; yÞ  ½Iðx  2; yÞ þ Iðx  1; yÞ; 2 2

ð1Þ

1 1 g0y ¼ ½Iðx; y þ 2Þ þ Iðx; y þ 1Þ  ½Iðx; y  2Þ þ Iðx; y  1Þ: 2 2

ð2Þ

and

Video

On-line

Off-line

Grayscale Transformation Pedestrian Motorbike

Region of Interest Selection

Bicycle

Car

No Prescreening Yes x

Weighted Histogram of Oriented Uniform Gradient

Weighted Histogram of Oriented Uniform Gradient w,b

SVM Classifier

SVM Training

State Machine Discard

Detection Results

Fig. 1. The flow chart of the proposed MOD system

Thus, the gradient is computed as: 1=2 02 Mg0 ðx; yÞ ¼ ðg02 : x ðx; yÞ þ gy ðx; yÞÞ

ð3Þ

Subsequently, the number of busyness per window can be calculated and used to determine whether the window is busy as followings: To achieve precise gradient orientation, it is noted that the computations of direction x and y gradients defined in (1) and (2) are different from the normal gradients. They are modified by adopting a onedimensional center gradient operator weighted by [−0.5, −0.5, 0, 0.5, 0.5] to achieve precise horizontal gradient and vertical gradients. The busy detection of the block is determined as follows: Step 1: Determine the point if it is a busy pixel:  Bðx; yÞ ¼

1; if Mg0 ðx; yÞ  20 0; if Mg0 ðx; yÞ\20;

ð4Þ

Weighted Histogram of Oriented Uniform Gradients …

Step 2: Accumulate the busyness per search window as: X Bsum ¼ Bðx; yÞ;

253

ð5Þ

x;y 2 X

Step 3: Decide the window if it contains a moving object as:  Active ¼

1; if Bsum  T 0; otherwise,

ð6Þ

where T is a selected threshold. If Active = 1, the detection will be performed; otherwise, the block will be directly classified as “definitely no object” and discarded. 2.2

Weighted Histogram of Oriented Uniform Gradient

Traditionally, the histogram is built by gradient orientation with an equal weight but it does not consider that the most orientations of moving objects are vertical for moving objects. Thus, a novel feature extraction method called weighted histogram of oriented uniform gradient (WHOUG) is proposed as the follows. With gradient magnitude Mg0 ðx; yÞ stated in (3) and the orientation h0g ðx; yÞ can be computed as: h0g ðx; yÞ ¼ arctanð

g0y ðx; yÞ Þ; g0x ðx; yÞ

ð7Þ

we can directly apply the HOG concept for classification. In this paper, we suggest a weighted histogram, which consider different weight for different orientations. The directional range from 0–180° is divided into eight segments with 22.5° apart to eight bins. Generally, with HOG concept, the counting strength qi(x,y) contained the weighted magnitude according to the gradient orientation is stated as: vj ¼ ½b0 ; . . .; bi ; . . .; b7 ; XX qi ðx; yÞ; bi ¼ x

ð8Þ ð9Þ

y

where qi(x,y) is weighted by xi as 

qi ðx; yÞ ¼ Mg0 ðx; yÞ  xi ; if 0  h0g ðx; yÞ\11:25; 168:75  h0g ðx; yÞ\180; i ¼ 0 0 180 qi ðx; yÞ ¼ Mg0 ðx; yÞ  xi ; if i  180 2m  hg ðx; yÞ\ði þ 2Þ  2m ; i ¼ ½1; 7 ð10Þ

254

W.-J. Yang et al.

with 

xi ¼ 1=24; i ¼ 0; 1; 6; 7 xi ¼ 15=24; i ¼ 2; 3; 4; 5

ð11Þ

where bi denotes the sum of weighted magnitudes under the ith bin in a cell and m is 8.

b′0 b′1 b′2

Cell (8 8 pixels)

b′3

b′4

b′5 b′6

b′7

A weighted linear histogram v′j

Fig. 2. Histograms by linearly weighted magnitudes for one cell

For the WHOUG, we also suggest linearly combining the neighboring weighted magnitudes, which is different from the weighted histogram in counting weighted strength stated in (10). The same directional range is also chosen from 0–180° for each cell as shown in Fig. 2. However, the counting strength q0i ðx; yÞ is obtained from the linear interpolation between the near center of the bin and the weighted xi according to the gradient orientations. The weighted linear histogram is computed by v0 j ¼ ½b00 ; . . .; b0i ; . . .; b07 ; XX q0i ðx; yÞ; b0i ¼ x

ð12Þ ð13Þ

y

where 8 h0 ðx;yÞ þ ci > q0i ðx; yÞ ¼ Mg0 ðx; yÞ  g 180=m  xi ; > > > > > if 0\h0g ðx; yÞ  ci ; i ¼ 0 > > > > ci þ 1 h0g ðx;yÞ > >  xi ; q0i ðx; yÞ ¼ Mg0 ðx; yÞ  180=m > > > 0 < if ci \hg ðx; yÞ  ci þ 1 ; i 2 ½0; 6 h0 ðx;yÞci 0 > > qi þ 1 ðx; yÞ ¼ Mg0 ðx; yÞ  g180=m  xi ; > > > > > if ci \h0g ðx; yÞ  ci þ 1 ; i 2 ½0; 6 > > > ci þ 1 h0g ðx;yÞ > >  xi ; q0i ðx; yÞ ¼ Mg0 ðx; yÞ  180=m > > > : if ci \h0g ðx; yÞ  180; i ¼ 7

ð14Þ

Weighted Histogram of Oriented Uniform Gradients …

255

with 

xi ¼ 2=24; i ¼ 0; 1; 6; 7 xi ¼ 15=24; i ¼ 2; 3; 4; 5

ð15Þ

where b0i denotes the sum of weighted magnitudes under the ith bin in a cell, ci means the center of the ith bin and m is 8. Then, four neighboring cells build up as a block and to reduce the influence of illumination and background variation, normalization is essential. L2-norm is used for normalization of each block V = {v1, v2, v3, v4}: V ~ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi; V kV k22 þ e2

ð17Þ

where e is small constant to avoid dividing by zero. Finally, all blocks of the sliding windows are collected in overlapped fashions to form the WHOUG features. The sizes of sliding windows are 64  128, 56  112 and 48  96. For the WHOUG features, 105 (7  15) blocks, 78 (6  13) blocks and 55 (5  11) blocks, 3360-element vector, 2496-element vector and 1760-element vector are sent to the SVM classifier to determine if it is moving object or not. 2.3

State Machine

After receiving the recognition results from the SVM classifier, the state machine is designed to make the proposed system more robust. The state machine contains four states: Empty S1, Warning S2, Alarming S3 and Vanishing S4 states as shown Fig. 3. SVM Score < 1.0

Moving Objects Possibly Disappearing

SVM Score < 0.9 when 5 frames passed

SVM classification

SVM Score < 0.9

No Moving Object SVM classification

SVM Score > 0.9 SVM Score < 1.0

Moving Objects Possibly Existing

Moving Objects Existing SVM classification and Kalman Filter update

SVM Score > 1.0

Twice detection within 5 frames

SVM classification and objects predicting with Kalman Filter

:Empty state :Warning state : Alarming state

SVM Score > 0.9

Fig. 3. The flow chart of the proposed state machine

:Vanishing state

256

W.-J. Yang et al.

Empty state S1 denotes that we do not detect the target object within the ROI when the SVM score is lower than the threshold which is set as 1.0. If the score is higher than the threshold, the state machine will enter Warning state S2 to that there are moving objects possibly existed. In this state, the MOD system will not inform the driver and uses Kalman filter [14] to predict the position of the moving objects in the next frame. If we detect moving objects twice within five frames, the MOD system will get into Alarming state S3, where a warning signal will be generated to indicate that this object might cause some dangerous for the driver. In Warming state, the threshold is set to a little bit lower as 0.9 since we must improve detection rate in order not to crush it and also update the position of the moving objects more frequently. If the score is lower than 0.9 within five frames, the state machine enters Vanishing state S4 that means the moving objects are possibly disappearing. If we do not detect the moving objects in one of five frames, the state machine needs a transient period to confirm if the moving object is disappeared or not. If the score is lower than threshold 0.9 for five frames, the state machine would return to Empty state S1.

3 Experimental Results 3.1

Environment and Database Training of Experiments

To verify the proposed MOD system, we implement it in a computer with Intel Core i74770 CPU 3.40 GHz, 16 GB memory and run it with Visual Studio 2010 and OpenCV 2.4.9 function library. The videos are captured by Chimei-motor camera with 1280  720 resolution and the camera set up on the cars. We conducted experiments on the 11 urban videos which contain 6630 frames. For performance comparisons, the proposed method with two different features is named as: HOUG-1 (histogram of oriented uniform gradient) and HOUG-2 (linear histogram) with xi ¼ 1, while WHOG-1 (weighted histogram of oriented gradient), WHOG-2, WHOUG-1 and WHOUG-2 with the weights suggested in (15). All the methods are comparing with the HOG [3]. Besides the detection rate and false positive rate, the success rate and false signaling rate are defined as Success Rate ¼

Number of Successful Moving Objects Detection ; Number of Moving objects

ð17Þ

and False Signaling Rate ¼

Number of False Signalings ; Number of Total Frames

ð18Þ

where successful moving objects detection means the moving object detected when enter the alarming state in the dangerous zone and false signaling rate means the false positive detected when enter the alarming state. As shown in Table 1, if we only use uniform gradient or weighted histogram, they cannot improve the performance for every testing video. On the other hand, the

Weighted Histogram of Oriented Uniform Gradients …

257

detection rate of proposed WHOUG is improved about 30% and the success rates are also better than the HOG. Some selected images of the corresponding results are shown in Fig. 4. Table 1. Detection performances achieved by the different methods Methods HOG HOUG-1 HOUG-2 WHOG-1 WHOG-2 WHOUG-1 WHOUG-2

Detection rate (%) 52.20 49.57 51.39 48.76 59.16 87.06 88.93

Success rate (%) 91.67 97.22 94.44 94.44 100.00 100.00 100.00

False positive rate (%) 0.45 0.57 0.92 3.00 0.64 6.98 1.99

Fake signaling rate (%) 0.18 0.39 0.63 1.79 0.43 5.33 1.37

(a) HOG results

(b) WHOUG-2

Fig. 4. Selected detection results achieved by using: (a) HOG and (b) WHOUG-2 features

3.2

Comparison of Computing Time

In this subsection, the computing time with skipping low busyness windows will be presented. There are the same testing videos for the experiments using WHOUG-2 to show the manifestation of the proposed system. As mentioned in Subsection II-B, the

258

W.-J. Yang et al.

threshold for busyness test is 1900. We further verify the results by testing 1600 to 2000 for the threshold T of Bsum will be set different values for the experiments. Table 2 shows detection rate, false positive rate and computation for the thresholds. Table 2. Detection rate, false positive rate and computing time with different thresholds Threshold (T) Detection rate (%) False positive rate (%) Computing time 0 88.93 3.80 34.31 ms/frame 1600 88.93 1.99 29.74 ms/frame 1700 88.93 1.99 29.07 ms/frame 1800 88.93 1.99 28.50 ms/frame 1850 88.93 1.99 28.16 ms/frame 1900 88.93 1.99 28.15 ms/frame 1950 88.25 1.91 28.09 ms/frame 2000 88.25 1.91 28.05 ms/frame

The results show that if the threshold is set to 1900, the computing time raises up 17% while the detection rate does not decrease. Moreover, the false positive rate could be reduced by using skipping low busyness windows. Although the computing time of threshold 1950 is faster, the detection rate is start decreasing. The skipping window may contain MOD when threshold is 1950. 3.3

INRIA Person Database

In this subsection, we have conducted the experiments on INRIA person database [15] which contains 1208 positive samples and 1218 negative samples. First, we use mirror images from 1208 positive samples and cropped 64  128 pixels from the center of the initial positive sample with 96  160 pixels. Then, the negative samples are cropped into 10 images of 64  128 pixels for the 1218 samples. Therefore, we obtain 12,180 negative samples and 2416 positive samples for training. Figure 5 shows some of the positive samples.

Fig. 5. Examples of positive samples of the INRIA database

In the testing phase, there are 1126 testing images which contain the person with 70  134 pixels as shown in Fig. 6. And the detection rate will be presented by using

Weighted Histogram of Oriented Uniform Gradients …

259

HOG and proposed WHOUG as shown in Table 3. We can see that the proposed method performs better than the HOG because the proposed method effectively processes for standing people. In contrast, WHOG-2 is better than WHOUG-2 in INRIA database. We think the resolution and the noise of the video may influence detection results, in this situation, we suggest choosing WHOUG-2. On the other hand, if the input source is clearly, WHOG-2 performs better.

Fig. 6. Examples of testing images of the INRIA database

Table 3. The detection rate with different methods Methods HOG HOUG-1 HOUG-2 WHOG-1 WHOG-2 WHOUG-1 WHOUG-2

Detection rate (%) 87.21 88.72 98.40 96.98 98.84 97.33 98.31

4 Conclusions In this paper, a novel moving object detection system which could perform pedestrian, scooter, bicycle and car detections, is proposed. Since the computation is a general problem for moving object detection. The proposed prescreening method not only reduces the computation but also decreases the false positives. In addition, the novel weighted histogram of oriented uniform gradient (WHOUG) is proposed to improve the detection rate for moving objects. Generally, the weights could help to retrieve important features in images. Besides, the finite state machine makes the proposed MOD system robust and establishes a comfortable but precise alarming mechanism for the driver. The experimental results verify that the proposed MOD system achieves better detection rate and less computing time than the original system. Furthermore, the proposed MOD system could achieve real-time detection and be able to be implemented for realistic realization of the ADAS system.

260

W.-J. Yang et al.

Acknowledgements. This work was supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 105-2221-E-006-065-MY3.

References 1. Gavrila, D.M., Philomin, V.: Real-time object detection for smart vehicles. In: Proceedings of IEEE International Conference on Computer Vision, vol. 1, pp. 87–93 (1999) 2. Borgefors, G.: Distance transformations in digital images. Comput. Vis. Graph. Image Process. 34(3), 344–371 (1986) 3. Gavrila, D.M.: A Bayesian, exemplar-based approach to hierarchical shape matching. IEEE Trans. Pattern Anal. Mach. Intell. 29(8), 1408–1421 (2007) 4. Nguyen, D.T., Li, W., Ogunbona, P.: A part-based template matching method for multi-view human detection. In: Proceedings of IEEE International Conference on Image and Vision Computing, New Zealand, pp. 357–362 (2009) 5. Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE Trans. Pattern Anal. Mach. Intell. 23(4), 349–361 (2001) 6. Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: Proceedings of IEEE International Conference on Computer Vision, vol. 1, pp. 90–97 (2005) 7. Zhu, Q., Yeh, M.C., Cheng, K.T.: Fast human detection using a cascade of histograms of oriented gradients. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1491–1498 (2006) 8. Chen, Y.T., Chen, C.S.: A cascade of feed-forward classifiers for fast pedestrian detection. In: Proceedings of Asian Conference on Computer Vision, pp. 905–914 (2007) 9. Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 878–885 (2005) 10. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of European Conference on Computational Learning theory, pp. 23–37. Springer, Berlin (1995) 11. Belongie, S., Malik, J.: Matching with shape contexts. In: Proceedings of IEEE Workshop on Content-based Access of Image and Video Libraries, vol. 1, pp. 20–26 (2000) 12. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005) 13. Suykens, J.A.K., Vandewalle, J.: Least square support vector machine classifiers. Neural Netw. Lett. 9, 293–300 (1999) 14. Bishop, G., Welch, G.: An introduction to the Kalman filter. In: Proceedings of SIGGRAPH, Course 8 (2001) 15. Dalal, N., Triggs, B.: INRIA Person Dataset (2005). http://pascal.inrialpes.fr/data/human/

Enabling Pedestrian Safety Using Computer Vision Techniques: A Case Study of the 2018 Uber Inc. Self-driving Car Crash Puneet Kohli(B) and Anjali Chadha Texas A&M University, College Station, TX, USA puneetkohli,anjali [email protected]

Abstract. Human lives are important. The decision to allow self-driving vehicles operate on our roads carries great weight. This has been a hot topic of debate between policy-makers, technologists and public safety institutions. The recent Uber Inc. self-driving car crash, resulting in the death of a pedestrian, has strengthened the argument that autonomous vehicle technology is still not ready for deployment on public roads. In this work, we analyze the Uber car crash and shed light on the question, “Could the Uber Car Crash have been avoided?”. We apply state-ofthe-art Computer Vision models to this highly practical scenario. More generally, our experimental results are an evaluation of various image enhancement and object recognition techniques for enabling pedestrian safety in low-lighting conditions using the Uber crash as a case study. Keywords: Computer vision · Pedestrian detection enhancement · Autonomous vehicles

1

· Image

Introduction

Nearly 1.3 million people die in road crashes each year, averaging to 3287 deaths a day. Another 20–50 million people are injured or disabled. These statistics are alarming, and if no action is taken, road accidents would be the fifth leading cause of death by 2030 [1]. Human error, such as distraction, fatigue, speeding, and misjudgment—have been shown to be responsible for 90% of road accidents [2]. One of the promises of Autonomous Vehicle (AV) technology is to significantly mitigate this impending public health crisis [3,4]. Autonomous vehicles can not get distracted, drunk, or tired. An additional boost in performance for AVs comes from the advances in artificial intelligence, sensor fusion, and computer vision techniques that essentially self-drive the vehicle. Despite this, we have seen through various incidents that self-driving technology is not even near perfect. A recent study shows that self-driving cars would have to be driven hundreds or millions of miles to accurately demonstrate their safety [5]. Many companies and research institutions alike have started deploying self-driving cars c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 261–279, 2020. https://doi.org/10.1007/978-3-030-12388-8_19

262

P. Kohli and A. Chadha

and autonomous vehicles on the roads for testing in an arms race to secure a foothold on this upcoming market. The Uber Inc. Self-Driving Car Crash1 which occurred in Tempe, Arizona (USA) was one of the most recent accidents involving self-driving cars which unfortunately resulted in the demise of a pedestrian. The crash involved a Volvo XC90 sport utility vehicle in fully autonomous mode hitting a pedestrian as she was crossing the road at night in a generally low visibility environment. Figure 1 shows a few frames taken just before the crash. It is shocking and still unknown as to why Uber’s systems were unable to detect the pedestrian before the crash happened. Although reports claim that the car’s system was unable to detect the pedestrian due to the lighting conditions, a study by Intel shows that their proprietary MobileEye system was able to detect the pedestrian one second before the crash happened [6]. On similar lines, we evaluate the performance of various state-of-the-art Object Recognition frameworks, both neural network based and traditional machine learning based, to detect the pedestrian in the post-crash dashboard camera (dash-cam) footage released by Uber. We also perform a variety of image enhancement techniques in a best-effort approach to detect the pedestrian sooner than with just the raw footage. In Sect. 3, we first go through the image enhancement techniques we applied on the video, followed by the object recognition techniques we used. Sections 4 and 5 discuss our experimental setup and findings in-depth. Section 6 shows our vision for the future prospects of this work.

Fig. 1. Frames 60, 70, 80, and 90 of the dash-cam footage released by Uber Inc.

2

Related Work

Over the last decade, multiple researchers have studied the problem of pedestrian detection [7–11]. However, the current systems aren’t sophisticated enough to be deployed in self-driving cars without a human behind the wheel. This is a challenging task because of the difference in humans’ postures, appearances. The problem discussed here is two-fold—pedestrian detection and image enhancement. Therefore, we will discuss the prior work related to the two topics in separate sections. 1

VentureBeat - Uber self-driving car crash in Tempe, Arizona. Youtube Video: https://youtu.be/XtTB8hTgHbM.

Enabling Pedestrian Safety Using Computer Vision Techniques

2.1

263

Pedestrian Detection

Previous works on this topic can be broadly classified into two categories— classical models using handcrafted features and those using neural networks for pedestrian detection. Classical Methods. Earlier work relied on sliding window techniques for object detection using proposal generation or Histogram of Orientated Gradients (HOG) [12] for feature extraction followed by use of classification methods such as SVM [13] and Adaptive Boosting [14] for pedestrian detection. Before the recent advancements in neural networks, decision forests (boosted) used to give best performance for pedestrian detection. SquaresChnFtrs [15], InformedHaar [11] and, SpatialPooling [16] are some of the top performing boosted decision trees for integral channel features architecture [9]. Multiple works also investigated the performance of features extracted from motion to detect human [17–19]. While others experimented with the models combining motion and intensity features [20]. These hand-crafted features have been widely used for object detection. Use of edge detection techniques for pedestrian detection have shown good results as well [21–23]. An edge is a boundary between background and an object. There are different categories of edge detectors—gradient [24–26], zero crossing [27], Laplacian of Gaussian [26,27], Gaussian Edge [26], and Colored Edge detector [28]. In our study, we use the classical Canny Edge Detection approach [29]. Partial occlusion of the pedestrian images is another hurdle for human detection. In fact, around 70% of the pedestrians captured in street scenes are partially occluded in at least one frame of the video [30]. Previous works have tried to tackle this problem by specifically training detectors for different types of occlusions [31,32], or by modeling part visibility as latent variables [33–36]. Mathias et al. trained classifiers for specific types of occlusion, for instance bottom-up or right-left occlusions [32]. Other works divided the pedestrian into multiple parts and inferred their visibility with latent variables [34,36]. A different class of part-based pedestrian detectors assume that the head of the pedestrian is always visible [37]. Neural Network Method. With the recent advances in neural networks, Convolutional Neural Networks have been successfully applied for the object detection [38–43]. Recent works also focus on using convnets to achieve build better pedestrian detection models [10,44–47]. Zhang et al. [44] explored the use of Region Proposal Networks (RPN) [38] for generating pedestrian candidate boxes followed by a cascaded boosted forest [48] for classifying whether a candidate box is a pedestrian. Tian et al. [47] joins the the task of pedestrian detection with semantic tasks such as pedestrian attributes (example, carrying a backpack). Convnet [10] applied convolutional sparse coding to unsupervised pre-trained CNNs for pedestrian detection. Researchers have tried to combine the classic methods of occlusion handling in pedestrian images [32] with the novel deep learning [46] architectures to

264

P. Kohli and A. Chadha

achieve better results. In contrast to [32] where the body parts were pre-defined, [46] determines these parts automatically from data which may vary with the datasets and the scenarios. This step is followed by training an ensemble of individual part detectors for pedestrian detection. One advantage of this model over Part-based RCNN model [49] is that it doesn’t require part annotations in training. Datasets. Multiple pedestrian detection datasets have been made publicly available over the years. ETH [50], INRIA [12], Caltech-USA [30,51] , KITTI [52], and TUD-Brussels [53] are a few of the more popular of them. Different datasets vary in the environmental settings (urban, city, mountains), capturing angle (rear-view; stereo rig mounted on a stroller), diversity of the data (multiple cities/countries, crowded areas), and the inclusion of partially/occluded pedestrians of varied sizes and postures. We evaluate our model using the Caltech Pedestrian Dataset as it consists of footage recorded from a vehicle driving through an urban scenario with regular traffic conditions which is fitting for our use case. 2.2

Image Enhancement

One of the major challenges for Computer Vision research and applications is the low quality of input images. Many researchers have worked in the area of Image Enhancement in order to tackle this problem. Our work primarily focuses on pedestrian detection in low-light scenarios. Existing literature for low-light image enhancement can be broadly classified into two groups: Histogram based and Retinex based methods. Histogram equalization [54] is one of the most intuitive and common ways to fix the bad lightening in an image. Contrast-limiting Adaptive Histogram Equalization (CLAHE) [55] limits the HE result to a certain area. There are other works which explores non-linear functions like gamma function to enhance image contrast. Image denoising tasks have been explored using K-SVD, BM3D and non-linear filters. Retinex methods are based on Retinex Theory introduced by Land [56] to explain the color perception of human vision system. Single Scale Retinex (SSR) is based on center/surround retinex. There are Multi-scale Retinex models [57] as well which considers the weighted sum of several different SSR outputs. There are deep learning models—LLNet [58], HDRNet [59], MSRNet [60], which enhances the image using neural networks. For instance, Fu et al. [61] tried to remove rain from single images via deep detailed network. Cai et al. [62] introduced a trainable end to end system called DehazeNet which takes a hazy image as an input and provides a transmission map which outputs a haze-free image.

Enabling Pedestrian Safety Using Computer Vision Techniques

3

265

Methodology

Our methodology is divided into two parts. First, the Image Processing and Enhancement Techniques, and second, the Object Detection techniques. We ran our best Object Detection model on both the raw video feed, as well as on videos enhanced via our processing algorithms. 3.1

Image Processing and Enhancement

Motion Detection. One of the first ideas we experimented is detecting motion in the video as in [63]. We performed background subtraction in consecutive frames of video. There are multiple methods to perform background subtraction as discussed in [64–66]. We used a Gaussian Mixture based algorithm for background and foreground segmentation as discussed in [65] which selects the appropriate number of Gaussian distribution for each pixel. This step helps us to segment the background from the foreground. These algorithms are sophisticated enough to distinguish between actual motion and other shadowing or lighting changes in the video. Gamma Correction. The notion of gamma stems from the fact that our eye reacts to light in a non linear way. In other words, human eye is more sensitive to small variation in light in dark settings than in a bright one. Gamma Correction is a non-linear operation used to correct image’s luminance. Each pixel in a image has brightness level called luminance. This value varies between 0 and 1 where 0 mean complete darkness and 1 is brightest. Gamma correction is also known as Power Law Transform. 1/γ

Vout = Vout

γ < 1 shifts the image towards the darker end of the spectrum while γ > 1 makes the image appear lighter. This technique is often used in night vision systems—[67,68] apply gamma correction for autonomous/driver assistance vehicles. To illuminate our video, we tried various γ values. For γ = 3.5, we can see the one of the frames of the video in Fig. 2. Histogram Equalization. While Gamma Correction modifies the luminance of an image, Histogram Equalization plays around with the contrast of an image. Contrast is determined by the difference in the color and brightness of the object and other objects within same field of view. Histogram Equalization achieves contrast enhancement by equalizing the image intensities. Multiple works have successfully used this technique for vision enhancement [55,69] and object detection [70], and medical image processing [71]. By redistributing the pixels between the highest and the darkest portions

266

P. Kohli and A. Chadha

Fig. 2. Frame 75 after applying Gamma correction (γ = 3.5)

Fig. 3. Grey levels for the RGB channels (depicted in red, green, and blue) for the Uber crash video. The black line represents the mean grey level

Enabling Pedestrian Safety Using Computer Vision Techniques

267

of an image, we make a dark image (underexposed) less dark and a bright image (overexposed) less bright. Since Histogram Equalization considers the global contrast of the image, it doesn’t lead to better image for our scenario as there is large intensity variations in every frame of the video. In other words, histogram covers a large region, i.e. both bright and dark pixels are present. Figure 4 illustrates this on Frame 75 of the Uber crash video. To overcome this problem, we used adaptive Histogram Equalization method—CLAHE, where we divide every image into small blocks. For each of these blocks, we perform histogram equalization. As can be seen in Fig. 5, this problem is overcome.

Fig. 4. Histogram Equalized image for Frame 75

Edge Detection. Multiple works have used Canny Edge Detection algorithm for pedestrian detection [21–23]. We applied same approach to create an edge map for every frame of video. A sample of such edge map is Fig. 8. Canny Edge Detection is a multi-stage algorithm involving multiple steps— image smoothing using Gaussian convolution, apply 2-D first derivative operator, perform non-maximal suppression. The effect of Canny operator is determined by—width of the Gaussian kernel used while smoothing, the upper and lower thresholds used by tracker. Adaptive Thresholding. Tian et al. [72] introduced an adaptive thresholding segmentation algorithm for nighttime pedestrian detection. They follow a twostep approach—a detection phase which performs image segmentation followed by recognition phase to classify whether the object is a pedestrian using Support

268

P. Kohli and A. Chadha

Fig. 5. Results after applying CLAHE to frame 75

Vector Machines (SVM). We implemented a slightly modified version of their recognition algorithm in order to detect pedestrians. Our version is as follows: 1. Convert the input frame to Grey-scale 2. Identify all the Connected Components in the frame using Block Based Component Labelling (BBDT) [73] 3. Remove Components that are greater than 10000 or less than 50 pixels in size 4. Remove Components whose bottom-most pixels are in the top or bottom 10% of the frame 5. Remove Components whose area ratio (ratio of actual area to area of bounding box) is less than 0.5. Figure 6 shows the comparison of our method against [72] on a sample pedestrian image. Figure 7 shows the results for this method on Uber video frame 75. Multi-Exposure Fusion Framework. Ying et al. [74] proposed a new Retinex based model which uses a dual-exposure fusion framework to enhance constrast and lightness of images. They create an enhanced image by fusing the original image with a synthetic image. This method has achieved state-of-the-art results, and we experimented with it for our image enhancement step. The Mean Grey Levels (Fig. 3) were calculated for use in this model. Camera Response Model. Ying et al. [75] proposed a new image enhancement method which uses the response characteristics of cameras. We implemented this approach to check its effectiveness in improving the lighting condition of our video. First, we investigated the relationship between two images

Enabling Pedestrian Safety Using Computer Vision Techniques

269

Fig. 6. Comparison of adaptive thresholding segmentation algorithms. The image on the top is the original image, the one in the center is from [47], and the one on the bottom is using our algorithm

Fig. 7. Adaptive thresholding segmentation on frame 75. The bicycle is outlined almost perfectly but the image is not devoid of noise. Our algorithm barely works on most other frames, completely failing to identify any object

270

P. Kohli and A. Chadha

Fig. 8. Canny Edge Detection on frame 75

with different exposures to obtain an accurate camera response model. Following this, we borrowed the illumination estimation techniques to estimate the exposure ratio map. In the final step, this camera response model is used to adjust each pixel to its desired exposure according to the estimated exposure ratio map. 3.2

Object Detection Approaches

HOG + SVM. This approach uses locally normalized HOG descriptor as features as described in the related work and uses linear svm as a baseline classifier. This approach provides good performance, in fact better than other existing feature sets including wavelets. You Only Look Once (YOLO). YOLO [39,40,76] is one of the most promising state-of-the-art, real time object detection systems. What separates it from other object detector systems is the approach of applying a single neural network to the full image instead of applying the model to multiple locations of an image. Compared to other region proposal classification networks (Fast-RCNN) which perform detection on various region proposals and thus end up performing prediction multiple times for various regions in an image, YOLO architecture passes the image once through the network and the output is a prediction. YOLO network divides the image into grid cells. Each of these grid cells is responsible for predicting 5 bounding boxes. A bounding box is described by a rectangle that encloses an object. YOLO also outputs a confidence score that informs us the certainity with which an object might be present in a grid cell. The YOLO architecture only uses standard layer types—convolutional layers with a 3 * 3 kernel and max-pooling with a 2 * 2 kernel. In the second version of

Enabling Pedestrian Safety Using Computer Vision Techniques

271

YOLO, the fully connected layers are removed. The very last convolutional layer has a 1 * 1 kernel. YOLO is written in Darknet. YOLO and other object detection models use Intersection over Union (IOU) as an evaluation metric. Any algorithm that provides predicted bounding boxes as output can be evaluated using IoU. This metric is simply a ratio of area of overlap between the predicted bounding box and the ground truth bounding box to the area of union between the predicted and ground truth boxes. Multiple works have applied YOLO or model architectures similar to it for autonomous driving applications and have achieved good prediction accuracy [77–79]. SSD. Another popular deep learning based object detection method is Single Shot Detector (SSD) [80]. Similar to YOLO, SSD uses a single network. However, it divides the image into a set of default boxes over different aspect ratios. For each default box, we predict the confidence for all object categories and adjusts the box location to better match the object location. For the task of pedestrian detection, SSD gives high accuracy on par with the state-of the-art object detectors [81]. RetinaNet. Similar to SSD and YOLO, RetinaNet is a one-stage detector [82] which beats other two-stage detectors like Faster-RCNN [38] in performance for object detection. The paper introduces Focal Loss—a new loss function for classification, which increases the performance significantly. This loss function replaces the cross entropy loss function and is targeted to solve the problem of extreme foreground-background class imbalance which is encountered during training.

4 4.1

Experimentation Experimental Settings

We ran all our experiments on Linux CentOS 7 machines with Intel Xeon E5-2680 v4 2.40 GHz 14-core processors (128 GB), and two NVIDIA Tesla K80 Graphical Processing Units (GPU). The footage provided by Uber is shot at 24 frames per second. In order to detect the pedestrian in the Uber crash video, we converted the video into a total set of 165 frames, out of which every frame after frame 95 is the same. Frame 95 is where the actual crash happened. The pedestrian’s feet become first visible at frame 60, and the complete silhouette is visible by frame 73. In other words, the crash happened at 3.95–0.91 s after the pedestrian was first fully visible, and 1.45 s after the pedestrian’s feet were visible. As per a subjective analysis, we deduce that MobileEye was able to identify the pedestrian at frame 74 i.e. 0.86 s before the crash.

272

4.2

P. Kohli and A. Chadha

Results

Table 1 shows the time taken to process the complete 165 frames of the input video, as well as the FPS for each of the Image Processing and Enhancement technique we applied to the video. As is visible, the more “basic” techniques such as Histogram Equalization, Canny Edge Detection, Gamma Correction, and Binary Thresholding were able to process the whole video at over 100 frames per second. On the other hand, more robust models such as the Camera Response Model and the Multi-Exposure Fusion Framework take upto 1 s per frame. This already gives us some intuition that for a real-time system that could potentially be deployed in an autonomous vehicle, we would need to stick to simpler image processing techniques that can process frames extremely fast. Table 1. Processing time of image enhancement techniques Method

Time (s) FPS

Histogram equalization

0.85

192.41

Canny edge detection

1.00

165.16

Binary thresholding

1.14

145.16

Gamma correction

1.51

115.98

CLAHE

1.72

96.00

Adaptive threshold segmentation 5.08

32.51

Motion map (Shadows)

9.90

16.56

Motion map (No Shadows)

10.42

15.75

Harris corner detection

50.50

3.27

Camera response model

136.37

1.21

Multi-exposure fusion framework 164.55

1.00

Table 2 shows the amount of time taken for the various Object Recognition models to process the video. We also report which frame the person was first detected in. Both RetinaNet and YOLO were able to detect the pedestrian at frame 74, but RetinaNet’s framerate was astonishingly the worst among our chosen models, processing only 1.57 frames per seconds. A surprising result was that the classical HOG + SVM based approach was able to successfully detect the pedestrian at frame 75, only one frame after YOLO and RetinaNet. The drawback of this approach was that there were too many false positives throughout the video. SSD nearly failed to recognize the pedestrian, just barely making it at frame 90, a few milliseconds before the actual crash. As the Uber video is a challenging video due to the unknown downsampling, low quality, and low-light scenario, we also tested our models against numerious other videos, and also against specific frames from the Caltech Pedestrian Dataset. On this dataset, we observed a similar pattern in the accuracies and processing time for each model.

Enabling Pedestrian Safety Using Computer Vision Techniques

273

Table 2. Results. YOLO outperforms all other models in terms of both accuracy and speed of detection. SSD performs the words, whereas RetinaNet has comparable accuracy but very slow speed Method

Person detected (frame #) Processing time (s) FPS

RetinaNet

74

105.46

1.57

HOG + SVM 75

44.23

3.73

SSD

90

33.36

4.94

YOLO

74

18.59

8.87

It is clear that out of the chosen models, YOLO outperforms the others both in terms of accuracy of detection as well as processing time. By running YOLO on the raw input video frames, we were able to detect the pedestrian at frame 74, on par with the proprietary MobileEye model by Intel. In terms of time loss, this is 0.86 s before the actual crash. Figure 9 shows the result of YOLO detection on Frame 87.

Fig. 9. YOLO detection on frame 87. YOLO was able to detect the pedestrian as well as the bicycle

As YOLO was our best performing model, we ran the enhanced set of frames only on this model. All of the techniques described in Sect. 3.1 had no improvement on the detection of the pedestrian, with the best detection capped at frame 74. Interestingly, most of the enhancement techniques reduced the detection accuracy, with YOLO often failing to detect the pedestrian intermittently on the enhanced image sets.

274

5

P. Kohli and A. Chadha

Conclusion

In this work we showed that the YOLO Object Detection framework is able to detect the pedestrian from the Uber crash approximately one second before the actual crash. Our results with YOLO are on par with the proprietary models that have claimed similar results. Though we tried a variety of image enhancement and processing techniques, neither of them could further improve the detection rate on the Uber video. Our experiments on other videos and datasets were limited, but we are confident that in a general low-lighting scenario, these techniques would be beneficial for enabling pedestrian safety. While we still do not have a concrete answer to the original question—“Could the Uber Car Crash have been avoided?”, through our experiments we have shown that through a variety of available object detection models, we could successfully detect the pedestrian much before the actual accident. Whether the crash could have actually been avoided is a question best left for Uber to answer, as other factors such as decision making time, emergency braking system performance, etc. would also come into play. It is only a matter of time before Uber reveals what went wrong in their object recognition system and why the pedestrian could not be detected.

6

Future Work

Currently our image enhancement step is not directly linked to the object recognition models in a single pipeline. We first perform the image enhancement step and manually send the output to the recognition frameworks. As an immediate goal, we would like to create a joint model that processes low-light video frames and performs object recognition in real time. We aim to supplement this pipeline with a ‘day/night’ classifier that would also be able to judge whether or not low-light enhancement is required on an input video stream. Investigation of various permutations and sequencing of image enhancement techniques would also be a potential future work, aiming to visually enhance the pedestrians in a scene while filtering out unwanted background content. Further work would include fine-tuning our model specifically for pedestrian detection under various low-lighting scenarios. This would involve creating a synthetic dataset for pedestrian detection in low light conditions, done via augmenting the Caltech Pedestrian Dataset. We expect this to have a significant improvement in accuracy of detection in low lighting conditions.

References 1. Association for Safe International Road Travel: Annual global road crash statistics (2013). www.asirt.org 2. Blincoe, L.J., Miller, T.R., Zaloshnja, E., Lawrence, B.A.: The economic and societal impact of motor vehicle crashes, 2010. (Revised) (report no. dot hs 812 013). National Highway Traffic Safety Administration, Washington, DC (2015)

Enabling Pedestrian Safety Using Computer Vision Techniques

275

3. Anderson, J.M., Nidhi, K., Stanley, K.D., Sorensen, P., Samaras, C., Oluwatola, O.A.: Autonomous Vehicle Technology: A Guide for Policymakers. Rand Corporation (2014) 4. Fagnant, D.J., Kockelman, K.: Preparing a nation for autonomous vehicles: opportunities, barriers and policy recommendations. Transp. Res. Part A: Policy Pract. 77, 167–181 (2015) 5. Kalra, N., Paddock, S.M.: Driving to safety: how many miles of driving would it take to demonstrate autonomous vehicle reliability? Transp. Res. Part A: Policy Pract. 94, 182–193 (2016) 6. Shashua, A.: Experience counts, particularly in safety-critical areas (2018). https://newsroom.intel.com/editorials/experience-counts-particularly-safetycritical-areas/ 7. Benenson, R., Mathias, M., Timofte, R., Van Gool, L.: Pedestrian detection at 100 frames per second. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2903–2910. IEEE (2012) 8. Doll´ ar, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014) 9. Dollr, P., Tu, Z., Perona, P., Belongie, S.J.: Integral channel features. In: Cavallaro, A., Prince, S., Alexander, D.C. (eds.) BMVC, pp. 1–11. British Machine Vision Association (2009) 10. Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3626–3633. IEEE (2013) 11. Zhang, S., Bauckhage, C., Cremers, A.B.: Informed Haar-like features improve pedestrian detection. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 947–954. IEEE (2014) 12. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893, June 2005 13. Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998) 14. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997) 15. Benenson, R., Omran, M., Hosang, J.H., Schiele, B.: Ten years of pedestrian detection, what have we learned? CoRR, abs/1411.4304 (2014) 16. Paisitkriangkrai, S., Shen, C., Van Den Hengel, A.: Strengthening the effectiveness of pedestrian detection with spatially pooled features. In: European Conference on Computer Vision, pp. 546–561. Springer, Heidelberg (2014) 17. Liu, F., Picard, R.W.: Finding periodicity in space and time. In: Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pp. 376–383, January 1998 18. Polana, R., Nelson, R.: Detecting activities. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2–7, June 1993 19. Cutler, R., Davis, L.S.: Robust real-time periodic motion detection, analysis, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 781–796 (2000). Aug 20. Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vis. 63(2), 153–161 (2005). July 21. Barinova, O., Lempitsky, V., Kholi, P.: On detection of multiple object instances using hough transforms. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1773–1784 (2012)

276

P. Kohli and A. Chadha

22. Sharma, V., Davis, J.W.: Integrating appearance and motion cues for simultaneous detection and segmentation of pedestrians. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp.1–8. IEEE (2007) 23. Spinello, L., Triebel, R., Siegwart, R.: Multimodal detection and tracking of pedestrians in urban environments with explicit ground plane extraction. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2008, pp. 1823– 1829. IEEE (2008) 24. Chidiac, H., Ziou, D.: Classification of image edges. Vis. Interface 99, 17–24 (1999) 25. Clavier, E., Clavier, S., Labiche, J.: Image sorting and image classification: a global approach. In: Proceedings of the Fifth International Conference on Document Analysis and Recognition, ICDAR 1999, pp. 123–126. IEEE (1999) 26. Haralick, R.M., Shapiro, L.G.: Computer and Robot Vision. Addison-Wesley, Boston (1992) 27. Bovik, A.C.: Handbook of Image and Video Processing. Academic Press, Cambridge (2010) 28. Garcia, P., Pla, F., Gracia, I.: Detecting edges in colour images using dichromatic differences. In: Seventh International Conference on Image Processing And Its Applications (Conf. Publ. No. 465), vol. 1, pp. 363–367, July 1999 29. Canny, J.: A computational approach to edge detection. In: Readings in Computer Vision, pp. 184–203. Elsevier, Amsterdam (1987) 30. Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2012). April 31. Wojek, C., Walk, S., Roth, S., Schiele, B.: Monocular 3D scene understanding with explicit occlusion reasoning. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1993–2000. IEEE (2011) 32. Mathias, M., Benenson, R., Timofte, R., Van Gool, L., Handling occlusions with Franken-classifiers. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1505–1512. IEEE (2013) 33. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010) 34. Ouyang, W., Wang, X.: A discriminative deep model for pedestrian detection with occlusion handling. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3258–3265. IEEE (2012) 35. Ouyang, W., Zeng, X., Wang, X.: Modeling mutual visibility relationship in pedestrian detection. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3222–3229. IEEE (2013) 36. Ouyang, W., Wang, X.: Joint deep learning for pedestrian detection. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 2056–2063. IEEE (2013) 37. Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 1, pp. 90–97. IEEE (2005) 38. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015) 39. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. CoRR, abs/1612.08242 (2016)

Enabling Pedestrian Safety Using Computer Vision Techniques

277

40. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788, June 2016 41. Girshick, R.: Fast R-CNN. arXiv preprint arXiv:1504.08083 (2015) 42. Saini, S., Nikhil, S., Konda, K.R., Bharadwaj, H.S., Ganeshan, N.: An efficient vision-based traffic light detection and state recognition for autonomous vehicles. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 606–611, June 2017 43. Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2524 (2013) 44. Zhang, L., Lin, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection? In: European Conference on Computer Vision, pp. 443–457. Springer, Heidelberf (2016) 45. Li, J., Liang, X., Shen, S.M., Tingfa, X., Feng, J., Yan, S.: Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimed. 20(4), 985–996 (2018) 46. Tian, Y., Luo, P., Wang, X., Tang, X.: Deep learning strong parts for pedestrian detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1904–1912 (2015) 47. Tian, Y., Luo, P., Wang, X., Tang, X.: Pedestrian detection aided by deep learning semantic tasks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5079–5087 (2015) 48. Appel, R., Fuchs, T., Doll´ ar, P., Perona, P.: Quickly boosting decision trees– pruning underachieving features early. In: International Conference on Machine Learning, pp. 594–602 (2013) 49. Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for finegrained category detection. In: European Conference on Computer Vision, pp. 834–849. Springer, Heidelberg (2014) 50. Ess, A., Leibe, B., Schindler, K., Van Gool, L.: A mobile vision system for robust multi-person tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008) 51. Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: a benchmark. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 304– 311, June 2009 52. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Rob. Res. 32(11), 1231–1237 (2013). September 53. Wojek, C., Walk, S., Schiele, B.: Multi-cue onboard pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. pages 794–801. IEEE (2009) 54. Pizer, S.M., Amburn, E.P., Austin, J.D., Cromartie, R., Geselowitz, A., Greer, T., ter Haar Romeny, B., Zimmerman, J.B., Zuiderveld, K.: Adaptive histogram equalization and its variations. Comput. Vis. Graph. Image Process. 39(3), 355– 368 (1987) 55. Reza, A.M.: Realization of the contrast limited adaptive histogram equalization (CLAHE) for real-time image enhancement. J. VLSI Sig. Process. Syst. Sig. Image Video Technol. 38(1), 35–44 (2004) 56. Land, E.H.: The retinex theory of color vision. Sci. Am. 237(6), 108–129 (1977) 57. Jobson, D.J., Rahman, Z., Woodell, G.A.: A multiscale retinex for bridging the gap between color images and the human observation of scenes. IEEE Trans. Image Process. 6(7), 965–976 (1997) 58. Lore, K.G., Akintayo, A., Sarkar, S.: LLNet: a deep autoencoder approach to natural low-light image enhancement. CoRR, abs/1511.03995 (2015)

278

P. Kohli and A. Chadha

59. Gharbi, M., Chen, J., Barron, J.T., Hasinoff, S.W., Durand, F.: Deep bilateral learning for real-time image enhancement. ACM Trans. Graph. 36(4), 1–12 (2017) 60. Shen, L., Yue, Z., Feng, F., Chen, Q., Liu, S., Ma, J.: MSR-Net: low-light image enhancement using deep convolutional network. CoRR, abs/1711.02488 (2017) 61. Fu, X., Huang, J., Zeng, D., Huang, Y., Ding, X., Paisley, J.: Removing rain from single images via a deep detail network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) 62. Cai, B., Xiangmin, X., Jia, K., Qing, C., Tao, D.: DehazeNet: an end-to-end system for single image haze removal. IEEE Trans. Image Process. 25(11), 5187–5198 (2016) 63. Xu, Y., Xu, L., Li, D., Wu, Y.: Pedestrian detection using background subtraction assisted support vector machine. In: 2011 11th International Conference on Intelligent Systems Design and Applications, pp. 837–842, November 2011 64. KaewTraKulPong, P., Bowden, R.: An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection, pp. 135–144. Springer, Boston (2002) 65. Zivkovic, Z.: Improved adaptive Gaussian mixture model for background subtraction. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 2, pp. 28–31, August 2004 66. Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recogn. Lett. 27(7), 773–780 (2006). May 67. Shadeed, W.G., Abu-Al-Nadi, D.I., Mismar, M.J.: Road traffic sign detection in color images. In: Proceedings of the 2003 10th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2003, vol. 2, pp. 890–893. IEEE (2003) 68. Broggi, A., Cerri, P., Medici, P., Porta, P.P., Ghisio, G.: Real time road signs recognition. In: 2007 IEEE Intelligent Vehicles Symposium, pp. 981–986. IEEE (2007) 69. Tarel, J.-P., Hautiere, N., Caraffa, L., Cord, A., Halmaoui, H., Gruyer, D.: Vision enhancement in homogeneous and heterogeneous fog. IEEE Intell. Transp. Syst. Mag. 4(2), 6–20 (2012) 70. Singh, S., Godara, A., Gaurav, G.: Detection of partial invisible objects in images using histogram equalization. Int. J. Comput. Appl. 85(9), 40–44 (2014). January 71. Pisano, E.D., Zong, S., Hemminger, B.M., DeLuca, M., Johnston, R.E., Muller, K., Braeuning, M.P., Pizer, S.M.: Contrast limited adaptive histogram equalization image processing to improve the detection of simulated spiculations in dense mammograms. J. Digit. Imaging 11(4), 193 (1998) 72. Tian, Q.M., Luo, Y.P., Hu, D.C.: Nighttime pedestrian detection with a normal camera using SVM classifier. In: Third International Conference on Image and Graphics (ICIG 2004), pp. 116–119. IEEE (2004) 73. Chang, W.-Y., Chiu, C.-C., Yang, J.-H.: Block-based connected-component labeling algorithm using binary decision trees. Sensors 15(9), 23763–23787 (2015) 74. Ying, Z., Li, G., Ren, Y., Wang, R., Wang, W.: A new image contrast enhancement algorithm using exposure fusion framework. In: International Conference on Computer Analysis of Images and Patterns, pp. 36–46. Springer, Heidelberg (2017) 75. Ying, Z., Li, G., Ren, Y., Wang, R., Wang, W.: A new low-light image enhancement algorithm using camera response model. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 3015–3022, October 2017 76. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

Enabling Pedestrian Safety Using Computer Vision Techniques

279

77. Wu, B., Iandola, F., Jin, P.H., Keutzer, K.: SqueezeDet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. arXiv preprint arXiv:1612.01051 (2016) 78. Teichmann, M., Weber, M., Zoellner, M., Cipolla, R., Urtasun, R.: MultiNet: real-time joint semantic reasoning for autonomous driving. arXiv preprint arXiv:1612.07695 (2016) 79. Alvar, S.R., Baji´c, I.V.: MV-YOLO: motion vector-aided tracking by semantic object detection. arXiv preprint arXiv:1805.00107 (2018) 80. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. CoRR, abs/1512.02325 (2015) 81. Du, X., El-Khamy, M., Lee, J., Davis, L.: Fused DNN: a deep neural network fusion approach to fast and robust pedestrian detection. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 953–961. IEEE (2017) 82. Lin, T.-Y., Goyal, P., Girshick, R., He, K., Doll´ ar, P.: Focal loss for dense object detection. arXiv preprint arXiv:1708.02002 (2017)

Towards Improved Drink Volume Estimation Using Filter-Based Feature Selection Henry Griffith(&) and Subir Biswas Michigan State University, East Lansing, MI 08544, USA [email protected]

Abstract. Maintaining adequate hydration is crucial for ensuring positive health outcomes. A variety of solutions have been proposed to assist individuals in achieving optimal water intake, including smart-bottles with embedded consumption tracking. To improve user convenience, we have proposed an attachable solution aimed at providing retrofittable smart-bottle functionality. This paper summarizes recent progress towards improving the volume estimation accuracy of this attachable device. Namely, four filter-based feature selections tools are employed to rank a superset of attributes describing the inclination trajectory of the bottle as estimated from an accelerometer sensor. By sequentially constructing feature sets of varying order based upon the rankings provided by each algorithm, binary volume classification accuracy is increased versus our previously employed feature set for each of the considered algorithms. Results are demonstrated using a newly-constructed set of 1,200 unique drink events for partitions of varying volume separations. For the median partition case, error rate is decreased for the best-case set by 5%. For a partition along the upper and lower quartiles, a best-case improvement of over 10% is achieved. Keywords: Hydration monitoring  Machine learning unit (IMU) sensors  Filter-based feature selection

 Inertial measurement

1 Introduction 1.1

Purpose

The research described herein seeks to improve the performance of a sensing solution providing retrofittable water consumption tracking functionality to conventional water bottles. Performance improvement is sought through application of filter-based feature selection techniques within existing machine learning workflows. The device architecture consists of a six-degree-of-freedom inertial measurement unit (IMU) sensor fastened around the exterior of a bottle using an elastic strap. Features are extracted from the estimated bottle inclination angle, which is computed by leveraging the redistribution of the acceleration due to gravity within the local coordinate frame of the accelerometer under the assumption of negligible applied translational forces [1]. A set of four filter-based feature selection algorithms are used to rank 32 handengineered features which describe the general morphology of the inclination trajectory. Preliminary assessment of the performance of each algorithm is conducted © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 280–290, 2020. https://doi.org/10.1007/978-3-030-12388-8_20

Towards Improved Drink Volume Estimation…

281

through examination of the binary volume classification accuracy achieved using a linear support vector machine (SVM) classifier with feature subsets constructed sequentially from the top ranked features as defined by each algorithm. Results are benchmarked versus a four-element feature set applied in our prior work using a newlygathered data set consisting of 1,200 unique drinking events [2]. 1.2

Background

To maintain baseline hydration levels, a daily intake of up to 10% of total body water is required to offset losses associated with various biological processes [3]. Failure to maintain requisite hydration is associated with a variety of negative health outcomes, including seizures, hypovolemic shock, and death [4]. Moreover, research has noted an association between mild chronic dehydration and some forms of cancer, diabetes, and cardiovascular disease [5]. While the human body has evolved an internal regulation mechanism in thirst to ensure adequate fluid intake, mild chronic dehydration is common amongst certain populations, including physically active individuals and the elderly [6]. A variety of innovative solutions have been proposed to assist individuals in maintaining optimal hydration levels. Smart phone applications capable of estimating daily water requirements, tracking self-logged consumption, and providing reminder notifications are currently available in the market (i.e.: Hydro Coach, Waterlogged, etc.). Smart-bottles, which estimate and track the volume of water consumed automatically, are also gaining in popularity (i.e.: H20Pal, Ozmo Active Smart Bottle etc.). Beyond augmented drinking devices, a variety of alternative solutions, such as wearable devices [7] and smart coasters [8] have been proposed. To improve user convenience, we have previously proposed an attachable sensor solution aimed at providing retrofittable smart-bottle functionality [2]. Further detail regarding the newly-deployed sensor architecture and estimation method enhancements are provided in the following section.

2 Methods 2.1

Data Collection Protocol

All subject recruitment, data collection, and record archiving is being conducted according to protocol approved by Institutional Research Board at Michigan State University. Study participants completed 12 drinks during each session from a Kor Delta 750 mL refillable bottle. To ensure that a representative variety of drink volumes were captured, participants were instructed to consume a large, small, or medium amount of volume with each drink consistent with their typical daily consumption (hereby denoted as a round), with bottle mass and fill height recorded after each drink. The experiment was conducted 102 times by 64 unique participants, with two trials discarded due to hardware failure. The diversity of drink volumes recorded over the experimental set is depicted in Fig. 1.

282

H. Griffith and S. Biswas

Fig. 1. Distribution of drink volumes (N = 1,200)

2.2

Sensor Hardware

A 6 degree-of-freedom inertial measurement unit sensor, composed of a 3-axis accelerometer (Analog Devices ADXL345) and gyroscope (InvenSense IMU-3000), was attached near the bottom of the bottle during each trial. While sensor height with respect to the bottle was controlled by ensuring that sensor and bottle bottoms were coaligned (chosen to ensure minimal interference with gripping), the azimuthal location of the sensor was not controlled (consistent with expectations regarding intended use). For all subsequent discussion, the local coordinate system of the sensor is denoted as ðx0 ; y0 ; z0 Þ, where the positive y0 axis points upwards along the bottle (i.e.: the axial component), the positive x0 axis points parallel to the surface of the bottle (i.e.: the parallel component), and the positive z0 axis points outward from the surface of the bottle (i.e.: the normal component). Data was transmitted from the sensor to a MEMSIC IRIS base station through an 802.15.4 wireless link at an average rate of 20 Hz, and transferred for offline processing in MATLAB through the USB port of a nearby laptop. The output of each session was parsed into discrete drinking events using a customized algorithm exploiting the stationary position of the bottle between drinks while bottle mass and fill height was recorded. Namely, the component of the accelerometer directed parallel to the bottle’s axis was subjected to an initial threshold determined empirically according to the noise floor of the device, with a subsequent parameterized merging operation performed to ensure that the entire movement interval was captured. This parsing process is depicted in Fig. 2. After parsing, all signals were bias adjusted and low-pass filtered with a 2-sample moving average filter to yield a consistent preprocessing scheme with our prior work.

Towards Improved Drink Volume Estimation…

283

Fig. 2. Visualization of drink parsing algorithm

2.3

Definition of the Inclination Trajectory Signal

The components of the accelerometer output exhibit a characteristic morphology consistent (under the assumption that any applied translational forces are negligible) with the redistribution of the acceleration due to gravity within the local sensor coordinate bases during the drinking event. As the bottle is inclined, the projection of the gravitational vector along the local axial component is reduced, with corresponding increases in the two components located in the bottle’s cross-sectional plane. This process is reversed during declination, resulting in the signal morphology exhibited in Fig. 3.

Fig. 3. Characteristic morphology of accelerometer signals during drink event

Fig. 4. Variability in inclination angle morphology across participants/fill ratio/drink volume

284 H. Griffith and S. Biswas

Towards Improved Drink Volume Estimation…

285

The inclination trajectory may be estimated through trigonometry using the 2-argument arctangent function as shown in (1). This approach is analogous to estimation of tilt angle performed in consumer devices for determination of screen orientation [1]. 0 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 a2x0 þ a2z0 ^h ¼ atan2@ A ay 0

ð1Þ

Estimated inclination trajectories for two rounds of drink cycles for the first two participants are shown in Fig. 4, where FR denotes the fill ratio, defined as the ratio of the fill height at the beginning of the drink event relative to the initial fill. As suggested by this visualization, considerable variability in signal morphology was noted across participants, drink volumes, and fill ratio across the entire data set. 2.4

Feature Engineering

Preliminary feature engineering efforts have been employed solely on the inclination trajectory waveform defined above, with exploitation of the gyroscope outputs targeted for future efforts. Feature extraction has been motivated by the assumption that participants may control drink volume through various parameters of the inclination trajectory through principles of kinematic redundancy, including inclination angle, duration, and inclination rate. Therefore, the features tabulated in Table 1 attempt to inclusively capture various measures of both signal intensity, its rate of change, and   duration, where h j ¼ h1j ; h2j ; . . .; hNj j represents the estimated inclination angle parsed for the jth drinking event. For benchmarking purposes, classification accuracy was compared versus anset of 4 features employed in our previous work as described in o j j j j ¼ aax;1 ; aax;2 ; . . .; aax;N Table 2, where aax j

represents the axial component of the

accelerometer for the jth drinking event. 2.5

Filter-Based Feature Selection

Filter-based feature selection schemes describe a subset of feature selection methods which rank attributes according to the intrinsic structure of the data, operating independent of the actual classification algorithm employed. Filter-methods offer the advantages of low computational complexity, along with robustness with respect to overfitting when compared to alternative feature selection techniques such as wrapper and embedded methods [9]. For preliminary investigation, a set of four algorithms were identified and applied to the feature superset described in the prior section – (1) the mutual information (MI) algorithm [10], (2) the Relief-F algorithm [11], (3) the Fisher method [12], and (4) infinite latent feature selection (ILFS) [13]. These four methods were chosen for preliminary convenience, due to the availability of the newly-released MATLAB Feature Selection toolbox by the authors of the latter method. Due to the limited scope of this manuscript, readers interested in further details regarding the mechanisms of each algorithm should refer to the listed citations.

286

H. Griffith and S. Biswas Table 1. Description of feature superset

Feature ID 1

Feature symbol h

Feature definition

2–10

Dhk

 j  count   h [ k  k 2 10 ; 20 ; . . .; 90

11–19

D Pk

 j  count hh [ Pk Pk 2 f10%; 20%; . . .; 90%g

20

Qh

h Dh10

21 22

h DhR

  mean h j argmaxðh j Þ N j argmaxðh j Þ

23–24

Sh ; STh

U P

  max h j

m¼1

hmj T

25

REh

maxðh j Þh1 argmaxðh j Þ1

26

FEh

hN j maxðh j Þ Nj argmaxðh j Þ

27

hT

28

hA

29

h0 T

30

h0 A

31

sh0

0

0

T

32

sh0

A



  j 0 j max h 1 : argmax h

   0 j max h argmax h j : N j

  j 0 j mean h 1 : argmax h

   0 j mean h argmax h j : N j

  0 j std h 1 : argmax h j

   0 j std h argmax h j : N j

Description Maximum inclination angle during drink event Number of samples for which inclination angle exceeds k degrees Number of samples for which normalized inclination angle exceeds Pk percent of maximum inclination Ratio of maximum inclination value to duration for which inclination exceeds ten degrees Mean inclination angle Ratio of time for which inclination angle is increasing relative to decreasing Riemann sum approximation to integral of inclination curve over entire duration (U ¼ N j ) or inclination interval   (U ¼ argmax h j ) Slope of line intersecting inclination trajectory start of trajectory time of maximum value Slope of line intersecting inclination trajectory at time of maximum value and end of trajectory Maximum rate of inclination, where h0 is a numerically estimate of the derivative of h Maximum rate of declination Mean rate of inclination Mean rate of declination Standard deviation rate of inclination Standard deviation rate of declination

Towards Improved Drink Volume Estimation…

287

Table 2. Benchmark feature set Feature ID 1B

Feature symbol H

2B 3B

D M

4B

R

Feature definition

Description

 j   j  max aax  min aax

Range of axial accelerometer signal

j

N  j  mean aax argmaxðaaxj Þ argmaxðaaxj Þ

N j

Duration of drink event Number of samples for which normalized inclination angle exceeds Pk percent of maximum inclination Ratio of time for which inclination angle is increasing relative to decreasing

To begin the analysis, all data was partitioned into a training and testing set using 90:10 holdout validation. Each algorithm was subsequently implemented on the training set to develop rankings for each member of the feature superset. Once rankings were obtained using each of the four methods, feature sets of varying order were formed by choosing the top-k ranked features from each method, with k 2 f1; 2; . . .; 15g. Performance was then assessed according to the binary classification accuracy resulting from employing each set in a support vector machine classifier which was trained and tested using the same partition. Binary classification was chosen for the current analysis due to its simplicity and direct compatibility with the filtering algorithms considered. However, as quantitative estimates of drink volume are ultimately desired, future work will explore regression techniques.

Fig. 5. Two analysis partitions considered: top: entire data set, bottom: extended class separation

288

H. Griffith and S. Biswas

To investigate variability in volume estimation performance for varying degrees of class separation, the analysis was repeated for both the entire data set (NA ¼ 1; 200) as well as a subset (NB ¼ 588) developed by first partitioning the entire set into quartiles, and subsequently choosing only drink volumes above the 75th percentile and below the 25th percentile. These two partitions are depicted in Fig. 5.

3 Results Total classification error rates for each of the aforementioned partitions are tabulated below in Tables 3 and 4. For ease of visualization, feature sets whose performance exceeds the benchmark value (listed in the heading of each table) are bolded and underlined. The minimum error achieved for each feature selection algorithm across all subsets is denoted in the final row of each table. As noted, there exists at least 3 top-k ranked feature sets constructed naively from the rankings provided by each filtering algorithm which provide superior classification performance relative to the benchmark. The portion of considered subsets providing superior performance is increased for the extended class separation partition as shown in Table 4.

Table 3. Entire data set ðA1 ¼ VjV [ medianðV Þ; A0 ¼ VjV  medianðV ÞÞ: Benchmark error rate: 34.2% k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Min.

% Error–MI 40.0 36.7 36.7 35.8 35.0 35.0 35.8 35.8 34.2 34.2 32.5 33.3 33.3 34.2 34.2 32.5

% Error–RELIEF-F 40.0 38.3 36.7 35.0 35.0 35.0 35.0 31.7 33.3 33.3 33.3 33.3 30.8 30.0 29.2 29.2

% Error–Fisher 48.3 44.2 44.2 42.5 34.2 34.2 33.3 34.2 33.3 33.3 34.2 34.2 34.2 37.5 36.7 33.3

% Error–ILFS 40.0 37.5 37.5 35.8 35.8 35.0 35.0 35.0 34.2 34.2 35.0 34.2 32.5 32.5 33.3 32.5

Towards Improved Drink Volume Estimation…

289

Table 4. Extended class separation ðB1 ¼ VjV [ percentileðV; 75Þ; A0 ¼ VjV  percentile ðV; 25ÞÞ: Benchmark error rate: 24.1% k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Min.

% Error–MI 24.1 22.4 22.4 19.0 19.0 19.0 20.7 20.7 20.7 17.2 17.2 17.2 17.2 17.2 19.0 17.2

% Error–RELIEF-F 25.9 22.4 19.0 17.2 22.4 22.4 19.0 20.7 15.5 13.8 13.8 13.8 17.2 17.2 15.5 13.8

% Error–Fisher 34.5 34.5 24.1 22.4 20.7 20.7 20.7 19.0 25.9 25.9 24.1 24.1 22.4 22.4 24.1 19.0

% Error–ILFS 31.0 17.2 17.2 17.2 20.7 20.7 17.2 17.2 25.9 25.9 24.1 22.4 24.1 24.1 27.6 17.2

4 Conclusions and Future Work A set of four novel filter-based feature selection tools (mutual information (MI) algorithm, Relief-F algorithm, Fisher method, and infinite latent feature selection (ILFS) algorithm) were investigated herein for purposes of improving the volume estimation accuracy of an attachable sensor providing retrofittable smart-bottle functionality. Features were extracted from the bottle’s estimated inclination trajectory, which was computed using trigonometry under the assumption of negligible applied translational forces. Sets of varying order formed naively through forward sequential addition of ranked features exhibited improved volume classification performance versus our previously considered feature set for each algorithm considered, with the Relief-F algorithm providing the best performance across the two analyses performed. Classification was accomplished using a linear SVM, trained and tested using a 90:10 holdout validation partition on a newly collected data set consisting of 1,200 unique drink events. In the initial analysis, responses were labeled according to the location of each drink volume in the distribution with respect to the median. For each of the four filtering techniques considered, at least 3 of the 15 feature sets sequentially constructed provided improved classification accuracy, with a best-case reduction of error rate by 5%. The analysis was repeated for a partition consisting of drink volumes above the 75th and below the 25th percentile. As expected, classification performance was enhanced when the separation between volume distributions was increased, with a bestcase reduction in classification error relative to our prior benchmark of over 10%.

290

H. Griffith and S. Biswas

As the sequential method utilized for the construction of feature sets within this work is known to be suboptimal due to its failure to consider feature interactions, future efforts will focus on further enhancing volume accuracy through application of more sophisticated set construction techniques. Moreover, to achieve the desired sensor functionality, efforts will increase partition resolution for class labeling, ultimately moving towards continuous regression approaches. Additionally, consideration of alternative classification models, along with utilization and fusion of information provided by the gyroscope sensor output, will be performed.

References 1. Pedley, M.: Tilt sensing using a three-axis accelerometer. Freescale Semicond. Appl. Note 1, 2012–2013 (2013) 2. Dong, B., Gallant, R., Biswas, S.: A self-monitoring water bottle for tracking liquid intake, pp. 311–314 (2014) 3. Lentner, C.: Geigy Scientific Tables, 8th edn. Ciba-Geigy, Basle (1981) 4. Popkin, B.M., D’Anci, K.E., Rosenberg, I.H.: Water, hydration and health. Nutr. Rev. 68(8), 439–458 (2010) 5. Maughan, R.J.: Impact of mild dehydration on wellness and on exercise performance. Eur. J. Clin. Nutr. 57(S2), S19–S23 (2003) 6. Begum, M.N., Johnson, C.S.: A review of the literature on dehydration in the institutionalized elderly. e-SPEN Eur. e-J. Clin. Nutr. Metab. 5(1), e-47–e-53 (2010) 7. Amft, O., Bannach, D., Pirkl, G., Kreil, M., Lukowicz, P.: Towards wearable sensing-based assessment of fluid intake. In: 2010 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), pp. 298–303 (2010) 8. Chan, A.B., Scaer, R.: Hydration Tracking Coaster with BLE Android App (2018) 9. Guyon, I.: Feature Extraction: Foundations and Applications, vol. 207. Springer Science & Business Media, Heidelberg (2006) 10. Zaffalon, M., Hutter, M.: Robust feature selection using distributions of mutual information. In: Proceedings of the 18th International Conference on Uncertainty in Artificial Intelligence (UAI 2002), pp. 577–584 (MI) (2002) 11. Liu, H., Motoda, H. (eds.): Computational Methods of Feature Selection. Chapman & Hall Data Mining and Knowledge Discovery Series (RELIEF-F) (2008) 12. Gu, Q., Li, Z., Han, J.: Generalized fisher score for feature selection. Computing Research Repository (CoRR) (2012) 13. Roffo, G., Melzi, S., Castellani, U., Vinciarelli, A.: Infinite latent feature selection: a probabilistic latent graph-based ranking approach. In: Computer Vision and Pattern Recognition (2017)

Coordinated Scheduling of Fuel Cell-Electric Vehicles and Solar Power Generation Considering Vehicle to Grid Bidirectional Energy Transfer Mode Benslama Sami1(&), Nasri Sihem2, Zafar Bassam1, and Cherif Adnen2 1

Information System Department, King Abdulaziz University, Jeddah, Saudi Arabia [email protected], [email protected] 2 Analysis and Processing of Electric and Energetic Systems Unit, Faculty of Sciences, Tunis EL MANAR II, PB1060, Tunis, Tunisia [email protected]

Abstract. Home-to-Vehicle (H2V) appears as an interesting research area due to its public services that incorporates new technologies and new devices for better life quality. The objective is to study and analyze house energy needs to optimize more efficiently the energy production for an optimal economy. In this context, hydrogen-based hybrid electric stand-alone systems are considered as a promising option to ensure efficient power generation without interruption and to meet fuel vehicles requirements. To perform this, a specific H2V simulation system is developed incorporating electrolyzer technology, solar energy and a Supercapacitor. Thus, to maintain the energy balance between demand and production, the excess electrical energy will be stored under different forms (electrical or chemical (H2 gas)) according to system constrains. Therefore, the produced hydrogen through the excess will fueled the vehicle after the analysis of its state need. In fact, the flows exchange will be performed between the home and the PEMFC hybrid electric vehicle while supplying the appropriate amount H2. Therefore, it is necessary to develop an intelligent energy management (IEM) for the H2V system. The proposed IEM processes user preferences and manages the energy production and storage. The results obtained are discussed and tested using MATLAB/Simulink software. Keywords: Hydrogen  Supercapacitor Storage  Production  H2V

 Vehicle  Control

Nomenclature

IGEN IDEM IAP QP SOCSC ISC

PV generated current (A) Load consumption current (A) Appliance consumption current (A) H2 produced amount (mol) Supercapacitor state of charge Supercapacitor current (A)

© Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 291–303, 2020. https://doi.org/10.1007/978-3-030-12388-8_21

292

Imax SC Pst Tst Vst SOCH2 QS QSmax NEL SOCEST SOCCH IST ICH SC QH2V Qn QVEH F R

B. Sami et al.

Supercapacitor maximum current (A) Tank pressure (bar) Tank temperature (°C) Tank volume (l) H2 tank state of charge H2 tank Stored amount (mol) H2 tank maximum stored amount (mol) Electrolyze cell numbers Estimated H2 tank state of charge Estimated Supercapacitor state of charge Estimated excess generated current (A) Supercapacitor charging current (A) Vehicle fuel delivery (mol) H2 needed amount (mol) Vehicle fuel reserve (mol) Faraday constant Ideal gas constant

1 Introduction The extravagant energy consumption, followed by its environmental impacts and the imbalance between demand and production, are considered a major and interesting research problem for various communities that requires an immediate and radical solution [1, 2]. In fact, the innovative technology developed for smart home networks has emerged as an important solution presented by a set of existing and emerging, standards-based and interoperable appliances and technologies for the purpose of developing the existing power grid [3, 4]. In addition the use of renewable energies (such as: solar energy, wind energy) to produce electricity and the intervention of vehicles for the supply of domestic energy have become an opportunity for effective investigation. Through the exploitation of new technologies, the design of new homes can incorporate charging stations for vehicles (especially electric vehicles) which can be used later as potential sources of energy for the home [5, 6]. Various projects on smart home, have considered the home recharging vehicles H2V a novelty in a sophisticated home energy system and in which the V2H has been described as an energy system used to provide electricity to the house [7]. Thus, the proposed H2V technology was intended to provide the amount of fuel H2 required to operate the vehicle. In fact, renewable energy sources such as solar energy, wind turbines, etc., can also be considered as a source of energy for vehicles to meet the energy demand for the home [8]. In addition, several methods and platforms have been studied and developed for the smart home, such as Intelligent Sensor Technologies, Home Network, and Smart Home Appliance [9]. However, the full potential of smart homes is still present, due to the complexity and diversity of systems, as well as repeated control strategies without the problem of the optimal level. Intelligent home energy management aimed to control the application and acquisition of data, generation, transmission and grid

Coordinated Scheduling of Fuel Cell-Electric Vehicles …

293

electricity [10, 11]. In addition, this smart management has attracted more interest from the research community to apply modern automation technology in the house of smart promises. This article aims to present and to develop a precise smart home management for vehicle fuel charging using a hybrid Hydrogen power system. To control the energy production and storage as well as to ensure the fuel exchange between home and vehicle, an intelligent energy management (IEM) is presented that intended to: • Synchronize between the domestic energy production and the electric vehicle demand. • Regularize the residential demand and the surplus production by storing it in proper condition. In the next section, we present a brief review of the published articles, almost classified according to thematic areas. The paper is organized as follows. Section 2 is devoted to survey the related works and contributions; Sect. 3 presents the system model description. An Intelligent energy management (IEM) strategy is proposed and developed in Sect. 4. Simulation results are presented and evaluated in Sect. 5. Finally, the conclusion is made in Sect. 6.

2 Literature Review and Contributions 2.1

Literature Review

To highlight the contributions of our work, a survey of reported research on homevehicle applications and the management of household energy in smart households is described and detailed. In the literature, several studies have presented different applications on H2V. The proposed configurations were performed to analyze energy production depending on the power needs based on the possibility of house charging electric vehicle for future used as backup energy. For example, the authors in [12, 13] presented a system using Plug-In Electric Vehicle (PHEV) which can be recharged by connecting to an external power supply. In order to optimize the proposed model, an efficient algorithm has been developed and discussed. Subsequently, the proposed system design was extended to domestic vehicles that combined several homes, solar power systems and electric vehicles. To balance the demand and the supply of energy, a new control algorithm has been suggested. The obtained results prove the effectiveness of the home–vehicle system under different scenarios. The authors of [14, 15] have proposed smart home energy management that optimally plans for appliances based on real-time electricity price forecasts. The results obtained prove the effectiveness of the proposed algorithm to cooperate between the load request and the load. In [5, 6], the battery energy storage system (BESS) was programmed in a coordinated way and the household appliances with high solar penetrations. In [15, 16], a chosen load engagement platform was proposed to minimize household operating costs. The authors in [17] proposed a renewable hybrid power system for vehicle charging from home using H2V technology. The proposed system aims to optimize a framework for efficient energy

294

B. Sami et al.

management and components sizing of a single smart home with home battery, PEV, and photovoltaic (PV) arrays. It seeks to maximize the home economy, while satisfying home power demand and PEV driving. It develops a CP control law in home to vehicle (H2V) mode and vehicle to home (V2H) mode that aims to not sell electric energy from the grid during the electric price’s peak periods. 2.2

Contributions

Compared to previous related work, the main contribution expected by this work is to develop an accurate smart home management for vehicle house charging using a hometo-vehicle technology (H2V) in order to accomplish some improvements: • Ensure the energy storage process in favorable conditions. • System components protection: prevent the Supercapacitor overcharging and overproduction of H2 gas. • Vehicle fuel charging from home taking into consideration the energy demand.

3 System Design and Description The proposed system aims to study the energy balance between demand and production, particularly the development of storage phenomena and vehicle charging from the house. Hence, the described system deals with two main duties: • The energy storage control taking into consideration the demand. • The vehicle charging regulation when it complains of fuel lack. In this context, the study and the development of energy storage system is highlighted. The given system is composed of a set of equipment as Electrolyzer and Supercapacitor that is employed to regulate the excess energy storage. In addition, the system uses the hydrogen flow generated from home to supply the electric vehicle “H2V” in order to reward the insufficiency vehicle fuel and to control the energy production by excess storing (see Fig. 1). Moreover, an Intelligent Energy Management “IEM” is presented in order to control the system operation to protect it against the over H2 fuel production and the Supercapacitor overcharging.

(a)

PV Roof

Power Excess

(b)

Power Excess

Electricity

H2 Production Station: Electrolyzer

Electricity

Electricity

Electricity Storage: Supercapacitor

Electricity

H2 gas H2 Station

H2 Flow

Deficit Power Process V2H

Fig. 1. H2V System design: a general design; b matlab/simulink prototype

Coordinated Scheduling of Fuel Cell-Electric Vehicles …

295

• The energy production: constitutes the energy issued from the photovoltaic module located at the house roof. The expression of the PV generated current is described as [18]:  IGEN ¼ Iph  Is e

NS :VPV þ NP :IPV :Rs VT

 

NS :VPV þ NP :IPV :Rs Rsh

ð1Þ

• The energy consumption: is defined by the total installed appliances consumption. this energy may be used for lighting, heating, leisure,… IDEM ¼

n X

IAPi

ð2Þ

i¼0

• Electrolyzer equipment: it is an electrochemical device that, in the presence of an excess power current, produces hydrogen by decomposing the water molecule into H2 and O2 gas. The generated hydrogen flow is calculated from the Faraday law as [19]: QP ¼

gELF :NEL :IEL 2:F

ð3Þ

• Supercapacitor equipment: it is an electrical device that through its discharge can accomplish the energy recovery process. So, to ensure its protection against the deep discharge, the Supercapacitor is controlled by its state of charge described by the following expression [20]:  SOCSC ¼

ISC

2

ISCmax

ð4Þ

Instead of battery, Supercapacitor is used as a short term energy storage device which is intervening in a period when excess generated power is lower than that nominal of electrolyzer. • H2 Station: it is equipment dedicated to the storage of the produced hydrogen flux. The storage process made under high pressure follows the law described by the equation below [21]: Pst ¼

R:Tst S :Q Vst

ð5Þ

The hydrogen flow storage control is made by state of charge verification that is expressed as: SOCH2 ¼

QS QSmax

ð6Þ

• Electric Vehicle: it is an electric vehicle based on fuel cell that requires hydrogen as fuel to be running.

296

B. Sami et al.

4 Intelligent Energy Management The overproduction of energy as well as the intermittent nature of renewable sources makes energy storage a great challenge to maintain the need for critical lack periods. To perform the best control of energy flows transfer, an optimal energy management method is presented to strike a balance between production and demand by organizing the storage activities. In this context, the proposed system uses an intelligent management strategy based on a theoretical analysis of Home Vehicle recharging Station (H2V). Indeed, the H2V technology relies on vehicle fuel charging from the home. The basic idea of IEM halts on the measurement of some control parameters as energy production, the home electricity demand and the terms vehicle charging based need analysis and control. Hence, a model is used to forecast and estimate the excess energy and the required vehicle fuel amount. To better understand the functioning of the presented management strategy, an algorithm represented by a control flow chart is given by Fig. 2 followed by a Petri nets model (see Fig. 3). The Petri net, instead of linear programming, is used to fix the difficulty problem related to the analytical model development that cannot accurately reflect all system constraints. It can offer a procedural description by giving in detail a set of cause and effect sequences. This approach is mainly used in the preparation phases of a space mission to validate different decision-making (Table 1).

Energy Demand: IDEM

TE1

System Startup

Energy Generation: IGEN

IDEMINm

TE4

Yes

No

Estimate H2 tank State SOCESTH2=f(IST)

Estimate SCap Charging State SOC CHSC=f(IST) No

SOCESTH2 1 TE5: SOCH2 > 1

Next state SE1: Excess process SE2: H2 production SE3: SCap charging Home vehicle charging SE2: H2 production

Energy flows IST = IGEN–IDEM IEL = IST ISC = IST QH2V = Qn–QVEH IEL = IST

The energy flows as well as the losses are considered and developed by the simulation prototype in order to analyze in a relevant way the system behavior by the support of an adequate and appropriate dimensioning. To generalize the model and focus on the development of the management and supervision approach, the details are not covered by this work. While the innovation of the proposed management strategy reveals in controlling the bidirectional transfer of hydrogen flow from home to car: local production station allowing residential to check the condition of their electric car and recharged it if necessary through hydrogen produced whereas for other works of this context, only the use of battery is treated. In addition, it is considered that the house

298

B. Sami et al.

is self-electrified from the PV panels installed in the house roof in order to be dependent on the power grid and its disturbances. This is the reason why H2V technology instead of V2G is another approach to develop and study soon.

5 Findings and Results To evaluate the system performance and to demonstrate its reliability to be against any unexpected events, numerous simulation tests have been made under Matlab/Simulink environment. Indeed, to accomplish the simulation process, experimental consumption profiles have been restored to describe the estimated electricity delivered to meet the required domestic appliances energy. Additionally, the present profile treats the house energy consumption for a 4 days period (96 h). Thus, the Fig. 4 illustrates the energy production and consumption balance. As seen, the system is located under four excess periods that by referring to control algorithm, the system choose the appropriate storage component.

Fig. 4. Energy balance between demand and generation

For the first excess time, the excess current value is lower than the electrolyzer nominal one (IST < IN). So, the transition TE4 is validated which causes the activation of Supercapacitor to accomplish the storage process (see Fig. 5). In the meantime, the electrolyzer is gathered inactive which explains the non-production of hydrogen. However, in the second period, the home is ready to generate H2 gas given the circumstances availability. At this time, the system verifies the vehicle status and analysis its needs (see Fig. 7). Indeed, for QVEH < QN the home supply the vehicle to compensate the fuel lack. Thus, the components operation during the excess periods is summarized by Fig. 6. We can notice that the hydrogen production phenomenon is allowed during four excess power periods while the Supercapacitor is intervening for eight periods.

Coordinated Scheduling of Fuel Cell-Electric Vehicles …

Fig. 5. Energy recovery generation

Fig. 6. Component activation versus simulation time

Fig. 7. Vehicle H2 fuel control

299

300

B. Sami et al.

The balance between the amount of stored Hydrogen and vehicle fuel supplied is presented by Fig. 8 which shows the periods of home-vehicle charging. In fact, the hydrogen stored quantity is proportional to that of the vehicle in order to ensure its supply in favorite condition. Although the vehicle will only be loaded from home when there is enough hydrogen produced.

Fig. 8. Balance between H2 production and storage

Moreover, the Fig. 9 study the state of charge variation of both H2 tank storage and Supercapacitor in order to control the inlet energy flow to protect the components against the overproduction and overcharging. So, as seen, the SCap and H2 states of charge are kept below the maximum state (SOCH2 = 62% < 100%; SOCSC = 70% < 100%).

Fig. 9. Components states of charge

Coordinated Scheduling of Fuel Cell-Electric Vehicles …

301

Finally, we have studied the system mean efficiency whose obtained result is depicted in Fig. 10. Indeed, the system reaches a 13.7% as overall efficiency value caused by the low PV value (17.75%). However, the SCap outperforms the electrolyzer efficiency by 94.48% versus 81.7%.

Mean Efficiency per Component 94.48% 81.70%

17.75%

PV

13.70% EL

SCap

System

Fig. 10. Mean efficiencies of system and its components

6 Conclusion In this paper, a model of vehicle charging station from house is proposed and developed. In fact, this work was developed as part of the treatment of a home selfelectrification system through rooftop PV modules that supported the fuel supply to the stationary car. Therefore, the main purpose was to show the interest of hydrogen energy storage to ensure the home energy needs on the one hand and the vehicle fuel on the other hand. Thus, to monitor the power distribution state and the energy flows transfer, a smart supervisory strategy is proposed and analyzed. Then, the system performances are tested and evaluated using a model prototype made by Matlab/Simulink. Hence, relying on the attained results, the proposed modeling and management methods prove its reliability and effectiveness to be facing encountered events and states that hinder the system functioning as the home vehicle charging process is carried out under favorable conditions. As a future work, we aspire to study the overall system and behavior by integrating the energy recovery part, which studies the impact of deficit power and the possibility of its processing through the vehicle (V2H) to ensure the house requirements.

302

B. Sami et al.

Input Simulation parameters of H2V system Parameters PV Maximal power: Pmax PV Maximal voltage: Vmax PV Maximal current: Imax PV Short-circuit current: ISC PV Open-circuit voltage: VOC PV Cell numbers SC State of charge SOCmax Number of cells

Values 60 W 17.1 V 3.5 A 3.8 A 21.1 V NS = 3; Np = 6 0.87 NEL: 5

References 1. Gupta, E.: Global warming and electricity demand in the rapidly growing city of Delhi: A semi-parametric variable coefficient approach. Energy Econ. 34(5), 1407–1421 (2012) 2. Celik, B., Roche, R., Bouquain, D., Miraoui, A.: Decentralized neighborhood energy management with coordinated smart home energy sharing. IEEE Trans. Smart Grid 9, 6387– 6397 (2017) 3. Wang, L., Kusiak, A., Dounis, A.: Guest Editorial special section on intelligent buildings and home energy management in a smart grid environment. IEEE Trans. Smart Grid 3(4), 2119–2120 (2012) 4. Zhao, W., Ding, L., Cooper, P., Perez, P.: Smart home electricity management in the context of local power resources and smart grid. J. Clean Energy Technol. 73–79 (2014) 5. Zhang, Q., Zhang, S.: Smart home energy management with electric vehicles considering battery degradation. Adv. Mater. Res. 860–863, 1085–1091 (2013) 6. Nezamoddini, N., Wang, Y.: Risk management and participation planning of electric vehicles in smart grids for demand response. Energy 116, 836–850 (2016) 7. Hajizadeh, A., Kikhavani, M.: Coordination of bidirectional charging for plug-in electric vehicles in smart distribution systems. Electr. Eng. 100, 1085–1096 (2017) 8. Kiat, L., Barsoum, N.: Smart home meter measurement and appliance control. Int. J. Innovative Res. Dev. 6(7) (2017) 9. Shirazi, E., Zakariazadeh, A., Jadid, S.: Optimal joint scheduling of electrical and thermal appliances in a smart home environment. Energy Convers. Manag. 106, 181–193 (2015) 10. Kim, J.: HEMS (home energy management system) base on the IoT smart home. Contemp. Eng. Sci. 9, 21–28 (2016) 11. Dinh, D.L., Kim, J.T., Kim, T.S.: Hand gesture recognition and interface via a depth imaging sensor for smart home appliances. Energy Procedia 62, 576–582 (2014) 12. Tushar, M.H.K., Assi, C., Maier, M., Uddin, M.: Smart microgrids: Optimal joint scheduling for electric vehicles and home appliances. IEEE Trans. Smart Grid 5(1), 239–250 (2014) 13. Setlhaolo, D., Xia, X.: Optimal scheduling of household appliances with a battery storage system and coordination. Energy Build. 94, 61–70 (2015) 14. Yang, Y., Zhang, W., Jiang, J., Huang, M., Niu, L.: Optimal scheduling of a battery energy storage system with electric vehicles’ auxiliary for a distribution network with renewable energy integration. Energies 8(10), 10718–10735 (2015)

Coordinated Scheduling of Fuel Cell-Electric Vehicles …

303

15. Chen, Q., Ma, Y.: The research on cloud platform considered privacy household load data processing. Adv. Mater. Res. 1049–1050, 1929–1933 (2014) 16. Wu, X., Hu, X., Teng, Y., Qian, S., Cheng, R.: Optimal integration of a hybrid solar-battery power source into smart home nanogrid with plug-in electric vehicle. J. Power Sources 363, 277–283 (2017) 17. Wu, X., Hu, X., Teng, Y., Qian, S., Cheng, R.: Optimal integration of a hybrid solar-battery power source into smart home nanogrid with plug-in electric vehicle. J. Power Sources 363, 277–283 (2017) 18. Jian, L., Zheng, Y., Xiao, X., Chan, C.: Optimal scheduling for vehicle-to-grid operation with stochastic connection of plug-in electric vehicles to smart grid. Appl. Energy 146, 150– 161 (2015) 19. Li, C.H., Zhu, X.J., Cao, G.Y., Sui, S., Hu, M.R.: Dynamic modeling and sizing optimization of stand-alone photovoltaic power systems using hybrid energy storage technology. Renew. Energy J. 34, 815–826 (2009) 20. Hadartz, M., Julander, M.: Battery-Supercapacitor Energy Storage. Master of Science thesis in Electrical Engineering, Department of Energy and Environment, Division of Electric Power Engineering Chalmers University Of Technology, Göteborg, Sweden (2008) 21. Lajnef, T., Abid, S., Ammous, A.: Modeling, control, and simulation of a solar hydrogen/fuel cell hybrid energy system for grid-connected applications. Adv. Power Electron. 2013, 9 (2013). Hindawi Publishing Corporation

Optimization of Bus Service with a Spatio-Temporal Transport Pulsation Model Shuhan Lou1(&), Ling Peng2(&), Yunting Song3(&), Xuantong Chen4(&), and Chengzeng You2(&) 1

2

Capital Normal University, Beijing 100048, China [email protected] Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100094, China [email protected], [email protected] 3 Wuhan University, Wuhan 430072, China [email protected] 4 Peking University, Beijing 100871, China [email protected]

Abstract. With the rapid urbanization, the transportation system, especially the bus system, plays an increasingly prominent role in city planning. However, the current bus evaluation model usually focuses on just one aspect either spatial or temporal. Inspired by “Pulsation Analysis”, this study presents a new approach for evaluating Service Level of Bus as well as Demand for Public Transport based on both spatial and temporal aspects of the bus trips, called “Public Transit Pulsation Analysis”. The transport pulsation assessment model which assesses the supply side and the demand of community for bus model which calculates the demand side, are combined to optimize the bus frequency settings in this approach. The proposed method is tested on a real case study in Tianjin, China, which implies its usefulness in evaluating the service level and improving the service quality of the bus system by 17.6%. As well as developing a bus frequency optimization model, this study also demonstrates its realworld application of “Pulsation Analysis” for decision-making in city planning. Keywords: Public transit pulsation analysis Demand for public transport

 Service level of bus

1 Introduction With the rapid city growth, public transport plays a pivotal role in the urban development and the residents’ living standard. Among them, ground transportation, especially the public transportation system, can effectively relieve the pressure of transportation and support the healthy and sustainable development of the city. Moreover, due to the increasing application of various sensors and the continuous improvement of public information platforms in smart cities, large volume, multiple time scales, multiple spatial scales and multiple types of data could be collected and managed, making it possible to analyze the spatio-temporal data of public transport for evaluating and optimizing the public transport services level [1, 2]. Peng et al. scholars © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 304–318, 2020. https://doi.org/10.1007/978-3-030-12388-8_22

Optimization of Bus Service with a Spatio-Temporal Transport …

305

carried on “Pulsation Analysis” of traffic problems such as traffic flow characteristics and bus service area [3–7]. The in-depth research on the spatio-temporal data from the public data platform could uncover potential patterns, discovers problems, and further increases the fine scale city management [1]. According to field research, the existing bus system, especially the frequency of bus, of Sino-Singapore Tianjin Eco-city fails to meet the residents’ requirements. The frequency of public transportation, the interval between consecutive buses on a line, is a critical factor in the design of public transport systems and directly affects the service quality and economic benefits of public transport [8]. Therefore, this paper proposes an evaluation and analysis model to perform the quantitative analysis on spatio-temporal aspect of a bus system, which aims to assist decision-making in bus service, especially to improve its frequency setting. Scholars in various fields have studied the public transport system, but usually pay attention to just the space or time dimension. On the one hand, existing evaluation models based on bus frequency often only focus on the evaluation of time dimensions. Tilahun et al. viewed the timeline of a single line as a multi-objective fuzzy optimization problem [9]. Ruiz et al. only optimized the bus frequency by taking the number of lines and shifts as the service level [10]. Besides, some scholars include changes in passenger flow over time as a part of the evaluation criteria. For example, Yang uses the bus card records to improve frequency of departure [11]. Another example is Yu [12] or Niu [13], who also considered user satisfaction to optimize bus frequency. However, most of these studies rely on traffic-related information, thus it is necessary to manually determine a bus capacity threshold, leading that the assessment frequency setting in the absence of data is difficult. On the other hand, most of the public transportation evaluation models based on spatial indicators do not include the quantitative analysis of temporal indicators. The accessibility model proposed by Han et al., as an example, only focuses on the bus station, not from the perspective of the bus lines or the community, nor does it fully consider the actual needs of the residents, but simply uses the area ratio to represent bus system’s accessibility [3]. Although some scholars have analyzed the public transit system from a spatio-temporal perspective, they have not directly solved the problem of frequency of departure. For example, Peng used the pulsation analysis to present traffic flow features and calculates bus service area [3–7]. Pulsation analysis refers to the analysis of urban problems by using data obtained from real-time monitoring [1, 4] to visualize the results [5, 6]. After comprehensively considering the limitations of previous research and the feasibility of spatio-temporal pulsation analysis, we propose the concept of “Public Transit Pulsation Analysis”: to analyze the dynamic change of traffic and their impact factors by transit data mining, which supports the decision-making in bus system optimization. Based on the “Public Transit Pulsation Analysis”, this paper proposes a transport pulsation assessment model and a demand of community for bus model, and then quantifies whether the bus services could meet the needs residential nearby, so as to explore the best bus design which balances the service level of and operating cost. Additionally, the transport pulsation assessment model is based on the service level index which contains two subordinate indexes: the carrying capacity on the time dimension and the accessibility index on the spatio-temporal dimension. The proposed methodology is conducted in a representative Chinese district—Tianjin Eco-city—over

306

S. Lou et al.

a time period (June 2017). We also show that, as expected, a proposed bus frequency settings based on the optimization approach could improve the service level of bus in the study area.

2 Methodology Public Transit Pulsation Analysis builds on the evaluation of service level indicators based on accessibility and carrying capacity, in order to calculate the supply based on the transport pulsation assessment model and the demand based on the demand of community for bus model. The overview of models we used can be seen in Fig. 1.

Fig. 1. The overview of models

Optimization of Bus Service with a Spatio-Temporal Transport …

2.1

307

The Transport Pulsation Assessment Model

To reveal the law of bus service as well as balance the cost of the bus trip and the requirements of residents, the transport pulsation assessment model is proposed to quantify the service level of the whole district. The transport pulsation assessment model is defined as the function varied by time: sumðDSÞ ¼

X

DS ¼

XT  C  V  A D xi

ð1Þ

DS is the District Service Level; D is The Matrix of Available Bus for each community; T is Time Unit; xi is Departure Interval time; T=xi is the number of bus in a certain time unit, which varied by time; V is velocity varied by time, A is the Accessibility. 2.2

The Evaluation of Bus Service Levels Based on Spatio-Temporal Dimension

Different from the public transport satisfaction raised by previous researches, bus service level measures both the bus carrying capacity and bus line accessibility. It reflects the travelling time and space convenience of different bus lines. Slx ¼ Alx  CClx

ð2Þ

where Slx is the service level of bus line lx, Alx is the accessibility of line lx, CClx is the carrying capacity of lx. lx means the x-th line, x = 1, 2…n. The bus service level is based on two levels: the bus lines and the impact of bus lines on surrounding areas. Analysis of bus lines themselves starts with measuring their carrying capacity, and the influence of bus lines is expressed by the accessibility. The following is the quantitative representation of it. 2.3

Carrying Capacity Index Based on the Spatio-Temporal Aspect

For one resident who uses the service of one bus line, the carrying capacity of the bus line can be measured by the time that the resident needs to reach the destination. However, it can also be measured by the travel distance within the unit time of the bus if the passenger demand is unknown. For a group of residents, the corresponding carrying capacity is the total distance of all passengers that can be transported within a unit time. This index is mainly influenced by the speed of the bus and the number of passengers transported per unit of time [14, 15]. Therefore, the bus line carrying capacity in this study can be determined using the following formula: CClx ¼ F  V  C

ð3Þ

Where CClx is the carrying capacity (x = 1,2, 3…) of the x-th line, F is the number of bus trips per unit of time, V refers to the average speed of the bus in this line, and C refers to the total number of passengers that can be accommodated in a single bus.

308

S. Lou et al.

Considering that the bus departure interval is not less than one minute in real life, in this formula, F refers to the number of departures within one hour. The indicators obtained from the above formula mainly measure the transit capacity per unit time of the bus line, that is, the service level evaluation index in the time dimension. 2.4

Accessibility Index Based on the Spatio-Temporal Aspect

In addition to the quantitative indicators based on the time dimension, namely the carrying capacity, the evaluation model of bus routes also is required to combine the spatial service level in different regional, that is the spatial reachability index. The accessibility of the entire line is calculated by summing up the index value of each individual bus transit station. Additionally, bus routes also have interchange functions, so the sum of accessibility of transfer lines has also been added which multiplied by the transfer rate to simulate actual residents’ travel behavior. We define the line accessibility indicators as follows: Ali ¼

n X

Asi þ Ct 

i¼1

n X

AIk ð0\c\1Þ

ð4Þ

k¼1

Alx is Accessibility of the x-th line, Asi is Accessibility of Certain Stations, AIk is Accessibility of Interchange the k-th line, Ct is the constant of Transfer Rate. Functional weight of the functional zones in the public accessibility index is based on the spatio-temporal dimension while the distance weight is based on the spatial dimension. The functional weight refers to the weighted value of residents’ travel volume at different times in different functional zones; the distance weight refers to the weighted distance of every residential area to different bus routes. Therefore, the accessibility of each site is calculated by multiplying the functional weight and the distance weight: Asi ¼

n X

DWj  FWj

ð5Þ

j¼1

DWj is Distance Weight of District j, FWj is Functional Weight of District j, s represents bus station. In the formula (5), distance weight is in inverse proportion to the Euclidean distance from bus station to district. In the formula (5), functional weight is the normalized results of residents’ trip timing characteristics in different regions. For example, the demand for bus in the industrial areas during the rush hour is large. Therefore, the functional weight is the total passenger flow of all districts at each time multiplied by the proportion of passenger flow in a certain time period and in a certain district. FWj ¼ SFj  FðtÞj

ð6Þ

Optimization of Bus Service with a Spatio-Temporal Transport …

309

SFj is the sum of the total passenger flow in district j; F(t)j is a function varied by time(t), represents the proportion of passenger flow in district j at time t. The accessibility of certain bus line could be calculated by formula (4–6). 2.5

The Demand of Community for Bus Model

In addition to evaluating the service level of the bus line, that is, the supply amount, the optimization of the frequency of bus departures also bases on the demand for buses in the community. According to Shi Fei’s model for forecasting residents’ travel volume, they chose a land use-based traffic generation model and added the most basic demographic factors to predict residents’ traffic generation [16]. In addition, according to studies by Li [17] and Shi [18], the condition of volume ratio was added to the prediction model. The Floor Area Ratio is an important indicator for controlling the intensity of urban land development. It is known as the building coverage ratio. It refers to the ratio of the building area to the land area in a city development land. It reflects the intensity of urban land use. The higher the volume rate, the greater is the land development intensity [16]. At the same time, the portion of private car travel is subtracted to calculate the demand for bus in the certain community. Therefore, to synthesize the above three points, the formula is as follows: Di ¼ n  Fi  ðPi  Co  Qi Þ

ð7Þ

Where Di is the Demand of the i-th community, n is the average number of trips per day per person in the area, Fi is the Floor Area Ratio of the community, and Pi is the total population of the i-th community. Qi is Quantity of Parking Space in the i-th community. Considering the limit line policy, Co represents the Constance of Occupied Parking Spaces Ratio and better to show the actual situation. 2.6

The Optimization Method of Supply and Demand Balance

In this study, bus design optimization is built on the sum of bus service levels in the studied area as well as the balance of service supply and demand among regions. Therefore, there are two main goals for optimizing the frequency of bus departures: (1) The sum of the bus service levels of all communities should be the largest; (2) The distribution of bus service and that of service needed in the every community should achieve fair. Goal 2 can be evaluated using the GINI index. The GINI index was originally used to measure the differences in resident income distribution. Based on this, relevant researches have established a similar method to measure the distribution differences of bus services [19]. This method mainly measures the spatial balance between the allocation of public transport services and the demand in each region. The mathematical meaning of it represents the difference between the actual distribution curve and the absolute average distribution curve. The following formula can be utilized to make simple calculations.

310

S. Lou et al.

G¼1

n X ðXk  Xk1 ÞðYk þ Yk1 Þ

ð8Þ

k¼1

where n is the number of regions, Xk is the ratio of the sum of the demand of the first to k-th communities to the total demand, and Yk is the ratio of the sum of corresponding bus service levels of the first to k-th communities to the total service level of all communities. Using non-linear integer programming methods can solve this service level optimization problem.

3 Case Study 3.1

Study Area

The Sino-Singapore Tianjin Eco-City (SSTEC), the pilot area selected for our study, is a strategic cooperation project between the governments of China and Singapore. It is about 40 km away from the central city of Tianjin and covers an area of about 30 km2. The objectives of the China-Singapore Tianjin Eco-city’s green transportation planning are as follows: covering 70% of the bus trips in outbound travel will take place, and more than 25% internal bus travel [20]; the service area of bus station should be in 500 m radius. Currently there are 4 bus routes within the SSTEC bus system, which belongs to the small bus system. The study area contains a total of 35 residential communities. After excluding the residential areas whose names were changed to invalid data, the remaining 28 residential areas were considered in this study. However, there are still some problems in the Sino-Singapore Tianjin Eco-City’s public transportation system: as the number of both residents and companies increases, the current travel plan cannot meet the growing need. Therefore, the frequency of departures of different lines still has room for improvement. In order to strike the balance between economy and convenience, this study combines the transport pulsation assessment model and demand of community for bus model to optimize the bus frequency. The research data includes: regulatory plan map, four internal urban bus routes and site maps, distribution of various communities with the number of households and parking space, bus departure frequency, passenger flow daily reports, and bus scheduling. All the data is in June 2017. 3.2

The Transport Pulsation Assessment for the Service Level of Bus

This study plans to calculate the carrying capacity and accessibility of bus lines in discrete time periods, and then calculate the total sum of service levels along with the GINI index of bus service in Sino-Singapore Tianjin Eco-City. The study period is divided according to the start time and traffic flow of the first and last buses of the SinoSingapore Tianjin Eco-city, and the bus operating hours is divided into 6:00–8:00, 8:00–11:00, 11:00–14:00, and 14:00–17:00, 17:00–19:00 and 19:00–21:00 six time periods. First of all, using the bus driving data and bus schedule provided by the management department of Sino-Singapore Tianjin Eco-city, the average number of bus

Optimization of Bus Service with a Spatio-Temporal Transport …

311

25 20 15 10 5 0

(km/h)

(km/h)

trips and bus speeds of each line of different periods under the original scheme were calculated (shown in Fig. 2). In addition, bus models used in the four lines of the SinoSingapore Tianjin Eco-city are the same, with a passenger capacity of 75 passengers. According to formula (3), the carrying capacity of each line under this scheme can be calculated. Through the analysis of the traffic flow of Sino-Singapore Tianjin Eco-City, it is not difficult to find that the main traffic peak hours in the region are 6:00–8:00 and 17:00–19:00 on weekdays, which is consistent with the residents’ travel pattern in small and medium cities in previous studies [21–23].

35 30 25 20 15 10 5 0

35 30 25 20 15 10 5 0

(b) The bus frequency and velocity of line 2

(km/h)

(km/h)

(a) The bus frequency and velocity of line

(c) The bus frequency and velocity of line 3

30 25 20 15 10 5 0

(d) The bus frequency and velocity of line 4

Fig. 2. Bus frequency and velocity in different time

Our study used the travel pattern data of residents in small and medium-sized cities, collected from the studies of Zhang [21], Shen [22], and Li [23]. The passage flow percentage in the morning, noon and evening peak travel is accordingly 19, 14 and 12%, which are assigned to office area like the industrial district. Since the designed service area of the bus station in the SSTEC Tianjin City is 500 m, all the districts’ weight within 100 m distance is 5, and respectively outside 500 m distance is 0. In the calculation of interchangeability, this paper uses the average accessibility of internal bus routes in the study area to replace the accessibility of intercity bus line. Depending on research by Li, the transfer rate of this paper is 0.4 [24]. According to Eq. (3–5), the accessibility index value could be calculated. The service level could be obtained by Eq. (2) based on the carrying capacity and accessibility. The service level varied by time could be illustrate in Fig. 3. Since the basic residential unit of the city is a single community, the unit of the bus line service level space is also a community. For each community, the bus service level of it is described as the sum of the service levels of the bus lines near the region. According to the study of Shalalfah et al. in Toronto, 60% of the passengers arrived at

S. Lou et al. 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0.8

Nomalized Value

Normalized Value

312

0.6 0.4 0.2 0

(b) The service level of line 2 in different time period

1 0.8 0.6 0.4 0.2 0

1.2 1 0.8 0.6 0.4 0.2 0

Normalized Value

Normalized Value

(a) The service level of line 1 in different time period

(c) The service level of line 3 in different time period

(d) The service level of line 4 in different time period

Fig. 3. The service level of bus lines in different time

the bus station are from the area within 300 m [25]. Adding the service level of the bus lines which overlap with the buffer zone of 300 m of each community can obtain the regional service level of the community (as showed in Fig. 4).

Fig. 4. The service level of different communities

In general, the spatial distribution of bus lines fluctuates according to the demand of the community: the areas with higher levels of regional service are in the intersections

Optimization of Bus Service with a Spatio-Temporal Transport …

313

Normalized Value

of various bus lines and the aggregation areas of residential areas, reflecting the mutual influence between bus lines and the residential areas. The distribution of bus routes also affects the distribution of service levels. Figure 5 shows the time pattern of changing service levels in the region. There are noticeable peaks in the three time periods in the early morning, at noon and in the late evening. The level of bus service reach peak at night which is due to the return trips in residential area. Noon peak is relatively flat because the residents working outside the Sino-Singapore Tianjin Eco-city seldom return home at that time.

1.2 1 0.8 0.6 0.4 0.2 0

Fig. 5. The normalized service level in different time

3.3

Demand for Public Transport of Communities

Calculate the community bus demand based on the Floor Area Ratio and residential population in the district planning data provided by the management department of Sino-Singapore Tianjin Eco-city. The vehicle trip rate is 6/7. In addition, the number of daily trips per person is built on the most recently released Tianjin statistical data in 2006, which is 2.2 times [26]. Based on formula (7), the distribution of residents’ needs is illustrated in Fig. 6. Compared to Fig. 4, the spatial distribution of demand in this region and the public transport service is unbalanced, such as high demand in northwestern communities and relatively low level of corresponding public transport services in that area. 3.4

Analysis of Bus Frequency Optimization Results

In this paper, a common and feasible non-linear integer programming method is used, the Monte Carlo algorithm. It is often used to find the optimal solution that satisfies the goal that the regional bus service level is the highest and the GINI index does not decrease. The Monte Carlo algorithm produces the approximate solution of the equation by a random sampling method based on the probability of occurrence of a random event. Taking into consideration the fact that changes of bus departure interval are generally not too large, so the probability model was taken as the normal distribution with the mathematical expectation of the original departure interval and a

314

S. Lou et al.

Fig. 6. The demand for bus of different communities

standard deviation of 5. When solving, it is also a need to satisfy the constraint that the GINI index must not be greater than the original GINI index, so as to ensure that the service level difference between communities does not increase. In addition, considering the operating cost of the bus company and the difficulty of scheduling, the following constraint is imposed on the departure intervals {x1, x2, x3, …, xn} of each bus line in each period of this formula: 1. The sum of departure intervals after optimization is not lower than the original sum to ensure that the total cost of bus operations does not increase; 2. The interval between departures of any period is not less than the minimum interval between departures of the current lines, which balances the cost of bus service. The original bus departure interval is shown in Table 2 while the optimized one is given in Table 3. The new scheme will increase the frequency of departures for most morning and evening peaks, in line with the needs of residents. At the same time, the frequency of departures in peak time periods and demanded lines (i.e. Line 1 and Line 4) in the Sino-Singapore Tianjin Eco-city have increased, while the frequency of line 3, which is in low demand, has been reduced in some periods. The new departure interval combination influences the final regional service level by affecting the carrying

Optimization of Bus Service with a Spatio-Temporal Transport …

315

capacity. As is shown in Fig. 7, the carrying capacity of line 1, 2 and 4, especially during the peak hours, has been increased, while the carrying capacity of bus line 3 has been reduced during most of the time. Based on the above scheme, the sum of the communities’ service levels has been increased by 17.58%. Table 1 shows that the GINI index under the original bus scheme is about 0.194, which means the overall service level and demand are relatively balanced, but there is still room for improvement. After the change of bus departure frequency was optimized using this model, the GINI index has been reduced by 2.34% compared with that of the original scheme. Percent

Percent

100

50 40 30 20 10 0 -10 -20

80 60 40 20 0 -20 -40 (a) The change of carrying capacity of line 1

(b) The change of carrying capacity of line 2

Percent

Percent

60 50 40 30 20 10 0 -10 -20 -30

10 5 0

-5 -10 -15 -20 -25

(c) The change of carrying capacity of line 3

(d) The change of carrying capacity of line 4

Fig. 7. The change of carrying capacity Table 1. The service level of original bus lines and optimized ones Original scheme Optimized scheme The sum of community service levels 2,137,075.23 2,512,704.88 GINI index 0.1942 0.1896

Table 2. The bus schedule of original bus lines 6:00–8:00 8:00–11:00 11:00–14:00 14:00–17:00 17:00–19:00 19:00–21:00

Original line 1 Original line 2 Original line 3 Original line 4 12 13.3 20 17 13.1 20 30 17 20 24.3 30 17 13.8 18.5 30 17 10.4 17.9 20 17 16 20 30 17

316

S. Lou et al. Table 3. The bus schedule of new bus lines 6:00–8:00 8:00–11:00 11:00–14:00 14:00–17:00 17:00–19:00 19:00–21:00

New line 1 New line 2 New line 3 New line 4 10 14 25 12 10 24 35 22 14 32 35 14 10 26 37 13 11 10 25 11 18 13 28 12

The optimized service level of each community is illustrated in Fig. 8. Through comparing with the original service level map, it is not difficult to find that the service level of each community has been improved to some extent. However, in terms of the overall spatial distribution, the relative service levels of those communities have not changed much. This is because the bus routes that can be used by each community are always fixed, and the service level gap can only be slightly reduced by changing the bus departure frequency.

Fig. 8. The optimized service level map

Optimization of Bus Service with a Spatio-Temporal Transport …

317

The transport pulsation assessment mode proposed in this paper extends the depth of urban pulsation law analysis, as well as conducts research on the optimal a particular travel plan for urban public transport problems. This reflects the powerful role of urban pulsing analysis in decision support, especially combing with big data.

4 Conclusion Based on multiple urban spatio-temporal datasets, this paper proposes a transport pulsation assessment model for assessing the service level of urban public transit services and then integrates it with demand of community for bus model to optimize bus travel plans. In this study, taking the Sino-Singapore Tianjin Eco-City as study area, the model was used to evaluate the existing bus system, as well as scientifically analyze the pulsating spatio-temporal factors in the current car frequency. Next, the Monte Carlo algorithm is implemented to explore the balance between operating costs and residents’ needs. By adjusting the frequency setting, the level of bus services in the entire study area was optimized by 17.58%. On the other hand, the service level index based on each bus line proposed in this paper can also reflect the distribution of accessibility in different regions. It provides the possibility of the practical applications for redesigning bus routes to further improve the GINI index. Public Transit Pulsation Analysis can not only be applied to bus frequency settings but also opens many future research directions. One example is that the correlation between cities’ big data in other aspects and traffic data can also be tapped to further accurately optimize bus routes and frequency settings. Acknowledgements. This paper is supported by Innovation practice training program for college students of Chinese Academy of Sciences (No. 201707000121). We would like to thank all members of this project and we also thank the administration section of the Sino-Singapore Tianjin Eco-city for providing us necessary data.

References 1. Peng, L., Chi, T., Yao, X.: The Theory and Practice of Fluctuating Analysis in Smart Cities, pp. 123–135. Science Press, San Francisco (2017) 2. Chi, T., Peng, L., Yang, L.: Smart City Spatial Information Public Platform, pp. 123–135. Science Press, Beijing (2015) 3. Han, B., Nei, W., Wang, W., Wu, A.: Bi-level programming model for optimizing bus frequencies and its algorithm. J. Shenzhen Univ. Sci. Eng. 30, 98–102 (2013) 4. Peng, L., Li, X., Xu, Y., Chen, W., Hu, Y., Li, G., You, C.: Urban pulsation research based on spatio-temporal big data. Geomatics World 23, 5–12 (2016) 5. Peng, L., Chen, W., Li, G., Chi, T.: Research on the pulsating visualization of smart cities based on spatial big data. Geomatics World 23, 58–63 (2016) 6. Peng, L., Wu, T., Li, G., Li, X., Hu, Y., You, C.: Research on urban diseases based on the fluctuation law of multi-source spatial temporal data of SMA. Geomatics World 24, 29–35 (2017)

318

S. Lou et al.

7. Xu, Y.: Research on technologies of spatio-temporal data pulsation analysis based on GIS platform. Master thesis, University of Chinese Academy of Sciences (Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences) (2017) 8. Guo, J., Sun, M., Liu, X., An, J.: Thoughts and suggestions on prioritizing urban public transportation development in China. Urban Transp. China 11, 7–12 (2013) 9. Tilahun, S.L., Ong, H.C.: Bus timetabling as a fuzzy multiobjective optimization problem using preference-based genetic algorithm. Promet—Traffic—Traffico 24, 183–191 (2012) 10. Ruiz, M., Segui-Pons, J.M., Mateu-Lladó, J.: Improving bus service levels and social equity through bus frequency modelling. J. Transp. Geogr. 58, 220–233 (2017) 11. Yang, Z., Zhao, S., Zhao, Q.: Research on bus scheduling based on artificial immune algorithm. In: International Conference on Wireless Communications, Networking and Mobile Computing, pp. 1–4. IEEE, October 2008 12. Yu, B., Yang, Z., Cheng, C., Zuo, Z.: Bi-level programming model for optmizing bus frequencies and its algorithm. J. Jilin Univ. (Eng. Technol. Ed.) 36, 664–668 (2006) 13. Niu, X.Q., Chen, Q., Wang, W.: Optimal model of urban bus frequency determination. J. Traffic Transp. Eng. 3(4), 68–72 (2013) 14. Martínez, H., Mauttone, A., Urquhart, M.E.: Frequency optimization in public transportation systems: Formulation and metaheuristic approach. Eur. J. Oper. Res. 326, 27–36 (2014) 15. Currie, G.: Quantifying spatial gaps in public transport supply based on social needs. J. Transp. Geogr. 18, 31–41 (2008) 16. Shi, F., Wang, W., Lu, J.: Improvement and conclusion about resident trip generation. Urban Transp. China 3, 43–46 (2005) 17. Li, X., Shao, C., Jia, H.: Land use and resident trip generation model and its parameter calibration. J. Jilin Univ. (Eng. and Technol. Ed.) 37, 1300–1303 (2011) 18. Shi, F., Jiang, W., Wang, W., Lu, J.: Research on forecast method for traffic creating based on characteristic of land utilizing. China Civil Eng. J. 38, 115–118 (2005) 19. Delbosc, A., Currie, G.: Using Lorenz curves to assess public transport equity. J. Transp. Geogr. 19, 1252–1259 (2011) 20. Yin, G., Li, Q.: Green transportation system planning and implementation: a case study from Sino-Singapore Eco-City, T. Urban Transp. China 7, 58–65 (2009) 21. Zhang, T.: Analysis on characteristics of the resident trip and study on policy of the transport development in medium or small Urban. Sci. Technol. Inf. Water Transp. 3, 58–65 (2005) 22. Shen, J., He, B., Sun, J.: Analysis on characteristics of the resident trip and study on policy of the transportation development in small or medium city. Highw. Eng. 36, 123–126 (2011) 23. Li, Y., Duan, H., Chen, J., Meng, Q.: Characteristic of resident trip and countermeasure of transport development in medium or small city. Transp. Stand. 214, 28–31 (2010) 24. Li, Q.: Urban public transport network planning study. Master thesis, Hunan University (2002) 25. Shalalfah, B.W., Shalaby, A.S.: Case study: relationship of walk access distance to transit with service, travel, and pemonal characteristics. J. Urban Plan. Dev. 133, 114–118 (2007) 26. Ma, K., Cao, B., Fan, X.: Analysis on the resident trip characteristics and study on the transport polices in Tianjing. Transp. Stand. 4, 112–113 (2007)

The Artificial Intelligence Application in the Management of Contemporary Organization: Theoretical Assumptions, Current Practices and Research Review Dorota Jelonek, Agata Mesjasz-Lech, Cezary Stępniak, Tomasz Turek, and Leszek Ziora(&) Faculty of Management, Czestochowa University of Technology, Czestochowa, Poland {dorota.jelonek,agata.mesjasz,cezary.stepniak, tomasz.turek,leszek.ziora}@wz.pcz.pl

Abstract. Nowadays the artificial intelligence solutions together with data science and business analytics solutions such as Business Intelligence systems, Big data and data mining play crucial role in the management of many contemporary business organizations. The multitude of its benefits include improvement of the whole management process of business organization and especially the process of decision making, allowing for automation of tasks in many areas. The aim of the paper is to present the role of artificial intelligence solutions in the process of contemporary organization’s management, its theoretical assumptions, development and current practices. The paper also presents authors’ research carried out among the group of 12 respondents. The aim of the study was to find how the benefits and drawbacks of artificial intelligence solutions are perceived by respondents. The foreign research review includes analysis of practices in such areas and branches as production management, logistics, retail trade and financial sector. Keywords: Artificial intelligence  Neural networks data  Business Intelligence  Data mining

 Machine learning  Big

1 Introduction Currently AI solutions find application in many areas such as enterprise management, analysis of cybersecurity, market analytics, supply chain management in logistics, production management. Its combination with widely understood business analytics and such solutions as business intelligence systems, big data, data mining methods, techniques and tools creates holistic approach which can facilitate and automate the management of contemporary enterprise. The artificial intelligence solutions have been developed since 1950s and currently with the advent of latest development in machine learning, reinforcement learning are becoming the indispensable tool in supporting management of contemporary organization and especially in the improvement of decision making process. © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 319–327, 2020. https://doi.org/10.1007/978-3-030-12388-8_23

320

D. Jelonek et al.

2 The Origin, Theoretical Assumptions and Development of Artificial Intelligence Solutions The most prominent researcher in the field of AI was Alan Turing who in 1950 s developed a theory of computation and dealt with issues concerning the future of machine intelligence. The term artificial intelligence was coined by John McCarthy one of the pioneers in this field who organized conference outlining future prospects of its development at Dartmouth in 1956. The Dartmouth AI Conference and summer research project gathered pioneers of this newly formed field of research which included John McCarthy, Allen Newell, Herbert Simon, Marvin Minsky, Nathaniel Rochester and Claude Shannon. The key researcher in the field of AI Marvin Minsky in the paper dated 1961 and entitled “Steps toward Artificial Intelligence” “outlined basic features of AI programs that still form the basis of artificial problem solving today” [1, 28]. In the literature of subject can be found multiple definitions of AI. The definition selected for the purpose of this paper is the one provided by Dubitzky and Azuaje where “artificial intelligence embraces concepts, methodologies, and techniques that form part of a computer system or program that exhibits characteristics akin to intelligent behavior” [2]. It is worth mentioning that early AI solutions put emphasis on top-down approach to AI where “higher level concept of the brain like planning, reasoning, language understanding were tried to simulate or mimic and later in 1960s the bottom-up approach began to dominate which allowed for modelling lower level concepts such as neurons which later led to development of Hebbian learning, perceptrons and advanced network architecture [3]. The artificial intelligence solutions may embrace the whole set of data science and mining methods, techniques and tools as well as classical and advanced statistical methods such as classification with the application of decision trees and rule- based methods, k-nearest neighbor, linear discrimination, regression analysis connected with data prognosis for specific variables or cluster analysis. Such solutions also embrace machine learning e.g. Bayes trees, neural networks such as supervised networks using perceptrons and multi-layer perceptrons and unsupervised learning such as Kohonen networks, Learning Vector Quantizers and variety of algorithms such as Naïve Bayes, genetic algorithms used for optimization purposes where constantly better solutions to a problem are found as well as rough sets appropriate to reasoning and discovering relationships in the data [4]. The other worth mentioning AI components embrace expert systems, intelligent agents, fuzzy logic, natural language processing, bots, heuristics [5]. The neural networks and neurocybernetics models can also be used as a tool for knowledge synthesis [6]. The AI solutions may also consist of cognitive computing methods, virtual assistants available in all types of operating systems as well as augmented reality to support and facilitate performance of specific tasks. As far as neural networks are concerned it is worth mentioning the fact that such solutions refer to the functionality of human brain. Nowadays there is realized a huge scientific project in this area such as Human Brain Project co-founded by European Union which in the field of brain emulation may be prominent step towards development of artificial intelligence especially with the development and application of neuromorphic chips.

The Artificial Intelligence Application

321

The significant component of AI is machine learning which consists of many different methods like supervised learning and unsupervised learning. The exemplary algorithms include decisions trees and nearest neighbor learning. The most advanced solutions combine different methods and tools and such hybrid solutions seems to the most efficient ones e.g. the Google Deep mind AlphaGo is based on an algorithm combining neural networks, machine learning and Monte Carlo Tree Search. The hybrid systems applying more than one solution can in an effective way support the corporate management processes [7]. The exemplary framework presenting key components of AI applied in the management of contemporary business organization was presented in Fig. 1.

Neural networks AI BENEFITS

Machine learning

Big Data

Data mining

Management support

Improvement and perfectness of decision processes Improvement of organization functionality Automation of tasks

Business Intelligence Fig. 1. Key AI components supporting management of an organization Source Authors’ study

The determinant of AI development can embrace e.g.: models of cognitive processes, systems engineering, biology models of organisms and their behavior, neurophysiology and the brain model, computational models, mathematical formalization of models and theories, semantic models of language [8].

322

D. Jelonek et al.

3 The Role of AI in the Management of Contemporary Business Organization In the management of contemporary organization domain artificial intelligence solutions begin to play a significant role. They can be applied in prediction of enterprises’ key performance indicators (KPIs). Branscombe mentions the fact that machine learning and predictive analytics “can have a positive impact on project management outcome” [9]. Machine learning and especially reinforcement learning are the AI areas which are developed in a huge pace. Andrew Ng presents applications of machine learning and especially supervised learning in such areas as: photo tagging, loan approvals, targeted online ads, speech recognition, language translation, preventive maintenance, self-driving cars [10]. Machine learning, neural networks and statistics can be used to predict corporate bankruptcy. In banking domain expert systems, neural networks, genetic algorithms, fuzzy logic, pattern recognition, and the “study of decision processes are used for trading purpose in order to improve performance while reducing the risks of operating in the financial markets” [11]. The classification procedure applied for the purpose of decision and forecast support “on the basis of available information allow for repetitive judgment in new situations” [12]. Classification task finds its application in automated letters sorting on the basis of postcodes read by a machine, deciding on credit assignment on the basis of financial statement and any other additional information [12]. The role of artificial intelligence solutions in the management of contemporary organization can be perceived in a holistic way where the application of hybrid solutions is the most effective such as a combination of statistical methods with neural networks, machine learning, big data solutions, Business Intelligence and Management Information Systems, sentiment analysis and so on. The big data solutions support decision making at strategic, tactical and operational level of management as well as at its every stage of decision making process starting from problem definition and ending with making the final decision [13]. Sentiment analysis using machine learning and lexical-based approach is applied in analysis of opinions, attitudes and emotions related to a particular product or a service [14]. AI and data science solutions are applied in financial and telecommunication industries [15]. Another application of artificial intelligence in the management of modern enterprise is the area of supply chain management. Dolgui et al. state that in case of supply chain management in the area of lot sizing and scheduling under uncertainty (e.g. uncertainty in supply quality, in purchasing price and lead time) the genetic algorithms and lot sizing support optimal production planning decisions and answer the questions concerning what products and in a which order should be processed, which machine should be used and determine the number of units of each product which should be launch on input of the production system to satisfy customer demands on output [16]. Effective application of artificial intelligence in the enterprise requires from managers to perceive its advantages. It is necessary to realize that artificial intelligence is a key factor of success and it is indispensable and essential to transform existing business plans, define key decisions and take appropriate investments in this context [17].

The Artificial Intelligence Application

323

4 Review of Research and Current Practices The application of artificial intelligence in an enterprise constitute not only support for its functioning and development, but also is associated with certain threats. Chances and threats concerning implementation of artificial intelligence in enterprise management are noticed not only by managers, but also by surveyed students of economic faculties. The conducted research involved a group of 12 students of Czestochowa University of Technology in the field of logistics, the first (seven students) and the second (5 students) degree of study. The first level included education in the engineering discipline, the second - in the non-engineering discipline. In such a featured group of students there was carried out a survey to verify what are the benefits and drawbacks resulting from implementation of artificial intelligence solutions as far as the management of business organizations is concerned. In the surveyed group of students was examined the relationship between the level of studies and the perception of the advantages/disadvantages of artificial intelligence in the context of the enterprise or in the context of the information system. It could be observed the fact that the students’ responses related either to the general functionality of the enterprise: to support decision-making, improve management of resources, such as stocks, energy, enhancing the effects of work, lowering prices, reducing machine errors, jobs and professions etc., Or to the functioning of the information system and data security: easier access to information, the loss of information, cybercrime and so on. The respondents also indicated such advantages resulting from application of AI in the management of organization as: conducting simulations e.g. predictions and experiments, creation and deployment of intuitive interfaces, interaction with virtual assistants, creative thinking support, saving time and reduction of costs related to services and products development, reduction of business analyses time, optimization and acceleration of managerial decisions, improvement of business processes and support in company’s strategy creation. The drawbacks indicated by respondents included also the threat of losing job, possibility of error occurrence, lack of experience, creativity and empathy of real humans, the possibility of IT security breach. Due to the small number of surveyed respondents there was used Fisher’s exact test for dependency analysis purpose, which is useful in the case of an association array comprising of small number of individual cells and where the size of the sample does not exceed 20 units [18]. The results of the Fisher’s exact test are presented in Table 1. Table 1. Fisher’s exact test for research sample Correlation between:

The probability of getting association of observed number in the table cell

The level of study and the perception of the 0.5833 advantages of artificial intelligence in the context of enterprise functioning or in the context of the information system The level of study and the perception of 0.0265 disadvantages in artificial intelligence in the context of the enterprise functioning or in the context of the information system Source Authors’ study

p-value

0.05

0.05

324

D. Jelonek et al.

On the basis of the obtained results, the following conclusions can be formulated: 1. There is no reason to reject the hypothesis about the lack of correlation between the level of studies and the perception of the advantages of artificial intelligence in the context of the functioning of the enterprise or in the context of the information system, 2. The hypothesis about the lack of correlation between the level of study and the perception of disadvantages of artificial intelligence in the context of the functioning of the enterprise or in the context of the information system should be rejected. Therefore, it can be argued that, while students agree, the advantages of artificial intelligence primarily concern the entire functioning and development of an enterprise, their opinions are divided when it comes to the perception of disadvantages. Second degree students the disadvantages of artificial intelligence relate mainly to the information system and data security, while first degree students relate artificial intelligence threats mainly to the functioning of the enterprise. Kolbjørnsrud, Amico, Thomas try to respond to the issue of how AI will redefine management and their Accenture survey carried out among 1770 frontline, mid-level and executive level managers and 37 executives responsible for digital transformation at their organizations from 14 countries. The authors of the mentioned surveyed identified 5 practices related to the field of AI and they underline such crucial advantages of artificial intelligence that it “will soon be able to do the administrative tasks that consume much of managers’ time faster, better, and at a lower cost [19].” Leaving administration to AI will allow for automation of multiple tasks such as writing reports which is nowadays possible for same analytical management reports. The authors provide the example of Narrative for Tableau tool which allow for automated creation of explanations in a written form used in Tableau graphics [19]. The second practice indicated by authors of the cited report is focus on judgment work, where machine learning solution is able to support managers in making decisions and especially supporting them in judgment-oriented skills requiring experimentation, creative thinking, data analysis and interpretation as well as strategy development. The next practice was called treating intelligent machines as “colleagues” where intelligent machines can assist in decision support and data driven simulations (78% of managers would trust AI solutions in helping them to make business decisions in the future). Another practice was entitled “work like a designer” where third of surveyed managers indicated creative thinking and experimentation as a significant field of achieving success in AI driven administrative work. The last mentioned practice is developing social skills and network [19]. Artificial intelligence solutions allow undoubtedly for: 1. Reimagining business process - improving the functioning of organization by obtaining results impossible to achieve as a result of work of only a man or machine. 2. Redesigning the human-machine relationship - improving the productivity of employees by automating their most routine and low-budget tasks and thanks to the greater involvement of the managerial staff in the implementation of the company’s strategy.

The Artificial Intelligence Application

325

3. Unlocking the value of data - acquiring and gathering data in such a way as to solve previously unsolvable problems, unlock hidden value and achieve the increase in the value of implemented processes [20]. Artificial intelligence finds application in creating value for clients. It is used to support the processes occurring in the enterprise by transforming customer data into information useful for the business management process. As examples, can be listed here: – supporting retail food sales by transforming data on grocery shopping into information on customer nutrition methods, – energy management by transforming energy consumption data into real-time information on energy consumption, – managing the financial sector by transforming transaction data into information on consumer finance [21]. In the context of customers, artificial intelligence is used primarily in relation to sales, and in it especially allows: – – – – – – – – –

an accurate selection of actions in relation to a specific target group of clients, sales forecasts, informing about the possibility of customers leaving to competitors, conducting effective up-selling and cross-selling campaigns, customer segmentation, optimization of sales activities, tracking customer behavior over time, analysis of customer behavior, analysis of sales trends [22].

Another example of using artificial intelligence is supporting the customer service process in the library [23]. Artificial intelligence is widely used in reputation risk management, especially in the field of creating a kind of predictive analytics, allowing to identify ethical problems in the organization which unrecognized could threaten its proper functioning. Special software would allow to scan emails to capture keywords that indicate unethical behavior. The identification of behavior violating ethics would take place thanks to the archive in which the patterns of abnormal behavior would be stored [24]. What is more AI solutions are applied in the production management and intelligent manufacturing systems [25, 26]. Oztemel emphasizes the fact that the indicator of change in the manufacturing is constituted by automation and its advantage is not only decreasing the cost of product but also assurance of more compliant products with the needs and specification of customers. The author further claims that in most contemporary manufacturing facilities machines can make decisions and show intelligent behavior in all of the functions of manufacturing systems such as design, process planning, production planning, quality assurance, storing and shipment. Intelligent manufacturing systems can diagnose machines and perform maintenance actions, allow for automatic arrangement of materials, monitor performance of production processes and reduce to the minimum the human involvement in the manufacturing process [27].

326

D. Jelonek et al.

Artificial intelligence is also used in production management, especially in the field of simulation in a dynamic and stochastic production environment, where finding the optimal solution is too time-consuming and unrealistic. Finding the optimal solution may be difficult due to insufficient system flexibility in the context of taking into account frequent changes in consumer needs [25].

5 Summary Artificial Intelligence solutions bring many advantages to the process of organization’s management such as automation of different procedures, improvement of decision making processes. It should be remembered that AI solutions are not perfect and still undergo the development and improvement process. Further development of technological solutions e.g. cloud computing, increase of computational power will contribute to improvement of existing and creation of new AI solutions and allow for increasing business value of organization. There exist some limitations as well which might be connected with current technology restrictions but with the advent of new algorithms and such promising information and communication technologies as quantum computing the future should bring many new innovative solutions in the support of management and in many other areas of science and industry.

References 1. Henderson, H.: Artificial Intelligence: Mirrors for the Mind, p. 62. Chelsea House Publishers, New York (2007) 2. Dubitzky, W., Azuaje, F.: Artificial Intelligence Methods and Tools for Systems Biology. Springer, Heidelberg (2004) 3. Jones, M.T.: Artificial Intelligence: A Systems Approach, p. 7. Infinity Science Press LLC, Hingham, New Delhi (2008) 4. Munakata, T.: Fundamentals of the New Artificial Intelligence: Neural, Evolutionary, Fuzzy and More. Springer, London (2008) 5. Rutkowski, L.: Methods and techniques of Artificial Intelligence (in Polish). PWN, Warsaw (2018) 6. Tadeusiewicz, R. (ed.): Theoretical Neurocybernetics (in Polish). WUW, Warsaw (2009) 7. Nowicki, A., Stanek, S., Ziora, L.: The applicability of hybrid systems in support of the corporate management process: a review of selected practical examples. Pol. J. Manag. Stud. 8, 269–279 (2013) 8. Flasiński, M.: Introduction to Artificial Intelligence (in Polish), p. 251. PWN, Warsaw (2018) 9. Branscombe, M.: How AI could revolutionize project management. https://www.cio.com/ article/3245773/project-management/how-ai-could-revolutionize-project-management.html. Accessed 12 Jan 2018 10. Ng, A.: What Artificial Intelligence can and can’t do right now: harvard business review. https://hbr.org/2016/11/what-artificial-intelligence-can-and-cant-do-right-now 11. Ein-Dor, P. (ed.): Artificial Intelligence in Economics and Management, p. 78. Kluwer Academics Publishers, Dordrecht (1996)

The Artificial Intelligence Application

327

12. Michie, D., Spiegelhalter, D.J., Taylor, C.C. (eds.): Machine Learning, Neural and Statistical Classification. Overseas Press, London (2009) 13. Jelonek, D., Stępniak, C., Ziora, L.: The meaning of big data in the support of managerial decisions in contemporary organizations: review of selected research. In: Proceedings of 2018 Future of Information and Communication Conference, Singapore, pp. 195–198. IEEE, New York (2018) 14. Ziora, L.: The sentiment analysis as a tool of business analytics in contemporary organizations. In: Economics Studies. Research papers of the University of Economics in Katowice, Katowice, no. 281, pp. 234–241 (2016) 15. Nowicki, A., Ziora, L.: The application of data mining models and methods in enterprises. Review of Selected foreign financial and telecommunication industry case studies. In: Economics Studies. Research papers of the University of Economics in Katowice, Katowice, no. 88, pp. 85–94 (2011) 16. Dolgui, A., Grimaud, F., Shchamialiova, K.: Supply chain management under uncertainties: lot-sizing and scheduling rules. In: Benyoucef, L., Grabot, B. (eds.) Artificial Intelligence Techniques for Networked Manufacturing Enterprises Management. Springer, London (2010) 17. Plastino, E., Purdy, M.: Game changing value from artificial intelligence: eight strategies. Strat. Leadersh. 46(1), 16–22 (2018) 18. Szajt, M.: Space in Economics Studies (in Polish). Faculty of Management, Czestochowa University of Technology Publishing House (2014) 19. Kolbjørnsrud, V., Amico, R., Thomas, R.J.: The promise of artificial intelligence. Redefining management in the workforce of the future. https://www.accenture.com/us-en/insightpromise-artificial-intelligence 20. Shukla, P., Wilson, H.J., Alter, A., Lavieri, D.: Machine reengineering: robots and people working smarter together. Strat. Leadersh. 45(6), 50–54 (2017) 21. Riikkinen, M., Saarijärvi, H., Sarlin, P., Lähteenmäki, I.: Using artificial intelligence to create value in insurance. Int. J. Bank Mark. 84 (2018) 22. Syam, N., Sharma, A.: Waiting for a sales renaissance in the fourth industrial resolution: machine learning and artificial intelligence in sales research and practice. Ind. Mark. Manag. 69, 135–146 (2018) 23. Massis, B.: Artificial intelligence arrives in the library. Inf. Learn. Sci. (2018). https://doi. org/10.1108/ILS-02-2018-0011 24. Hirsch, P.B.: Tie me to the mast: artificial intelligence & reputation risk management. J. Bus. Strat. 39(1), 61–64 (2018) 25. Kasie, F.M., Bright, G., Walker, A.: Decision support systems in manufacturing: a survey and future trends. J. Model. Manag. 12(3), 432–454 (2017) 26. Benyoucef, L., Grabot, B. (eds.): Artificial Intelligence Techniques for Networked Manufacturing Enterprises Management. Springer, London (2010) 27. Oztemel, E.: Intelligent manufacturing systems. In: Benyoucef, L., Grabot, B. (eds.) Artificial Intelligence Techniques for Networked Manufacturing Enterprises Management, pp. 1–3. Springer, London (2010) 28. Minsky, M.: Steps toward artificial intelligence. Proc. IRE 49(1), 8–30 (1961)

Identification of Remote IoT Users Using Sensor Data Analytics Samera Batool(&), Nazar Abbas Saqib, Muazzam Khan Khattack, and Ali Hassan National University of Sciences and Technology, Islamabad, Pakistan [email protected], {nazarabbas,alihassan} @ceme.nust.edu.pk, [email protected]

Abstract. The immense progress of sensor technology and Internet of Things (IoT) has contributed well for the provision of various smart services through smart applications. These services include remote sensing, monitoring, control and operations in the fields of health care, transportation and weather forecast etc. alongside these great benefits users and device security prevails as a great challenge. Recently existing biometric identification methods are incorporated with other identification techniques of remote user recognition to improve the performance. In this research paper we have introduced a novel user identification framework using sensor data of walk activity. Accelerometer and heart rate sensors are used in combination for this purpose. As we know that both of these sensor readings are biologically more correlated during the walk activity. Heart rate is a unique biometric parameter for user identification whereas accelerometer sensor is known for its effective usage for activity recognition. The fusion method is adopted to make the proposed identification technique more customized to remove the overlapping probabilities of existing classification methods. The actual data set of 15 subjects is used for the experiments. The results are elaborated to prove the validity of the proposed approach. Accuracy for user identification is improved and a certain level of overlapping is reduced despite the low level of accuracy of heart rate sensors currently embedded in smart IoT devices. Keywords: Internet of Things  Biometric identification  Activity recognition

1 Introduction Growing use of the Internet of Things (IoT) devices has provided enormous amounts of personal data assets for research purpose. Sensors embedded in smart phones & wearables such as accelerometer, proximity and heart rate etc. enables the collection of user’s specific data sets without any additional setup or computational cost. This facilitates the usage of this valuable data for analytics purpose. It also calls for effective identification and secure access methods. Various identification mechanisms using different parameters, like passwords and user account, etc., are used for user identification purpose. But these are susceptible to security breaches due to user negligence and non-technical expertise [1]. © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 328–337, 2020. https://doi.org/10.1007/978-3-030-12388-8_24

Identification of Remote IoT Users Using Sensor …

329

With the introduction of first handheld phone in 1979, mobile industry has progressed remarkably and use of smart phones is widely spread. According to a survey by Ericson, the use of IoT connected devices will reach up to 50 billion by the end of 2020 [2], and it is expected that cell phone users will reach up to 72% of the overall world population by the end of 2019 [3]. The cell phone based sensors are quite effectively being utilized for Human Activity Recognition (HAR) systems. Applications of wearable sensors and cell phone embedded sensors are increasing tremendously in various fields such as healthcare, transportation and industrial development etc. [4]. It is also explored by many researchers for user identification and user authentication purpose with quite prospective results [5]. In healthcare applications use of a heart rate sensor is most common for the diagnosis of patients [6]. Heart rate is considered a unique biological human body feature. It has the characteristics to change with the movement of human body while performing different activities. Among other qualities of heart rate biometric identification, one valuable feature of the heart rate signal is its uniqueness for each subject and it also remains constant over long periods [7, 8]. It is also incorporated in the most editions of the smart phones recently. As previously stated the dominant drawbacks of the existing authentication techniques such as user ids and passwords, the biometric authentication mechanisms alongside other authentication methods has also got good attention. In this study, we have explored the use of heart rate data of a subject during the walk activity to analyze it for building more specific human identification models. Whereas the use of the accelerometer sensor data of different activities has already produced good results for activity recognition and user identification. We incorporate the walk activity data of a person using heart rate and accelerometer sensor to produce better results for recognition of remote users in IoT. The novel contribution of this research includes the following • Use of the heart rate sensor data of walk activity for user identification and authentication purpose. • Collection of actual data sets of 15 subjects in the real-life scenario using the easily available IoT sensors equipped devices. • Experimental evaluation of the proposed approach on real IoT data sets for validation purpose. • To minimize the data collection and storage overhead of IoT devices, least amount of data used for model building with high accuracy in user recognition process. The rest of this paper is organized as follows. Section 2 is comprised of the related work. Section 3 presents the motivation behind this work, Sect. 4 presents the proposed approach which is followed by Sect. 5 in which experiments and results are presented. Section 6 presents the future work and conclusion.

2 Related Work A brief study of the existing literature clearly illustrates the usefulness of sensor data obtained from the sensors such as accelerometers and heart rate for HAR and user authentication systems. In this section, we have also looked up for the user identification or recognition related works and those for HAR.

330

S. Batool et al.

In our previous research effort in [9] we have deployed the accelerometer sensor for the activity recognition and user identification purpose. Data set of a single accelerometer sensor is collected while performing the basic set of activities, i.e., walking, running and standing from different individuals, which produced highly encouraging results. But a certain level of overlapping was found in various classes of users and activities. To address this issue, we have customized the proposed models for authentication in this research. A recent work by [10] uses the cell phone sensors data and usage patterns of social applications by user for continuous verification of users in IoT. The authors have proposed an ad on application for smart devices to verify users based on their social biometric and sensor data and encourage the usage of smart devices, equipped with the intelligent ad on to verify users with 10% less false rejection rate. The combination of accelerometer and ECG data for activity recognition is proposed by [11]. Those activities are focused that consume energy without any movement. Data set is from 13 subjects of the six activities, i.e., sitting, standing, ascending, resting, walking and running. The accuracy up to 96.35% is achieved, which is much higher than the accelerometer sensor data based activity recognition solely. The authors also conclude that the human physiological data can be effectively applied for activity recognition. ECG data have been utilized for user authentication purpose as well, due to its unique characteristics in each person. In [12] the authors use the chest worn ECG sensors for collection of various activity data of four subjects to apply the proposed approach to authenticate and identify the users. The results produced by the experimentation on the proposed approach have shown the error rate from 6 to 13% which is quite controlled. Another approach in [13] has utilized the concept of IoT sensors for user authentication purpose. In the proposed technique, the IoT user authentication is based on the proximity sensor embedded in IoT devices. Every user is registered with a unique IoT device in the network, and when a new device of the same user asks for access or authentication, the system estimates the proximity distance between the two devices using the acoustic signals. The author’s proof of the concept technique presents that the proposed approach is secure and personalized for authentication purpose. Energy consumption of IoT devices is also of great importance in the process of activity recognition in the existing approaches. In [14] the authors present a precise detail of the existing classification methods that are applied on accelerometer sensor data for activity recognition context of the IoT devices. The described classification methods are decision tree, dynamic time wrapping and support vector machines. The dominating classification methods among the three are SVM which produces results of activity recognition to 90% and about 99% in terms of motion recognition. Investigation of recent literature illustrates the growing attention of researchers in the use of sensor data for human activity recognition and motion prediction. So far, the accelerometer has been among the most widely used sensor for activity recognition and user authentication [15]. This research work uses the easily available sensors embedded in smart phones data sets for creation of profile models of individual users for authentication purpose. Accelerometer sensor data along with heart rate sensor is used for this purpose.

Identification of Remote IoT Users Using Sensor …

331

3 Motivation Personal data sets generated by IoT devices such as smart phones, wearable sensors and IoT devices are an economic asset. Medical science and existing literature have also proved that ECG data is unique for each subject. It has been utilized in the literature for biometric authentication of users quite effectively [16]. Due to the unique characteristic of ECG data for every single individual, it cannot be replicated or easily steal [17, 18]. Heart rate sensor is one of the newly added sensors in recent models of smart phones such as Samsung s5 and onward series. As we know that the heart rate is directly correlated to various physical activities of humans. This paper aims to carry out the experiments on heart rate data and accelerometer sensor data of walk activity belonging to a basic set of activities to build classification models of various subjects that can be later used as templates for verification of the users. The use of the smart phones enables us to collect the required data quite easily.

4 Authentication of Remote IoT Users In this section, we explain the details of the proposed framework for remote user authentication in IoT. Figure 1 presents the overall architecture and data flow of the proposed framework.

Fig. 1. Proposed framework of user identification

332

S. Batool et al.

4.1

Data Processing

For prototype implementation of the proposed framework, we have acquired real IoT data from 15 subjects. The data are collected using the cell phone based sensors. The Samsung S7 edge device is used for data collection. Samsung S7edge device is selected due to its high-quality sensor addition in S series with embedded heart rate sensor and accelerometer. We acquired data of walk activity of each subject, and both the sensor readings are collected. List of the attributes used for different participants are presented in the Table 1. Table 1. List of attributes of subjects participated for data collection Gender No of persons Age group range (years) Weight (kg) Female 14 17–33 45–68 Male 1 16 50

During the data acquisition process the phone was attached to the waist of the subject. Same area of college building is used for data acquisition for all subjects. For data acquisition from accelerometer and heart rate sensor, we used two Android apps. Samsung S health [19] the built-in app is a promising addition in Samsung for sensor data collection. Whereas to record accelerometer sensor data we used open access android app accelerometer log [20]. The accelerometer log application saves this data on storage for analytic purpose. The setup for data acquisition is presented in the following Fig. 2.

Fig. 2. Data collection from user walk activity

4.2

Noise Removal

Heart rate sensor data is collected using Samsung s7 edge device Shealth app and it is recorded during walk activity. About 20 samples of each subject are taken. This data is already in the processed form there is no need to remove noise. The accelerometer sensor data are affected by the movement of other body parts etc., For removing the noise, we use band pass filter to remove the upper and lower band frequencies from the signal.

Identification of Remote IoT Users Using Sensor …

4.3

333

Filtering

The required data is filtered and unnecessary data is removed from the data set. We used only the minimum number of data instances of both the heart rate and accelerometer sensor for building models and classification purpose to reduce the cost overhead. A same corridor of the college building is used for collection of walk activity data and every individual participated in this context completed the same number of events. 4.4

Labeling

In the data preprocessing process, we labeled each attribute of the collected data for each subject. The following attributes are used for each subject. Subject id X axis Y axis Z axis Heart rate The above attributes represent the acceleration readings of each participant and heart reading during the walk activity. 4.5

Feature Extraction

Labeled and preprocessed data instances are used for feature extraction. The features are selected according to the classification model requirement and the proposed classifier used for model building to uniquely identifies a person. These features set acts as the unique profile of a person which is saved in the data based for registration process. And later when the user accesses some services, the saved profile is loaded in the system. As elaborated in the proposed approach Fig. 1 the proposed model is checked for registration or authentication purpose.

5 Experimental Evaluation In this section we have implemented the prototype of the proposed system to validate the proposed approach on real data sets. For experimentation, we have used the data set elaborated in the previous section. 5.1

Experiment 1

The first experiment takes as input the accelerometer sensor data to develop the classification model for identification of the subjects. The first data set builds the general model for all the subjects for whom data are provided. This model can be tested by the input data of each subject separately. The following attributes are used for the experiment 1.

334

S. Batool et al.

X axis Y axis Z axis Subject Id: For building the models from data set we use the random forest algorithm. The experimental setup performs cross-validation into 10 folds. The saved model is brought back to the system to verify against each person’s data whenever identification is required. The output of the model building process is presented in the following Table 2. Table 2. Summary of the performance of Experiment 1 Total no of instances 231

Correctly classified instances 207

Incorrectly classified instances 24

Accuracy 89.61

The above table presents the accuracy of user identification model on the same number of data instances of accelerometer sensor 89.61%. And in the following Fig. 3 we present the confusion matrix of the model evaluation which will highlight the certain overlapping of subjects. The confusion matrix represents the highlighted area of where overlapping is found. The circled instances are incorrectly classified by the algorithm.

Fig. 3. Confusion matrix of Experiment 1 using accelerometer sensor data set

Identification of Remote IoT Users Using Sensor …

5.2

335

Experiment 2

In this experiment we have built the user classification models by adding the heart rate data of the subjects along with the accelerometer sensor data to reduce the overlapping found and improve the accuracy user identification models. The output is presented in the following Table 3. The results of this experiment show the accuracy improved up to 93.93%. The confusion matrix is presented which shows the removed overlapping at certain level.

Table 3. Summary of the performance of Experiment 2 Total no of instances 231

Correctly classified instances 217

Incorrectly classified instances 17

Accuracy 93.93 %

The confusion matrix of the experiment 2 in Fig. 4 also presents the results of second experiment which shows the improved accuracy and the reduced overlapping is highlighted with cross forward cross line. These results support the concept of using the heart rate sensor data along with other IoT based sensors for user identification.

Fig. 4. Confusion matrix of Experiment 2 using customized data set of hear rate sensor with accelerometer data

336

S. Batool et al.

6 Conclusion and Future Work In this research paper we have presented the novel framework to analyze IoT data for user identification. The activity based authentication techniques proposed in this paper use sensor data of cell phone based sensors accelerometer and heart rate for user recognition. We have built user recognition models using sensor data to authenticate users effectively. For prototype implementation of the proposed framework and for more detailed description is presented using the real data experimentations. The WEKA classification tool is used as the experiment tool. There is visible improvement in the accuracy of the classification results. And the overlapping of existing techniques based solely on accelerometer sensor data is reduced by combining the heart rate data and it supports the idea of more customized recognition models by adding more sensor data.

References 1. Yosef Ashibani, F., Dylan Kauling, S., Qusay, H. Mahmoud, T.: A context-aware authentication service for smart homes. In: IEEE Annual Consumer Communications & Networking Conference (CCNC) (2017) 2. Bayat, A., Pomplun, M., Tran, D.A.: A study on human activity recognition using accelerometer data from smartphones. In: 11th International Conference on Mobile Systems and Pervasive Computing (MobiSPC-2014) (2014) 3. Ehatisham-Ul-Haq, M., et al.: Authentication of smartphone users based on activity recognition and mobile sensing. Sensors 17(9) (2017) 4. Chetty, G., White, M., Akther, F.: Smart phone based data mining for human activity recognition. In: ELSEVIER, International Conference on Information and Communication Technologies (ICICT), pp. 1181–1187. (2015) 5. Ugulino, W., et al.: Wearable computing: accelerometers’ data classification of body postures and movements. In: Proceedings of 21st Brazilian Symposium on Artificial Intelligence. Advances in Artificial Intelligence—SBIA, pp. 52–61. (2012) 6. Islam, M.S.: Heartbeat biometrics for remote authentication using sensor embedded computing devices. Int. J. Distrib. Sens. Netw. (2015) 7. Lena Biel, F., Ola Pettersson, S., Lennart Philipson, T.: ECG analysis: a new approach in human identification. IEEE Trans. Instrum. Meas. 50(3), 808–812 (2001) 8. Chan, A.D.C., Hamdy, M.M., Badre, A.: Wavelet distance measure for person identification using electrocardiograms. IEEE Trans. Instrum. Meas. 57(2), 248–253 (2008) 9. Batool, S., Saqib, N.A., Khan, M.A.: Internet of things data analytics for user authentication and activity recognition. In: Second International Conference on Fog and Mobile Edge Computing (FMEC) (2017) 10. Anjomshoa, F., et al.: Social behaviometrics for personalized devices in the internet of things era. J. IEEE Access (2017) 11. Lee, J., Kim, J.: Energy-efficient real-time human activity recognition on smart mobile devices. Hindawi J. Mob. Inf. Syst. (2016) 12. Šprager, S., Trobec, R., Jurič, M.B.: Feasibility of biometric authentication using wearable ECG body sensor based on higher-order statistics. In: 40th International Convention on Information and Communication Technology, Electronics and Microelectronics (2017)

Identification of Remote IoT Users Using Sensor …

337

13. Gong, N.Z., et al.: PIANO: proximity-based user authentication on voice-powered internetof-things devices. In: IEEE 37th International Conference on Distributed Computing Systems (2017) 14. Bisio, I., et al.: Enabling IoT for in-home rehabilitation: accelerometer signals classification methods for activity and movement recognition. IEEE Internet Things J. 4(1) (2017) 15. Murmuria, R., et al.: Your data in your hands: privacy-preserving user behavior models for context computation. In: First IEEE International Workshop on Behavioral Implications of Contextual Analytics (PerCom Workshops) (2017) 16. Arteaga-Falconi, J.S., Al Osman, H., El Saddik, A.: ECG authentication for mobile devices. IEEE Trans. Instrum. Measur. 65(3) (2016) 17. Zahra Fatemian, S., Hatzinakos, D.: A new ECG feature extractor for biometric recognition. In: Proceedings of the 16th International Conference on Digital Signal Processing (DSP ’09), pp. 1–6 (2009) 18. Agrafioti, F., Hatzinakos, D.: Signal validation for cardiac biometrics. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2010) 19. https://play.google.com/store/apps/details?id=com.sec.android.app.shealth&hl=en 20. https://play.google.com/store/apps/details?id=com.alfav…accelerometerlog&hl=en

From Smart Concept to User Experience Practice a Synthetic Model of Reviewed and Organized Issues to Conceive Qualified Interactions Cristina Caramelo Gomes(&) CITAD, Architecture and Arts Faculty, Lusíada University of Lisbon, Lisbon, Portugal [email protected]

Abstract. Nowadays, we live in environments where smart products enable new processes of human and non-human interaction. Smart concepts push functions beyond traditional levels of expectation. It is widely appraised that smart concepts enhance products to make them respond to human needs and expectations. However, regardless of the level of complexity and intelligence of those products (regardless of scale, from home environment to an object), if the environment is non-user-friendly, users will reject them and eventually abandon them. Conceptualizing intelligent solutions is more often centered on the requirements of functions and overlooks the profile of users, a critical aspect to outline the experience. Present-day interaction between users and smart products should be enabled by their physical model, finishing, interface features and functional capabilities. The quality of experience is not only grounded on the type and efficiency of the task performed. The quality of the experience depends on: the physical experience (dimension, height, shape, finishing); the sensorial experience (how does the artifact stimulate the sensorial human system); the cognitive experience (how easy it is to interact with technology) and the emotional experience (is this a pleasant experience? Would we repeat it?) Moreover, outlining the users’ demands and expectations from environments, products and technology is also critical to reach qualified and appealing solutions. The aim of this paper is to establish the need of taking a different approach when conceptualizing smart environments and products, arguing that users can only understand inherent intelligence if the parameters that sponsor an emotional appealing experience are considered throughout the design process. Keywords: Smart product

 User experience  Emotional experience design

1 Prologue Contemporaneous society notice the spreading of the smart expression set in products and environments that we live and interact with. From living communities, domestic environments and daily objects all with no exception allows new functions, new ways to perform traditional ones and mainly new processes of human (as well as non-human) interactions. © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 338–357, 2020. https://doi.org/10.1007/978-3-030-12388-8_25

From Smart Concept to User Experience Practice a Synthetic …

339

Smart concept encourages more than the accomplishment of the function the delivery of services. Yet, as much complex and sophisticated is the technology if the user’ experience is not pleasant the final solution tends to be abandoned by the user and overlooked by the market. It is commonly accepted that the smart concept enhances the product behavior to respond to human needs and expectations throughout the development of several and alike tasks and functions. Despite the intelligence evidenced by a product (regardless of scale, from home environment to an object) and its efficiency to perform an order or request, a non-user-friendly experience by the user implies its abandon and rejection. Commonly, the conceptualization of intelligent solutions is centered on the requirements of the function on itself neglecting the user identity which emerges as critical to outline the experience. Contemporaneous user is informed and demanding thus the experience is built by the allowed interaction between user and the product stimulated by its physical model, finishing, interface features and functional capabilities. The quality of the experience is grounded on the type and efficiency of the task performed, the physical experience—dimension, height, shape, finishing—the sensorial experience—how does the artifact stimulates the sensorial system namely, visual, auditory and tactile ones, the cognitive experience—how easy is to interact with the technology—and the emotional one—is this a pleasant experience? Would we repeat it? The experience grounded on the interaction between the human and the machine— despite other interactions between machines to deliver a service—emerge as the central spot to develop in the smart concept. For that is crucial to understand what users expect from environments, products and technology to achieve a qualified and appealing solution.

2 Introduction “Any sufficiently advanced technology is indistinguishable from magic.” Arthur C. Clarke “Man is the only animal for whom his own existence is a problem which he has to solve”. Erich Fromm

Design is a dynamic and focused method that individuals use to change the world to their needs and expectations. By Design, individuals eliminate barriers and build up supportive environments, objects, and systems to enable the accomplishment of their objectives. Design actions have matured and emerged with human experience and the advances in technology [1]. Conceiving empowering environments, objects and systems is a significant ambition of designers. Needless barriers to autonomy, independence and social participation should be refrained for the improvement of every human experience. Design solutions that fence the human output spoil social identity while improving different senses of dependency at significant costs to society. Barriers emerge on dissimilar levels of human experience, embracing the physical, the sensory, the cognitive and the communicative dimensions.

340

C. C. Gomes

Interactive interfaces for smart objects design is about people. Interactive interfaces for smart objects design is understanding how people live and interact with physical and virtual “objects” to perform a task in a safe, comfortable and pleasant way. Interactive smart objects design is a wide-ranging view of what the object is, and how it contributes to achieve a function in an efficient and easy way to an identified user. It considers interfaces and cognitive ergonomics (beyond other human factors considerations) to build something that boost a functional and enjoyable experience [2]. Interactive smart objects and environments design process is the way to create something that people use. The process recognises the need or opportunity and, by the analysis of the functions as well as users’ requirements, satisfies it. The test of the solution throughout prototyping evaluation, virtual objects and environments, interviews, questionnaires, focus groups, among other methodologies, validates the solution in relation to the characteristics presented by the task and the users [3]. Smart objects and Internet of things are two concepts that describe the near future depending and complementing each other. The interconnections between smart (as well as non-smart) objects can expand intelligence to unpredicted boundaries. This relationship relies on a network that interconnects objects around the world. Cities, houses, cars, machines, domestic equipment or any object that can sense, respond, work or just relax the daily living of users tend to become a smart object [4]. From “object” we understand any device or thing that can be intelligent or not. When people address the connection between objects, that indicates that interconnection is between smart objects, between non-smart objects and between both. Intelligent objects are the ones that are continuously monitoring, reacting and adapting to the environment, having an optimal performance and embracing an active communication. It acceptance by users depend on the possible interactions boosted by their physical and digital appearance [5]. Interaction Design has its roots in graphic and web design yet, it has matured in a field of its own [8]. Interaction Design concerns the practice of designing interactive digital objects, environments, systems, and services. Although the term is usually connoted with digital artefacts, its aim can be expanded to the design of physical objects studying the possible interactions with the user [6]. An object as an environment boost user experiences which can be more easy and pleasant by their physical features—such as form, color, texture and how these promote its functional use—as well as the usability offered by interfaces existing in so many products and environments to encourage a more efficient and qualified user’s experience [7]. Interaction design is the link between user and technology while modelling how information is retrieved and used and how embedded information in objects (as well as environments) encourage and enlarges their use [8]. There is a focus of this area of knowledge in digital artefacts which orientate (but not limit) its interest in the requirements of digital interfaces to boost the communication between individuals and technology [9]. Interaction Design is supported by five dimensions: words, visual representations, physical objects and space, time and behavior aiming to design for human performance heterogeneity, while responding to users’ requirements and expectations [10]. This language dimension aims to enhance user’s experience, once contributes to boost the

From Smart Concept to User Experience Practice a Synthetic …

341

communication between user and machine, and to afford the user with emotional demand, an aesthetical sense towards a positive feeling emerging from the graphical layout and communication strength of the interface [11]. To interact with users, the objects depend on their form (dimensions, color, texture) functionalities and the interfaces that allow and encourage to choose functions according with users’ needs and expectations. The design of artefacts, namely the ones that includes technology is regularly focused on the way the artefact must be used. Ergonomics is usually introduced as the need to combine the system with human features. Because of informed decisions, equipment, tools, tasks and environments can be selected and designed to respond to human skills and restrictions heterogeneity. Cognitive ergonomics lay emphasis on the balance between human cognitive skills and restrictions and the object, task and environment despite its physical and/or virtual character. Cognitive ergonomics is a critical area to the design of objects, equipment and environments that imply complex, high-tech or automated systems. The practical purpose is to enhance human performance as well as safety and health avoiding human error and needless load and stress. A poor designed object, equipment and environment have a considerable impact on its target audience, which can limit its production and distribution, as well as promote unpleasant and/or dangerous users’ experiences. Cognitive ergonomics aims to spot the importance of how the use of artefact changes the way the user thinks about it, work with it and accept it. The ability of objects and technology to amplify users’ efficiency, implies the migration from the physical to the cognitive parts of work. It is not that work become more cognitive but also because users do not have the monopoly of performing cognitive work or cognitive tasks [12]. This area of knowledge discusses mental processes, like perception, memory, reasoning and motor response once they have a considerable impact on interactions between individuals, objects and other elements of a system [13]. It interferes and mediates the graphic representation to conceive interfaces to boost interactions (between users and the artefact/objects, or between objects) to generate learning processes (on users and objects) by the design of digital contents and their layout display on dissimilar devices (from desktop, to tablets, to smartphones, equipment displays, etc.…) [14]. Cognitive ergonomics aims to expand human information processing to increase efficiency, fewer errors and accidents, and the sense of well-being; its attention is oriented to psychological phenomena, such as knowledge, perception and planning. Cognitive ergonomics responds by designing users’ interfaces that settles cognitive limitations. Users’ interfaces ought to be accepted as the users’ manner to perform tasks, the user interface is understood as the knowledge that users need to positively perform tasks with a computer system. Cognitive ergonomics refers to the users’ interface design [15]. By definition, User Interface is the space where interactions between users and objects are performed aiming the operation and control of the equipment by the individual though the equipment at the same time feeds back information that support the operators’ decision-making process [16]. It aims to conceive interfaces featured by

342

C. C. Gomes

a set of parameters (being easy. efficient, enjoyable and user friendly) to boost the user experience [17]. The end of the 20th century witnessed the design of interfaces, namely related with information and communication technologies (ICT), based on graphical elements to organize menus, differentiate items, development of logical dialogue boxes to facilitate the navigation and the achievement of the end result [18]. Improvement on graphical interfaces, speech and hand-writing recognition as well as the advent of internet, wireless networks, sensor technologies and a range of other new technologies delivering small and large displays, boost a new challenge to the user interface design towards new forms of interaction [19]. The latest developments of ICT address a new call driven towards gesture-based, tactile-based and emotional based interactions, while at the same time there is a trend in combining physical and digital world, enabling individuals to access and interact with information in their work, social, everyday lives [20] using an array of available technologies [21]. Informed and supported by several areas of Human Factors, Design is shaping interactions—amid human and physical objects but also with physical/technical, informational, social, political, economic and organizational elements of a system. These interactions can be experienced physically, cognitively and emotionally from different perspectives. Today, technological progress challenge and change everyday life, namely by the improvement of internet access, digital technologies and associated products and services. The imbedding of technology in products alters dramatically the interaction userproduct/object [22]. The changing of product/object characteristics lines up with demographic changes, especially an ageing society, posing a significant challenge to the design practice. The matching of users’ and tasks requirements, and particularly users’ expectations, along with technological developments, requires increased use of granting intelligence to the usually static and unclever objects. Beyond the physical artefact, there can be an array of selectable functions (through an interface such a visual display) and the possibility to connect to the internet allows to manage information, between the machine and the user—as well as between machines—informing useful services which need also to be designed. At this moment, the number and nature of the possible interactions user-machine, justifies more than ever before the consideration of Human Factors input into Design process from the beginning. The success of a product depends from its acceptance by the user (customer), and this acceptance will be easier to achieve if the product was conceived following a process that considers the human being at all his/her dimensions [23]. Inclusive Design aims to conceive urban areas, buildings, facilities, objects and systems with human heterogeneity in mind, to be used by as many people as possible. The Inclusive Design Research Centre identifies three dimensions on Inclusive Design: Recognize diversity and uniqueness—As users differ from the inexistent average model, the needs and desires of the users which are in the margins are even more divergent. This suggests that a mass solution is not the best one. Flexibility is the goal to achieve, once although driven to a target user they can be suitable to another sample purpose, without stigmatizing each one of them. Stigmatized solutions are not economically or technically sustainable.

From Smart Concept to User Experience Practice a Synthetic …

343

Inclusive process and tools—Inclusive project teams must have a multidisciplinary approach and must include users as well as other intervenient on the use, delivery, sell and maintenance. They should use user-centered and participatory design approaches (although they both can match). Broader beneficial impact—The designer must be conscious of the context and the design solution impact of the solution achieved beyond the user. Inclusive designer must have in mind the possible interactions (as well as prospective opportunities) between the user and systems [24]. Innovation and competitiveness demand emotionally appealing user experiences. Unpleasant experiences will lead to the abandon of the product [25]. Moreover, all over the product/object development cycle, designers must include potential users, individually or in focus groups, to analyses and recognize task and product requirements, evaluating design strategies, and testing prototypes. Form, dimensions, color, texture and functions performed define products. Technology enables new functions, which are extended with “smartness” and connectivity, such as the Internet of Things [26]. A smart object is characterized as an object which is responsive to human and machine interactions. The complexity that this kind of object can achieve boosts more complex interactions: they are not simply physical, extending to sensorial and cognitive, pushing the need of inclusive design and usability to the edge [27]. Smart objects can be described as tangible products with Information Technology embedded. The enablers for smart objects are sensors, processors and semantic technology. Smart objects present a set of possibilities making them exceptional: contextawareness, pro-activity and self-organization. These characteristics enable them to make decisions based on different contexts as well as anticipate the user’s activities and choices. Smart objects’ behavior (action) may result from information gathered from different sensor embedded into the object, but also from a so called smart environment semantically enabled [28]. Smart object is an innovative concept that includes several areas of knowledge [29]. It’s not easy or realistic to forecast its future development, but this added intelligence, sensing aptitudes and semantic capability suggest benefits and opportunities that can transform dramatically the way the user interacts with, thinks about, and benefits from the object. For the Designer, this is an opportunity to develop and apply concepts such as User-centered-Design, Usability, Interaction Design, Experience Design, Emotional Design—just to mention a few—and create objects that can be accepted, used and desired by users. Currently, users claim for objects that present an expanded functionality, improved and friendly interface, enhanced smartness and undeniable aesthetics [30]. The refined functionality, complexity, and smartness indicates the potential of products to identify and react to changes in the environment and user’s requirements. This potential must be accomplished with higher quality and reliability with reduced manufacturing and societal costs. Threats to smart object success are safety of use, usability (product-user interaction both in physical and virtual dimensions), control and ethical issues related with data gathered and managed [31]. These threats must be addressed by research.

344

C. C. Gomes

Most of the smart objects have their performance increased by the opportunity to use or to activate a service which expands its functionality a sequential or possible service that expand its purpose and the possibilities offered by smart objects and applications demand the design of that service [32]. A sustainable approach stimulates responsible objects building up a difference upon the way they influence and affect the users’ life, work or perform in it along with the earth ecology [33]. Considerable research has been conducted to study the object life cycle and its impact in natural world, but scarce research has been developed to identify the way the object features affects physically, psychologically and behaviorally their users. The results achieved contribute to new materials and manufacture process, showing the importance and the near future acceptance of these themes [34]. ICT stimulates a different way to accept and manage technology. Individuals understand technological apparatus as digital beings, disregarding the traditional idea of inanimate objects, understanding objects and accepting and expecting smart ones. More important than the functionalities, boosted by technology, is the social integration and interaction needed by human being [35]. The functionalities provided by smart objects and the possibility to access to a significant range of services, are crucial to be in contact with another being, even if virtual, supported by visual, acoustic and touchable senses. Designers that embrace a sustainable philosophy, while creating an object, conceive a service that surpasses the immediate customer to other people, to other species and, at the limit, to future generations [36].

3 Emotional User Experience “The notion that we are purely rational creatures has been dispelled definitively by behavioral economics and, over the last decades, we have seen an increased awareness of the role emotions play in our decision making and perception. However, the way we have been designing the spaces we inhabit, the products we use, and the services we deliver has only taken emotions into partial consideration” [37]. User experience is characterized by the appraisal of the experience that the individual has when interacting with an object or/and an environment [38]. User experience highlights the practical, emotional, meaningful and appreciated features of interaction design and product ownership as user’s perception of the unremarkable facets like utility, ease of use and efficiency of a system. User experience is active once fluctuates over time and contextual sceneries [39]. Bearing in mind the experience as the artefact’s objective, requests a new approach to technology, questioning why technology is important and what its intended impact. In opposition with the concept task/function-oriented method, the experience oriented method is focused on the user’s subjective of interacting with a product (despite object or environment), realizing interaction as a dynamic story, skilled to create emotions and meaning [40]. The subjectivity of user’s emotions all over the experience stresses its design. Still, it is possible to outline a set of steps to encourage the experience: define a purpose (what is the reason to design the experience and what is desired to attain);

From Smart Concept to User Experience Practice a Synthetic …

345

define a path (the story) to users pursue the best support to reach the objective; define the emotional movements that make the emotional path and the distinct stimuli to provoke the emotional events in the user. Accept as true that is not feasible to experience an experience until experience it is central to offer a prototype (physical or virtual) to observe and evaluate the way users interact with [41]. Undertaking human heterogeneous condition and resulting ways to object and environment interaction, it is believable from a selected users’ sample to raise some forms of behavior to attain the conclusions that will support the result.

4 Smart (Dwelling) Environments The use of technology helps the human being to control and settle into his/her natural environment and to the empowerment of societies. The last century saw the emergence of several technologies aimed to the performance of the most demanding (in effort and time) activities. From the telephone and the washing machine to the computer, peripherals and mobiles gathered our attention and investment. Technology exists in our houses and our ways of living create the sense of assistance and dependence to accomplish the most simple or complex tasks. They appeared gradually, within particular houses and families but the standardization of models and the intensive use of them democratized their existence in most family homes. The development of ICT (Information and Communication Technologies), with special attention to the Internet and wireless technology, stimulates the connection between various equipment and the contact (driven by different causes) with and between institutions and individuals. Albeit the importance of every technology, the ICT emerge as the technology that challenged more significantly and in a minor period our way of living and interact. The revolution inspired and motivated by ICT, stimulates new forms of work and living where the access of information and the possibilities of communication decrease geographical and social barriers. People can work, interact with friends and relatives, shop, learn, entertainment, etc. from home, since this technology is available and there is the ability to manage ICT. Contemporaneous life style shows that most daily performed routines are supported by technology, and this condition states that technology must be designed for common users, comprehending a range of capabilities and not focusing on impairment or disability [42]. To design an inclusive solution (environment, product, communication, system and experience) the designer must be informed about the requirements of users and the available technology. Research information is available, but designers are not familiar with it or the offered information does not merge with solutions requirements. The focus is to offer to designers the awareness to integrate more usability and inclusivity within solutions’ conceptual process. Designers must accept this chance to conceive solutions that appeal to larger markets and to be more sustainable [43]. “…assistive environments is a field that pushes innovation and interdisciplinary collaboration forward, catalyzes the union of computation with the development of new materials (e.g., sensors) and devices (e.g., assistive robots), new or improved drugs

346

C. C. Gomes

and drug administration, new methods and software for rehabilitation, physical/ occupational therapy, new materials for safety or disorder monitoring (e.g., sleep disorders connected with epilepsy or depression), new algorithms for mental health and emotion monitoring (e.g., through facial expressions or speech), new types of virtual life coaches (e.g., avatars to promote exergaming) and many other innovations” [44]. Smart and Assistive technology contribute to simplify life to everyone, regardless of age and who may need support to perform daily activities through home modification and adaptation. Technical devices may be applied to different environments such as home, work, for recreation and leisure, for outdoors activities and for daily living activities—these include hygiene, dressing, taking medication, preparing meals, feeding, and household chores. They can also constitute a significant contribution to enhance communication, safety, education, mobility, and transportation within a variety of environments and situations. Assistive technology is a significant component of smart homes which intends “… the integration of technology and services through home networking for a higher quality of living at home” [45]. “Smart homes are houses or apartments equipped to enhance the safety of day-today activities, monitor health and stimulate mental and physical exercise. Key devices are fall detectors, wearable sensors that measure physiological responses, embedded sensors in furniture or walkers that evaluate health, personal emergency responders, easy-to-use cell phones, two-way video conferencing equipment, assisted technologies and computer games like the Wii that promote mental and physical exercise” [46]. “They are about using the latest information and communications technology to link all the mechanical and digital devices available today—and so create a truly interactive house. They started by designers examining the way people live now, and then exploring how society might look in the future. This generated a number of new ideas that could improve people’s lives and help them stay independent for longer” [47]. Moreover, smart homes are considered ambient intelligence, which is sensitive and adaptive to contemporaneous human and social needs. The employment of smart home technologies intends to improve home comfort for every person through the automation of domestic tasks, easier communication, and higher security. Smart home users are capable to boost their capacity to interact with their domestic environment, perform tasks and involve in activities that would have been previously challenging or unmanageable [48]. Smart homes can be characterized or identified as having five basic features: “Automation: the ability to accommodate automatic devices or perform automatic functions; Multi-functionality: the ability to perform various duties or generate various outcomes; Adaptability: the ability to adjust (or be adjusted) to meet the needs of users; Interactivity: the ability to interact with or allow for interaction among users; Efficiency: the ability to perform functions in a time-saving, cost-saving and convenient manner” [49].

From Smart Concept to User Experience Practice a Synthetic …

347

More than including a variety of gadgets and technology into a home, it is important to define what really matters to users’ health, safety, comfort and wellbeing. This sense of wellbeing is provided not only because the user can be informed about possible intrusions inside the house (which can be more a sense of constant concern than a sense of wellbeing) but also by the communication between different technologies to recognize and react to occupants’ routines. The crucial issue is to engage users within the design process. Quoting Bierhoff and Panis [49], “Involving users in this case means not only consulting them when the product is finished but giving them an active role in the design process and the actual shaping of Ambient Intelligence.” Inclusive prototypes and tests to experiment and to validate results will be also an important contribution to a more qualified solution. The inclusive design concept comprises the use of the equipment as well as the usability of the communication panels conceived to interact with individuals. Reality shows that the acceptance of the smart home concept is delayed by the need for research focused on user requirements and expectations. Furthermore, it is recognized a deficient understanding of user requirements and substandard demands for services to be operated at smart homes. In addition, the industrial cluster is managed by providers offering a technology-push instead of a demand-pull system contributing to customer dissatisfaction [50]. Several research projects show the interesting and useful direction that this area of knowledge pursues. Every person will have his/her own life style and will need a spatial and technological solution to answering to his/her personal and professional requirements. The static and indifferent character of dwelling environment illustrates the “non-smart” attitude from construction cluster towards the stimulus presented by contemporaneous society. Questions, such as the where or which is the best place to live do not have an accurate answer without understanding user’s expectancies [51]. In fact, home should not be a project made by investors, architects, designers (and another technician from construction cluster) where their concepts or preferences overcome users’ requirements; nor a technologic experimental depot, where the continuous experiment can lead to technological evolution yet forgetting the user that perceives dwelling environment with his/her senses, mobility capacity, daily life routines and technological adaptation. The concept of smart houses responds to different approaches: ranging from security and energy automation, remote equipment control, assisted environments to individuals with special requirements and the supervision of human behavior within dwelling and community/urban environments. The answer given by smart houses must be supported by the skills presented by the group, family or individual, more precisely in what they can do, and which are their expectancies [52]. The holistic support of the concept leads us to some conclusions (Tables 1 and 2). The relationship between the information shown in these tables, acknowledges that the inclusion of smart environments in dwelling backgrounds stimulates more responsiveness to new functions and different users. Smart environments enhance the capacity to carry out routine activities contributing to independence and, consequently, self-esteem of individuals. Intelligent environments do not limit themselves to the use of equipment by the person, but they boost the

348

C. C. Gomes Table 1. Users expectations from Dwelling Environment

What people expect from dwelling environment Reasons to support expectations Mortgage impact in salary budget Person living all alone, from which the elderly emerges as the group more in need Support the challenges of life and the support to relatives To respond to the requirements of inhabitants’ daily activities Support technology to respond to the trials of different every day routines To respond to the new functions inside the home (with special emphasis to remote work) and user physical and sensorial capabilities Enhance individuals and family self-esteem

Expectations Affordable price The guarantee of comfort and dignity during the human life cycle

Functionality

Flexibility

Being sensitive to individuals and family goals

interaction between user and environment to the expression of our emotions, projections of our expectancies and illusions, through odd jobs performance or the remote communication with others. However, technology does not replace human contact. Smart environments can be programmable and can systematize human routine activities. The challenge is to transform this apparatus which obey to our commands in machines that recognize our identity, dignity, requirements and behavior. Diverse research projects pursue this purpose; special attention to those based on home’ labs and users’ participation: HOUSE_n – MIT (House_n Research Group, Department of Architecture, MIT, website at: http://web.mit.edu/cron/group/house_n/); AwareHome —Georgia Institute of Technology (Aware Home Research Initiative (AHRI) at Georgia Institute of Technology, website at: http://www.awarehome.gatech.edu/); Centre for Usable Home Technology, University of York (website at: http://icuhtec. org/about/); Making Smart Homes Smarter, Ulster University (Smart Environments Research Group, website at: https://www.ulster.ac.uk/research/institutes/computerscience/groups/smart-environments). The results achieved show the importance and the near future acceptance of these themes. The interaction allowed and motivated by computer network stimulates a different way to accept and manage technology. The individual understands technological apparatus as a digital being disregarding the traditional idea of inanimate object, thus they will understand the home environment. Nevertheless, more important than activities and supervision enabled by technology is the social integration and interaction needed by all human beings. Crucial to achieve it is the opportunity provided by the system to be in contact with another being, even if virtual, supported by visual, acoustic and touchable senses. Human communication and interaction can be the key to activities and monitor performance. As Juan Carlos Augusto claims, there is like a trend allowing different houses to be called “smart houses”; however, it will become “hard to say how many actually deserve

From Smart Concept to User Experience Practice a Synthetic …

349

Table 2. Users expectations from Technology What people expect from Technology Reasons to support expectations Usability to enhance the use of technology and communication between people Person living all alone, from which elderly emerges as the group with major demanding Support the challenges of life and the support to relatives Intrusion Fire Flooding Control of light and temperature Liability on the communication, despite its professional or personal character Promotion of interaction between individuals Security of individual’s information Security of individual’s identity Balance geographical and time constraints Encouragement of individual independence, namely the ones with special requirements Promotion of individuals interaction despite individuals demands and background Increase the functionality and flexibility of different facilities within dwelling environments To supervise human behaviour to store information and further communication with (health/care) professionals Enhance individuals and family self-esteem and wellbeing

Expectations Simple to use Support the performance of daily routines (Activities of Daily Living)

Sense of security provided by different devices Sense of comfort Work, entertainment and shopping

Assistive technology

Being sensitive to individuals and family goals

the label—as this depends on where the line is drawn between something behaving intelligently or not” [53]. The previous quotation raises the question: which are the right features/technology to improve the experience and interaction in the home environment? How can we implement it?

5 Issues to Discuss This research shows that a Smart Home Environment must be inclusive, responsive to human needs and expectations plus requirements of functions. These are decisive parameters to achieve the balance between the built environment and human life cycles to enhance the future human quality of life.

350

C. C. Gomes

The continuous demographic shifts and changes in individuals’ life styles induce relevant challenges demanding the best solutions for the new requirements and expectations. The endless technological progress interferes with forecasting tomorrow’s reality; nevertheless, it is commonly accepted that the improvement achieved in the last decade and in the present as we know it will be very soon find itself memorialized in remembrance. Facing this reality, the possibility to anticipate and speculate in near future worlds will remain. The goal is to be creative, to discuss, to debate, to produce artefacts oriented towards the users’ present-day and future needs and expectations. The way designers work and think conveys a solid potential in assembling these two fields: future and design with the awareness that the somebody’s future is somebody’s else present. The importance of home environment to ageing in place is undeniable. The innumerable items that perform home environment and enable human interactions encourage a challenge for Design intervention areas. Multidisciplinary teams and user centred design can be the key to friendly usable present and future oriented solutions. The presented scheme aims to be a synthetic model of reviewed and organized issues that must be considered to conceive the Inclusive/Smart Home Environment for our present and future selves.

Fig. 1. Synthetic model, core

From Smart Concept to User Experience Practice a Synthetic …

351

Smart home environment should be conceived to respond to human needs and desires; although the user and the experience of the environment must be the focus of the smart solution design, it is vital to understand technology and home environment requirements in order to achieve a successful result. Understanding the features of each-user, home environment and technology—allow to establish the possible relationships between the elements of this trilogy (see Fig. 1). Technical performance and progress must exist and be developed towards its use and support for positive experiences and better performance. Smart objects and environments are no exception. The emergence of smart objects boosted by the internet of things allow the user to interact with the physical object and the associated information throughout the interface and this reality extends to the use of space namely home. The way the user updates his/her abilities to interact with technology so does the home environment. Homes must be conceived and build to include the smart concept to respond to a contemporaneous request and accomplish the principles of the four pillars of sustainability: environment, economic, social and cultural. The desired interaction between the user and home environment can be enhanced by a technologically enabled and flexible layout to ensure spatial dimensions to develop a task as well as to warrant the privacy required by the user. Nowadays,

Fig. 2. Synthetic model, southeast quadrant, Interaction between user and home environment

352

C. C. Gomes

home is more than our personal space, from home we can perform work and leisure activities over technical devices. Nevertheless, finishing must stimulate human senses, enable aesthetical value and comfort to qualify the interaction experience and do not deprive home of its symbolic identity (see Fig. 2). A smart environment is more than a collection of technical gadgets. A smart environment ought to enhance human condition while responsive to security, safety and wellbeing requirements. Technical advances from the last decade prevent forecasting the future. The possible answers about the technologies expected for the next decade will be outdated tomorrow. The advancements of technologies allow new kind of functions as well as new ways to perform the traditional ones, thus the dwelling environment has to present in its layout configuration as well as in its construction process and materials applied updated solutions to allow and boost the use of new technologies. The unpredictable future of technologies’ development demands flexibility from built environment and products (main of them are part of our daily professional as personal routines and have a strong impact on it). Flexible architectural

Fig. 3. Synthetic model, north half, Interaction between home environment and technology

From Smart Concept to User Experience Practice a Synthetic …

353

Fig. 4. Synthetic model, southwest quadrant, Interaction between user and technology

and design solutions (the same is true for objects and systems) are the only way to support such technologies. The aim of the technologies imbedded in home environments is to support human lifecycle and must be easily added, adapted and updated (see Fig. 3). Smart concept of objects and environments must be part of human routines as other technical/mechanical devices. The design of smart objects and environments must consider a user-centered process. It is important to understand which and how much technology will be useful along human life cycle. The understanding of human condition, objectives, limitations (physical, sensorial, cognitive, emotional, etc.) and expectations allow to state that the technology that users may need change by two reasons: the changing of user professional and individual’s needs and technological progress (see Fig. 4).

6 Conclusions Continually smart concept research emerges on research projects driven towards technical innovation and development along with the performance of tasks and activities. However, these approaches are (usually) unaware from contextual scenarios and human heterogeneity performance. Whenever approached the thinkable use of the object and environment, the experience is usually made by professionals and personas

354

C. C. Gomes

to measure interaction usability but neglecting the emotional factor crucial to qualify the user experience. The quality of the experience is perceived by the satisfaction felt by the user while performing a function or task while interacting with an object or environment. The human interaction with an object and an environment is physical, sensorial, cognitive and emotional. The information provided in this, namely on the Tables 1 and 2, identify the user’s requests from home and from technology. Interaction with objects is part of human daily routines as well as the interaction with the environment particularly dwelling one. From the environment, individuals require and expect the guaranty of comfort and dignity along with their lifecycle, with the functionality and flexibility to be sensitive to individuals and family/community objectives. From technology, individuals require and expect to be supportive in personal, work and entertainment activities, easy to use, secure and comfortable and sensitive to individuals and family/community objectives. Human standard routines have long remained, the way to perform them is the big change mostly due to technology, and technology has its own requirements from environment features. The fact that users are more prepared for technology it doesn’t mean that user can not appreciate the experience and refuse the object and environment if the experience is unpleasant. The knowledge of the previous statements endorses a new attitude when designing an object and environment. Nowadays, objects and environment achieve functions with more efficient, effective and comfortable way but a new period is already initiated, the time when objects as environment communicate with individuals, objects, environments, and systems; more than perform a function they deliver of a service. The advances of smart concept and its attachment to objects increase the ability of the object to communicate with more complex ones’ while helping users to achieve their goals: complex functions are completed by a mediation of a computerized system despite time and place since technology permits it. The boost of functionalities indicates further and complex interactions; physical interaction grounded on shape, dimensions, finishing and features, stimulating mainly our visual and tactile senses and reaches, give place to interactions that arouse visual, auditory and tactile senses as well as our cognitive abilities. Users are knowledgeable as never before and desire more than usability from interaction: user requires and desires to create an emotionally positive experience. To accomplish such target a new design method is essential. The design process ought to be focused on the experience as the multidisciplinary team required to conceive it. Knowledge is required about the function and tasks to perform and the target user, the analysis of reality promotes the understanding of the way users perform a function. From this stage is thinkable to conceive a physical or virtual prototype to be experienced by sample groups and be assessed and improved. The final experience must be settled on the emotions aroused such as pleasant, fun, enjoyable, valued, etc. to boost the approval rate of the smart concept in objects and environment from users and therefore by the market. Acknowledgment. This work is financed by national funds by FCT - Foundation for Science and Technology, under the Project UID/AUR/04026/2019.

From Smart Concept to User Experience Practice a Synthetic …

355

References 1. Kopec, D.: Environmental Psychology for Design. Fairchild Books, New York (2012) 2. O’Grady, J.V., O’Grady, K.: A Designer’s Research Manual: Succeed in Design by Knowing Your Clients and What They Really Need. Rockport Publishers, Beverly (2009) 3. Stone, T.L.: Managing the Design Process-Concept Development: An Essential Manual for the Working Designer. Rockport Publishers, Beverly (2010) 4. Buyya, R., Dastjerdi, A.V.: Internet of Things: Principles and Paradigms. Morgan Kaufmann, Cambridge (2016) 5. Rowland, C., Goodman, E., Charlier, M., Light, A., Lui, A.: Designing Connected Products: UX for the Consumer Internet of Things. O’Reilly Media, Sebastopol (2015) 6. Cooper, A., Reimann, R., Cronin, D., Noessel, C.: About Face: The Essentials of Interaction Design. Wiley, Indianapolis (2014) 7. Norman, D.: The Design of Everyday Things: Revised and Expanded Edition. Basic Books, New York (2015) 8. Preece, J., Sharp, H., Rogers, Y.: Interaction Design: Beyond Human-Computer Interaction. Wiley, Indianapolis (2015) 9. Pannafino, J.: Interdisciplinary Interaction Design: A Visual Guide to Basic Theories, Models and Ideas for Thinking and Designing for Interactive Web Design and Digital Device Experiences, 2nd edn. Assiduous Publishing (2018) 10. Weinschenk, S.: 100 Things Every Designer Needs to Know About People. New Riders, Berkeley (2011) 11. Greenberg, S., Carpendale, S., Marquardt, N., Buxton, B.: Sketching User Experiences: The Workbook. Morgan Kaufmann, Waltham (2011) 12. Hollnagel, E.: Handbook of Cognitive Task Design (Human Factors and Ergonomics). CRC Press, New Jersey (2003) 13. Salvendy, G., Karwowski, W.: Advances in Cognitive Ergonomics (Advances in Human Factors and Ergonomics Series). CRC Press, New Jersey (2010). (Vol 3) 14. Falzon, P.: Cognitive Ergonomics: Understanding, Learning, and Designing HumanComputer Interaction (Computers and People). Academic Press. Kindle Edition (2015) 15. Werby, O.: Interfaces.com: Cognitive Tools for Product Designers. CreateSpace. Kindle Edition (2008) 16. Moggridge, B.: Designing Interactions. The MIT Press, Cambridge (2007) 17. McKay, E.N.: UI is Communication: How to Design Intuitive, User Centered Interfaces by Focusing on Effective Communication. Morgan Kaufmann, Waltham (2013) 18. Tidwell, J.: Designing Interfaces: Patterns for Effective Interaction Design. O’Reilly Media, Sebastopol (2011) 19. Hoober, S., Berkman, E.: Designing Mobile Interfaces: Patterns for Interaction Design. O’Reilly Media, Sebastopol (2011) 20. Crumlish, C., Malone, E.: Designing Social Interfaces: Principles, Patterns, and Practices for Improving the User Experience. O’Reilly Media, Sebastopol (2014) 21. LaViola, J.J., Kruijff, E., McMahan, R.P., Bowman, D., Poupyrev, I.P.: 3D User Interfaces: Theory and Practice. Wesley Professional, New York (2017) 22. King, S., Chang, K.: Understanding Industrial Design: Principles for UX and Interaction Design. O’Reilly Media, Sebastopol (2016) 23. Langdon, P., Clarkson, P.J., Robinson, P.: Designing Inclusive Interactions: Inclusive Interactions Between People and Products in Their Contexts of Use. Springer, New York (2010)

356

C. C. Gomes

24. Inclusive Design Research Centre—OCAD University. The Three Dimensions of Inclusive Design. http://idrc.ocadu.ca/. Last accessed 2018/6/8 25. Norman, D.: Emotional Design: Why We Love (or Hate) Everyday Things. Basic Books, New York (2005) 26. Fortino, G., Trunfio, P.: Internet of Things Based on Smart Objects: Technology, Middleware and Applications. Springer, New York (2014) 27. Rose, D.: Enchanted Objects: Innovation, Design, and the Future of Technology. Scribner, New York (2015) 28. Stanford, J.: Product Design and The Internet of Things: Designing The Future. Kindle Edition. Amazon Digital Services LLC (2016) 29. Clements-Croome, D.: Intelligent Buildings: An Introduction. Routledge, New York (2013) 30. Johnson, J.: Designing with the Mind in Mind—Simple Guide to Understanding User Interface Design Guidelines, 2nd edn. Morgan Kaufmann, Waltham (2014) 31. Kuniavsky, M.: Smart Things: Ubiquitous Computing User Experience Design. Morgan Kaufmann, Waltham (2010) 32. Bodine, K.: Service Design: The Most Important Design Discipline You’ve Never Heard Of. http://blogs.forrester.com/kerry_bodine/13-10-01-service_design_the_most_important_ design_discipline_youve_never_heard_of. Last accessed 2018/6/8 33. Manzini, E., Coad, R.: Design, When Everybody Designs: An Introduction to Design for Social Innovation (Design Thinking, Design Theory). The MIT Press, Cambridge (2015) 34. Goodwin, K., Cooper, A.: Designing for the Digital Age: How to Create Human-Centered Products and Services. Wiley, New Jersey (2009) 35. Hartson, R., Pyla, P.: The UX Book: Process and Guidelines for Ensuring a Quality User Experience. Morgan Kaufmann, Waltham (2012) 36. Hurff, S.: Designing Products People Love: How Great Designers Create Successful Products. O’Reilly Media, Sebastopol (2016) 37. Ospina, D.: Emotion & Design: The Emotional Experience Design Framework. https://medium.com/@conductal/the-future-of-design-emotional-design-798a3f15d698. Last accessed 2018/6/8 38. Moggridge, B.: Designing Interactions. MIT Press, Cambridge (2007) 39. International Organization for Standardization: Ergonomics of human system interaction— Part 210: Human-Centered Design for Interactive Systems (formerly known as 13407) (2009) 40. Hassenzahl, M.: Experience Design: Technology for All the Right Reasons. Morgan & Claypool Publishers, California (2010) 41. Albert, W., Tullis, T.: Measuring the User Experience—Collecting, Analyzing, and Presenting Usability Metrics (Interactive Technologies), 2nd edn. Morgan Kaufmann, Sebastopol (2013) 42. Steinfeld, E., Maisel, J.: Universal Design: Creating Inclusive Environments. Wiley, New Jersey (2012) 43. Carse, B., Thomson, A., Stansfield, B.: Use of biomechanical data in the inclusive design process: packaging design and the older adult. J. Eng. Des. 21(2–3), 289–303 (2010) 44. Maglogiannis, I., Betke, M., Pantziou, G., Makedon, F.: Assistive environments for the disabled and the senior citizens. Pers Ubiquit Comput. 18, 1–3 (2014). https://link.springer. com/article/10.1007/s00779-012-0616-0, last accessed 2018/6/8 45. Tiresias: Smart Home Environment. http://www.tiresias.org/cost219ter/inclusive_future/ (14).pdf. Last accessed 2018/6/8 46. Crawford, M.: “Smart Homes”—Lay foundation for aging in place. In: FutureAge, pp. 36– 38. July/August 2009. http://artikelpdf.co.cc/link/smart-homes-lay-foundation-for-aging-inplace/. Last accessed 2018/6/8

From Smart Concept to User Experience Practice a Synthetic …

357

47. Pragnell, M., Spence, L., Roger Moore, R.: The Market Potential for Smart Homes, pp. X. http://www.jrf.org.uk/sites/files/jrf/1859353789.pdf. Last accessed 2018/6/8 48. Lê, Q., Hoang, B.N., Barnet, T.: Smart Homes for Older People: Positive Aging in a Digital World. Future Internet 4(4), 607–617 (2012) 49. Bierhoff, I., Panis, P.: Active involvement of older users in the design process of smart home technology. Gerontechnology 7(2), 74 (2008) 50. Suryadevara, N.K., Mukhopadhyay, S.C.: Smart Homes—Design, Implementation and Issues. Springer, New York (2015) 51. Gallagher, W.: The Power of the Place: How our Surroundings Shape our Thoughts, Emotions and Actions. Harper Perennial, New York (2007) 52. Dewsbury, G.: Intelligent or “Smart” Home Technology. http://www.smartthinking.ukideas. com/IntellBuild.html. Last accessed 2018/6/6 53. Augusto, J.C.: Making SMART Homes Smarter. http://news.ulster.ac.uk/releases/2005/ 1669.html. Last accessed 2018/6/8

Democratization of Intelligent Sensor Network for Low-Connected Remote Healthcare Facilities—A Framework to Improve Population Health & Epidemiological Studies Santosh Kedari1, Jaya Shankar Vuppalapati1, Anitha Ilapakurti2, Chandrasekar Vuppalapati2(&), Sneha Iyer2, and Sharat Kedari2 1

2

Hanumayamma Innovation and Technologies Private Limited HIG-II, Block-2/Flat-7, Baghlingampally, Hyderabad, Telangana, India {skedari,jaya.vuppalapati}@sanjeevani-ehr.com Hanumayamma Innovations and Technologies Inc, 628 Crescent Terraces, Fremont, CA, USA {Ailapakurti,cvuppalapati,siyer,sharath} @hanuinnotech.com

Abstract. Healthcare associated infections (HAI), or infections that are acquired in health-care settings are the most common detrimental events in health-care delivery worldwide. Millions of patients are affected by HAI worldwide each year, leading to high mortality rates and financial losses. Out of every 100 patients that are hospitalized at a particular time, 7 in developed and 10 in developing countries will be affected by at least one HAI. These infections are responsible for approximately 2 million cases and around 80,000 deaths per year in developing countries. The prevalence of HAI in rural areas is more frequent and acute than that of in the urban areas. One chief reason: “Connectivity gap”. Unlike many urban healthcare facilities where the providers usually have Dedicated Internet Access (DIA) with greater bandwidth and better reliability to aid acute services, the healthcare facilities in rural areas have low or no connectivity and are less equipped to prevent containment of HAI. This research paper provides an innovative and low-cost alternative to overcome “Connectivity” obstacle by developing de-centralized intelligent sensor network, based on MQTT, that bring connected intelligence to non-connected healthcare facilities. Thereby overcoming “Connectivity gap” barrier. The paper presents a prototype solution design, its application and a few experimental results. Keywords: MQTT  MQTT-SN  Electronic health records  EHR  Healthcare associated infections (HAI)  Preventive healthcare  Supervised machine learning  Internet of things  IoT  Health  Real-time stream analytics  IoT architecture  Sanjeevani electronic health records  Association rule mining  Naïve bayes classifier  And outpatient

© Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 358–376, 2020. https://doi.org/10.1007/978-3-030-12388-8_26

Democratization of Intelligent Sensor Network …

359

1 Introduction Healthcare associated infections (HAI), or infections that are acquired in health-care settings are the most common detrimental events in health-care delivery worldwide [1]. Millions of patients are affected by HAI worldwide each year, leading to high mortality rates and monetary losses for health facilities [1]. Out of every 100 patients that are hospitalized at a particular time, 7 in developed and 10 in developing countries will be affected by at least one HAI [1]. HAIs are responsible for approximately 2 million cases and around 80,000 deaths per year in developing countries1 (please see Figs. 1 and 2). Some of the most common HAIs include: Influenza, Tuberculosis and Norovirus.

Fig. 1. Acute care hospitals

Fig. 2. Bed bugs [10]

The digital divide is one of the major factor that resulted HAI more acute and intense in rural healthcare settings [2, 3]. The “connectivity gap”, i.e., the internet access that is unreliable, costly, or simply unavailable at the speeds necessary for enabling new innovations in care, is a major factor and is a hindrance to high-quality health care in rural areas [3]. The effect of “connectivity gap” is so stark that life threatening chronic diseases prevalence is comparatively high in rural communities than sub-urban or urban locations [4, 5]. For instance, as per the Centers for Disease Control and Prevention (CDC),

1

National HAI—https://www.cdc.gov/HAI/pdfs/progress-report/hai-progress-report.pdf.

360

S. Kedari et al.

the death rates for five leading2 causes in the United States—cancer, heart disease, unintentional injury (which includes opioid overdoses and vehicle accidents), stroke and chronic lower respiratory disease—are higher in rural areas [6–8]. One of the major reasons for the digital divide and “connectivity gap” is that the rural healthcare facilities exhibit long-tail healthcare delivery patient experience. 1.1

The Long Tail: Issue at Stake

Internet has changed the dynamics of the healthcare industry [9]. Since the development of the internet, all the attributes of the healthcare industry has deeply and continuously evolved: usage of connected networks, usage of electronic health records (EHR), personnel assistant voice enabled models, location & wellness applications, analog-digital data, Sensor data, Low Power Bluetooth (BLE) & mining of healthcare data, exclusive recommendations gleaned from Web 2.0 & 3.0 data sources, Internet Of Things (IoT), wearable data sources and new business models reshape the industry. This process has even been sped up recently by the emergence of disruptive innovations, from the evolutions of web & mobile technologies to the Artificial Intelligence (AI) platforms. The long tail is used in this paper as a multitude of a small, independent remote healthcare facilities (Fig. 3) with limited digital connected infrastructure and deficient intelligent systems can grasp and synthesize this complexity.

Fig. 3. Long tail

Having Intelligent Sensor Networks at rural healthcare facilities not only address the HAI but also addresses prevailing issues such as: “overcrowding3” and lack of availability of “high skilled human” resources [8]. In addition, the intelligent Sensor Network collectively addresses the golden rule in infection control—“there is no ideal disinfectant and the best compromise should be chosen according to the situation.” Please see Fig. 4 to see the spread of infection—routes [8].

2

3

Reducing Potentially Excess Deaths from the Five Leading Causes of Death in the Rural United States—https://www.cdc.gov/mmwr/volumes/66/ss/ss6602a1.htm?s_cid=ss6602a1_w. When hospitals infect you—https://www.thehindu.com/sci-tech/health/When-hospitals-infect-you/ article17289370.ece.

Democratization of Intelligent Sensor Network …

361

Fig. 4. The spread of nosocomial infection [8]

1.2

Connectivity and Intelligent Sensor Network

In the IoT world, the integration of devices with wireless sensor networks (WSNs) is crucial to bring people, data, and processes together. Among the several IoT protocols available for such communication challenges, Message Queuing Telemetry Transport (MQTT) is already a widely implemented solution. MQTT is a publish/subscribe type of message system designed specifically for use in devices which have limited resources over constrained networks [10]. This paper explains the use of MQTT-SN (Sensor Network), a special version of MQTT, designed especially for the peculiarities of a wireless sensor environment. In general, wireless radio links have greater failure rates as compared to wired ones because they are susceptible to interference disturbances and fading. In addition, they also have reduced transmission rates. MQTT-SN is optimized in such a way that it can be implemented on a typical WSN, one with many hundreds of low cost sensors which are battery-operated and actuators which usually communicate with the use of RF based protocols like Zigbee and Bluetooth low energy (Bluetooth LE). MQTT-SN serves as a light weight protocol for edge processing and is extremely useful in areas where there is low or intermittent connectivity, such as remote locations. This paper provides an implementation aspect for IoT devices in rural areas which deal with poor, slow or intermittent network connections and frequent network disruptions This research paper focuses on the challenge by building an intelligent sensor network enabled autonomous disinfect dispenser that brings connected intelligence to non-connected and limited compute healthcare venues. The paper presents a prototype solution design, its application and a few experimental results. The structure of this paper is as follows: Sect. 2 discusses the basics of MQTT, MQTT-SN, Edge Analytics and Machine Learning. Section 3 talks about our MQTT architecture. Section 4 explains the related designs and implementation. Section 4 presents a case study. Section 5 includes the conclusion and future work.

362

S. Kedari et al.

2 Understanding Machine Learning Algorithms Democratization of Intelligent Sensor Network 2.1

MQTT

MQTT (Message Queuing Telemetry Transport) is a light weight publish/subscribe protocol designed especially for machine to machine communication and IoT devices. It is an open protocol which is best suited for communications over networks with limited bandwidth and in places where the network connection could be intermittent [9]. MQTT became an OASIS standard [11] in 2014, and later was standardized as an ISO standard [12] in 2016. It is designed in such a way that it requires an underlying network like TCP/IP for connection and this is too complex for the simple, small footprint devices like sensors and actuators [13]. Therefore, in 2008, Stanford-Clark A. and H.Linh Truong from IBM published MQTT-SN (MQTT for Sensor Networks) [13] a pub/sub protocol based on MQTT where UDP is used instead of the underlying TCP. MQTT uses a topic-based filtering mechanism in which every message carries a topic/subject with it. The broker then uses this topic to determine if a subscribed client should get the message or not (Tables 1 and 2). Table 1. Publish data packet as part of gateway packet

A client is allowed to subscribe to multiple topics at the same time. The topic is specified using a character string which the broker uses to filter messages for the different subscribed clients. The topic can contain one or more hierarchical topic levels. Each topic is separated by a forward slash character “/”. For example, a temperature sensor placed in the kitchen on the first floor could publish its data by using the hierarchical topic level “myhouse/firstfloor/kitchen/temperature”. Topics are case sensitive. Wild chard characters are used to subscribe to multiple topics simultaneously. For example, “myhouse/firstfloor/+/temperature” can be used to subscribe to the data that is generated by all the temperature sensors placed on the first floor.

Democratization of Intelligent Sensor Network … Table 2. MQTT gateway structure

363

364

S. Kedari et al.

1. QOS The Quality of Service (QoS) defines the reliability of delivery of a message between the sender and the receiver of a specific message. MQTT gives basic end-to-end Quality of Service [14]. MQTT has three QoS levels: QoS level 0 (at most once): It assures a best effort delivery. The recipient does not send any acknowledgement on receiving the message, hence there is no guarantee of delivery. QoS level 1(at least once): It assures that the message is delivered to the receiver. Therefore, it is more reliable than QoS level 0. The sender keeps sending the message till it gets a PUBACK acknowledgement packet from the receiver. There is a possibility that a message may arrive several number of times at the receiver because of retransmissions. QoS level 2 (exactly once): It assures that the message is delivered to the receiver. QoS 2 also guarantees that the intended recipient receives the message only once. The client can choose the level of QoS level required depending on the application and reliability of the network. As mentioned, the MQTT clients do not communicate with each other directly. A MQTT connection is established between a client and the broker [15]. A CONNECT message is sent by the client to the broker to initiate a connection. The broker then responds with an acknowledgement message CONNACK and a status code. After the connection is established, the broker keeps it open until it gets a message to disconnect from the client or the connections breaks. ClientID is used to identify each MQTT client that is connected with the MQTT broker. This ID is used by the broker to identify the client and the state of each client. Hence, there should be a unique ID per client broker connection (Tables 3 and 4).

Democratization of Intelligent Sensor Network … Table 3. MQTT client receive

365

366

S. Kedari et al. Table 4. MQTT broker to gateway communication

Democratization of Intelligent Sensor Network …

2.2

367

MQTT-SN

MQTT-SN is designed to be as similar to MQTT as possible but is architected specifically to suit the peculiarities of wireless sensor networks typical in sensor environments. These features include support for short message length, low bandwidth and link failure. It is designed to be implemented on low cost, battery operated devices and hardware with resource constraints [11]. MQTT-SN does not need the TCP/IP stack (Figs. 5 and 6).

Fig. 5. MQTT SN architecture

Fig. 6. Decision tree

2.3

Pub/Sub Architecture

The concept of publish/subscribe (pub/sub) architecture is such that the clients that are interested in acquiring certain information register their interest. The client which registers their interest to a certain information is called subscriber and the client that wants to send out information is called the publisher, hence called as the pub/sub model. There is no direct communication between the publishers and the subscribers. A third component called the broker handles the connection between them. The broker filters all the incoming messages and then correctly distributes them to the subscribers. There are three different types of message filtering methods available in the broker, namely, topic based, type bases and content based [15]. This paper uses both, MQTT and MQTT-SN cooperatively to set up the iDispenser in hospitals and healthcare facilities with unreliable network connection.

368

2.4

S. Kedari et al.

Machine Learning

(1) Decision Tree Algorithm: This algorithm classifies attributes and helps select the outcome of the class attribute (Fig. 3). To build a decision tree both the item attribute and class attribute are needed. The structure of decision tree is similar to a tree where the intermediate nodes describe attributes of the data, the leaf nodes describe the data outcome and the branches carry the attribute value. Decision tree algorithm is commonly used in the process of classification since no domain knowledge is required to construct it. Below is the figure of a simple decision tree [16, 19]. The root node calculation is the primary step in decision tree algorithm. There are several methods to choose the root node of the decision tree. Gini impurity and information gain are the main methods that help find the root node. Root node helps deciding which side of decision tree the data will fall into. Like all other methods of classification, decision trees are also built using the training data and are tested using the test data [19]. Information Gain: For a dataset D, the information gain is computed using entropy and information. [16, 19]. InfoðDÞ ¼ 

m X

pi log2 ðpiÞ

i¼1

The formula given below is used to calculate the information of the attribute. InfoA ðDÞ ¼

v X jDjj j¼1

jDj

x infoðDjÞ

GainðAÞ ¼ Info ðDÞInfoA ðDÞ Information gain of a particular attribute is defined as the difference between entropy and the information of that particular attribute. The attribute which has the maximum information gain forms the root node, and the attributes with the next highest information form the next level nodes and so on [16, 19]. (2) Naïve Bayes Classifier: It is a probabilistic classifier developed according to Bayes’ theorem. To make computations easier, this classifier makes an assumption that the data features are conditionally independent provided the class value (activity label) is given. For example, consider estimating the conditional probability distribution of activity labels Y over the observed features X (i.e.PðYjXÞ. According to Bayes’ theorem, the conditional probability can be calculated as: PðYjXÞ ¼

PðXjY ÞPðY Þ PðX Þ

Democratization of Intelligent Sensor Network …

369

In the above equation, P (Y|X), gives the probability of finding a set of feature values for a particular activity label, P(Y) shows the previous probability distribution over the class values before the evidence was seen, and P(X) gives the probability of finding that particular feature vector in the entire dataset. (3) Sliding Window: This algorithm is used to identify and analyze trends in a data stream. Sliding window is used to perform a common pattern analysis of real-time as well as continuous data and uses the rolling counts of incoming data to measure trending, humidity and temperature [17, Decaying window, 18]. (4) Vector Space Model: The basics of a vector space model is as follows: Both a document and a query are represented as vectors in a high-dimensional space that corresponds to all the keywords and uses an approximate similarity measure to compute the degree of similarity between the document vector and the query vector. The values of similarity are used to rank the documents [9]. We can model every document as a vector v in the t dimensional space Rt, with a set of documents d and a set of terms t. Hence this method is named as the vector-space model. The term frequency is the number of times the term t occurs in the document d, that is, freq (d; t). The (weighted) term-frequency matrix TF (d; t) calculates the relation of a term t with regards to the given document d: it is defined as 0 if the term is not present in the document, or else as nonzero. There are several ways to specify the weighting term for the nonzero items in such a vector.  TFðd; tÞ ¼

0 1 þ logð1 þ logðfreqðd; tÞÞÞ

if freqðd; tÞ ¼ 0 otherwise:

Along with the frequency measure, the inverse document frequency (IDF) also plays an important role. IDF signifies the scaling factor, or shows how important a term t is. If a certain term t repeats in several documents, its importance is scaled down because of its reduced discriminative power. IDFðtÞ ¼ log

1 þ jdj jdtj

Where d is the collection of documents, and dt is the collection of documents that have the term t; if |dt|  |d|, the term will have a bigger IDF scaling factor and vice versa. In a complete vector-space model, TF and IDF are integrated together, to give the TF-IDF measure [9]. TF  IDFðd; tÞ ¼ TFðd; tÞ x IDF ðtÞ:

370

S. Kedari et al.

(5) Committee Machine: The performance of classification models can mostly be improved by integrating several models together, instead of just using one single model. For example, we can train M different models. A combination of the models, like a simple average among the models or a particular voting strategy, determines the winning label for a certain data point. The meta-model which is formed as a result of combining multiple models is ocassionally called as a committee machine.4 The well established Bayesian averaging can be considered to have a the approach of a committee machine model. Assume that a certain statistical model has the inference about the variable y as the predictive probability density P (y|w) where w denotes a vector of model parameters. Also, assume that a data set D has information regarding parameter vector w as probability density P(w|D). We then get [8, 15] Z

PðyjDÞ ¼ PðyjwÞPðwjDÞdw  where M denotes the samples {wi}M function P(w|D).

i=1

M 1X PðyjwiÞ; M i¼1

that are generated from the distribution

3 System Overview The architecture of our implemented system is as shown in Fig. 9. The gateway, an important component in the system, is considered as the middle layer between the cloud and the wireless sensors. It collects data form the wireless sensor layer and reports it to the cloud and delivers the commands that are generated by the cloud to the actuators. The forwarder also forms critical part of the architecture. It is embedded inside the gateway. It bridges a BLE UART to a UDP packet. Besides the gateway and the forwarder, the iDispenser is essential to the system. The temperature and humidity sensors are a part of the iDispenser. They are responsible for collecting the indoor temperature and humidity. Additionally, the iDispenser communicates with the local gateway using the MQTT-SN protocol which depends on non-IP wireless networks. Bluetooth LE (BLE) technology is used for the non-IP wireless sensor networks. The gateway and the cloud layer communicate using the MQTT protocol, which requires the IP based TCP/IP stack. 3.1

Architecture

The architecture of the system is divided into four major parts: (1) Cloud Layer (2) Gateway plus Forward Layer, (3) Sensor Layer, and (4) Device Layer.

4

Committee 0fa4649f05b522624a4ba58a644c7145263a.pdf.

Machines—https://pdfs.semanticscholar.org/538f/

Democratization of Intelligent Sensor Network …

371

Fig. 7. Bluetooth module

(1) Cloud Layer: The Cloud Layer posts the data from Gateway (Fig. 7) to backend data center for storage and analysis purposes. Currently, we have implemented Service Oriented Architecture on the Cloud Layer to communicate to Web & Database servers. (2) Gateway/Forward Layer: The role of Gateway is to collect MQTT packets from clients and process the packets and assemble payload to deliver or forward to Cloud layer by call REST call (Tables 5, 6 and 7). (3) Sensor Layer: The role of the Sensor layer is to collect occupancy rates, temperature, noise and other location specific data and post the collected data to MQTT Gateway.

Table 5. MQTT message read at broker

Table 6. Sensor layer to broker

372

S. Kedari et al. Table 7. Decision tree code

Fig. 8. Board

Democratization of Intelligent Sensor Network …

373

(4) Device Layer: The purpose of the device layer is to collect occupancy rate at healthcare facilities, temperature variations in room and other location contexztual information. The collected payload is sent to MQTT Gateway. The Deveice layer consists of: Board, Connectivity Module and Sensors (Figs. 8, 9, 10 and 11).

Fig. 9. Sensor network platform

Fig. 10. PI module

Fig. 11. Decision tree

374

S. Kedari et al.

(1) Board: The board that we use for the FreeRTOS embedded application has a LPC1769 ARM Cortex M3 microcontroller which runs at frequency of 100 MHz. The ARM Cortex M has a 32-bit RISC architecture. It is commonly used in microcontrollers, has a high level of integration and consumes less power. The SJOne board has a 64 K RAM, 512 K on-chip ROM and 1 M SPI flash memory. It has several GPIOs with two UARTs, two SPIs, I2C and onboard sensors like accelerometer, temperature, light sensor and IR. (2) Bluetooth Module: The bluetooth low energy(BLE) is a wireless low powered personal area network which transmits small chunks of data over a shorter distance as compared to the classic Bluetooth which consumes more power to transmit larger amount of data over a longer distance. The nRF8001 uses v4.0 BLE radio and has a serial interface that supports a large number of external application microcontrollers. The SJOne board is interfaced with the BLE (Fig. 7) module using UART at a baud rate of 9600 bps. 3.2

System Function

(1) Service APIs: Our experimental setup had two Intelligent Dispenser devices with the sensors connected to a Raspberry Pi that is used as a gateway and forwarder. To simplify the study, we split the architecture into two different parts: On the Wireless sensor network side, we had the iDispenser devices connected to the RaspberryPi using Bluetooth LE (BLE). The dispensers form the MQTT-SN clients and the Raspberry Pi is used as a gateway which translates MQTT-SN to MQTT and vice versa. A Node.js service is used that bridges the BLE UART to a UDP socket as a forwarder implemented on the Raspberry Pi. The bridge maintains one UDP socket per BLE device [18]. The Eclipse Paho MQTT-SN gateway implementation is used which translates MQTT-SN to MQTT. (1) Analytics: Given intermittent connectivity at the remote healthcare locations, the deployment of analytics at Edge enables to process the incoming MQTT payloads and application of supervised decision tree machine learning. MQTT Payload contains: Room Temperature, Rom Humidity. Location Trend (derived from social sentiment), number of occupants and class to dispense or not. (2) Data: Data collected at a remote healthcare facility: Room temperature Medium Medium Low

Room humidity High High High

Occupants 20 50 70

Location trend Disease Trend No Disease Trend

Dispense No No Yes (continued)

Democratization of Intelligent Sensor Network …

375

(continued) Room temperature High High High Low Medium Medium High Medium Low Low High

Room humidity Secondary Primary Primary Primary Secondary Primary Secondary Secondary High High Secondary

Occupants 33 5 100 10 90 20 40 20 50 60 10

Location trend Disease Trend Disease Trend No No Disease Trend Disease Trend Disease Trend No Disease Trend Disease Trend No

Dispense Yes Yes No Yes No Yes Yes Yes Yes Yes No

(3) Model: The supervised model is codified as part of Broker to make local intelligent decision. As model clearly demonstrated, the number of people or occupants dictate dispense or not dispense. On the other side, we have the Raspberry Pi which is connected to the cloud. The data is sent to the cloud of further processing and analysis.

4 A Case Study The intelligent sensor network were deployed and tested in A REMOTE healthcare facility in India. Our analysis clearly confirms to the fact that having connected network improves healthcare care and overall health outcomes.

5 Conclusion and Future Work This research paper presented a new approach to Democratization of Intelligent Sensor Network for Low-Connected remote healthcare facilities—a framework to improve population health & Epidemiological studies. We staunchly believe that MQTT enabled intelligent sensor networks will be crucial in preventing healthcare infection issues at remote healthcare facilities that’re plagued by low or no connectivity with no data collecting & processing capabilities. Having decentralized low-cost sensor network enables healthcare facilities to be on data enabled businesses and improved overall capabilities to our future generations. We strongly believe that intelligent sensor networks will decrease the cost factor for outpatients as well as save their lives.

376

S. Kedari et al.

References 1. WHO: Patient Safety—Health care-associated infections FACT SHEET. URL: http://www. who.int/gpsc/country_work/gpsc_ccisc_fact_sheet_en.pdf. Accessed 4 Aug 2018 2. Grant, M.: Study finds a lack of internet access is having a big impact in Southern Indiana, 10 May 2018 3. Samuels, K., McClellan, M.B., Kaushal, M., Patel, K., Darling, M.: Closing the rural health connectivity gap: how broadband funding can improve care, 1 April 2015 4. Warshaw, R.: Health disparities affect millions in rural U.S. Communities, 31 Oct 2017 5. O’Connor A1, Wellenius, G.: Rural-urban disparities in the prevalence of diabetes and coronary heart disease, 2012 Oct; 126(10), 813–20. https://doi.org/10.1016/j.puhe.2012.05. 029. Epub 2012 Aug 24 6. Befort, C.A., Nazir, N., Perri, M.G.: Prevalence of obesity among adults from rural and urban areas of the United States: findings from NHANES (2005–2008), 31 May 2012. https://doi.org/10.1111/j.1748-0361.2012.00411.x 7. The Center for Disease Control and Prevention: Urban-Rural differences in COPD, 8 March 2018. https://www.cdc.gov/features/copd-rural-areas/index.html 8. WHO: Hospital Hygiene and Infection control. Accessed 6 Aug 2018. http://www.who.int/ water_sanitation_health/medicalwaste/148to158.pdf 9. Chen, W.-J., Gupta, R., Lampkin, V., Robertson, D.M., Subrahmanyam, N.: Responsive mobile user experience using MQTT and IBM, MessageSight, IBM Corp., Ed. (2014) 10. Anand, C.: When hospitals infect you, February 12, 2017. URL: https://www.thehindu.com/ sci-tech/health/When-hospitals-infect-you/article17289370.ece. Accessed 6 Aug 2018 11. “Oasis message queuing telemetry transport (mqtt),” OASIS MQTT v3.1.1 (2014) 12. “Information technology – message queuing telemetry transport (mqtt) v3.1.1,” ISO/IEC 20922:2016 (2016) 13. Stanford-Clark, A., Truong, H.L.: MQTT for sensor networks (MQTT-SN). http://mqtt.org/ new/wp-content/uploads/2009/06/MQTT-SN_spec_v1.2.pdf. Nov 2013 14. Chen, D., Varshney, P.K.: QoS support in wireless sensor networks: a survey. In: International Conference on Wireless Networks, pp. 227–233 (2004) 15. Hive MQ – Enterprise MQTT Broker, MQTT Essentials, Web. https://www.hivemq.com/ blog/mqtt-essentials-part-1-introducing-mqtt 16. Stanford-Clark, A.: Urs Hunkeler and Hong Linh Truong, MQTT-S- A Publish/Subscribe Protocol for Wireless Sensor networks, IBM Corp., 2013 17. Eugster, P.T., Felber, P.A., Guerraoui, R., Kernmarrec, A.-M.: The many faces of publish/subscribe. ACM Comput. Surv. 35(2), 114–131 (2003) 18. Cabe, B.: ble-uart-to-udp, Github repository. (2017). https://github.com/kartben/ble-uart-toudp

Latency-Aware Distributed Resource Provisioning for Deploying IoT Applications at the Edge of the Network Cosmin Avasalcai(B) and Schahram Dustdar(B) Distributed Systems Group, TU Wien, Vienna, Austria {c.avasalcai, dustdar}@dsg.tuwien.ac.at

Abstract. With the increased success of Internet of Things (IoT), the conventional centralized cloud computing is encountering severe challenges (e.g., high latency, non-adaptive machine type of communication), that proved insufficient to meet the stringent requirements of IoT applications. Besides requiring fast response time, increased security and privacy, they lack computational resources at the edge of the network. Motivated to solve these challenges, new technologies are driving a trend that distributes the computational resources and shifts the function of centralized cloud computing to the edge. Several edge computing technologies, edge and fog paradigms, originating from different backgrounds have been emerging to overweight these challenges. However, to fully utilize these limited devices, we need advanced resource management techniques. In this paper, we present a novel distributed resource allocation algorithm with the purpose of enabling seamless integration and deployment of different applications in an IoT infrastructure. The algorithm decides: (i) the mapping of an IoT application at the edge of the network; (ii) dynamic migration of parts of the application, such that Service Level Agreement (SLA) is satisfied. Furthermore, we analyze and discuss our approach and the potential to minimize the latency of different IoT applications. Keywords: Resource management computing · Internet of Things

1

· Edge computing · Fog

Introduction

Over the past decades, cloud computing has been applied in most of the industries due to its high cost-efficiency and flexibility achieved through consolidation, in which computing, storage, and network management functions work in a centralized manner. Since the number of connected devices in the IoT has increased dramatically in the last couple of years generating more and more data, the existing centralized cloud computing architecture is encountering severe challenges. For instance, transferring all this data to the cloud introduces congestion in the network and extra delays in the application. Hence, deploying a real-time IoT c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 377–391, 2020. https://doi.org/10.1007/978-3-030-12388-8_27

378

C. Avasalcai and S. Dustdar

application, which requires fast response times and increased security, entirely to the cloud is not an effective strategy anymore. As a result, the need for processing applications closer to the edge of the network has become a necessity [3]. Shifting the storage and computational power from the cloud closer to the edge provides a series of benefits, such as smaller end-to-end (e2e) delay, better scalability of the network, enhancing privacy and independence of the cloud [14]. Applications such as self-driving cars [6] and aid for people with disabilities [2] benefits from having a minimum e2e delay and fast response time. For example, when care services are extended from hospital to private homes with a focus on remote monitoring of people suffering from chronic diseases or old people, in case of emergencies, the faster the response time is, the immediate help can be provided (e.g., the command sent to the sensors should be granted immediately, stabilizing the patient, until a nurse will arrive). Moreover, there are applications where security and privacy have an important role like ambient assisted living [5]. Finally, since most of the computation is performed at the edge, only a small portion of generated data is sent to the cloud for further processing. This greatly impacts the network overhead and offers correct functionality even when the connection to the cloud is lost. The underlying premise of new paradigms like fog [4] and edge [16] computing is to deploy distributed heterogeneous computational resources closer to the IoT sensors. We differentiate the two by the resource capabilities and location. For example, edge devices are placed closer to the source of data and have limited resources (e.g., limited computational power, limited energy). On the other hand, fog devices have more computational resources, being located between edge devices and cloud. However, deploying applications in such diverse IoT infrastructure where mobile devices can come and go without prior notice is impossible, if there is no support of novel resource management techniques. Since the cloud has almost unlimited but far away resources, the resource management comes as a solution to the constraints imposed by the IoT devices. Deploying IoT applications at the edge comes with a set of challenges. The devices in the network are heterogeneous, with limited resources (i.e., limited computational power and energy supply) and are subject to different environments. Moreover, the nodes can be mobile introducing even more uncertainty in the system, since nodes can enter or leave the network without any prior announcement. In edge computing, the diversity found in such networks and resource management play a significant role in assigning and distributing tasks (that are generated by local devices), to the remote cloud (the center of the network) or local servers/devices (the edge of the network). A common approach for resource management in edge computing is to assign tasks to the remote cloud or local servers according to several factors such as energy, bandwidth consumption, having as final scope the minimization of the latency. In this paper, we introduce a decentralized resource management algorithm with the purpose of deploying IoT applications at the edge of the network such that e2e delay is minimized. The algorithm tackles resource allocation [19] from

Latency-Aware Distributed Resource Provisioning Technique

379

two different perspective, where to place an application and when to migrate. This ensures that the system is able to dynamically adapt to the uncertainties of the IoT architecture and provide a better availability of applications deployed at the edge of the network. Furthermore, to aid the algorithm in deciding what tasks to migrate, we propose a task ranking system that offers a classification of the tasks on a node based on some criteria (e.g., battery level, trustiness and communication latency among others). This is important, especially for being able to handle variations in resource demand while assuring a good quality of service (QoS) [15]. The remainder of the paper is structured as follows: In Sect. 2 we discuss the related work on resource management techniques. Section 3 defines the IoT application and architecture considered in this paper. In Sect. 4 we describe the implementation details of our proposed algorithm. Section 5 present the challenges of such a framework and discuss the evaluation we intend to make. Finally, Sect. 6 concludes the paper and provides an outlook on future work.

2

Related Work

With the advent of new paradigms, like edge and fog computing and the tremendous impact IoT devices is having on people lives, researcher propose new solutions to take advantages of all available resources found closer to the IoT sensors. In the context of using the available computational resources, the authors in [17] presents a mapping algorithm that deploys a pre-partition application at the edge of the network, reducing the cloud communication and minimizing the latency. Another deployment algorithm composed of two stages, (i) partitions the IoT application in multiple tasks and annotates them with location information, (ii) place the obtained tasks on multiple edge nodes based on their location, is described in [8]. In [9], the authors suggest a cooperative fog platform using a distributed communication model. The purpose of the platform is to achieve a better collaboration between multiple static and mobile fog devices. Moreover, to improve the service efficiency of IoT applications, an allocation algorithm is applied by selecting hosts based on system characteristics. A similar approach is presented in [18] where an optimization service placement algorithm is developed to share fog resources. However, the algorithm will always try to map an application to the deployment node or to a neighbor device. In case of failure, the application is mapped to the cloud. Another important topic in resource allocation that helps adapt to the increased uncertainty is resource migration. In [12] the authors present an algorithm to dynamically migrate virtual machines (VMs) and find the best communication paths based on predictions of user movements. Another solution to service migration is presented in [13], where the authors propose an edge-enabled publish-subscribe middleware to continuously monitor the QoS and transparently migrate clients to another host in close proximity. For our self-adaption

380

C. Avasalcai and S. Dustdar

solution, we propose a similar approach that finds the latency of nearby neighbors devices and solves the problem of mobile nodes unpredictability. However, in comparison, our approach supports more diverse IoT applications. Multiple other solutions that tackle the challenges of deploying IoT applications at the edge have been proposed. The authors in [10] define algorithms to find the best location to distribute applications in a cloud-assisted vehicular network architecture. Others have proposed adaptive workload management mechanisms [7] or resource estimation techniques [1] to provide services closer to the edge similar to those in the cloud. Furthermore, the authors in [20] introduce a solution to minimize the response time of video analytics applications close to the edge. With respect to aforementioned research papers, most of the researchers work on emphasizing the placement of edge devices, whilst load distribution is considered to be a hot topic. In contrast, our proposal provides a more comprehensive resource management solution that is independent of IoT applications and environment. By combining initial placement of the application at runtime, with dynamic migration and neighborhood selection, the edge devices will become more intelligent.

3

IoT Infrastructure

The proposed solution is a decentralized algorithm designed to ensure that deployed IoT applications meet their SLA, i.e., the maximum E2E delay for an application to function as intended. The framework is divided into three different modules deployed on each computational device in an IoT architecture. 3.1

IoT Architecture

For the architecture, we consider that every edge, fog device, and cloud are connected in a peer-to-peer manner. This changes the pyramidal approach where there are four layers (i.e., sensor, edge, fog, and cloud) into a more flatter approach where devices have a direct connection to different cloud providers and to all nearby nodes (see Fig. 1). 3.2

IoT Application Model

Due to the limited computational power of edge devices, we model an IoT application as a Directed Acyclic Graph (DAG) where vertices represent different tasks (i.e., a set of instruction that performs a specific computation) and edges show the dependencies between them. Each task is characterized by its computational requirements (i.e., CPU utilization, RAM, storage) and the address of the dependent tasks. An example of such an IoT application is presented in Fig. 2. Moreover, to deploy the application in such a heterogeneous IoT network, the tasks are placed in individual docker containers. A docker container represents a

Latency-Aware Distributed Resource Provisioning Technique

381

Dispatcher node Latency Monitoring Deployment policy

Offer generator Legend Physical communication Broadcast to neighbors Cloud Edge node Fog node Deploy IoT application from edge node

Neighbor node

Offer generator

Latency Monitoring Deployment policy

Fig. 1. Our system deployed in an IoT architecture

lightweight, stand-alone, executable package that contains everything needed to run the specifically added task [11]. Furthermore, since tasks are isolated from the environment, a better security and a fast and easy deployment are ensured. In this paper, we assume that the application is divided into multiple dockers by the developer and sent directly to the deployment edge node once everything is prepared.

4

Resource Management Solution

We design an algorithm to dynamically distribute and adapt the application such that the e2e delay is minimized. In our approach, the algorithm is deployed on every edge and fog node in the IoT architecture, being divided into three modules each performing a different functionality as depicted in Fig. 1. Moreover, the algorithm is developed to work on two different states: (i) deployment state represents the place from where an application is deployed (i.e., the dispatcher node) and a (ii) resource sharing state where offers are generated to receive

382

C. Avasalcai and S. Dustdar

Legend

C A

B

Task connected to the sensors

E

Task connected to the actuator

D F

Normal task

Fig. 2. IoT application model

tasks for further processing (i.e., the neighbor node). In the deployment state, the deployment policy is aided by the latency module to find the best placement for the IoT applications, while the offer generator module is not working. In contrast, in the resource sharing state the offer generator module switch places with the deployment policy to choose the tasks and compute the offers. The functionality of our algorithm is inspired from a real-world private auction house rules. Once an item has arrived at the auction house and is ready to be sold, a list of people is created, who are invited to the event. These persons are not randomly chosen since they have to fulfill some criteria and prove that own the necessary “resources” to pay for the items. Once the item placed for auction, the interested buyers can place offers in a limited period of time. In the end, the buyer who offered the most is the winner of the object. In our case, the IoT devices found in close proximity to the dispatcher are eligible to participate. However, since each IoT application is different, the latency monitoring module creates a personalized list of invitations from all the neighbors, based on the application’s characteristics. When the participants are elected and the list created, a message is sent to present the application model and start a timer in which offers can be submitted. During this period of time, the participating nodes send offers for one or more tasks depending on their available resources. Once the time has expired, all the received offers are collected and the winners are announced. Finally, each winner receives the desired tasks. 4.1

Latency Monitoring Module

The primary function of the latency monitoring module is to filter the nearby IoT devices and create, at runtime, a list of participants that meet the IoT application requirements. Since each IoT application is different, the generated list of

Latency-Aware Distributed Resource Provisioning Technique

383

participants is unique. The difference comes from the filter threshold used by the module. In this case, the threshold is dependent on the application maximum e2e delay and has the purpose of ensuring that the communication latency from the dispatcher to the participants does not exceed a certain percentage. The equation used to compute this threshold is presented in Eq. 1: Tmax = e2edelay × p

(1)

where e2edelay is the maximum e2e delay of the application and p represents the percentage taken from the e2edelay . Furthermore, the threshold provides an upper bound of the accepted latency of each participant. Thus, the algorithm is in charge of monitoring the latency to the neighbor nodes, by measuring the time taken to send a message and until the time it arrived back, similar to [13]. Hence, this technique is perfect to monitor mobile devices and deal with their uncertainty. For example, if the latency of such a mobile device increases, then the mobile device has left the area. The entire functionality of this module is presented in Fig. 3. The second functionality of this approach is to keep track of the nodes that joined the network and provide a possibility to easily enter into the neighborhood of an edge device. Consequently, each node has the knowledge of all the computational resources in its vicinity. Since the module knows exactly how old (i.e., the period of time since the node joined the network) a node is, a better decision can be made when inviting participants. A list of participants is presented in Fig. 1. 4.2

Offer Generator Module

This module is used only when a device is in a resource sharing state and has the purpose to create and send offers to the dispatcher. Each offer reflects the resources that a node desire to share and represents the Worst-Case Response Time (WCRT) of the selected tasks. Moreover, the offers are compared to a computed threshold to filter them and send only the ones that could improve the overall e2e delay. This strategy lowers the network communication and at the same time helps the neighbor node to take better decisions regarding its resources. The threshold is calculated based on two parameters, the number of selected tasks and a percentage of the e2e delay. Through this approach, we ensure that only very good offers are sent to the dispatcher. The mathematical formula is presented in Eq. 2: (2) Tmax = e2edelay × p × n where n represents the number of selected tasks and p represents the percentage taken from the e2edelay . The algorithm starts executing when an invitation message is received. At the same time, a counter, used to know how much time it takes to generate the current offer, is started by the algorithm. Next, based on the available resources, the module selects a number of tasks for which an offer is prepared. However,

384

C. Avasalcai and S. Dustdar

receive request

compute threshold (t)

Yes

choose a nearby node

compute latency(l)

are there any nodes left?

No

if l 0, then this variable provides additional information about C and is inserted in the network and in X . Its parents set is calculated with PARENTS(Xmax , X ∪ {C}, Πi ).In theory, C ∈ Πi , always, since otherwise Inf or(Xmax , C) = 0, although due to the voracious nature of the procedure, C∈ / Πi , what is a remote possibility. In other words the variable giving most information is added to the network provided that the information is positive, and is considered, as well as its parents set, the best parents set provided by this function. Since the information is positive, the class variable should be supposed to be included in the parents set. The algorithm ends if the function Inf or(Xmax , C) ≤ 0 [8].

3 3.1

Experimentation Analysis with UCI Database

A set of UCI data is analyzed, which refers to the performance of secondary school students in two Portuguese schools [2]. On the one hand, the attributes of the data include: the student’s grades, characteristics, social situation and demographic situation. This information was obtained through the use of school reports and some questionnaire type surveys. The data sets provided have reference to the performance of two different themes, on the one hand those who study mathematics, and those who study

776

B. Oviedo and C. Zambrano-Vega

Portuguese [2]. The data set was modeled by virtue of the binary regression and classification tasks at five levels. The ultimate goal G3 has a strong correlation with attributes G2 and G1, as G3 it is the final grade, while G1 and G2 correspond to the notes in 1st and 2nd courses. For this reason, variables G1 and G2 have been removed from our study. In Tables 1, 2, 3 and 4 we can visualize the variables with their discretization proposals, compliance, themes and attribute information. Then a training is carried out analyzing the educational level of the Portuguese population in the basic classes of Mathematics and Portuguese. Table 1. Variables and description of compliance in UCI Variable

Description

A

GP = 1; MS = 0

B

Female = 1; Male = 0

D

Urban = 1; Rural = 0

F

Yes = 1; No = 0

L

Mother = 1; Father = 0

Table 2. Courses related to the subjects of the courses Grade G3

4

Description

Type

Observation

Final courses Numeric ≤10 reprobate; >10 meet the goal

Results

4.1

Classification Using Weka with the Data of the UCI Mathematics Course

We can visualize in the Figs. 1 and 2, the results obtained by each of the attributes in reference to the class G3 final grade. Tables 5, 6, 7 and 8, where we can find the descriptive analysis of the variables have been built. Using Bayesian classifiers for UCI data math course Given the fact that the students may or may not pass the course, they should be tested. The final result can be positive or negative. If the result is positive it is determined that the student will have a high chance of passing. Sensitivity

Educational Database Analysis Using Simple Bayesian Classifier

777

Table 3. Variables and attribute information Variable

Name

Description

A

School

Student’s school

B

Sex

Student’s sex

C

Age

Student’s age

D

Address

Student’s home address

E

Famtam

Size of the student’s family

F

Pestado

Live with parents

G

Medu

mother’s level of education

H

Pedu

Father’s level of education

I

Mwork

Mother’s occupation

J

Ftrabajo

Father’s occupation

K

Reason

Reason why school is changed

L

Representative

Student’s representative

M

Transfers

Time it takes to go from home to school

N

Tstudy

Time dedicated to study in the week

O

Failures

Number of failures in previous classes

P

Esupport

Extra educational support

Q

Fapoyo

Family support for studies

R

Payment

Payment for classes outside the course

S

Activities

Extra curricular activities

T

Nursing

Use of school nursing

U

EduSuperior

Wants to pursue higher education Internet access from home

V

Internet

W

Rsentimental

Keeps a love relationship

X

Rfamily

Quality of the family relationship

Y

Free time

Has a lot of free time after school

Z

Oscio

Share with friends

AA

dalcohol

Drinks alcohol on weekdays

AB

fsalcohol

Drinks alcohol on weekends

AC

Health

Health situation

AD

Absence

Number of absences to school

and specificity are two probability values that quantify the diagnostic reliability of a test. Results were obtained using as Naive Bayes and BayesNet classifiers with different alternatives like K2 with 1 and 5 parents, TAN, Hill Climber with 1 and 5 parents.

778

B. Oviedo and C. Zambrano-Vega Table 4. Variables and type of attributes Variable Type

Discretization proposal

A

Binary

GP = Gabriel Pereira = 0, MS = Mousinho da Silveira = 1

B

Binary

F = female = 0 o M = male = 1

C

Numeric 15–22

D

Binary

U = Urban = 0, R = rural = 1

E

Binary

LE3 = X ≤ 3, GT3 = X > 3; LE3 = 0, GT3 = 1

F

Binary

T = they live together = 1 o A = apart = 0

G

Numeric 0 = null, 1 = primary (4th grade), 2 = 5th to 9th, 3 = high school o 4 = higher education

H

Numeric 0 = null, 1 = primary (4th grade), 2 = 5th to 9th, 3 = high school o 4 = higher education

I

Nominal Teacher = 0, related to health = 1, civil services = 2, home = 3, other = 4

J

Nominal Teacher = 0, related to health = 1, civil services = 2, home = 3, other = 4

K

Nominal Close to home = 0, school reputation = 1, course preference = 2, other = 3

L

Nominal Mother = 0, father = 1, other = 2

M

Numeric 1 ≤ 15 min, 2 = de 15 a 30 min, 3 = de 30 min a 1 h, 4 ≥ 1 h

N

Numeric 1 ≤ 2 h, 2 = de 2 a 5h, 3 = de 5 a 10 h, 4 ≥ 10 h

O

Numeric n yes 1 ≤ n < 3, else 4

P

Binary

Yes = 1 o no = 0

Q

Binary

Yes = 1 o no = 0

R

Binary

Yes = 1 o no = 0

S

Binary

Yes = 1 o no = 0

T

Binary

Yes = 1 o no = 0

U

Binary

Yes = 1 o no = 0

V

Binary

Yes = 1 o no = 0

W

Binary

Yes = 1 o no = 0

X

Numeric Since 1 = very bad to 5 = Excellent

Y

Numeric Since 1 = very little to 5 = a lot

Z

Numeric Since 1 = very little to 5 = a lot

AA

Numeric Since 1 = very little to 5 = a lot

AB

Numeric Since 1 = very little to 5 = a lot

AC

Numeric desde 1 = very bad a 5 = very good

AD

Numeric If it is 0 = 0; 1–15 = 1; 16–30 = 2; 31– 45 = 3; 46–60 = 4; 61–75 = 5

Educational Database Analysis Using Simple Bayesian Classifier

779

Fig. 1. Results obtained by each of the attributes in reference to the class

Fig. 2. Results obtained by each of the attributes in reference to the class

As Table 9 shows, we have worked with 395 cases and BayesNet-TAN achieves the best classifications with a 67.0886%, with a true negative rate of 0.591. It allows to guess who will pass the math course in 74.2%. It can be determined according to the Fig. 3 that working with BayesNet with Hill Climber the class variable (G3) is directly related to whether the student wants to go to university (U ) or not. It is also directly related to the number of failures in previous classes (O). On the other hand the degree of education of parents influences the type of work they have; but, they are not directly related to the student failure. Social variables are also strongly related to each other, as

780

B. Oviedo and C. Zambrano-Vega Table 5. Variables and descriptive analysis Variable Description Amount Percentage A

GP MS

349 46

88.35 11.65

B

F M

208 187

52.66 47.34

it is the case that he goes out with friends (Z) because, he has free time after school (Y ) and consumes alcohol on the weekend (AB). Using tree classifiers for UCI math data Results were obtained using J48 and Random Forest tree classifiers. As you can see in Table 10, working with a J48 tree classifier correctly classified cases equals 59.4937% of the 395 cases. In the case of Random Forest, correctly classified cases are equivalent to 59.7468%, which allows to determine that there is a certainty of 66% to determine the approved of the course. Additionally it must be indicated that it is a random forest of 100 trees of which each one is built with 5 random characteristics. Using classification rules for UCI mathematics data Results were obtained using ZeroR classification rules and decision tables. As you can see in Table 11, working with Zero classification rules correctly classified cases equal 52.9114% of the 395 cases. But when using decision tables the percentage improves to 63.7975. It is then determined that there is 77.90% certainty in determining the students who will pass the course. Next, a comparison will be made of the results obtained with the different algorithms with the UCI database of the students taking the mathematics course, as indicated in Table 12 4.2

Classification Using Weka with the Data of the UCI Portuguese Course

We can visualize in Figs. 4 and 5, the results obtained by each of the attributes in reference to the class G3 final grade. On this basis the tables have been built Tables 13, 14, 15 and 16, where we can find the descriptive analysis of the variables. It must be indicated that the Portuguese course is being worked with a total of 649 students with the same number of attributes as those of the Mathematics course. Using Bayesian classifiers for Portuguese UCI data Given the possibility of a student achieving the goal of completing their studies or not, they must be tested, the final result may be positive or negative. If the result is positive it is determined that the student will have a high chance of passing. Results were obtained using Naive Bayes and BayesNet classifiers with different alternatives such as K2 with 1 and 5 parents, TAN, Hill Climber with 1 and 5 parents.

Educational Database Analysis Using Simple Bayesian Classifier

Table 6. Variables and descriptive analysis Variable Description Amount Percentage C

15 16 17 18 19 20 21 22

82 104 98 82 24 3 1 1

20.76 26.33 24.81 20.76 6.08 0.76 0.25 0.25

D

U R

307 88

77.72 22.28

E

GT3 LE3

281 114

71.14 28.86

F

A T

41 354

10.38 89.62

G

0 1 2 3 4

3 59 103 99 131

0.76 14.94 26.08 25.06 33.16

H

0 1 2 3 4

2 82 115 100 96

0.51 20.76 29.11 25.32 24.30

I

Home Health Other Services Teacher

59 34 141 103 58

14.94 8.61 35.70 26.08 14.69

J

Home Health Other Services Teacher

29 217 111 18 20

7.34 54.94 28.10 4.56 5.06

781

782

B. Oviedo and C. Zambrano-Vega

Table 7. Variables and descriptive analysis Variable Description Amount Percentage K

Course 145 Other 36 Home 109 Reputation 105

36.71 9.11 27.60 26.59

L

Mother Father Other

273 90 32

69.11 22.78 8.10

M

1 2 3 4

257 107 23 8

65.06 27.09 5.82 2.03

N

1 2 3 4

105 198 65 27

26.58 50.13 16.46 6.84

O

0 1 2 3

312 50 17 16

78.99 12.66 4.30 4.05

P

Yes No

51 344

12.91 87.09

Q

No Yes

153 242

38.73 61.27

R

No Yes

214 181

54.18 45.82

S

No Yes

194 201

49.11 50.89

T

Yes No

314 81

79.49 20.51

U

Yes No

375 20

94.94 5.06

V

No Yes

66 329

16.71 83.29

W

No Yes

263 132

66.59 33.41

Educational Database Analysis Using Simple Bayesian Classifier Table 8. Variables and descriptive analysis Variable Description Amount Percentage X

1 2 3 4 5

8 18 68 195 106

2.03 4.56 17.22 49.37 26.84

Y

1 2 3 4 5

19 64 157 115 40

4.81 16.20 39.75 29.11 10.13

Z

1 2 3 4 5

23 103 130 86 53

5.82 26.08 32.91 21.77 13.42

AA

1 2 3 4 5

276 75 26 9 9

69.87 18.99 6.58 2.28 2.28

AB

1 2 3 4 5

151 85 80 51 28

38.23 21.52 20.25 12.91 7.09

AC

1 2 3 4 5

47 45 91 66 146

11.90 11.39 23.04 16.71 36.96

AD

0 1 2 3 4 5

115 247 28 2 2 1

29.11 62.53 7.09 0.51 0.51 0.25

G3

0 1

186 209

47.09 52.91

783

784

B. Oviedo and C. Zambrano-Vega Table 9. Results obtained with classifiers, UCI mathematical data Classifier

Classified correctly TN rate TP rate

NaiveBayes

65.8228

0.575

0.732

BayesNet con K2-1 padre

66.0759

0.581

0.732

BayesNet con K2-5 padres

66.0759

0.597

0.718

BayesNet con TAN

67.0886

0.591

0.742

BayesNet con Hill Climber-1

61.7722

0.468

0.751

BayesNet con Hill Climber-5

58.9873

0.409

0.751

Fig. 3. Network obtained with BayesNet classifier with Hill Climber with only one parent

Educational Database Analysis Using Simple Bayesian Classifier

785

Table 10. Results obtained with tree classifiers for UCI mathematics Classifier

Classified correctly TN rate TP rate

J48

59.4937

0.554

0.632

Random forest

59.7468

0.527

0.66

Table 11. Results obtained with different classification rules for UCI mathematics Classifier

Classified correctly TN rate TP rate

ZeroR

52.9114

0

1

Decision table

63.7975

0.457

0.779

Table 12. Results with database of students of the UCI mathematics Data

SBND BDE

SBND BIC

SBND Ak

SBND K2

BAN BDEu

UCIMat

60.558

63.103

61.532

57.237

63.25

62.994

Data

BAN K2

TAN

NaiveBayes

UCIMat

63.256

61.551

66.564

RPDag BDEu RPDag BIC RPDag K2 63.859

63.859

59.25

BAN BIC

Fig. 4. Results obtained by each of the attributes in reference to the class

As you can see in the Table 17 we have worked with 649 cases and that the classifier that has correctly classified is BayesNet with Hill Climber with a maximum of 5 parents with 79.1988%, with a true negative rate of 0.569, allowing us to guess who will pass the Portuguese course at 88.90%. Additionally, it indicates the values of true positives that is the sensitivity and false positives or true negatives that the specificity represents.

786

B. Oviedo and C. Zambrano-Vega

Fig. 5. Results obtained by each of the attributes in reference to the class Table 13. Variables and descriptive analysis Variable Description Quantity Percentage A

GP MS

423 226

65.18 34.82

B

F M

383 266

59.01 40.99

In this case the variables showing number the absences (AD) and if the student consumes alcohol on weekdays (AA) are directly related to the class variable (G3). The variables of social nature are still related as in the case of the mathematics course. Using tree classifiers for Portuguese ICU data Results were obtained using J48 and Random Forest tree classifiers. As you can see in the Table 18 working with a J48 tree classifier correctly classified cases are equivalent to 78.2743% of the 649 cases. In the case of Random Forest we worked with a random forest of 100 trees, each of them built with 5 random characteristics and the correctly classified cases are equivalent to 77.0416%. In this case, J48 is better classified with an 88.30% effectiveness to determine the students who will pass the Portuguese course. Using classification rules for Portuguese UCI data Results were obtained using ZeroR classification rules and decision tables. As Table 19 shows, working with ZeroR classification rules correctly classified cases are equivalent to 69.6456% of the 649 cases. But when using decision tables, the percentage improves to 80.1233, reaching a 92.70% effectiveness when determining the students who will pass the course. An experiment was carried out in which all the algorithms calculated separately were tested, using a cross

Educational Database Analysis Using Simple Bayesian Classifier

Table 14. Variables and descriptive analysis Variable Description Quantity Percentage C

15 16 17 18 19 20 21 22

112 177 179 140 32 6 2 1

17.26 27.27 27.58 21.57 4.93 0.92 0.31 0.15

D

U R

452 197

69.65 30.35

E

GT3 LE3

192 457

29.59 70.41

F

A T

80 569

12.33 87.67

G

0 1 2 3 4

6 143 186 139 175

0.92 22.03 28.66 21.42 26.96

H

0 1 2 3 4

7 174 209 131 128

1.08 26.81 32.20 20.18 19.72

I

Home Health Other Servicies Teacher

72 48 136 135 258

11.09 7.40 20.96 20.80 39.75

J

Home Health Other Servicies Teacher

36 23 181 42 367

5.55 3.54 27.89 6.47 56.55

787

788

B. Oviedo and C. Zambrano-Vega

Table 15. Variables and descriptive analysis Variable Description Quantity Percentage K

Course 149 Other 143 Home 285 Reputation 72

22.96 22.03 43.1 11.09

L

Mother Father Other

455 153 41

70.11 23.57 6.32

M

1 2 3 4

366 213 54 16

56.39 32.82 8.32 2.47

N

1 2 3 4

212 305 97 35

32,67 47.00 14.95 5.39

O

0 1 2 3

549 70 16 14

84.59 10.79 2.47 2.16

P

Yes No

581 68

89.52 10.48

Q

No Yes

251 398

38.67 61.33

R

No Yes

610 39

93.99 6.01

S

No Yes

334 315

51.46 48.54

T

Yes No

128 521

19.72 80.28

U

Yes No

69 580

10.63 89.37

V

No Yes

151 498

23.27 76.73

W

No Yes

410 239

63.17 36.83

Educational Database Analysis Using Simple Bayesian Classifier Table 16. Variables and descriptive analysis Variable Description Quantity Percentage X

1 2 3 4 5

22 29 101 317 180

3.39 4.47 15.56 48.84 27.73

Y

1 2 3 4 5

45 107 251 178 68

6.93 16.49 38.67 27.43 10.48

Z

1 2 3 4 5

48 145 205 141 110

7.40 22.34 31.59 21.73 16.95

AA

1 2 3 4 5

451 121 43 17 17

69.49 18.64 6.63 2.62 2.62

AB

1 2 3 4 5

247 150 120 87 45

38.06 23.11 18.49 13.41 6.93

AC

1 2 3 4 5

90 78 124 108 249

13.87 12.02 19.11 16.64 38.37

AD

0 1 2 3 4 5

244 384 20 1 0 0

37.60 59.17 3.08 0.15 0.00 0.00

G3

0 1

197 452

30.35 69.65

789

790

B. Oviedo and C. Zambrano-Vega Table 17. Results obtained with Portuguese UCI data classifiers Classifier

Classified correctly TN rate TP Tasa rate

NaiveBayes

78.1202

0.635

0.845

BayesNet with K2-1 parent

78.1202

0.64

0.843

BayesNet with K2-5 parent

77.6579

0.599

0.854

BayesNet with TAN

77.3498

0.574

0.861

BayesNet with Hill Climber-1

75.6549

0.614

0.819

BayesNet with Hill Climber-5

79.1988

0.569

0.889

Table 18. Results obtained with tree classifiers for Portuguese UCI Classifier

Classified correctly TN rate TP rate

J48

78.2743

0.553

0.883

Random forest

77.0416

0.365

0.947

Table 19. Results obtained with different classification rules for Portuguese UCI Classifier

Classified correctly TN rate TP rate

ZeroR

69.6456

0

1

Decision table

80.1233

0.513

0.927

Table 20. Experiment with all Portuguese UCI data algorithms Description Analyzing

Valor Percentage correct

Data set

1

Results set

10

Data

K2-5p K2-1p TAN Hill-1p Hill-5p NBayes Zero Table

J48

RF

UCI port

76.86

77.80

76.61

76.61

78.79

77.89

69.65 79.94 77.92 77.57

Deviation std

4.69

5.08

5.07

4.95

4.36

5.12

0.65

4.14

4.36

4.44

validation of 10 batches to compare the field of well classified and show us the standard deviation, what allowed obtaining Table 20. When performing the experiment by entering all 10 algorithms in the Weka analyzer, we can determine that the use of decision tables allows having a greater amount of well classified data (79.94%) with a standard deviation of 4.14. A comparison of the results obtained with the different algorithms is made: SBND BDE, SBND BIC, SBND Ak, SBND K2, BAN BDEu, BAN BIC, BAN K2,

Educational Database Analysis Using Simple Bayesian Classifier

791

Table 21. Results with a database of students of the Portuguese UCI Datos

SBND BDEu

SBND BIC

SBND Akaike

SBND K2

BAN BDEu

UCIPort

79.661

79.815

76.269

79.351

78.572

BAN BIC 77.493

Data

BAN K2

RPDag BDEu

RPDag BIC

RPDag K2

TAN

NaiveBayes

UCIPort

73.793

79.343

77.959

73.038

78.120

77.651

RPDag BDE, RPDag BIC RPDag K2, TAN and NaiveBayes, with the base of UCI data taking the Portuguese course. The results obtained for SBND are very good in general, superior to other Bayesian classifiers, this can be seen in the Table 21.

5

Conclusions

In this paper we have introduced a Bayesian classifier known as SBND that is based on quickly obtaining an easy to learn and very competitive Markov border. SBND was applied to the analysis of data on education problems. This classifier is fast to learn and very competitive in relation to the other known classifiers. Several experiments were conducted with the UCI database of two Portuguese schools. It can be indicated that the problem of student desertion being analyzed is complex and difficult and that it has been necessary to use methods that use a combination of factors (Bayesian classifiers) in order to obtain some improvements over the trivial classifier determining that no student deviates. For future work, the costs of incorrect classifications must be included in the problem, since a false positive is not the same as a false negative. If we consider If we consider the cost of a false negative better than that of a false positive, we could detect more students who would leave, and the number of students considered in danger of dropping would be increased.

References 1. Acid, S., De Campos, L., Castellano, J.: Learning Bayesian network classifiers: searching in a space of partially directed acyclic graphs. Mach. Learn. 59(3), 213– 235 (2005) 2. Cortez, P., Gonc´ alves, S.: Using data mining to predict secondary school student performance. University of Minho, pp. 5–12. EUROSIS (2008) 3. Dekker, G., Pechenizkiy, M., Vleeshouwers, J.: Predicting students drop out: a case study. In: Educational Data Mining 2009 (2009) 4. Felgaer, P.: Optimizaci´ on de redes bayesianas basado en t´ecnicas de aprendizaje por inducci´ on. Reportes T´ecnicos en Ingenier´ıa del Softw. 6(2), 64–69 (2004) 5. Garc´ıa, F.: Modelos Bayesianos para la clasificaci´ on supervisada: aplicaciones al an´ alisis de datos de expresi´ on gen´etica. Tesis Doctoral, Universidad de Granada (2009)

792

B. Oviedo and C. Zambrano-Vega

6. Garc´ıa-Mart´ınez, R., Borrajo, D.: An integrated approach of learning, planning, and execution. J. Intell. Robot. Syst. 29(1), 47–78 (2000) 7. Morales, M., Salmer´ on, A.: An´ alisis del alumnado de la universidad de Almer´ıa mediante redes bayesianas. In: 27 Congreso Nacional de Estad´ıstica e Investigaci´ on Operativa, pp. 3413–3436, Abril 2003 8. Oviedo Bayas, B.W.: Modelos gr´ aficos probabilisticos aplicados a la predicci´ on del rendimiento en educaci´ on (2016)

Two Approaches to Country Risk Evaluation Ramin Rzayev1, Sevinj Babayeva1(&), Inara Rzayeva2, and Adila Ali3 1 Department of Information Systems, Institute of Control Systems, ANAS ICS, Baku, Azerbaijan {raminrza,babayevasevinj}@yahoo.com 2 Department of International Economics, Azerbaijan State University of Economics, UNEC, Baku, Azerbaijan [email protected] 3 Department of MSc Business Analytics, University College London, UCL, London, UK [email protected]

Abstract. Weighted attribute estimates and fuzzy inference methods are based on two approaches to evaluate the levels of country risk which are considered on the base of expert judgments. To obtain the final estimates of the country risk levels for an arbitrary set of alternatives these approaches are used on the base of expert conclusions regarding factors of country risk. The study is completed by comparative analysis of finale estimates of country risks. Keywords: Country risk  Concordance coefficient conclusion  Fuzzy set  Fuzzy conclusion



Estimate



Expert

1 Introduction Country risk (CR) is a multifactor category that is characterized by a combined system of financial, economic, socio-political, and legal factors, which distinguishes the market of any country. According to the degree of risk, all countries are ranked by quantitative assessments of CR levels. A consolidated risk indicator R is used, which aggregates the relative influence of the considered number of factors (variables) of CR xi (i = 1–n) by the function R ¼ Rðx1 ; x2 ; . . .; xn Þ. Ranking of countries by degree of CP includes the following stages: • selection the financial, economic, socio-political and legal variables of the CR; • identification of the weights of the selected CR variables, based on their relative impact on the CR-level; • expert evaluation of CR-factors using the expert scale; • determination of a weighted index reflecting the CR-level. Currently, many world rating agencies and international institutions, such as the Economist Intelligence Unit, Euromoney, Institutional Investor, Mood’s Investor

© Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 793–812, 2020. https://doi.org/10.1007/978-3-030-12388-8_54

794

R. Rzayev et al.

Service, Standart & Poor’s Rating Group, The European Bank for Reconstruction and Development (EBRD), the World Bank (WB), etc., range countries on the CR levels and their approaches are determined by qualitative and/or quantitative, economic, combined and structurally-qualitative methods of CR estimation. To date, there are quite a lot of numerical methods for solving this type of problem. In particular, in Boolean case such estimates can be realized by Boolfilter and BoolNet package vignettes, which were respectively considered in [1, 2]. However, the main purpose of this study is to evaluate the levels of country risk by applying the fuzzy inference for identification the function R ¼ Rðx1 ; x2 ; . . .; xn Þ.

2 Selection of the List of CR-Factors The CR evaluation is a multi-criteria procedure, implying the use of the composite rule of aggregating the assessment for each of the selected risk factors. To date, there is no unified approach to calculating the CR index, since there are different points of view regarding the composition of CR factors. For example, in the process of ranging analysts of EBRD use indicators such as macroeconomic stability, taxation conditions, the quality of the judicial system, the level of corruption in the country, the finances of the leading base enterprises, the infrastructure. Another authoritative opinion on the investment attractiveness of states is the WB rating, which is established on the base of CR evaluations. At the same time, the WB assessment methodology takes into account the CR factors, such as the risks of nationalization and expropriation, risks related to private and foreign capital, the level of state policy, including the government’s stable policy and its popularity among citizens, the industrial cycle stage, market capacity and the resulting financial and currency risks, labor force qualification. For visual demonstration of the proposed methods for CR evaluation, we chosen a rather limited list of risk factors used by the audit company Pricewaterhous Coopers in the process of its ranging of the investment attractiveness of states [3]. Namely: x1—the level of corruption; x2—compliance with legislation; x3—level of economic development; x4—state policy on accounting and control; x5—state regulation.

3 Ranking of CR-Variables in the Orders of Experts’ Preferences Suppose that expert estimates of the importance degrees for CR-factors xi (i = 1–5) are determined by separate survey of 15 core specialists. Each expert was invited to arrange the variable xi according to the principle: the most important variable should be designated by the number “1”, the next less important one—by the number “2” and further in descending order of importance. Obtained all rank estimates are summarized in the form of Table 1.

Two Approaches to Country Risk Evaluation

795

Table 1. Ranking of CR-variables Expert number CR-variables and their rank estimates (rij) x1 x2 x3 x4 x5 01 1 2 4 3 5 02 1 3 2 4 5 03 2 1 5 4 3 04 1 2 4 5 3 05 2 1 3 4 5 06 1 2 4 3 5 07 2 1 4 3 5 08 1 2 4 5 3 09 1 3 2 4 5 10 1 3 2 5 4 11 1 3 4 2 5 12 1 2 3 5 4 13 2 1 4 3 5 14 3 1 2 4 5 15 1 2 5 4 3 P rij 21 29 52 55 65

To establish the degree of consistency of expert opinions, we use the Kendall concordance coefficient, which demonstrates the multiple rank correlation of expert opinions. According to [4, 5], this coefficient is calculated by the formula: W¼

12  S ;  nÞ

m2 ðn3

ð1Þ

where m is the number of experts, n is the number of CR-variables, and S is the deviation of expert conclusions from the average value of the CR-variables ranking, which is calculated, for example, by the formula [3]: S¼

n m X X i¼1

j¼1

!2 mðn þ 1Þ rij  ; 2

ð2Þ

where rij 2{1; 2; 3; 4; 5} is the rank of i-th CR-variable established by j-th expert. Then at the value of S = 1450 calculated on the base formula (2) and data from Table 1, the value of the Kendall concordance coefficient is W = 0.6444 > 0.6. This indicates a sufficiently strong agreement of expert conclusions regarding the importance degree of CR-variables.

796

R. Rzayev et al.

4 Identification of Weights of the CR-Variables Now, suppose that at the preliminary stage of separate questionnaire each expert was also instructed to establish the values of the normalized estimates of CR-variables, which determine the specific density (weight) of the influence of each factor on the scale of the unit interval. The results of this questionnaire are summarized in Table 2. Table 2. Normalized estimates of CR-variables Expert number CR-variables and their normalized estimates (aij) x2 x3 x4 x5 x1 01 0.300 0.250 0.150 0.225 0.075 02 0.350 0.175 0.200 0.150 0.125 03 0.225 0.250 0.150 0.175 0.200 04 0.275 0.250 0.175 0.100 0.200 05 0.250 0.275 0.200 0.175 0.100 06 0.300 0.250 0.150 0.200 0.100 07 0.200 0.375 0.150 0.175 0.100 08 0.325 0.300 0.150 0.025 0.200 09 0.275 0.175 0.200 0.100 0.250 10 0.300 0.200 0.250 0.100 0.150 11 0.300 0.175 0.150 0.250 0.125 12 0.300 0.250 0.200 0.100 0.150 13 0.225 0.250 0.175 0.200 0.150 14 0.200 0.300 0.250 0.150 0.100 15 0.300 0.250 0.125 0.150 0.175 P aij 4.125 3.725 2.675 2.275 2.200

Starting from the data presented in Table 2, let us make preliminary calculations for the subsequent identification of the weights of CR-variables: it is necessary to define their group estimates and the numerical characteristics (degrees) of competence of each expert. To calculate the average value of ai for i-th group of normalized estimates of CRvariables it is possible use the weighted degrees of expert competence by following difference equation: ai ðt þ 1Þ ¼

m X

wj ðtÞaij ;

ð3Þ

j¼1

where wj(t) is the weight characterizing the competence degree of the jth expert (j = 1–m) at time t. It is clear that the process of finding of group estimates of the normalized values has an iterative character, which is completed under condition:

Two Approaches to Country Risk Evaluation

maxfjai ðt þ 1Þ  ai ðtÞjg  e;

797

ð4Þ

i

where e is the allowable accuracy of calculations, which is set in advance. In our case, let it be e = 0.0001. At the initial stage t = 0 we assume that experts have the same degrees of competence. Then, assuming for the general case the value wj(0) = 1/m as initial value of the competence degree of each expert, the average value for the i-th group of normalized estimates of CR-variables in the first approximation is obtained from the particular equality: ai ð1Þ ¼

m X

wj ð0Þaij ¼

j¼1

m 1X aij : m j¼1

ð5Þ

In accordance with (5), the averaged estimates of CR-variables into divisions in the first approximation are the following corresponding numbers: {a1(1); a2(1); a3(1); a4(1); a5(1)} = {0.27500; 0.24833; 0.17833; 0.15167; 0.14667}. It is not difficult to see that requirement (4) is not satisfied for the first approximation. Therefore, before move up to the next iteration step, it is necessary calculate the normalizing coefficient as: gð1Þ ¼

5 X 15 X

ai ð1Þaij ¼ 3:2042:

i¼1 j¼1

Then the competence indicators of experts can be calculated according to the following expressions: 8 5 P > 1 > w ð1Þ ¼ ai ð1Þ  aij ðj ¼ 1; 14Þ; > j gð1Þ > > i¼1 > > < 14 P w15 ð1Þ ¼ 1  wj ð1Þ; > j¼1 > > 15 > > P > > : wj ð1Þ ¼ 1;

ð6Þ

j¼1

where w15(1) is the competency indicator of the 15-th expert. Thus, on the base of expressions (6), in the 1-st approximation there are following competence indicators of experts: (

w1 ð1Þ; w2 ð1Þ; w3 ð1Þ; w4 ð1Þ; w5 ð1Þ; w6 ð1Þ; w7 ð1Þ; w8 ð1Þ;

)

w9 ð1Þ; w10 ð1Þ; w11 ð1Þ; w12 ð1Þ; w13 ð1Þ; w14 ð1Þ; w15 ð1Þ ¼ f0:0676; 0:0676; 0:0645; 0:0666; 0:0668; 0:0675; 0:0674; 0:0698; 0:0645; 0:0668; 0:0652; 0:0679; 0:0648; 0:0660; 0:0672g:

798

R. Rzayev et al.

Now we can proceed to the calculation of the mean group estimate of CR-variables in the 2-nd approximation by the formula (3), or more precisely by its particular expression: ai ð2Þ ¼

15 X

wj ð1Þaij :

j¼1

In this case, the average estimates of the CR-variables for groups i = 1  5 are the following numbers: {a1(2); a2(2); a3(2); a4(2); a5(2)} = {0.27547; 0.24876; 0.17821; 0.15116; 0.14640}. Checking these values for the fulfillment of condition (4) and making sure that it is not fulfilled again: maxfjai ð2Þ  ai ð1Þjg ¼ 0:0005 [ e; i

let us calculate the normalizing coefficient as: gð2Þ ¼

5 X 15 X

ai ð2Þaij ¼3:2056:

i¼1 j¼1

Then the expert competence indicators at the 2-nd approximation wj(2) (j = 1–15) will be: wj(2) (j = 1–15) will be: {w1(2); w2(2); w3(2); w4(2); w5(2); w6(2); w7(2); w8(2); w9(2); w10(2); w11(2); w12(2); w13(2); w14(2); w15(2)} = {0.0676; 0.0676; 0.0645; 0.0666; 0.0668; 0.0675; 0.0674; 0.0699; 0.0645; 0.0668; 0.0652; 0.0679; 0.0647; 0.0660; 0.0672}. The average group estimates for the CR-variables in the 3-rd approximation can be obtained from the following particular case of formula (3), namely: ai ð3Þ ¼ P15 j¼1 wj ð2Þaij . In this case, the average estimates of the CR-variables for groups i = 1  5 are the following numbers: fa1 ð3Þ; a2 ð3Þ; a3 ð3Þ; a4 ð3Þ; a5 ð3Þg ¼ f0:27547; 0:24876; 0:17821; 0:15115; 0:14640g. As can be seen, the accuracy of group estimates of the CR-variables in the 3-rd approximation already satisfies condition (4), i.e.: maxfjai ð3Þ  ai ð2Þjg ¼ 0:00001\e, which is the reason for stopping the calculations. In this case, the values of the group estimates of the CR-variables, i.e. fa1 ð3Þ; a2 ð3Þ; a3 ð3Þ; a4 ð3Þ; a5 ð3Þg are the final (consolidated) weights of the variables xi (i = 1–5).

5 Determination of Weighted CR-Level on the Base of Expert Estimations The method of expert assessments involves discussing the factors that affect the CR-level of a particular country by a group of experts specially involved for this purpose. Each of the experts is provided with a list of possible risks on the basis of the

Two Approaches to Country Risk Evaluation

799

CR-variables xi (i = 1–5) and they are invited to give an separate assessment of the probability of their occurrence in percentage terms on the base of the following fivepoint rating system: • • • • •

5—insignificant risk; 4—the risk situation will not come for most probability; 3—about the possibility of risk it is impossible to say anything definite; 2—the risk situation will most probably come; 1—the risk situation will surely come.

Further, expert assessments of risk situations are analyzed for consistency (or inconsistency) according to the rule: the maximum allowable difference between two expert opinions for any type of risk relative to xi (i = 1–5) should not exceed a value of 3. This rule allows to filter out inadmissible deviations in expert assessments of the probability of risk occurrence for the separate CR-variable. The calculation of the total index, theoretically ranging from 0 to 100, can be carried out by the following evaluation criterion: P5 R¼

max i

i¼1 P 5

ai e i

i¼1

ai e i

 100

ð7Þ

Table 3. Gradation of the total weighted estimates of CR Interval CR-level (90; 100] Too low or absent (80; 90] Very low or insignificant (70; 80] More than low (60; 70]

Low

(50; 60]

High

(40; 50]

More than high

(30; 40]

Very high or significant

[0; 30]

Too high or impermissible

Explanation The financial-economic, socio-political, and state-legal statuses are estimated as stable in the long-term outlook The financial-economic, socio-political, and state-legal statuses are estimated as stable in the medium-term outlook The financial-economic, socio-political, and state-legal statuses are estimated as stable in the near-term outlook The main indicators of the financial-economic, socio-political, and state-legal conditions are estimated as satisfactory and stable in the near-term outlook The main indicators of the financial-economic, socio-political and state-legal conditions are estimated as satisfactory, but their stability is doubtful The main indicators of financial-economic, socio-political, and state-legal conditions are estimated as close to satisfactory, but their stability is more than doubtful The financial-economic, socio-political, and state-legal statuses are estimated as unsatisfactory or close to satisfactory, but unstable Financial-economic, socio-political and state-legal statuses are estimated as stably unsatisfactory

800

R. Rzayev et al.

where ai is the weight of the significance of i-th CR-variable, ei is the expert estimate of the probability of risk occurrence for i-th CR-variable based on the five-point rating system. In this case, the minimum index means the maximum risk, and vice versa, and the index of CR-level is established on the assumption of the graduation of the resulting weighted estimates, which summarized in Table 3. Now let us assume that the expert community is offered to test 10 alternative countries ak (k = 1–10) on the five-point system to assess the degree of influence of financial, economic, socio-political and state-legal factors in these countries on their. Thus, for these countries the consolidated (average) expert opinions based estimates of the CR-level are obtained by application of the total evaluation criterion (7). These estimates are summarized in Table 4. Table 4. Total estimates of CR-levels State Identified weights of CR-variables ai(3) 0.2755 0.2488 0.1782 0.1512 0.1464 Normalized estimates of CR-variables e2 e3 e4 e5 e1 a1 4.5 4.75 4.5 4.75 4.25 a2 4.85 4.50 4.55 2.75 3.75 a3 3.75 4.00 3.25 3.85 3.25 a4 4.25 3.45 2.85 2.75 1.85 a5 4.00 2.55 3.00 2.25 1.85 a6 3.55 2.85 2.00 1.25 0.85 a7 2.25 1.75 1.25 1.85 1.50 a8 2.25 1.85 1.25 0.75 0.25 a9 5.00 4.75 4.85 4.85 4.75 a10 3.25 2.85 3.75 4.25 3.50

Ratio

91.27 84.62 73.30 64.47 57.64 47.13 35.54 29.06 97.04 68.55

6 Determination of the CR-Level Using the Fuzzy Inference All existing models of CR-evaluation have certain advantages and disadvantages. For example, the approach described above, which based on the application of the expert evaluation system, is criticized for absence there a cause-effect relationы. In particular, the gradation of the CR-levels, presented in Table 3, was chosen conditionally— without any objective justifications. As a rule, such gradation is established by the expert community or heuristic knowledge. Therefore, before we begin to form a model for estimating the CR-level, it is necessary to construct a justified gradation scale. A. CR-levels classification CR-level evaluation being a multi-criteria procedure implies application of the composite rule of aggregation of the evaluation in each specific case. To estimate the CR-level we choose eight estimated concepts (or terms): u1—“too low”; u2—“very

Two Approaches to Country Risk Evaluation

801

low”; u3—“more than low”; u4—“low”; u5—“high”, u6—“more than high”, u7—“very high”, u8—“too high”. More simply, by the set C = (u1, u2, u3, u4, u5, u6, u7, u8) we will mean the set of criterions of classification of the CR-levels. Then, assuming the factors of CR as linguistic variables, the CR-level estimation can be realized by application of the sufficient set of consistent rules of the form “If , then ” and based on them it is possible to establish the corresponding scale for gradation the final estimates of the CR-levels. The basic judgments can be formulate as follows: d1: “If there is no corruption and economic development is observed, then the CRlevel is acceptable”; d2: “If in addition to the above requirements the state policies on accounting and control are implemented, then the CR-level is more than acceptable”; d3: “If in addition to the conditions stipulated in d2 there is appropriate legislation and state regulation is implemented, then the CR-level is low”; d4: “If there is no corruption, there is appropriate legislation, economic development is observed, the state policies on accounting and control are implemented, then the CR-level is very acceptable”; d5: “If there is adequate legislation, economic development is observed, and state policies on accounting and control are implemented, but there is display of corruption, the CR-level is still acceptable”; d6: “If there is display of corruption, there is no development of the economy, and there is no state regulation, then the CR-level is unacceptable”. In the above statements, reflecting the internal cause-effect relations, the factors influencing the CR-level will be considered as inputs in the form of linguistic variables xi (i = 1–5), and the output is a linguistic variable y whose terms reflect the CR-levels. Then, having specified the corresponding terms of these variables, on the basis of the above statements it is possible to construct implicative rules as following [6]: d1: “If x1 = absent and x3 = observed, then y = acceptable”; d2: “If x1 = absent and x3 = observed and x4 = implemented, then y = more than acceptable”; d3: “If x1 = absent and x2 = exist and x3 = observed and x4 = implemented and x5 = implemented, then y = low”; d4: “If x1 = absent and x2 = exist and x3 = observed and x4 = implemented, then y = very acceptable”; d5: “If x1 = display and x2 = exist and x3 = observed and x4 = implemented, then y = very acceptable”; d6: “If x1 = display and x3 = not visible and x5 = not implemented, then y = unacceptable”. Linguistic variable y can be defined on the discrete set J = {0; 0.1; 0.2; …;1}. Then, 8j2J its terms can be described by fuzzy subsets of J by following membership pffi functions [6]: S = acceptable, lS(j) = j; MS = more than acceptable, lMS ðjÞ ¼ j; L = low, lL(j) = 1, if j = 1 and lL(j) = 0, if j < 1; VS = very acceptable, lVS(j) = j2; US = unacceptable, lUS(j) = 1 − j.

802

R. Rzayev et al.

The fuzzification of terms in the left-hand parts of the rules can be realized by Gaussian membership function: l(u) = exp{−(u − u0)2/r2i } (i = 1–5), which restore fuzzy subsets of the discrete universe C = (u1, u2, u3, …, u8), where uk= (ak+1 + ak)/2 (k = 1–8) (see Fig. 1). In this case, the density of elements distribution r2i for the i-th factor is chosen individually on the assumption of condition of its criticality. It should be noted that the inaccuracy as a result of an arbitrary density choice is eliminated during the intersection of fuzzy sets in the left-parts of the rules. In Fig. 1, the gradation of CR-factors is presented in a general form. However, it is obvious the segment [a0, a8] can be easily reduced to the unit segment [0; 1] by a simple transformation t = (u −a0)/(a8 − a0), where u2[a0, a8], t2[0; 1].

Fig. 1. Uniform gradation of CR-factors

Fig. 2. Uniform gradation of CR-factors at the scale of the unit segment

Estimating the CR-level from the point of view of the factors xi (i = 1–5), which are graded at the scale of the unit segment (Fig. 2), where ak= 0.125 k (k = 0–8), all terms from the left-hand parts of the rules can be fuzzyfied in the following form: • Absent (corruption): A = {0.9070/u1; 0.6766/u2; 0.4152/u3; 0.2096/u4; 0.0870/u5; 0.0297/u6; 0.0084/u7; 0.0019/u8}; • Exist (appropriate legislation): B = {0.9070/u1; 0.6766/u2; 0.4152/u3; 0.2096/u4; 0.0870/u5; 0.0297/u6; 0.0084/u7; 0.0019/u8}; • Observed (economic development): C = {0.9394/u1; 0.7788/u2; 0.5698/u3; 0.3679/ u4; 0.2096/u5; 0.1054/u6; 0.0468/u7; 0.0183/u8}; • Implemented (state policies on accounting and control): D = {0.9497/u1; 0.8133/u2; 0.6282/u3; 0.4376/u4; 0.2749/u5; 0.1557/u6; 0.0796/u7; 0.0367/u8}; • Implemented (state regulation) E = {0.9575/u1; 0.8406/u2; 0.6766/u3; 0.4994/u4; 0.3379/u5; 0.2096/u6; 0.1192/u7; 0.0622/u8}. Then taking into account these formalisms, the implicative rules in the symbolic expression will be as: d1: d2: d3: d4: d5: d6:

(x1 (x1 (x1 (x1 (x1 (x1

= = = = = =

A)&(x3 = C))(y = S); A)&(x3 = C)&(x4 = D))(y = MS); A)&(x2 = B)&(x3 = C)&(x4 = D)&(x5 = E))(y = L); A)&(x2 = B)&(x3 = C)&(x4 = D))(y = VS); ¬A)&(x2 = B)&(x3 = C)&(x4 = D))(y = S); A)&(x3 = ¬C)&(x5 = ¬E))(y = US).

Two Approaches to Country Risk Evaluation

803

Further, for the left-parts of these rules, it necessary to find the membership functions of appropriate fuzzy sets obtained by intersection [6]: d1: lM1(u) = min{lA(u), lC(u)}, M1 = {0.9070/u1; 0.6766/u2; 0.4152/u3; 0.2096/ u4; 0.0870/u5; 0.0297/u6; 0.0084/u7; 0.0019/u8}; d2: lM2(u) = min{lA(u), lC(u), lD(u)}, M2 = {0.9070/u1; 0.6766/u2; 0.4152/u3; 0.2096/u4; 0.0870/u5; 0.0297/u6; 0.0084/u7; 0.0019/u8}; d3: lM3(u) = min{lA(u), lB(u), lC(u), lD(u), lE(u)}, M3 = {0.9070/u1; 0.6766/u2; 0.4152/u3; 0.2096/u4; 0.0870/u5; 0.0297/u6; 0.0084/u7; 0.0019/u8}; d4: lM4(u) = min{lA(u), lB(u), lC(u), lD(u)}, M4 = {0.9070/u1; 0.6766/u2; 0.4152/ u3; 0.2096/u4; 0.0870/u5; 0.0297/u6; 0.0084/u7; 0.0019/u8}; d5: lM5(u) = min{1-lA(u), lB(u), lC(u), lD(u)}, M5 = {0.0930/u1; 0.3234/u2; 0.4994/u3; 0.2910/u4; 0.1453/u5; 0.0622/u6; 0.0228/u7; 0.0072/u8}; d6: lM6(u) = min{1-lA(u), 1-lC(u), 1-lE(u)}, M6 = {0.0425/u1; 0.1594/u2; 0.3234/ u3; 0.5006/u4; 0.6621/u5; 0.7904/u6; 0.8808/u7; 0.9378/u8}. As a result, the rules can be described as: d1: d2: d3: d4: d5: d6:

(x (x (x (x (x (x

= = = = = =

M1))(y M2))(y M3))(y M4))(y M5))(y M6))(y

= = = = = =

S); MS); L); VS); S); US).

These rules are transformed by Lukasiewicz’s implication [7]: lUJ ðu; jÞ ¼ minf1; 1  lU ðuÞ þ lJ ð jÞg;

ð8Þ

as a result of which for each pair (u, j)2U  J the fuzzy relations are obtained in the form of correspondent matrix: 2 6 0:9070 6 6 0:6766 6 6 0:4152 6 R1 ¼ 6 6 0:2096 6 0:0870 6 6 0:0297 6 4 0:0084 0:0019

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0:1 0:1930 0:4234 0:6848 0:8904 1:0000 1:0000 1:0000 1:0000

0:2 0:2930 0:5234 0:7848 0:9904 1:0000 1:0000 1:0000 1:0000

0:3 0:3930 0:6234 0:8848 1:0000 1:0000 1:0000 1:0000 1:0000

0:4 0:4930 0:7234 0:9848 1:0000 1:0000 1:0000 1:0000 1:0000

0:5 0:5930 0:8234 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:6 0:6930 0:9234 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:7 0:7930 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:8 0:8930 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:9 0:9930 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

3 1 1:0000 7 7 1:0000 7 7 1:0000 7 7 1:0000 7 7; 1:0000 7 7 1:0000 7 7 1:0000 5 1:0000

804

R. Rzayev et al. 2

6 0:9070 6 6 0:6766 6 6 0:4152 6 R2 ¼ 6 6 0:2096 6 0:0870 6 6 0:0297 6 4 0:0084 0:0019 2 6 0:9070 6 6 0:6766 6 6 0:4152 6 R3 ¼ 6 6 0:2096 6 0:0870 6 6 0:0297 6 4 0:0084 0:0019 2 6 0:9070 6 6 0:6766 6 6 0:4152 6 R4 ¼ 6 6 0:2096 6 0:0870 6 6 0:0297 6 4 0:0084 0:0019 2 6 0:0930 6 6 0:3234 6 6 0:4994 6 R5 ¼ 6 6 0:2910 6 0:1453 6 6 0:0622 6 4 0:0228 0:0072 2 6 0:0425 6 6 0:1594 6 6 0:3234 6 R6 ¼ 6 6 0:5006 6 0:6621 6 6 0:7904 6 4 0:8808 0:9378

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0:3162 0:4093 0:6396 0:9010 1:0000 1:0000 1:0000 1:0000 1:0000

0:4472 0:5403 0:7706 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:5477 0:6408 0:8711 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:6325 0:7255 0:9558 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:7071 0:8001 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:7746 0:8676 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:8367 0:9297 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:8944 0:9875 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:9487 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

3 1 1:0000 7 7 1:0000 7 7 1:0000 7 7 1:0000 7 7; 1:0000 7 7 1:0000 7 7 1:0000 5 1:0000

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

3 1 1:0000 7 7 1:0000 7 7 1:0000 7 7 1:0000 7 7; 1:0000 7 7 1:0000 7 7 1:0000 5 1:0000

0 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9981

0:01 0:1030 0:3334 0:5848 0:8004 0:9230 0:9803 1:0000 1:0000

0:04 0:1330 0:3634 0:6248 0:8304 0:9530 1:0000 1:0000 1:0000

0:09 0:1830 0:4134 0:6748 0:8804 1:0000 1:0000 1:0000 1:0000

0:16 0:2530 0:4834 0:7448 0:9504 1:0000 1:0000 1:0000 1:0000

0:25 0:3430 0:5734 0:8348 1:0000 1:0000 1:0000 1:0000 1:0000

0:36 0:4530 0:6834 0:9448 1:0000 1:0000 1:0000 1:0000 1:0000

0:49 0:5830 0:8134 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:64 0:7330 0:9634 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:81 0:9030 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

3 1 1:0000 7 7 1:0000 7 7 1:0000 7 7 1:0000 7 7; 1:0000 7 7 1:0000 7 7 1:0000 5 1:0000

0 0:9070 0:6766 0:5006 0:7090 0:8547 0:9378 0:9772 0:9928

0:1 1:0000 0:7766 0:6006 0:8090 0:9547 1:0000 1:0000 1:0000

0:2 1:0000 0:8766 0:7006 0:9090 1:0000 1:0000 1:0000 1:0000

0:3 1:0000 0:9766 0:8006 1:0000 1:0000 1:0000 1:0000 1:0000

0:4 1:0000 1:0000 0:9006 1:0000 1:0000 1:0000 1:0000 1:0000

0:5 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:6 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:7 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:8 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:9 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

3 1 1:0000 7 7 1:0000 7 7 1:0000 7 7 1:0000 7 7; 1:0000 7 7 1:0000 7 7 1:0000 5 1:0000

1 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000

0:9 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 0:9622

0:8 1:0000 1:0000 1:0000 1:0000 1:0000 1:0000 0:9192 0:8622

0:7 1:0000 1:0000 1:0000 1:0000 1:0000 0:9096 0:8192 0:7622

0:6 1:0000 1:0000 1:0000 1:0000 0:9379 0:8096 0:7192 0:6622

0:5 1:0000 1:0000 1:0000 0:9994 0:8379 0:7096 0:6192 0:5622

0:4 1:0000 1:0000 1:0000 0:8994 0:7379 0:6096 0:5192 0:4622

0:3 1:0000 1:0000 0:9766 0:7994 0:6379 0:5096 0:4192 0:3622

0:2 1:0000 1:0000 0:8766 0:6994 0:5379 0:4096 0:3192 0:2622

0:1 1:0000 0:9406 0:7766 0:5994 0:4379 0:3096 0:2192 0:1622

3 0 0:9575 7 7 0:8406 7 7 0:6766 7 7 0:4994 7 7: 0:3379 7 7 0:2096 7 7 0:1192 5 0:0622

Two Approaches to Country Risk Evaluation

805

As a result of intersection of fuzzy relations R1, R2, …, R6 we finally obtain a general functional solution R reflecting the cause-effect relations between the factors xi (i = 1–5), on the one hand, and, in fact, the CR-level, on the other. 2 6 u1 6 6 u2 6 6 u3 6 R¼6 6 u4 6 u5 6 6 u6 6 4 u7 u8

0 0:0930 0:3234 0:5006 0:7090 0:8547 0:9378 0:9772 0:9928

0:1 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9916 0:9622

0:2 0:0930 0:3234 0:5848 0:7904 0:9130 0:9703 0:9192 0:8622

0:3 0:0930 0:3234 0:5848 0:7904 0:9130 0:9096 0:8192 0:7622

0:4 0:0930 0:3234 0:5848 0:7904 0:9130 0:8096 0:7192 0:6622

0:5 0:0930 0:3234 0:5848 0:7904 0:8379 0:7096 0:6192 0:5622

0:6 0:0930 0:3234 0:5848 0:7904 0:7379 0:6096 0:5192 0:4622

0:7 0:0930 0:3234 0:5848 0:7904 0:6379 0:5096 0:4192 0:3622

0:8 0:0930 0:3234 0:5848 0:6994 0:5379 0:4096 0:3192 0:2622

0:9 0:0930 0:3234 0:5848 0:5994 0:4379 0:3096 0:2192 0:1622

3 1 0:9575 7 7 0:8406 7 7 0:6766 7 7 0:4994 7 7: 0:3379 7 7 0:2096 7 7 0:1192 5 0:0622

To determine the CR-level it is necessary to apply the rule of composite conclusion in a fuzzy environment [6]: Ek ¼ Gk R;

ð9Þ

where Ek is the acceptability degree of risk relative to the k-th CR-level (k = 1–8), Gk is the mapping of the k-th CR-level in the form of a fuzzy subset of the discrete universe J. Then, choosing a composite rule as [6] lEk ðjÞ ¼ maxfmin½lGk ðjÞ; lR ðjÞg; j2J

( and assuming that in this case lGk ðjÞ ¼

ð10Þ

0; j 6¼ jk ;

finally we have: lEk(u) = lR(jk, u), 1; j ¼ jk ; that is, in other words, Ek is the k-th row of the matrix R. Now, to classify the CR-levels defuzzification procedure for the fuzzy outputs of the applied model is used. So, for the estimated concept u1 of risk acceptability, the fuzzy interpretation of the corresponding CR-level will be the following fuzzy subset of the universe J: E1 = {0.0930/0; 0.0930/0.1; 0.0930/0.2; 0.0930/0.3; 0.0930/0.4; 0.0930/0.5; 0.0930/0.6; 0.0930/0.7; 0.0930/0.8; 0.0930/0.9; 0.9575/1}. Setting the levelPsets E1a and calculating the corresponding powers M(E1a) by the formula MðE1a Þ ¼ nr¼1 xnr , we have: • for 0 < a < 0.0930: Da = 0.0930, E1a= {0; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9; 1}, M(E1a) = 0.5; • for 0.0930 < a < 0.9575: Da = 0.8645, E1a= {1}, M(E1a) = 1. For numerical estimations of fuzzy outputs Ek (k = 1–8) following formula can be applied [8, 9]: FðEk Þ ¼

1 amax

Zamax MðEka Þda; ðk ¼ 1  5Þ; 0

ð11Þ

806

R. Rzayev et al.

where amax is the maximum value on Ek. Thus, in this case we have: 1 FðE1 Þ ¼ 0:9575

0:9575 Z

MðE1a Þda ¼

0:5  0:0930 þ 1:0  0:8645 ¼ 0:9514: 0:9575

0

For estimated concept u8 of risk acceptability, the reflection of the corresponding CR-level will be following fuzzy set: E8 = {0.9928/0; 0.9622/0.1; 0.8622/0.2; 0.7622/0.3; 0.6622/0.4; 0.5622/0.5; 0.4622/0.6; 0.3622/0.7; 0.2622/0.8; 0.1622/0.9; 0.0622/1}, for which we have, respectively: • for 0 < a < 0.0622: Da = 0.0622, E8a= {0; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9; 1}, M(E8a) = 0.5; • for 0.0622 < a < 0.1622: Da = 0.1, E8a= {0; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9}, M(E8a) = 0.45; • for 0.1622 < a < 0.2622: Da = 0.1, E8a = {0; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8}, M(E8a) = 0.40; • for 0.2622 < a < 0.3622: Da = 0.1, E8a = {0; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7}, M (E8a) = 0.35; • for 0.3622 < a < 0.4622: Da = 0.1, E8a = {0; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6}, M (E8a) = 0.30; • for 0.4622 < a < 0.5622: Da = 0.1, E8a = {0; 0.1; 0.2; 0.3; 0.4; 0.5}, M (E8a) = 0.25; • for 0.5622 < a < 0.6622: Da = 0.1, E8a = {0; 0.1; 0.2; 0.3; 0.4}, M(E8a) = 0.20; • for 0.6622 < a < 0.7622: Da = 0.1, E8a = {0; 0.1; 0.2; 0.3}, M(E8a) = 0.15; • for 0.7622 < a < 0.8622: Da = 0.1, E8a = {0; 0.1; 0.2}, M(E8a) = 0.10; • for 0.8622 < a < 0.9622: Da = 0.1, E8a = {0; 0.1}, M(E8a) = 0.05; • for 0.9622 < a < 0.9928: Da = 0.0307, E8a = {0}, M(E8a) = 0. Then the numerical estimate of the fuzzy output E8 will be: 1 FðE8 Þ ¼ 0:9928

0:9928 Z

MðE8a Þda ¼ 0:2579: 0

Point estimates for remaining fuzzy outputs are calculated by similar actions: for the estimated concept u2 of risk acceptability—F(E2) = 0.8077; u3—F(E3) = 0.5741; u4—F(E4) = 0.4689; u5—F(E5) = 0.3964; u6—F(E6) = 0.3324; u7—F(E7) = 0.2863. F(E8) = 0.2579 is the least defuzzified output of the applied model of the multicriterion assessment of the CR-level, as the upper bound it corresponds to the consolidated estimation of the CR-level “too high or impermissible”. From the point of view of the influence of the CR-factors, for others the defuzzified outputs we have, respectively: • 0.2863 is upper bound of estimate “very high or significant”; • 0.3324 is upper bound of estimate “more than high”; • 0.3964 is upper bound of estimate “high”;

Two Approaches to Country Risk Evaluation

• • • •

0.4689 0.5741 0.8077 0.9514

is is is is

upper upper upper upper

bound bound bound bound

of of of of

estimate estimate estimate estimate

807

“low”; “more than low”; “very low or insignificant”; “too low or absent”.

As a criterion for the forming of the final estimation the following equality E¼

FðEk Þ  100 Fmax

ð12Þ

is applied, where F(Ek) is the estimate of the k-th CR-level (to wide extent also any other estimate); Fmax = F(E1) = 0.9514. Then, in the accepted assumptions, the justified scale for estimation the CR-level within the framework of the segment [0; 100] is summarized in Table 5.

Table 5. Gradation of CR-levels using the fuzzy inference Interval (84.90; 100] (60.34; 84.90] (49.29; 60.34] (41.66; 49.29] (34.94; 41.66] (30.09; 34.94] (27.11; 30.09] [0; 27.11]

CR-level Too low or absent Very low or insignificant More than low Low High More than high Very high or significant Too high or impermissible

B. CR-levels classification To construct the fuzzy inference system according to the CR-level estimation, the basis verbal model is chosen by above statements d1–d6. As alternatives, ten hypothetical states ak (k = 1–10) are used, which having passed expert examination on a five-mark grading system for the influences of CR-factors xi (i = 1–5) on their CRlevels (see Table 4). In this case, for the terms from the left-hand parts of the rules d1– d6, the procedure for fuzzification can be applied somewhat differently, namely: each term is reflected as a fuzzy subset of the final set of estimated alternatives (countries) {a1, a2, …, a10} as Ai = {lAi(a1)/a1; lAi(a2)/a2; …; lAi(a10)/a10}, where lAi(at) is the value of the membership function of the fuzzy set Ai, i.e. it characterizes the country at with respect to the assessment criterion Ai. As a membership function, Gaussian function is chosen in the form of: lAi ðat Þ ¼ expf½ei ðat Þ  52 =r2i g; where ei(at) is the consolidated expert evaluation of the country at (t = 1–10), which is given by five-mark grading system for compliance with the i-th CR-factor as non-

808

R. Rzayev et al.

existent; r2i is the density of the location of the nearest elements, which is chosen as 4 for all cases of the fuzzification. Then, assuming each of the CR-factors CP xi (i = 1–5) as the linguistic variable, one of its terms, namely “non-existent risk » can be described in the form of the corresponding fuzzy subset Ai. of the discrete universe U = {a1, a2, …, a10} as follows [7]: • A1 = {0.9394/a1; 0.9944/a2; 0.6766/a3; 0.8688/a4; 0.7788/a5; 0.5912/a6; 0.1510/a7; 0.1510/a8; 1.0000/a9; 0.4650/a10}; • A2 = {0.9845/a1; 0.9394/a2; 0.7788/a3; 0.5485/a4; 0.2230/a5; 0.3149/a6; 0.0713/a7; 0.0837/a8; 0.9845/a9; 0.3149/a10}; • A3 = {0.9394/a1; 0.9506/a2; 0.4650/a3; 0.3149/a4; 0.3679/a5; 0.1054/a6; 0.0297/a7; 0.0297/a8; 0.9944/a9; 0.6766/a10}; • A4 = {0.9845/a1; 0.2821/a2; 0.7185/a3; 0.2821/a4; 0.1510/a5; 0.0297/a6; 0.0837/a7; 0.0109/a8; 0.9944/a9; 0.8688/a10}; • A5 = {0.8688/a1; 0.6766/a2; 0.4650/a3; 0.0837/a4; 0.0837/a5; 0.0135/a6; 0.0468/a7; 0.0036/a8; 0.9845/a9; 0.5698/a10}. Then, taking these formalisms into account and presented above formal descriptions of terms from the right-hand parts of the rules d1–d6, the basic model is written as following: d1: d2: d3: d4: d5: d6:

(x1 (x1 (x1 (x1 (x1 (x1

= = = = = =

A1)&(x3 = A3))(y = S); A1)&(x3 = A3)&(x4 = A4))(y = MS); A1)&(x2 = A2)&(x3 = A3)&(x4 = A4)&(x5 = A5))(y = L); A1)&(x2 = A2)&(x3 = A3)&(x4 = A4))(y = VS); ¬A1)&(x2 = A2)&(x3 = A3)&(x4 = A4))(y = S); A1)&(x3 = ¬A3)&(x5 = ¬A5))(y = US).

Similarly, intersections of fuzzy sets from the left-parts of the rules are established. In the discrete case, these are determined by finding the minimum of the corresponding values of membership functions, namely: d1: lM1(u) = min{lA1(u), lA3(u)}, M1 = {0.9394/a1; 0.9506/a2; 0.4650/a3; 0.3149/ a4; 0.3679/a5; 0.1054/a6; 0.0297/a7; 0.0297/a8; 0.9944/a9; 0.4650/a10}; d2: lM2(u) = min{lA1(u), lA3(u), lA4(u)}, M2 = {0.9394/a1; 0.2821/a2; 0.4650/a3; 0.2821/a4; 0.1510/a5; 0.0297/a6; 0.0297/a7; 0.0109/a8; 0.9944/a9; 0.4650/a10}; d3: lM3(u) = min{lA1(u), lA2(u), lA3(u), lA4(u), lA5(u)}, M3 = {0.8688/a1; 0.2821/ a2; 0.4650/a3; 0.0837/a4; 0.0837/a5; 0.0135/a6; 0.0297/a7; 0.0036/a8; 0.9845/a9; 0.3149/a10}; d4: lM4(u) = min{lA1(u), lA2(u), lA3(u), lA4(u)}, M4 = {0.9394/a1; 0.2821/a2; 0.4650/a3; 0.2821/a4; 0.1510/a5; 0.0297/a6; 0.0297/a7; 0.0109/a8; 0.9845/a9; 0.3149/a10}; d5: lM5(u) = min{1 − lA1(u), lA2(u), lA3(u), lA4(u)}, M5 = {0.0606/a1; 0.0056/a2; 0.3234/a3; 0.1312/a4; 0.1510/a5; 0.0297/a6; 0.0297/a7; 0.0109/a8; 0.0000/a9; 0.3149/a10}; d6: lM6(u) = min{1 − lA1(u), 1 − lA3(u), 1 − lA5(u)}, M6 = {0.0606/a1; 0.0056/ a2; 0.3234/a3; 0.1312/a4; 0.2212/a5; 0.4088/a6; 0.8490/a7; 0.8490/a8; 0.0000/a9; 0.3234/a10}.

Two Approaches to Country Risk Evaluation

809

As a result, the rules are described in a more compact form: d 1: d2: d3: d4: d5: d6:

(x (x (x (x (x (x

= = = = = =

M1))(y M2))(y M3))(y M4))(y M5))(y M6))(y

= = = = = =

S); MS); L); VS); S); US).

As above these rules are transformed by Lukasiewicz’s implication (8) into the fuzzy relations R1, R2, …, R6, intersection of which creates the following general matrix solution R. 2 6 6 6 6 6 6 6 6 R¼6 6 6 6 6 6 6 6 4

a1 a2 a3 a4 a5 a6 a7 a8 a9 a10

0 0:0606 0:0444 0:5350 0:6851 0:6321 0:8946 0:9703 0:9703 0:0056 0:5350

0:1 0:0706 0:1494 0:5350 0:7279 0:7321 0:9803 0:9703 0:9964 0:0155 0:6350

0:2 0:1006 0:2494 0:5350 0:7579 0:8321 0:9865 0:9510 0:9510 0:0155 0:6851

0:3 0:1312 0:3494 0:5350 0:8079 0:9163 0:9865 0:8510 0:8510 0:0155 0:6851

0:4 0:1312 0:4494 0:5350 0:8779 0:9163 0:9865 0:7510 0:7510 0:0155 0:6851

0:5 0:1312 0:5494 0:5350 0:9163 0:9163 0:9865 0:6510 0:6510 0:0155 0:6851

0:6 0:1312 0:6494 0:5350 0:9163 0:9163 0:9865 0:5510 0:5510 0:0155 0:6851

0:7 0:1312 0:7179 0:5350 0:9163 0:9163 0:8912 0:4510 0:4510 0:0155 0:6851

0:8 0:1312 0:7179 0:5350 0:9163 0:9163 0:7912 0:3510 0:3510 0:0155 0:6851

0:9 0:1312 0:7179 0:5350 0:9163 0:8788 0:6912 0:2510 0:2510 0:0155 0:6851

3 1 0:9394 7 7 0:9944 7 7 0:6766 7 7 0:8688 7 7 0:7788 7 7 0:5912 7 7 0:1510 7 7 0:1510 7 7 1:0000 5 0:6766

On the discrete set J the matrix R reflects the cause-effect relations between the consolidated expert assessments of countries by CR-factors xi (i = 1–5), on the one hand, and, corresponding their CR-levels, on the other According to (9) and (10), the k-th row of the matrix R is a fuzzy conclusion relative to the aggregated CR-level for the k-th alternative (country). In order to numerically interpret each of these fuzzy conclusions it necessary to apply the defuzzification procedure based on the method of point estimation of fuzzy sets. In particular, for a fuzzy conclusion regarding the CR-level of the first alternative E1 = {0.0606/0; 0.0706/0.1; 0.1006/0.2; 0.1312/0.3; 0.1312/0.4; 0.1312/0.5; 0.1312/0.6; 0.1312/0.7; 0.1312/0.8; 0.1312/0.9; 0.9394/1} according to the above arguments, we have: • for 0 < a < 0.0606: Da = 0.0606, E1a= {0; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9; 1}, M(E1a) = 0.5; • for 0.0606 < a < 0.0706: Da = 0.01, E1a= {0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9; 1}, M(E1a) = 0.55; • for 0.0706 < a < 0.1006: Da = 0.03, E1a= {0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9; 1}, M(E1a) = 0.60; • for 0.1006 < a < 0.1312: Da = 0.0306, E1a= {0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9; 1}, M (E1a) = 0.65; • for 0.1312 < a < 0.9394: Da = 0.8082, E1a= {1}, M(E1a) = 1.

810

R. Rzayev et al.

Then, according to (11) the numerical estimate of E1 is: 1 FðE1 Þ ¼ 0:9388

0:9388 Z

MðE8a Þda ¼ 0:9388: 0

The point estimates of fuzzy conclusions about CR-levels for other alternative countries are established by similar actions: F(E2) = 0.7687; F(E3) = 0.6047; F (E4) = 0.5370; F(E5) = 0.5206; F(E6) = 0.4552; F(E7) = 0.3055; F(E8) = 0.3001; F (E9) = 0.9927; F(E10) = 0.5140. As a result, according classification presented in Table 5, the ratio of the total estimates of the CR-levels on the scale of the interval [0; 100] are obtained by simply multiplying these values by 100.

7 Conclusion So, two approaches to the evaluation of CR-levels are considered on the base of the application of expert conclusions regarding the degrees of influence of the factors xi (i = 1–5) on the CR-level. As a result of applying the method of weighted estimates of attributes, it was possible to determine the coefficient of rank correlation of CR-factors xi (i = 1–5), which indicated sufficiently high degree of agreement between expert opinions, but also a close relationships between the considered CR-factors. In addition, within the framework of this approach the generalized values of the weights of the CR-factors xi (i = 1–5) were calculated by analytical reasoning, which became the basis for justifying and developing recommendations for the formation of final estimates of the CR-levels by the established comparison criterion at the scale of the interval [0; 100]. The method of weighted estimates can be used in the decision-making process as effective mechanism for multicriterion evaluation of alternatives characterized by a certain set of attributes. In fact, fuzzy inference, which is the essence of the second approach similarly solves the discussed problem, with the only difference that it relies not on an indirect, but on a direct cause-effect relations between the factors xi (i = 1–5) and CR-levels. As a result of the application of fuzzy inference, it was possible to formulate a valid scale of CR-levels gradation and it is relatively easy to obtain finale estimates of the CRlevels. Comparative analysis of the results of estimations the CR-levels for hypothetical alternatives (countries) ak (k = 1–10) obtained by both methods is presented in the form of Table 6. As can be seen from Table 6, the orders of final estimates of the CR-levels only for alternatives a4, a5 and a10 are different. With comparing by denominations of estimates, the CR-levels do not always coincide too. It is explained by different approaches to the formation of a grading scale for the final estimates of the CR-levels. Nevertheless, the

Two Approaches to Country Risk Evaluation

811

Table 6. Comparative analysis of the obtained results ai

a1 a2

Weighted Finale estimate 91.27 84.62

a3

73.30

a4 a5 a6 a7 a8

64.47 57.64 47.13 35.54 29.06

a9 97.04 a10 68.55

estimation Fuzzy inference CR-level according to Order Finale CR-level according to Order uniform gradation estimate fuzzy gradation Too low or absent 2 93.88 Too low or absent 2 Very low or insignificant 3 76.87 Very low or 3 insignificant More than low 4 60.47 Very low or 4 insignificant Low 6 53.70 More than low 5 High 7 52.06 More than low 6 More than high 8 45.52 Low 8 Very high or significant 9 30.55 More than high 9 Too high or 10 30.01 Very high or significant 10 impermissible Too low or absent 1 99.27 Too low or absent 1 Low 5 51.40 More than low 7

fuzzy approach based classification of the final estimates is more confidence, since in this case the cause-effect relations between the influence factors and the CR-levels are traced, even though these relations are formulated on the base of trivial, but consistent and sufficiently valid implicative rules. Acknowledgment. The authors consider necessary to express their sincere appreciation to the direction of the Institute of Control Systems of the Azerbaijan National Academy of Sciences represented by professors T. A. Aliev and O. G. Nusratov for the help that they rendered during the process of writing and preparing this article.

References 1. McClenny, L.D., et al.: BoolFilter: an R package for estimation and identification of partiallyobserved boolean dynamical systems. Biometrics 18(1), 2–8 (2017) 2. Mussel, C., et al.: BoolNet package vignette. http://medicinaycomplejidad.org/pdf/BoolNet_ package_vignette.pdf. Accessed 13 Jan 2017 3. PwC Global Risk podcast series. https://www.pwc.com/gx/en/services/advisory/consulting/ risk/coso-erm-framework/podcasts.html. Accessed 13 Jan 2017 4. Lin, L.: A note on the concordance correlation coefficient. Biometrics 56, 324–325 (2012) 5. Lin, L., Hedayat, A.S., Wu, W.: Statistical Tools for Measuring Agreement. Springer, New York (2012) 6. Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning. Inf. Sci. 8(3), 199–249 (1965) 7. Lukasiewicz, J.: On three-valued logic. In: Borkowski, L. (ed.) Selected Works, pp. 87–88. NorthHolland Publishing Company, Amsterdam (1970)

812

R. Rzayev et al.

8. Rzayev, R.R.: Intellectual Analyses in Decision Support Systems. LAP Lambert Academic Publishing, Verlag (2013). (in Russian) 9. Rzayev, R.R.: Analytical Support for Decision-Making in Organizational Systems. Palmarium Academic Publishing, Saarbrucken (2016). (in Russian)

Conceptual Model for the New Generation of Data Warehouse System Catalog Danijela Jaksic1(&), Patrizia Poscic1, and Vladan Jovanovic2 1

2

Department of Informatics, University of Rijeka, Rijeka, Croatia {danijela.jaksic,patrizia}@inf.uniri.hr Department of Computer Science, Georgia Southern University, Statesboro, GA, USA [email protected]

Abstract. This paper introduces a formal definition of a Data Vault model and a conceptual data model of a new Data Warehouse (DW) system catalog (Metadata Vault Repository - MDV) which is based on the Data Vault (DV) method for database modeling. The goal of this conceptual MDV model is to serve as a basis for future development of a new generation of DW temporal system catalogs – catalogs that will track and manage changes in the DW data and metadata, as well as in its’ schemas. The main contributions of this paper are: (a) a formal definition of DV model and its main concepts, (b) a conceptual MDV model, (c) a final set of fundamental changes over the DW schema, and (d) a formal algebra for DW schema maintenance. Keywords: Data warehouse model  System catalog



Data vault



Schema evolution



Conceptual

1 Introduction A Data Warehouse (DW) environment is extremely dynamic. A number of (heterogeneous) data sources are subject to frequent changes of data and structure along with frequent changes in the information requirements (set by business users). Data Warehouses implement extremely complex tasks. They must at all times be able to adapt to changes from data sources as well as satisfy user’s requests for information. The problem that is explored in this paper is known and recognized in literature as a DW evolution problem [1–7, 9, 17–19], tracking and storing the scope and structure changes of data and metadata for a very long time period. This paper presents a formal definition of a Data Vault model and a conceptual data model of a new DW system catalog (Metadata Vault Repository - MDV) which is based on the Data Vault (DV) method for database modeling. This conceptual MDV model serves as a first step for development of a new generation of DW system catalogs, catalogs that will track and manage changes in the DW data and metadata, as well as in its’ schemas. With a DW system and its catalog designed this way a DW schema evolution will be carried out only with the expansion of the existing schema and without loss of information. © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 813–825, 2020. https://doi.org/10.1007/978-3-030-12388-8_55

814

D. Jaksic et al.

System catalogs built on this MDV model will serve as an extension of traditional relational database system catalogs. In order to build a practical prototype and to test the proposed solution, a permanent and comprehensive conceptual metadata repository model for tracking DW data and schema changes was developed, along with a final set of fundamental changes over the DW schema and a formal algebra for DW schema maintenance. The paper is organized as follows: in Sect. 2 a brief summary of related work is presented, in Sect. 3 a Data Vault modeling method is briefly described, in Sect. 4 a formal definition of the Data Vault model is given, in Sect. 4 a conceptual MDV system catalog model is presented, in Sect. 5 our research hypothesis is defined, along with a formal algebra for DW schema maintenance and a formal set of DV schema changes, and finally in Sect. 6 are the conclusions and guidelines for future work.

2 Related Work With the evolution of technology, Web 2.0 tools and cloud computing, as well as the phenomenon of Big Data, the schema evolution problem became even harder to solve because of the sheer size and quantity of existing and newly created data, as well as its rapid creation and its different source systems and data types. Relational databases were not designed to cope with agility challenges, scale, commodity storage and processing power demanded by modern applications. That can be seen in the related academic research, as well as in industry trends. With respect to the literature, DW evolution can be traced through three approaches; schema evolution, schema versioning and view maintenance. The first two approaches [1–7, 9, 17–19] are of more interest because they are based on a DW defined as a multidimensional schema, and the third is based on a DW that is defined as a set of materialized views. A good comparative study between three dominant DW architecture approaches (Inmon, Kimball and Data Vault) is given in [21]. The extensive and comprehensive state-of-the-art solution of this problem of DW schema evolution can be found in [20], but from analyzing the related work it can generally be concluded that the process of schema evolution and versioning is still demanding in terms of invested time and resources. It is necessary to balance the resource requirements and the quality of the schema evolution process. Perhaps the biggest problem is the preservation of schema consistency and data integrity (there is still a lack of an integrated system-of-records), as well as the simultaneous performance of temporal queries against multiple versions of the schema. Also, migration and transformation of data is still slow and expensive. The loss of information during these processes is still present and there is a lack of effective integration, organization and management of metadata. Different approaches to solving the DW schema evolution problem are presented in literature (including a variety of techniques, algorithms, algebras, models, prototypes, methodologies and frame-works), but there is still no widely accepted solution and general framework for managing DW schema changes. More importantly, previous research does not emphasize the fact that the DW requirements are increasing in the data and meta-data scope and structure (the growing number of data sources and more new and different types of data), which additionally requires developing some

Conceptual Model for the New Generation of Data Warehouse System Catalog

815

new approaches and solutions to the schema evolution problem. In [22] a comparative analysis is made of the four most prominent DW data models (namely the relational/normalized model, data vault model, anchor model and dimensional model) and the Domain/Mapping model (DMM) is presented. This has been designed to reconcile the semantic differences of existing data warehouse conceptual models and allows for the capturing of temporal aspects, but only on a conceptual level. It is believed that the Data Vault modeling methodology is better suited for logical and physical modeling of DW repositories, especially with regard of its usage and status in the industry [25]. As for the industry trends, many industries today are already combining NoSQL database technology and relational databases in order to gain needed flexibility and scalability [23, 24]. Traditional relational databases are not enough. Many of the leading database management systems providers are upgrading their solutions to incorporate some solutions for the schema evolution problem, as well as tracking the history of schema changes. For example, Microsoft implemented temporal tables in its SQL Server 2016 system catalog, and in 2017 version new graph database capabilities for modeling many-to-many relationships are found, all with a goal to easily track and manage data and schema changes in their databases. This is an early version of temporal system catalog for a database management system. It is the aim of this paper to take this idea from industry trends and further develop it in the area of data warehousing so not only traditional relational databases will have temporal system catalogs, but data warehouses too. Also to upgrade this industry approach from a relational model to a Data Vault model, for which it is believed to be better suited to sustain schema evolution in todays’ rapidly changing environment.

3 About the Data Vault Method Data Vault (DV) is a method for database modeling, specially designed to build a DW for long-term storage of historical data collected from various sources [11–16]. The DV method is based on the assumption that the DW environment is in a state of constant change, so it emphasizes the need to track the origin of data through an empirically defined set of metadata. This allows the tracking of values back to the data source and track the history of changes. Structural data is explicitly separated from descriptive attributes, no matter which source the data comes from, which allows for fast parallel loads and reduces costs. The model becomes flexible to changes in the business environment. Additionally, the DV method does not distinguish between good and bad data; all the data is stored, no matter which source it comes from and whether it is adaptable to business rules. With this a loss of information is avoided. All the data is stored at all times. All changes to the model are implemented solely as independent extensions of the existing model, meaning that the changes do not affect existing applications, and all versions of applications can be based on the same evolving database. All versions of the schema are subsets of the DV schema. The basic 3 concepts of DV model are hub, link and satellite, as it can be seen in Fig. 1. Hubs represent identities or business keys. Links represent links be-tween two or more hubs, i.e. representations of M:M relationships from a relational model. One

816

D. Jaksic et al.

can distinguish so-called “same-as” links that help define semantically identical data structures from hierarchical links that help define hierarchical structure between business keys (hubs). Satellites represent historical descriptive attributes, i.e. all non-key attributes of the hub or link. Additionally, there is a fourth concept, the reference table, which is a representation of referential data (ciphers, internal or external standardized data, business classifications or categorizations, etc.) and can be related to the other three concepts in the model.

Fig. 1. Entities and relationships between entities in the Data Vault model

4 Formal Definition of the Data Vault Model Since the Data Vault model is developed in industry and is primarily used in business practice, for the purpose of further scientific research, as well as development and formal validation of our conceptual MDV model, it was necessary to give a formal definition of a Data Vault model (and all its basic concepts). The formal definition is given in the Table 1. Principles of set and graph theory were used to develop a formal definition of a DV model. The DV model and its concepts are treated as graphs. This definition is applicable to both data and metadata levels of DV modeling. The same definition for the new system catalog based on a DV model (MDV - Metadata Vault Repository) is used, as well as the central/integrated enterprise data warehouse data repository (EDV – Enterprise Data Vault), also based on a Data Vault model. A set of DV model concepts from [15, 16] with Attributes is extended for a more detailed tracking and managing of data and schema changes (as well as differentiating data and schema level). Keeping in mind the formal definition of general DV concepts, our conceptual (and also further on logical) MDV model can be defined as: “Acyclic, directed multigraph (C, B) with shallowly connected nodes, where C is the finite set of nodes (peaks), and B is the finite set of edges (arcs). Node represents concepts in DV model (hub, link, satellite, reference, attribute). Edge represents the relation-ship between two concepts in the DV model.” DV graphs (by extension this includes our EDV and MDV graphs as well) have the following characteristics:

Conceptual Model for the New Generation of Data Warehouse System Catalog

817

Table 1. Formal definition of a data vault model # Definition 1 (Hub) Definition 2 (Link)

Definition 3 (Satellite) Definition 4 (Attribute) Definition 5 (Reference) Definition 6 (DV schema)

Definition Hub H is the true subset of the final set of nodes C. For each Hub node there is only one path to its Satellite nodes Link L is the true subset of the final set of nodes C. Each Link node is strongly associated with Hub node - each Link node has at least two edges going toward Hub nodes. For each Link node there is only one path to its Satellite nodes Satellite S is the true subset of the final set of nodes C. There is only one path from every Satellite node to its Hub or Link node Attribute A is the true subset of the final set of nodes C. There is only one path from every Attribute node to its Satellite node Reference R is the true subset of the final set of nodes C DV schema SHESP is 5-tuple [H, R, L, S, A], where H is final set of Hubs, R final set of References, L final set of Links, S final set of Satellites and A final set of Attributes

• oriented - the edges have direction associated to them (directed edges - arcs) • no loops - there are no loops/multiple edges (edges that start and end in the same node) • multigraph – the graph can have multiple (parallel) edges, i.e. two nodes can be associated with more than one edge (edges form a multiset) • acyclic – has no directed cycles • evenly shallow - contains height one or two (Hub-> Satellite or Hub-> Link-> Satellite) • nodes are named - there is a finite set of roles U for each node, U 2 {Hub, Link, Satellite, Attribute, Reference} • edges are named - there is a finite set of roles U for each edge, U 2 {HL edge, HS edge, LS edge} • for each Hub and Link node there is only one way to their Satellite nodes (leafs) • Link nodes are strongly linked to Hub nodes - there are at least two edges, and the number of paths from Hub node to Link’s Satellite node is equal to number of Link’s edges toward Hubs Based on a formal definition of the DV model and the DV graph, the conceptual MDV model is shown in Fig. 2. The model was developed using the IDEF1X method [10] and the CA ERwin Data Modeler 9.5 modeling tool [8]. In the model from Fig. 1 there are only hubs and links shown. The satellites and attributes are omitted for simplicity of display and better readability of the model (however, each hub and link in the model has its own appropriate satellite with its own attributes). The conceptual MDV model is based on the idea that all elements of the model are nodes or edges in the graph. It can be seen from Fig. 2 that the MDV model consists of 4 parts; data sources (DS_Schema), EDV central enterprise data vault repository (DV_Schema), data marts/data cube (SS_Schema) and security and access control

818

D. Jaksic et al.

Fig. 2. Conceptual MDV model (simplified conceptual model of a new system catalog)

(User and UserAccess). It can be seen that all four parts are interlinked through the DS_Integration and DV_Integration schema integration concepts, where the mappings between data source schemas and EDV schemas can be monitored, as well as mappings between EDV schemas and Data Mart (DM) schemas. Also, user access history on a DM schema can be monitored. Part of the MDV model that represents data source metadata serves to store general data about: (a) resources and physical data sources (DS and DS_Schema hubs, and their associated links and satellites), (b) source structures such as tables, columns, relationships, and constraints (DS_Node and DS_Operation hubs, and their associated links and satellites), and (c) their mappings into EDV part of the MDV model (DV_NodeMap and DS_Integration links, with the associated satellites). The scope of the research presented in this paper covers only relational data sources. Part of the MDV model representing the EDV central data vault repository metadata stores information about the structure of the EDV repository, that is, information about the hubs and links and their associated satellites and attributes. Here information is found about: (a) mappings between relational data source structures and EDV structures (DV_NodeMap and DS_Integration, with the corresponding satellites), (b) the structure of the EDV repository (DV, DV_Schema, DV_Node, and their associated satellites), and (c) mappings between EDV structures and DM structures (DV_Operation, DV_Integration and SS_NodeMap, with their associated satellites). All these mappings are associated with data about business rules and transformations used (Rule and Transformation hubs with their associated satellites). It can be noted that there are no reference tables in the conceptual MDV model. This is because the reference data is pre-consolidated and internal business rules or external

Conceptual Model for the New Generation of Data Warehouse System Catalog

819

standardization has already been applied to data so they are stored at the data level, i.e. in the EDV central data vault repository. Later on, in the logical and physical MDV model, the REFERENCE hub in EDV will store their metadata. Part of the MDV model that represents the Data Mart (DM) metadata, stores data on facts and dimensions and their associated attributes (measures and dimension elements), as well as dimensional levels, dimensional hierarchies, and data cubes (SS_Schema and SS_Node with their associated satellites). The mappings of the EDV schema with the DM schema are also stored and monitored (DV_Integration and SS_NodeMap with their associated satellites). Part of the MDV model that represents user access rights metadata includes all user roles and privileges and their authentication and complete history of user access to DM schema (objects within DM) in order to better control and monitor access and data management (hub User and link UserAccess with their associated satellites). It should be emphasized that, because of the usage of the DV modeling method, the MDV model is temporal, extremely flexible, expandable, and can be constantly enriched with new and useful metadata types and structures [11–16].

5 Research Hypothesis The hypothesis is: “Can the conceptual MDV model serve as a basis for further development of a logical MDV model as well as a new DW system catalog which will ultimately support DW schema evolution only by the expansion of the existing schema without loss of information?’’ The conclusion is that it can with a DW system catalog based on a permanent MDV model. A DW schema evolution will be carried out only with the expansion of the existing schema and without loss of information. The hypothesis requires that the MDV model is permanent if: (a) All data changes are implemented in the model as additions only – no loss of data, and (b) All schema changes are implemented in the model as simple expansions – no loss of schema. The first step was to formally validate the conceptual MDV model and confirm the research hypothesis. The hypothesis is considered confirmed if, based on a developed MDV model and formal set of fundamental changes over the DW schema, it is shown that schema evolution over the model can be executed by using only two basic evolution operations. In order to prove statements (a) and (b), a formal algebra is developed for DW data and schema maintenance and a formal and final set of fundamental changes over the DW schema is defined. 5.1

Formal Algebra for DW Schema Maintenance

The formal algebra for DW schema maintenance includes six operations for DW data and schema evolution:

820

• • • • • •

D. Jaksic et al.

AddN – add a new node DeleteN – delete an existing node ChangeN – change an existing node AddE – add a new edge DeleteE – delete an existing edge ChangeE – change an existing edge

It will be shown (using set theory principles) that two basic evolution operations (AddN and AddE) are enough to change the DV schema. All other operations over the DV schema are a combination of those two basic ones (shown formally in Table 2). Let the Sh(C, B) be a DV schema (with C as nodes, B as edges), A a set of attributes, S a set of satellites, H a set of hubs, L a set of links, R a set of reference tables and I(Ai) a set of instances of attribute Ai where Ai -> I(Ai). For all the operations on a DW schema in a metadata repository (system catalog), the changes will be implemented as data changes - (A -> I(A)). This means there will be no structural changes in the MDV schema, only data changes (MDV stores all the data on EDV structural (schema) changes). However, since both EDV and MDV schemas in our research are based on a Data Vault model and its formal definition, for every type of schema change the same operation is valid in both EDV and MDV. It is shown that only two basic evolution operations (AddN and AddE) are enough to change the DV schema, and all other DV evolution operations (simple or more complex ones) are a combination of these two basic ones. For example, the operation of merging two tables uses AddN/AddE combination of operations to add new, merged tables, and DeleteN/DeleteE operations to invalidate old tables. To further empirically verify this, a final set of schema changes is defined over the DV schema and determined their effect on the both central EDV repository and metadata MDV repository schema. 5.2

Schema Changes Over the DV Schema

In the context of change, changes in data sources are observed, with the scope of our research being limited to relational data sources only. These changes directly affect the central EDV data vault repository and its schema. From the previously defined set of operations, it is clear that one can distinguish between changes affecting the graph content (instances of nodes or edges) and changes affecting the graph structure (nodes or edges), depending on the level from which one looks at the graph (data level with instances and values, or schema level with basic concepts of the DV model). Schema changes (structural changes over the graph) are the main focus of this research, while data-level changes are simply implemented by adding new instances to the node or edge (i.e. adding new records to a data-base) and are not very relevant to this research. Let us define the set of changes in relational data sources and their effect on the EDV and MDV schema. The final set of changes in the data source schema is defined as set I {DR, IR, BR, DA, IAn, IAt, BA, DV, BV, IRV, IVR, IAR, IRA}: Add relation - DR, Rename relation - IR, Delete relation - BR, Add attribute - DA, Change attribute (Rename attribute - IAn, Change attribute data type- IAt), Delete attribute from relation BA, Add relationship - DV, Delete relationship - BV, Change relation into relationship -

Conceptual Model for the New Generation of Data Warehouse System Catalog

821

Table 2. Formal proof for operations Operation AddN

DeleteN

ChangeN

AddE DeleteE ChangeE

Proof Add a new attribute node: New (Sh, AddN (A)) := (C [ {A}, B [ { Sd A}) Add a new satellite node: dS}), A1 ; Sd A2 ; . . .; Sd Ak } [ { HL New (Sh, AddN (S)) := (C [ {S}, B [ { Sd where HL 2 {H, L} Add a new link node: New (Sh, AddN (S)) := (C [ {L} [ {S} [ {A}, B [ { Sd A1 ; Sd A2 ; . . .; Sd Ak } d d d d d d [ { L S1 ; L S2 . . .; L Sk } [ { L H1 ; L H2 . . .; L Hk }) Add a new hub node: New (Sh, AddN (H)) := (C [ {H} [ {L} [ {S} [ {A}, B [ S2 . . .; Hd Sk } [ { Sd A1 ; Sd A2 ; . . .; Sd Ak } [ { Hd S1 ; Hd d d d d d d { H L1 ; H L2 . . .; H Lk } [ { L S1 ; L S2 . . .; L Sk }), where n o Sd A1 ; Sd A2 ; . . .; Sd Ak 2 HL Add a new reference table node: A1 ; Rd A2 ; . . .; Rd Ak } [ New (Sh, AddN (R)) := (C [ {R} [ {A}, B [ { Rd d d d { S A1 ; S A2 ; . . .; S Ak }) Delete an existing attribute node: New (Sh, DeleteN (A)) := (C, B) := A -> I(A) - no change Delete an existing satellite node: New (Sh, DeleteN (S)) := (C, B) := A -> I(A) - no change Delete an existing link node: New (Sh, DeleteN (L)) := (C, B) := A -> I(A) - no change Delete an existing hub node: New (Sh, DeleteN (H)) := (C, B) := A -> I(A) - no change Delete an existing reference table node: New (Sh, DeleteN (R)) := (C, B) := A -> I(A) - no change Rename an existing node: New (Sh, ChangeN (C)) := (C, B) := A -> I(A) - no change Change attribute data type: New (Sh, ChangeN (A)) := New (Sh, AddN (A)) := (C [ {A}, B [ { Sd A}) - no change c SH, c SL, c HL} c New (Sh, AddE (b)) := (B [ {b}), where b 2 { SA, New (Sh, DeleteE (b)) := (C, B) := A -> I(A) - no change New (Sh, ChangeB (b)) := New (Sh, DeleteE (b)) := (C, B) := A -> I(A) := c SH, c SL, c HL} c - no change New (Sh, AddE (b)) := (B [ {b}), where b 2 { SA,

IRV, Change relationship into relation - IVR, Change attribute into relation - IAR, and Change relation into attribute - IRA. The operation of adding new relational data sources is viewed as the operation of mapping data source schemas (as it is mostly considered in the literature [1–7, 9, 17– 19]) The final set of changes when adding new data sources DI is the true subset of the final set I, DI  I and includes the same changes as does the existing data source schema.

822

D. Jaksic et al.

Accordingly, the final set of changes in the EDV schema M {DH, IH, BH, DS, ISn, ISt, BS, DL, BL, IHL, ILH, ISH, IHS} is defined: Add hub - DH, Rename hub - IH, Delete hub BH, Add satellite - DS, Change satellite (Rename satellite - ISn, Change satellite data type, ISt), Delete satellite - BS, Add link - DL, Delete link - BL, Change hub into link IHL, Change link into hub - ILH, Change satellite into hub - ISH, and Change hub into satellite - IHS. Table 3 shows that the effects of changes from the set M over the MDV schema/graph are equivalent to the effects of changes from the set I over the EDV schema/graph, with an exception for IH, ISn and ISt - however, one can generalize because only the levels of application (data/schema, i.e. content/structure of the graph) are differentiated. According to Table 3 it is further concluded that the final set of changes I in data sources affects the schema change in EDV with no effect on the MDS schema (data level changes only). In EDV schema changes are implemented solely as an extension of the existing schema/graph (through only two operations - AddN and AddE, adding new nodes and edges with their associated roles). These changes in EDV are implemented into MDV exclusively at the data level (changes over the content/instances of the graph), as future additions of new records into the corresponding tables. Structural changes to the MDV schema are treated equally as structural changes over the EDV graph (since both are based on a DV model). All other changes (both in EDV and MDV schema) can be treated the same way - through only two operations - AddN and AddE).

6 Conclusion and Future Work In this paper we presented a formal definition of a Data Vault model and a conceptual data model of a new DW system catalog (Metadata Vault Repository - MDV) which is based on the Data Vault (DV) method for database modeling. This conceptual MDV model serves as an extension of traditional DW system catalog models. Its goal is to track and manage changes in the DW data and metadata, as well as in its schema. With our MDV model as a basis for new DW system catalog we expect a DW schema evolution will be carried out only with the expansion of the existing schema and without loss of information. Besides a permanent and comprehensive conceptual metadata repository model (MDV model) for tracking DW data and schema changes, a final set of fundamental changes over the DW schema and a formal algebra for DW schema maintenance were introduced in the paper. We have formally and empirically validated our MDV model, proved our research hypothesis and shown that all changes in the data source schema are implemented into EDV and MDV as simple schema extensions (exclusively through addition operations, AddN and AddE). Old data and schemas are stored so there is no loss of information (about data or schema). This makes the model comprehensive and permanent. Regarding our future work, the main goal now is to formalize our logical MDV system catalog model, as well as to develop physical MDV model for our new system catalog. After the development of all three models, we will build a prototype DW system on a selected business case. This way we will additionally confirm our research

Conceptual Model for the New Generation of Data Warehouse System Catalog

823

Table 3. Change effects in EDV and MDV schemas/graphs Data source Effect in EDV schema EDV schema schema change change Add hub, Add relation, EDV schema expansion (new nodes and edges) - AddN and DH DR AddE Rename Data level: no schema changes – Rename relation, IR only data changes in the EDV hub, IH Delete relation, BR

EDV schema expansion (new satellite validity nodes and edges) - AddN and AddE Add attribute, EDV schema expansion (new DA satellite nodes and edges) - AddN and AddE Rename Data level: no schema changes – attribute, IAn only data changes in the EDV

Delete hub, BH

Add satellite, DS Rename satellite, ISn Change EDV schema expansion (new Change attribute data satellite validity nodes and edges) satellite type, IAt - AddN and AddE data type, ISt Delete EDV schema expansion (new Delete attribute, BA satellite validity nodes and edges) satellite, - AddN and AddE BS Add EDV schema expansion (new link Add link, relationship, and satellite nodes and edges) - DL DV AddN and AddE Delete EDV schema expansion (new Delete relationship, satellite validity nodes and edges) link, BL BV - AddN and AddE Change EDV schema expansion (new link Change relation into and satellite nodes and edges) - hub into relationship, AddN and AddE link, IHL IRV EDV schema expansion (new hub Change Change and satellite nodes and edges) - link into relationship hub, ILH into relation, AddN and AddE IVR EDV schema expansion (new Change Change attribute into huv, link and satellite nodes and satellite into hub, relation, IAR edges) - AddN and AddE ISH

Effect in MDV schema

Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog Data level: no schema changes – only data changes in the MDV system catalog (continued)

824

D. Jaksic et al. Table 3. (continued)

Data source schema change Change relation into attribute, IRA

Effect in EDV schema

EDV schema change EDV schema expansion (new link Change and satellite nodes and edges) - hub into AddN and AddE satellite, IHS

Effect in MDV schema

Data level: no schema changes – only data changes in the MDV system catalog

hypothesis. We will consider it fully validated (formally and empirically) if the number of queries submitted over different parts of the prototype architecture (data sources, EDV repository, DM repository and MDV system catalog) return the same or greater set of results (we will expect the EDV to return the same or greater set of results relative to the data sources. The same is true for comparing the results of the traditional system catalog with its extension in the form of a MDV repository). Acknowledgements. This paper is based upon work supported by the University of Rijeka under project no. 13.13.2.2.06, titled “Metode i modeli za dizajn i evoluciju skladišta podataka”.

References 1. Andany, J., Leonard, M., Palisser, C.: Management of schema evolution in databases. In: Proceedings of the 17th International Conference on Very Large Databases, Barcelona (1991) 2. Banerjee, S., Davis, K.C.: Modeling data warehouse schema evolution over extended hierarchy semantics. In: Journal on Data Semantics XIII. LNCS, vol. 5530, pp. 72–96. Springer, Heidelberg (2009) 3. Bebel, B., Krolinkowski, Z., Wrembel, R.: Formal approach to modeling a multiversion data warehouse. In: Bulletin of the Polish Academy of Sciences, Technical Sciences, vol. 54, no. 1 (2006) 4. Bellahsene, Z.: Schema evolution in data warehouses. Knowl. Inf. Syst. 4(3), 283–304 (2002) 5. Cui, Y., Widom, J.: Practical lineage tracing in data warehouses. In: Proceedings of the 16th International Conference on Data Engineering (ICDE 2000), San Diego, California (2000) 6. Date, C.J., Darwen, H., Lorentzos, N.: Temporal Data & the Relational Model. Morgan Kaufmann Publishers, Burlington (2002) 7. Eder, J., Koncilla, C.: Evolution of dimension data in temporal data warehouses. Technical report (2000) 8. ErWin Data Modeler. http://erwin.com/products/data-modeler. Accessed 17 Nov 2017 9. Golfarelli, M., Lechtenbörger, J., Rizzi, S., Vossen, G.: Schema versioning in data warehouses. In: ER Workshops 2004. LNCS, vol. 3289, pp. 415–428. Springer, Heidelberg (2004) 10. Idef1X. http://www.idef.com/idef1x-data-modeling-method/. Accessed 14 Jan 2018 11. Inmon, W.H., Strauss, D., Neushloss, G.: DW 2.0: The Architecture for the Next Generation of Data Warehousing. Morgan Kaufmann Publishers, Burlington (2008)

Conceptual Model for the New Generation of Data Warehouse System Catalog

825

12. Inmon, W.H., Linstedt, D.: Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data Vault. Morgan Kaufmann, Burlington (2014) 13. Jovanović, V., Bojičić, I.: Conceptual data vault model. In: Proceedings of the Southern Association for Information Systems Conference, Atlanta, USA (2012) 14. Jovanović, V., Bojičić, I., Knowles, C., Pavlic, M.: Persistent staging area models for data warehouses. In: Issues in Information Systems, vol. 13, no. 1, pp. 121–132 (2012) 15. Linstedt, D.: SuperCharge Your Data Warehouse: Invaluable Data Modeling Rules to Implement Your Data Vault. CreateSpace Independent Publishing Platform, USA (2011) 16. Linstedt, D., Olschimke, M.: Building a Scalable Data Warehouse with Data Vault 2.0: Implementation Guide for Microsoft SQL Server 2014. Morgan Kaufmann, Burlington (2015) 17. Malinowski, E., Zimányi, E.: A conceptual model for temporal data warehouses and its transformation to the ER and the object-relational models. Data Knowl. Eng. 64, 101–133 (2008) 18. Quix, C.: Repository support for data warehouse evolution. In: Proceedings of the International Workshop DMDW, Heidelberg, Germany (2004) 19. Rundensteiner, E.A., Koeller, A., Zhang, X.: Maintaining data warehouses over changing information sources. Commun. ACM 43, 57–62 (2000) 20. Subotić, D., Jovanovic, V., Poščić, P.: Data warehouse schema evolution: state of the art. In: 25th Central European Conference on Information and Intelligent Systems CECIIS, Varaždin, Croatia (2014) 21. Yessad, L., Labiod, A.: Comparative study of data warehouses modeling approaches: Inmon, Kimball and Data Vault. In: International Conference on System Reliability and Science (ICSRS), pp. 95–99 (2016) 22. Bojicic, I., Marjanovic, Z., Turajlic, N., Petrovic, M., Vuckovic, M., Jovanovic, V.: Domain/mapping model: a novel data warehouse data model. Int. J. Comput. Commun. Control 12(2), 166–182 (2017) 23. Rocha, L., Vale, F., Cirilo, E., Barbosa, D., Mourão, F.: A framework for migrating relational datasets to NoSQL. Procedia Comput. Sci. 51(1), 2593–2602 (2015) 24. Hanine, M., Bendarag, A., Boutkhoum, O.: Data-migration-methodology-from-relational-toNoSQL-databases. Int. J. Comput. Electr. Autom. Control Inf. Eng. 9, 2369–2373 (2015) 25. Jakšić, D., Jovanović, V., Poščić, P.: Integrating evolving MDM and EDW systems by data vault based system catalog. In: Proceedings of the 40th Jubilee International Convention on Information and Communication Technology, Electronics and Microelectronics MIPRO, Opatija, pp. 1633–1638 (2017)

Towards the Processes Discovery in the Medical Treatment of Mexican-Origin Women Diagnosed with Breast Cancer Guillermo Molero-Castillo1(&), Javier Jasso-Villazul1, Arturo Torres-Vargas2, and Alejandro Velázquez-Mena1 1 Universidad Nacional Autónoma de México, Mexico City, Mexico [email protected], [email protected], mena@fi-b. unam.mx 2 Universidad Autónoma Metropolitana, Unidad Xochimilco, Mexico City, Mexico [email protected]

Abstract. In the last decades, the society went from being predominantly analog to digital. This has had a significant impact on the way of doing new analyzes for business, communications, education, health, safety, among other areas of interest. This technological advance, in turn, represents one of the main challenges in today’s organizations. This is due to the need to extract information and value from the stored data, for example, to discover the real behavior of the operating processes registered in the information systems, and to define the way in which the resources interact with the process, that is, activities, roles, assignment rules, and priorities. This paper presents the proposal for the design of a method for the discovery of processes in the medical treatment of Mexicanorigin women diagnosed with breast cancer. Disease caused by the uncontrolled growth of cells. These cells that are not necessary for the body could eventually form a cancerous tumor. Keywords: Process mining Breast cancer

 Process discovery  Medical treatment

1 Introduction Data analysis currently represents a valuable resource in organizations that manage their data [1, 2]. To make an adequate analysis of this data is one of the current challenges in data science (DS). DS is not only relevant for business and society, but also for current research [3]. This is due to the growing collection of data in various fields of application, such as healthcare, public safety, market analysis, commercial practices, industrial processes, scientific discoveries, public policies, among other topics of interest. Precisely, data analysis becomes a value for people and society, and its importance is increasing in public and private organizations [4]. This trend and its potential constitute a major challenge to data-based scientific discovery, which unifies theory, experimentation, and computation. Some examples of data-oriented organizations are © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 826–838, 2020. https://doi.org/10.1007/978-3-030-12388-8_56

Towards the Processes Discovery in the Medical Treatment

827

[2]: Google, Facebook, IBM, Netflix, among others, and government initiatives such as United Nations [5], National Science Foundation of the United States [6] and European Commission [7]. Other examples of data-driven organizations, which have changed the way of doing business, are [8]: Airbnb, which helps people find private and tourist accommodation; Uber, which connects drivers and users for the use of private vehicles such as taxis; and Alibaba Group, as an online business-to-business platform. Currently, an interdisciplinary field of DS is process mining (PM), which aims to discover, verify and improve the real processes, extracting knowledge from the data stored in the current information systems [8, 9], such as healthcare information systems (HIS), which stored data on evidence of diagnosis, the treatments performed for a certain pathology; as well as information on the beginning and end of the process, together with other data of the patient and medical personnel, such as oncologists, surgeons, laboratory technicians, nurses, technicians and other health professionals [10]. These activities are stored for the support, control, and analysis of the operational processes in medical care. Medical care processes can be classified as medical treatment processes, also known as clinical processes [11]. These processes are related to the patient and the execution of a cycle that begins with a diagnosis and the treatment received [12]. This treatment includes observation, reasoning, and action [10], which depends on medical knowledge, which involves the interpretation of patient information and specific decisions. On the other hand, administrative processes support the processes of medical treatments in general, for example, the scheduling of medical consultations, the request for examinations, delivery of results, among others. In this sense, the basis for PM is an event set, also known as event logging [10]. Each event is an activity, that is, a step within a process, which is related to a resource, be it a people or device, that initiates the activity, that has a timestamp, and that is associated with other variables, such as age, diagnosis year, treatments received, among others. Thus, processes are seen as traces of activities and PM is useful for identifying problems in organizations based on facts instead of assumptions [8]. For this, it is necessary to relate the data with operational processes, that is, to analyze the processes from beginning to end. Therefore, a growing problem in the field of Health is the need to exploit the data in order to analyze the operating processes [13]. Some examples of operational processes are the preparation and execution of surgery and the treatment of patients suffering from cancer. The aim of doing these analyze is to propose improvements in the medical care processes, either to reduce operating costs, improve the care of the population, seek other types of care due to the increase in population, reduce waiting times, between other aspects that cause dissatisfaction in people caused by the postponement of medical treatments [10]. Therefore, process mining has an important role in this effort. This paper describes the proposal for the design of a method for the discovery of processes in the medical treatment of Mexican-origin women diagnosed with breast cancer. The data source that is intended to be used corresponds to the clinical logs of the Surveillance, Epidemiology and End Results (SEER) program of the National Cancer Institute of the United States. SEER is responsible for gathering information on diagnosed cancer cases (incidence), of deaths attributed to this disease (mortality) and

828

G. Molero-Castillo et al.

the survival of patients with cancer. This in order to understand and show cancer patterns in different populations, as is the case of the population of Mexican-origin residing in that country.

2 Background Currently, data is collected at anytime and anywhere in a certain context, whether people, object, organization or society [1]. This technological advance, in turn, represents one of the main challenges in organizations of today. This is because it is necessary to extract information and value from the stored data, for example, to discover the real behavior of the operating processes stored in the information systems, and to define the way in which the resources interact with the process, that is, activities, roles, assignment rules, and priorities. 2.1

Process

The elements of an organization are its employees, structure, policies, culture and operational processes. The structure of the organization varies depending on several factors, such as the strategy, size of the organization, processes, technology, people involved and activities that are carried out [14]. From this, it follows that the structure of the organization is adjusted to its environment and certain needs. These needs are marked by the historical moment, the market and the activities they carry out. In this sense, given the tendency of organizations to structure themselves around processes and the incorporation of information technologies (IT) led to the emergence of business process management, which is an approach that combines information technologies, and processes [8]. It has its foundations in a set of methods, tools, techniques, and technologies that allow the design, modeling, and execution of business processes in an iterative and closed cycle. Figure 1 shows an example of a medical care process in terms of events. The flow of activities is represented through a network, that is, a transitions graph that represents activities (square), which are connected through states (circle). The process begins with the admission of the patient who needs a certain treatment. This admission has a single entry (begin) and two exit tokens (values) c1 and c2. After activating the two tokens, three transitions b, c, and d are enabled. Through c1 there are two options: a thorough evaluation of the patient (b), or a quick evaluation (c). In parallel, c2 enables the review of the patient’s medical history (d). Subsequently, b and c are attainments of the c3 token; whereas d is consecutive of the token c4. In this way, the process is synchronized before any surgery (e). This surgery (c5) involves three transitions: f, examining possible complications; g, postoperative care; and h, consultations. The process ends with a token in the final node. Another example is the purchase of an air ticket through the Internet, where the user interacts with airlines, travel agencies, banks and intermediaries through Web pages. Thus, if the purchase is successful, the customer receives the electronic ticket. In this sense, current information systems are increasingly intertwined with the operational processes they support. These systems store multiple events.

Towards the Processes Discovery in the Medical Treatment

829

Fig. 1. Process of medical attention in terms of events. Adapted from [10].

Today, there are tools that support business process management (BPM). These tools are known as business process management systems (BPMS) and management information systems, as (a) enterprise resource planning (ERP), which support the basic functional processes of an organization; (b) supply chain management (SCM), which focus on management with suppliers; and (c) customer relationship management (CRM), which allow a direct relationship with customers. The use of these systems has had a positive impact on the functioning of organizations, allowing the storage of events associated with the execution of processes. However, these events could also be stored in databases, e-mail files, transactional logs, among others. Thus, data volume reaching these sources is a limiting factor for processes analysis, for which specialized technologies have been developed that allow processing and obtaining information of interest, with the purpose of supporting the decision-making process [15]. One of these technologies is process mining, which is currently an important field linked to organizations and information technologies. 2.2

Process Mining

Process mining is a set of techniques capable of discovering, monitoring and improving processes by extracting knowledge from data on specific events [8, 9]. By analyzing these events, we could discover bottlenecks, verify compliance, diagnose deviations and predict times of process flows, to mention a few examples. Figure 2 shows a summary of how these analyses are performed. In the first instance, we have the processes that are supported by the information systems. These systems store and coordinate the events that describe the history of the processes. From these events, real processes can be discovered through process mining techniques. The main source for process mining is information systems, which currently store data from different operating processes and workflow management. These systems log multiple events. However, today’s organizations have problems extracting value from that data. Therefore, the aim of process mining is to use this data to extract information related to a certain process. It is important to note that each process is described in

830

G. Molero-Castillo et al.

Fig. 2. General scheme of process mining. Adapted from [8].

terms of activities and possibly in sub-processes [8]. The order of these activities is modeled by the description of dependencies and temporary properties, for example, the way in which resources interact with a certain process, that is, assignment rules, roles, priorities, among other aspects of interest. In this sense, in recent years, process mining has been applied successfully in several domains, for example, banking, insurance, logistics, production, e-government, customer relationship management, monitoring and diagnostics [10]. Through process mining, the actual behavior of people, machines and organizations could be related to the modeled previously behavior. This shows a different reality to the perceptions, opinions, and beliefs that those involved in the processes have. 2.3

Types of Process Mining

There are three main types of process mining [8]: (a) process discovery, which uses a set of events and produces a process model without using any a priori information; (b) conformity verification, which compares an existing process with a set of events of the same process, which is useful to verify if the reality conforms to the process model and vice versa; and (c) process improvement, which is used to expand or improve an existing process using information of the logged events. On the other hand, in current process mining, several research challenges have been posed, such as [8, 15–17]: (a) collecting data from heterogeneous sources; (b) include in the process simulation models several types of analysis, such as flow control, data, resources and time; (c) automate the redesign of processes for decision-making, that is, suggest process improvements based on events analysis; d) improve analysis techniques; (e) choose the appropriate process mining techniques whose results are capable of meeting the project objectives; (f) improve the presentation and visualization of the results obtained especially for non-expert users; and (g) have better tools and methods for the extraction and analysis of events. Precisely, an aspect of particular interest in this research is to address the last challenge, since it is necessary to have adequate methods that allow the discovery of processes in the medical treatment of certain diseases, such as breast cancer.

Towards the Processes Discovery in the Medical Treatment

2.4

831

Breast Cancer

Cancer, in general, is a disease caused by not controlled growth of one or more cells [18]. Cancer begins in the cells, which are the basic units that make up the tissues. In turn, these tissues form the organs of the body [19]. Normally, cells grow and divide to form new cells, this as the body needs them, that is, when cells age and die are replaced by new cells. However, sometimes this orderly process gets out of control, that is, new cells continue to form when the body does not need them and old cells do not die when they should die [20]. These cells that are not necessary could form a mass of tissue, which is known as a tumor [21]. Figure 3 shows the general anatomy of the breast composed of lymph nodes, fatty tissue, lobe, ducts, areola, nipple and lobules [22].

Thus, when cancer cells spread from the breast, they are usually found in the lymph nodes near the breast and could invade any part of the body, which can often be the bones, liver, lungs, and brain. The new tumor formed has the same type of abnormal cells and the same name as the primary tumor, that is, if the breast cancer spreads to the lungs, the cancer cells in the lungs are breast cancer cells; which is called metastatic breast cancer and not lung cancer [23]. In this case, the disease is treated as breast cancer and not as lung cancer. Breast cancer at first does not cause pain. However, it is advisable to see a specialist if there is a pain in the breast or if there is any other symptom that does not go away. The most common symptoms of breast cancer, associated with the change in how the breast or nipple looks and feels are [22]: (a) change in breast size or shape; (b) nipple sunk into the breast; (c) thickening or lump in the breast, near it or in the armpit; (d) sensitivity in the nipple; (e) breast skin, areola or nipple may look scaly, red or swollen, could present small dimples similar to the shell of an orange; and (f) secretion or fluid of the nipple, especially if there is blood.

832

G. Molero-Castillo et al.

3 Research Problem Current healthcare organizations seek to improve their operating processes and reduce their operating costs [10]. However, to date, traditional methods for the discovery and improvement of processes based on work meetings, observations, surveys, interviews, questionnaires, and workflow analysis are still used. This in order to try to understand how the processes work. However, the results obtained through these methods are generally subjective due to individualized perceptions, and even idealized [10, 13], nevertheless, the reality is another. In addition, these traditional methods consume time and effort [12], since it involves extensive analysis and careful observation. The problem of using traditional methods is due to discrepancies between the real processes and the way in which they are perceived or described by people [12]. The more complex, dynamic and ad hoc the processes are, the more complex it will be for people to describe them. Therefore, the need to exploit the data is increasing in order to analyze the operational processes based on facts instead of assumptions [8, 15]. In this sense, for this type of analysis, process mining is particularly useful, through which the actual behavior of how a process is executed could be identified. In addition, deviations, bottlenecks, and inconsistencies in the processes can be identified. This leads to show a different reality to the perceptions, opinions, and beliefs that those involved in the processes have. Consequently, the analysis of data collected in healthcare centers can be useful to improve the processes in medical treatment, reduce patient waiting times, reduce operating costs, improve capacities to meet demand, improve the productivity of resources and increase the transparency of processes [13]. Therefore, the analysis of clinical and administrative processes can be useful to meet these objectives [15]. Therefore, a current trend is the search for new methods for the extraction and analysis of events, in this case, related to the medical treatment of patients with breast cancer. Some characteristics of medical treatment are flexibility and personalized decisionmaking (ad-hoc) because each patient requires a certain treatment and not through mechanized processes [10]. In addition, medical treatment and its processes have special characteristics due to the dynamism, complexity and multidisciplinary nature. These characteristics are due to the discovery of new medicines, new treatments, technological development and the evolution of medical knowledge [12]. Hence, there is a growing interest in carrying out research from the scientific and technological point of view to develop new methods and support tools to identify behaviors and trends of certain diseases, such as cancer. Based on the above, the purpose of this research project is the proposal of the design of a method for the processes discovery in the medical treatment of Mexican-origin women diagnosed with breast cancer.

4 Method Given the objective of proposing a method for the discovery of processes in the medical treatment of Mexican-origin women diagnosed with breast cancer, for this investigation four stages of work were defined: (a) begin, (b) analysis and design,

Towards the Processes Discovery in the Medical Treatment

833

(c) construction, and (d) validation; which are exploratory type given the fact that process mining is an emerging discipline, and applied research because it relies on specific knowledge, which together is the method defined for this research. Figure 4 shows a projection of the stages and activities considered in the scientific method.

Fig. 4. Proposal of the scientific method.

In begin stage, a theoretical analysis of process mining in the context of Health was made, specifically in the medical treatment of breast cancer, for this, a detailed review of the literature was made and the current methods used were analyzed in process mining. In the second stage, the main variables in the medical treatment of breast cancer were analyzed, as well as the analysis and conceptual design of the solution proposal of the process discovery method. The third stage corresponds to the development of the study case related to medical treatment of breast cancer in Mexicanorigin women, as well as the review and adjustments of the results obtained. In the last stage, for future work, the evaluation of the results and the verification of the hypothesis will be done, as well as the validation of the fulfillment of the proposed objectives. 4.1

Data Source

SEER (Surveillance, Epidemiology and End Results) program is responsible for the national registry of cancer and the main source of authorized information for this disease in the United States. This in order to understand and address cancer in the population of that country. At present, there are several types of research that use cancer data collected by SEER program. These data are available to researchers, physicians, public health officials, legislators, politicians, research groups and general public. Current investigations are oriented to [24]: monitor cancer trends over time, detect cancer patterns in different populations, support in the establishment of priorities in the allocation of resources, guide the planning and evaluation of programs for cancer control, and promote research activities in the medical and epidemiological area.

834

G. Molero-Castillo et al.

In the SEER database, there are demographic data of the patient, location, and morphology of the tumor, stage of cancer at the time of diagnosis, treatment, follow-up of the disease, among others. The collection and registration of the data are given through medical establishments, such as hospitals, clinics and pathology laboratories, which send information about the cases evaluated to their respective state cancer registries. In general, most of the information comes from hospitals, where authorized employees transfer information from patients’ medical logs to local databases, to be sent later to the central cancer registry [25]. Thus, information on cancer cases and deaths from this disease are crucial for reporting on cancer trends, establish whether prevention and control efforts are effective, participate in researches and take action when possible increases are reported in the incidence of cancer. 4.2

Medical Treatment

Current research on process mining in medical care, specifically in medical treatment, is a growing field [12, 13], with promising results due to its applicability and potential [10]. For example, studies have been conducted with data on patients with stroke [26], where the sanitary practices used to treat patients with similar characteristics were analyzed. In another work [27] used process mining to analyze the flow of medical care in patients with gynecological cancer, through which the potential of process mining for the understanding of medical processes and their variants were shown. In another subsequent work [28] analyzed the types of data that exist in healthcare information systems that could be used in process mining. Undoubtedly, people who work in healthcare need information from patients, such as their medical history, treatments received and results of clinical examinations, as laboratory, x-rays, biopsies, among others. Also, the administration needs timely information about the costs and services offered by the organization. All this information must be reliable and updated to achieve quality medical care [10]. Therefore, process mining, in the context of health, can be especially useful to know the operational processes, as they are executed, in order to identify deviations, analyze bottlenecks and monitor the behavior of the organization.

5 Related Researches In the current literature, five main methods and procedures used in process mining have been identified: (a) Life Cycle L* [9], (b) PMPM, Process Mining Project Methodology [17], (c) PMMF, Process Mining Methodology Framework [29], (d) PM2, Process Mining Project Methodology [15], and from a focus in the context of health, (e) Business process analysis (BPA) applied in Health [12]. These procedures share points in common, that is, they are structured in sequential stages related to each other. Table 1 shows the main characteristics of the procedures. Emphasis is placed on the key points that differentiate them, such as their stages, area of application, tools used and limitations.

Towards the Processes Discovery in the Medical Treatment

835

Table 1. Methods and procedures used in process mining. Life Cycle L*

PMPM

PMMF

PM2

BPA applied in Health Author van der Aalst van der Heijden De Weerdt et al. van Eck Rebuge [9] [17] [29] et al. [15] and Ferreira [12] - Data preparation - Planning - Scope Stages - Planning - Data exploration - Extraction Clustering - Data and of - Data - Analysis justification comprehension sequences processing - Results - Extraction - Creation of - Mining and - Clusters - Creation of event logs analysis analysis - Process mining the flow - Evaluation - Regular control model - Evaluation behavior - Process - Creation of - Deployment improvement analysis the integrated and support - Infrequent model behavior - Operational analysis support - Clusters selection IBM Public Insurance Financial Application Public and enterprise hospital in services, private Portugal organizations Rabobank, Netherlands Tools ProM - Disco ProM ProM - ProM - ProM - Medtrix Process Mining Studio It does not specify It has no Limitations - Oriented to It does not - Limited the project specify how to discover a information to the use perform data unique planning and how on how to do of preparation and to approach the model clustering process - It does not how to do techniques mining problem process mining address an - Oriented understanding iterative to hospital analysis processes from a business approach

The identified procedures are structured in different stages related to each other. Some, such as Life Cycle L*, PMPM, and PM2, start from the planning and analysis of the scope of the project, which is important because it is necessary to define the project objectives, program the activities to develop the project, allocate resources, define the project scope, define the work team, among other aspects of interest.

836

G. Molero-Castillo et al.

BPA applied in Health begins with the data preparation, this makes it focus on technical aspects of the project, leaving aside the stage of problem analysis and understanding. In addition, each of the procedures includes specific tasks for the event logs understanding, selection and preparation, as well as the application of process mining algorithms. Also, evaluation stage is important in all cases, this due to the need to analyze the results obtained, for example, PMPM, PMMF, and PM2 interpret and evaluate the results based on compliance with project requirements. On the other hand, process mining processes were carried out from theory to practice, in public hospitals (BPA applied in Health), financial institutions (PMPM), insurance companies (PMMF) and technology companies (PM2). However, these proposals were tested by their own authors, therefore, the characteristics of their applicability could not be evaluated externally. Another aspect of interest is the use of ProM as a process mining tool. This can be associated with the fact that said tool is one of the predecessors for the analysis of processes from event logs. The mentioned procedures, despite their variety, do not include specific methods for the processes discovery in medical treatment, except BPA applied in Health, which bases its operation on the analysis of hospital processes from an administrative process approach, through clustering techniques. This proposal was applied in a public hospital in Portugal. The objective was to analyze uncommon hospital processes and behaviors. Its scope is limited to the use of clustering techniques, which restricts its application in another context, such as the discovery and analysis of medical treatment processes. In this sense, given that process mining is an emerging area and there is currently an exponential growth of data storage in the health field, today there are important research challenges for the development of new models, processes, methods, algorithms and efficient tools for process analysis [15, 17], such as medical treatment. Therefore, making sense of these data is a fundamental challenge through new solutions for their manipulation and understanding of the processes as support for making decisions focused on the care and protection of health.

6 Conclusions Today, more organizations recognize the value of data as a strategic asset and invest in the construction of infrastructure, resources, human talent and equipment to support organizational innovation and create differentiators that increase competition and productivity. In today’s society, data is collected about a specific context, time and place. These data, information and knowledge are currently considered as a strategic asset in organizations. Practically all areas of organizations collect data on their operations, workflows and other processes of interest. This availability of data has led to a growing interest in defining new methods to extract useful information and acquire knowledge through data analysis. The availability of data, in a given context, has led to a growing interest in defining new methods to extract useful information and acquire knowledge through data analysis. Practically all areas of organizations collect data on their operations, workflows and other processes of interest.

Towards the Processes Discovery in the Medical Treatment

837

Process mining could not only be used to know the process as it is executed but also to identify deviations, analyze bottlenecks and monitor organizational behavior in certain contexts, for example, for the discovery and analysis of processes related to the medical treatment of certain pathologies, such as breast cancer. The processes of medical treatment are a series of clinical activities aimed at diagnosing, treating and preventing diseases in order to improve the patient’s health. This involves the participation of different actors, known as resources, such as oncologists, surgeons, nurses, technical specialists and medical personnel in general, which may vary from one organization to another. In the context of Health, information on cancer is important for several reasons: researchers need updated data on cancer to study possible causes of this disease, medical administrators use data on cancer to make decisions regarding the purchase of equipment and the development of programs for the prevention of the disease, and health departments use these data to investigate possible cancer groups and their causes. As a future work and in order to answer the question How can clinical processes be discovered in the medical treatment of breast cancer? Will be used as a case study the analysis of events related to the medical treatment of Mexican-origin women diagnosed with breast cancer. This type of research is useful to monitor cancer trends over time, show patterns of cancer in different populations, evaluate of cancer prevention, and promote research activities in the medical area and of epidemiology.

References 1. Khattak, A., Akbar, N., Aazam, M., Ali, T., Khan, A., Jeon, S., Hwang, M., Lee, S.: Context representation and fusion: advancements and opportunities. Sensors 14(6), 9628–9668 (2014) 2. Cao, L.: Data science: challenges and directions. Commun. ACM 60(8), 59–68 (2017) 3. Provost, F., Fawcett, T.: Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking. O’Reilly Media, Sebastopol (2013) 4. Khan, N., Yaqoob, I., Targio, I., Inayat, Z., Mahmoud, W., Alam, M., Shiraz, M., Gani, A.: Big data: survey, technologies, opportunities, and challenges. Sci. World J., 1–18 (2014) 5. United Nations. United Nation Global Pulse Projects: Harnessing big data for development and humanitarian action. https://www.unglobalpulse.org/challenges-hackathons. Accessed 08 June 2018 6. National Science Foundation. New NSF awards will bring together cross-disciplinary science communities to develop foundations of data science. https://www.nsf.gov/news/ news_summ.jsp?cntn_id=242888&org=CISE&from=news. Accessed 20 Apr 2018 7. European Commission. Commission urges governments to embrace potential of big data. http://europa.eu/rapid/press-release_IP-14-769_en.htm. Accessed 02 June 2018 8. van der Aalst, W.: Process Mining: Data Science in Action, 2nd edn. Springer, Eindhoven (2016) 9. van der Aalst, W.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Eindhoven (2011) 10. Mans, R., van der Aalst, W., Vanwersch, R.: Process Mining in Healthcare: Evaluating and Exploiting Operational Healthcare Processes. Springer, Cham (2015)

838

G. Molero-Castillo et al.

11. Lenz, R., Reichert, M.: IT support for healthcare processes-premises, challenges, perspectives. Data Knowl. Eng. 61(1), 39–58 (2007) 12. Rebuge, A., Ferreira, D.: Business process analysis in healthcare environments: a methodology based on process mining. Inf. Syst. 37(2), 99–116 (2012) 13. Rojas, E., Muñoz-Gama, J., Sepúlveda, M., Capurro, D.: Process mining in healthcare: a literature review. J. Biomed. Inform. 61, 224–236 (2016) 14. Laudon, K., Laudon, J.: Sistemas de información gerencial, 12th edn. Pearson, Mexico City (2012) 15. van Eck, M., Lu, X., Leemans, S., van der Aalst, W.: PM2: a process mining project methodology. In: 27th International Conference on Advanced Information Systems Engineering, pp. 297–313. Springer, Stockholm (2015) 16. Bozkaya, M., Gabriels, J., van der Werf, J.: Process diagnostics: a method based on process mining. In: Conference International on Process and Knowledge Management, pp. 22–27. IEEE, Cancun (2009) 17. van der Heijden, T.: Process mining project methodology: developing a general approach to apply process mining in practice. M.Sc. thesis, Eindhoven University of Technology, Netherlands (2012) 18. Mukherjee, S.: The Emperor of All Maladies: A Biography of Cancer. Scribner, New York (2010) 19. National Cancer Institute. Breast Cancer—Health Professional Version. https://www.cancer. gov/types/breast/hp. Accessed 02 May 2018 20. American Cancer Society. What Is Breast Cancer? https://www.cancer.org/cancer/breastcancer/about/what-is-breast-cancer.html. Accessed 04 May 2018 21. American Cancer Society. Is Cancer Contagious? https://www.cancer.org/cancer/cancerbasics/is-cancer-contagious.html. Accessed 05 May 2018 22. National Cancer Institute. Breast Cancer Treatment (PDQ®)–Patient Version. https://www. cancer.gov/types/breast/patient/breast-treatment-pdq. Accessed 02 May 2018 23. MedlinePlus National Institutes of Health. Screening for Breast Cancer. https://medlineplus. gov/magazine/issues/summer14/articles/summer14pg22.html. Accessed 05 May 2018 24. National Cancer Institute. SEER Training Modules, Surveillance, Epidemiology and End Results Program. https://training.seer.cancer.gov/modules_reg_surv.html. Accessed 10 May 2018 25. Centers for Disease Control and Prevention. National Program of Cancer Registries. https:// www.cdc.gov/cancer/npcr/about.htm. Accessed 10 May 2018 26. Mans, R., Schonenberg, H., Leonardi, G., Panzarasa, S., Cavallini, A., Quaglini, S., van der Aalst, W.: Process mining techniques: an application to stroke care. Stud. Health Technol. Inform. 136, 573–578 (2008) 27. Mans, R., Schonenberg, M., Song, M., van der Aalst, W., Bakker, P.: Application of process mining in healthcare–a case study in a Dutch hospital. Biomed. Eng. Syst. Technol. 25, 425– 438 (2008) 28. Mans, R., van der Aalst, W., Vanwersch, R.: Process mining in healthcare: opportunities beyond the ordinary. BPM Reports, vol. 1326 (2013) 29. De Weerdt, J., Schupp, A., Vanderloock, A., Baesens, B.: Process mining for the multifaceted analysis of business processes—a case study in a financial services organization. Comput. Ind. 64(1), 57–67 (2013)

SPARQλ: SPARQL as a Function Christian Vogelgesang, Torsten Spieldenner(B) , and Ren´e Schubotz German Research Center for Artifical Intelligence (DFKI), Saarbr¨ ucken Graduate School of Computer Science, 66123 Saarbr¨ ucken, Germany {Christian.Vogelgesang, Torsten.Spieldenner, Rene.Schubotz}@dfki.de

Abstract. With more and more applications providing semantic data to improve interoperability, the amount of available RDF datasets is constantly increasing. The SPARQL query language is a W3C recommendation to provide query capabilities on such RDF datasets. Yet as the coverage of RDF datasets with efficient and available SPARQL endpoints is still limited, integration of data from different RDF sources is a bottleneck that has mostly to be done in RDF consuming clients. We tackle this bottleneck by introducing SPARQλ, an extension to the SPARQL 1.1 query language. SPARQλ enables dynamic injection of RDF datasets during evaluation of the query, and by this lifts SPARQL to a tool to write templates for RDF producing functions in functional programming style. This is an important step to reduce the effort to write SPARQL queries that work on data from various sources. SPARQλ is moreover suitable to directly translate to an RDF described Web service interface, which allows to lift integration of data and re-provisioning of integrated results from clients to cloud environments.

Keywords: SPARQL programming

1

· Data integration · RDF · Functional

Introduction

The idea of the Semantic Web was to improve existing web applications by using a defined semantic datamodel called RDF [1]. In the last few years the Linked Data initiative around Tim-Berners Lee1,2 was able to get enough traction and more and more big datasets like dbpedia3 are available as Linked Open Data. The hosting of Linked Data datasets is a well understood problem [20–22]. However, the wide range of different interface types for Linked Data datasets [24] requires a respective range of experts to access the entirety of the data. Linked Data clients must be able to interact with these interface types. In the past, clients were mostly data consumers that gained knowledge from RDF data to improve local applications. The trend to move more and more 1 2 3

https://www.w3.org/DesignIssues/LinkedData.html. https://www.w3.org/TR/ldp/. https://wiki.dbpedia.org/.

c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 839–856, 2020. https://doi.org/10.1007/978-3-030-12388-8_57

840

C. Vogelgesang et al.

functionality to the cloud resulted in prosumers [23] that not only act as client, but also as a server by providing the own result to other clients. For this reason, prosumers face the same challenges as do data providers, and it’s getting more complicated to provide a flexible, robust and scalable solutions. One remedy from the world of Web applications is the so called Serverless Computing or Functions-as-a-Service (FaaS) [7,15]. Functions-as-a-Service provide concepts and frameworks for small, task-focused Web services that are usually considered as stateless functions on data. To our knowledge, however, there are no comparable solutions for Linked Data processing and RDF data integration. There do exist widely recognized extensions to SPARQL, the standard query language for RDF dataset access and processing. But like SPARQL itself, solutions like SPARQL/Update [5], SPARQL-LD [13,14] or SPARQL Microservices [18,19] only provide little means of parameterization for dynamic specification of datasets against which a query is executed. To this extent, we add to the canon of extensions our proposal for SPARQλ, a SPARQL extension that is focused on flexible and dynamic integration of RDF datasets. We remove the static dataset handling and introduce the possibility to declare resource IRIs (Internationalized Resource Identifiers) of SPARQL GRAPHs in CONSTRUCT queries4 as parameters. Based on SPARQλ, we then define SPARQλ functions which provide a functional programming style interface and allow the definition of the dataset by parameter bindings. SPARQλ functions can either evaluate to a new, partially evaluated SPARQλ function, or to RDF data that can be used as input for other RDF consuming applications. Following the rules the of Linked Data, we also provide a self descriptive, machine readable interface that can be created automatically by a proposed mapping. Our contributions in this paper are: – SPARQλ, an extended subset of SPARQL for dynamic dataset integration – SPARQλ functions, a service definition that allows to execute a query as service invocation with a functional programming style interface – A formal approach to generate SPARQλ functions from SPARQλ functions by partial evaluation – SPARQλ functions as Web services with RDF-based interface description – An implementation proposal based on FaaS-Frameworks We outline the required preliminaries in Sect. 2. In Sect. 3 we define the SPARQλ language, followed by the definition of SPARQλ functions in Sect. 4. We then present a proposal for a Web interface to expose the SPARQλ functions in Sect. 5. Section 6 explains the implementation of our prototype. Finally, we provide outlook on future work in Sect. 8 and draw a conclusion about our presented work in Sect. 9.

4

https://www.w3.org/TR/2013/REC-sparql11-query-20130321/#construct.

SPARQλ: SPARQL as a Function

2 2.1

841

Preliminaries RDF Graphs and Datasets

The Resource Description Framework (RDF) is the standard data model for the Semantic Web. We briefly introduce the abstract representation of the formal RDF model [1]. Formally, let I, L and B be pairwise disjoint infinite sets of IRIs (Internationalized Resource Identifiers), literals and blank nodes, respectively. The elements in I ∪ B ∪ L are collectively known as RDF terms. The subset T = I ∪ L of RDF terms denotes resources in some universe of discourse. We use T = I ∪ L as a short-hand notation where appropriate. Blank nodes in B are treated as simply indicating the existence of a resource, without using an IRI to identify any particular resource. The infinite set of all RDF triples is T = (I ∪ B) × I × (T ∪ B). Asserting an RDF triple (s, p, o) says that some resource, denoted by p, establishes a binary relationship between the resources denoted by s and o. An RDF graph G ⊂ T is a finite set of RDF triples. We call an RDF graph G ⊂ T ∗ = I × I × T blank-node-free. An RDF dataset D is a set {G0 , u1 , G1 , . . . , un , Gn }, where G0 , . . . , Gn are RDF graphs and u1 , . . . , un ∈ I are distinct IRIs. The graph G0 is called default graph, and G1 , . . . , Gn are called named graphs with names u1 , . . . , un , correspondingly. Note that the graphs in D use pairwise disjoint sets of blank nodes. For a given dataset D and i ∈ I, we define σD [i] = G if i, G ∈ D and σD [i] = ∅ otherwise to select a graph G by its name i from the dataset D. Furthermore, we use the function νD = {ui | ui , Gi  ∈ D} to list the names of all named graphs in a given dataset D. 2.2

SPARQL Query Language

The SPARQL Query Language (SPARQL) is the W3C recommended query language for RDF datasets. SPARQL queries are built around a graph pattern matching facility, i.e. triple patterns, which form the core of the language. In the following, we briefly present the syntax and semantics of SPARQL graph patterns. We use V to denote the infinite set of variables that is disjoint from I∪B∪L. The set of variables occurring in a syntax expression E is given by var(E). A tuple from (T ∪ V) × (I ∪ V) × (T ∪ V) is called a triple pattern. Triple pattern components may be bound, i.e. from set T, or unbound, i.e. from set V. Using binary operators AND, UNION, OPT, FILTER, SERVICE and GRAPH, a SPARQL graph pattern expression is defined recursively as follows [9]. (1) A triple pattern t ∈ (T ∪ V) × (I ∪ V) × (T ∪ V) is a graph pattern. (2) If P1 and P2 are graph patterns, then expression (P1 AND P2 ), (P1 UNION P2 ), and (P1 OPT P2 ) are graph patterns. (3) If P is a graph pattern and a ∈ (I ∪ V), then the expression (a GRAPH P ) is a graph pattern.

842

C. Vogelgesang et al.

(4) If P is a graph pattern and a ∈ (I ∪ V), then the expression (a SERVICE P ) is a graph pattern. The semantics of SPARQL graph patterns is defined in terms of an evaluation function ·D G which returns a set of mappings for a given SPARQL graph pattern expression, a fixed and active dataset D = {G0 , u1 , G1 , . . . , un , Gn } and an active graph G within D. A mapping from V to (T ∪ B) is a partial function μ : V → (T ∪ B). The domain dom(μ) of a mapping is the subset of V on which μ is defined. The evaluation P D G of a SPARQL graph pattern P over a dataset D with active graph G is defined as follows. (1) If P is a triple pattern, then P D G = {μ : var(P ) → T | μ(P ) ∈ G}. (2) If P = (g GRAPH P ) with g ∈ (I ∪ V), then  P1 D ⇔g∈I D P G =  σD [g] D  {[g → u]}) ⇔ g ∈ V u∈I (P1 σD [u]  For formal evaluation of AND, UNION, OPT, and other operators, we refer to the literature [9]. A CONSTRUCT query q is an expression of the form q = CONSTRUCT H WHERE { P } where H ⊂ (T ∪ V) × (I ∪ V) × (T ∪ V) is a finite set of triple patterns, called a template, and P is a SPARQL graph pattern. Next, we define the semantics of a CONSTRUCT query over an input dataset D as ans(q, D) = {μ(t) μ ∈ P D G0 , t ∈ H}

3

SPARQλ

In order to use SPARQL queries as parameterizable functions as motivated in Sect. 1, the evaluation of a SPARQL query must support the assignment of external datasets to undistinguished variables. This is similar to a function in classic programming languages that binds calling parameters to local variables in the function scope. The SPARQL 1.1 query language specification, as outlined in Sect. 2, requires to specify external data by dataset clauses FROM and FROM NAMED with the respective IRIs known beforehand. Commands to load external data exist in the SPARQL/Update5 language specification, but they are not available in the standard SPARQL query language. We for this propose SPARQλ as a SPARQL1.1 extension. SPARQλ extends the semantics of the GRAPH operator to allow on-demand fetching of specified named graphs, as well as allowing to express a GRAPH IRI by a variable that 5

https://www.w3.org/TR/sparql11-update/.

SPARQλ: SPARQL as a Function

843

is bound during evaluation of the query. We restrict the query forms to the CONSTRUCT form because the SPARQL results of the SELECT and ASK forms do not produce RDF and are not suitable as an input to other RDF consuming services. 3.1

Extending the GRAPH operator

We will in the following extend the semantics of the GRAPH operator to allow queries like the query given in Fig. 1. The example computes the price for an offering in US-Dollar. The IRIs of the currency graph and the offering graph are not explicitly stated, but represented by variables. We will in the following refer to variables that represent GRAPH IRIs short as graph variables. A standard SPARQL 1.1 processor would iterate over all named graphs for each GRAPH keyword, evaluate the triple pattern and unify the results.

Fig. 1. A SPARQλ construct query to compute a price specification in US-Dollar

SPARQλ changes this behavior to treat variables in GRAPH patterns such that they are bound to an IRI during processing, and the respective GRAPH is retrieved from the remote resource that is specified by the IRI. The so fetched external GRAPH is subsequently set as active graph, as for the original SPARQL1.1 behavior. Loading such a requested GRAPH into the dataset requires calls to external endpoints, which are potentially not reachable. Following the SPARQL 1.1 specification for handling reachability, we include the SILENT keyword for the GRAPH operator to suppress errors.

844

C. Vogelgesang et al.

We thus modify the SPARQL1.1 grammar as follows: GraphGraphPattern ::= ’GRAPH’ ( ’SILENT’ )? VarOrIri GroupGraphPattern

We moreover modify the evaluation semantics of GraphGraphPattern to: 1 2 3 4

D : a dataset D(G) : D a dataset with active GRAPH G (the one patterns match against) D[i] : The GRAPH with IRI i in dataset D P : GRAPH pattern

5 6 7 8 9 10

μ: a solution mapping SilentOp: boolean variable to indicate that the evaluation of the corresponding GRAPH pattern is evaluated on an empty graph

11 12

Definition: Evaluation of Graph

13 14 15 16 17 18 19 20 21 22

if IRI is a GRAPH name in D eval(D(G), Graph(IRI, P, SilentOp)) = the result is eval(D(D[IRI]), P) if IRI is not a GRAPH name in D eval(D(G), Graph(IRI,P, SilentOp)) = D[IRI] := FetchGraph(IRI, SilentOp) the result is eval(D(D[IRI]), P) eval(D(G), Graph(var, P, SilentOp)) = the result is eval(D(G), Graph(μ(?var)), P, SilentOp)

23 24 25

where: FetchGraph(IRI, SilentOp) is the result of an executing a GET request to given IRI.

26

3.2

Dataset Specification

The usual SPARQL dataset clauses FROM and FROM NAMED only allow for dataset specification that needs to be fixed before the execution of a query, and that requires the IRIs of the included graphs to be known beforehand. These mechanics are not suitable for dynamic dataset integration as intended for SPARQλ. We for this omit support of said dataset clauses, and instead exploit the modified semantics of GRAPH as defined in Sect. 3.1. As soon as a GraphGraphPattern is evaluated, the respective data source as specified by a given IRI is dynamically fetched, and the respective named GRAPH is selected as scope for the GroupGraphPattern. We thus combine the specification of named GRAPH datasets as done by FROM NAMED statements in SPARQL1.1, and the selection of a named GRAPH as scope for the query based on its IRI, into one single GRAPH statement. We do not make any assumptions about the default GRAPH for a SPARQλ query, and leave interpretation of query

SPARQλ: SPARQL as a Function

845

fragments that operate outside of GraphGraphPattern to the implementation of the processor, as for usual SPARQL1.1 queries. The IRI in a GraphGraphPattern can be explicitly stated, or expressed by a variable. This variable may either be bound to an IRI as result of an evaluation of another part of the query, or work as a parameter to the query. Parameters may be specified manually when executing the query. Passing parameters to SPARQλ queries will be discussed in detail in Sects. 4 and 5. 3.3

Evaluation Strategy

If all graph variables are bound, the query can be executed as a SPARQL1.1 query. Otherwise, we partially evaluate the query and generate a new query with pre-evaluated Ωp for evaluable subpattern in the query. The Ωp is stored in a placeholder object in the generated query and is used in the final evaluation. Formally, we define the partial evaluation as follows: Let evaluable(P ) be a boolean function that tests if a pattern does NOT contain a GraphGraphPattern with an unbound variable as argument and thus can be pre-evaluated. Let getSubP atterns(P ) a function that creates a list of all GroupGraphPattern in the given pattern, and let SolutionP laceholder(P ) be a wrapper that holds a pre-evaluated Ωp that contains valid solution mappings. We can then perform a partial evaluation by: 1 2 3 4 5 6 7 8 9

partialEval(P): if evaluable(P): Ωp := eval(P) replace(P,SolutionPlaceholder(Ωp )) elif P = ∅ return else for each subPat in getSubPatterns(P): partialEval(subPat)

Furthermore, we define an extension to the SPARQL 1.1 evaluation semantic for the SolutionPlaceholder: 1 2

eval(D(G),SolutionPlaceholder) = multiset of solution mappings

If all references to a referred graph variable are covered by the partial evaluation, the corresponding graph itself can be removed from the dataset. In our example in Fig. 1, we have two graph variables ?offering and ?currencyGraph. If the latter variable is bound, the inner graph pattern can be fully evaluated and the graph can be removed from the dataset. However, if ?offering is bound with an unbound ?currencyGraph, the referred graph has to be added to the dataset of the resulting query. For some queries, there might exist equivalent patterns that allow a higher degree of partial evaluation. However, such optimizations are subject of future research.

846

C. Vogelgesang et al.

4

SPARQλ Functions

With the modified GRAPH semantics that treat variables in GraphGraphPattern as free variables that are bound to IRIs during query executions according to Sect. 3.1, the respective dataset management strategy that dynamically fetches remote datasets as named graphs when the graph group pattern is evaluated (Sect. 3.2), and the evaluation strategy according to Sect. 3.3, we can use SPARQλ to write Lambda functions as SPARQλ CONSTRUCT query. We for this provide a formal definition for parameterizable SPARQλ functions, and specify evaluation behavior for both fully, and partially satisfied SPARQλ functions. 4.1

Formal Definition

Let CONSTRUCT H WHERE S be a SPARQλ query with query graph pattern S. We define the following sets of variables and notions for a SPARQλ query graph pattern S according to Sect. 2.2: – V = {vi | vi ∈ V, vi ∈ [a GRAPH vi ]} ⊆ var(S) the set of all free variables that describe an undistiguished GRAPH IRI (“graph variables”). – B = {(vi , bi ) | vi ∈ V, bi ∈ T ∪ B} the set of all bound graph variables in the SPARQλ query P for which holds that μ(vi ) = bi . – We define f unc(P) → G as SPARQλ function that takes a list of parameters P = {(vi , pi ) | vi ∈ V, pi ∈ I} and performs a mapping vi → pi before the query is evaluated. f unc returns a RDF graph G by computing μ on the set of remaining free variables after parameter matching. – We call the SPARQλ function fully distinguished if after evaluation of the query holds ∀vi ∈ V : ∃(vi , bi ) ∈ B, otherwise we call it partially distinguished. In the following, we motivate an understanding of a SPARQλ query execution as evaluation of a Lambda function λV.S(P) based on the notions above. The SPARQλ query graph pattern S will for this take the role of encoding a function func: The binding vi ∈ V → pi of variables to parameters in a function λV.S(P) is equivalent to binding the respective free variables to the provided parameter pi ∈ I, and subsequently retrieving a named graph as specified by pi using the strategy for dataset management as specified in Sect. 3.2. Binding GRAPH variables to actual IRIs for a SPARQλ query graph pattern S is then done by rewriting S to include BIND commands for each parameter set (vi , pi ) ∈ P: 1 2 3 4

S : the SPARQL pattern V : a set of unbound graph variables in S P : a set of input parameters (vi , pi ) BIND: the SPARQL1.1 BIND statement

5 6

Definition: bind(V,P), the

SPARQλ: SPARQL as a Function 7 8

847

injection of IRI bindings into the SPARQλ query:

9 10 11 12

bind(V, P) := foreach vi in V S = BIND(vi , P[vi ]) S

We thus bind each of the unbound GRAPH IRIs to an IRI that is specified by a set of input parameters when calling the SPARQλ function. Following this, the invocation of the SPARQλ function with all graph variables bound, which is equivalent to |B| = |V|, allows to evaluate the query as SPARQL1.1 query as described in Sect. 3.3. 4.2

Partially Evaluated Functions

After binding parameters to graph variables, and evaluating the distinguished part of S, Sd with evaluable(Sd ) = true (cf. Sect. 3.3), some of the graph variables in S may still be unbound, i.e. |B| < |V|. In this case, we consider the result of the function invocation as a partial function application which results in a SPARQλ function with the unbound graph variable list Vu with: Vu = {vi | vi ∈ V, (vi , bi ) ∈ B} Vu contains all graph variables that could not yet be bound by evaluating Sd during execution of the SPARQλ function. A partial evaluation of a function thus results in a new function with a reduced set of unbound graph variables Vu compared to the original variable set V. For a fully distinguished set of graph variable bindings B, the new function computes the same results as would the original SPARQλ function on the same set B. This follows the so-called futamura concept as presented by Jones et al. [16]. We by this employ the evaluation strategy for SPARQλ queries as defined in Sect. 3.3 to perform partial evaluation, in the following fashion: ➀ A SPARQλ query is executed with a parameter set P by which only a subset of graph variables is bound: |P| < |V| ➁ The SPARQλ query graph pattern S is divided into its fully distinguished part Sd with evaluable(Sd )=true, and its (partially) undistinguished part Su with evaluable(Su )=false that contains Vu , the remaining undistinguished variables after evaluation. ➂ Sd is evaluated as SPARQL1.1 query fragment (cf. Sec. 3.3). The result is an updated dataset D that contains the already retrieved named graphs, possibly updates to the graphs from evaluation of satisfied subqueries, and a set  of solution mappings Ωp as result of the executed subqueries: pi ∈P Sd D σ[pi ] ⇒  (D , Ωp ). ➃ The resulting new SPARQλ function then consists of the undistinguished query pattern Su , the set of remaining unbound variables Vu , and a dataset equivalent to the dataset of the original SPARQλ function after evaluation of the

848

C. Vogelgesang et al.

distinguished query fragment, D : λV.S → λVu .Su , with dataset D . Evaluation of λVu .Su has then to take into account the pre-evaluated variable mappings Ωp as given in ➂.

5

SPARQλ Functions as a Web Service

Based on the definition of the SPARQλ language and SPARQλ functions, we now define the interface of SPARQλ function services and function invocations. 5.1

Interface Design and Invocation

The SPARQL 1.1 Protocol [4] defines an interface for query invocations on SPARQL 1.1 endpoints. We in the following adapt the protocol to support provisioning of graph IRI parameters and by this allow invocation of SPARQλ enpoints via HTTP requests. As SPARQλ is limited to CONSTRUCT clauses, we build on top of the SPARQL query interface definition. The function invocation is triggered by a HTTP request to the invocation endpoint. We support three different types of requests (see Table 1): A GET request with all function parameters specified as query parameters in the IRI, a POST request with www-form-urlencoded body and finally a POST request that defines one additional named graph in the dataset by sending a serialized RDF graph in the request body. The name of the graph is specified by the postgraph-name query parameter. Binding this graph to a function parameter avoids a dynamic fetching from the given IRI. Table 1. Defined HTTP request types to a SPARQλ functions service endpoint, expected parameters and parameter types HTTP method

Query string parameters

Request content type

Request message body

GET

parameter-name (#unbound variables)

None

None

POST

None

application/x-wwwform-urlencoded

URL-encoded, &-separated query parameter

POST

parameter-name (#unbound variables) post-graph-name (exactly 1)

RDF serialization format

RDF Graph

If fetching of a graph fails and GRAPH is not used in conjunction with the SILENT keyword, the invocation endpoint has to supply an error message with the affected parameter name and the bound IRI. If the error is suppressed, the

SPARQλ: SPARQL as a Function

849

result of the fetching process is assumed as an empty graph. Other errors are handled as defined in Sect. 2.1 of the SPARQL Protocol.6 The different types of requests are interchangeable. In the example of currency conversion (Fig. 1), the following two calls result in the same generated partially evaluated functions: Variant 1: HTTP GET: http://example.org/function ?currencyGraph=http://www.currency2currency.org REQUEST BODY RETURN BODY: http://example.org/newfunction Variant 2: HTTP POST: http://example.org/function REQUEST BODY currencyGraph=http://www.currency2currency.org RETURN BODY: http://example.org/newfunction

5.2

Describing SPARQλ Functions

As described in Sect. 1 SPARQλ function are meant to be used in Linked Data environments. For this reason, the functions have to provide a self descriptive interface, that allows the exploration of function signature, the functionality and available endpoints. A SPARQλ function has to provide multiple endpoints. One endpoint has to provide the description itself with links to all other relevant endpoints, which are for the function invocation, the original SPARQλ query and optionally for a constraint shape of the result graph. Figure 2 shows an example function description based on our example query in Fig. 1. We have three different categories of function attributes. The first one links other relevant endpoints of the function. lambda:sparqlSource allows the access to source code of the orginal SPARQλ query. The function invocation endpoint is given by sd:endpoint. The second category are attributes, that are related to the execution of the function. Our proposal contains sd:EntailmentRegime to specify the used entailment regime in basic graph pattern matching [4]. 6

https://www.w3.org/TR/sparql11-protocol/#query-operation.

850

C. Vogelgesang et al.

Fig. 2. Example of a self-describing SPARQλ function

SPARQλ: SPARQL as a Function

851

The last set of attributes defines the signature of the function. Free graph variables of the SPARQλ query are exposed as lambda:unboundParameter of the function. Variables that are already bound by partial evaluation, are described by lambda:unboundParameter. The parameters are defined by the name of the variable using dct:identifier and, if applicable, with the graph IRI they are bound to with lambda:boundTo. The serialization format for the result graph is specified by sd:resultFormat. Additionally, the function may optionally provide a constraint shape in an appropriate RDF constraint language [2,3] with lambda:resultShape. The function signature specific attributes can be automatically derived from the SPARQλ query as follows: Let λV.S be a SPARQλ function as defined in Sect. 4, V ⊂ var(S) and B sets of unbound and bound variables respectively, and ν an IRI minting function that generates a valid IRI from a variable name. We then map from variable sets to the semantic parameter description by employing the following mapping rules: ∀(vi ∈ V) ν(vi ) rdf:type lambda:UnboundVariable . ➀ ν(vi ) dct:identifier “vi ”ˆˆxsd:String .

➀ All unbound variables vi ∈ V expose there identifier that can be used to by the SPARQλ protocol to bind the parameters. ∀((vi , bi ) ∈ B) ν(vi ) rdf:type lambda:BoundVariable . ➁ ν(vi ) dct:identifier “vi ”ˆˆxsd:String . ν(vi ) lambda:boundTo bi .

➁ All bound variables expose there identifier and the value they are bound against.

6

Prototype Implementation

We base our prototype on the Apache Jena Framework,7 and employ the architecture shown in Fig. 3. Each SPARQλ function is provided with its own SPARQλ processor to evaluate the provided SPARQλ query, a triple store containing the local dataset and as supplementary data the potentially pre-evaluated solution mappings Ωp . The deployment of SPARQλ functions can be directly mapped to existing Function-as-a-Service frameworks. In our implementation, we used the OpenFaaS framework8 due to a wide range of supported languages and a fine granular 7 8

https://github.com/apache/jena/. https://github.com/openfaas/faas.

852

C. Vogelgesang et al.

control over the deployment process itself. The FaaS Framework is responsible for the creation and management of new functions and their endpoints. We have created an OpenFaaS environment with a Jena triple store, the SPARQλ processor and a framework for creating the function descriptions. Before starting such an environment, we specialize the environment by injecting the query and the pre-evaluated Ωp , in case of a partially evaluated function. SPARQλ queries are executed against the triple store by a modified Apache ARQ9 SPARQL processor. The ARQ library provides a rich set of interfaces for customizations. For binding function parameters to graph variables, we use a predefined QuerySolutionMap. The OpExecutor interface enables users to implement custom semantics. We specialized it to implement the new graph semantic with dynamic fetching of datasets. For the implementation of the partial evaluation, we used the OpWalker interface to traverse the query and find evaluable subpattern as defined in Sect. 3.3. The found patterns are evaluated, the result is serialized and the patterns are replaced by a new PlaceholderOp storing a GUID of the result. The evaluation of the PlaceholderOp is again implemented using the OpExecutor interface and a lookup to a map of all available Ωp by this GUID is used to inject the preevaluated solution bindings. Figure 3 furthermore illustrates the process of the partial evaluation. After the function invocation, the SPARQλ processor evaluates the query and creates a new SPARQLλ query, a set of Ωp and a RDF dataset as a result.

Fig. 3. Architecture of a SPARQλ function service for partial evaluation of SPARQλ queries

With this data, a new function is created and spawned in the FaaS framework. During the initialization process, the function description is created and made available at an endpoint. 9

https://github.com/apache/jena/tree/master/jena-arq.

SPARQλ: SPARQL as a Function

7

853

Related Work

SPARQL Endpoints in the Cloud: With the ever increasing amount of data, and by this, increasing sizes of data sets, it becomes a challenging task to provide access to data efficiently and reliably [10,20]. Several works tackle the question of how to best distribute large data sets over cloud infrastructure. Leng et. al [17] investigate RDF graph partitioning techniques to distribute the work load of SPARQL query execution over sub-graphs with their own respective SPARQL endpoints. Rietveld et al. [22] provide a novel architecture for Linked Data endpoint federation. Rakhmawati et al. [21] provided thorough comparison of frameworks that provide SPARQL endpoint federation. Lately, Abdelaziz et al. claimed in their survey [6] that with systems in a distributed setup often being highly specialized towards the underlying dataset, special care needs to be applied when updating datasets, a claim that we see also true for data integration over endpoints. Data Integration with SPARQL: SPARQL 1.1 Update [5] is an official companion language of SPARQL 1.1, recommended by the W3C, and is intended for specifying and executing updates to RDF graphs in RDF datasets. Specifically, the LOAD keyword allows to merge external datasets into a RDF dataset, either in the default or in a named graph. DELETE and INSERT allow the creations and deletion of triples from a graph. Fafalios et al. present SPARQL-LD [13,14], which generalizes the semantics of the SPARQL1.1 SERVICE keyword. SPARQ-LD allows to dynamically fetch RDF datasets from Web resources, also during evaluation of the query, without having to have a named graph declared under the respective URI beforehand. C-SPARQL [8] extends SPARQL 1.1 by a dataset specification that allows the usage of streamed datasets and supports an execution on data specified by a time window. However, like SPARQL 1.1, the dataset handling is based on IRIs that are specified in query itself [12]. Daga et al. present the cloud platform BASIL [11]. BASIL exposes stored SPARQL queries as HTTP Web APIs, with the goal to simplify data integration and interoperation for Linked Data and RDF datasets for non RDF experts. BASIL is based on SPARQL 1.1, and contrary to SPARQλ services, it does not allow for dynamic data set bindings based on input parameters which are provided with the API call. Michel et al. proposed to defined SPARQL microservices [18,19] to wrap arbitrary JSON-based Web APIs, and query those via SPARQL. For this, a SPARQL microservice forwards parameters that are provided with a SPARQL query to a non-RDF Web API, translates the received result of the API call into triple fragments and runs the provided query against the resulting data set. Different to our approach, the Web endpoints from which data is retrieved, are fixed after deployment of the service. In conclusion, federation of SPARQL endpoints as discussed in [6,17,21,22] makes the importance of data integration and interoperability apparent. Even

854

C. Vogelgesang et al.

though this is topic in BASIL [11] and SPARQL Microservices [18,19], those approaches do not support dynamic data integration from federated RDF Web resources or SPARQL endpoints. SPARQL-LD [13,14] provides semantics on the SERVICE keyword similar to our specification of GRAPH, but it is lacking a formal definition that allows to perform the mapping to a lambda-function-like micro-service interface, which was our main goal in this paper. We have provided this formalism as extensions of the semantics of the GRAPH keyword instead as a matter of choice. We consider it the more fitting choice, as we load new data to our dataset, similar to how named graphs work in SPARQL1.1, rather than performing a subquery on a remote endpoint. The purpose of SPARQL-Update is graph modification and not graph creation. It does not support the CONSTRUCT keyword to create new graphs. The LOAD keyword allows the fetching of graphs at execution time, like SPARQλ, but requires the IRIs to be known beforehand in the query definition.

8

Future Work

We see possible future work in SPARQλ by providing support of more types of datasources beyond RDF, comparable to SPARQL Microservices. We moreover see promising work in including better handling of different RDF dataset interfaces and large datasets, as it is already offered by SPARQLLD, in SPARQλ. The partial evaluation of SPARQλ functions can be improved by an automatic optimization step to maximize the evaluable fragment of the query. Multi-staged programming is another concept from functional programming, that allows to evaluate expressions to entirely new functions. This would require that the evaluation of SPARQλ functions can produce entirely new queries, not only partially evaluated functions.

9

Conclusion

We presented SPARQλ, an extension to the SPARQL 1.1 query language, that is optimized for reusable, predefined and parameterizable CONSTRUCT queries. Our modifications allow the injection of dataset sources into the query and cover possible reachability problems of external graphs. On top of the query language, we specified a function interface for encapsulating SPARQλ queries as services with a functional programming style interface. We have formally defined the functions and respective evaluation strategies, including partial evaluation as a means to generate new SPARQλ functions with reduced parameter sets. The functions describe themselves by using a modified SPARQL service description that can be semi-automatically generated by a presented mapping from the given query. The invocation of the functions is following the notions of the SPARQL protocol, which we extended by means to provide function parameters, and purged dataset handling.

SPARQλ: SPARQL as a Function

855

We presented an implementation proposal based on a Function-as-a-Service Framework and a customized Jena implementation that allows a flexible and scalable execution of the defined queries. Acknowledgment. This work is supported by the Federal Ministry of Education and Research of Germany in the project Hybr-iT (F¨ orderkennzeichen 01IS16026A).

References 1. 2. 3. 4. 5. 6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

RDF 1.1 Primer. https://www.w3.org/TR/rdf11-primer/ Shape Expressions Language 2.0. https://www.w3.org/TR/shex-semantics/ Shapes Constraint Language (SHACL). https://www.w3.org/TR/shacl/ SPARQL 1.1 Protocol. https://www.w3.org/TR/sparql11-protocol/ SPARQL 1.1 Update. https://www.w3.org/TR/sparql11-update/ Abdelaziz, I., Harbi, R., Khayyat, Z., Kalnis, P.: A survey and experimental comparison of distributed SPARQL engines for very large RDF data. Proc. VLDB Endow. 10(13), 2049–2060 (2017) Baldini, I., Castro, P., Chang, K., Cheng, P., Fink, S., Ishakian, V., Mitchell, N., Muthusamy, V., Rabbah, R., Slominski, A., et al.: Serverless computing: current trends and open problems. In: Research Advances in Cloud Computing, pp. 1–20. Springer (2017) Barbieri, D.F., Braga, D., Ceri, S., Valle, E.D., Grossniklaus, M.: C-SPARQL: a continuous query language for RDF data streams. Int. J. Semant. Comput. 4(01), 3–25 (2010) Buil-Aranda, C., Arenas, M., Corcho, O., Polleres, A.: Federating queries in SPARQL 1.1: syntax, semantics and evaluation. Web Semant. Sci. Serv. Agents World Wide Web 18(1), 1–17 (2013). Special Section on the Semantic and Social Web Buil-Aranda, C., Hogan, A., Umbrich, J., Vandenbussche, P.Y.: SPARQL webquerying infrastructure: ready for action? In: International Semantic Web Conference, pp. 277–293. Springer (2013) Daga, E., Panziera, L., Pedrinaci, C.: A BASILar approach for building web APIs on top of SPARQL endpoints. In: CEUR Workshop Proceedings, vol. 1359, pp. 22–32 (2015) Dia, A.F., Kazi-Aoul, Z., Boly, A., Chabchoub, Y.: C-SPARQL extension for sampling RDF graphs streams. In: Advances in Knowledge Discovery and Management, pp. 23–40. Springer (2018) Fafalios, P., Tzitzikas, Y.: SPARQL-LD: a SPARQL extension for fetching and querying linked data. In: International Semantic Web Conference (Posters and Demos) (2015) Fafalios, P., Yannakis, T., Tzitzikas, Y.: Querying the web of data with SPARQLLD. In: International Conference on Theory and Practice of Digital Libraries, pp. 175–187. Springer (2016) Fox, G.C., Ishakian, V., Muthusamy, V., Slominski, A.: Status of serverless computing and function-as-a-service (FaaS) in industry and research. arXiv preprint arXiv:1708.08028 (2017) Jones, N.D., Gomard, C.K., Sestoft, P.: Partial Evaluation and Automatic Program Generation. Prentice-Hall International Series in Computer Science. Prentice-Hall, New York (1993)

856

C. Vogelgesang et al.

17. Leng, Y., Zhikui, C., Zhong, F., Li, X., Hu, Y., Yang, C.: BRGP: a balanced RDF graph partitioning algorithm for cloud storage. Concurr. Comput. Pract. Exp. 29(14), e3896 (2017) 18. Michel, F., Faron-Zucker, C., Gandon, F.: Bridging web APIs and linked data with SPARQL micro-services. In: Extended Semantic Web Conference (ESWC) (2018) 19. Michel, F., Zucker, C.F., Gandon, F.: SPARQL micro-services: lightweight integration of web APIs and linked data. In: LDOW 2018-Linked Data on the Web, pp. 1–10 (2018) 20. Millard, I., Glaser, H., Salvadores, M., Shadbolt, N.: Consuming multiple linked data sources: challenges and experiences (2010) 21. Rakhmawati, N.A., Umbrich, J., Karnstedt, M., Hasnain, A., Hausenblas, M.: A comparison of federation over SPARQL endpoints frameworks. In: International Conference on Knowledge Engineering and the Semantic Web, pp. 132–146. Springer (2013) 22. Rietveld, L., Verborgh, R., Beek, W., Vander Sande, M., Schlobach, S.: Linked data-as-a-service: the semantic web redeployed. In: European Semantic Web Conference, pp. 471–487. Springer (2015) 23. Stadtm¨ uller, S., Speiser, S., Harth, A.: Future challenges for linked APIs. In: SALAD@ ESWC, pp. 20–27 (2013) 24. Verborgh, R., Vander Sande, M., Colpaert, P., Coppens, S., Mannens, E., Van de Walle, R.: Web-scale querying through linked data fragments. In: LDOW. Citeseer (2014)

A Holistic Approach to Requirements Elicitation for Mobile Tourist Recommendation Systems Andreas Gregoriades1(&), Maria Pampaka2, and Michael Georgiades3 1

Cyprus University of Technology, Limassol, Cyprus [email protected] 2 The University of Manchester, Manchester, UK [email protected] 3 Primetel PLC, Limassol, Cyprus [email protected]

Abstract. Mobile recommendation systems (MRS) are becoming ever more popular in the tourism industry, due to their potential to declutter the decisionmaking process of tourists. Despite their proliferation, such systems seem to lack accuracy and relevance to the needs of their users. This paper describes the mobile recommendation problem and explores the relationships between personality, emotion, context and recommendations for tourists. Its aim is to investigate user-requirements of prospective mobile recommendation systems for tourists and the influence of personality and emotional state on user needs. To that end, a survey was conducted with tourists in Cyprus at a point of interest to identify their recommendation needs. Collected data have been analyzed and preliminary results indicate different user requirements among contextual factors. This indicated that the contextualization of these applications in accordance with users’ personality and emotional state is essential to realize their full potential. Keywords: Mobile recommendation systems Personality  Emotion  Context

 User requirements

1 Introduction Recommender systems (RS) gained acceptance as software tools and techniques for providing suggestions to users by utilizing the excessive information availability on the web. They use information retrieval and fusion techniques that provide personalized recommendations to users during decision-making and have been extensively used in the tourist domain. Their most important feature is their ability to predict users’ preferences and interests by analyzing their behavior and the behavior of other similar users. Commonly used recommendation techniques include collaborative filtering, content-based, knowledge-based and hybrid techniques [1, 7, 8, 10]. An important aspect of RS that has not been adequately addressed in the existing literature is the relationship between recommendations and the combined user’s © Springer Nature Switzerland AG 2020 K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 857–873, 2020. https://doi.org/10.1007/978-3-030-12388-8_58

858

A. Gregoriades et al.

inherent characteristics such as their emotions, personality and physiology. Some of these have been addressed in isolation and found significant towards improving recommendations [5, 31, 32]. Contextual information about location, temperature and weather were analyzed independently and also found to improve recommendation relevance and accuracy [15, 41] through Mobile Recommendation Systems (MRS). This is a growing area of RS that utilize the ubiquitous and context-aware capabilities of smartphones. The availability of built-in sensors such as GPS, gyroscope, motion etc., enables MRSs to offer their users recommendations about things to do or buy on the go. Geo-recommendation is indeed redefining the way outdoor shopping is performed, bringing notable opportunities to the tourism industry. Existing MRSs in e-tourism acquire the user needs and desires, either explicitly through questions or implicitly by mining the user’s online activity, and suggest destinations to visit, points of interest, tour plans that include transportation, restaurants and accommodation, events/activities or even complete tourist packages. The main objective of tourist MRSs is to ease the information search process for the traveler and to convince the user of the suitability of the recommended services or product. MRSs, however, failed to gain wider acceptance primarily due to their inability to utilize effectively contextual information about their users such as time, location, emotion, activity, so as to make relevant and valuable recommendations on-the-go and/or, but also due to their difficulty of use [1]. According to the literature [1, 13] the main issue in RS is the lack of understanding of the relevance of contextual information to the recommendation problem. This is due to a mismatch between user-needs and implemented MRS functionality that aims to support these needs and is a well-known issue in Design Science and Requirements Engineering domains. Additionally, MRSs mainly use a sterilized rationalistic approach using either collaborative filtering or content based filtering that focuses more on the recommendation itself, rather than how users’ decision making is inherently influenced by human factors, such as their personality, emotion or physiology. These constitute contextual information that need to be incorporated into the recommendation problem. Designing and implementing RS requires a systematic process that initiates with the requirements elicitation phase. Generally, the development of successful software requires tremendous amount of time spent with the user being involved with prototyping, experimentation, and providing feedback in order for the developer to understand the problem domain and then to identify the requirements of the prospective system. The quality of the software improves when requirements are defined correctly at the early stage of system development, hence their importance is imperative. By definition, software requirements are a set of statements which the software system must implement, the qualities it must achieve, and the constraints it must satisfy. These requirements are defined at an early stage of system development and reflect user needs [2]. Requirements are classified into four categories: business requirements, user requirements functional and nonfunctional requirements. Business requirements describe the rationale for an organization implementing the system, while user requirements describe functionality of the system that users expect to have. In this

A Holistic Approach to Requirements Elicitation

859

paper we concentrate on the user requirements and on tourist users in particular. However, requirements engineering methodologies tend to focus on the software, rather than the people. While the last couple of decades have seen a move away from this, with advancements such as social modelling [3] and user stories, which focus on why users want something in addition to what they want, the focus is still very much on the behavioral “why”. Work by Miller [4] introduced the concept of emotional goals which captures the desired feelings of stakeholders in a prospective system. This, however, serves as a means to an end that aims for the desired emotion after using the system. The challenge is to incorporate user emotion and personality as an influencing factor to the design of a system. Design of systems has now moved beyond its traditional goals of efficiency and ease of use, towards systems being designed for desirability, seductiveness, and persuasion, properties that are linked to inherent user attributes. This is more relevant in applications such as MRS where personality and emotion have an important effect on the acceptance of the recommendations [5]. The goal in MRS, is to accurately recognize the user’s context and accordingly tailor its response. Hence, there are two problems to be resolved: the recognition of user state and, the recommendation of best option given the identified state. The latter has been addressed by many authors [1, 8, 11, 17, 54], while the former is still an open problem due to the lack of reliable non-intrusive techniques and equipment for recognizing user physiological state and emotion. According to the literature [5], personality affects user needs and wants and the way users interact with products or services. Personality also influences the way people use software [6]. Therefore, it is imperative to incorporate user personality along with other contextual cues during the early stages of an application development process, such as the requirements elicitation phase. Based on the above, the design of prospective MRS should adopt a holistic approach by incorporating all facets of the recommendation problem. This work partly addresses this goal by examining the disparity between MRS users’ needs and the implemented MRS functionality. It focuses on the investigation of the need to incorporate contextual information regarding users’ state (inherent and circumstantial), in the recommendation process. It reports on an exploratory study which identifies relationships between tourists’ information needs, decision making strategies, personality and emotional state with prospective MRS requirements. This aims to highlight inherent user requirements and the way they relate to personality and emotions. This is required to specify functionalities of future MRSs that will satisfy the contextual needs of their users. The research questions addressed in this paper are the following. What design features (software requirements) do most of the tourists want to see in a future MRS? What is the relationship of personality with these prospective features? Is there an association between emotional state and type of recommendation? The paper is organized as follows. The next section presents a review of the literature starting with RSs and their application in tourism and then focusing on their relationship with decision making, personality and emotion. This is followed by the research methodology, the data collection and the analysis before presenting the results. The paper concludes with a brief discussion and the main conclusions.

860

A. Gregoriades et al.

2 Literature Review 2.1

Recommendation Systems

A RS is a computational system that can make meaningful recommendations to potential users and is currently one of the main application areas of machine learning and artificial intelligence in the information technology domain. Commercial application includes, online advertising and item recommendation [7] within Netflix, TripAdvisor, Amazon The last years have witnessed an explosive increase in the use of mobile technology among tourists [7, 8]. e-Tourism systems [9] provide a good opportunity for mobile services to assist travelers in their decision making by offering recommendations based on their preferences, current location, and weather (context). There are different types of RSs depending on the technique used to make recommendations, including Collaborative filtering, Content-based, knowledge-based and Context-based. Collaborative filtering [10] is a popular method that bases its predictions on the behavior of similar users. The fundamental assumption behind this method is that other users’ opinions can be used to provide a reasonable prediction of the active users’ preference as similar users are expected to have similar preferences. Hence, if users agree about the quality or relevance of items, they are likely to agree about other items [11]. Content-based techniques on the other hand recommend articles/products/services that are similar to items previously preferred by a specific user [12, 54]. They are based on the analysis of descriptions (content) of items preferred by a particular user to determine the key attributes (preferences) that can be used to distinguish these items. These preferences are stored in a user profile. Similarity between user and item is performed by comparing each item’s attributes with the user profile to identify what to recommend. In contrast with Content based approaches, Collaborative filtering does not require human intervention for tagging content because item knowledge is not required. Recommendations are made based on the nearestneighborhood method, hence users whose rating profiles are most similar to that of the target user are considered relevant. Collaborative Filtering technique can be either userbased or item-based. In the former, a user receives recommendations for items liked by similar users, while in the latter a user receives recommendations of items similar to items he/she liked in the past. The drawback of this approach is the description of users as the average of their friends. These are called memory-based approaches since they are based on past data. Model-based approaches use a model of the user, created either explicitly or implicitly using experts or machine learning techniques and utilize user profile data. They can improve the performance of the recommender system by taking into account the preferences of the active user as well as the aggregate of the neighborhood users. The Knowledge-based recommendation technique recommends items to users based on knowledge about the users, items and their relationships. Usually, such systems use a functional knowledge base describing how an item meets a specific user’s need. This can be achieved based on relationships between user needs and a possible recommendation [13]. Case-based reasoning is a common technique used for knowledge-based recommendation in which items are “cases” and recommendations are generated by retrieving the most similar cases to the user’s query or profile [12].

A Holistic Approach to Requirements Elicitation

861

Alternatively, pattern identification using association rules can be applied. If the above are not suitable, domain knowledge can be used to make these associations explicit in the knowledge-base. Knowledge based systems use explicit knowledge about the relationships among an item, user preferences, and recommendation criteria and are applied in situations where collaborative filtering and content-based filtering cannot be used due to data sparsity. Traditional recommendation systems ignore key contextual information when making suggestions to users. The emerging context-based RS which notably involve spatiotemporal criteria in the recommendation algorithm utilize contextual data through location-based services. They are based on the use of location-awareness and “ubiquity”. The former refers to knowledge of the user’s physical position at a particular time, while “ubiquity” refers to the ability to deliver the information and services to users wherever they are, and whenever they need it. In the tourism domain contextual information is of key importance. Tourist MRS belong to this category and employ some Artificial Intelligence techniques to analyze the behavior of their users, learn their preferences and provide proactive recommendations depending on the context [14]. Other tourists MRS focus on suggesting attractions and use automated planners to schedule activities within temporal and geographic constraints [15]. They could provide opening and closing times of the attractions, or the time needed to go from one point of interest to another. This, however, is a very complex planning and scheduling problem that researchers try to solve using optimization techniques, such as ant colony or meta-heuristic iterative methods [16]. In the same vein, automatic clustering algorithms are used to classify tourists with similar preferences or similar features into groups [17]. Alternative techniques for the tourist recommendation include approximate reasoning methods, such as Bayesian networks. These are used to manage the uncertainty between the relationships of user preferences and available activities [18]. Rule-based systems are used to deduce user preferences from data or experts using machine learning [19]. The most frequently used recommendation techniques in the tourism domain are content-based and collaborative filtering. Content-based recommendation however is problematic because it only recommends items closely related to those the user liked in the past. This is referred to as the overspecialization problem. Hence, no new items are recommended [20]. Collaborative filtering on the other hand suffers from the new item problem, where new item cannot be easily recommended to other users because the new items have no ratings [20]. This is referred to as the data sparsity problem. To address these problems, researchers have proposed knowledge-based RSs [13]. Rather than requiring a large amount of data (item rating from users), such methods require only sufficient knowledge to judge items similar to each other. For example, [21] developed a RS by using the knowledge of domain experts to describe the relations among items and features, and user-defined preferences of recommendations. However, the knowledge-based RS exhibited the cold-start problem and lack of explicit mechanism to identify the constructs used to describe user preferences and item features. The cold start problem describes the case when the system needs to recommend an item to a new user with no past information regarding this user and the case when a new item has few or no ratings. Thus, the problem focuses on the sparsity and the lack of relevant information about users and items. Techniques to address this problem

862

A. Gregoriades et al.

involve asking users to rate a number of items to gather information about their preferences. In the case of context-based RS, cold-start users represent users located in unfamiliar areas with no physical location histories. Social based collaborative filtering methods represent a solution to the cold start problem and use social neighborhood (users similar to active user) as an indication of active user preferences. The tourism domain is notably a flourishing application field for MRSs, which leverages massive opportunities to provide highly accurate and effective recommendations for tourists based on personal preferences and contextual parameters. However, the recommendation accuracy of current MRS is far from being optimum. To improve this, it is essential to dive into the complexities of the tourist decision making process and their inherent properties such as personality and emotions, to identify how these influence decisions and how MRS can utilize relevant information to improve the accuracy of the recommendations. 2.2

Tourist Decision Making Process

RSs promise to streamline the decision-making process of people, by decluttering the decision space. It is, thus, imperative to examine the theoretical background of decision making when designing artifacts that aim to support it. The main drawback of MRS is the danger of irrelevant recommendations or recommendations made at the wrong time. This will distract the user and will result in user rejecting the MRS. Hence to design effective MRS to serve user needs, it is essential to understand the decision-making process of tourists before and during their holidays, to identify which part of the process could be best served by MRS and the identification of the optimum recommendation time along with the best recommendation that will relate to the dynamic needs of its user. Decision making research investigate the influence of internal and external factors of human decision maker. The most dominant decision-making models are the rational and the bounded rationality. Much work on tourist decision-making adopted the rational decision-maker model. In this paradigm tourists engage with the decisionmaking problem in a motivationally-driven process of searching for an efficient means of satisfying desires and needs in relation to travel [46]. This approach is based on consumer behaviour [47] knowledge that represents the rational decision-making process followed by consumers when deciding what service or product to buy. This is expressed as a directed search for information about available and accessible options to satisfy a desire and evaluation of these options against some criteria. In tourism research the dominant rationalistic approach to decision-making does provide some useful insights across tourism choice and could be applied during tourist destination choice stage, which occurs before the trip [48]. However, it is less suited for the often relatively unplanned, hedonic, opportunistic and impulsive decision-making that often characterises tourists’ behaviours on-site within a destination [47, 48]. It is arguable that rational models of motivation and decision-making underestimate the importance of emotional processes in tourists’ behaviour [56]. There has also been criticism of the rational approach [57] highlighting insight from psychology that behaviour is an adaptive process based on interaction.

A Holistic Approach to Requirements Elicitation

863

The Bounded rationality model [22] offers a more realistic view, claiming that, time constraints, cognitive capacity and incomplete information, make individuals decide on a ‘good enough’ (‘satisfying’) solution rather than the optimal [22]. Related to this, [23] claimed that decisions are made only where an alternative is definitively better than the status quo and expresses this with the rationality of tourists bounded by constraints including travel stimuli, psycho-social state and environmental conditions. With regard to the temporal aspects of tourist decision making, range from planned and early decisions before trip, to more spontaneous decisions at a destination [51]. Gunn [52] classifies decision regarding a trip into: primary (before trip), secondary (list of ‘to do’s’ at a destination) and tertiary (dynamically encountered at a destination). RSs can support all three levels of decision making, while MRSs focus more on supporting secondary and tertiary decisions relating while at a destination. Generally, tourist decision making models are informed by models of consumer behaviour [24]. The main variables in these models relate to socio-psychological processes, personality and environmental variables. Alternatively, formal approaches to modelling tourist behaviour are based on economic and marketing theories [49]. Such modelling, however, operates on reasonably coarse-grained assumptions about the relevant properties of the tourist and the environment within which tourists express their behaviours. Finer-grained approaches to modelling tourist behaviour have a bottom up approach and focus on individual tourists’ presumed decision-making strategies. They model tourists’ behaviour using [50] the agent-based paradigm in which tourist decision makers are represented by agents that interact in a simulation environment. Based on the above it is evident that tourist decision making process is a subset of tourist behavior which in turn is a subset of consumer behavior in which emotions and personality are highly influential. Therefore, modelling these effectively in MRS seems to be imperative to address current limitations of MRS and effectively support spontaneous tourists’ decisions, by utilizing inputs from diverse sources external and internal to the user. However, due to the difficulty of eliciting inherent emotional and physiological properties of tourists on the fly, it is essential to investigate approaches of eliciting this knowledge. This will make the link between user requirements of prospective MRS and personality, emotion and context more explicit. 2.3

Personality in Recommendation

Most approaches for building RSs focus on recommendation precision and they ignore how users are inherently influenced by their own emotions and personality, and other human factors, during decision making. Personality is one of the factors that differentiate individuals and has been found to affect the way users interact with technology [6]. By definition, personality is referred to the set of emotional, attitudinal and interpersonal processes that are specific to each individual person [25]. Consequently, personality can be considered as one of the most important factors influencing human behaviour as it can affect how people react, behave and interact with other individuals. There is evidence relating personality of individuals and their tastes and interests,

864

A. Gregoriades et al.

for example, affective experience and social behaviour [26]. This implies that individuals with matching personalities might have similar interests, which has direct application in the tourism domain. Several studies aimed at finding the best features to describe someone’s personality. Tupes and Christal [28] were the first authors to identify five features in personality, while a model composed of five features known as the Big Five model was presented in [29]. This model is recognized as a valid mechanism for defining the most essential aspects of personality as expressed in the following 5 dimensions: Agreeableness, Extraversion, Openness to Experience, Conscientiousness and Neuroticism. Additional work [30] presents the development and validation of the 10-Items Personality Inventory Questionnaire that is a short version of the big five model (also used in the current study, see Methodology section). With regards to software systems requirements, personality has also been found to be significant in predicting the required functionality of software systems [27]. Specifically, extraverts and emotionally stable people demonstrate higher intention to use new technology, while ‘open to experience’ people show a positive relationship with system’s ease of use. Extraverts are more likely to use technology, prefer to use applications on their mobile phone, and are also more likely to act based on the opinions of those whom they consider as significant [27]. This is the underlying theory used in collaborative filtering technique of RSs. Despite the importance of personality in the recommendation problem, the literature on application of personality in tourism recommenders has been scarce. There are however some applications of personality in movie recommenders, such as in [5] where the authors added personality scores to a content-based movie RS to generate more personalized recommendations. Their comparative analysis revealed that the results from the recommendation were superior when personality was taken into consideration. In the same vein, in [31, 32] the researchers included personality scores using the Big Five Model, as complementary information in traditional rating-based collaborative filtering RS. Their approach was compared to a traditional rating-based filtering system, showing that the system combining ratings and personality significantly outperformed the systems solely using either ratings or personality features alone. Consideration of personality traits in the tourism domain include work by [33, 34] that employed personality along with places of interest to make tourist recommendations. These results indicate that personality could significantly improve MRS recommendation in the tourist domain. 2.4

Emotion, Physiology and Recommendation

Emotions are directly linked to decision making and personality. Emotions are considered to be critical to understanding the underlying reasoning of consumers’ behaviors in marketing literature [35] as they can explain variations in individuals’ responses beyond rationality [55]. According to [36], there are six main emotions: happiness, sadness, fear, surprise, anger, and disgust. They are the dominant driver of most significant decisions in life and can also affect human behavior. Therefore the need to incorporate emotions in future MRS seems promising.

A Holistic Approach to Requirements Elicitation

865

Essentially, decisions guide people’s actions towards avoiding negative and increasing positive feelings [53]. Emotions emerge after we evaluate and interpret an event or stimulus [37]. They emerge from the subjective evaluations of a situation or an event. However, the same event can provoke different emotions for different individuals depending on how the event influences them. Therefore, to improve MRS recommendations, it is useful to know the users’ emotional state (positive/negative) at any given time [38]. Some studies have also shown that emotions are influenced by time and surroundings [39] which is also an important aspect when designing an MRS, to identify the optimum time to make the recommendation to its user. This need becomes more apparent when travelling [40]. Nawijn [41] also reported that temperature may lead to the change of travelers’ emotions. This is particularly important for MRS that should not distract or annoy the user. Morris and Geason [42] showed that emotions can explain personal intentions more than cognition, while [43] highlights that emotional changes are fundamental determinants of tourists’ decisions and account for 45% of the variance in tourists’ intentions to visit a place. In the same vein, the impact of user’s physiological states such as level of fatigue and mood have been mostly ignored in the literature. When fatigued, users loose interest on most activities. With regards to MRS, user physiology is highly relevant. According to [44] users could become annoyed and loose trust in the system if the same item has been presented multiple times. In their work they utilize user mood (fatigue, tiredness) to recommend tourists attractions. Similarly [45] also recognized the importance of user physiology to the recommendation problem. Their MRS SenSay adjusts its recommendations dynamically to changing environmental and physiological shapes of its user. SenSay utilizes mobile sensors such as accelerometers and light sensor in accordance with physiological sensors mounted at numerous points on the body to provide data about the users’ physiology. Based on the above, it is important to highlight that an isolated analysis of the recommendation problem using a subset of the total dimensions of the recommendation problem such as preferences, content, users, or context alone yield in mediocre results. Research indicates that the solution lies in holistic approaches that integrate all aspects of the recommendation problem. This is the method employed in this work.

3 Methodology The main limitation of existing approaches to MRS stems from the fact that they do not take a holistic view at the problem but, instead, concentrate only on a few dimensions such as users’ past preferences and geolocation. This, in essence, yields mediocre results. To improve user recommendation, it is important to employ a holistic approach that integrates not only external cues (such as environmental, historical and contextual data), but also internal information, relating to the psychological and physiological state of the user. The method proposed herein is based on this paradigm and as such falls under the category of hybrid MRSs. However, due to the wide scope of the problem space, the focus of the research will be on the elicitation of the user

866

A. Gregoriades et al.

requirements of MRS using field-based questionnaires to address each dimension of the problem (psychological, physiological, contextual etc.). Another limitation of existing MRS frameworks is the scarcity or unavailability of data for the training of the recommendation algorithm, which is also referred to as the cold-start problem [8]. Most MRS use secondary data from social networks to hypothesize user behavior and to develop recommendation models. These, however, are based on mere assumptions of the tourist experience. Even though in some cases this approach yields good results, in our case, historical data regarding the emotional and physiological state of users is not available, nor collated data regarding all the dimensions of the external influences such as weather, traffic conditions and user’s social networks activity. To tackle this limitation, primary data (internal and external) are obtained directly from tourists while on their holidays in Cyprus. This initial knowledge is pre-processed and subsequently utilized to identify the user needs with regards to new MRS requirements. The proposed conceptual model is depicted in Fig. 1. As shown, tourists’ personality, emotion, physiology and context are proposed to relate to different features of a prospective MRS. According to the literature, there appears to be links between physiology, context, personality and emotions, and these properties then influence userneeds that are expressed in terms of user requirements.

Physiology

Personality User needs

MRS requirement

Emotional State

Context

Fig. 1. Hypothesized conceptual model of features influencing MRS requirements

The methodological approach of this study is composed of the following steps. Initially a literature review was conducted in the fields of personality, emotion, mood, and tourist decision making to identify the main constructs for the design of the instrument for data collection. Subsequently a pilot study was conducted to verify the quality of the questions used in the questionnaire. The second task involved identifying candidate tourists that could take part in the survey research. The first criterion for participant selection was the location of the participant and their activity status. Therefore we recruited only foreign tourists during their holiday activities. This was

A Holistic Approach to Requirements Elicitation

867

necessary in order to identify among others, MRS requirements that relate to user needs during their daily activities. The data collection was conducted at different tourist attractions in the Limassol area with many points of interests, restaurants, cafeterias, shops, museums and tourist activities at each point. Participants were interviewed while they were approaching the attraction to distill strategies they used when deciding their spontaneous activities or when selecting a point of interest to visit. The research instrument used to acquire user personality is based on the simplified Big five model as was adapted by [30], namely the Ten-Items Personality Inventory Questionnaire. This instrument is used in this study due to its validity and simplicity. Emotional and physiological state questions were used to capture users’ mental and physiological state using the main emotions [36] and main physiological states as reviewed earlier [44]. Contextual information included the current state of the user (active, relaxing etc.), location, weather and time. During interaction with participants, the following information was also collected: purpose of their visit to Cyprus along with demographics, their interests, food preferences, recommendations they would prefer based on their current state from a hypothetical MRS. Based on tourists’ current situation (geolocation, emotional & physiological state, preferences and personality type), participants were asked to choose a recommendation from a list of candidate predefined options. Participants were also asked to denote the best type of recommendation(s) given their current situation irrespective of the list provided in the point above. The above data was integrated and annotated with spatiotemporal and weather information. This aimed to identify links between their current state and required service. Collected data was preprocessed and subsequently analyzed to identify relationships among different constructs. Data-mining and statistical modelling techniques will be used to identify and investigate patterns in the dataset. The decision making strategies and patterns that emerged from the data analysis will be used as the basis for the specification of the MRS functionality and subsequently its implementation in future work.

4 Analysis and Results The designed questionnaire was in English and all participants were fluent in the English language. This study was based on a sample of seventy participants: 27 males (39%) and 43 females (61%). Twenty-six were British (38%), twenty-two from other European country EU (36%) and sixteen non-EU nationals (26%). Forty-nine (69%) of them had higher education qualifications (College 12, Bachelors 20, Masters 15, PhD 2). Their mean age was 41.1, with 44 (66%) being employed, 14 (20%) retired, 5 students (7.1%), and 3 (4,3%) not employed. The majority were in Cyprus with their spouse or family (76%), 20% with friends or colleagues, and 4% alone. The resulting sample reported a variety of preferences with regards to their purpose of visit to Cyprus: beach holiday (35%), walking and nature (26%), cultural holiday (23%), nightlife and clubbing (11.4%), relaxation (55%), active holiday (13%).

868

A. Gregoriades et al.

Initially we conducted a nonparametric Spearman’s correlation analysis which revealed some following links between personality traits and MRS user-needs based on contextual information of user’s situation (Table 1). The relationships are color-coded based on positive (blue) and negative (red).

Table 1. Correlations between personality and MRS user-needs - on the go Conserva

Open To On the go MRS ...

Enthusiastic

Argumentative

-.103

.199

.154

Events

Reliable

Anxious

.065

Experiences

-.060

Quiet

.033

Caring

.096

Disorganized

-.013

Calm

.141

tive

.019

.212

.119

-.023

.069

.006

.009

.034

-.116

.132

-.019

Attractions

-.105

.187

.112

.036

-.337**

-.129

.073

-.134

-.091

.410**

Restaurants

-.117

.200

.222(*)

-.024

.008

-.045

.126

-.090

.066

.044

Cafeteries

-.097

.273*

-.105

-.024

-.284*

.009

.013

-.096

-.075

.002

-.213(*)

.205

-.023

.140

-.326**

.154

-.219(*)

.051

.020

.101

Offers Near Me

.112

.044

.170

.075

-.201

-.064

.290*

-.067

.105

.174

Physical Effort

.002

.156

-.039

.123

-.239(*)

-.011

.003

.023

.020

.122

-.092

-.082

-.009

Sport Activities

Transportation Methods

.017

-.067

-.022

-.005

Very Close Friends

-.073

.007

.050

.148

-.014

-.073

.103

.019

-.048

-.039

Activities Near Me

.068

.081

.048

.031

-.122

.122

.094

-.139

.084

-.047 -.035

Other People Preferences

Tiredness Recommend Hungry Recomend People With Similar Preferences

.069

.106

-.030

.039

.190

-.220(*)

-.124

-.178

-.045

.108

-.078

.044

-.103

.180

-.050

-.084

-.169

-.092

.111

.045

.015

.104

-.102

.101

-.129

.095

.088

-.028

-.006

.104

-.236(*)

-.083

N=61 to 65 , (*) p relaxaƟon=1 25 liŌ:(1.12) lev:(0.04) [2] conv:(1.4) fesƟval=2 shops=2 27 ==> sightseeing=1 24 liŌ:(1.15) lev:(0.05) [3] conv:(1.54) argumentaƟve=1 anxious=1 30 ==> relaxaƟon=1 26 liŌ:(1.08) lev:(0.03) [2] conv:(1.2) conservaƟve=1 fesƟval=2 29 ==> spirit=2 25 liŌ:(1.26) lev:(0.07) [5] conv:(1.82) ecotour=2 relaxaƟon=1 29 ==> conservaƟve=1 25 liŌ:(1.21) lev:(0.06) [4] conv:(1.66) conservaƟve=1 50 ==> relaxaƟon=1 42 liŌ:(1.05) lev:(0.03) [1] conv:(1.11) argumentaƟve=1 fesƟval=2 30 ==> sightseeing=1 25 liŌ:(1.08) lev:(0.03) [1] conv:(1.14) conservaƟve=1 ecotour=2 30 ==> spirit=2 25 liŌ:(1.22) lev:(0.06) [4] conv:(1.57)

Fig. 2. Emerged association rules using Weka tool

5 Conclusions and Future Directions Personality is one of the most commonly used differentiators of individuals and has been found to affect tourist decisions and information needs in prospective MRS. In the same vein, emotion and physiology are additional dimensions that affects decision

A Holistic Approach to Requirements Elicitation

871

making. Emotion influence information needs and decisions and is directly linked to user needs of prospective MRS plus the propensity to MRS acceptance. This work presents preliminary results from the analysis of tourists’ personalities and emotions and their association with prospective MRS user-requirements using field-based data. Initial results from this study indicate that different personalities have different user requirements that support our initial assumptions. Similarly, emotion is found as an influencing factor to user requirements. Finally, user acceptance of prospective MRS is linked to personality. Concluding, due to the complex relationships among variables we suggest further analysis through regression and data mining using association rules and decision trees as part of our future work to identify patterns that could be used in the design of a prospective MRS for tourists.

References 1. Sassi, I.B., Mellouli, S., Yahia, S.B.: Context-aware recommender systems in mobile environment: on the road of future research. Inf. Syst. 72(C), 27–61 (2017) 2. Sommerville, I., Sawyer, P.: Requirements Engineering: A Good Practice Guide. Wiley, New York (1997) 3. Yu, E., Giorgini, P., Maiden, N., Mylopoulos, J. (eds.): Social Modeling for Requirements Engineering. MIT Press, Cambridge (2011) 4. Miller, T., Pedell, S., Lopez-Lorca, A.A., Mendoza, A., Sterling, L., Keirman, A.: Emotionled modelling for people-oriented requirements engineering: the case study of emergency systems. J. Syst. Softw. 105, 54–71 (2015) 5. Wu, W., Chen, L., He, L.: Using personality to adjust diversity in recommender systems. In: 24th ACM Conference on Hypertext and Social Media (HT 2013), Paris, France, pp. 225–229 (2013) 6. Svendsen, G.B., Johnsen, J.K., Sorensen, L.A., Vitterso, J.: Personality and technology acceptance: the influence of personality factors on the core constructs of the technology acceptance model. Behav. Inf. Technol. 32(4), 323–334 (2013) 7. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 1. MIT Press, Cambridge (2016) 8. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005) 9. Buhalis, D.: eTourism: Information Technology for Strategic Tourism Management. Prentice Hall, Upper Saddle River (2003) 10. Kabassi, K.: Personalizing recommendations for tourists. Telematics Inform. 27(1), 51–66 (2010) 11. Ekstrand, M.D., Riedl, J.T., Konstan, J.A.: Collaborative filtering recommender systems. Found. Trends Hum.-Comput. Interact. 4(2), 81–173 (2011) 12. Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.): Content-Based Recommendation. The Adaptive Web, pp. 342–376. Springer, Heidelberg (2007) 13. Burke, R.: Hybrid recommender systems: survey and experiments. User Model. User-Adap. Interact. 12, 331–370 (2002) 14. Batet, M., Moreno, A., Sánchez, D., Isern, D., Valls, A.: Turist: agent-based personalised recommendation of touristic activities. Expert Syst. Appl. 39(8), 7319–7329 (2012)

872

A. Gregoriades et al.

15. Vansteenwegen, P., Souffriau, W., Vanden, G., Van Oudheusden, B.D.: The city trip planner: an expert system for tourists. Expert Syst. Appl. 38(6), 6540–6546 (2010) 16. Lee, C.S., Chang, Y.C., Wang, M.H.: Ontological recommendation multi-agent for Tainan city travel. Expert Syst. Appl. 36(3), 6740–6753 (2009) 17. Gavalas, M., Kenteris, A.: web-based pervasive recommendation system for mobile tourist guides. Pers. Ubiquituous Comput. 15(7), 759–770 (2011) 18. Huang, Y., Bian, L.: A Bayesian network and analytic hierarchy process based personalized recommendations for tourist attractions over the internet. Expert Syst. Appl. 36(1), 933–943 (2009) 19. Lamsfus, C., Alzua-Sorzabal, A., Martin, D., Salvador, Z., Usandizaga, A.: Human-centric ontology-based context modelling in tourism. In: Proceedings of the International Conference on Knowledge Engineering and Ontology Development, Funchal, Madeira, Portugal, pp. 424–434 (2009) 20. Degemmis, M., Lops, P., Semeraro, G.: A content-collaborative recommender that exploits WordNet-based user profiles for neighborhood formation. User Model. User-Adap. Interact. 17(3), 217–255 (2007) 21. Hsu, C.K., Hwang, G.J., Chang, C.K.: Development of a reading material recommendation system based on a knowledge engineering approach. Comput. Educ. 55(1), 76–83 (2010) 22. Simon, H.A.: A behavioral model of rational choice. Q. J. Econ. 69(1), 99–118 (1955) 23. Schmoll, G.: Tourism Promotion. Tourism International Press, London (1977) 24. Sirakaya, E., Woodside, A.G.: Building and testing theories of decision-making by travelers. Tour. Manag. 26(6), 815–832 (2005) 25. Funder, D.C.: Personality Puzzle. W.W. Norton Incorporated, New York (2012) 26. Cuperman, R., Ickes, W.: Big Five predictors of behavior and perceptions in initial dyadic interactions: personality similarity helps extraverts and introverts, but hurts “disagreeables”. J. Pers. Soc. Psychol. 97(4), 667 (2009) 27. Oliveira, R.D., Cherubini, M., Oliver, N.: Influence of personality on satisfaction with mobile phone services. ACM Trans. Comput.-Hum. Interact. 20(2), 1–23 (2013) 28. Tupes, E.C., Christal, R.E.: Recurrent personality factors based on trait ratings. J. Pers. 60 (2), 225–251 (1992) 29. Costa Jr, P.T., McCrae, R.: Revised NEO personality inventory (NEO-PI-R) and NEO five factor model (NEO-FFI) professional manual. Psychological Assessment Center, Odessa, FL, USA (1992) 30. Gosling, S.D., Rentfrow, P.J., Swann, W.B.: A very brief measure of the Big-Five personality domains. J. Res. Pers. 37(6), 504–528 (2003) 31. Hu, R., Pu, P.: Enhancing collaborative filtering systems with personality information. In: Proceedings of the Fifth ACM Conference on Recommender Systems (RecSys 2011), Chicago, IL, USA, pp. 197–204 (2011) 32. Tkalcic, M., Kunaver, M., Tasic, J., Košir, A.: Personality based user similarity measure for a collaborative recommender system. In: Proceedings of the Fifth Workshop on Emotion in Human–Computer Interaction-Real World Challenges, Cambridge University, Cambridge, UK, pp. 30–37 (2009) 33. Braunhofer, M., Elahi, M., Ricci, F.: STS: a context-aware mobile ecommender system for places of interest. In: CEUR Workshop Proceedings, Aalborg, Denmark (2014) 34. Braunhofer, M., Ricci, F.: Selective contextual information acquisition in travel recommender systems. Inf. Technol. Tour. 17(5), 5–29 (2017) 35. Dubé, L., Menon, K.: Multiple roles of consumption emotions in post-purchase satisfaction with extended service transactions. Int. J. Serv. Ind. Manag. 11(3), 287–304 (2000)

A Holistic Approach to Requirements Elicitation

873

36. Ekman, P.: Facial expressions. In: Handbook of Cognition and Emotion, vol. 16, pp. 301– 320 (1999) 37. Zeelenberg, M., Pieters, R.: Beyond valence in customer dissatisfaction: a review and new findings on behavioral responses to regret and disappointment in failed services. J. Bus. Res. 57(4), 445–455 (2004) 38. Ortigosa, A., Martín, J.M., Carro, R.M.: Sentiment analysis in Facebook and its application to e-learning. Comput. Hum. Behav. 31, 527–541 (2014) 39. Penner, L., Shiffman, S., Paty, J.A., Fritzsche, B.A.: Individual differences in intraperson variability in mood. J. Pers. Soc. Psychol. 66(4), 712 (1994) 40. Servidio, R., Ruffolo, I.: Exploring the relationship between emotions and memorable tourism experiences through narratives. Tour. Manag. Perspect. 20, 151–160 (2016) 41. Nawijn, J.: Determinants of daily happiness on vacation. J. Travel Res. 50(5), 559–566 (2011) 42. Morris, J., Geason, J.: The power of affect: predicting intention. J. Advert. Res. 42(3), 7–17 (2002) 43. White, C.J., Scandale, S.: The role of emotions in destination visitation intentions: a crosscultural perspective. J. Hosp. Tour. Manag. 12(2), 168–179 (2005) 44. El Moemen, S.A., Soliman, T.H., Sewisy, A.: A context-aware recommender system for personalized places in mobile applications. Int. J. Adv. Comput. Sci. Appl. 7(3), 442–448 (2016) 45. Siewiorek, D.P., et al.: SenSay: a context-aware mobile phone. In: Seventh IEEE International Symposium on Wearable Computers (2003) 46. Woodside, A., King, R.: An updated model of travel and tourism purchase-consumption systems. J. Travel Tour. Mark. 10, 3–27 (2001) 47. Pizam, A., Mansfeld, Y.: Consumer Behavior in Travel and Tourism. Haworth Hospitality Press, New York (1999) 48. Decrop, A., Snelders, D.: Planning the summer vacation: an adaptable process. Ann. Tour. Res. 31(4), 1008–1030 (2004) 49. Jafari, J.: Encyclopedia of Tourism. Routledge, London (2003) 50. Correia, A., Kozak, M., Ferradeira, J.: Impact of culture on tourist decision-making styles. Int. J. Tour. Res. 13, 433–446 (2011) 51. Becken, S., Wilson, J.: Trip planning and decision-making of self-drive tourists, a quasiexperimental approach. J. Travel Tour. Mark. 20(3/4), 47–62 (2006) 52. Gunn, C.A.: Tourism Planning, 2nd edn. Taylor & Francis, New York (1988) 53. Coughlan, R., Connolly, T.: Predicting affective responses to unexpected outcomes. Organ. Behav. Hum. Decis. Process. 85, 211–225 (2001) 54. Brusilovsky, P., Kobsa, A., Nejdl, W.: Content-based recommendation systems. In: The Adaptive Web, pp. 325–341. Springer, Heidelberg (2007) 55. Lee, S.A., Shea, L.: Investigating the key routes to customers’ delightful moments in the hotel context. J. Hosp. Mark. Manag. 24(5), 532–553 (2015) 56. Goossens, C.: Tourism information and pleasure motivation. Ann. Tour. Res. 27(2), 301–321 (2000) 57. Smallman, C., Moore, K.: Process studies of tourists’ decision-making: the riches beyond variance studies. Ann. Tour. Res. 37(2), 397–422 (2010) 58. Bayardo, R.J.: Efficiently mining long patterns from databases. In: Proceedings of the 1998 ACM-SIGMOD International Conference on Management of Data, Seattle, Washington, USA, pp. 85–93 (1998)

A Marketing Game: A Model for Social Media Mining and Manipulation Matthew G. Reyes(B) Independent Researcher and Consultant, Ann Arbor, MI 48105, USA [email protected] https://matthewgreyes.com,http://amarketinggame.com

Abstract. This paper derives marketing-influenced Glauber dynamics for socially-contingent consumer choice, which rests on the foundation of socially-contingent random utility. This dynamics model provides companies with a reinforcement learning approach to influencing consumer decision-making. The paper presents a procedure for using machine learning algorithms to estimate consumer preferences as well as direct and social biases on the network. The paper discusses the use of market research to estimate inherent biases and marketing responses for individual consumers. Finally, the paper illustrates on a star-chain network how optimization of marketing allocation depends on parameter estimation.

1

A Marketing Game

This is principally a position paper in which we argue for a model of consumer decision-making that affords analysis and optimization of marketing influence. Our model builds off of well-established models in economics, and while it differs in subtle points, it is precisely these points that open the door for understanding and application of influence on decision-making within a social network. Consider a market in which consumers choose between two alternatives, Product A and Product B, according to their perception of the value of these two choices. The Products may be commercial products or political candidates, for example, an individual’s perceived utility of such owing to enhanced productivity, enjoyment, or status. In exchange for the product they choose, they give to Company A or Company B, respectively, their money, or vote, for example. To enhance the perception that consumers have of their respective products, Companies A and B market their products to consumers in a social network, as depicted in Fig. 1. The point of emphasizing the social network is to underscore the role that social connections play in influencing the decisions of consumers, and the strategy involved in companies selecting which consumers to target with marketing. One may approach this problem with the scientific objective of seeking to characterize influence imparted by particular (types of) individuals or a particular network topology; or the engineering objective of seeking to optimize influence irrespective of the respective individuals or topology. This paper addresses the engineering objective. c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 874–892, 2020. https://doi.org/10.1007/978-3-030-12388-8_59

A Marketing Game: A Model for Social Media ...

875

(a)

(b)

Fig. 1. a (perpendicular) view of a star-chain network that does not show marketing. The consumer with d + 1 neighbors is the hub c1 ; the d consumers whose only neighbor is the hub are the leaves l1 , . . . , ld ; the remaining consumers are the chain consumers c2 , . . . , c7 . b (parallel) view of network illustrating marketing influence by Company A (red) and Company B (blue).

The market share [3] for Company A with respect to consumer k is the probability pk (A) that consumer k chooses Product A. Likewise for Product B. With respect to the entire network, the market share of a Company is the sum, over all consumers in the network, of the probabilities that each consumer chooses their Product. The bias μk of consumer k is the difference in the probabilities of selecting A and B, i.e., Δ

μk = pk (A) − pk (B). The total bias on the network is the sum of the biases of all consumers,   μk = [pk (A) − pk (B)] k∈V

k∈V

=



k∈V

pk (A) −



(1)

pk (B) ,

k∈V

i.e., the difference in market share [3] of the two companies. Note that Company A wants to maximize, whereas Company B seeks to minimize, total bias. Companies A and B select respective marketing allocations, which are subsets of consumers to target with consumer-specific types of marketing, by learning models for the consumer choice probabilities {pk } that take into account the

876

M. G. Reyes

effect of marketing, by optimizing over expected total bias. It is this that we refer to as A Marketing Game. The framework presented in this paper provides a firm foundation on which to construct a decision-influencing operation. However, in order for such an operation to be viable, it must leverage advances in so-called affective computing [8], which connects states of mind, for example preference between Products A and B, and the effects of such preference, for example content shared on social media. Moreover, in [33] we introduced a marketing response for each consumer that reflects consumers’ individual responsiveness to a particular type of marketing. Such a marketing response will necessarily abide by the well-known Weber-Fecher or Stevens Laws [36] in quantifying the change in a consumer’s perception of the value, or utility, of a Product, as a function of the marketing intensity, e.g., frequency and duration. In other words, this paper proposes a theoretical scaffold upon which to organize relevant social science contributions. The following section discusses related work and the contributions of this paper. Section 3 provides overview on socially-contingent random utility and introduces the marketing-influenced parametrization thereof. Section 4 discusses the emphasis on influences in our model in the context of prior emphasis on socalled influentials. Section 5 briefly discusses the connection between consumer preferences and posts on social media. Section 6 outlines the basic data analytic components of A Marketing Game. Finally, Sect. 7 concludes with a discussion of limitations and future work.

2

Related Work and Contributions

The problem of influencing decision-making on a social network has attracted a great deal of attention [1,12,15,19,27,34,44]. Some of this attention has focused on the role played by network topology [27,44]. Social networks have been characterized in different ways, the two most prominent being the small-world [43] and scale-free [2] properties. Small-world networks are defined as occupying an intermediate position between completely random and completely regular networks, in terms of so-called clustering coefficient and average (shortest) path length. Scale-free networks possess so-called power-law degree distributions, which result in a characteristic of relatively few hubs, individuals who are highly connected within the network, with most others connected to few others. There has been recent work showing that spanning trees of scale-free networks are typically themselves scale-free [21]. As such we feel that the so-called star-chain network illustrated in Fig. 1a is a useful atom to consider. The numerical analysis we present in Sects. 6.1 and 6.3 will be with respect to this network. In addition to network topology, there is the issue of choice dynamics on the network, the process by which consumers form and modify their preferences in response to interactions with neighboring consumers. Kempe et al. [19] and Watts and Dodds [44] have examined dynamics ranging from contagion models inspired by epidemiology to threshold models tantamount to best-response in the parlance of so-called interaction games [4,27]. The language of contagion

A Marketing Game: A Model for Social Media ...

877

and epidemic can in part be traced to The Tipping Point [15] in which Gladwell draws analogy between widespread product adoption and the outbreak of disease. Unfortunately, the epidemic analogy is rather misleading with respect to product adoption. For example, if neighboring consumers i and j are “infected”, respectively, with Products A and B, and i infects j, then j is no longer infected with B. But this is not how diseases work, and as such, an epidemic is not a good model for the adoption of preferences on a network. On the other hand, Gladwell also identified more germane forces influencing the spread of a Product. Through a number of examples he argues that connectors, mavens, and salesmen facilitate diffusion of product preference. In addition, he introduces the idea of a Product’s stickiness to indicate the likelihood that consumers will continue to choose a product after gaining experience with it. A central argument of this paper is that these concepts, while informal, are nevertheless captured in a rather simple parametrization of consumer choice based on random utility [26]. In particular, the socially-contingent extension of random utility introduced by Blume [4], and explored by Montanari and Saberi [27] in the context of Product adoption, includes the inherent bias αi of a consumer towards the Products, and social biases θj→i and θi→j indicating the influence that neighboring consumers i and j exert upon one another. That is to say, the inherent bias αi captures the stickiness of Gladwell’s formulation, while the social biases θj→i and θi→j capture the relative ’maven-ness’ of neighboring consumers towards on another. Montaneri and Saberi [27] considered the effect of network topology on preference adoption in the case of non-uniform inherent biases where all inherent biases favored the same Product. While we feel, and discuss briefly in Sect. 3.2, that such a model is useful for certain markets of social decision-making, it is ill-suited for modeling product adoption. For one, it, and indeed the majority of works that can be interpreted within the interaction game paradigm [1,9,14,19,34,44], model choice updates as best-responses to inherent and social biases, conditioned on the preferences of one’s neighbors. Moreover and more importantly, the model in [27] does not include marketers for the Products. Indeed, the common approach of these works is seeding a Product at a select subset of consumers and then analyzing production adoption under best-response choice dynamics, without any ongoing efforts to market the Product. In contradistinction, in our model consumers update their preferences randomly 1 , according to the logit choice response model [26] with inherent biases {αi } and social biases {θj→i } determined from data. In [10] they consider the problem of learning the influences between neighboring consumers. Their learning criterion is a squared error metric with respect to a deterministic choice dynamics model. On the other hand, random utility [5,26] theory reflects the fact that, with respect to a particular market, consumer choices will appear random, due to the fact that decisions made between alternatives A and B will 1

It is important to note that in [4, 27], among others, randomness is included in the choice updates. However, such noisy best-response dynamics are employed as a stratagem to force convergence to the payoff rather than risk dominant strategy.

878

M. G. Reyes

nevertheless be influenced by considerations external to the market. In other words, utility is viewed as a parametrization of the actual frequencies with which consumers exhibit preference among alternatives. The modeler will decompose the parametrization into different influences that one suspects may be important, and which correspond to data that can be observed. That is, random utility theory is a truly data-driven framework for modeling consumer decision-making. More important than the resulting stochastic choice dynamics, the power of the random utility parametrization is that by including marketing into the parametrization, Companies A and B can approach the problem of optimizing their marketing allocations within the framework of reinforcement learning [37]. That is, miA and miB are marketing biases applied to consumer i from Companies A and B, respectively. Decisions by Companies A and B as to which consumers to target with marketing will be determined by learning inherent and social biases, as well as the marketing responses [33] of individual consumers to different types of marketing. The marketing response and the level of investment by Company A, for instance, in marketing to consumer i, is what determines the value of the marketing bias miA . The marketing responses will be learned by market research. On the other hand, inherent and social biases can be learned by wellknown inference algorithms for graphical models [41]. Marketers are indicated in Fig. 1b, and correspond to the salesmen of Gladwell’s model. Recently, Abebe et al. have considered the problem of maximizing influence from the perspective of modifying consumers’ so-called susceptibiliites [1]. Our argument here is that consumer i’s susceptibility, if you will, can be decomposed into inherent bias αi , social biases θj→i , and marketing biases miA and miB . In the context of this paper, then, modifying a consumer’s susceptibility amounts to Company A, for example, investing more, presumably in effective marketing, to increase the marketing bias miA applied to consumer i. In future work, it could correspond to efforts to influence social biases between neighboring consumers, for example the effect of rumor spreading. This paper rigorously derives the marketing-influenced parametrization that enables incorporation of marketing responses into a model of consumer network decision-making. This paper likewise presents a general template for learning network biases from social media content. Such learned biases are fused with marketing responses gleaned from market research to form a model of network decision-making. We illustrate in Sect. 6.3 the importance of parameter estimation in optimizing marketing allocation. In particular, if Company A is unable to distinguish between an inherent bias in favor of Company B and marketing bias applied by Company B, what Company A determines to be the best allocation could in fact be the worst allocation.

3

Choice Dynamics Model

Consumers choose between alternatives A and B in part as a result of the perceived utility of each alternative. Utility can be viewed as a scale for measuring differences in a consumer’s perception of the respective value provided by each

A Marketing Game: A Model for Social Media ...

879

alternative [24,39]. This perceived value is partly due to objective matters such as price and monetary returns, but also to more subjective matters such as enhanced enjoyment and status. For example, prospect theory [20] posits that choices are often determined more by minimization of risk rather than maximization of monetary expectation, presumably because losing a gamble can result in a loss of status, and therefore greater utility is assigned to a more certain though less obviously beneficial possibility. Random utility theory [5,26], on the other hand, is somewhat more agnostic, positing instead that consumers maximize utility, but that the utility assigned to an alternative by a consumer can be decomposed into known and unknown sources. Such a parametrization may, of course, obscure explanation of the observed frequencies of choice, for example, as provided by prospect theory [20]. In this section we introduce the marketer into the random utility parametrization of socially-contingent choice, which, as we discuss in Sect. 6, enables Companies to combine market research, data analytics, and simulation, to optimize marketing allocation. Let  1 if consumer i chooses A (2) xi = −1 if consumer i chooses B numerically denote consumer i’s choice or preference, and Xi the random variable associated with consumer i’s possible choice. We will abuse notation and let xi refer to both the numerical value (1 or −1) and the choice (A or B). Let x = (x1 , . . . , x|V | ) denote a configuration of choices on the network, and X = (X1 , . . . , X|V | ) the random field associated with choices on the network. Here, V is the set of consumers. 3.1

Random Utility Parametrization of Choice

In choosing between alternatives A and B, consumers seek to maximize between the utilities   uA + A U= , (3) uB + B where uA and uB are the known sources of utility assigned respectively to Products A and B, and A and B are the respective unknown sources of utility. For example, if a modeler opted to use expected monetary gain as the known sources of utility uA and uB , then a consumer’s preference for a certain gain over a less certain but larger gain would be attributable to the unknown sources of utility A and B . In this case, by observing the actual frequencies with which consumers choose one alternative over the other, the modeler would fit a value for the parameter associated with numerical expectation that predicted monetary gains obtained from each of the alternatives. When a consumer updates his choice by maximizing the utility in (3), he will choose Product A if uA + A > uB + B . Because the sources of utility A

880

M. G. Reyes

and B are unknown, we model them as random variables. Therefore, whether a consumer chooses Product A will likewise be random, with probability p(A) = p(uA + A > uB + B ) = p(A − B > uB − uA ) .

(4)

Assume that the unknown sources of utility A and B are distributed as the maxima of sequences of independent and identically distributed random variables. In this case (4) becomes a logit response distribution [26,38], i.e., p(A) =

euA , euA + euB

(5)

and likewise for p(B). As such, the utilities uA and uB can be viewed as a parametrization of individual choice probabilities. In part for this reason, the logit model is common in marketing research [18,25]. One can also derive randomness in individual choice by hypothesizing bounded rationality [35] on the part of consumers. Different assumptions about A and B will lead to different choice rules. For example, if the unknown sources of utility for different alternatives are instead modeled as dependent, one instead derives a nested logit model, which can be viewed as an iterative logit model akin to Tversky’s model of aspect elimination [40]. Moreover, if A and B are modeled as sums rather than maxima of i.i.d. variables, one derives a probit rather than a logit model [38]. 3.2

Socially-Contingent Parametrization of Choice

When the parametrization of consumers’ utilities are contingent upon the choices of other consumers, the interdependence of utility is referred to as a game [4]. Such socially contingent choice can be extended to networks of consumers [4,7] whose individual choices are contingent upon a small subset of other consumers, referred to as neighbors. Let ∂i denote the set of neighbors of consumer i. The known sources of utility uA and uB for consumer i will be decomposed additively into utility derived through agreement or disagreement with his neighbors in ∂i. Let uiA|A and uiB|A be the respective known utilities that consumer i derives from Products A and B conditioned on neighbor j ∈ ∂i having chosen Product A. Likewise for uiA|B and uiB|B conditioned on j having chosen B. This dependence can be summarized by the matrix   uiA|A uiA|B i , u|j = uiB|A uiB|B where the column corresponds to the choice by j. Following Blume [4], we can re-write ui|j as ui|j



   θj→i −θj→i αi|j αi|j = + , −θj→i θj→i −αi|j −αi|j

(6)

A Marketing Game: A Model for Social Media ...

881

where θj→i =

uiA|A − uiA|B − uiB|A + uiB|B 4

and αi|j

=

uiA|A + uiA|B − uiB|A − uiB|B 4

.

Here, θj→i is the social bias exerted upon consumer i by neighbor j, due to whether i makes the same choice as j. With some abuse of notation, we define  α  Δ i|j αi = −αi|j j∈∂i   αi = (7) −αi to be the inherent bias of consumer i, with αi > 0 indicating a bias in favor of Product A and αi < 0 a bias in favor of Product B. Using the numerical representation of choice (2), the utility vector (3) for consumer i, conditioned on the choices x∂i of his neighbors, can now be decomposed as

⎤ ⎡ θj→i xj + iA αi + j∈∂i ⎦ .

(8) U|x∂i = ⎣ −αi − θj→i xj + iB j∈∂i

Applying the socially-contingent decomposition of utility (8) to the random utility update (5) yields the following choice dynamics: 

(t) θj→i xi xj + αi xi exp (t)

p(xi |x∂i ) =

j∈∂i

Zi|x(t)

,

(9)

∂i

referred to as Glauber dynamics [16] in the statistical mechanics literature, where Zi|x(t) is the normalizing constant referred to as the (local) partition function at ∂i i conditioned on x∂i at time t. Best- or near best-response dynamics [4,13,14,27,28,44,45] have long been considered . As discussed in [38], scaling the utilities uA and uB by a constant β > 0 amounts to adjusting the relative importance of the known and unknown sources of utility. For example, the probability of a consumer choosing Product A becomes p(A) = p(βuA + A > βuB + B ) = p(uA +

A B > uB + ), β β

882

M. G. Reyes

in which case the limit β → ∞ would amount to a market in which consumers’ perception of utility for Products A and B significantly outweigh sources of utility external to the market; moreover, a market in which the modeler is fully aware of the within-market sources of utility. Such a construction, considered explicitly in [27] and implicitly in [14,44], is useful from the scientific objective of understanding social norms, where emphasis on fitting in likely outweighs influences from other markets. However, from the engineering objective of using data to influence consumer decision-making, consumer choice should not be modeled under a β → ∞ scaling of known utility, but rather stochastically, as (9), where utilities αi and θj→i can be estimated from data as those that predict actual choice frequencies. More importantly, one needs to include the marketer into the model so that companies can account for the influence that marketing has on choice dynamics. 3.3

Including the Marketer in Choice Parametrization

On a purely mathematical level, the primary contribution of this paper is incorporating the influence of marketing for Products A and B into the above parametrization of choice dynamics. In pioneering work, Harold Laswell [22] examinined the social role that media advertising plays in consumer decisionmaking. Around the same time, David Ogilvy [29] opened what would become Ogilvy and Mather, ushering in an age of content marketing, in which emphasis was placed on communicating with consumers in a more informative rather than promotional manner. It therefore makes sense to view the marketer as a social connection with a constant preference. We include in the socially-contingent decomposition of utility (8) the marketing biases miA > 0 and miB > 0 applied by Companies A and B, respectively, to consumer i, as

⎤ ⎡ αi + miA − miB + θj→i xj + iA j∈∂i ⎦ .

(10) U|x∂i = ⎣ −αi − miA + miB − θj→i xj + iB j∈∂i

The marketing biases applied to a consumer are tantamount to the social biases neighboring consumers exert upon each other, for example as indicated in Fig. 1b. The strength of the marketing biases will depend on the marketing response [33] of consumer i to the particular type of marketing, which indicates the degree of influence as a function of marketing intensity. This is discussed briefly in Sect. 6.2. The choice dynamics now become 

(t) θj→i xi xj + θi xi exp (t)

p(xi |x∂i ) =

j∈∂i

Zi|x(t) ∂i

where θi = αi + miA − miB is the direct bias of consumer i.

,

(11)

A Marketing Game: A Model for Social Media ...

4

883

Influences Verses Influentials

Slightly before Lasswell’s work, Paul Lazarsfield et al. analyzed the influence of media on consumer choice in a presidential election, concluding that for most consumers, social bias was more influential than the applied bias of media marketing. Based on these findings, they introduced the two-step model of communication [23] whereby the role of a marketer was to target influential opinion leaders, as they were the gateway to everyone else. In the The Tipping Point [15], Malcolm Gladwell revisited the idea of influentials with his salesmen, mavens, and connectors. In contrast, it is important to note that our model does not stress influential individuals, but rather influences. For example, if θi→j > θj→i , for each neighbor j ∈ ∂i, then we might refer to consumer i as a maven, one who exerts a strong influence over the preferences of others. On the other hand, we might simply say that i is a maven in his interactions with neighbor j1 , if θi→j1 > θj1 →i , even if θi→j2 < θj2 →i , for some other neighbor j2 ∈ ∂i. If i is a hub in the network, or connects otherwise distant regions of the network, we might refer to consumer i as a connector. Moreover, an inherent bias αi can be interpreted as the difference in stickiness between Product A and Product B, with respect to consumer i. The relative importance of these influences will depend on the overall constellation of influences. Indeed, Watts and Dodds [44] found that under certain combinations of inherent and social biases,2 targeting connectors would result in a so-called cascade, while with other combinations, targeting connectors did not lead to a cascade. Our objective is not to affirm or contradict the so-called influentials hypothesis, nor to identify a priori which influences are more important. Rather, the relative importance of influences will manifest in simulation of the network under influences estimated from data. As indicated above, our primary mathematical contribution is including the influence of the salesmen, through marketing biases miA and miB .

5

Social Media Posts (t)

(t)

At a given time t, the configuration of choices x(t) = (x1 , . . . , x|V | ) on the network represent preferences for the two Products. There is a corresponding (t) (t) configuration y(t) = (y1 , . . . , y|V | ) of consumer posts. Posts can be an image, a block of text, or a combination of the two. If at time t consumer i prefers (t) Product A, the post yi could reflect positive sentiment with respect to A or negative sentiment with respect to B [30]. (t+1) When consumer i updates his choice to xi , he does so based on his inherent bias towards Products A and B, the applied bias from Companies A and B, the social biases from his neighbors, and his understanding of the preferences (t) indicated in his neighbors’ posts {yj : j ∈ ∂i}. We assume in this paper that the (t)

post yi 2

(t)

is perfectly correlated with consumer i’s choice xi , so that consumer

They referred to the susceptibility of a consumer.

884

M. G. Reyes (t)

i’s neighbors can be said to observe i’s choice xi . In general, however, there (t) may be some ambiguity between consumer i’s post yi and his actual preference (t) (t+1) xi [31]. When consumer i updates his choice to xi , he creates a new post (t+1) yi . In Sect. 6 we briefly discuss the use of deep learning [17] and sentiment analysis [8] to detect semantic relationships between objects and topics in consumers’ social media posts. In order to fully leverage these tools, Companies will additionally need models correlating a user’s preference towards Products (commercial or electoral) and the semantic content of their social media posts. For example, consumers i and j may both prefer political candidate A, but whereas consumer i may post favorable content with respect to candidate A, consumer j may post unfavorable content with respect to candidate B.

6

Analytics of A Marketing Game

As mentioned in Sect. 1, introducing marketing biases into the parametrization of choice dynamics places A Marketing Game within the purview of reinforcement learning. That is, Companies A and B will learn marketing biases miA and miB , respectively, through market research; direct biases θi = αi + miA − miB and social biases θj→i through a combination of deep learning [17] and graphical model inference algorithms [32,41]; and use the resulting learned model of network choice dynamics to select the marketing allocation that optimizes predicted market share. To be sure, successfully implementing this approach will require considerable interdisciplinary effort. Namely: on the marketing side, we need to understand how users’ mental states will respond to content, so that we can create content marketing; on the analytics side, we need to understand how users’ mental states create content, so that we can respond with appropriate content analytics; in particular, we need to understand how to use deep learning to infer from posted social media content consumer preferences that are in turn influenced by marketing content posted by Companies. In this section we briefly discuss the basic components of an analytics pipeline, illustrated in Fig. 2, for leveraging the marketing-influenced sociallycontingent parametrization of consumer choice introduced in Sect. 3.3. The two components that interact directly with the network are the API/data collection module, which scrapes user posts from social media, and the content marketing module, which exposes users to marketing content analogous to the social media posts of a user’s neighbors. In Sect. 6.1 we discuss estimation of direct and social biases from social media posts. In Sect. 6.2 we discuss marketing research required to estimate consumers’ marketing responses. In Sect. 6.3 we discuss simulation of the network under candidate marketing allocations and illustrate the importance of distinguishing between inherent and marketing biases in favor of the other Company.

A Marketing Game: A Model for Social Media ...

885

Fig. 2. Block diagram illustrating data analytic components of A Marketing Game.

6.1

Estimation of Direct and Social Biases

A Company will use an application programming interface (API) to collect data, for example posts y(t) , . . . , y(t) , from a social media network. For a given con(t−τ ) sumer i and post yi , Company A uses machine learning algorithms, for example a recurrent or convolutional neural network [17] to determine the subject of the post, and sentiment analysis [8] to determine the consumers’ attitudes with respect to the subject. Using models correlating preferences and posts, Company A will form an (t−τ ) of consumer i’s preference, which can be viewed as a noisy version estimate x ˆi (t−τ ) of xi , the noise being determined by the accuracy of the machine learning algorithms. From the perspective of Company A, the preferences (X) constitute a hidden Markov random field in which noisy observations {ˆ x(t−τ ) } are observed, (t−τ ) } are unobserved. while true choices {x In order to simplify things, assume that Company A’s machine learning algorithms are perfect, so that Company A observes the true sequence of choice configurations {x(t−τ ) }. Company A can leverage the Markov property of X, that knowing the choices of a consumer’s neighbors renders the consumer’s choices independent of choices outside of his neighborhood, by minimizing the condi-

886

M. G. Reyes

tional description length [32] ¯ (t−T :t) |x(t−T :t) ; θ¯i ) = − D(x i ∂i

T  τ =0

(t−τ )

log p(xi

(t−τ )

|x∂i

; θ¯i ) (t−T :t)

of consumer i’s choices conditioned on the choices of his neighbors. Here, xi (t−T ) (t) xi , . . . , xi ,

Δ

indicates the sequence of observations and θ¯i = θi ∪ {θj→i : j ∈ ∂i} denotes the parameter specifying consumer i’s choice. Figure 3 illustrates estimates of the direct bias for the hub consumer c1 in Fig. 1a. When the machine learning algorithm of Company A is not perfect and estimated parameters θ˜ = x(t−τ ) ), Company A can (θ˜i , θ˜j→i ) must be determined using noisy observations (ˆ use a variant of the well-known expectation-minimization (EM) algorithm [11].

Fig. 3. Estimates θˆc1 of the direct bias for hub consumer c1 in Fig. 1a as a function of sample complexity. Note that the estimates converge to a direct bias of θˆc1 = −2, which favors Company B. Note also that this estimate does not disambiguate αc1 +mcA1 −mcB1 . In Sects. 6.2 and 6.3 we will assume that this estimation is performed by Company A and used in the evaluation of candidate marketing allocations.

The reason for focusing on estimating consumer preferences rather than simply social media posts is that our ultimate concern is whether a consumer will purchase a product or vote for a candidate. Therefore, we want to go from posts back to states of mind, i.e., preferences with respect to Product alternatives. This implies that one has to be somewhat judicious regarding what posts one “pulls” for the purposes of estimating network biases.

A Marketing Game: A Model for Social Media ...

6.2

887

Estimating Marketing Response

The relationship between marketing investment in a consumer and the resulting perceived utilities miA and miB would, in practice, be determined by market research. Such a relationship between stimulus (i.e., marketing) intensity and perception of value will likely obey the well-known Weber-Fechner [6] or Stevens [36] Laws. For example, there would be a saturation effect where additional investment has only negligible influence on consumer choice. Moreover, the response of a consumer to marketing by a Company would likely depend on any inherent bias of the consumer towards Products A or B. For example, if a consumer has a bias towards one or the other product, we would expect that he will be less responsive to marketing from both Companies than if he has no bias. For instance, if a consumer is biased in favor of Product A, marketing by Company A will only incrementally add to the effective attractiveness of Product A. On the other hand, if a consumer is biased in favor of Product B, marketing by Company B can only do so much to attract the consumer. Figure 4 illustrates hypothetical marketing responses for a consumer.

Fig. 4. Hypothetical marketing responses indicating marketing bias miA applied to consumer i by Company A as a function of investment diA , for different values of inherent bias αi .

In the previous section, Company A forms an estimate ˜ c1 + m ˜ cA1 − m ˜ cB1 θ˜c1 = α

(12)

of the direct bias at consumer c1 . If Company A knows the shape of consumer c1 ’s marketing response, for example as illustrated in Fig. 4, then given an estimate

888

M. G. Reyes

α ˜ c1 of consumer c1 ’s inherent bias, Company A can form an estimate m ˜ cA1 of the marketing strength applied to consumer c1 at the relevant level of investment. Company A can then estimate m ˜ cB1 from (12). In particular, it can determine whether a direct bias at consumer c1 in favor of Product B is due to inherent bias or marketing. We will see in the next section that being able to disambiguate inherent and marketing bias can have dramatic consequences for the allocation of marketing resources. Estimating inherent biases and marketing responses can be carried out with surveys, focus groups, and A/B testing. 6.3

Simulation and Optimization

The estimated direct and social biases from Sect. 6.1 will be combined with the estimated marketing responses from Sect. 6.2 into a model of network decisionmaking for candidate marketing allocations. Company A will in general simulate the corresponding choice dynamics (11) and select the allocation that optimizes expected market share. To illustrate, we consider a simplified scenario: symmetric social biases on the star-chain network of Fig. 1a, where all consumers except the hub consumer c1 have inherent bias αi = 0 and marketing biases miB = 0 from Company B. We consider optimization of Company A’s allocation under three cases for the direct bias of c1 . The first two correspond to the estimated direct bias θˆc1 = −2 shown in Fig. 3. In other words, how does Company A’s optimal allocation depend on whether that direct bias is an inherent bias in favor of Company B versus a marketing bias from Company B? The third case considers that consumer c1 has an inherent bias in favor of Company A. Blume [4] shows that if the social biases are symmetric, that is, θj→i = θi→j = θij , for all pairs of neighboring consumers i and j, then the dynamics of (11) converge to the equilibrium Gibbs distribution given by p(x; θ) =

  1 exp{ θij xi xj + θi xi } , Z(θ) {i,j}

(13)

i∈V

where Z(θ) =



exp{

x



θij xi xj +

{i,j}



θi xi }

i∈V

is the (global) partition function. The total bias (1) can be computed as 

μk =

k∈V

 Zk (A) − Zk (B) , Zk (A) + Zk (B)

k∈V

where the vector Zk , with components    exp{ θij xi xj + θi xi } Zk (A) = x:xk =A

{i,j}∈E

i∈V

(14)

A Marketing Game: A Model for Social Media ...

and Zk (B) =

 x:xk =B

exp{



{i,j}∈E

θij xi xj +



889

θi xi } ,

i∈V

is the belief for consumer k. For the star-chain network, we can compute Zk (A) and Zk (B) using Belief Propagation [41]. Figure 5 illustrates the total bias that results from Company A allocating a marketing bias miA to a single consumer in this star-chain network, for the three different cases of direct bias at the hub consumer c1 . We see that if Company A knows only that there is a direct bias in favor of B, without knowing whether that bias is due to inherent bias or the effect of marketing, then Company A could actually make the worst marketing allocation when it thinks it is making the best. That is, Company A placing a unit of marketing allocation at the hub c1 is optimal if the direct bias at c1 in favor of B is due to marketing, but is the worst allocation if it is an inherent bias in favor of B. This comports with the intuition that marketing to consumers who are loyal towards an opposing brand is not a good investment. On the other hand, if the direct bias at c1 is an inherent bias in favor of Company A, then Company A allocating its single unit of marketing to c1 likewise results in the lowest possible total bias. The reason is that consumer c1 ’s inherent bias towards Product A already serves as a form of marketing to his neighbors, and thus greater total bias can be achieved by Company A allocating elsewhere.

Fig. 5. Total bias on star-chain network with a direct bias at consumer c1 . The direct bias is either an inherent bias in favor of Product B, i.e., mcB1 = 0 and αc1 = −2; an applied marketing bias from Company B, i.e., mcB1 = 2 and αc1 = 0; or an inherent bias in favor of Product A, i.e., mcB1 = 0 and αc1 = 2.

890

7

M. G. Reyes

Summary, Limitations, and Future Directions

We have derived the marketing-influenced parametrization of socially-contingent consumer choice, discussed the estimation of network biases from data, and analyzed a star-chain network to illustrate optimization of marketing allocation. Limitations of this paper mainly revolve around work that remains to be done in order to implement this program on a large scale. For example, looking at real data from a social network, and implementing machine learning algorithms to infer consumer preferences. This will inevitably require us to understand asynchronous choice dynamics, as users post content at varying rates. Moreover, while the simplified scenario in this paper consisted of symmetric social biases, which in the case of two Products, leads to a tractable equilibrium Gibbs distribution, in general social biases will not be symmetric and the resulting dynamics will not have an equilibrium. For instance, social connections on Twitter are asymmetric. Preliminary analysis with asymmetric social biases suggests that, while not possessing an equilibrium per se, such asymmetric dynamics do have as a stationary distribution the Gibbs equilibrium corresponding to symmetric dynamics, albeit with different direct biases. This would be significant regarding actual implementation of this model, as it has been shown [42] that suboptimal variational methods with respect to an equilibrium Gibbs model can yield better performance than exact Monte Carlo simulation when computing resources are at a premium. Furthermore, as highlighted in [10], it is important to consider the case where the choice dynamics themselves are non-stationary. Social connections and biases change over time, and it is unlikely that a given set of network biases will remain constant long enough for a stationary distribution to be attained. Again, preliminary work with asymmetric social biases makes some inroads into consideration of transient phase analysis. Lastly, there is considerable need to focus on market research to understand marketing responses and the use of such information to disambiguate estimated direct biases.

References 1. Abebe, R., Kleinberg, J., Parkes, D., Tsourakakis, C.E.: Opinion dynamics with varying susceptibility to persuasion. In: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), KDD 2018, London, UK, August 2018 2. Barabasi, A.-L., Bonabeau, E.: Scale-free networks. Sci. Am. 288, 60–69 (2003) 3. Bell, D.E., Keeney, R.L., Little, J.D.C.: A market share theorem. J. Mark. Res. 12(2), 136–141 (1975) 4. Blume, L.E.: Statistical mechanics of strategic interaction. Games Econ. Behav. 5(3), 387–424 (1993) 5. Block, H.D., Marschak, J.: Random orderings and stochastic theories of responses. In: Contributions to Probability and Statistics. Stanford University Press (1960) 6. Britt, S.H.: How Weber’s Law can be applied to marketing. Bus. Horiz. 18(1), 21–29 (1975)

A Marketing Game: A Model for Social Media ...

891

7. Brock, W.A., Durflauf, S.N.: Discrete choice with social interactions. Rev. Econ. Stud. 68, 235–260 (2001) 8. Cambria, E.: Affective computing and sentiment analysis. IEEE Intell. Syst. 31(2), 102–107 (2016) 9. Chierichetti, F., Kleinberg, J., Oren, S.: On discrete preferences and coordination. In: ACM Conference on Electronic Commerce, pp. 233–250 (2013) 10. De, A., Bhattacharya, S., Bhattacharya, P., Ganguly, N., Chakrabarti, S.: Learning a linear influence model from transient opinion dynamics. In: The 23rd ACM International Conference on Information and Knowledge Management, pp. 401– 410. ACM (2014) 11. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977) 12. Domingos, P., Richardson, M.: Mining the network value of customers. In: Proceedings of SIGKDD 2001, San Franciso, CA, pp. 57–66 (2001) 13. Ellison, G.: Learning, local interaction, and coordination. Econometrica 61(5), 1047–1071 (1993) 14. Fazeli, A., Jadbabaie, A.: Game Theoretic Analysis of a Strategic Model of Competitive Contagion and Product Adoption in Social Networks, December 2012. http://repository.upenn.edu/ese papers/618 15. Gladwell, M.: The Tipping Point, Little, Brown and Company (2000) 16. Glauber, R.J.: Time-dependent statistics of the Ising model. J. Math. Phys. 4, 294–307 (1963) 17. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016) 18. Green, P.E., Carmone, F.J., Wachspress, D.P.: On the analysis of qualitative data in marketing research. J. Mark. Res. 14, 52–59 (1977) 19. Kempe, D., Kleinberg, J., Tardos, E.: Influential nodes in a diffusion model for social networks. In: Proceedings of the 32nd International Conference on Automata, Languages, and Programming (ICALP), pp. 1127–1138 (2005) 20. Kahneman, D., Tversky, A.: Prospect theory: an analysis of decision under risk. Econometrica 47(2), 278 (1979) 21. Kim, D.-H., Noh, J.D., Jeong, H.: Scale-free trees: the skeletons of complex networks. Phys. Rev. E 70, 046126 (2004) 22. Lasswell, H.D.: The structure and function of communication in society. Bryson, L. (ed.) The Communication of Ideas. Harper and Brothers (1948) 23. Lazarsfield, P.F., Berelson, B., Gaudet, H.: The People’s Choice: How the Voter Makes up his Mind in a Presidential Campaign. Columbia University Press, New York City (1944) 24. Luce, D.: Individual Choice Behavior. Dover, Illinois (1959) 25. Malhotra, N.K.: The use of linear logit models in marketing research. J. Mark. Res. 21, 20–31 (1984) 26. McFadden, D.: Conditional logit analysis of qualitative choice behavior. In: Frontiers in Econometrics. Academic Press, New York (1974) 27. Montanari, A., Saberi, A.: The spread of innovations in social networks. PNAS 107(47), 20196–20201 (2010) 28. Morris, S.: Contagion. Rev. Econ. Stud. 67, 57–78 (2000) 29. Ogilvy, D.: Ogilvy On Advertising. Vintage, New York City (1985) 30. Ravi, K., Ravi, V.: A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl.-Based Syst. 89, 14–46 (2015) 31. Reyes, A., Rosso, R., Buscaldi, D.: From humor recognition to irony detection: the figurative language of social media. Data Knowl. Eng. 74, 1–12 (2012)

892

M. G. Reyes

32. Reyes, M.G., Neuhoff, D.L.: Minimum conditional description length estimation of Markov random fields. In: Information Theory and Applications Workshop, February 2016 33. Reyes, M.G.: A marketing game: a rigorous model for strategic resource allocation. In: ACM Workshop on Machine Learning in Graphs, London, UK, August 2018 34. Richardson, M., Domingos, P.: Mining the network value of customers. In: Proceedings of SIGKDD, Edmonton, Alberta, Canada (2002) 35. Simon, H.A.: A behavioral model of rational choice. Q. J. Econ. 69(1), 99–118 (1955) 36. Stevens, S.S.: To honor Fechner and repeal his law. Science 133(3446), 80–86 (1961) 37. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998) 38. Train, K.: Discrete Choice Models with Simulation. Cambridge University Press, Cambridge (2002) 39. Thurstone, L.L.: Psychological analysis. Am. J. Psychol. 38(3), 368–389 (1927) 40. Tversky, A.: Elimination by aspects: a theory of choice. Psychol. Rev. 79(4), 281 (1972) 41. Wainwright, M., Jordan, M.: Graphical models, exponential families, and variational inference. Technical report, UC Berkeley (2003) 42. Wainwright, M.J.: Estimating the “wrong” graphical model: benefits in the computation-limited setting. J. Mach. Learn. Res. 7, 1829–1859 (2006) 43. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440 (1998) 44. Watts, D.J., Dodds, P.S.: Influentials, networks, and public opinion formation. J. Consum. Res. 34, 441–458 (2007) 45. Young, H.P.: The evolution of conventions. Econometrica 61(1), 57–84 (1993)

Acoustic Event Detection with Sequential Attention and Soft Boundary Information Jingjing Pan1(B) and Xianjun Xia2 1

China University of Mining and Technology, Xuzhou 221008, Jiangsu, China [email protected] 2 The University of Western Australia, Perth, WA 6009, Australia [email protected]

Abstract. Acoustic event detection is to perceive the surrounding auditory sound and popularly performed by the multi-label classification based approaches. The concatenated acoustic features of consecutive frames and the hard boundary labels are adopted as the input and output respectively. However, the different input frames are treated equally and the hard boundary based outputs are error-prone. To deal with these, this paper proposes to utilize the sequential attention together with the soft boundary information. Experimental results on the latest TUT Sound Event database demonstrate the superior performance of the proposed technique. Keywords: Acoustic event detection Sequential attention · Soft boundary

1

· Multi-label classification ·

Introduction

Acoustic event detection (AED) is to determine the event types and the boundaries of the active acoustic events (AE). The acoustic event detection is useful in various automatic monitoring systems [1], indoor and out door activities [2] and human-computer interactions [3,4]. Up to now, the AED is still challenging and largely unresolved due to large intra-class variations in terms of event durations, non-stationary background noise as well as the polyphonic characteristic of the acoustic events. Many recent works [5–10] were done and various campaigns [11–13] were organized to address the challenges facing the AED. The detection of acoustic events are typically applied over sliding time windows with statistical models adopted to represent the acoustic features. The neural network based multilabel classification framework is popularly applied to the polyphonic acoustic event detection due to the advances of deep learning in the image and acoustic signal processing. The frame based acoustic features, e.g. Mel-Frequency Cepstral Coefficients (MFCCs) [14] and log mel-band energies [15] are adopted as the classifier inputs. To capture the context information, the acoustic features of consecutive frames are usually concatenated. In [16–18], the Deep Neural Networks c Springer Nature Switzerland AG 2020  K. Arai and R. Bhatia (Eds.): FICC 2019, LNNS 69, pp. 893–903, 2020. https://doi.org/10.1007/978-3-030-12388-8_60

894

J. Pan and X. Xia

(DNNs) with concatenated features are applied to the AED. The Recurrent Neural Networks (RNNs) [15,19] and the Convolutional Recurrent Neural Networks (CRNN) [7,20] were used to capture the context information when performing the AED. However, the aforementioned approaches treat the input features equally. For the DNN based approaches [17,18], the acoustic features of different frames are concatenated with the same importance. In [15,19], the RNN was applied to capture the sequential information. Although the Long Short Term Memory (LSTM) or the Gated Recurrent Unit (GRU) scheme were adopted to solve the gradient vanishing, the importance of the previous frames varies and the very front frames may be more important than the surrounding frames. Another problem is that the manually labeled outputs are hard boundary based and error-prone, especially when the acoustic events are overlapping or too short to be annotated accurately. The authors in [21] adopted the soft boundary based labels as the classifier outputs to overcome the error-prone characteristic of the hard boundary based outputs. However, the acoustic features of consecutive frames were still treated equally and assigned with the same importance. Inspired by the recent advances of the sequential attention scheme in the natural language processing [22] and the superior performance of the soft boundary based output labels [21], this paper proposes to apply the sequential attention to the input space and the soft boundary based labels to the output space. The sequential attentions are represented by different weights which are assigned to different frames. The soft boundary based labels are represented by the frame wise confidence measures. The closer the current frame is to the middle of the manually labeled boundary, the higher the confidence measures will be. The advantages of the proposed approach with the sequential attention and soft boundary information are two-fold: (1) sequential attention in time assists the AED systems in capturing an even higher level context information and information from the far previous frames can also be captured with higher weights and (2) the soft boundary based labels utilizes the acoustic internal boundary information, which facilitates the acoustic boundary detection. The rest of this paper is organized as follows. In Sect. 2, the multi-label classification based baseline AED system is introduced. The proposed ideas and algorithms are elaborated in Sect. 3. In Sect. 4, the experimental configurations and results are given followed by the conclusion and future work in Sect. 5.

2

Multi-Label Classification Based AED

The multi-label classification based training space of an AED system can be written as: (1) Ω = {It , COt } Here, the t is the time index and the It is the concatenated acoustic features of L consecutive frames, which can be written as: It = {Ft−L+1 , Ft−L+2 , ..., Ft }

(2)

Acoustic Event Detection with Sequential ...

895

where the Ft denotes the acoustic features at time t. The COt is the hard boundary based training output label vector, which is derived from the manually labeled beginning and end time of the acoustic events. The classification based output labels COt at the frame index t is in binary format representing each acoustic event class c, and can be written as: COt = {COt,1 , COt,2 , ..., COt,c , ..., COt,C }

(3)

where COt,c is equals to 1 when the cth event class is active at time index t and the C is the total number of acoustic event classes of interest. During the training, the sigmoid function is used in the output layer as some acoustic events may overlap and the cross-entropy is adopted as the loss function, which is expressed as: JCE (Wc , bc ; S) =

t=T 1  λ JCE (Wc , bc ; Ft , COt ) + ||Wc ||2 T t=1 2

(4)

where Wc and bc are the weights and bias parameters of the trained acoustic models. The λ is the weight for the regularization. The JCE (Wc , bc ; Ft , COt ) is expressed as: c=C  JCE (Wc , bc ; Ft , COt ) = − logqc (5) c=1

Here qc is the probability estimated from the neural network PN N (c|Ft ). During testing, with the trained acoustic classifier and the given test audio stream, each time index t will correspond to C output probabilities which is expressed as: (6) cpt = {cpt,1 , cpt,2 , ..., cpt,c , ..., cpt,C } where cpt,c represents the probability that the current frame t belongs to the cth event type. Afterwards, a global threshold τ which is empirically set is applied the to the output probabilities. Event classes with a higher probability than the global threshold are detected as the final active acoustic events.

3

AED with Sequential Attention and Soft Boundary

The flowchart of the proposed AED system is shown in Fig. 1. The input and output space module consist of the whole AED system. As shown in Fig. 1, the concatenated acoustic features It in the baseline system is replaced by an RNN based input space and the hard boundary based output labels COt are replaced by the soft boundary based confidence measures p(t), which can be written as: p(t) = {p(t)1 , p(t)2 , ..., p(t)c , ..., p(t)C }

(7)

where the p(t)c denotes the assigned confidence that the frame t belongs to the cth acoustic event of interest.

896

J. Pan and X. Xia

Fig. 1. The flowchart of the acoustic event detection with sequential attention and soft boundary.

In this section, how the soft boundary based confidence measures [21] are calculated and the LSTM based RNN unit are briefly introduced. Then how the sequential attention scheme is applied will be elaborated. By utilizing the sequential attention and the soft boundary information, the acoustic features of more predictive frames are more weighted and the error-prone characteristic of acoustic events are overcame. 3.1

Soft Boundary Based Confidence Measure

Figure 1 shows how the confidence measures p(t) are calculated for the acoustic event “car”. The rectangular solid boxes denote the manually labeled results of the acoustic event “car” and the dotted parabolic lines denote the corresponding confidence for each frame. The reason why the parabolic lines are adopted can be referred in [21]. The LB and LE denote the manually labeled boundaries respectively. However due to the labeling inaccuracies in human annotation at the boundaries, the actual happening time for the acoustic event may be different from the manually labeled boundaries. Hence, we assume a soft boundary with different confidences. The closer the current frame is to the centre of the manually

Acoustic Event Detection with Sequential ...

897

labeled acoustic event, the higher the confidence p(t)car will be. The SB and SE denote the soft beginning and end times respectively. When the confidence at the manually labeled boundaries are pre-fixed to v (v is experimentally set to 0.3 based on the development set), the frame wise confidence measures, the soft beginning time SB and soft end time SE can be expressed as: 2t − LB − LE 2 ) }I(t) (8) p(t)car = {1 − (1 − v)( (LE − LB) SB = max{

1 LE + LB − (LE − LB) √1−v

2

, 0}

(9)

1 LE + LB + (LE − LB) √1−v

} (10) 2 Here, the soft beginning time SB is greater than 0, the soft end time SE is shorter than the given maximum length of the audio file LA and the I(t) is defined as:  1 (SB