Intelligence Enabled Research: DoSIER 2020 [1st ed.] 9789811592898, 9789811592904

This book gathers extended versions of papers presented at DoSIER 2020 (the Second Doctoral Symposium on Intelligence En

474 108 4MB

English Pages XVII, 123 [134] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Intelligence Enabled Research: DoSIER 2020 [1st ed.]
 9789811592898, 9789811592904

Table of contents :
Front Matter ....Pages i-xvii
DEBM: Differential Evolution-Based Block Matching Algorithm (Abhishek Dixit, Ashish Mani, Rohit Bansal)....Pages 1-9
Multi-modality of Occupants’ Actions for Multi-Objective Building Energy Management (Monalisa Pal, Sanghamitra Bandyopadhyay)....Pages 11-19
A Novel Self-adaptive Salp Swarm Algorithm for Dynamic Optimization Problems (Sanjai Pathak, Ashish Mani, Mayank Sharma, Amlan Chatterjee)....Pages 21-32
Digital ID Generation and Management Framework Using Blockchain (Suchira Banerjee, Kousik Dasgupta)....Pages 33-43
HFAIR: Hello Devoid Optimized Version of FAIR Protocol for Mobile Ad hoc Networks (Abu Sufian, Anuradha Banerjee, Paramartha Dutta)....Pages 45-54
Performance Evaluation of Language Identification on Emotional Speech Corpus of Three Indian Languages (Joyanta Basu, Swanirbhar Majumder)....Pages 55-63
Disaster Severity Prediction from Twitter Images (Abhinav Kumar, Jyoti Prakash Singh)....Pages 65-73
A Study on Energy-Efficient Communication in VANETs Using Cellular IoT (R. N. Channakeshava, Meenatchi Sundaram)....Pages 75-85
Recognition of Transforming Behavior of Human Emotions from Face Video Sequence: A Triangulation-Induced Circumradius-Incenter-Circumcenter Combined Approach (Md Nasir, Paramartha Dutta, Avishek Nandi)....Pages 87-96
A Study on Radio Labelling of Evolving Trees for Path \(P_n\) (Alamgir Rahaman Basunia, Laxman Saha, Kalishankar Tiwary)....Pages 97-104
Secure Blockchain Smart Contracts for Efficient Logistics System (Ajay Kumar, Kumar Abhishek)....Pages 105-112
COVID-19 Outbreak Prediction Using Quantum Neural Networks (Pranav Kairon, Siddhartha Bhattacharyya)....Pages 113-123

Citation preview

Advances in Intelligent Systems and Computing 1279

Siddhartha Bhattacharyya Paramartha Dutta Kakali Datta   Editors

Intelligence Enabled Research DoSIER 2020

Advances in Intelligent Systems and Computing Volume 1279

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering The Chinese University of Hong Kong, Shatin, Hong Kong

The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. Indexed by SCOPUS, DBLP, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago. All books published in the series are submitted for consideration in Web of Science.

More information about this series at http://www.springer.com/series/11156

Siddhartha Bhattacharyya · Paramartha Dutta · Kakali Datta Editors

Intelligence Enabled Research DoSIER 2020

Editors Siddhartha Bhattacharyya CHRIST (Deemed to be University) Bangalore, India

Paramartha Dutta Visva-Bharati University Bolpur Santiniketan, India

Kakali Datta Visva-Bharati University Bolpur Santiniketan, India

ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-981-15-9289-8 ISBN 978-981-15-9290-4 (eBook) https://doi.org/10.1007/978-981-15-9290-4 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Dr. Siddhartha Bhattacharyya would like to dedicate this book to Prof. Balachandran Krishnan, Head, Department of Computer Science and Engineering, CHRIST (Deemed to be University), Kengeri Campus, Bangalore, India, and to all his colleagues in the department Dr. Paramartha Dutta would like to dedicate this book to his parents Late Arun Kanti Dutta and Late Bandana Dutta Dr. Kakali Datta would like to dedicate this book to her father Dr. Jayanta Kumar Datta and her mother Mrs. Shyamali Datta

Preface

Computational Intelligence (CI) has been the buzzword in today’s workaday world. Use of CI enabled techniques has gained impetus over the last few decades. At present, it becomes difficult to think of any technological advancement without having CI as one of its integral components. The impact is so vast that almost every facet of modern human civilization has embraced intelligence enabled tools and techniques to achieve failsafe and robust end solutions. CI has effected the most impelling influence in signal and data processing, smart manufacturing, predictive control, robot navigation, smart cities, and sensor design, to name a few. In tune with the current rapidly changing technical innovations, Government of India has also come forward to boost promotion and use of computational intelligence in different sectors. The Doctoral Symposium on Intelligence Enabled Research is one such attempt in this direction. The 2019 First Doctoral Symposium on Intelligence Enabled Research (DoSIER 2019) was organized by RCC Institute of Information Technology, Kolkata, India, during October 19–20, 2019, with an aim to provide doctoral students and early career researchers an opportunity to interact with their colleagues working on foundations, techniques, tools, and applications of computational intelligence, thereby creating an intelligent workforce for the future. The 2020 Second Doctoral Symposium on Intelligence Enabled Research (DoSIER 2020) organized by Visva-Bharati University, Santiniketan, India, during August 12–13, 2020, is in line with the tradition in making. Due to the severe pandemic currently affecting the world socioeconomic structure, DoSIER 2020 has been held in a Virtual mode using Webex meet platform, however following all the stipulations as are maintained in a physical mode of organization. More than 100 participants attended the proceedings of DoSIER 2020 held in online mode. The symposium was technically sponsored by IEEE Computational Intelligence, Kolkata Chapter. DoSIER 2020 featured three keynotes delivered by eminent researchers and academicians across the globe along with two technical tracks. The keynote speakers included (i) Prof. Vaclav Snasel, VSB-Technical University of Ostrava, Ostrava, Czech Republic, (ii) Prof. Prasun Chakrabarti, Techno India NJR Institute of Technology, Udaipur, India, and (iii) Mr. Aninda Bose, Senior Editor, Springer India. vii

viii

Preface

DoSIER 2020 received a good number of submissions from doctoral students in the country. After peer review, only 13 papers were accepted to be presented at the conference. Authors from different parts of the country presented their peer-reviewed articles under the technical tracks of DoSIER 2020. Bangalore, India Santiniketan, India Santiniketan, India August 2020

Siddhartha Bhattacharyya Paramartha Dutta Kakali Datta

DoSEIR 2020 Committees

Honorary Chair Dr. Kalyanimoy Deb, Michigan State University, USA

General Chairs Dr. Siddhartha Bhattacharyya, CHRIST (Deemed to be University), Bangalore, India Dr. Paramartha Dutta, Visva-Bharati University, Santiniketan, India Dr. Kakali Datta, Visva-Bharati University, Santiniketan, India

Program Chairs Mr. Debaditya Barman, Visva-Bharati University, Santiniketan, India Dr. Debabrata Samanta, CHRIST (Deemed to be University), Bangalore, India Mr. Tathagato Mukhopadhyay, Visva-Bharati University, Santiniketan, India

Technical Co-Chairs Mr. Arindam Karmakar, Visva-Bharati University, Santiniketan, India Dr. B. K. Tripathy, VIT University, India Mr. Avishek Nandi, Visva-Bharati University, Santiniketan, India Md. Nasir, Visva-Bharati University, Santiniketan, India Ms. Mili Ghosh, Visva-Bharati University, Santiniketan, India Dr. Ashish Mani, Amity University, India

ix

x

DoSEIR 2020 Committees

Organizing Secretaries Prof. Alak Kumar Datta, Visva-Bharati University, Santiniketan, India Mr. Madhusudan Paul, Visva-Bharati University, Santiniketan, India Prof. Utpal Roy, Visva-Bharati University, Santiniketan, India

International Advisory Committee Dr. Vincenzo Piuri, Universita’ degli Studi di Milano, Italy Dr. Debotosh Bhattacharjee, Jadavpur University, India Dr. Sushmita Mitra, ISI Kolkata, India Dr. Aboul Ella Hassanien, Cairo University, Egypt Dr. Ujjwal Maulik, Jadavpur University, India Dr. Asit K. Datta, University of Calcutta, India Dr. Elizabeth Behrman, Wichita State University, USA Dr. Nikhil R. Pal, ISI Kolkata, India Dr. Mita Nasipuri, Jadavpur University, India Dr. Xiao-Zhi Gao, University of Eastern Finland, Finland Dr. Leo Mrši´c, Algebra University College, Croatia Dr. Debashish Chakravarty, IIT Kharagpur, India

Technical Program Committee Dr. Nilanjan Dey, Techno International New Town, Kolkata, India Dr. Sourav De, Cooch Behar Government Engineering College, India Dr. Jyoti Prakash Singh, NIT Patna, India Dr. Rik Das, Xavier Institute of Social Service, India Dr. Anasua Sarkar, Jadavpur University, India Dr. Eduard Babulak, Liberty University, USA Dr. Kakali Datta, Visva-Bharati University, India Dr. Koushik Mondal, IIT Dhanbad, India Dr. Kousik Dasgupta, Kalyani Government Engineering College, India Dr. Satadal Saha, MCKVIE, India Dr. Indradip Banerjee, University Institute of Technology, Burdwan University, India Dr. Debasish Mondal, RCC Institute of Information Technology, Kolkata, India Dr. Debashis De, MAKAUT, India Dr. Mihaela Albu, Politehnica University of Bucharest, Romania Dr. Mariofanna Milanova, University of Arkansas at Little Rock, USA Dr. Ashish Mani, Amity University, India Dr. Tanusree Chatterjee, Techno International New Town, India

DoSEIR 2020 Committees

xi

Dr. Debajyoti Mukhopadhyay, NHITM, India Dr. Debarka Mukhopadhyay, CHRIST (Deemed to be University), Bangalore, India Dr. Abhishek Bhattacharya, Institute of Engineering and Management, India Dr. Surbhi Bhatia, King Faisal University, Saudi Arabia Ms. Pampa Debnath, RCC Institute of Information Technology, Kolkata, India Dr. Kolla Bhanu Prakash, K L University, India Dr. Srijibendu Bagchi, RCC Institute of Information Technology, Kolkata, India Mr. Soumen Mukherjee, RCC Institute of Information Technology, Kolkata, India Mr. Swalpa K. Roy, Jalpaiguri Government Engineering College, India Dr. Debajyoty Banik, KIIT University, India Dr. Shyantani Maiti, RCC Institute of Information Technology, Kolkata, India Dr. Soham Sarkar, RCC Institute of Information Technology, Kolkata, India Dr. Sandip Dey, Sukanta Mahavidyala, India Dr. Abhishek Basu, RCC Institute of Information Technology, Kolkata, India Mr. Debanjan Konar, Sikkim Manipal Institute of Technology, East Sikkim, India Dr. Anirban Mukherjee, RCC Institute of Information Technology, Kolkata, India Dr. Indrajit Pan, RCC Institute of Information Technology, Kolkata, India Dr. Mousumi Gupta, Sikkim Manipal Institute of Technology, East Sikkim, India Dr. Anirban Das, University of Engineering and Management, India Dr. Tirthankar Ghosal, Sikkim Manipal Institute of Technology, East Sikkim, India Mr. Siddhartha Chatterjee, GMIT, India Dr. Abhijit Das, RCC Institute of Information Technology, Kolkata, India Mr. Arpan Deyasi, RCC Institute of Information Technology, Kolkata, India Dr. Pijush Barthakur, KLS Gogte Institute of Technology, India Dr. Tirtha Sankar Das, Purulia Government Engineering College, India Dr. Shibakali Gupta, University Institute of Technology, Burdwan University, India Dr. Rajarshi Gupta, University of Calcutta, India Mr. Soumyadip Dhar, RCC Institute of Information Technology, Kolkata, India Dr. Sanjoy Pratihar, IIT Kalyani, India Mr. Arup Bhattacharjee, RCC Institute of Information Technology, Kolkata, India Mr. Hiranmoy Roy, RCC Institute of Information Technology, Kolkata, India Dr. Deepak Gupta, Maharaja Agrasen Institute of Technology, India

Publicity and Sponsorship Chairs Dr. Siddhartha Bhattacharyya, CHRIST (Deemed to be University), India Dr. Debabrata Samanta, CHRIST (Deemed to be University), India

Local Hospitality Chairs Kajol Sk, Visva-Bharati University, India

xii

DoSEIR 2020 Committees

Mr. Swapan Kumar Das, Visva-Bharati University, India Mr. Tapas Chandra Ghosh, Visva-Bharati University, India

Finance Chair Mr. Subhasis Banerjee, Visva-Bharati University, India

Organizing Committee Dr. Alak Kumar Datta, Visva-Bharati University, India Mr. Arindam Karmakar, Visva-Bharati University, India Mr. Avishek Nandi, Visva-Bharati University, India Mr. Debaditya Barman, Visva-Bharati University, India Kajol Sk, Visva-Bharati University, India Dr. Kakali Datta, Visva-Bharati University, India Mr. Madhusudan Paul, Visva-Bharati University, India Ms. Mili Ghosh, Visva-Bharati University, India Md. Nasir, Visva-Bharati University, India Dr. Paramartha Dutta, Visva-Bharati University, India Mrs. Sanchita Pal Choudhuri, Visva-Bharati University, India Mr. Subhasis Banerjee, Visva-Bharati University, India Mr. Swapan Kumar Das, Visva-Bharati University, India Mr. Tapas Chandra Ghosh, Visva-Bharati University, India Mr. Tathagato Mukhopadhyay, Visva-Bharati University, India Dr. Utpal Roy, Visva-Bharati University, India.

Contents

DEBM: Differential Evolution-Based Block Matching Algorithm . . . . . . Abhishek Dixit, Ashish Mani, and Rohit Bansal

1

Multi-modality of Occupants’ Actions for Multi-Objective Building Energy Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monalisa Pal and Sanghamitra Bandyopadhyay

11

A Novel Self-adaptive Salp Swarm Algorithm for Dynamic Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sanjai Pathak, Ashish Mani, Mayank Sharma, and Amlan Chatterjee

21

Digital ID Generation and Management Framework Using Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Suchira Banerjee and Kousik Dasgupta

33

HFAIR: Hello Devoid Optimized Version of FAIR Protocol for Mobile Ad hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abu Sufian, Anuradha Banerjee, and Paramartha Dutta

45

Performance Evaluation of Language Identification on Emotional Speech Corpus of Three Indian Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . Joyanta Basu and Swanirbhar Majumder

55

Disaster Severity Prediction from Twitter Images . . . . . . . . . . . . . . . . . . . . . Abhinav Kumar and Jyoti Prakash Singh A Study on Energy-Efficient Communication in VANETs Using Cellular IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R. N. Channakeshava and Meenatchi Sundaram Recognition of Transforming Behavior of Human Emotions from Face Video Sequence: A Triangulation-Induced Circumradius-Incenter-Circumcenter Combined Approach . . . . . . . . . . . Md Nasir, Paramartha Dutta, and Avishek Nandi

65

75

87

xiii

xiv

Contents

A Study on Radio Labelling of Evolving Trees for Path Pn . . . . . . . . . . . . . Alamgir Rahaman Basunia, Laxman Saha, and Kalishankar Tiwary

97

Secure Blockchain Smart Contracts for Efficient Logistics System . . . . . 105 Ajay Kumar and Kumar Abhishek COVID-19 Outbreak Prediction Using Quantum Neural Networks . . . . . 113 Pranav Kairon and Siddhartha Bhattacharyya

Editors and Contributors

About the Editors Dr. Siddhartha Bhattacharyya [FIET (UK), FIEI (I), FIETE, LFOSI] is currently a Professor in the Department of Computer Science and Engineering of CHRIST (Deemed to be University), Bangalore, India. Prior to this, he was the Principal of RCC Institute of Information Technology, Kolkata, India. He served as a Senior Research Scientist at the Faculty of Electrical Engineering and Computer Science of VSB Technical University of Ostrava, Czech Republic, from October 2018 to April 2019. Prior to this, he was the Professor of Information Technology at RCC Institute of Information Technology, Kolkata, India. He is a co-author of 5 books and co-editor of 68 books, and has more than 300 research publications in international journals and conference proceedings to his credit. His research interests include soft computing, pattern recognition, multimedia data processing, hybrid intelligence, and quantum computing. Dr. Paramartha Dutta received his bachelor’s and master’s degrees in Statistics from the Indian Statistical Institute, Calcutta, in 1988 and 1990, respectively. He completed his Master of Technology in Computer Science at the same institute in 1993 and his Doctor of Philosophy in Engineering at Bengal Engineering and Science University, Shibpur, in 2005. He has served on various projects funded by the Government of India, e.g. for the Defence Research and Development Organization, Council of Scientific and Industrial Research, Indian Statistical Institute, Calcutta, etc. Dr. Dutta is currently a Professor at the Department of Computer and System Sciences, Visva-Bharati University, West Bengal, India. He has co-authored four books and has one edited book to his credit. He has published ca. 100 papers in various journals and conference proceedings, both national and international. Dr. Kakali Datta born in 1974, is working as an Assistant Professor (STAGE-III) in the Department of Computer and System Sciences, Visva-Bharati University, Santiniketan, India. She did her graduation from Presidency College, Kolkata, with honours in Statistics in 1995. Then, she earned her Bachelor of Technology degree in xv

xvi

Editors and Contributors

Computer Science and Engineering from the University of Calcutta and the Master of Engineering degree in Computer Science and Technology from Bengal Engineering College (Deemed University), Shibpur in the years 1995 and 2000, respectively. She got her doctoral degree from Visva-Bharati University, Santiniketan, in 2018. Her areas of interest are quantum-dot cellular automata, image processing, approximation algorithms, etc. She has fourteen publications in various peer-reviewed journals and national/international conference proceedings, and several book chapters with reputed international publishers. She is supervising two scholars at present. She has two (one international and one national) patents to her credit.

Contributors Kumar Abhishek Department of Computer Science & Engineering, NIT Patna, Bihar, India Sanghamitra Bandyopadhyay Machine Intelligence Unit Indian Statistical Institute, Kolkata, India Anuradha Banerjee Department of Computer Application, Kalyani Government Engineering College, Kalyani, India Suchira Banerjee Kalyani Government Engineering College, Kalyani, West Bengal, India Rohit Bansal Department of Management Studies, Rajiv Gandhi Institute of Petroleum Technology, Harbanshganj, Jais, Amethi, Uttar Pradesh, India Joyanta Basu CDAC Kolkata, Kolkata, India Siddhartha Bhattacharyya CHRIST (Deemed to be University), Bangalore, India R. N. Channakeshava Department of Computer Science, Government Science College, Chitradurga, India Amlan Chatterjee California State University Dominguez Hills, Carson, CA, USA Kousik Dasgupta Kalyani Government Engineering College, Kalyani, West Bengal, India Abhishek Dixit Department of Computer Science, Amity School of Engineering and Technology, Amity University, Noida, Uttar Pradesh, India Paramartha Dutta Department of Computer & System Sciences, Visva-Bharati University, Santiniketan, West Bengal, India Pranav Kairon Delhi Technological University, Bawana, Delhi, India Abhinav Kumar National Institute of Technology Patna, Patna, India

Editors and Contributors

xvii

Ajay Kumar Department of Computer Science & Engineering, NIT Patna, Bihar, India Swanirbhar Majumder Department of Information Technology, Tripura University, Agartala, Tripura, India Ashish Mani Amity Innovation & Design Centre, Amity University, Noida, Uttar Pradesh, India Avishek Nandi Visva-Bharati University, Santiniketan, West Bengal, India Md Nasir Visva-Bharati University, Santiniketan, West Bengal, India Monalisa Pal Machine Intelligence Unit Indian Statistical Institute, Kolkata, India Sanjai Pathak Amity University, Noida, Uttar Pradesh, India Alamgir Rahaman Basunia Department of Mathematics, Balurghat College, Balurghat, India Laxman Saha Department of Mathematics, Balurghat College, Balurghat, India Mayank Sharma Amity University, Noida, Uttar Pradesh, India Jyoti Prakash Singh National Institute of Technology Patna, Patna, India Abu Sufian Department of Computer Science, University of Gour Banga, Malda, India; Department of Computer & System Sciences, Visva-Bharati University, Santiniketan, India Meenatchi Sundaram School of Computational Sciences, Garden City University, Bengaluru, India Kalishankar Tiwary Department of Mathematics, Raiganj University, Raiganj, India

DEBM: Differential Evolution-Based Block Matching Algorithm Abhishek Dixit, Ashish Mani, and Rohit Bansal

Abstract Of all the motion estimation approaches, block matching approach is the most powerful and effective method for motion estimation of video sequences. In recent years nature-inspired algorithms are being used in an effective way for motion estimation. This study proposes a novel approach of motion estimation: differential evolution-based block matching algorithm for motion estimation. To showcase the effectiveness of the proposed work, we have used four well-known test video sequences. The video sequences used in the experiments have all the required characteristics like diverse resolutions, formats, and the frame count that are needed in input video sequences. The proposed approach is compared with standard block matching algorithms by considering the parameters like structural similarity (SSIM) and peak signal-to-noise ratio (PSNR). The empirical results indicate that our proposed algorithms perform better in comparison to standard block matching algorithms. Keywords Differential evolution algorithm · Motion estimation · Block matching · Video compression

A. Dixit (B) Department of Computer Science, Amity School of Engineering and Technology, Amity University, Noida, Uttar Pradesh, India e-mail: [email protected] A. Mani Amity Innovation & Design Centre, Amity University, Noida, Uttar Pradesh, India e-mail: [email protected] R. Bansal Department of Management Studies, Rajiv Gandhi Institute of Petroleum Technology, Harbanshganj, Jais, Amethi, Uttar Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_1

1

2

A. Dixit et al.

1 Introduction Applications of video and visual messages usually have large quantity of data. Digital video is an important technology due to their transmitting bandwidth ability and limited storage. Motion estimation is the technique to improve the video coding efficiency by exploiting the correlation between successive frames. Various motion estimation techniques are available to reduce the video coding complexity and improve the efficiency like block matching (BM) algorithms, optical flow [1], parametricbased models [2], and pel-recursive techniques [3]. Of all the available techniques, BM is the most popular and effective technique for both software and hardware implementations where video frames are divided into macro blocks. This technique is largely accepted by different video coding standards. Block matching algorithm is an important motion estimation technique in which video frame is divided into size of macro blocks. Further, in these blocks the best matching block is acknowledged within the search space of size of prior frame; here the maximum allowable displacement is represented as s. Now, the difference between template frame in current and best matching frame from previous frame is calculated and this positive difference is called as motion vector. The goal is to minimize the MSE or MAD or sum of absolute difference (SAD) among blocks. However, the calculation of these parameters in block matching algorithm is highly time-consuming. Therefore, motion estimation or block matching is considered as the optimization problems and the aim is to obtain the best matching block for the target block. There are numerous methods proposed in the literature in order to reduce the complexity and to speed up in block matching algorithm techniques; for example, three-step search [4], new three-step search [5], four-step search [6], diamond search [7], and adaptive rood pattern search [8]. These approaches are effective; however, these techniques failed to establish the balance between accuracy and speed. There is an alternative approach in which evolutionary algorithms were utilized. Lin et al. [9] proposed an extension approach of 3SS based on GA. A new approach proposed by So et al. [10] is four-step genetic search by hybridizing GA and fourstep search. Bhattacharjee et al. [11] explore another approach for block matching algorithm by combining with Cuckoo search optimization. Choudhary et al. [12] reviewed several approaches of nature-inspired algorithms that are implemented for video motion estimation. Two new hybrid approaches for motion estimation using artificial bee colony (ABC) with differential evolution algorithm (DE) and harmony search with DE were proposed by Bhattacharjee et al. [13]. A novel algorithm developed on DE is proposed in this paper to lower the search locations count in the motion estimation method. This approach follows the simple fitness calculation method developed on the basis of nearest neighbor interpolation (NNI) algorithm, which is implemented. Therefore, this approach can significantly lower the function evaluation counts and keep the searching abilities of DE. Evaluating this approach with other fast BM algorithms gives competitive rates and more precise motion vectors.

DEBM: Differential Evolution-Based Block Matching Algorithm

3

This paper is mainly categorized into five sections. Section 2 discusses preliminaries; Sect. 3 presents the proposed algorithm. Section 4 demonstrates the results and comparative analysis. Finally, Sect. 5 discusses the conclusion and future aspects.

2 Related Works In this section, a discussion about the relevant techniques for the proposed algorithm is described.

2.1 Block Matching Algorithm Block matching algorithm is an important motion estimation technique in which video frame It is divided into N × N size of macro blocks. Further, in these blocks the best matching block is acknowledged within the search space of size (2W + 1) × (2W + 1) of previous frame It−1 ; here the maximum allowable displacement is represented as W . Now, the template frame difference in current and best matching frame from previous frame is calculated and this positive difference is called as motion vector. The goal is to minimize the MSE or MAD or sum of absolute difference (SAD) among blocks. The sum of absolute difference is calculated for template block at position (x, y) in current frame and previous frame block (x + u , y + v ) as per the following equation. 



N −1  N −1       gt (x + i, y + j) − gt−1 x + u + i, y + v + j SAD u , v = 







(1)

j=0 i=0

where gt and gt−1 are pixel’s gray value in the current It and previous frame It−1 . Therefore, motion vector is calculated as:   M V (u, v) = argmin(S AD u , v ) 

where S =



(2)

     u , v | − W ≤ u , v ≤ W and x + u , y + v is a valid pixel position It−1 . 











2.2 Differential Evolution Algorithm (DE) Differential evolution algorithm was proposed by Storn and Price [14] in 1997 as exploratory algorithm for finding an optimal solution in a large search space. In the evolution algorithm, first, the solution vectors are randomly initialized as xi j =

4

A. Dixit et al.



where i = {1, 2, . . . , N P} and N P is the population x min + R(0, 1) x max − x min j j j size, j = {1, 2, . . . D} and D is the search space dimensionality. R in range (0, 1) is the uniformly distributed random number. Predefined maximum and minimum values of parameter j are defined as x min and x max j j . Further, differential operators are applied to improve the individual. These operators are described in terms of various steps such as fitness evaluation, mutation, crossover, and selection. Mutation: During this stage, new vectors are generated as weighted difference and are added to the existing vector. Let us support that X i ,G is a target vector and Vi ,G+1 is a mutant vector.   Vi,g+1 = X r 1,g + F. X r 2,g − X r 3,g

(3)

where r1 , r2 , r3 are the arbitrary keys belonging to [1, N P]; r1 = r2 = r3 and scaling factor is represented as F. The value is in the range [0, 2]. This factor expends X r 2,G and X r 3,G difference. Crossover: In this step, the target vector and mutant vector are mixed on the basis of crossover rate.   Vi, j,g+1 if rand j,i ≤ Cr or j = jrand Ui, j,g+1 = (4) X i, j,g otherwise where Ui,g+1 is a trail vector with j = 1, 2 . . . .d, and rand j,i is an equally distributed random number from 0 to 1, Cr ∈ [0, 1] is crossover probability.   Selection: In this process, if trial vector at generation g + 1 U j,g+1 produces more cost value compared to target vector then trial vector replaces target vector in the next generation. X j,g+1 =

    U j,g+1 if f U j,g+1 ≤ f X j,g+1 X j,g+1

otherwise

(5)

where f (X ) defines the objective function with decision variable X and j = 1, 2, . . . N P. The procedure of mutation, recombination, and selection is done to achieve a stopping condition.

3 Proposed Algorithm This section presents the proposed algorithm for motion estimation.

DEBM: Differential Evolution-Based Block Matching Algorithm

5

3.1 BM Based on DE Differential evolution algorithm with block matching algorithm is proposed to lower the count of search location and to improve the code quality. Differential evolution algorithm iteratively attempts to enhance the solution vector, and further, the problem is improved by following the steps of differential evolution algorithm as: initialize the population, mutation, crossover, and selection operations. Population Generation: 2−dimensional N P blocks Bi ∈ 1 to N P Size of each block is 16 × 16 pixels. These blocks are created by utilizing a staticshape from (2W + 1) × (2W + 1) search space blocks. Mutation: We have used D E/best/1 strategy for mutation. In this strategy from the present generation, two random blocks Bn1 and Bn2 are selected. During the iteration process the criteria of selection of the two random blocks is to compare their indices. Bbest is the block with minimal sum of absolute difference value and is obtained by the mutation of these two randomly selected blocks. V = Bbest + F ∗ (Bn1 − Bn2 )

(6)

where F is mutation probability and V is mutant vector. Crossover: A utility block (U ) is generated by applying the uniform crossover among the parent block (Bi ) and mutant block (V ) with crossover probability as C P. The value is selected from mutant block if rand(0, 1) < C P, otherwise parent block attribute value is selected.

V j,i if rand (0, 1) ≤ C P or j = jrand (7) U= B j,i otherwise Selection: Selection operation is performed to choose the SAD value among parent child block pair. The current child block replaces the parent block if compared superior or else the parent remains the same.

Bi =

Ui if (Ui ) ≤ f (Bi ) Bi otherwise

(8)

6 Table 1 Different experimental parameters with their setting values

Table 2 Test video sequences

A. Dixit et al. Defined parameters

Setting value

Iteration count

100

Population (NP)

50

Crossover rate

0.9

Scale factor (F)

0.2

Min scaling factor

0.5

Max scaling factor

1.0

Threshold

0.6

Sequence

Motion types

Format

Total frames

Container

Low

QCIF (176 × 144)

299

Carphone

Medium

QCIF (176 × 144)

381

Foreman

Medium

QCIF (352 × 288)

398

Akiyo

Medium

QCIF (352 × 288)

211

4 Experimental Results 4.1 Experimental Settings The parameter values used in the experiment for DE and BM algorithm are mentioned in Table 1. The performance and effectiveness of the proposed algorithms DE-BM are evaluated on four common test video sequences mentioned in Table 2. Our proposed DE-BM technique is compared with different search algorithms such as TSS [4], NTSS [5], 4SS [6], DS [7], ARPS [8]. All the experiments are performed on Intel® core™ i7-8665U CPU @1.90 GHz with 16 GB of RAM. Result Analysis: Coding quality and search efficiency is used as the two performance indexes. The coding quality is represented by the PSNR value. It specifies the rebuilding value when the motion vectors are calculated by a block matching technique. In PSNR, unique data frames define the signal, but the error added by the calculated motion vectors is represented as noise. The PSNR is calculated as:  PSNR = 10.log10

2552 MSE

 (9)

In the above equation MSE is calculated as the mean square among the original frames and those redressed by the motion vectors. We have determined the PSNR, SSIM, and quality of loss of all the algorithms. Quality of loss is the term that designates the PSNR degradation ratio. Quality loss is calculated for the specific algorithm as the percentage by which the PSNR has been degraded, whereas PSNR

DEBM: Differential Evolution-Based Block Matching Algorithm Table 3 PSNR and comparison of BM methods for Carphone

Table 4 PSNR and comparison of BM methods for Container

BM algorithm

PSNR

ES TSS

7 Avg SSIM

Quality loss

32.7196

0.9372

2.7231

32.4837

0.9339

2.0167

SESTSS

32.2893

0.9299

1.4267

NTSS

32.5627

0.9347

2.2544

4SS

32.4554

0.9336

1.9312

DS

32.5153

0.9342

2.1119

ARPS

32.4357

0.9331

1.8717

DEBM

31.47

0.9218



BM algorithm

PSNR

Avg SSIM

Quality loss

ES

44.1108

0.9926

0.3414

TSS

44.0624

0.9925

0.2319

SESTSS

44.0584

0.9925

0.2228

NTSS

44.0624

0.9925

0.2319

4SS

44.0448

0.9925

0.1920

DS

44.0439

0.9925

0.1900

ARPS

44.0198

0.9925

0.1353

DEBM

43.9806

0.9924



degradation ratio is the percentage reduction of PSNR w.r.t. exhaustive search.  Quality loss =

PSNR − PSNRDEBM × 100 PSNR

 (10)

Tables 3, 4, 5 and 6 shows the comparison results of different block matching algorithm in view of the test video sequences: Carphone, Container, Akiyo, and Foreman, respectively. The results show that the proposed DEBM has better quality Table 5 PSNR and comparison of BM methods for Akiyo

BM algorithm

PSNR

Avg SSIM

Quality loss

ES

44.10

0.9931

0.6366

TSS

43.98

0.993

0.3614

SESTSS

43.87

0.9928

0.1253

NTSS

44.09

0.9931

0.6211

4SS

44.02

0.993

0.4466

DS

44.09

0.9931

0.6028

ARPS

44.07

0.9931

0.5627

DEBM

43.15

0.9927



8 Table 6 PSNR and SSIM comparison of BM methods for Foreman

A. Dixit et al. BM algorithm

PSNR

Avg SSIM

Quality loss

ES

32.6896

0.9201

10.1821

TSS

32.009

0.8976

8.2723

SESTSS

31.5079

0.8866

6.8135

NTSS

32.2292

0.9019

8.8990

4SS

32.053

0.8989

8.3982 8.8756

DS

32.2209

0.9028

ARPS

32.3647

0.9086

9.2804

DE-BM

31.51

0.8367



in comparison to other algorithms which show higher loss in quality. In our approach we have considered the average number of search parameters as the computation of complexity measure. From the results we can observe that the differential evolution algorithm with block matching approach shows improved results in relation to computation complexity. From the results we can see that in our approach the search point counts are also significantly reduced. Therefore, from our experiment we can deduce that the DEBM algorithm is in any case equal to the other approaches in terms of the lowering the number of search points. The main advantage of using differential evolution algorithm is to keep the balance between quality and computation complexity.

5 Conclusion In this paper, we have proposed a new algorithm for BM process developed with differential evolution (DE) to lower the search location points. This approach hybridizes the traditional DE with a fitness estimation approach that chooses the individual search location for estimation or evaluation in order to save the computation time. Therefore, this method can significantly lower down the count of functions evaluations by keeping the differential evolution algorithm’s search capabilities. In our proposed approach during the evolution process fitness calculation approach is used to evaluate the search locations fitness or SAD value by utilizing the formerly estimated neighboring locations. In this approach, the positions which are closer to the location containing best fitness value are estimated by utilizing the actual fitness function. In similar positions which are not previously evaluated and lying in the same region are also evaluated. The other left-over search points are evaluated by assigning the nearest known location’s fitness value. In this way the SAD value of very limited search points is calculated by fitness estimation technique while the remaining left points are just estimated. As noticed from the results that our proposed approach does not work on any fixed pattern for search or any other assumptions on movement, there is a high possibility of getting the correct motion vector irrespective of having any complexity in movement of video sequence.

DEBM: Differential Evolution-Based Block Matching Algorithm

9

References 1. Barron J, Fleet D, Beauchemin SS (1994) Performance of optical flow techniques. Int J Comput Vis 12(1):43–47 2. Tzovaras D, Kompatsiaris I, Strintzis MG 3D object articulation and motion estimation in model-based stereoscopic videoconference image sequence analysis and coding. Signal Process Image Commun 14(10): 817–840 3. Gharavi H, Reza-Alikhani H (2001) Pel-recursive motion estimation algorithm. Electron Lett 37(21):1285–1286 4. Kulkarni SM, Bormane DS, Nalbalwar SL (2015) Coding of video sequences using three step search algorithm. Procedia Comput Sci 49:42–49 5. Li R, Zeng B, Liou M (1994) A new three-step search algorithm for block motion estimation. IEEE Trans Circuits Syst Video Technol 4(4):438–442 6. Po L-M, Ma W-C (1996) A novel four-step search algorithm for fast block motion estimation. IEEE Trans Circuits Syst Video Technol 6(3):313–317 7. Zhu S, Ma KK (1997) A new diamond search algorithm for fast block matching motion estimation. In: ICICS international conference of information, communications and signal processing 8. Nie Y, KK M (2002) Adaptive rood pattern search for fast block-matching motion estimation. IEEE Trans Image Process 11(12):442–451 9. Lin CI, Wu JL (1998) A lightweight genetic block-matching algorithm for vid-eo coding. IEEE Trans Circuits Syst Video Technol 8(4):386–392 10. So MF, Wu A (2018) Four-step genetic search for block motion estimation. In: Proceedings of the 1998 IEEE international conference of acoustics, speech and signal processing 11. Bhattacharjee K, Kumar S (2018) A novel block matching algorithm based on Cuckoo search. In: 2nd international conference on telecommunication and networks (TEL-NET) 12. Choudhury AH, Sinha N, Saikia M (2019) Nature inspired algorithms (NIA) for efficient video compression–a brief study. Eng Sci Technol Int J 13. Bhattacharjee K, Kumar S, Pandey HM, Pant M, Windridge D, Chaudhary A (2018) An improved block matching algorithm for motion estimation in video sequences and application in robotics. Comput Electr Eng 68:92–106 14. Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11(4):341–359

Multi-modality of Occupants’ Actions for Multi-Objective Building Energy Management Monalisa Pal and Sanghamitra Bandyopadhyay

Abstract A prevalent goal of building energy management is to search for the optimal schedule of occupants’ actions by minimizing multiple objectives pertaining to occupants’ dissatisfaction. Existing approaches have ignored the multi-modal (different schedules having the same objectives) and the multi-view (different facets of occupants’ actions related to door, window and heater) nature of the action schedules. For addressing these problem characteristics, a recent multi-modal multi-objective evolutionary algorithm, known as LORD, is customized in this work. Moreover, a decision-making strategy is proposed to consider the user preference in the decision space. This strategy also complies with the existing decision-making strategies to avoid neglecting the preference in the objective space. The superior performance of the proposed strategies on a real-world dataset establishes their effectiveness. Keywords Multi-modal multi-objective optimization · Evolutionary algorithms · Graph Laplacian-based spectral clustering

1 Introduction Building energy management has been a trending topic over the past decade as a significant proportion (20 ∼ 40%) of the global energy consumption is from the buildings sector [1]. One of the prevailing strategies for building energy management, even applicable to non-green buildings, is regulating occupants’ actions to attain the finest indoor ambience [2]. This optimal schedule of occupants’ actions helps to generate causal explanations such that the occupants can learn to modify their actions toward an energy-efficient routine [3]. M. Pal (B) · S. Bandyopadhyay Machine Intelligence Unit Indian Statistical Institute, Kolkata 700108, India e-mail: [email protected] S. Bandyopadhyay e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_2

11

12

M. Pal and S. Bandyopadhyay

Depending on the characterization of indoor ambience, the involved optimization is often a multi-objective problem [4]. As a result, the solution is a set of trade-offs. However, only one of the solutions can be implemented in practice. For building energy management, the literature on this selection of the most relevant solution (decision-making) from the Pareto-optimal Set (PS) is scarce. Earlier, the study in [5] transformed the multi-objective problem into a single objective problem, and thus, estimated a single optimal solution. Around 2016, the multi-objective approaches [6] relied on expert knowledge for obtaining the relevant solution. Another study considered distance-to-compromises for automated decision-making by forgoing a nearly equal amount from all objectives [2]. A slider prototype was designed to incorporate occupants’ preference while informing them about the optimal ranges of the objectives [3]. Furthermore, in 2018, decision-making in the presence of multiple subjective preferences has been explored [7]. However, the existing approaches neither consider the solution distribution in the decision space nor the multi-view nature of the problem (actions are characterized by different domains—window, door and heater). Motivated by the fact that occupants will be more inclined to adopt a recommended schedule if it bears more resemblance with their usual schedule, this work contributes to building energy management in the following aspects: 1. A Multi-Modal Multi-Objective Evolutionary Algorithm (MMMOEA) is used to focus also on the solution distribution in the decision space. 2. For catering to the multi-view characteristics of the problem, a recent MMMOEA (graph Laplacian-based Optimization using Reference vector-assisted Decomposition, LORD [8]) is customized. 3. A novel decision-making strategy is proposed by considering the occupants’ preference in the decision-space. In the rest of the article, Sect. 2 outlines the proposed algorithmic framework, Sect. 3 discusses the results of implementing the proposed work on a real-world dataset, and Sect. 4 concludes the article with the future scope.

2 Building Energy Management Framework and Its Optimization Problem The overall building energy management framework [3] is outlined in Fig. 1. The loop begins with sensor-fitted rooms and creation of a database (HDB ) to store usual occupants’ actions (opening/closing of windows ζ˜W and doors ζ˜ D and turning on/off the room heater ζ˜ H ) and contextual variables (outdoor temperature Tout , corridor temperature Tn , occupancy n, plug load energy consumption Pelec , fuel cost E fuel and electricity cost E elec ). For the kth hour, this data is used by the simulation models [3] such that the indoor physical variables (temperature Tink , CO2 concentration Cink k and heater energy consumption Pfuel ) can be evaluated for hypothetical actions (ζWk ,

Multi-modality of Occupants’ Actions for Multi-Objective …

13

Fig. 1 Building energy management framework where the optimization module (dashed box) aims at estimating the relevant Pareto-optimal schedule

ζ Dk and ζ Hk ). This work considers a 24-hour action schedule which is denoted by a 72-dimensional solution vector (X B , Fig. 2) as follows:     X B = x B,1 , . . . , x B,72 = ζW0 , . . . , ζW23 , ζ D0 , . . . , ζ D23 , ζ H0 , . . . , ζ H23 .

(1)

These variables (input and simulated) are used to minimize thermal discomfort σtemp , aeraulic discomfort σair , heater associated cost indicator σcost and the number of changes in a schedule δWD . Thus, using the formulation from Table 1, the optimization objective vector F B is given as follows: 

F B = f B,1 , f B,2 , f B,3 , f B,4



  23 23 23 23    1  k k k k = σ , σ , σ , δ . (2) 24 k=0 temp k=0 air k=0 cost k=1 WD

Thus, the minimization of F B (X B ) estimates the optimal actions ζW , ζ D and ζ H under the same recorded context used by the simulation models (fetched from the database HDB ). These recommended actions can be compared with the usual actions for generating causal explanations, from which the occupants can learn by themselves the impact of their actions [2, 3]. The concerned optimization problem is tackled using LORD (with proposed customization for multi-view problems), which consists of the following steps: [Step 1: Initialization] The binary action variables are randomly set to 0 or 1 to initialize a population (AG=0 ) of n pop solutions. Moreover, as LORD is a decompositionbased algorithm [8], the objective space is partitioned into n dir sub-spaces (S1 , . . ., Sn dir ) using Das and Dennis’ approach [9, 10].

14

M. Pal and S. Bandyopadhyay

Table 1 Mathematical formulation of the four optimization objectives Objective Parameters ⎧ 294.15−Tink k ⎪ k ⎪ ⎨ 294.15−291.15 , if Tin < 294.15 and n > 0 k σtemp = 0, if 294.15 ≤ Tink ≤ 296.15 or n k = 0 Simulated Tink (in K) and ⎪ k −296.15 ⎪ T ⎩ in occupancy (n k ) at the k th hour , if Tink > 296.15 and n k > 0 299.15−296.15

k ≤ 400 or n k = 0 0, if Cin k k k (in ppm) and σair Simulated Cin Cin = C k −400 k k in 1500−400 , if Cin > 400 and n > 0 occupancy (n k ) at the k th hour

k Pelec E elec +P kf uel E f uel   , if n k > 0 k , Pk k 1000 Pelec = σcost Simulated P kf uel (in W) and f uel 0, if n k = 0 k (in W), E recorded Pelec elec and E f uel (in Euros per kWh) at the k th hour ⎧   k−1 k−1 k−1 k ⎨   δW D ζ pair + 1, if ζ pair = ζ pair k k   δW XB D ζ pair = ⎩ k−1 k−1 k−1 k + 0, if ζ pair δW D ζ pair = ζ pair   k k k 0 0 where ζ pair = ζW , ζ D and δW D ζ pair = 0

Fig. 2 a Reproduction of Xchild B , b Proposed decision-making (S2) as compared to distance-tocompromises (S1) for schedule selection

[Step 2: Perturbation] Given Si , the first parent is sampled from Si . The second parent is sampled from a non-empty sub-space, within a neighborhood of Si to benefit from the neighborhood property [10]. If Si is empty, then the first parent is also sampled similarly. Due to the multi-view nature of X B , variables are extracted for any one of the views (window, door or heater) from both the parents for singlepoint binary crossover and bit-flip mutation [3]. The generated variables are replaced for the respective view in the first parent vector to output the child solution (Xchild B ), as illustrated in Fig. 2.

Multi-modality of Occupants’ Actions for Multi-Objective …

15

Fig. 3 Proposed population filtering step for LORD

3: Filtering] After Xchild is generated, it is added to the population pool B [Step child AG ∪ X B . One of the solutions has to be removed to maintain a constant n pop . As illustrated in Fig. 3, filtering considers the following steps: 1. Using non-dominated sorting [9], the population is partitioned into nondomination ranks. While rank-1 estimates the Pareto Front (PF), one of the solutions from the last rank is to be eliminated. If the last rank has only one solution, it is deleted and filtering stops. 2. All the solutions in the last rank undergo spectral clustering, considering the multi-view characteristics of the problem. For each of the three domains, the nearest neighbor graph is obtained using the cosine similarity (for binary variables) between the solutions (nodes). The adjacency matrices (G window , G door and G heater ) are constructed considering pair-wise node similarity to be more than ε L = 0.6. , Ldoor Subsequently, the symmetric normalized graph Laplacians (Lwindow sym sym and Lheater ) are obtained. Thereafter, the eigendecomposition is performed separately sym heater on Lwindow , Ldoor sym sym and Lsym . The smallest non-zero eigenvalue (Fiedler value) , λdoor and λheater ) represents the quality of the for each of the domains (λwindow 2 2 2 graph partitioning [11]. Hence, these Fiedler values determine the influence of a particular domain on the overall Laplacian (Lcomb sym ) as follows: Lcomb sym =

heater × Lwindow + λdoor × Ldoor × Lheater λwindow sym sym + λ2 sym 2 2

λwindow + λdoor + λheater 2 2 2

.

(3)

16

M. Pal and S. Bandyopadhyay

The algebraic multiplicity of 0 eigenvalue of Lcomb sym gives the number of connected components (kCC ) for the overall cluster structure (in the decision space). The eigenvectors of Lcomb sym corresponding to the second smallest to the kCC th eigenvalue are clustered (C1 , . . . , CkCC ) using the k-means algorithm. 3. For the set of solutions in each cluster, the crowding distances [8] in the objective space are noted. 4. The solutions in the last rank are finally rearranged where at first the candidates with the highest crowding are selected from each cluster, then candidates with the second-highest crowding are selected from each cluster and so on. On iterating from the end (worst position) of this rearranged set of solutions, if the solution belonging to a non-singleton sub-space is encountered, it is deleted and the filtering stops. If all solutions are from singleton sub-spaces, then the last solution from this rearranged last rank is deleted. The perturbation and the filtering steps are repeated for all the sub-spaces within one generation. After G max generations, LORD is terminated to yield the PS (AG max ) and the PF (AF,G max = {F B (X B )|X B ∈ AG max }). [Step 4: Selecting the relevant and Pareto-optimal solution] After G max generations, Algorithm 1 is executed. In Algorithm 1, lines 2 to 5 determine the minimum deviation (Δsch min ) over all the solutions in AG max from occupants’ usual/preferred ˜ solution (X B ). Line 6 forms a set (pruned PF: Asch F,G max ) of the objective vectors for . Finally, lines 7 to 9 use an existing decisionthe schedules corresponding to Δsch min making strategy (e.g., distance-to-compromises [2]) in the objective space to select the schedule XB for recommendation. This decision-making strategy is illustrated in Fig. 2 for a two-objective case. Algorithm 1 Choosing a relevant and Pareto-optimal schedule ˜ B : Usual/Preferred schedule Input: AG max : Estimated PS; AF,G max : Estimated PF; X Output: XB : Relevant and Pareto-optimal schedule ˜ 1: procedure Schedule_Select(AG max , AF,G max , Fref B , XB ) 2: for i = 1 to n pop (for each solution vector in AG max ) do    ˜ B) 3: ΔX B,i = Nj=1 x˜ B, j − x B,i, j  (Net deviation of X B,i ∈ AG max from X 4: end for 5: Δsch min ← mini ΔX B,i (Minimum deviation over all schedules) sch sch 6: Asch F,G max = {F B (X B ) | ΔX B = Δmin } (Pruned PF for schedules with Δmin )  sch 7: F B ← arg min D B over AF,G max using Eq. (4) (distance-to-compromises [2]) 8: XB = arg FB (Obtain the corresponding schedule) 9: return XB as the relevant and Pareto-optimal schedule 10: end procedure

The subsequent section discusses the performance of this optimization framework using real-world data for building energy management.

Multi-modality of Occupants’ Actions for Multi-Objective …

17

3 Performance on Real-World Data The customized version of LORD is implemented on a computer with 8 GB RAM, Intel Core i7 processor and Python 3. It uses the simulation models which are trained using the data from April 2015 to October 2016 for an office room in Grenoble Institute of Technology, France [3]. The existing approach of distance-to-compromises (S1) is compared with the proposed decision-making strategy (S2). As LORD also considers the decision space during filtering, it is compared with a state-of-the-art optimization algorithm, NAEMO [10], which focuses only on the objective space during population filtering. In the decision space, the proportion of change in schedules (Δsch min /N with N = 72) is noted and in the objective space, the distance-tocompromises (D B (X B )) is noted as follows: D B (X B ) =

4  wi × f B,i (X B ) , with w1 = w2 = w3 = 1 and w4 = 0.01. (4) i=1

The general parameters considered for this experiment are n dir = 35, n pop = 3 × n dir and G max = 300. As the solution vector dimension is one-third reduced during perturbation (Fig. 2a), a mutation probability of 1/24 is considered. The best result is combined over 5 runs of the algorithm as recommended in [3]. The results obtained from the above approaches for the 20 random days are noted in Table 2, from which the following insights are obtained: – For both LORD and NAEMO, as S1 prioritizes D B (XB ) over Δsch min , larger ΔD B values are obtained by S1 approaches. Similarly, as S2 prioritizes in the reverse order, smaller Δsch min /N values are obtained by S2 approaches. – Although S2 yields a lower ΔD B value, it is numerically very close to S1 at a much lower Δsch min /N value. For example, on 26-Jan-2016, NAEMO attains a similar ΔD B value with only a 61% change in the schedule using S2 instead of a 79% change using S1. Thus, S2 is a better approach than S1 for choosing the schedule to be recommended. – LORD (S2) is noted to perform as good as or better than NAEMO (S2) in 15 cases (using both the indicators). Moreover, for five days (numbered 2, 5, 8, 13, and 19), Δsch min /N from LORD (S2) is at least 10% lesser than those from NAEMO (S2). This superiority is because LORD can obtain more equivalent schedules, which results in the same objectives (multi-modality). Thus, recommending a relevant and Pareto-optimal schedule can be beneficial for energy management. For example, on 01-Sep-2015, with only a 16% change from ˜ B , LORD (S2) has obtained a better (Pareto-optimal) schedule X than the usual X B ˜ B. schedule X

03-Nov-2015

07-Dec-2015

26-Jan-2016

01-Feb-2016

17-Mar-2016

13-Apr-2016

23-May-2016

02-Jun-2016

20-Oct-2016

16-Jun-2015

07-Jul-2015

01-Sep-2015

30-Jun-2016

26-Jul-2016

31-Aug-2016

08-Sep-2016

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

0.3472

0.3611

0.3611

0.3889

0.4306

0.5000

0.4028

0.7917

0.4444

0.6250

0.6944

0.7222

0.7778

0.7917

0.7083

0.7778

0.4167

Rank Test (Sig./Not)

(Sig.)

08-Oct-2015

4

0.5139

Wilcoxon Signed

30-Sep-2015

3

0.6806

0.4583

(Sig.)

0.5000

0.2639

0.3611

0.3611

0.3611

0.2083

0.3889

0.4028

0.7917

0.4444

0.6250

0.5833

0.6250

0.7500

0.6111

0.5833

0.6667

0.4167

0.4167

0.6806

0.4583

(Sig.)

0.4674

0.3472

0.2222

0.3333

0.2917

0.4028

0.2917

0.3611

0.4444

0.3472

0.5694

0.7778

0.7083

0.6806

0.6528

0.6111

0.7639

0.3889

0.3472

0.4167

0.3889

(S1)

(S1)

(vs. others)

0.4118

0.1806

0.2222

0.3333

0.2917

0.1667

0.2917

0.3611

0.4444

0.3472

0.5694

0.6250

0.6528

0.5417

0.6250

0.5694

0.4861

0.3889

0.3333

0.4167

0.3889

(S2)

(Not)

0.3122

0.7563

0.7317

0.7635

0.7737

0.6765

0.6655

0.5867

0.1467

0.0176

0.0088

0.0069

0.3726

0.3345

0.1122

0.1650

0.0210

0.0270

0.0318

0.0191

0.0275

(S1)

NAEMO

(Not)

0.3019

0.7356

0.7317

0.7635

0.7347

0.5469

0.6610

0.5867

0.1467

0.0176

0.0088

0.0061

0.3706

0.3340

0.1112

0.1625

0.0170

0.0270

0.0303

0.0191

0.0275

(S2)

Objective Space (ΔD B ↑)

(Sig.)

0.3134

0.7563

0.7319

0.7635

0.7737

0.6972

0.6668

0.5872

0.1463

0.0176

0.0092

0.0073

0.3732

0.3349

0.1126

0.1650

0.0211

0.0270

0.0320

0.0183

0.0275

(S1)

LORD

(vs. others)

0.3026

0.7230

0.7319

0.7635

0.7737

0.5306

0.6668

0.5872

0.1463

0.0176

0.0092

0.0069

0.3728

0.3348

0.1116

0.1545

0.0179

0.0270

0.0307

0.0183

0.0275

(S2)

  ˜ B ) − D B (X )) in global criteria − xB, j  /N ) required for a deviation of ΔD B (= D B (X B

LORD (S2)



N  x B, j j=1 ˜

NAEMO

0.5597

20-May-2015

2



Decision Space (Δsch min /N ↓)

Mean

01-Apr-2015

Date

1

Number

Day

Table 2 Amount of change in schedule (Δsch min /N =

18 M. Pal and S. Bandyopadhyay

Multi-modality of Occupants’ Actions for Multi-Objective …

19

4 Conclusion In this article, the multi-view and the multi-modal nature of occupants’ actions are considered for better building energy management. The perturbation and filtering steps of a recent MMMOEA (known as LORD) are modified to satisfy the multiview characteristics. Furthermore, to deal with user preference in the decision space, a novel decision-making strategy is proposed. Results on a real-world dataset establish the efficacy of the proposed strategies. While Algorithm 1 performs relevant schedule selection in the presence of a preferred occupants’ schedule, decision-making among equivalent schedules in the absence of such preference is an interesting direction for further exploration. Moreover, additional views (like appliance usage, water consumption) and objectives (like air pressure- and humidity-related discomfort) may be considered in the future studies.

References 1. Corrado V (2018) Energy efficiency in buildings research perspectives and trends. Thermal Sci 22(4):971–976 2. Alyafi AA, Pal M, Ploix S, Reignier P, Bandyopadhyay S (2017) Differential explanations for energy management in buildings. In: 2017 computing conference, pp 507–516 3. Pal M, Alyafi AA, Ploix S, Reignier P, Bandyopadhyay S (2019) Unmasking the causal relationships latent in the interplay between occupants actions and indoor ambience: a building energy management outlook. Appl Energy 238:1452–1470 4. Nguyen AT, Reiter S, Rigo P (2014) A review on simulation-based optimization methods applied to building performance analysis. Appl Energy 113:1043–1058 5. Asadi E, da Silva MG, Antunes CH, Dias L, Glicksman L (2014) Multi-objective optimization for building retrofit: A model using genetic algorithm and artificial neural network and an application. Energy Build 81:444–456 6. Papadopoulos S, Azar E (2016) Optimizing HVAC operation in commercial buildings: a genetic algorithm multi-objective optimization framework. In: Proceedings of the 2016 winter simulation conference. IEEE Press, pp 1725–1735 7. Pal M, Bandyopadhyay S (2018) Consensus of subjective preferences of multiple occupants for building energy management. In: 2018 IEEE symposium series on computational intelligence (SSCI), pp 1815–1822 8. Pal M, Bandyopadhyay S (2020) Decomposition in decision and objective space for multimodal multi-objective optimization. Preprint arXiv:2006.02628 9. Li K, Deb K, Zhang Q, Kwong S (2015) An evolutionary many-objective optimization algorithm based on dominance and decomposition. IEEE Trans Evol Comput 19(5):694–716 10. Sengupta R, Pal M, Saha S, Bandyopadhyay S (2019) NAEMO: neighborhood-sensitive archived evolutionary many-objective optimization algorithm. Swarm Evol Comput 46:201– 218 11. Fiedler M (1975) A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslovak Math J 25(4):619–633

A Novel Self-adaptive Salp Swarm Algorithm for Dynamic Optimization Problems Sanjai Pathak, Ashish Mani, Mayank Sharma, and Amlan Chatterjee

Abstract Many real-world applications can be cast as dynamic optimization problems where it is required to locate and track the trajectory of the changing global optima while finding the global best solution in a dynamic and uncertain environment. In this article, we present a novel nature-inspired meta-heuristic optimizer to solve dynamic optimization problems, namely self-adaptive salp swarm algorithm. The self-adaptive parameter control technique is used with a multi-population and ageing mechanism, in which individuals have to maintain diversity during the optimization process in SA-SSA. The evaluation is conducted to examine the overall performance of SA-SSA on widely known generalized dynamic benchmark problems provided in the CEC’09 competition. Preliminary results showed that the proposed SA-SSA is promising. Keywords Computational intelligence · Dynamic optimization problems · Swarm intelligence · Evolutionary algorithm

S. Pathak (B) · A. Mani · M. Sharma Amity University, Noida, Uttar Pradesh, India e-mail: [email protected] A. Mani e-mail: [email protected] M. Sharma e-mail: [email protected] A. Chatterjee California State University Dominguez Hills, Carson, CA, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_3

21

22

S. Pathak et al.

1 Introduction Stationary optimization problems have been primarily studied in most of the research work on evolutionary computation. However, a large number of real-world applications are dynamic where changes occur over time and can be represented as dynamic optimization problems (DOPs). In such cases, the optimization algorithm has to find and track the moving optimum as closely as possible, rather than, just to find a single good solution. The solution of DOPs (i.e. defined as γ) changes overtime during the optimization process that impacts the performance of many real-world applications [1]. γ = g(x, δ, μ) where γ is dynamic optimization problem defined as objective function g including x as a probable solution from the set of solutions X, δ is the system parameters to determine the solution position in the fitness environment, and μ is the time. To solve DOPs, the searching strategy of the algorithms should be capable of locating and tracking the changing global optima over time μ along with finding a high-quality solution in the fitness landscape. The advisable attributes of a good optimizer for solving DOPs are the following: Create diversity in solution, maintain diversity, and store a good solution and multipopulation strategy [2]. Mirjalili et al. [3] in 2017 proposed a nature-inspired population-based metaheuristic algorithm that mimics the swarming behaviour of salps in deep oceans. This swarming behaviour can avoid convergence to each solution into a local optimum up to some extent. There are many real-world applications where SSA works efficiently and has shown good results, but in some cases, it does not always perform the searching process well and fails to obtain a global optimum. Thus, SSA lacks a good searching strategy in situations where it is required to improve the global optima attained until now, that is, it struggles to achieve the expected global optima that are set in the search space. DOPs are one such kind of problem, where global optima change over time and SSA lacks in improving the global optima attained until now to achieve the expected global optima in the search space. The existing literature shows an increasing interest for applying SSA in various stationary problems such as in hydrology for river flow forecasting [4], binary SSA and a novel chaotic SSA for feature selection problems in [5] and [6], respectively, and chaotic SSA for SDN multi-controller placement problems [7]. Faris et al. [5] presented a work to deal with feature selection task in machine learning. Sayed et al. [8] proposed a chaos-based salp swarm algorithm (CSSA) for feature selection task in machine learning. Ismael et al. [9] used the original version of SSA to choose the finest electrical line in a real-world power system design problem in Egypt. Ekinci et al. [10] applied SSA to tune the stabilizer, which is an important task of a multimachine power system to deliver constant voltage regardless of changes in the input

A Novel Self-adaptive Salp Swarm Algorithm …

23

voltage of the power system. The result of the experiment shows the effectiveness and confirmed that SSA outperformed other intelligent techniques. All these works demonstrate that SSA is flexible and capable of managing the exploration and exploitation propensities of a nature-inspired algorithm, especially to the feature selection task in machine learning, where this method has shown satisfactory performance and flexibility in detecting near-optimal solutions. A novel technique is proposed in this article to improve the overall performance of original SSA in solving the DOPs as the original SSA cannot improve the global optima attained until now in the dynamic search space. To the best of our knowledge, this is the first attempt of applying SSA with a self-adaptive technique for solving DOP. The article is structured as follows: Sect. 2 presents the self-adaptive SSA algorithm and techniques to apply for solving DOPs. The experimental evaluation of the proposed optimizer is presented in Sect. 3. Section 4 presents the results and discussions of experiment, and finally, Sect. 5 concludes this article.

2 Self-adaptive SSA Algorithm An adequate exploration and exploitation proclivity of SSA algorithm on stationary optimization problems makes it appealing to DOPs. There is a unique advantage of SSA that cannot be obtained using some other standard optimizer such as particle swarm optimization [11] [12], grey wolf optimizer [13] and whale optimization algorithm [14] techniques. The SSA is capable, simple, flexible, easy to understand, and can be applied in solving complex real-world applications. Moreover, a single adaptively decreasing control parameter mechanism in SSA makes it more suitable for solving the optimization problems due to a better balance between the exploration and exploitation propensities. SSA mimics the swarming behaviour of salps during the optimization and forms a salp chain. This chain can avoid stagnation in local optima up to some extent [7, 15]. Also, the searching process of original SSA is not efficient to improve the global optima attained until now, which is required to track and locate global optima in a dynamic fitness landscape. Therefore, the original SSA lacks the expected ability to explore and maintain diversity in DOPs. Thus, SSA is unable to find and pursue the changing global best solution in a dynamic and uncertain environment. Self-adaptive control parameter technique with multi-population mechanism is an efficient approach for solving the DOPs, where the control parameters of an algorithm are adapting itself according to the progress of optimization process, and multi-population with ageing strategy helps in searching and tracking ever-changing global optima [16].

24

S. Pathak et al.

2.1 Mathematical Model The whole population is divided into two groups, namely leaders and followers to model the salp chain mathematically. Front salps are a leader whose responsibility is to guide the swarm and rest of the salps are considered as followers, whose responsibility is to follow each other or to the leader directly or indirectly. SSA is similar to other swarm-based techniques, in which the search space may be defined as N × d matrix where the dimension is d and number of salps is N in the given problem. Hence, all salps position can be stored in matrix X i represented as in Eq. (1). We assumed the food source S as the salp swarm’s target and considered it as the best food position. In SA-SSA, the leader’s and the follower’s positions are calculated using Eqs. (2) and (3), respectively, during the optimization process. ⎡

x11 , x21 , x31 , . . . xd1



⎢ 2 2 2 ⎥ ⎢ x1 , x2 , x3 , . . . xd2 ⎥ ⎢ ⎥ ⎢ 3 3 3 3⎥ 1 x , x , x , . . . x ⎢ d⎥ xd = ⎢ 1 2 3 ⎥ ⎢ . . . ..⎥ ⎢ .. .. .. .⎥ ⎣ ⎦ 1 2 3 N x N , x N , x N . . . xd ⎧ ⎨ S j + C1, j S j − X 1 ∗ C2, j k3 ≤ 0 j X 1j = ⎩ S j − C1, j S j − X 1 ∗ C2, j k3 ≥ 0 j 

rand1 C1, j =k1 , rand1 , C2, j = log rand2 + k2 (X i−1 X ij = X i−1 + Cit+1 S j − X i−1 − X i−2 j j j j )

 t t+1 ∗ rand(0, 1) Ci = sin(P ∗ R) − T

(1)

(2)

(3)

where C1, j , C2, j , Cit+1 are self-adaptive parameters and X 1j indicates a position of the first salp (i.e. leader) in jth dimension and S j reveals the position vector of food source in the search space as salp swarm’s target in jth dimension, k 2 and k 3 are uniformly generated random values in range [0, 1], k 3 indicates if the next position in jth dimension toward positive infinity or negative infinity. Coefficient k1 is the most important parameter in the SSA algorithm presented in Eq. (4) that balances the exploration and exploitation propensities of EAs. k1 = 2e−( L ) 4l

2

(4)

where l and L indicate the current and maximum iteration number, respectively. Hence, when the value of l increases during the search process, the SSA parameter

A Novel Self-adaptive Salp Swarm Algorithm …

25

k1 decreases, which result in exploration during the initial stage and put more focus on exploitation during the later stages of the search process.

2.2 Self-adaptive Techniques in SSA In the evolutionary algorithm, it is extremely important to choose the most appropriate control parameters as it has serious implications on the performance of the algorithm. There is no consistent technique to choose suitable parameter values, and most of the time it’s been chosen as arbitrarily set of some predefined range of values [17]. There are two major forms to set the parameter values of EAs, that is, parameter tuning and parameter control. The parameter tuning is a commonly practised approach in the existing literature, which involves finding a good value of a parameter beforehand through different techniques and these values remain fixed during the application of the algorithm for solving similar problems. The parameter control means values of the parameter keep changing during the optimization process, which can be categorized into three classes, mainly deterministic, adaptive and self-adaptive [18]. In this paper, we consider the self-adaptive technique because the value of the parameters is being updated while the actual search process is in progress. Also we found it as an efficient technique for solving the dynamic optimization problems. There are some ideas proposed in the literature to improve the performance of SSA, but we used the original version of SSA and applied multi-population with ageing mechanism and self-adaptive parameter control as described in [16]. The objective of this approach is to improve the global best solution obtained so far by SSA and to track the trajectory of the global optima. Further, we used Gaussian hyperparallelepiped approach to allow some of the salps in population to oscillate around the best salp, known as Brownian salp [19]. This approach helps in maintaining the diversity of individuals and prevents re-initializing the population when the change is detected, as it introduces a severe loss of information. Maintaining diversity is extremely important to the dynamic optimization problems, as a global optimum of such problem changes over time, and if salps are clustered in a tight region then the individual salp may not be able to detect the change in the problem function. Also, the explorative propensity of EAs is influenced by the population diversity; it means the exploration power is lower in similar individuals of the population. In the SA-SSA, the Eqs. (2) and (3) update the salps positions and produces new solutions. The value of self-adaptive control parameters C1, j , C2, j and Cit+1 changes over the iteration of the algorithm and produces new values for updating the position of salps during the execution of the algorithm. The quantities randk k∈ {1, 2, 3} represent uniform random values within the range [0, 1]. P, R, were taken as fixed values 1.489 and 0.04, respectively. The control parameters value is modified before the position updates, which means it influences and tracks the changes in the new position of salps towards the global optima. Pseudo-code of self-adaptive SSA is presented in Algorithm 1.

26

S. Pathak et al.

3 Experiments and Computational Analysis 3.1 Setup of the Experimental Environment The computer experimental environment is run on Intel® Core™ i7-3520 M CPU @ 2.90 GHz, 16 GB of RAM, and the Microsoft Windows 10 Home operating system. The source code of self-adaptive SSA is implemented in C++.

3.2 Benchmark Problems on Real Space The performance of self-adaptive SSA algorithm is evaluated on the six benchmark problems (F) proposed by Li et al. [1, 20] for the special session in CEC’09 on evolutionary computation in the dynamic and unknown environs: F1: Rotation peak function, F2: Composition of Sphere’s function, F3: Composition of Rastrigin’s function, F4: Composition of Griewank’s function, F5: Composition of Ackley’s function, F6: Hybrid composition function. There are types of seven dynamic change supported by the generalized dynamic benchmark generator (GDBG) that includes a small step (T 1), large step (T 2), random (T 3), chaotic (T 4), recurrent (T 5) and, recurrent with noise (T 6) and dimensional change (T 7) [20].

A Novel Self-adaptive Salp Swarm Algorithm …

27

3.3 Parameter Settings and Testing Procedures The self-adaptive SSA is designed to work on GDBG framework and requires five parameters: global optima, fitness evaluation as a function parameter to record performance value, previous salp positions, required changes in age and control parameters. Rest of all required parameters in the algorithm are managed by using the other functions of GDBG such as a number of salps, maximum generations (i.e. iterations), no. of dimension, search space lower bound and upper bound, and various objective functions. In this article, the performance of self-adaptive SSA is compared with original SSA [3], conventional PSO [11] and clustering PSO [21]. The PSO version of algorithms is evaluated on the dynamic benchmark problems and presented in CEC’2009 to the maximum population size of 50. For the self-adaptive SSA to evaluate on GDBG framework, the maximum population size is 50. For problem function F1, the number of peaks is 10 and 50, which is used for two versions of test problems. However, in case of remaining problem functions F2 to F6, the number of peaks is 10. The self-adaptive control parameters are calculated using various equations defined in Sect. 2.1 and k2 , k3 are uniformly generated in the range [0, 1], according to the dynamic benchmark framework that is being used in the SA-SSA algorithm. The rest of the algorithm parameters to the dynamic benchmark problems are set as usual without any fine-tuning. For all the basic test functions, the testing procedure of the dynamic benchmark problem is used and an interface was created to perform all the required changes from the SA-SSA implementation perspective. The self-adaptive control parameters are adjusted in code to maintain the expected ability to explore and exploit the search space of DOPs. The self-adaptive SSA algorithm has been set to run the pre-defined number of evaluations of the evaluation functions (F) along with required parameter with change types (T ) as input. That means, there is no information regarding change of the problem like the number of peaks, and dynamic or dimension changes were transferred during the execution of the algorithm. However, a proposed method in [1] has been used to calculate the error value of self-adaptive SSA.

4 Results and Discussion The self-adaptive SSA is executed and evaluated for the DOPs and calculated performance values of each function are recorded in the form of average best, the average mean, average worst and standard deviation (STD), presented in Tables 1, 2, 3, 4, 5, 6, 7 as results for the analysis. In Table 8, the overall performance of the SA-SSA algorithm is presented with all the six dynamic optimization problems and seven types of changes, with a combination of different test cases. Assessment result and

28

S. Pathak et al.

Table 1 Value achieved for problem F1 on 10 peaks Errors

T1

Mean best

0.00

T2 0.00

T3 0.00

0.00

T4

T5 0.00

T6 0.00

T7 0.00

Mean worst

1.67

32.90

36.74

0.01

20.85

42.73

46.93

Mean

0.04

5.44

9.27

0.00

2.39

1.91

7.22

STD

0.31

8.82

11.29

0.00

4.13

6.46

13.32

Table 2 Value achieved for problem F1 on 50 peaks Errors

T1

T2

T3

T4

T5

T6

T7

Mean best

0.00

0.00

0.00

0.00

0.00

0.00

0.00

Mean worst

4.95

27.03

38.70

1.29

13.40

65.47

25.67

Mean

0.36

5.71

10.62

0.09

1.68

4.59

4.17

STD

0.95

6.47

10.26

0.27

2.82

11.12

5.93

T6

T7

Table 3 Value achieved for problem F2 on 10 peaks Errors Mean best Mean worst

T1

T2

T3

T4

T5

0.00

0.00

0.00

0.00

0.00

0.00

0.00 49.64

23.24

525.57

494.61

47.08

485.13

319.65

Mean

2.67

103.19

71.64

1.69

77.08

14.05

8.73

STD

5.31

182.53

152.69

6.81

147.86

53.25

12.44

Table 4 Value achieved for problem F3 on 10 peaks Errors Mean best

T1

T2

T3

T4

T5

T6

T7

0.00

0.00

0.00

0.00

0.00

0.00

0.00

577.38

967.54

940.40

1094.54

956.58

1381.92

873.27

Mean

17.60

711.52

595.33

195.03

607.37

428.36

254.28

STD

74.02

321.15

359.39

366.50

375.38

477.66

356.00

T5

T6

T7

Mean worst

Table 5 Value achieved for problem F4 on 10 peaks Errors Mean best Mean worst

T1

T2

T3

T4

0.00

0.00

0.00

0.00

0.00

0.00

0.00

25.08

591.22

557.91

11.06

541.01

59.23

541.85

Mean

4.34

174.15

84.26

1.39

135.05

5.53

30.24

STD

8.11

244.03

167.34

2.85

199.28

11.23

99.38

A Novel Self-adaptive Salp Swarm Algorithm …

29

Table 6 Value achieved for problem F5 on 10 peaks Errors Mean best Mean worst

T1

T2

T3

T4

T5

T6

T7

0.00

0.00

0.00

0.00

0.00

0.00

0.00

35.79

14.93

12.25

8.59

15.91

11.26

32.53

Mean

2.12

0.83

1.18

0.18

1.31

0.93

2.43

STD

7.31

2.48

2.97

1.14

3.07

2.25

6.57

T4

T5

Table 7 Value achieved for problem F6 on 10 peaks Errors Mean best

T1

T2

T3

T6

T7

0.00

0.00

0.00

0.00

0.00

0.00

0.00

29.80

60.32

68.21

46.71

51.94

52.95

41.13

Mean

5.84

14.19

20.31

9.08

14.38

10.87

15.11

STD

8.30

16.03

16.31

13.02

15.72

14.22

15.19

Mean worst

analysis of data show that self-adaptive SSA has outperformed on most of the test functions, when it is compared with original SSA, PSO, GA, and CPSO optimization algorithm. The performance of the self-adaptive SSA was measured for all types of changes and for all the problem functions, including F3, where the results of CPSO are worse than the simple genetic algorithm (SGA) because SGA has better diversity. The analysis of results indicates that SA-SSA is a competitive optimizer for DOPs. The chaotic and small step changes are easier for SA-SSA for most of the dynamic optimization problems.SA-SSA has performed better than CPSO for all of the test functions of DOPs. The large displacement and Gaussian displacement were tough in problem function F3 and it was also difficult to optimize in case of dimensional changes. The problem-wise performance indicates that the SA-SSA has excellent performance on functions F2, F4 and F5 for the small step (T 1) and chaotic (T 4) change types. The algorithm has secured superior results to each type of changes in functions F5 (Ackley’s function) and F2 (Sphere’s function). The consistent performance of SA-SSA in function F6, that is, hybrid composition function to all the types of change (T 1–T 7) can be observed from Table 8 by comparing the average best values. Rastrigin’s function with large step change is the most difficult test case among all the dynamic optimization problems for the SA-SSA.

0.01311785

0.0121935

0.01485975

0.01387166

0.01364502

0.00851145

0.0908849

T2

T3

T4

T5

T6

T7

Mark

0.09074192

0.00898691

0.013053585

0.01410951

0.01483671

0.012049185

0.01314255

0.01456347

F1(50)

0.098058752

0.010177232

0.014114832

0.0119016

0.021195408

0.011837328

0.009891984

0.018940368

F2

Original SSA: 14.1646, GA: 38.0792, SPSO: 33.1020, CPSO: 57.5742

Performance = Sum of the marks secured in each case ∗ 100

0.01468568

T1

F1(10)

Table 8 Algorithm overall performance

0.043654822

0.005157584

0.005162448

0.003691632

0.010122384

0.003065688

0.001709942

0.014745144

F3

0.091430512

0.009644848

0.015609408

0.009776832

0.020433312

0.010410672

0.007836144

0.017719296

F4

63.46793654

0.129988456

0.012168832

0.018529968

0.0198198

0.022725456

0.018846456

0.01896648

0.018931464

F5

0.089920008

0.008371728

0.01233312

0.014944896

0.014645136

0.011134104

0.012040944

0.01645008

F6

30 S. Pathak et al.

A Novel Self-adaptive Salp Swarm Algorithm …

31

5 Conclusion This paper presents a novel approach to nature-inspired Salp swarm algorithm (SSA) for solving dynamic optimization problems, proposed in CEC’2009 special session. The self-adaptive parameter control technique is used with a multi-population mechanism, in which population diversity is maintained by using Gaussian hyperparallelepiped approach to locate and track the ever-changing global optima in an uncertain and dynamic environment. The evaluation is conducted to examine the overall performance of SA-SSA on the widely known generalized dynamic benchmark problems (F1–F6) with seven types of changes (T 1–T 7). The results show that SA-SSA is a promising algorithm and can find a reasonable solution for most of the DOPs. Although the current self-adaptive SSA is effective in solving the problems in dynamic and uncertain environments, the additional cooperation between the follower salps would be considered for improving the performance of SSA in future work.

References 1. Li C, Yang S, Nguyen T, Yu E, Yao X, Jin Y, Beyer H-G, Suganthan P (2008) Benchmark generator for CEC’2009 competition on dynamic optimization. CEC’2009 Special Session 2. Branke J (2000) Evolutionary optimization in dynamic environments 3. https://doi.org/10.1007/ 978-1-4615-0911-0 3. Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp Swarm Algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163– 191. ISSN 0965-9978 4. Yaseen ZM, Sulaiman SO, Deo RC, Chau K-W (2019) An enhanced extreme learning machine model for river flow forecasting: State-of-the-art, practical applications in water resource engineering area and future research direction. J Hydrol 569:387–408 5. Faris H, Mafarja MM, Heidari AA, Aljarah I, Al-Zoubi AM, Mirjalili S, Fujita H (2018) An efficient binary Salp Swarm Algorithm with crossover scheme for feature selection problems. Knowl Based Syst 154:43–67 6. Jour S, Gehad I, Khoriba G, Haggag MH (2018) A novel chaotic salp swarm algorithm for global optimization and feature selection. Appl Intell 3462–3481 7. Ateya AA, Muthanna A, Vybornova A, Algarni AD, Abuarqoub A, Koucheryavy Y, Koucheryavy A (2019) Chaotic salp swarm algorithm for SDN multi-controller networks. Eng Sci Technol Int J 2(4):1001–1012 8. Sayed GI, Khoriba G, Haggag MH (2018) A novel chaotic salp swarm algorithm for global optimization and feature selection. Appl Intell 48:3462–3481 9. Ismael SM, Abdel Aleem SHE, Abdelaziz AY, Zobaa AF (2018) Practical considerations for optimal conductor reinforcement and hosting capacity enhancement in radial distribution systems. IEEE Access 6:27268–27277. https://doi.org/10.1109/access.2018.2835165 10. Ekinci S, Hekimoglu B (2018) Parameter optimization of power system stabilizer via Salp Swarm algorithm. In: 2018 5th international conference on electrical and electronic engineering (ICEEE), Istanbul, pp 143–147. https://doi.org/10.1109/iceee2.2018.8391318 11. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. Proceedings of the sixth international symposium on micro machine and human science, Nagoya, Japan, pp 39–43, https://doi.org/10.1109/mhs.1995.494215

32

S. Pathak et al.

12. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95— international conference on neural networks, vol 4, Perth, WA, Australia, pp 1942–1948. https:// doi.org/10.1109/icnn.1995.488968 13. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv EngSoftw 69:46–61 14. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67 15. Abbassi R, Abbassi A, Heidari AA, Mirjalili S (2018) An efficient salp swarm-inspired algorithm for parameters identification of photovoltaic cell models. Energy Convers Manag 179:362–372. https://doi.org/10.1016/j.enconman.2018.10.069 16. Brest J, Zamuda A, Boskovic B, Maucec MS, Zumer V (2009) Dynamic optimization using self-adaptive differential evolution. In: 2009 IEEE congress on evolutionary computation, Trondheim, pp 415–422. https://doi.org/10.1109/CEC.2009.4982976 17. Maruo MH, Lopes HS, Delgado MR (2005) Self-adapting evolutionary parameters: encoding aspects for combinatorial optimization problems. In: Raidl GR, Gottlieb J (eds) Evolutionary computation in combinatorial optimization. EvoCOP 2005. Lecture Notes in Computer Science, vol 3448. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-319962_15 18. Brest J, Greiner S, Boskovic B, Mernik M, Zumer V (2006) Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems. IEEE Trans Evol Comput 10(6):646–657. https://doi.org/10.1109/TEVC.2006.872133 19. Zaharie D, Zamfirache F (2006) Diversity enhancing mechanisms for evolutionary optimization in static and dynamic environments 20. Li C, Yang S (2008) A generalized approach to construct benchmark problems for dynamic optimization. In: Li X et al (eds) Simulated evolution and learning. SEAL 2008. Lecture Notes in Computer Science, vol 5361. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3540-89694-4_40 21. Li C, Yang S (2009) A clustering particle swarm optimizer for dynamic optimization. In: 2009 IEEE congress on evolutionary computation, Trondheim, pp 439–446. https://doi.org/10.1109/ CEC.2009.4982979

Digital ID Generation and Management Framework Using Blockchain Suchira Banerjee and Kousik Dasgupta

Abstract In the age of digitization and Internet, one of the most compelling need is to have a user identity management system that is decentralized, secure, and interoperable across different online service domains. The success of cryptographically secured, decentralized ledger system, called blockchain, as the underlying platform for Bitcoin cryptocurrency system, has made it a promising candidate for research and development in the identity management domain as well. One of the challenges in designing blockchain-based digital identity management solutions is to balance the degree of decentralization and self-sovereignty in order to maximize security, credibility of personal data mapped with the digital identity (ID), and increase usability. In this work, a novel digital ID generation and management framework over blockchain is proposed that balances centralized digital ID issuance process with a decentralized peer verification of the issued ID on the blockchain before the ID is confirmed and ready to be shared. The proposed work has been simulated on the HyperledgerSawtooth blockchain framework to demonstrate its feasibility. The results show the efficacy and usability of the proposed work. The work is concluded by showing future research directions for improving privacy of the proposed identity management system. Keywords Blockchain · Digital identification · Security

1 Introduction An identity management system binds each individual with a distinct, unambiguous identifier that is composed of a set of personal information, traits, or attributes of the individual [1]. The prevalent national identification schemes are highly centralS. Banerjee (B) · K. Dasgupta Kalyani Government Engineering College, Kalyani, West Bengal, India e-mail: [email protected] K. Dasgupta e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_4

33

34

S. Banerjee and K. Dasgupta

ized in terms of data storage and on the dependency of a single authority for ID issuance which makes the system vulnerable to security threats, e.g., data theft, privacy breach, and so on. Also, the various stages of ID issuance process and ID life cycle are not transparent enough, so, many irregularities and malpractices get opportunity to creep in. Differences in requirement and compatibility of the consumer applications/services pose additional challenges in establishing a uniform standard for data exchange, encoding schemes, data storage, and data usage in an identity management system. In this work, we propose a digital ID generation and management framework over a consortium blockchain network. It provides each user with a unique ID, referred to as Digital ID, bound with both the blockchain address of a user and the user as a person. This is issued, recorded, managed, and shared with transparency on the distributed, cryptographically secured, append-only ledger of blockchain. The use of blockchain technology incorporates not only transparency in the proposed Digital ID life cycle but also decentralization in validation and issuance process. The system also provides user authentication through digital signature and privacy using standard encryption. The contributions of this paper include the following: • Detailed design of the Digital ID generation process over blockchain. • Security analysis of the proposed framework. • Implementation of a prototype application over Hyperledger-Sawtooth to demonstrate feasibility of the Digital ID generation protocols. The remainder of this paper is organized as follows. In Sect. 2, we discuss the related works. Section 3 provides an overview of the proposed system and design goals. Section 4 elaborates on the design of the system, discusses identity management, use cases, and analyzes the security aspects of the system. The implementation details of the prototype blockchain system for Digital ID generation are provided in Sect. 5. Screenshots of output from the prototype application are provided and discussed in Sect. 6 and conclusion is given in Sect. 7.

2 Related Work From year 2014 to 2018, many blockchain-based solutions in this field were developed—Sovrin [2], Blockstack [3], ShoCard [4], uPort [5] to name a few. Few of these solutions and its approaches have been studied and analyzed in detail in some research papers and surveys [1, 6]. The study by Dunphy and Petitcolas [1] on blockchain-based identity management system identifies the two widely used approaches: 1. Self-sovereign Identity (SSI )—In this, the user has full control and ownership over his identity data. Self-sovereignty was given the prime focus by identity management systems like Blockstack, Sovrin, uPort, etc.

Digital ID Generation and Management Framework Using Blockchain

35

2. Decentralized trusted identity—In this, the identity is provided by a centralized issuing authority. The ShoCard application uses this approach as data are scanned from some government ID at the time of bootstrapping to blockchain. Unlike SSI, this approach requires user to disclose more information for issuance of new digital identity. However, fool-proofing the personal attribute attestation process while bootstrapping ID is not fully addressed in applications like ShoCard or uPort as the data attestation by third party is kept optional after bootstrapping of ID. Again, the SSI approach might give rise to scenarios where the owner can hide or alter some of his attribute values causing a breach of trust. In this work, we try to address these issues by achieving a balance of both self-sovereignty and centralized authentication of the identity attributes by using the principle of federated attestation. We find the concept of federated attestation being used by the Bloom Protocol–a blockchain project for decentralized credit scoring system and secure identity [7]. The Bloom protocol uses “peer-to-peer staking” for ensuring creditworthiness and authenticity of the BloomID [7], where peers, reflecting the social relationships of a user vouch for the user’s PII . Although the BloomID generation approach closely resembles our proposed approach, Bloom does not take into account the presence of faulty nodes in the network—either non-responding or malfunctioning user/peer clients. In our proposed Digital ID generation method, we seek to reduce the dependency on the correctness of the system-defined trusted parties by requiring a super-majority quorum considering the possibility of faulty participants in the peer-verification process.

3 System Overview This section explains an overview of the system architecture.

3.1 Participants and Process Overview The proposed Digital ID generation and management framework uses blockchain as a backend system for issuing, storing, and exchanging Digital ID. The participants access the blockchain through a CLI/GUI-based client frontend. Figure 1 presents an overview of the proposed system. The blockchain participants are given special privilege on the basis of their role that is leveraged by the role-specific client application. The participants can be classified as follows: 1. Primary ID issuer—An authorized entity responsible for acquiring user details and verifying them with valid acceptable proofs such as government IDs, birth certificates, etc. at the physical presence of the user. 2. Secondary certifier—These are again authorized entities providing key services (banks, universities, hospitals, etc.) and can act as ID verifiers on blockchain.

36

S. Banerjee and K. Dasgupta

Fig. 1 Overview of Digital ID generation with four selected peers on blockchain

3. User—Any person or organization as an entity who seeks to obtain a Digital ID and avail applicable services through Digital ID authentication. 4. Peer user—Each user selects a list of peers who are willing to vouch for the validity and authenticity of personal information present in the Digital ID. The peers can be chosen from secondary certifiers or other users in the network possessing a valid Digital ID and assumed to be known to the requesting user. To get the blockchain network functional for Digital ID generation, it starts with a trusted organization as the primary ID issuer and at least four secondary certifiers. It has been assumed in this work that only the primary ID issuer and the secondary certifiers have permission to take part in the consensus for adding new blocks to the blockchain.

Digital ID Generation and Management Framework Using Blockchain

37

3.2 Parameters Used by the Digital ID Generation Framework [i] Trust score—A trust score is assigned to a user or certifier’s profile in the Digital ID network to quantify the trustworthiness and so, accounted as the weightage of each vote casted by the concerned user or certifier. [ii] Credibility score—The credibility score is calculated at the attribute level of a Digital ID and gives a numeric interpretation of credibility or trustworthiness of the attribute’s data. Credibility score is calculated as the sum of the trust scores of all the peer verifiers who vouched for the correctness of the attribute’s value.

3.3 Design Goals For designing this system, we specify the following requirements to be met by the solution: 1. Endpoint addressability—The proposed solution must allow ease of addressability and discoverability of the blockchain participants. 2. Correct binding and correctness of PII—Ensure reliability and trustworthiness of Digital ID. 3. Security—Must address the possible security threat from identity theft. 4. Fault Tolerance—The peer-verification algorithm and the blockchain consensus algorithm should achieve Byzantine fault tolerance.

4 Detailed Design In this section, we describe the procedure for Digital ID generation on blockchain in the proposed system and use cases for Digital ID management.

4.1 Connecting to Digital ID Blockchain Network For connecting to the blockchain network, each participant performs the following steps: 1. Run a user role-specific blockchain client in his/her devices and creates a user profile by generating private–public key pair using an Elliptic Curve Digital Signature Algorithm (ECDSA) key generation algorithm [8]. 2. Obtains a 160-bit public blockchain address, Ax , from the client, generated by taking double hash of public key by using 160-bit RACE Integrity Primitives

38

S. Banerjee and K. Dasgupta

Evaluation Message Digest (RIPEMD-160) algorithm over the 512-bit hash from Secure Hash Algorithm (SHA-512) on the public key. An approach originally proposed and used by Bitcoin [8]. 3. Obtains current blockchain state status including user’s trust score linked with Ax from the client. 4. If a participant’s state is not initialized which is the case for the first-time use, then (a) If the participant is a primary issuer or secondary certifier, maximum allowed trust score, θmax = 20, is saved to the blockchain state associated with Ax . The value θmax = 20 has been taken by empirical study, (b) Else, a minimum trust score, θ0 = 0, is assigned by client locally to the general user as the initial trust score.

4.2 Creation of New Digital ID on Blockchain In the proposed system, the Digital ID is created from a set of personal identification information (PII ). Each ID attribute consists of PII along with processing status, certificate from the primary certifier, a list of peer verifiers, and credibility score gathered so far. The Digital ID also contains unique random numbers at both ID level and attribute level, generated at the time of ID creation and every time attribute field values are updated. The Digital ID creation flow primarily consists of following four steps: • User raising new Digital ID request with the primary ID issuer who has the sole authority to acquire user’s identification information. • Bootstrapping or linking of verified PII on blockchain by the primary issuer after adding digital signature as issuer’s certificate. Successful completion of this step sets the trust score of the user, θuser , as 1. • User’s self-verification of the bootstrapped PII in the pre-confirmed ID obtained from the previous step and subsequent on-chain peer verification of the same to reach a protocol specified quorum for each Digital ID attribute. – For each attribute in the Digital ID the user chooses at least four peers and at most five peers, from a previously selected group of peers such that sum of trust scores of all peers is at least equal to Qpeer , calculated as follows from Eq. 1: Qpeer = S ∗ θmax

(1)

where θmax is obtained empirically and S represents super-majority number of votes required to make the peer voting byzantine fault tolerant [9] as per votingbased Practical Byzantine Fault-Tolerant (PBFT ) consensus algorithm [9], S is obtained from Eq. 2,

Digital ID Generation and Management Framework Using Blockchain

S = 2 ∗ (N − 1)/3 + 1 = 2 ∗ (totalpeer )/3 + 1

39

(2)

where totalpeer represents total number of selected peers and N is the total number of participants in the peer-verification process including the requesting user represented by 3, (3) N = totalpeer + 1 For four to five selected peers, Qpeer for ID verification as calculated by Eq. 1 comes out to be 60 using θmax = 20. – The peer-verification request transactions are then sent as an atomic batch. In response, each confirming peer attests the requested PII attributes of the user by digitally signing the data with own private key on the blockchain. • Digital ID confirmation by the user on successful attainment of required quorum for each identity attribute in the ID and subsequent acknowledgement of confirmation by the primary issuer for its finalization. With a confirmed ID, the trust score of the user (θuser ) becomes a maximum of 10.

4.3 Sharing of Digital ID on Blockchain and Additional Use Cases A finalized Digital ID generated using the above steps is valid to be shared for authentication over the blockchain by providing a digital signature on the Digital ID along with a reference of the final transaction carrying acknowledgement of the confirmed ID. As additional use cases, Digital ID update, invalidation, and recovery operation can be implemented using the proposed framework which would involve seeking acknowledgement or verification from the primary certifier and a set of user-chosen peers as in the Digital ID generation process.

4.4 The Consensus Algorithm for Block Generation We have used the voting-based Practical Byzantine Fault-Tolerant (PBFT ) consensus algorithm by Hyperledger-Sawtooth [10] as the preferred mechanism for transaction ordering and new block creation in the blockchain. Using PBFT would reduce cost and time involved in probabilistic leader election-based consensus algorithms [11]. Transaction verification and blockchain state validation algorithms augment blockchain consensus. The integrity and uniqueness of Digital ID is maintained by including a random number every time a message is serialized for sending in a transaction and then signing over the Digital ID hash to create certificates.

40

S. Banerjee and K. Dasgupta

4.5 Security Analysis This section analyzes how the proposed Digital ID generation system meets the design goals specified in Sect. 3.3. Endpoint addressability—The 160 − bit blockchain address can identify each of the blockchain participants uniquely and deterministically from the corresponding public key while keeping the participant pseudo-anonymous. Binding—To ensure correct linking of a user’s personal details to his/her Digital ID, the proposed solution performs linking in two steps: 1. Linking a person with his/her blockchain address—Done by offline verification and registration of user identity data by the primary issuer and subsequent peer verification. 2. Linking blockchain address with the issued digital identity—Done by applicationspecific transaction verification and state validation logic as part of consensus. Security and integrity—The blockchain consensus as well as various operationspecific condition checks in the clients before sending any transaction reduces chances of impersonation when using Digital ID on the chain. The data integrity is maintained by unique message digest of each different Digital ID. Fault tolerance of Digital ID peer-verification algorithm—As per the requirement of PBFT consensus algorithm [11], the Digital ID peer-verification algorithm attains Byzantine fault tolerance for at most one faulty node. This has been checked from Eqs. 1 and 2.

5 Implementation The open-source project for enterprise blockchain platform, Hyperledger-Sawtooth [12], is chosen as the underlying blockchain platform for implementation of the proposed Digital ID framework. This is due to its high modularity, dynamic consensus, and flexibility in deploying permissioning of participating nodes. A validator node in Sawtooth network runs the validator process that is responsible for validating transaction batches and grouping them into blocks, coordinating communication in the network and running consensus for finalizing the blocks to be included into the chain [13]. The application is coded in Python and using Sawtooth’s PythonSDK [14]. For our implementation, we use Sawtooth’s command “sawtooth keygen” [15] for generating any participant’s private–public key pairs.

Digital ID Generation and Management Framework Using Blockchain

41

Fig. 2 (a) Screenshot of Digital ID request operation for user0x1 with additional attributes “Education” and “Guardian” as highlighted in the image. (b) Screenshot displaying certifier1 filling in Digital ID data and submitting transaction for user0x1 to blockchain

6 Results In this section, we demonstrate the functionality of the proposed framework by providing screenshots from a test run of the Digital ID generation process of a single validator node running Sawtooth’s random-leader selection-based Devmode Consensus Engine [13, 16]. Starting user’s client and sending ID request—We run the client application using the id _wallet command for user user0x1 for creating its very first Digital ID. We request a new Digital ID using request command as highlighted in Fig. 2a. Digital ID issuance by primary issuer—On receiving request from user0x1, the primary issuer, certifier1, runs process_request command in its client to fill the PII for user0x1 and issue certificate as displayed in Fig. 2b. Requesting peer verification—U ser0x1 runs the peer_verify command to selfverify and send the Digital ID for peer verification. As shown in Fig. 3a, the figure shows that the peer_verify command displays the Digital ID, saved at blockchain state, with its current processing status, ON _VERIFICATION and the details of attribute “Name” as highlighted in the image for self-verification. As user0x1 confirms the displayed ID data, the peer-verification process continues with peer selection for individual attributes to send peer-verification request transactions. The peer-selection process for “Name” attribute is shown in Fig. 3b, where the maximum achievable peer vote of 80 (sum of the trust scores of the selected peers) meets the minimum target peer-quorum requirement of 60 as shown highlighted in the image. Confirmation and Acknowledgement—The finalized Digital ID after successful peer verification and user confirmation is shown with display command in Fig. 4.

42

S. Banerjee and K. Dasgupta

Fig. 3 (a) Screenshot displaying peer-verification sub-step for user0x1 and (b) Screenshot displaying peer-selection process after successful self-verification of the Digital ID

Fig. 4 Screenshots (a)–(b) displaying the finalized Digital ID of user0x1 with status CONFIRMED

7 Conclusion In this paper, we have proposed a blockchain-based identity generation and management framework that aims to build a system of trust using both human resource and a permissioned blockchain network. The proposed system leverages protocols that involve an especially authorized primary ID issuer to verify the personal data for ensuring correctness of a digital ID and subsequent on-chain self-verification and quorum-based peer verification of the personal data for strengthening trustworthiness of the issued ID on the blockchain. The peer-verification algorithm introduces resiliency to faulty nodes in the network to improve the federated attestation process. The proposed work can be extended by employing blockchain-based protocols for ensuring privacy and non-traceability of the transactions. Also appropriate data compression method can be used to reduce space requirement.

References 1. Dunphy P, Petitcolas FAP (2018) A first look at identity management schemes on the blockchain. In: IEEE Secur Privacy 16(4):20–29. https://doi.org/10.1109/msp.2018.3111247

Digital ID Generation and Management Framework Using Blockchain

43

2. Sovrin Foundation (2018) Sovrin: a protocol and token for self-sovereign identity and decentralized trust [White paper]. https://sovrin.org/wp-content/uploads/Sovrin-Protocol-and-TokenWhite-Paper.pdf. Accessed 8 July 2020 3. Ali M, Nelson J, Blankstein A, Shea R, Freedman MJ (2019) The blockstack decentralized computing network [White paper]. https://blockstack.org/blockstack_usenix16.pdf. Accessed 8 July 2020 4. Ebrahimi A (2020) Identity management service using a blockchain providing certifying transactions between devices (Ping Identity Corp Patent) [Review of Identity management service using a blockchain providing certifying transactions between devices]. U.S. Patent No. 10,657,532 B2. U.S. Patent and Trademark Office, Washington, DC 5. uport-project/specs (nd) GitHub. https://github.com/uport-project/specs/blob/develop/ README.md. Accessed 8 July 2020 6. Lim SY, Fotsing PT, Almasri A, Musa O, Kiah MLM, Ang TF, Ismail R (2018) Blockchain technology the identity management and authentication service disruptor: a survey. Int J Adv Sci Eng Inf Technol 8(4–2):1735–1745. https://doi.org/10.18517/ijaseit.8.4-2.6838 7. Leimgruber J, Meier A, Backus J (2018) Bloom protocol: decentralized credit scoring powered by Ethereum and IPFS [White paper]. https://bloom.co/whitepaper.pdf. Accessed 8 July 2020 8. Antonopoulos AM, O’reilly Media (2018) In mastering bitcoin: programming the open blockchain. O’reilly 9. Castro M, Liskov B (1999) Practical Byzantine fault tolerance. OSDI 99:173–186 10. Introduction-Sawtooth PBFT v1.0.0 documentation (nd) Sawtooth.Hyperledger.Org. https://sawtooth.hyperledger.org/docs/pbft/releases/1.0.0/introduction-to-sawtooth-pbft.html. Accessed 8 July 2020 11. Xiao Y, Zhang N, Li J, Lou W, Hou Y (2019) Distributed consensus protocols and algorithms. In: Shetty S, Kamhoua C, Njilla L (eds) Blockchain for distributed systems security, 1st edn. Wiley, pp 25–50 12. Introduction-Sawtooth v1.2.5 documentation (nd) Sawtooth.Hyperledger.Org. https:// sawtooth.hyperledger.org/docs/core/releases/latest/introduction.html. Accessed 8 July 2020 13. Architecture Guide-Sawtooth v1.2.5 documentation (nd) Hyperledger.Org. https://sawtooth. hyperledger.org/docs/core/releases/latest/architecture.html. Accessed 8 July 2020 14. Python SDK API Reference-Sawtooth v1.2.5 documentation (nd) Sawtooth.Hyperledger.Org. https://sawtooth.hyperledger.org/docs/core/releases/latest/sdk_python.html. Accessed 8 July 2020 15. hyperledger/sawtooth-core (nd) GitHub. https://github.com/hyperledger/sawtooth-core/blob/ master/cli/sawtooth_cli/admin_command/keygen.py. Accessed 8 July 2020 16. Glossary—Sawtooth v1.2.5 documentation (nd) Sawtooth.Hyperledger.Org. https://sawtooth. hyperledger.org/docs/core/releases/latest/glossary.html. Accessed 8 July 2020

HFAIR: Hello Devoid Optimized Version of FAIR Protocol for Mobile Ad hoc Networks Abu Sufian, Anuradha Banerjee, and Paramartha Dutta

Abstract Mobile ad hoc networks largely suffer from the constraint of battery power. A node continuously senses the information about the network by sending the HELLO packet and receiving the Acknowledgment packet. It consumes huge energy that affects each node based on battery power. This article proposed an optimized HELLO devoid version of the FAIR protocol that intelligently reduces the requirement of continuous HELLO packets, and as a result, provides huge energy preservation that leads to significant performance improvement in terms of energy consumption, lifetime of network, end-to-end delay, and network throughput. Keywords Energy-efficient routing · MANET · Reactive routing · Simulation study

1 Introduction Mobile ad hoc network (MANET) is an infrastructure-less network that can be formed with a group of mobile nodes in an ad hoc manner. Nodes are battery powered and may move arbitrarily, that is why the topology is very dynamic and suffers from frequent route breakage. In addition, the node also could play the roles of a router [1, 2]. MANET is very useful for some hostile situations such as natural disasters and war where the traditional network may not be active [3]. Several routing protocols are proposed [4]. Some of them are pro-active, some are reactive, and some are hybrid A. Sufian Department of Computer Science, University of Gour Banga, Malda, India A. Banerjee Department of Computer Application, Kalyani Government Engineering College, Kalyani, India A. Sufian (B) · P. Dutta Department of Computer & System Sciences, Visva-Bharati University, Santiniketan, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_5

45

46

A. Sufian et al.

[5]. Recently, researches on energy-efficient routing protocols are becoming popular [6, 7]. This article brings an energy-efficient routing scheme, we called HFAIR for MANETs that modified a state-of-the-art routing FAIR [8]. The concept is based on our earlier novel concept reported as Minus HELLO [9]. Some main contribution of the article are • Extension of state-of-the-art FAIR [8] to reduce unnecessary HELLO packets. • An optimized energy-efficient scheme that exploits the benefit of FAIR with the idea of Minus HELLO [9]. • Simulation studies to show performance improvement of HFAIR over AODV [10], FAIR[8], and MMBCR [11]. The rest of the article is organized as follows: In Sect. 2, the background of the study is described whereas methods of proposed HAFIR in Sect. 3. Simulation studies are discussed in Sect. 4 and finally, the conclusion in Sect. 5.

2 Background In general, in every reactive routing protocol in MANETs, all nodes must broadcast the HELLO packet at a regular interval to nodes that falls in their respective radioranges [12]. Typical attributes of a HELLO packet are as follows: (i) message type id (1), (ii) sender id (ni ), (iii) sender location (xi (t), yi (t)), (iv) radio-range (rad (i)), and (v) current timestamp (t). In reply to each HELLO packet, the acknowledgment (ACK) packet has to be sent back from each downlink neighbor node to that node. The format of an ACK packet is the same as that of a HELLO packet. These continuous procedures to maintaining networks burn down large amounts of residual energy, especially for HELLO and ACK packets when they are transferred periodically [13]. But these are not required at all, and the information contained in those packets does not change frequently. HFAIR unfolds the fact that this information is not relevant until or unless a request for communication is received from the node. In HFAIR, information about neighbor nodes is amassed during the flooding of Route Request (RREQ). Knocking out unnecessary HELLO packets and ACK packets leads to saving a large amount of residual power or energy in MANETs. This will contribute to increases in the average lifetime of nodes. As a result, the reduction of frequent link breakages, which helps reduce power or energy consumption, increases the active lifetime of the network, increases average source to destination delay per session, and increases the throughput of the network. The concepts of this method are not old, we first reported in our earlier work [9]. Therefore, no direct related work was found except that work.

HFAIR: Hello Devoid Optimized Version of FAIR Protocol …

47

3 Methods of Proposed HFAIR In this section, we have discuses HFAIR, a Minus Hello Embedded FAIR [8] protocol that follows the similar principle of our earlier work [9].

3.1 Route Discovery Limited area flooding or directional flooding is applied in FAIR [8], provided a recent location of the destination node is known. This specifies a broadcast predictive region depending upon the last known location of the destination, corresponding timestamp, and maximum possible velocity of the destination. For example, consider Fig. 1 where node ni is a router that knows the location of the destination node nd at timestamp tm . The total time-to-live for an RREQ packet is TTL. Let us assume that the RREQ packet started its journey from source node ns at time ts . That RREQ arrived at node ni at time t. Therefore, the packet can remain alive for a maximum time of {TTL − (t − ts )}. If maximum possible velocity of the destination node nd is vmd , then during the entire lifetime of the RREQ packet, the destination will be able to go to a distance [vmd × {TTL − (t − ts )}]. The broadcast predictive region is bounded by two tangents from node ni to this embedding circle, including that part of the embedding circle which is outside the region bounded by those two tangents, as shown in Fig. 1. HFAIR retains this technique because this has nothing to do with the HELLO packet. Location information of the destination may be known from the earlier RREQ generated from the same or other sources destined to the same node. In classical FAIR, each node broadcasts the HELLO packet within its radiocircle informing it’s a unique identifier, radio-range, location, velocity, and residual energy. ACK packet also contains similar information as the sender. Whenever an RREQ arrives at node nj from node ni , it evaluates the connectivity of its link with node ni and appends the result with the RREQ packet. HFAIR intelligently changes the mechanism so that whenever node ni broadcasts an RREQ, it informs its own node-id, radio-range, location, velocity, and residual energy with it. As soon as it is received by node nj , it checks whether the same packet has been received earlier. If yes then a loop is found and the RREQ packet is discarded. Otherwise, node nj stores information about that RREQ in its route-request table. Attributes of this table are similar to that of our previous works [9]. Node nj is well aware of its own attributes. Based on velocity, location, radiorange, and residual energy of node ni , node nj computes residual energy quotient, neighbor affinity, radio-quotient, and weight of the link from node ni to node nj to combine that with the route performance obtained so far starting from source node ns using a fuzzy controller as in FAIR. This combined performance information up to node nj replaces performance information up to node ni in the RREQ packet as in classical FAIR. Then, HFAIR instructs nj to replace radio-range, location, velocity,

48

A. Sufian et al.

Fig. 1 Directional flooding

Fig. 2 RREQ generated by ns

Fig. 3 RREQ forwarded by ni

and residual energy information of ni by those of nj , so that the successor of node nk of node nj in RREQ propagation can compute residual energy quotient, neighbor affinity, etc., of its link with nj . Figures 2, 3, and 4 show RREQ generated by ns and forwarded by node ni and node nj , respectively. Assume that node ns generated the RREQ at timestamp 1 which is forwarded by node ni at timestamp 4 and node nj at timestamp 9; aij (t), bij (t), cij (t), and dij (t) denote residual energy quotient, neighbor affinity, radio-quotient, and weight of the same link from node ni to node nj at time t. Similarly, ei (t), ri , vi (t) specify residual energy or battery power of ni at time t, radio-range, and velocity, respectively, of the same node at the same time. An initiator node is ns (current RREQ is a fresh one) so, the difference of maximum hop count of current RREQ from maximum possible hop count is 0 which is reflected in the last field of the RREQ packet. Currently, session no. 3 is going to be established between source ns and destination nd . Five data packets are to be transmitted. f (s, i) is the performance of the path from ns to ni , obtained as an output of a fuzzy controller embedded in all nodes in classical FAIR as well as HFAIR. In RREQ generated by ns , no path performance computation is performed yet so the 11-th field of the RREQ in Fig. 2 shows null. The 12-th field is also null because no router is yet appended to the RREQ. Appending consecutive router-ids is done in the classical version of FAIR too. The current timestamp is 1 for a generation of RREQ by ns (13-th field) and the 14-th field is 0. This is not changed by subsequent routers unless a link breakage is detected.

HFAIR: Hello Devoid Optimized Version of FAIR Protocol …

49

Fig. 4 RREQ forwarded by nj

3.2 Loop Detection On receiving an RREQ, the node first reads the value of the maximum allowable hop count mentioned in the RREQ. If it is a minimum than that value then it compares in the RREQ table to look over whether the same packet of RREQ comes earlier, if it is spotted then it dropped. But if session-id is the latest then replace previous RREQ by latest one. If the same packet did not arrive earlier then it added.

3.3 Route Maintenance In classical FAIR, link breakage is detected through the HELLO packet, whereas in HFAIR, it is similar to our earlier work [9].

3.4 Comparing Sizes of Various Messages in FAIR and HFAIR Information sent in the RREQ packet of HFAIR is the same as the information sent in one HELLO packet in classical FAIR [8]. Analysis of link-fail, repair-request, and repair-permission are the same as in our earlier work [9].

4 Simulation Results The simulation experiments are done using NS-2 [15] network simulator. The simulation environment is mentioned in Table 1. The proposed protocols HFAIR, MMBCR, MTPR, MRPC, and MFR are compared with three state-of-the-art protocols AODV [10], FAIR [8], and MMBCR [11]. Simulation parameters are energy consumption, network lifetime, average end-to-end delay, and network throughput. Results are shown Figs. 5, 6, 7, 8, 9, 10, 11, and 12.

50 Table 1 Simulation framework Area of the network Traffic type Size of each packet Interval of two consecutive HELLO packets Mobility model Mobility of node Frequency of the signals Channel bandwidth Packet transmission power Packet receiving power Radio-range Initial residual power of nodes Number of nodes considered

Fig. 5 Energy consumption with different number of nodes

Fig. 6 Network lifetime with different number of nodes

Fig. 7 End-to-end delay (in seconds) per session with different number of nodes

A. Sufian et al.

500 × 500 m2 CBR (constant bit rate) 512 bytes 10 ms Random waypoint [14] 10–30 m/s 2.40 GHz 2 Mbps 300–600 MW 50–300 MW 50–100 m 5–10 J 20, 40, 60, 70, 80, 90, 100

HFAIR: Hello Devoid Optimized Version of FAIR Protocol … Fig. 8 Network throughput with different number of nodes

Fig. 9 Energy consumption with different packet loads

Fig. 10 Network lifetime with different packet loads

Fig. 11 End-to-end delay per session with different packet loads

51

52

A. Sufian et al.

Fig. 12 Network throughput with different packet loads

4.1 Battery Power (Energy) Consumption Compared to state-of-the-art protocols namely AODV, FAIR, and MMBCR, in HFAIR energy consumption at each node is greatly reduced. It has been mentioned in this article that by nominal changes of structures in the RREQ packet, the requirement HELLO packet can be avoided, and this is especially in reactive, stability oriented, and energy-aware routing protocols. In addition, the maintenance of the route in the HFAIR protocol is performed in such a way that it consumes less energy compared to those protocols. Whenever a breakage of a link is traced by a router node in a live communication path, HFAIR emphasizes that the distance from that router to the destination is significantly lesser than the distance between the source and the destination nodes. Therefore, in HFAIR inject of RREQ packet for that portion of the network that largely saves the energy which is depicted in Figs. 5 and 9.

4.2 Lifetime of Networks With the increases in energy consumption, more nodes are starting to run out of energy quickly, and as a result, network lifetime decreases. As depicted in Figs. 6 and 10, the lifetime of networks increases with an increasing number of nodes. When the number of nodes is set and packet transfer load varies, then the lifetime of networks scales down.

4.3 End-to-End Delay The route re-discovery process increases the overall end-to-end delay. Because frequent route breakage force increases the flow of the RREQ packet for route rediscovery that leads to delay of data packets, HFAIR reduces the end-to-end delay. Figures 7 and 11 show the reduction of end-to-end delay for HFAIR compared to the other three.

HFAIR: Hello Devoid Optimized Version of FAIR Protocol …

53

4.4 Throughput of Networks The throughput of networks also largely depends on route re-discovery. Increases in the RREQ packet cause a large number of packet contention and collision in the network. As results throughput long with other performance metric reduces. It is evident from Figs. 8 and 12 that HFAIR gives better-simulated results. Initial improvement for all is due to superior stable network connectivity, other hands after the network becomes condense, the throughput of networks starts falling. Throughput also decreases with an increase in packet load.

5 Conclusion Reducing the required energy for maintaining network connectivity in MANETs is very effective. Most of the performance metrics are directly or indirectly dependent on the energy efficiency of nodes. The reduction of the network controlling packet such as the HELLO packet reduces energy consumption significantly. This HFAIR is a technique proposed for the same purposes. HFAIR is an optimized state-of-theart protocol FAIR with a reduction of unnecessary HELLO packets. This reduction saves huge residual energy, and as a result, performance improvement is noticed in simulation studies.

References 1. Chlamtac I, Conti M, Liu JJ-N (2003) Mobile ad hoc networking: imperatives and challenges. Ad hoc Netw 1(1):13–64 2. Boukerche A (2004) Performance evaluation of routing protocols for ad hoc wireless networks. Mob Netw Appl 9(4):333–342 3. Basagni S, Conti M, Giordano S, Stojmenovic I (2013) Mobile ad hoc networking: cutting edge directions, vol 35. Wiley 4. Moussaoui A, Boukeream A (2015) A survey of routing protocols based on link-stability in mobile ad hoc networks. J Netw Comput Appl 47:1–10 5. Quy VK, Ban NT, Nam VH, Tuan DM, Han ND (2019) Survey of recent routing metrics and protocols for mobile ad-hoc networks. J Commun 14(2):110–120 6. Chawda K, Gorana D (2015) A survey of energy efficient routing protocol in manet. In: 2015 2nd International conference on electronics and communication systems (ICECS). IEEE, pp 953– 957 7. Sufian A, Banerjee A, Dutta P (Apr 2019) Energy and velocity based tree multicast routing in mobile ad-hoc networks. Wirel Pers Commun 107:2191–2209 8. Banerjee A, Dutta P (2010) Fuzzy-controlled adaptive and intelligent route selection (fair) in ad hoc networks. Eur J Sci Res 45(3):367–382 9. Banerjee A, Sufian A, Dutta P (2019) Minus hello: Minus hello embedded protocols for energy preservation in mobile ad-hoc networks. arXiv:1910.11916 10. Perkins CE, Royer EM (1999) Ad-hoc on-demand distance vector routing. In: Proceedings WMCSA’99. Second IEEE workshop on mobile computing systems and applications. IEEE, pp 90–100

54

A. Sufian et al.

11. Toh CK (2001) Maximum battery life routing to support ubiquitous mobile computing in wireless ad hoc networks. IEEE Commun Mag 138–145 12. Patel DN, Patel SB, Kothadiya HR, Jethwa PD, Jhaveri RH (2014) A survey of reactive routing protocols in manet. In: International conference on information communication and embedded systems (ICICES2014). IEEE, pp 1–6 13. Xiao H, Ibrahim DM, Christianson B (2014) Energy consumption in mobile ad hoc networks. In: 2014 IEEE Wireless communications and networking conference (WCNC). IEEE, pp 2599– 2604 14. Bettstetter C, Resta G, Santi P (2003) The node distribution of the random waypoint mobility model for wireless ad hoc networks. IEEE Trans Mob Comput 2(3):257–269 15. Issariyakul T, Hossain E (2010) Introduction to Network Simulator NS2, 1st edn. Springer Publishing Company, Incorporated

Performance Evaluation of Language Identification on Emotional Speech Corpus of Three Indian Languages Joyanta Basu

and Swanirbhar Majumder

Abstract This paper describes the performance evaluation of spoken language identification (SLID) from emotional speech data of three Indian languages, namely Assamese (AS), Bengali (BN), and Santali (SA). A speech corpus containing six basic human emotions (i.e. anger, fear, happy, sad, surprise, and neutral) has been created and used in this study for experimental purposes. Different experiments are carried out to build SLID models for evaluation. In this study, spectral feature like shifted delta cepstral (SDC) is explored for investigating the presence of languagespecific information from emotional data. Support vector machine (SVM) and Gaussian mixture model (GMM)-based models are developed to represent the languagespecific information captured through the spectral features. Apart from that, to build the modern SLID i-vectors, time delay neural networks (TDNN) and recurrent neural network with long short-term memory (LSTM-RNN) have been considered. For the evaluation of the different emotional speech utterances, equal error rate (EER) and average cost function (C avg ) have been used as a performance matrix of the SLID system. The evaluation of the study on emotional speech utterances indicates that the EER and C avg of SLID systems of neutral, fear, and sad are better than the anger, happiness, and surprise emotions. Keywords Emotional speech corpus · Spoken language identification (SLID) · Shifted delta cepstral (SDC) · Gaussian mixture model · Support vector machine · i-vectors · Linear discriminant analysis · Probabilistic linear discriminant analysis · Time delay neural networks · Recurrent neural network · Long short-term memory

J. Basu (B) CDAC Kolkata, Salt Lake, Sector - V, Kolkata 700091, India e-mail: [email protected] S. Majumder Department of Information Technology, Tripura University, Suryamaninagar, Agartala 799022, Tripura, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_6

55

56

J. Basu and S. Majumder

1 Introduction SLID denotes the development of automatically identifying the language spoken in a speech sample usually under the assumption that a single language is present [1]. India has around 1576 mother tongues within which 114 are major languages and that are coming from five language families like Indo-European, Dravidian, Austro-Asiatic, Tibeto-Burmese, and Semito-Hamitic [2]. Due to globalization and interaction of the human population, it has been observed that multilingual systems become a necessity of different people of linguistic backgrounds [3]. Hence, SLID from natural communication with emotion plays important role in various real-life applications, including spoken document retrieval, multilingual communication systems [4], multilingual speech recognition [5], and spoken language translation. For this reason, interest on SLID also increased. In recent days, it has been observed that different SLID challenges [6] are receiving importance. The outcome of these challenges is providing better solutions to the world. Most of the modern SLID methods are established on discriminative of language modeling, like the i-vector model [7–9], similar to speaker recognition (SR) systems [8] as well as automatic speech recognition (ASR). In [10, 11] study, authors investigated different low-resource eastern and northeastern languages of India using language-specific acoustic features as well as traditional spectral features and classifiers. x-vector-based model [12] and the phonetic temporal neural (PTN) model [7] under deep neural network (DNN) are getting popularity among researchers. In recent days, it has been observed that DNN-based SLID models are getting better results than the GMM-UBM (universal background model) framework [13] based ivector model. Due to the facility of getting input sequentially, LSTM-RNN networks have shown improvements in performance over DNN. Another variation of DNN is TDNN. As we go deeper from the first layer, every TDNN layer imposes a different temporal incremental resolution. In ASR, TDNN is used as an acoustic modeling [14]. In recent times, deep neural network-based x-vector has been familiarized [12] for SLID as well as SR research work. In the study [15], researchers have shown SLID performance degradation due to different dialectical variations of a language family and short durational speech utterances. Some attempts have been observed to find the usefulness of SLID system performance for telephone narrowband speech and noise [8]. In the studies [16–18], it has been shown that SR and SLID systems provide improved performance on neutral emotional speech data than other emotional states.

2 Motivation In this research paper, the authors worked on SLID of emotional speech corpus of three languages, namely AS, BN, and SA. However, there are few attempts reported on the performance evaluation of SLID systems on different human emotional states.

Performance Evaluation of Language Identification …

57

Table 1 Speech corpus distribution of speakers Language name

Emotional speech corpus Male

Female

Total

Duration (in hours)

Readout speech corpus

AS

3

2

5

5.50

9

8

17

9.86

BN

5

4

9

8.10

14

11

25

10.00

SA

4

3

7

6.30

13

9

22

9.90

Male

Female

Total

Duration (in hours)

All these facts motivated authors to investigate the performance evaluation of SLID systems on speakers’ emotional states on the above said Indian languages. The authors of this work investigate spoken language identification from emotional audio data (like anger, fear, happy, sad, surprise, and neutral) and study in detail to find out the SLID performance using different state-of-the-art techniques. The originality of this research paper is to study the performance evaluation of SLID on developed readout and an emotional corpus of above said three Indian languages and then finally identify unknown language using an appropriate model.

3 About Emotional Speech Database For emotional speech corpus, 139 numbers of sentences are created in six different emotions, namely anger, fear, happy, sad, surprise, and neutral. Similarly, a readout speech corpus is created without emphasizing the emotional quotients of the speakers. These entire speech corpora are collected from native speakers of different eastern and northeastern parts of India within the age group of 15–55 years. Table 1 shows the speaker-wise distribution of collected speech corpus. Both the speech corpuses are recorded in a home, office, and studio environment with 16-bit 22.05 kHz digitization format.

4 Experimental Setup For this work authors use open-source Kaldi toolkit [19] to extract features and build classification models.

4.1 Experimental Data Distribution of corpus for training as well as testing of the SLID system is shown in Table 2. In total, eight sets of utterances from the three considered languages

58

J. Basu and S. Majumder

Table 2 Details of the database for analysis Corpus type

Purpose

Assamese (in hours)

Bengali (in hours)

Santali (in hours)

Total (in hours)

Readout speech corpus

Train

7.89

8.00

7.92

23.81

Test

1.97

2.00

1.98

5.95

Emotional speech corpus

Test neutral

0.92

1.26

1.10

3.28

Test anger

0.87

0.98

0.89

2.74

Test fear

0.95

1.47

1.20

3.62

Test happy

0.85

1.32

0.84

3.01

Test sad

0.97

1.67

1.10

3.74

Test surprise

0.94

1.40

1.17

3.51

are prepared for the experimental purposes, like training set, test set (from readout corpus), test neutral, test anger, test fear, test happy, test sad, and test surprise. The datasets for training and test are designed wisely to keep in mind that no linguistic information or speaker is shared.

4.2 Baseline SLID System MFCC features are derived from a speech frame of 25 ms with an overlap of 10 ms from 24 filter bands placed as per the mel-scale. Parameter specification of extracted SDC feature is 7-1-3-7. Using the feature mapping technique, SDC features were post-processed. In this study, the authors use 56 number coefficients from each frame. It is an appended form of 7 MFCC and 49 SDC coefficients. Then using energybased voice activity detection (VAD), the non-speech frames are removed. In this study, the GMM [20] and SVM [21] were used to build up traditional language models and evaluation of test utterances. In GMM, 1024 number of mixtures have been used to build models. Radial basis function (RBF) kernel is used for SVM to build models. Using 2048 Gaussian mixtures 400-dimensional i-vectors [8, 9] are extracted. Transforms like LDA [22] and PLDA [23] are applied, where dimension of the i-vectors is 150 after using LDA reduction. As a baseline DNN system, two different approaches were used, namely TDNN [24] and LSTM-RNN [25]. For both TDNN and LSTM-RNN 56-dimension feature vectors of SDC are used as input. For TDNN, six hidden layers are used. For second as well as for the last hidden layers no splicing was used, and was decided empirically. Each layer contains 650 units and the activation function is a rectified linear unit (ReLU). Similar to the first layer of TDNN architecture, the affine transformation layer is used in the LSTM network as first layer with a sequential context. A total of 512 cells are used after that stacked long short-term memory layer.

Performance Evaluation of Language Identification …

59

4.3 Evaluation Metrics As mentioned in the evaluation plan of language recognition evaluation 2009 challenges [26], C avg has been mentioned as the error measure to calculate the competences of the SLID system. Another parameter for evaluating the performance of the SLID system is EER. False acceptance rate (FAR) and false rejection rate (FRR) are used to predetermine the threshold values of EER. The equal error rate is determined when FAR and FRR are equal. In the current challenges program [6, 7], these primary metrics are used for evaluating SLID systems. Target class and non-target classes pairwise loss has been used for the calculation of C avg . The loss is calculated as: C(L t , L n ) = PTarget PMiss (L t ) + (1 − PTarget )PF A (L t , L n )

(1)

where L t is the target language class and L n is a non-target language class. The target language prior to probability is PTarget . This probability is set to 0.5 during the time of evaluation. The pairwise loss average is calculated using C avg . It is as follows: Cavg =

1  C(L t , L n ) N L L t

(2)

n

where language number is N.

5 Results and Discussion The baseline performance of the SLID system is given in Table 3. This performance is on emotional speech corpus as well as readout speech corpus. It has been observed that the performance of the test corpus is always better from any emotional test corpus. This is because we have used similar corpus for training and testing and the other variations are limited. But this is not possible in real-life situations because sometime SLID may need to evaluate the emotional speech data from users. It has been observed that the TDNN and LSTM-RNN systems perform better than the different variations of the i-vector system as well as traditional GMM and SVM systems. In the SLID system, the TDNN model performed better than other models and EER is 7.67% using the SDC feature. Test neutral, test sad, and test fear perform better in almost all the cases than test anger, test happy, and test surprise corpus. This is because speakers sometimes express their emotions (anger, happy, and sad) in different ways than normal speaking style. Moreover, most of the scenario was not captured in the training data as well. Figure 1 shows the short durational (1/3/10/30 s) performances of the SLID system on the TDNN model only. It was visible that the performance of the system is deteriorated with the shorter durational test sample. In all cases of test duration of 1 s, we achieved poor results and for surprise, EER is

0.3068 33.63 0.3367 29.39

0.2865 30.58 0.3018 24.67

0.3067 34.42 0.3441 25.78

0.2926 31.28 0.2934 23.66

0.3289 33.71 0.3389 30.71

31.79

Test fear 28.67

30.95

Test anger

Test happy

Test sad 29.27

Test 32.75 surprise

0.1829 19.52 0.1932 15.39

0.1589 17.59 0.1756 13.59

18.21

EER

i-vector

16.45

C avg

Test neutral

EER

SVM

Test

C avg

EER

Type of data

GMM EER

0.3398 29.76

0.2139 22.59

0.2576 24.51

0.2587 21.61

0.2832 27.53

0.1486 13.54

EER

0.2929 29.61

0.2182 21.43

0.2678 23.73

0.2429 20.28

0.2732 25.67

0.1361 12.78

0.1267 11.56

C avg

i-vector + LDA

0.1361 12.69

C avg

Table 3 Performance evaluation of SLID on speech corpus of three Indian languages

9.53

7.67

EER

0.3156 25.65

0.2039 19.49

0.2372 21.39

0.2028 17.71

0.2467 20.54

0.1378

0.1165

C avg

i-vector + PLDA TDNN

9.56

EER

0.2683 26.73

0.1843 20.53

0.2142 21.67

0.1783 18.67

0.1956 21.48

0.0967 10.49

0.0729

C avg

0.2671

0.2051

0.2329

0.1949

0.2198

0.1172

0.0951

C avg

LSTM – RNN

60 J. Basu and S. Majumder

Performance Evaluation of Language Identification …

61

Fig. 1 Effect of short speech utterance on SLID evaluation

38.32%. However, the interesting part is that test and test neutral cases system has achieved good EER in almost all the short durational cases as we received 7.78% EER. The performance of traditional classifiers like GMM and SVM are encouraging. Another experiment was carried out by combining the readout speech corpus and emotional speech corpus. After fusion of the corpus, 80% of the data was prepared for training and 20% of the data for testing in different categories. SLID system performance is shown in Fig. 2 after the fusion of speech corpus. From the baseline system, it has been observed that TDNN and LSTM-RNN-based systems provide better performance and for this reason, we have chosen only these two approaches to build the SLID system. From the figure, it has been observed that the SLID system performance improved for emotional test databases. In all the cases, TDNN-based model outperforms the LSTM-RNN model and achieved EER of 6.74%. We observed around 14% improvement in EER using the TDNN system.

6 Conclusion This paper addressed the state-of-the-art SLID system performance evaluation of emotional speech corpus as well as readout corpus. An emotional speech corpus of six basic emotions is created and evaluated using developed models. This study used traditional systems like GMM and SVM model and also used current techniques like i-vector with linear discriminant analysis and probabilistic linear discriminant analysis, time delay neural networks, and recurrent neural network with long

62

J. Basu and S. Majumder

Fig. 2 Performance evaluation of SLID using corpus fusion technique

short-term memory architectures to build the SLID systems. The results show that TDNN and LSTM-RNN performed better than other models for neutral, sad, and fear emotions than other emotions. EER of 7.67% is achieved using the TDNN model. Shorter duration of speech sample performance is also shown in this paper with some encouraging results for the researchers. After applying the speech corpus fusion technique, the SLID system has achieved better performance and around 14% improvement in EER using the TDNN system. In this corpus fusion technique, the TDNN system outperforms the LSTM-RNN system for SLID. This result is encouraging for the researchers to work in this area. However, for the SLID systems, performance improvement normalization of acoustic features may need to be study. In the future works, different adaptation and fusion techniques will be explored to improve the performance of SLID for different speakers’ emotional states. Acknowledgements The authors are thankful to all native speakers for providing the emotional audio data for building the speech corpus. The authors would like to acknowledge the support of the Centre for Development of Advanced Computing, Kolkata, India to carry out this research initiative.

References 1. Dehak N, Torres-Carrasquillo PA, Reynolds D, Dehak R (2011) Language recognition via ivectors and dimensionality reduction. In: INTERSPEECH, pp 857–860 2. MHRD (2016) Language education. Department of Higher Education, MHRD, Govt. of India [Online]. https://mhrd.gov.in/language-education 3. Waibel A, Geutner P, Tomokiyo LM, Schultz T, Woszczyna M (2000) Multilinguality in speech and spoken language systems. Proc IEEE 88(8):1297–1313

Performance Evaluation of Language Identification …

63

4. Nasution AH, Syafitri N, Setiawan PR, Suryani D (2017) Pivot-based hybrid machine translation to support multilingual communication. In: 2017 international conference on culture and computing (culture and computing), pp 147–148 5. Toshniwal S et al (2018) Multilingual speech recognition with a single end-to-end model. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4904–4908 6. Sadjadi SO et al (2018) Performance analysis of the 2017 NIST language recognition evaluation. In: INTERSPEECH, pp 1798–1802 7. Tang Z, Wang D, Chen Y, Li L, Abel A (2018) Phonetic temporal neural model for language identification. IEEE/ACM Trans Audio Speech Lang Process 26(1):134–144 8. Dehak N, Kenny PJ, Dehak R, Dumouchel P, Ouellet P (2011) Front-end factor analysis for speaker verification. IEEE Trans Audio Speech Lang Process 19(4):788–798 9. Dehak N, Dehak R, Kenny P, Brümmer N, Ouellet P, Dumouchel P (2009) Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In: INTERSPEECH 10. Basu J, Majumder S (2020) Identification of seven low-resource North-Eastern languages: an experimental study. In: Bhattacharyya DP, Mitra S (ed) Intelligence enabled research. Advances in intelligent systems and computing, vol 1109. Springer, Singapore, Kolkata, pp 71–81 11. Basu J et al (2017) Acoustic analysis of vowels in five low resource north East Indian languages of Nagaland. In: 2017 20th conference of the oriental chapter of the international coordinating committee on speech databases and speech I/O systems and assessment (O-COCOSDA), pp 1–6 12. Snyder D et al (2018) Spoken language recognition using x-vectors. In: Odyssey, pp 105–111 13. Wong E, Pelecanos J, Myers S, Sridharan S (2000) Language identification using efficient gaussian mixture model analysis. In: Australian international conference on speech science and technology, pp 78–83 14. Fathima N, Patel T, Mahima C, Iyengar A (2018) TDNN based multilingual speech recognition system for low resource Indian languages. In: INTERSPEECH, pp 3197–3201 15. Ambikairajah E, Li H, Wang L, Yin B, Sethu V (2011) Language identification: a tutorial. IEEE Circuits Syst Mag 11(2):82–108 16. Jain P, Gurugubelli K, Vuppala AK (2020) Study on the effect of emotional speech on language identification. In: 2020 national conference on communications (NCC), pp 1–6 17. Markov I, Nastase V, Strapparava C, Sidorov G (2018) The role of emotions in native language identification. In: Proceedings of the 9th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 123–129 ˇ 18. Macková L, Cižmár A (2014) Emotional speaker verification based on i-vectors. In: 2014 5th IEEE conference on cognitive infocommunications (CogInfoCom), pp 533–536 19. Povey D et al (2011) The Kaldi speech recognition toolkit. In: IEEE 2011 workshop on automatic speech recognition and understanding 20. Reynolds DA, Speaker identification and verification using Gaussian mixture speaker models. Speech Commun 17(1–2):91–108 21. Vapnik VN (2000) The nature of statistical learning theory. Springer, New York, New York, NY 22. Balakrishnama S, Ganapathiraju A (1998) Linear discriminant analysis-a brief tutorial. Inst Signal Inf Process 18:1–8 23. Prince SJD, Elder JH (2007) Probabilistic linear discriminant analysis for inferences about identity. In: 2007 IEEE 11th international conference on computer vision, pp 1–8 24. Lang KJ, Waibel AH, Hinton GE (1990) A time-delay neural network architecture for isolated word recognition. Neural Netw 3(1):23–43 25. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780 26. NIST (2009) The 2009 NIST SLR evaluation plan. www.itl.nist.gov/iad/mig/tests/lre/2009/ LRE09_EvalPlan_v6.pdf

Disaster Severity Prediction from Twitter Images Abhinav Kumar and Jyoti Prakash Singh

Abstract Damage assessment is an essential situation awareness task to know the severity of impairment for organizing relief efforts. In this work, a model is proposed to automatically classify the disaster-related Twitter images into severe, mild, and little or no damage classes. These classification results can be used to estimate the damage caused by the disaster. Three different pre-trained models such as VGG16, ResNet-50, and Xception are used to extract features from the images and then these features are used by the dense neural network and seven different conventional machine learning classifiers for the classification task. The models are validated with four different real-life disaster-related datasets, such as hurricane, earthquake, flood, and wildfire that show the conventional machine learning classifiers have learned better than the dense neural network with features extracted from pre-trained models. Keywords Disaster · Damage assessment · Twitter images · VGG-16 · ResNet-50 · Xception

1 Introduction Natural calamities such as hurricanes, earthquakes, floods, and wildfires lead to significant losses in infrastructure and human lives. On average, 388 natural disasters have occurred annually, resulting in economic losses of 156.7 billion dollars [7]. Collecting and processing timely information on the destruction caused by a natural catastrophe is extremely important to mitigate the loss, guide the resource allocation, and speed up the recovery [5, 7, 12]. However, collecting such information in real time through traditional news media and government agencies is difficult due to the lack of reporters and government officials in the disaster region [1]. Hence, A. Kumar (B) · J. P. Singh National Institute of Technology Patna, Patna, India e-mail: [email protected] J. P. Singh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_7

65

66

A. Kumar and J. P. Singh

damage assessment becomes a major challenge for humanitarian organizations. It takes weeks and even months to understand the exact situation of the damage caused by the disaster [9]. Social media users post a massive amount of disaster-related images, texts, and videos during natural or man-made disasters. The pervasiveness and rapid accessibility of these disaster-related social media contents can augment the traditional news reports. Nowadays, both the government and non-government agencies are shifting to Twitter to disseminate information to the people and understand the floor reality of the disaster [5, 7]. Several works [5–7, 10] used textual contents of the social media post to propose methods to help humanitarian agencies and governments to prioritize relief operations. Besides the textual content, people nowadays post large volumes of images of the affected area on social media within minutes of disaster occurrences. The disaster-related images can be a valuable source of information to find the level of destruction in the disaster region and the approximate number of affected people in the disaster area. The estimation of the level of damage during the disaster is one of the main objectives for humanitarian response agencies to understand the magnitude of the destruction and prepare their recovery efforts accordingly. In the current research work, we have tried to estimate the level of destruction caused by a disaster using disaster-related images posted on social media. However, analyzing social media images is a quite challenging task as (i) the objects are usually not in well-defined shapes, e.g., damage to buildings, damage to bridges and roads, and destruction of trees, (ii) the level of noise in disaster-related datasets is remarkably high, and (iii) the number of labeled imagery data relevant to the actual disaster is comparatively small. Because of this, classifications of damage-related images are quite different from the standard task of classifying images [9]. Recently, Daly and Thom [4] and Lagerstrom et al. [8] have tried to identify images containing fire-related damage. Nguyen et al. [9, 11] and Attari et al. [3] tried to classify images into different damage classes such as severe damage, mild damage, and no damage. VGG-16 pre-trained models were the preferred choice of people to extract features from images. We hardly found the usage of other pretrained models such as ResNet-50 and Xception in the literature for the extraction of damage-related features. In this study, we use three different pre-trained models such as VGG-16, ResNet-50, and Xception to extract features from the images. Then these extracted features are used by dense neural network and conventional machine learning models to investigate classifier performance in classifying disaster-related images in severe, mild, and, little or no damage classes. The contribution of this paper can be summarized as follows: – Comparative analysis of pre-trained models, VGG-16, ResNet-50, and Xception for feature extraction from damage-related images. – Evaluation of the efficiency of the dense neural network and different machine learning classifiers on the extracted features of the pre-trained models. The rest of the paper is structured as follows: Sect. 2 describes the detailed approach. The results of the experiments are listed in Sect. 3. The discussion of the findings is presented in Sect. 4, and the paper concludes in Sect. 5.

Disaster Severity Prediction from Twitter Images

67

Xception ResNet50 (224 x 224 x 3)

VGG16

(256)

Dense (3-Neurons) Support Vector Machine Logistic Regression Severe damage

Pre-trained CNN Model

K-Nearest Neighbors Naive Bayes

Mild damage Little or no damage

Decision Tree Random Forest Gradient Boosting

Fig. 1 Flow diagram of the proposed methodology

2 Methodology This section introduces the proposed methods for identifying the level of damage caused by disasters. To extract features from images, we used three different pretrained models based on the Convolutional Neural Network (CNN) such as VGG-16, ResNet-50, and Xception. These extracted features are then used by eight different classifiers such as (i) Dense Neural Network, (ii) Support Vector Machine (SVM), (iii) Naive Bayes (NB), (iv) K-Nearest Neighbors (KNN), (v) Logistic Regression (LR), (vi) Decision Tree (DT), (vii) Random Forest (RF), and (viii) Gradient Boosting (GB) to classify the images. The flow diagram of the proposed methodology can be seen in Fig. 1.

2.1 Data Description and Pre-processing This study uses the CrisisMMD dataset released by AAAI conference on web and social media [2]. The dataset includes images from seven different disasters. Three of them are related to hurricane events, two are related to the earthquakes, and the other two are related to floods and wildfires. The dataset included images with three different class information: (i) severe damage, (ii) mild damage and, (iii) little or no damage. In our work, the datasets of similar events are combined into one, i.e., datasets of hurricane events are merged into one. Likewise, datasets of the earthquakes have been merged into one. The complete summary of the datasets used in this work is shown in Table 1. The images are converted into the size of (224 × 224 × 3), where 3 represents the RGB component of the image and (224 × 224) represents the height and width of the image. The pixel values of the images are normalized in the range 0–1 by dividing each pixel value by 255. After all the pre-processing, the

68

A. Kumar and J. P. Singh

Table 1 The detailed class-wise data statistic for different disaster events Hurricane Earthquake Flood Severe damage Mild damage Little or no damage Total

Wildfire

1385 725 446

306 36 9

60 30 5

465 51 15

2556

351

95

531

images are fed directly to the input of the pre-trained models to extract the relevant features from them. To adapt pre-trained networks for our model development, the last layer of all the three pre-trained models, VGG-16, ResNet-50, and Xception networks, containing 1,000 output classes were removed. Two dense layers containing 256 and 3 neurons were added. We used 3 neurons in the output layer for severe, mild, and little or no damage classes. For training the model, the last three layers of the model are marked as trainable and the remaining layers are marked as non-trainable. The proposed models use Adam optimizer and categorical cross-entropy loss function and are trained for the 100 epochs. The experiments were performed by changing the learning rate and the batch size to find the best suitable value for the models. The models were performed best with a learning rate of 0.001 and a batch size of 16. After training the model, features are extracted from the dense layer containing 256 neurons, and then these features are used by the different classifiers. The detailed experimental results of each of the classification are discussed in Sect. 3.

3 Experimental Results To measure the performance of the proposed models, weighted Precision (P), Recall (R), and F1 -score (F1 ) are used. The results for the VGG-16, ResNet-50, and Xception networks for Hurricane, Earthquake, Flood, and Wildfire are shown in Table 2, Table 3, and Table 4, respectively. In the case of the hurricane, Xception network features with Naive Bayes and Xception network features with Gradient Boosting performed best with the F1 -score of 0.59. In the event of an earthquake, Xception network features with Dense Neural Network performed best with an F1-score of 0.82. In the case of flood, ResNet-50 features with KNN performed best with an F1-score of 0.75. In the case of wildfires, VGG-16 network features with KNN and Xception network features with Random Forest performed best with an F1-score of 0.88. Irrespective of the earthquake event where Xception network features with dense neural networks performed best with an improvement of F1-score by 1% over KNN,

Disaster Severity Prediction from Twitter Images

69

Table 2 Results of the different classifiers in case of VGG-16 VGG-16 Hurricane P R Dense 0.58 0.61 SVM 0.34 0.58 LR 0.53 0.57 KNN 0.53 0.56 NB 0.56 0.55 DT 0.53 0.53 RF 0.55 0.57 GB 0.55 0.57

F1 0.53 0.43 0.54 0.54 0.55 0.53 0.56 0.56

Earthquake P R 0.83 0.85 0.69 0.83 0.75 0.83 0.77 0.83 0.78 0.75 0.74 0.79 0.76 0.83 0.75 0.83

F1 0.79 0.75 0.78 0.79 0.76 0.76 0.78 0.78

Flood P R 0.40 0.63 0.40 0.63 0.57 0.63 0.58 0.63 0.51 0.32 0.58 0.63 0.45 0.63 0.61 0.63

F1 0.49 0.49 0.56 0.59 0.29 0.60 0.52 0.61

Wildfire P R 0.80 0.90 0.80 0.90 0.80 0.90 0.86 0.91 0.90 0.12 0.83 0.81 0.84 0.90 0.83 0.88

F1 0.85 0.85 0.85 0.88 0.16 0.82 0.86 0.85

F1 0.49 0.49 0.49 0.75 0.48 0.48 0.64 0.54

Wildfire P R 0.80 0.90 0.80 0.90 0.80 0.90 0.82 0.86 0.90 0.14 0.83 0.80 0.84 0.90 0.85 0.89

F1 0.85 0.85 0.85 0.84 0.19 0.81 0.86 0.87

F1 0.26 0.49 0.69 0.55 0.55 0.55 0.52 0.51

Wildfire P R 0.85 0.91 0.80 0.90 0.87 0.85 0.83 0.88 0.91 0.42 0.86 0.85 0.88 0.90 0.84 0.84

F1 0.87 0.85 0.85 0.85 0.52 0.86 0.88 0.84

Table 3 Results of the different classifiers in case of ResNet-50 ResNet-50 Hurricane P R Dense 0.34 0.58 SVM 0.34 0.58 LR 0.34 0.58 KNN 0.49 0.53 NB 0.56 0.28 DT 0.48 0.45 RF 0.49 0.53 GB 0.50 0.56

F1 0.43 0.43 0.43 0.50 0.16 0.46 0.50 0.49

Earthquake P R 0.69 0.83 0.69 0.83 0.69 0.83 0.70 0.80 0.78 0.23 0.73 0.75 0.68 0.77 0.75 0.82

F1 0.75 0.75 0.75 0.75 0.28 0.74 0.73 0.77

Flood P R 0.40 0.63 0.40 0.63 0.40 0.63 0.79 0.79 0.57 0.47 0.41 0.58 0.64 0.68 0.53 0.58

Table 4 Results of the different classifiers in case of Xception Xception Hurricane P R Dense 0.60 0.62 SVM 0.34 0.58 LR 0.57 0.59 KNN 0.51 0.54 NB 0.60 0.60 DT 0.50 0.49 RF 0.55 0.59 GB 0.58 0.60

F1 0.57 0.43 0.58 0.52 0.59 0.50 0.56 0.59

Earthquake P R 0.81 0.86 0.69 0.83 0.77 0.83 0.79 0.85 0.75 0.68 0.75 0.82 0.73 0.80 0.79 0.85

F1 0.82 0.75 0.79 0.81 0.71 0.77 0.76 0.80

Flood P R 0.39 0.26 0.40 0.63 0.70 0.68 0.53 0.58 0.60 0.53 0.60 0.53 0.50 0.58 0.56 0.47

70

A. Kumar and J. P. Singh

Table 5 Result comparison of dense neural network and gradient boosting classifier in case of hurricane Hurricane (Xception) Dense neural network

Class Severe damage Mild damage Little or no damage Weighted average

Gradient boosting

No. of testing samples

P 0.66

R 0.90

F1 0.76

P 0.70

R 0.80

F1 0.75

297

0.46

0.32

0.38

0.36

0.31

0.33

140

0.58

0.09

0.16

0.51

0.36

0.42

75

0.60

0.62

0.57

0.58

0.60

0.59

512

Table 6 Result comparison of dense neural network and KNN classifier in case of earthquake Earthquake (Xception) Dense neural network

Class Severe damage Mild damage Little or no damage Weighted average

K-nearest neighbors

No. of testing samples

P 0.87

R 0.98

F1 0.92

P 0.88

R 0.97

F1 0.92

59

0.75

0.33

0.46

0.50

0.33

0.40

9

0.00

0.00

0.00

0.00

0.00

0.00

3

0.81

0.86

0.82

0.79

0.85

0.81

71

all other conventional machine learning classifiers learned better compared to dense neural networks. Most of the time, conventional machine learning classifiers outperformed the dense neural network in terms of F1 -score even in the case of data imbalance. Comparative results of dense neural networks with best-performing conventional machine learning classifiers for hurricane, earthquake, flood, and wildfire can be seen in Table 5, Table 6, Table 7, and Table 8, respectively.

Disaster Severity Prediction from Twitter Images

71

Table 7 Result comparison of dense neural network and KNN classifier in case of flood Flood (ResNet-50) Dense neural network

Class Severe damage Mild damage Little or no damage Weighted average

K-nearest neighbors

No. of testing samples

Precision 0.63

Recall 1.00

F1 -score 0.77

Precision 0.75

Recall 1.00

F1 -score 0.86

0.00

0.00

0.00

1.00

0.50

0.67

6

0.00

0.00

0.00

0.00

0.00

0.00

1

0.40

0.63

0.49

0.79

0.79

0.75

19

12

Table 8 Result comparison of dense neural network and random forest classifier in case of wildfire Wildfire (Xception) Dense neural network

Class Severe damage Mild damage Little or no damage Weighted average

Random forest

No. of testing samples

P 0.91

R 1.00

F1 0.95

P 0.91

R 0.98

F1 0.94

96

0.00

0.00

0.00

0.33

0.14

0.20

7

1.00

0.25

0.40

1.00

0.25

0.40

4

0.85

0.91

0.87

0.88

0.90

0.88

107

4 Analysis of Results The main finding of this study is that the Xception network performed best in the extraction of features in the event of hurricanes, earthquakes, and wildfires, whereas ResNet-50 performed best in the feature extraction of floods which can be seen from Tables 2, 3, and 4. Another finding of this research is that the use of conventional classifiers such as SVM, LR, KNN, NB, DT, RF, and GB at the top of pre-trained models outperformed the Dense Neural Network (DNN)-based classifiers in almost all the cases except the case of the earthquake when Xception network was used (see Tables 5, 6, 7, and 8). The features extracted from the Xception network in the case of hurricane performed best when the GB classifier was used. The performance of little

72

A. Kumar and J. P. Singh

or no damage class was improved using the GB classifier in comparison to DNN as can be seen from Table 5. Similarly, for the features extracted from the ResNet-50 network in case of flood, KNN performed better for severe and mild damage classes in comparison to DNN as can be seen from Table 7. For the features extracted from the Xception network in case of wildfire, RF performed better for the mild damage class in comparison of DNN as can be seen from Table 8. From all these findings, it is evident that the use of conventional machine learning at the top of pre-trained models works efficiently even for imbalanced datasets. The proposed system can work effectively in real-time settings. When disasteraffected people post disaster-related images, the proposed system classifies the images to varying degrees of severity. The prediction of the proposed system can be interpreted by the government and humanitarian groups to capture the floor reality of the disaster and, as a result, to prioritize rescue and relief operations in such a way that severe damage regions can receive a prompt response from the government and humanitarian organizations. One of the theoretical contributions of the proposed system is that it does not need feature engineering to extract relevant features from the images, as the pre-trained models automatically extract the relevant features from the images. The other theoretical contribution is that, at the early stage of the disaster, finding labeled data samples is difficult; the proposed model uses a pre-trained model that generally generalizes well with the lower training sample and also takes less training time and achieves significant performance. The current system uses only images in finding disaster severity, therefore the proposed system can be combined with any social media such as Facebook, Twitter, and Instagram. One of the possible implications of the proposed system is to simulate the model in an android application that can be run on a mobile device. This android application can be used effectively in the event of flooding, where the users of the system can receive timely situational awareness alerts such that the people in affected regions know which regions are at high risk, and which roads are blocked due to floods, etc. One of the limitations of this work is it directly uses Twitter images for the identification of disaster severity without checking the authenticity of the images. During disaster, people also post and retweet similar kinds of images related to the past event without checking their authenticity. Therefore, in the future, a model can be made to filter relevant images of the current event, and then the proposed system can be applied to identify the severity of the disaster.

5 Conclusion The analysis of social media images related to the disaster is challenging because the images are not in well-defined shapes, e.g., damage to buildings, damage to bridges and roads, and destruction of trees. In this work, we used VGG-16, ResNet-50, and Xception networks to extract relevant features from the disaster-related images. The use of conventional classifiers at the top of the pre-trained models performed better than the dense neural network. In the future, the proposed model can be tested with

Disaster Severity Prediction from Twitter Images

73

some other pre-trained networks, such as VGG-19, Inception, and ResNetV2, to see the performance of the classifiers. A multi-modal system can also be developed to see the performance of the models in classifying disaster-related images having damage information. The available geolocation and location reference written in the tweet text can also be integrated into the current system to locate the place of the damage on the ground that can help to better organize relief operations.

References 1. Ahmad K, Riegler M, Riaz A, Conci N, Dang-Nguyen DT, Halvorsen P (2017) The JORD system: linking sky and social multimedia data to natural disasters. In: International conference on multimedia retrieval. ACM, pp 461–465 2. Alam F, Ofli F, Imran M (2018) CrisisMMD: multimodal Twitter datasets from natural disasters. In: AAAI conference on web and social media 3. Attari N, Ofli F, Awad M, Lucas J, Chawla S (2017) Nazr-CNN: fine-grained classification of UAV imagery for damage assessment. In: IEEE conference on data science and advanced analytics. IEEE, pp 50–59 4. Daly S, Thom JA (2016) Mining and classifying image posts on social media to analyse fires. In: Proceedings of ISCRAM 5. Kumar A, Singh JP (2019) Location reference identification from tweets during emergencies: a deep learning approach. Int J Disaster Risk Reduct 33:365–375 6. Kumar A, Singh JP, Saumya S (2019) A comparative analysis of machine learning techniques for disaster-related tweet classification. In: Proceedings of IEEE R10 HTC. IEEE, pp 222–227 7. Kumar A, Singh JP, Dwivedi YK, Rana NP (2020) A deep multi-modal neural network for informative Twitter content classification during emergencies. Ann Oper Res 1–32 8. Lagerstrom R, Arzhaeva Y, Szul P, Obst O, Power R, Robinson B, Bednarz T (2016) Image classification to support emergency situation awareness. Front Robot AI 3:54 9. Nguyen DT, Ofli F, Imran M, Mitra P (2017) Damage assessment from social media imagery data during disasters. In: Proceedings of IEEE/ACM ASONAM. ACM, pp 569–576 10. Nguyen DT, Al Mannai KA, Joty S, Sajjad H, Imran M, Mitra P (2017) Robust classification of crisis-related data on social networks using convolutional neural networks. In: Proceedings of AAAI conference on web and social media 11. Nguyen DT, Alam F, Ofli F, Imran M (2017) Automatic image filtering on social networks using deep learning and perceptual hashing during crises. arXiv:1704.02602 12. Singh JP, Dwivedi YK, Rana NP, Kumar A, Kapoor KK (2019) Event classification and location prediction from tweets during disasters. Ann Oper Res 283:737–757

A Study on Energy-Efficient Communication in VANETs Using Cellular IoT R. N. Channakeshava and Meenatchi Sundaram

Abstract Vehicular ad hoc networks (VANETs) are formed for safety message propagations in a traffic model. VANETs are specialized mobile ad hoc networks (MANETs). They generally use architecture where vehicles can communicate within them and also through road side units (RSUs). Vehicles in a VANET form and communicate within a cluster. Messages sent from one vehicle may reach each vehicle in multiple hops. As the vehicle density increases vehicles can carry and forward message packets to neighbors easily but it introduces more energy consumption, delay and packet duplication. Using cellular IoT we are attempting to make clusters of vehicles fixed to a cellular tower. This will simplify the architecture of VANET and communication, so that it will reduce the delay and energy consumption, and packet delivery will remain the same for all densities of traffic. Keywords Cellular IoT · NB-IoT · LTE-M · VANETs · Clustering · Energy efficient · 3GPP

1 Introduction VANETs are formed for routing safety messages among vehicles moving in the same geographical region. Safety messages such as opposite vehicle, ambulance, road works, broken path, emergency brakes, vehicles turning and so on are communicated among the vehicles moving in the same region and in the same direction, which will be of great safety purpose served due to the extra information about the route. For communicating safety messages among the vehicles various communication models are proposed and the researches keep on improving the VANETs. Current research trends in VANETs all target toward having clusters of vehicles R. N. Channakeshava (B) Department of Computer Science, Government Science College, Chitradurga, India e-mail: [email protected] M. Sundaram School of Computational Sciences, Garden City University, Bengaluru, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_8

75

76

R. N. Channakeshava and M. Sundaram

Fig. 1 Communication between vehicles

communicating with road side units (RSUs) (Fig. 1). WAVE is used as communication technology between vehicles, and different technologies like Wi-Fi, WAVE, and so on are used for communicating between vehicle and RSUs, and RSU to RSUs. The major drawback with this model is the power consumption factor, which is at high due to the use of WAVE, clustering, and the architecture used for sending and receiving. Adaption of cellular IoT into VANETs not only reduces the power consumption but also simplifies the implementation complexity and costs. Elimination of RSUs is a dedicated infrastructure deployed only for VANETs. Using cellular IoT, we can better use the existing infrastructure for developing the VANETs. A cellular tower can serve the purpose of an RSU. A centralized server or a remote server may be deployed to monitor the messages to be communicated or formation of clusters. Clustering process in VANETs requires sending and receiving of present messages within each other and cellular towers are used for forming clusters. The vehicle moving in certain direction under tower coverage can be tagged to a cluster.

2 Related Works on VANETs VANETs are special type of mobile networks such as MANETs. Nodes in MANETs are limited mobile, whereas vehicles move randomly, so nodes in VANETs have high mobility. Data communication is more in MANETs in terms of voice and even

A Study on Energy-Efficient Communication in VANETs …

77

video calls are popular as well as short messages. In VANETs a small piece of information is transmitted to other vehicles in the cluster, so data transfer is limited in VANETs. The procedure and overview how VANETs are formed and divided into clusters are discussed in [1] which gives a detailed explanation and ideas for VANETs creation. The above work explains how clusters are created and messages propagated in between vehicles and RSUs. Clustering of vehicles can be fixed along a physical location and their advantages are listed. The procedure for creation of clusters in VANETs for safety message propagation is described in [2]. The paper also describes some case studies and example messages and their priority. The architecture and procedures described in [2] are well suited and easily implemented if the communication is done through cellular IoT, instead of WAVE and RSUs. Routing with geographical clustering is studied in [3]. Cellular IoT has a compatibility with all versions of cellular communication 2G, 3G, 4G, and 5G; currently, NB-IoT and LTEM are the most conquered cellular IoT communication technologies. The authors of [4] discuss the evaluation methodologies for NB-IoT. Some of the clustering methodologies and drawbacks in them are discussed in [5] for VANETs. Collision detection is one of the features of VANETs. In [6] simulation using OMNET++, SUMO and VEINS procedure is discussed. This paper also emphasizes on study of MAC layer protocols for communication using cellular IoT. WAVE has many advantages as well as disadvantages. It can communicate to long distances of up to 100 km but with low-frequency mode, whereas high-frequency signals can transmit only up to some meters; even very high-frequency signal can transmit within centimeters. At the same time, low-frequency signals have signal quality issues and high-frequency signals have high quality. As the frequency of signal increases, we can transmit large amounts of data and with low frequency we can transmit less data. Different standards of WAVE technology and its implementation in different layers, especially in MAC layer, and message formats are discussed in [7]. A well understanding about the topology of routes in the networks will enhance good message propagation with increased efficiency [8, 9]. Efficiency of energy consumption is the main goal of all IoT equipments [10], and the paper [11] has a detailed discussion about the energy consumption in VANETs. Delivery percentage, delay, throughput and handing over from one cluster to another are discussed in detail in [12]. IP-based routing via Wi-Fi and internet is proposed in [13], where with different traffic density various measures such as packet delay, delivery percentage and duplication are discussed, and the model is simulated using NS2. The study [14] emphasizes on specific cases for collision detection and avoidance by forming the VANETs on-the-fly message communication. When the alert system passes through many levels such as vehicle to RSU and then to controlling the server, the delay will be introduced at different levels also, so the overall delay is studied in [15].

78

R. N. Channakeshava and M. Sundaram

3 Cellular IoT We use VANETs for communication of safety messages which are generated by sensors and also some of them manually by humans. We may use a specialized communication technology which is evolving particularly for IoT. Once we join the network the remaining transmission will be taken care of by the carrier itself, which gives some relief and time for concentrating toward improving other aspects of VANETs. In describing cellular IoT, we term communication model in the vehicle as user equipments (UEs). Cellular IoT was proposed by 3GPP, to utilize unused spectrum in the cellular networks and transmit over the cellular networks. This cellular IoT was specified as an alternative for unlicensed LPWAN, LoRaWAN and Sigfox. As cellular IoT uses cellular networks, it can be upgraded to latest technology. The present two major forms of cellular IoT widely adapted are: NB-IoT and LTE-M. NB-IoT uses a 200 kHz bandwidth of GSM or LTE mobile network bands (700, 800 or 900 MHz). There are no limitations on number of messages that can be communicated. Moreover, NB-IoT offers maximum message payload length of 1600 bytes, with 200 kbps speed. 5G: 5th Generation New Radio is an evolving technology for communication with not only cellular but also it allows multiple technologies like Wi-Fi, Satellite, fixed line etc. Once completely established, both NB-IoT and LTE-M will be covered by 5G and the whole thing may be called as Massive IoT.

4 Advantages of Cellular IoT Against Other Communication • Cellular IoT works on existing infrastructure of cellular networks, so no need of establishing a private network for VANETs. • Scalable: Cellular networks take care of connectivity to rapidly increasing number of IoT devices. • With the latest updates to 3GPP standards, UEs are allowed to remain connected without packet data networks, and this helps in millions of devices keep an idle connection for a long time and transmit whenever required. When devices are connected without packet data network, they can send SMS to other devices and also the SMS can be used to establish PDN connection. • Power saving mode (PSM) configuration is a NB-IoT feature to help IoT devices conserve battery power. extended Discontinuous Reception (eDRX) is one more power conservative feature of LTE networks. • The major advantage of NB-IoT is the long-distance coverage and deep penetration; this enables vast coverage from the vehicles in the parking lots in the cellars of a building to highway traffic.

A Study on Energy-Efficient Communication in VANETs …

79

• Economical: NB-IoT M2M modules are of low cost of around 20–30 euros or below 2000 INR. • 3GPP has set several standards, and guidelines are issued to implement cellular IoT. Many of its features and standards discussed below are extracted from [16].

5 Power-Saving Techniques PSM and EDRX in Cellular IoT Both PSM and eDRX are for saving power consumption by an IoT device under NB-IoT. The UE may be configured with any of the power-saving method or both. PSM: Once the UE initiates PSM, the network keeps the registration information of the UE until the timer expires and the UE need not execute or reattach procedure to the network within that period (Fig. 2). During attachment procedure, UE can also request a periodic tracking area update (TAU). The value of PSM is always less than TAU period. According to 3GPP standards the device can be reachable in the network until 186 min and the UE may sleep up to 413 days. PSM timers are meaningless for a vehicular network, where a vehicle may be in a network-reachable area maximum of 10 min. Even if power consumption is a constraint, we do not need a strict implementation of it as keyword real-time suits more for the VANETs as the messages are to be sent immediately. eDRX: extended Discontinuous Reception is a feature of LTE networks, which can be used along with PSM or without PSM. UE using this feature temporarily switches off the receive section of radio module for a fraction of second. This disconnection for fraction of a second will not degrade the performance of a messaging system as this takes very less time when compared to the time required for a driver to react after an alert is received (Fig. 3). Deployment of NB-IoT module: 3GPP has specified three deployment modes: 1. Standalone deployment. 2. LTE-Guard band deployment. 3. In-band deployment.

UE Reachable

Fig. 2 PSM cycles

Data Transfer

PSM Cycle

Data Transfer

Data Transfer

TAU Period

eDRX Cycle

Data Transfer

eDRX Cycle

Data Transfer

R. N. Channakeshava and M. Sundaram

Data Transfer

80

Fig. 3 eDRX: extended Discontinuous Reception

3GPP also recommends mobile network operators to support paging for the messages from UEs, and the paging information is communicated with UEs. SCEF (Service capabilities exposure function): This feature of NB-IoT allows UEs to access network capabilities through homogeneous network APIs. MME-SCEF interface (Mobile management entity-service capabilities exposure function): This feature allows using of non-IP over the control plane by roaming visiting devices. Abstraction: 3GPP provides following network and interfaces features: 1. Underlying protocol connectivity, routing and traffic control. 2. Mapping specific APIs onto appropriate network interfaces. 3. Protocol translation. Events monitoring: Events such as UE reachable, location of UE and change in location of UE, loss of connectivity, communication failure are monitored. Cell reselection is allowed for UEs for selecting better quality signals. Vehicular ad hoc networks: Vehicles may form clusters as discussed in [1]. NBIoT transmitter or a UE will take care of data to be transmitted and received. At the receiving end the cluster manager will take care of the authentication of the messages received. Alerts received from sensors and manual alerts submitted by humans inside the vehicles are all processed through alerts processing module.

6 Formation of VANETs Using Cellular IoT VANETs may use different routing techniques. Ad hoc, cluster, broadcast, geo-cast, and position based, and so on, in the wake of GPS becoming popular and cheap, the position-based routing is becoming an evolved technology. We need an architecture that will combine the waning message along with the GPS information. Requirements of the architecture: • Receive the messages from vehicles. • Identify the vehicles that are in the same region and moving in the direction to form a cluster.

A Study on Energy-Efficient Communication in VANETs …

81

• Provide for communication between vehicles within the cluster. • Check the authenticity of the messages received. • Check the authenticity of all vehicles in the cluster that are communicating.

6.1 Clustering of Vehicles In order to simplify the communication and speedy delivery of messages, vehicles are to be allowed to communicate with each other. To achieve this, first the system has to decide which vehicles are to form cluster, and in this case the vehicles close together and move in a particular direction that are to be clustered as discussed in [2]. A vehicle moving from one cluster to another will send request to the server which will be communicated through cellular tower. Server tags of all vehicles moving in the cellular tower range and moving in particular direction will be tagged to one cluster. And the acceptance is sent back to the vehicle, and at the same time, member details are updated to all vehicles in that cluster and adjacent cluster. A cluster join request may contain Vehicle_id, length_of_vehicle, Vehicle_type, Previous_cluster, average_speed, current_position (Fig. 4). In reply server we will add current cluster_name to the above info and update to all vehicles in the cluster and adjacent cluster. At the same time, far-behind clusters that need not keep this vehicle information will be notified to disconnect. Processing of alerts need not pass through the server as they can be directly relayed back from the cellular tower itself. At the same time, any adjacent clusters that may be affected will also be notified the alert. An alert may include the following information: Vehicle_id, Alert_message, Alert_position. At the UE end or vehicle only, the alerts are passed onto the cellular network and the further work is done by tower to send message to the server for clustering or send alerts to the neighbor vehicles and clusters (Figs. 5 and 6).

Fig. 4 Cellular coverage and clustering

82

R. N. Channakeshava and M. Sundaram

Fig. 5 Procedure for membership of vehicle in a cluster

Disconnect

Accept & Update

Update Request

Update

Disconnect

Cluster A

Request Accept

Cluster B

Update

Cluster C

Fig. 6 Safety message communication between vehicles

Alert Processed Disconnect

Alert Processed Alert Raised

Cluster A

Cluster B

Alert Processed

Cluster C

A Study on Energy-Efficient Communication in VANETs …

83

Comfort Traffic Safety Comfort ITS Application Data Dissemination Medium Access

Emission

VEINS

Behavior Road Traffic Simulation

Mobility

Physical Layer

Channel

SUMO

OMNET++ Fig. 7 Simulation model using Omnet++, SUMO and VEINS

7 Simulation Model Various simulation models are used by researchers. OMNET++, NS2, NS3 are popular simulation models. We choose OMNET++ as it is open source and frameworks specific to simulate VANETs are available. “VEINS” is a framework for OMNET++ to simulate vehicles in network simulation, using SUMO (Fig. 7). A traffic model is simulated using SUMO as follows with random vehicles number varying from 50 to 100 in the traffic model. Lanes in the road are bidirectional. Speed of the vehicles varies from 40 to 120 km/h.

8 Comparison of Results Preliminary observations show that delay is almost flat with all traffic density and delivery ratio is also flat. The battery life can be seen at high against clustering of vehicles locally. Due to the simplicity in the clustering model and avoidance of intervehicular communication, all the communication are passed through cellular routers. Increased traffic will not affect any delay or delivery ratio. And even battery life can be extended due to the absence of inter-vehicular communication (Table 1). Table 1 Comparison of results Criterion

Low-traffic density (50)

Medium-traffic density (70)

High-traffic density (100)

LTE-M

WAVE

LTE-M

WAVE

LTE-M

WAVE

Delivery ratio

0.95

0.89

0.95

0.93

0.95

0.95

Delay(s)

0.25

0.22

0.26

0.26

0.26

0.34

Battery liftime

421

250

421

229

421

209

84

R. N. Channakeshava and M. Sundaram

9 Discussions and Conclusions This model using cellular IoT is very efficient compared to other models using WAVE technology. As cellular IoT is energy-efficient, scalable, and less cost due to infrastructure availability, it can be upgraded to new technology as the new cellular technology comes into use. Alerts are the messages generated by sensors, and the networks formed are interconnecting these sensors, so we should use a specialized communication technology for IoT. The cellular IoT does the best. As the cellular networks are reachable to remote areas and basements of bigger buildings, vehicles entering into road are also covered. GSMA is also working hard to bring automotive communication into real, by regular updates from cellular operators, automotive OEMs, relevant industry associations, regulatory bodies and whoever are in the field of the requirements for adapting vehicular safety messages.

9.1 Challenges in Adopting Cellular-IoT in VANETs Many cellular operators are yet to adapt cellular IoT standards. Even with high availability of cellular networks, still there are some locations not yet covered. In those areas message communications will not be possible. However, with the rapid growth of technology, cellular IoT is adopted by many of the cellular service providers, so the challenges will be overcome soon.

References 1. Channakeshava RN, Ashok kumar TA (2009) Multi-hop cluster based message propagation with cellular-IoT in vehicular communication. IJRAR 6(2) 2. Channakeshava RN, Sundaram M (2020) Overview of algorithm for clustering in VANETS. TEST-Eng Manag J 5462–5467 3. Sree Divya N, Bobba V (2019) A theoretical research on routing protocols for vehicular AD HOC networks (Vanets). Int J Recent Technol Eng (IJRTE) 8(1S4). ISSN: 2277-3878 4. Foni S, Pecorella T, Fantacci R, Carlini C, Obino P, Di Benedetto M-G. Evaluation methodologies for the NB-IOT system: issues and ongoing efforts 5. Kalaivani D, Rajkumar S (2019) A research on VANET: various broadcasting and clustering techniques. Int J Innov Technol Explor Eng (IJITEE) 8(6S4). ISSN: 2278-3075 6. Dhole KV, Dhudhe AS (2017) Effective vehicle collision detection system by using vehicular ad-hoc network. IJCSMC 6(4):284–288 7. Ahmed SAM, Ariffin SHS, Fisal N (2013) Overview of wireless access in vehicular environment (WAVE) protocols and standards. Indian J Sci Technol 8. Mukunthan A, Cooper C, Safaei F, Frankliny D, Abolhasany M, Ros M (2013) Experimental validation of the CORNER urban propagation model based on signal power measurements in a vehicular environment. In: Conference paper 9. Giordanoyz E, Frank R, Pauy G, Gerlay M. CORNER: a realistic urban propagation model for VANET. IEEE Xplore

A Study on Energy-Efficient Communication in VANETs …

85

10. Laroiya N, Lekhi S (2017) Energy efficient routing protocols in Vanets. Adv Comput Sci Technol 10(5):1371–1390. ISSN 0973-6107 11. Elhoseny M, Hassanien AE (eds) Energy efficient optimal routing for communication in VANETs via clustering model. In: Emerging technologies for connected Internet of vehicles and intelligent transportation system networks, studies in systems, decision and control, vol 242 12. Bhalaji N (2019) Performance evaluation of flying wireless network with Vanet routing protocol. J ISMAC 01(01):56–71 13. Benslimane A et al (2010) An efficient routing protocol for connecting vehicular networks to the Internet. Pervasive Mob Comput. https://doi.org/10.10.16/j.pcmj2010.09.002 14. Sivaganesan D (2019) Efficient routing protocol with collision avoidance in vehicular networks. J Ubiquitous Comput Commun Technol (UCCT) 1(02):76–86 15. Vitale C, Chiasserini CF, Malandrino F, Tadesse SS (2019) Characterizing delay and control traffic of the cellular MME with IoT support. ACM Mobihoc 16. NB-IoT deployment guide to basic feature set requirements. GSM Association Official Document. www.gsma.com/IoT-April-2018

Recognition of Transforming Behavior of Human Emotions from Face Video Sequence: A Triangulation-Induced Circumradius-Incenter-Circumcenter Combined Approach Md Nasir, Paramartha Dutta, and Avishek Nandi Abstract Usage of the system for human emotion recognition has been increased in various types of applications of affective computing fields such as human sign language understanding, identification of human mental disorder, and human-computer interaction. Here, we report a video frame-based procedure for the estimation of human emotional behavior. We introduce Circumradius-Incenter-Circumcenter combined geometric signature (CIC) induced from our proposed triangulation method. The method first includes the step of salient landmark identification from face image frames by using the Active Appearance Model (AAM). Here, we fetch geometric features from triangles drawn by landmark points, thereafter core triangles are found based on the CIC feature which plays an important role to get an interpretation of changing information of human emotions. In the end, the extracted core features from core triangles are employed into the Multilayer Perceptron (MLP) classifier to get recognition accuracy. The discrimination power of our proposed system is evaluated on well-known three benchmark video face frame databases, viz., CK+, MMI, and MUG. Moreover, the performance of the proposed procedure is validated by presenting the comparison task with other existing methods. Keywords Triangulation method · Salient landmark points · Active appearance model (AAM) · Circumradius-incenter-circumcenter combined signature (CIC) · Multilayer Perceptron (MLP)

1 Introduction Machine-based human emotion recognition is an ability to imitate human sign language in nonverbal communication. Facial expression and other human body gestures provide important clues that help people to understand the actual meaning of spoken words. The major contribution (55%) comes from facial expression and the rest of the M. Nasir (B) · P. Dutta · A. Nandi Visva-Bharati University, Santiniketan, West Bengal, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_9

87

88

M. Nasir et al.

total contribution comes from other body gestures to interpret messages concerning human emotions [1]. In the article [2], there was a first attempt to identify six various basic human emotions, viz., anger (AN), disgust (DI), fear (FE), happiness (HA), sadness (SA), and surprise (SU). The recognition system works with extracting information from either a single face frame or video and thereafter analyzes the extracted information to get the recognition accuracy which is the foremost intention of the system. To observe the motion of emotional transition (starting from neutral facial expression to peak expression), we need to find out the activities on the movements of facial muscle points. The important thing with this approach is that it generally facilitates us to understand the nature of image transition by providing intermediate image frame information in between neutral and peak expressions. The most important parts of the recognition system are feature invention and classification. Feature tracks can be done with geometric-based techniques and appearance-based techniques. In appearance-based techniques, features are produced by taking different statistics of image pixels that include intensity values for computing histogramoriented information, texture information of face images, etc. On the other hand, geometric-based techniques produce the feature by considering geometric shape information collected from facial points on images. Such feature-based information is used in the classification task to detect all basic expressions separately and the transition status of that emotion also. For the task of assigning a class level, different recognition modules like Support Vector Machine (SVM) [3] and Hidden Markov Model (HMM) [4] are used. Gap Analysis: In the article [5, 6], authors use only static image-based information in their emotion recognition systems and focus on understanding and analyzing only of either neutral emotion or peak emotion in the transition of any basic emotions. They simply skip taking temporal information which shows the dynamic behavior of human emotions concerning time. Our proposed system protocol tries to capture such temporal behavior of emotions by considering intermediate frame-based information in emotional transition. Even though we have chosen a geometric-based recognition system like other existing work, we can reduce the dimension of the feature set by selecting crucial triangles from face images and it gives superior recognition results. Our proposed work shows the following contributions: (1) Video datasets are collected into our work to study delicate variations like emotion transition. (2) The most popular landmark invention method Active Appearance Model (AAM) is pertained in our study to make out the important face regions. (3) The proper foundation of face regions with landmark points builds the geometric feature by utilizing our proposed triangulation method. Here, CircumradiusIncenter-Circumcenter (CIC) geometric signature is produced by drawing the triangles on face regions and used for a unique representation of the sequence. (4) Finally, Multilayer Perceptron (MLP) is invoked for recognition of that signature into several emotion classes.

Recognition of Transforming Behavior of Human Emotions …

89

The rest of the paper is arranged as follows: Sect. 2 describes our proposed methodology by covering the following topics: Salient Landmark Detection, Feature exploration with triangle formation, Feature selection with triangle selection, and MLPbased transition recognition of emotion. In Sect. 3, Experimentation and Results are discussed. Section 4 shows the comparison task with other existing methods. Finally, the conclusion is described in Sect. 5.

2 Proposed Methodology The framework of our proposed work is displayed by the following diagram shown in Fig. 1.

2.1 Salient Landmark Detection Initialization and tracking of geometric locations for facial muscle points are performed by utilizing the well-known landmark tracker Active Appearance Model (AAM) [7] on consecutive face frames. AAM can be expressed as a statistical model which is constructed by taking the composition of shape and texture information of the deformable object. Initially, the model requires some sets of annotated face images to be best-fitted. Procrustes analysis is performed on every image to set initial landmark points on the face shape represented by a vector s and it forms the shape model. With such a set of images and initial landmark points, the model is trained repeatedly to match those landmark locations with mean shape, s¯ . This entire process for the shape model is used in Active Shape Model (ASM) [8]. Similarly, Eigenanalysis is performed to make a texture model that stores patterns of intensity values or color information of every grayscale image into a vector g after normalizing it. Finally, these two models (shape and texture) are correlated with each other through parameter learning in the following rules s = s¯ + Q s C and g = g¯ + Q g C to get the

Fig. 1 Diagram of our proposed framework

90

M. Nasir et al.

Fig. 2 Major regions covered by key landmarks

appearance model. Here, s¯ is mean shape, g¯ is mean texture, Q s describes shape variation, and Q g describes texture variation obtained from training images. C is the appearance parameter. We have used landmark tracker AAM [9] in our system, and it detects a total of 68 landmark points on every image frame. Among them, only 23 landmarks are considered as very sensitive concerning various facial expressions [6]. All major regions (eyes, eyebrows, lips, and nose) of the face are covered by those key points. Figure 2 shows 68 landmark generations followed by 23 landmark collections from a single face frame in the sequence.

2.2 Feature Exploration with Triangle Formation In our proposed procedure, we got 23 points for each frame of each single sequence, and our triangulation method is applied on those points to form a triangle by acquiring each combination of three points. To invent geometry-based features corresponding to each triangle in consecutive frames, we retrieved first Circumcenter, Incenter, and Circumradius geometric information from every triangle over the sequence. Geometric locations of two centers, Circumcenter (C) and Incenter(I), for a particular triangle formed by i, j, k points over the mth sequence are presented in Eqs. 1 and 2. Ci,mj,k = [(x1c , y1c )i,m j,k , (x2c , y2c )i,m j,k , (x3c , y3c )i,m j,k , . . . , (xnc , ync )i,m j,k ]

(1)

Ii,mj,k = [(x1I , y1I )i,m j,k , (x2I , y2I )i,m j,k , (x3I , y3I )i,m j,k , . . . , (xnI , ynI )i,m j,k ]

(2)

Next, the distance between them is calculated by Eq. 3. (dl )i,m j,k

 = ( (ylc − ylI )2 + (xlc − xlI )2 )i,m j,k

(3)

Distance (d) over the mth sequence is denoted by Eq. 4. di,mj,k = [(d1 )i,m j,k , (d2 )i,m j,k , (d3 )i,m j,k , . . . , (dn )i,m j,k ]

(4)

Recognition of Transforming Behavior of Human Emotions …

91

Fig. 3 CIC feature generation by the triangulation method

Similarly, we found Circumradius (R) for each triangle formed by i, j, k points over the mth sequence, and it is denoted by Eq. 5 Ri,mj,k = [(R1 )i,m j,k , (R2 )i,m j,k , (R3 )i,m j,k , . . . , (Rn )i,m j,k ]

(5)

Finally, we got feature vector (CIC) which is a combination of Circumradius and Incenter-Circumcenter distance feature. It is defined by Eq. 6 C I Ci,mj,k = [(d1 /R1 )i,m j,k , (d2 /R2 )i,m j,k , (d3 /R3 )i,m j,k , . . . , (dn /Rn )i,m j,k ]

(6)

Here, n = number of frames used in a single sequence. In our computation, n = 10. In this manner, we explored total 23 C3 = 1771 number of feature vectors (CIC) which is equal to the number of triangles created by our proposed triangulation method for each frame. All vectors are not equally important but some of them show more discrimination power to get efficient results for transition recognition of emotions. The illustration for feature exploration is shown in Fig. 3.

2.3 Feature Selection with Triangle Selection In this section, we described the detailed procedure for core feature selection by identifying core triangles from the triangle pool available in sequence. The procedure is presented in Algorithm 1. Here, triangles are first selected according to the variation score. The score preserves the information of the triangle’s sensitivity with respect to the changes in emotional transition. It generally uses our computed CIC features to find out the changing activities of triangles over the frames. Hence, it helps to detect more responsible triangles which regulate the dynamic behavior of human emotions.

92

M. Nasir et al.

Algorithm 1 Core feature identification Input: Image sequence with CIC feature vectors of Triangles Output: Core CIC feature vectors with core Triangles 1: Divide sequence into basic expression groups: S AN , S D I , S F E , S H A , SS A , SSU 2: for each sequence s ∈ S AN do 3: T O PS AN = N U L L 4: for each triangle t ∈ s do 5: Furnish computed feature vector (C I C)i,m j,k 10 6: Compute variation score by (V S)i,m j,k = l=1 ((C I Cl+1 )i,m j,k − (C I Cl )i,m j,k ) 7: end for 8: Find set of top 50 triangles for each sequence based on variation score denoted by T O P50 9: Obtain all important triangles for anger class by T O PS AN = T O PS AN ∪ T O P50 10: end for 11: Repeat step 2 to step 10 for other groups: S D I , S F E , S H A , SS A , SSU 12: Obtain T O PS D I , T O PS F E , T O PS H A , T O PSS A , T O PSSU 13: Find core triangles by cor eT = T O PS AN ∩ T O PS D I ∩ T O PS F E ∩ T O PS H A ∩ T O PSS A ∩ T O PSSU 14: Obtain core feature vectors coreCIC by taking CIC feature corresponding to triangles ∈ coreT.

2.4 MLP-Based Transition Recognition of Emotion For recognition of our extracted feature set into several basic emotional transitions, Multilayer Perceptron (MLP) recognizer is introduced which is an artificial neural network. To model the dependency between feature sets and basic expressions, the network is trained in a supervised manner. The network consists of three layers: an input layer with 17710 nodes, a hidden layer with 10 nodes, and an output layer with 6 nodes. Features are processed with previously assigned network parameters (connection weights and bias) in the forward pass and the network provides computed output. Then it finds network errors by the formula  j (n) = τ j (n) − δ j (n). Here, j denotes output neuron, n denotes input feature, τ denotes target output, and δ denotes computed output. In the backward pass, connection weights are tuned by using the scale conjugate gradient backpropagation learning algorithm. The learning process stops when it reaches  the minimum error value. Error minimization is done with the formula (n) = 21 j  2j (n).

3 Experimentation and Result Discussion Our system is tested separately on 3 video sequence datasets, viz., CK+ [10], MMI [11], and MUG [12]. Before putting the extracted feature set into MLP modules, datasets are partitioned into three sets: 70% of data are used for network training purposes, 15% for validation, and 15% for the testing task. Training data are used to get the best-fit network model, validation data are used to avoid network overfitting problems, and test data are utilized to get accuracy on unknown data which are not

Recognition of Transforming Behavior of Human Emotions …

93

Fig. 4 Emotional profiles of happiness expression on CK+, MMI, and MUG datasets

used in either training or validation data. Our proposed triangle selection method has identified the following numbers of core triangles on different datasets (38 core triangles on CK+, 210 core triangles on MMI, and 600 core triangles on MUG).

3.1 CK+ Dataset and Results The dataset is organized by capturing the emotional profiles of 210 different subjects. All subjects are 18–50 years old and among them, most of the subjects are Euro-American and female. In our experimentation, the dataset includes 327 emotional profiles of six basic human emotions (stated earlier) and contempt (CON) expression. Each profile contains 10–60 face frames where the first frame shows neutral expression and the last frame shows any one of 7 expressions. Figure 4 displays the video frames of happiness emotion and Table 1 describes the identification results of sequences by showing the confusion matrix. From this confusion matrix, we found 98.16% overall accuracy with validation and testing accuracy 97.95% and 89.76%, respectively. Contempt, happiness, and surprise expressions are identified with 100% accuracy. The lowest accuracy of 89.28% is found with sadness class. We also computed tenfold cross-validation accuracy on this dataset and it gives 97.04%.

3.2 MMI Dataset and Results A total of 236 MMI videos display emotional profiles for basic human emotions. These are recorded from 28 different participants. Profiles are arranged by taking both a 90-degree side view and frontal face images. Among all videos available in MMI, 202 numbers of emotional profiles are collected with frontal face images in our experiment. Figure 4 shows the sequence of happiness emotions, for example, and

94

M. Nasir et al.

Table 1 Confusion matrix on CK+, MMI, and MUG

AN DI FE HA SA SU

Confusion Matrix on CK+ AN CON DI FE HA SA SU AN 44 0 1 0 0 0 0 CON 0 18 0 0 0 0 0 DI 1 0 58 0 0 0 0 FE 0 0 0 24 1 0 0 HA 0 0 0 0 69 0 0 SA 2 0 0 0 0 25 1 SU 0 0 0 0 0 0 83 Confusion Matrix on MMI Confusion Matrix on MUG AN DI FE HA SA SU AN DI FE HA SA SU 28 0 1 0 2 0 AN 28 0 1 0 2 0 1 30 1 0 0 0 DI 1 30 1 0 0 0 1 0 25 0 0 2 FE 1 0 25 0 0 2 0 0 0 42 0 0 HA 0 0 0 42 0 0 7 0 0 0 21 0 SA 7 0 0 0 21 0 0 0 1 0 0 40 SU 0 0 1 0 0 40

the confusion matrix on MMI is displayed by Table 1. Here, the overall recognition rate is 92.07%, validation accuracy is 76.66%, and the testing accuracy is 70%. Tenfold cross-validation accuracy of 89.76% is achieved on this MMI dataset. 100% accuracy is got for happiness emotion and the lowest accuracy is seen for sadness.

3.3 MUG Dataset and Results A total of 1462 MUG videos are collected from 86 different participants, and they perform 6 basic emotions. Among them, 51 are male and 35 are female of ages between 20 and 35 years. Emotional profiles are recorded by storing 19 f /s rate. Our module has taken 802 videos from MUG to classify. Figure 4 shows sample MUG video sequence of happiness emotion and Table 1 exhibits confusion matrix on MUG. We have got overall recognition results in 98.87% on this dataset. Validation and testing results are found with 95% and 97.5% accuracy, respectively. We also achieved tenfold cross-validation accuracy of 97.48% on MUG. Here, sadness achieves 100% accuracy, and 98.13% accuracy is achieved as the lowest accuracy for happiness.

Recognition of Transforming Behavior of Human Emotions …

95

Table 2 Comparison of our system performance with various other existing recognition systems Method Datasets Number of class Recognizer Average accuracy (%) [13] [14] [15] [16] [17] [3]

Our method

CK+ CK+ CK+ MMI MUG CK+ MMI MUG CK+

6 7 6 7 7 6 6 6 7

MMI

6

MUG

6

SVM SVM SVM SVM SVM SVM SVM SVM MLP MLP with tenfold cross-validation MLP MLP with tenfold cross-validation MLP MLP with tenfold cross-validation

92.54 83.00 80.00 86.90 95.24 97.80 77.22 95.50 98.16 97.04 92.07 89.76 98.87 97.48

4 Comparison with Other Existing Approaches From our computed recognition outcomes on several datasets CK+, MMI, and MUG, it is detected that our system with good accuracy can outperform other existing approaches. For a better understanding of our system performance, we also computed tenfold cross-validation accuracy and compared it with other approaches. Table 2 summarizes the comparison task and it describes all the comparison parameters like dataset type, number of classes, recognizer type, and average accuracy, etc.

5 Conclusion To study for a better understanding of the transitional nature of human emotions, it is required to confirm the maximum movements of major face portions. With such concern, we developed a triangulation technique that can check the variation of triangle shapes over the frames of the emotional sequence by exploring the CircumradiusIncenter-Circumcenter (CIC) geometric signature. Different triangles are drawn from different major portions, which may not be equally responsible for emotion changes. That is why our proposed approach can identify those core triangles having maximum variation among all possible triangles in sequence. In our experimentation, it

96

M. Nasir et al.

is observed that those core triangles have performed well with good tenfold crossvalidation accuracy and average accuracy on several datasets CK+, MMI, and MUG which outperform other approaches impressively. Hence, it shows the capability of core triangles to detect the changes of human emotions. Acknowledgements The authors want to state their gratefulness to Prof. Maja Pantic and Dr. A. Delopoulos for making available to use the MMI and MUG databases. The authors also like to express thanks to Department of Science and Technology, Ministry of Science and Technology, Government of India, for supporting with DST-INSPIRE Fellowship (INSPIRE Reg. no. IF160285, Ref. No.: DST/INSPIRE Fellowship/[IF160285]) to carry out research work. The authors are thankful to Department of Computer & System Sciences, Visva-Bharati University for providing infrastructure support.

References 1. Mehrabian A, Russell JA (1974) An approach to environmental psychology. MIT Press 2. Ekman P, Friesen WV (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17(2):124 3. Ghimire D, Lee J, Li Z-N, Jeong S (2017) Recognition of facial expressions based on salient geometric features and support vector machines. Multimed Tools Appl 76(6):7921–7946 4. Cruz AC, Bhanu B, Thakoor NS (2014) Vision and attention theory based sampling for continuous facial emotion recognition. IEEE Trans Affect Comput 5(4):418–431 5. Barman A, Dutta P (2020) Human emotion recognition from face images. Springer 6. Barman A, Dutta P (2019) Facial expression recognition using distance and texture signature relevant features. Appl Soft Comput 77:88–105 7. Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23(6):681–685 8. Cootes TF, Taylor CJ, Cooper DH, Graham J (1995) Active shape models-their training and application. Comput Vis Image Underst 61(1):38–59 9. Sagonas C, Antonakos E, Tzimiropoulos G, Zafeiriou S, Pantic M (2016) 300 faces in-the-wild challenge: database and results. Image Vis Comput 47:3–18 10. Lucey P et al (2010) The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops. IEEE 11. Valstar M, Pantic M (2010) Induced disgust, happiness and surprise: an addition to the MMI facial expression database. In: Proceedings of 3rd international workshop on EMOTION (satellite of LREC): corpora for research on emotion and affect 12. Aifanti N, Papachristou C, Delopoulos A (2010) The MUG facial expression database. In: 11th international workshop on image analysis for multimedia interactive services WIAMIS 10. IEEE 13. Yaddaden Y, Adda M, Bouzouane A, Gaboury S, Bouchard B (2017) Facial expression recognition from video using geometric features, pp 4-6 14. Saeed A, Ayoub A-H, Robert N, Moftah E (2014) Frame-based facial expression recognition using geometrical features. Adv Hum Comput Interact 15. Wan C, Tian Y, Liu S (2012) Facial expression recognition in video sequences. In: Proceedings of the 10th world congress on intelligent control and automation. IEEE, pp 4766-4770 16. Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(6):803–816 17. Rahulamathavan Y, Phan RC-W, Chambers JA, Parish DJ (2012) Facial expression recognition in the encrypted domain based on local fisher discriminant analysis. IEEE Trans Affect Comput 4(1):83–92

A Study on Radio Labelling of Evolving Trees for Path Pn Alamgir Rahaman Basunia, Laxman Saha, and Kalishankar Tiwary

Abstract The term radio labelling of a graph G of diameter d means a function f which maps each vertex of G with a non-negative integer such that the condition | f (x) − f (y)|  d + 1 − d(x, y), holds for x, y ∈ G, where d(x, y) represents the distance of x and y. The difference between the maximum and minimum values in f (V ) is termed as the span of f . The smallest span amongst all radio labelling that would be permitted by G is termed as the radio number of the graph G. Here we investigate radio labelling problem for evolving trees of path Pn and determine the radio number for several trees. Keywords Channel assignment problem · Radio labelling · Path · Span · Evolving trees

1 Introduction The channel assignment problem which was first given by Hale [1] led to the development of the concept of radio labelling of graphs. It works on the models of allocating channels to different stations so that it curtails the spectrum of channels and the interference is evaded. The interference is directly associated with the closeness of the locations of the transmitters or stations. To transform the above concept into the theory of graph, we consider each vertex of the graph to represent stations and if the relative positions of the two stations are near then we connect them by the two vertices. It allocates or designates channels to the various transmitters or stations so A. Rahaman Basunia · L. Saha (B) Department of Mathematics, Balurghat College, Balurghat 733101, India e-mail: [email protected] A. Rahaman Basunia e-mail: [email protected] K. Tiwary Department of Mathematics, Raiganj University, Raiganj 733134, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_10

97

98

A. Rahaman Basunia et al.

that interferences are evaded and the bandwidth of the channels which are considered needs to be greatly reduced. The interference between two stations is inversely proportional to the distance between them. The above problem can be visualized using the concept of graph theory wherein a station represented by a vertex and two vertices are associated by an edge if their proximity is near. Chartrand et al. [2, 3] first gave the concept of radio labelling. For two vertices x and y, we denote the distance between them as d(x, y). The diameter of G is the greatest distance between any of the two pair of vertices of G and is represented by diam(G). A function g : V (G) → {0, 1, 2, . . .} is termed as radio labelling of G if the conditions given below are satisfied for each pair x and y in G: |g(x) − g(y)|  diam(G) + 1 − d(x, y).

(1)

The absolute difference between the maximum and minimum in the range of g is called the span of g and is denoted by span(g). Thus, span(g) = max{|g(x) − g(y)| : x, y ∈ V (G)}. The minimum span from amongst all the radio labellings of G is termed as the radio number of G, denoted by r n(G), Many researchers have carefully investigated the radio number of trees. Liu [4] gave the general lower bound of the radio number of trees. There are different pedigrees of tree whose radio numbers coincide with this bound (cf. [4–7]). In literature, almost all results are determination of optimal radio labelling for few well-structured trees, for example, path Pn , regular caterpillars, complete m-ary trees. Hence, the following questions arose naturally: Question 1 How to solve radio labelling problem for trees which are not well structured? Question 2 How a radio labelling or the radio number will be vulnerable if we add new vertices by adjoining edges into these well-structured trees? The main objective of this paper is to resolve Questions 1 and 2. To solve Question 1, an idea has been presented as: We start with a well-structured tree T and evolve this tree by adjoining not only few vertices but also a list of forests (which may not have been well structured) and then extend the radio labelling of T for the evolved trees. In this article, we have taken initial tree as path Pn and evolve the path by adjoining a list of forests. In Sect. 3, we have noticed the vulnerability of radio labelling or the radio number for evolving trees of path Pn .

2 Preliminaries and Notations  For a tree T , the weight of T at z is defined by ω(z) = u∈V (T ) d(z, u). The greatest value of ω(z) amongst all vertices z is termed as weight of T and symbolically we

A Study on Radio Labelling of Evolving Trees for Path Pn

99

represent it as ω(T ). A vertex z is said to be a centroid of T if ω(z) = ω(T ). The results given below were obtained in [4]. Theorem 1 [4] For a n-vertex tree T having diameter d, r n(T )  (|V (T )| − 1) (diam(T ) + 1) − 2ω(T ) + 1. For a n-vertex tree T whose diameter is d, we denote (|V (T )| − 1) (diam(T ) + 1) − 2ω(T ) + 1 by Ω(T ). Following theorem is due to Liu et al. [8]. Theorem 2 A n-vertex tree T having at least one centroid s of degree 2 and diameter 2d. If both left and right branches of T − s contain d-level vertices, then r n(T )  Ω(T ) +

|Vd | , 2

where Vd denotes the set of all vertices at level d.

3 Evolving Trees for Paths P2k+1 and their radio numbers Let F = {F1 , F2 , . . . , Ft } be a collection of forests and r = {r1 , r2 , . . . , rt } be a vector of positive integers. Let Fi has ci connected components. We construct evolving trees by adjoining new edges to T applying the following algorithm: Algorithm 1: Construction of evolving trees by adjoining new edges between T and a set of forests. Input: A tree T ; a collection F of forests; a vector r = (r F ) F∈F , where r F represents the number of copies of F would be placed in T by adjoining new edges. Step 1: Assign U (F) ⊂ V (T ) corresponding to each F ∈ F by the following rule: (a) |U (F)| = r F and (b) U (F) ∩ U (G) = Φ for distinct F, G ∈ F. Step 2: For each forest F ∈ F and each u ∈ U (F) connect u and a copy of F by adjoining edges exactly one for each connected component of that copy of F. Output: Evolving trees of T . Notation 31 We denote TF, r, U (T ) be the set of all evolving trees T  of T that are produced in Algorithm 1, provided the diameters of T  and T are same. From Algorithm 1, it is clear that a copy of a forest F ∈ F has been adjoined into a distinct vertices of T for some a ∈ r. If coordinates of r are equal, say a, then we denote r by a. Our main aim is to determine an optimal radio labelling of an evolving tree T  ∈ TF, r, U (T ) by extending an optimal radio labelling of T . Since radio labelling is directly related to diameter of a graph, we will consider evolving trees and the

100

A. Rahaman Basunia et al.

initial tree have same diameter. Note that |V (T  )| = V (T ) +



r F · |F| for every

F∈F

tree T  ∈ TF, r, U (T ). Lemmas 1-3 are essential to prove our main results. Lemma 1 Let T be an n-vertex tree with diameter d. Let u 0 , u 1 , . . . , u n−1 be an alternating sequence of vertices for T satisfying the following: (a) (u i ) + 2φ(u i , u i+2 )  d+1 , 0  i  n − 3. 2 (b) For i, j ∈ {0, 1, 2, . . . , j} with j > i + 1, 2φ(u i , u j )  j − i − 1. Then the mapping f : V (T ) → {0, 1, . . .} defined by f (u i+1 ) = f (u i ) + d + 1 − (u i ) − (u i+1 ) is a radio labelling of T . Proof Consider u i and u j be arbitrary two vertices of T . Without loss of generality, we may assume j > i. Now for j = i + 1, the radio condition is automatically satisfied as u i and u i+1 are in different branches of T . For j = i + 2, from definition of f , we have f (u i+2 ) − f (u i ) = 2(d + 1) − ((u i ) + (u i+2 )) − 2(u i+1 ) = 2(d + 1) − d(u i , u i+2 ) − 2(u i ) − 2φ(u i , u i+2 )  d + 1 − d(u i , u i+2 ) (from given conditions (a)). Therefore, the radio condition is satisfied for any two distinct vertices u i and u j when j = i + 2. Now we take j  i + 3. Then f (u j ) − f (u i ) =

j−1  



f (u t+1 ) − f (u t )

t=i

 ( j − i)(d + 1) − ((i) + ( j)) −

j−2 

((u t ) + (u t+1 ))

t=i+1

 ( j − i)(d + 1) − ((i) + ( j)) − ( j − i − 1)d (as (u t ) + (u t+1 )  d) = d + 1 − d(u i , u j ) + j − i − 1 − 2φ(u i , u j )  d + 1 − d(u i , u j ) (from given condition (b)).

Therefore, radio conditions are satisfied for any two distinct vertices of T . Lemma 2 Let T be an n-vertex tree with diameter d. Let u 0 , u 1 , . . . , u n−1 be an arrangement of vertices for T that satisfies d(u i , u i+1 )  d2 + 1, 0  i  n − 2. Then the mapping f : V (T ) → {0, 1, . . .} defined by f (u i+1 ) = f (u i ) + d + 1 − d(u i , u i+1 ) for 0  i  n − 2 is a radio labelling of T . Proof Consider u i and u j be arbitrary two vertices of T . We assume j > i. From definition of f , it is not hard to observe that that the radio condition is satisfied for u i and u i+1 . So let j  i + 2. Then from definition of f ,

A Study on Radio Labelling of Evolving Trees for Path Pn

f (u j ) − f (u i ) =

j−1  

f (u t+1 ) − f (u t )

101



t=i

= ( j − i)(d + 1) −

j−1 

d(u t , u t+1 )

t=i



 ( j − i)(d + 1) − ( j − i) 

 d + 1 , from given condition. 2

( j − i)d  d, (as j  i + 2). 2

This shows that f forms a radio labelling of T . Remark 1 In the above lemma if u 0 , u 1 , . . . , u n−1 be an alternating sequence of T , then the condition stated in Lemma 2 reduces to (u i ) + (u i+1 )  d2 + 1 and consequently, the labelling f is changed by f (u i+1 ) = f (u i ) + d + 1 − (u i ) − (u i+1 ), where i ∈ {0, 1, . . . , n − 1}. Vertex indices scheme for P2k+1 : From here to onward by {v0 , v1 , . . . , vn−1 }, we mean the vertex set V (Pn ) for Pn . We rearrange the vertices of P2k+1 by wi = vσ (i) , where σ is a permutation of {0, 1, . . . , 2k} defined by σ (0) = k, σ (1) = 0 σ (i) = i −1 i k− or 2k + 1 − accordingly as i is odd or even. 2 2 Lemma 3 For the arrangement γ : w0 , w1 , . . . , wn−1 of vertices of P2k+1 , following hold. (a) w2i−1 and w2i (i  1) are left and right side vertices of Pn with respect to the middle vertex as we move from left to right. (b) d(w0 , w1 ) = k, d(w1 , w2 ) = 2k and for i  2, d(wi , wi+1 ) = k or k + 1 accordingly as i is odd or even. (c) Let d(wa , wb )  k − 1. If wa and wb are in left and right sides of middle point of P2k+1 , then a < b + 4. (d) The mapping g : {w0 , w1 , . . . , wn−1 } → {0, 1, . . .}, defined by g(wi+1 ) = g(wi ) + 2k + 1 − (wi ) − (wi+1 ) + δ, where δ = 1 if i = 2 and 0 otherwise, is an optimal radio labelling of P2k+1 . Theorem 3 For a null forest F, let F = {F}, r = (2) and U = {U (F)} = {{u, w}} where u and w are in opposite sides of the middle vertex of P2k+1 with d(u, w)  k − 1. Then for every tree T ∈ TF, r, U (P2k+1 ), r n (T ) = Ω(T ) + 1. Proof Let T be an arbitrary tree in TF, r, U (P2k+1 ). Then |V (T )| = |V (P2k+1 )| + 2|F| = 2k + 1 + 2|F|. Due to symmetricity of Pn , we assume u = w p and w = wq are in left and right side vertices of Pn with respect to the middle vertex as we move from left to right. Then from Lemma 3, we have p and q as odd and even integers, respectively. Then d(w p , wq )  k − 1 indicates p < q due to Lemma 3. Let Vc be the vertex set of T . Recall that the vertices of P2k+1 may be written in terms of wi ’s as presented in vertex index scheme. Let G be a copy of F. Let x1 , x2 , . . . , xr be the adjacent

102

A. Rahaman Basunia et al.

vertices to w p and y1 , y2 , . . . , yr be the same for wq . Let us denote {x1 , x2 , . . . , xr } and {y1 , y2 , . . . , yr } by V (F) and V (G), respectively. Now consider an arrangement of vertices of T as γ : w0 w1 . . . , w p y1 x1 . . . yr xr w p+1 . . . wn−1 . Consider a function f : V (T ) → {0, 1, . . .} defined by f (wt ) = g(wt ), 0  t  p; f (z 1 ) = f (w p ) + 2k + 1 − (w p ) − (z 1 ); f (z t+1 ) = f (z t ) + 2k + 1 − (z t ) − (z t+1 ), 1  t  r − 1; f (w p+1 ) = f (zr ) + 2k + 1 − (zr ) − (w p+1 ); f (wt+1 ) = f (wt ) + 2k + 1 − (wt ) − (wt+1 ), p + 1  t  n − 1; where z 2i−1 = yi and z 2i = xi for 1  i  r . We show f is a radio labelling of T . We consider the partition of V (T ) with partite sets U1 , U2 , U3 , where U1 = {w0 , w1 , . . . , w p }; U2 = {z 1 , z 2 . . . , zr } and U3 = {w p+1 , . . . , wn−1 }. Our claim | f (x) − f (y)| = |g(x) − g(y)|  2k + 1 − d(x, y) for all x, y ∈ U1 ∪ U2 ∪ U3 . For any two vertices x, y ∈ U1 , | f (x) − f (y)| = |g(x) − g(y)|  2k + 1 − d(x, y) as g is a radio labelling of P2k+1 . As d(w p , wq )  k − 1, so d(w p , y1 ) = d(w p , wq ) + 1  k and d(xi , yi ) = d(w p , wq ) + 2  k + 1. Also from Lemma 3, we have d(wi , wi+1 )  k + 1 and d(xr , w p+1 ) = d(w p , w p+1 ) + 1 = k + 1 as p is odd and so d(w p , w p+1 ) = k. Combining all of these distance inequalities or equalities, we have for any two consecutive vertices a, b in the segment w p y1 x1 . . . yr xr w p+1 . . . wn−1 of γ , d(a, b)  k + 1. Thus, by the logic given in Lemma 2, we have | f (x) − f (y)  2k + 1 − d(x, y) for all x, y ∈ U2 ∪ U3 . Now the remaining case is to show | f (x) − f (y)| = |g(x) − g(y)|  2k + 1 − d(x, y) for all x ∈ U1 and y ∈ ∪U2 ∪ U3 . For x = w p and y = z 1 , this is true directly from the definition of f . We take x = w p and y = z 2 . Now we calculate f (z 2 ) − f (w p ) via z 1 as f (z 2 ) − f (w p ) = f (z 2 ) − f (z 1 ) + f (z 1 ) − f (w p )  2(2k + 1) − ((z 2 ) + (z 1 )) − ((z 1 ) + (w p ) = 2(2k + 1) − 2(k + 1) = 2k  2k + 1 − d(w p , z 2 ).

(2)

Since f (x) < f (w p ) and f (y) > f (z 2 ) for all x ∈ U1 \ {w p } and y ∈ U2 ∪ U3 \ {z 1 , z 2 }, inequality (2) implies that f (y) − f (x) > f (z 2 ) − f (w p )  2k for all x ∈ U1 \ {w p } and y ∈ U2 ∪ U3 \ {z 1 , z 2 }. Therefore, f is a radio labelling of T . It is clear that span( f ) = Ω(T ) + 1. From Theorem 2, r n(T )  Ω(T ) + 1. Hence the theorem. Theorem 4 For a collection F of null forests, let r = (2) F∈F and U = {U (F) : F ∈ F}, where u, v ∈ U (F) if u and v are in opposite sides of the middle vertex of P2k+1 with d(u, v)  k − 1. Then for every tree T ∈ TF, r, U (P2k+1 ),

A Study on Radio Labelling of Evolving Trees for Path Pn

103

r n (T ) = Ω(T ) + 1. Proof Let T be an arbitrary tree in TF, r, U (T ), where F, r and U are given as in the statement of this theorem. Here r F = 2 for all F ∈ F, so |V (T )| =  |V (P2k+1 | + 2 F∈F |F| and two copies of F are to be placed in P2k+1 by adjoining edges according to Algorithm 1 for each F ∈ F. Here for each F ∈ F, the set U (F) contains exactly two vertices, one is in left side and another is in right side with respect to the middle vertex of P2k+1 , and two copies of F are to be placed into these two vertices by adjoining edges according to Algorithm 1. For each F ∈ F, let F L and F R be two copies of F which are to be placed in left and right sides of the middle vertex of P2k+1 , respectively. Algorithm 2: Radio labelling of all trees in the class TF, r, U (T ) and in some subclass of TF, r, U (T ) when T is a path. Input: P2k+1 , F, r and U. Initialization: V (P2k+1 ) = {w0 , w1 , . . . , wn−1 } according to the vertex index scheme as explained in first paragraph of this section. Let us denote c by the number of expected visit of Step 1 and let initially it be 1. Let γ (c − 1) : w0 w1 . . . wn−1 be the sequential arrangement of verticesof P2k+1 and g = f c−1 be the radio labelling of F. Pn as in Lemma 3. Denote Fc = F∈F

Step 1: Letpc be the smallest integer such that w pc is adjacent to some vertex u c ∈ Fc = F∈F F. Let u c ∈ Fc for some Fc ∈ F. Extend γ (c − 1) to the sequential arrangement γ (c) : w0 , w1 , w pc (FcR ↔ FcL )w pc +1 . . . wn−1 . Here (FR ↔ FL ) stands for an alternating arrangement amongst the vertices of FcR and FcL starting from a vertex in FcR . Step 2: Rename the vertices of γ (w pc ) by u 0 , u 1 , u 2 , . . . , u 2k+1 , . . . , u |γ (w pc )|−1 , where u i is the (i + 1)-th element in γ (w pc ) for 0  i  |γ (w pc )| − 1 and |γ (w pc )| denotes the number of strings (vertices) in γ pc . Step 3: Define a mapping f c : {u 0 , u 1 , u 2 , . . . , u 2k+1 , . . . , u |γ (w pc )|−1 } → {0, 1, . . .} by f c (u i ) = f c−1 (u i ), 0  i  w pc − 1; f c (u i+1 ) = f c (u i ) − (u i ) − (u i+1 ), w pc  i  |γ (w pc )| − 2. Step 4: If Fc \ Fc = Φ, then stop. Otherwise, do Step 1 to Step 3 after the following replacement: (a) replace Fc by Fc \ Fc and (b) replace c by c + 1. Output: Radio labelling of all trees in TF, r, U (T ). Remark 2 By reasoning proved in Theorem 3, it can be shown that each f c defined in Algorithm 2 is a radio labelling of some sub-tree of T ∈ TF, r, U (T ). For a forest F, we define h(F) = max{h(T ) : T is connected component of F} and call it by height of F, where h(T ) represents the height of the tree T . For the proof of two theorem stated below, we apply Algorithm 2.

104

A. Rahaman Basunia et al.

Theorem 5 Let F consists of a single forest F (a forest of height h < k − 2), r = (2) and U = {U (F)} = {{u, w}} where u and w are in opposite sides of the middle vertex of P2k+1 with d(u, v)  k − h − 1. Then for every tree T ∈ TF, r, U (P2k+1 ), r n (T ) = Ω(T ) + 1. Theorem 6 For a collection F = {Fi : 1  i  t} of forests, let r = (2, 2, . . . , 2) and U = {U (F) : F ∈ F}, where u, w ∈ U (F) if u and w are in opposite sides of the middle vertex of P2k+1 with d(u, v)  k − 1 − h(F). Then for every tree T ∈ TF, r, U (P2k+1 ), r n (T ) = Ω(T ) + 1. Similar results also hold for even path P2k .

4 Conclusion The article effectively investigated the radio labelling problem for trees which are not well structured. The article is able to identify the vulnerability of the radio number by adjoining edges or vertices into well-structured tree, namely, paths. The same type of concept can be utilized for computation of vulnerability for different pedigrees of trees such as family of complete binary trees, caterpillars, and m-distance trees for small m. We determined the radio numbers of evolving trees for paths Pn by inserting a list of forests to this path. Acknowledgements The second author is grateful to the National Board for Higher Mathematics (NBHM), India for providing monetary and logistic support (Grant No. 2/48(22)/R & D II/4033).

References 1. Hale W (1980) Frequency assignment. Theory Appl Proc IEEE 68:1497–1514 2. Chartrand G, Erwin D, Harary F, Zhang P (2001) Radio labelings of graphs. Bull Inst Combin Appl 33:77–85 3. Chartrand G, Erwin D, Zhang P (2005) A graph labeling problem suggested by FM channel restrictions. Bull Inst Combin Appl 43:43–57 4. Liu DF (2008) Radio number for trees. Discrete Math 308:1153–1164 5. Bantva D (2017) Radio number for middle graph of paths. Electron Notes Discrete Math 63:93– 100 6. Li X, Mak V, Zhou S (2010) Optimal radio labellings of complete m-ary trees. Discrete Appl Math 158:507–515 7. Zhou S (2004) A channel assignment problem for optical networks modelled by Cayley graphs. Theoret Comput Sci 310:501–511 8. Liu DF, Saha L, Das S Improved lower bounds for the radio number of trees. Theor Comput Sci. https://doi.org/10.1016/j.tcs.2020.05.023

Secure Blockchain Smart Contracts for Efficient Logistics System Ajay Kumar and Kumar Abhishek

Abstract Blockchain and allied technologies have been investigated as panacea for the problems faced by supply chain and logistics industry. However, earlier literatures have focused on limited aspects of a typical supply chain such as monitoring assets, securing traceability widely neglecting data integrity, and data access. To overcome such drawbacks, the current paper proposes secure smart contracts based on the ERC20 interface in a permissioned blockchain with relevant processes and functions to obtain a holistic framework for securing the supply chain and logistic operations. The efficacy of the proposed framework was demonstrated in a case study. It was found that critical loopholes in a current supply chain can be overcome using the proposed framework. Additionally, several outlines for future research are outlined. Keywords Blockchain · Distributed ledger · Intelligent logistics system · Secure smart contracts

1 Introduction Blockchain technologies have the potential to reduce costs in supply chain and logistics by 15%, contributing to profits and improving the margins of organizations [1]. Several startups have taken the initiative to provide blockchain integrated supply chain monitoring solutions to the industry [2]. Supply chain and logistics are two parts of the same coin. Supply chain focuses on building a cohesive system for delivering value created at a point to the end user. Logistics links the various entities and defines their roles and responsibilities in the supply chain [2, 3]. However, the diverse entities present in the supply chain create coordination issues (e.g. exporting from East Africa to Europe requires 200 interactions between 30 or more entities) A. Kumar (B) · K. Abhishek Department of Computer Science & Engineering, NIT Patna, Bihar, India e-mail: [email protected] K. Abhishek e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_11

105

106

A. Kumar and K. Abhishek

[4, 5]. Excessive paperwork and regulatory compliances are time-consuming in case of inter-state or inter-continental movement of goods. Over 25–50% of supply chain experts advocate blockchain for reducing transaction costs and improving transparency on supply chain [6]. Earlier paper-based systems were used which suffered from drawbacks of file management systems. Upon the entry of digitization due to the databases, the issues of having paper-based procedures were overcome to a large extent [7]. Conceived in 2008 by an unknown individual (or a group of individuals) who used a pseudonym “Santoshi Nakamoto”, Bitcoin cryptocurrency has since then emerged as the most successful cryptocurrency among its peers, reaching an adoption level unrealized by older digital currencies [8–10]. As on 19 March 2020, Bitcoin has a market cap of USD$98,584,789,143 with 18,277,112 bitcoins (BTCs) in circulation, each with a value of USD$5,393.89. Bitcoin differs from its traditional online banking peers by relying on a decentralized consensus scheme for verifying the correctness and authentic nature of currency transfers between users [11–13]. The decentralized consensus scheme is made possible by an organized collection of nodes in the Bitcoin system known as “miners”. The miners confirm each transaction for authenticity. This increases security in the Bitcoin system and ensures the core philosophy of Bitcoin “Maintain trust in an untrusted environment” without the need for a trusted third party. As rewards, miners collect transaction fees for the transactions that they confirm. In the Bitcoin network, individual users used a private key to sign transactions. Without the private key, no party can transfer funds. The public key cryptographic mechanisms ensure that it is easy to verify the transaction as initiated by a user without the need for the user to divulge his private key [14]. Thus, Bitcoin ensures integrity and non-repudiation of transactions. Also, as Bitcoin users do not disclose their identities, anonymity and privacy are ensured.

1.1 Contributions • Explored applicability of blockchain in streamlining the supply chain and logistic • Conducted a systematic study of blockchain and allied techniques • Proposed a distributed ledger technology-based framework for logistic and supply chain operation management • Conducted thorough comparison of proposed framework with existing payment process in trade.

1.2 Novelty There is lack of research in the area of financial sector, even in logistics and application of blockchain to provide decentralized decision-making, integrity protection, anonymity, and transaction confidentiality. Also, few logistic experts are aware of

Secure Blockchain Smart Contracts for Efficient Logistics System

107

blockchain and pursue implementation [5, 15]. The proposed work eliminated all the above-mentioned problems and provided a framework that highlights future research scope in that domain.

1.3 Outline The remaining paper is as follows: Sect. 2 gives the background and state-of-the-art techniques on a blockchain; Sect. 3 discusses the proposed work; Sect. 4 gives the results and discussion of the framework, and its conclusion is given in Sect. 5.

2 Related Works In this section, we describe the current situation of blockchain application service provided in India. Figure 1 gives a description of the steps needed to export pharmaceutical products from India. The exporter needs to apply for a certificate of origin and no-objection certificate for consignment. The exporter also needs to fill forms for excise and value-added tax exemption. Then he loads the consignment to the customs clearing house and obtains a shipping bill which contains information of buyer, buyer’s destination, time needed to ship goods to the destination, value of goods, time needed to receive payment from the seller, quantity of goods, unique identifier for type of goods, and so on. With the shipping bill, the buyer approaches the shipping company to transport the goods. The shipping company provides the buyer with a bill of lading. Upon showing the bill of lading to the exporter’s bank, the exporter shall receive his payment. The importer issues a letter of credit from his bank in favor of the exporter. The letter of credit is forwarded to the exporter’s bank and when the terms mentioned in the letter of credit are fulfilled by the exporter, the exporter’s bank will receive the payment made by the importer. The importers bank transfers the payment of the goods to the exporter’s bank in return for bill of

Fig. 1 Export and import procedures—sequence diagram

108

A. Kumar and K. Abhishek

lading. Once the importers bank receives the bill of lading, the importer is informed. With the bill of lading, the importer can get his shipment from the shipping company subject to customs clearances. The intermediary banks ensure trust between exporter and importer and get convenience fees in exchange.

2.1 Issues • • • • • •

Bill of lading is forged to claim the consignment fee Letter of credit is forged to defraud exporter Bill of lading is lost and importer cannot claim the consignment Bill of lading is stolen and consignment is claimed in place of importer Bill of lading is stolen and consignment fee is claimed in place of exporter Consignment is overvalued or undervalued to claim lower tax or higher insurance payout • Convenience fee of intermediary banks may increase costs.

2.2 State-of-the-Art In this subsection, we studied how the above issues in blockchain applications in India is related to the existing works. The key intermediary is the bank, which acts as a trusted third party and a channel for communication between the various actors. For its mediation, the bank earns its commission or fees. Owing to the commission, it is infeasible to have transactions of a minute amount. K Kuhi et al. focused on evaluating efficacy of blockchain-based systems for logistics case study [16]. Hackius et al. [15] interviewed stakeholders in multiple logistic and supply chain organizations to understand how their organizations had adopted blockchain for seamless operations. The authors also highlighted the hype vs. reality of blockchain in logistic sector. Fu et al. [1] studied the application of blockchain in diminishing the security threats, traceability, and privacy risks in intelligent logistic systems. Li et al. [3] presented a model for improving the traceability of transactions in blockchain. The aim of the proposed model by them was to overcome the issues related to consensus algorithms. Maiti et al. [17] integrated sensors with blockchain to create a system for monitoring assets in the supply chain. The proposed framework used blockchain to calculate asset quality in the supply chain. Integrating sensors for farm-to-fork monitoring of agro-based commodities was proposed by Caro et al. [18]. Though the authors focused only on domestic trade of agro-based commodities, it has a different supply chain compared to international trade of commodities. From the literature, it is concluded that blockchain is an important technology and can have far-reaching impact on logistic sector. However, as there has been limited work proposed in this domain, the current paper proposes a framework for securing logistic and supply chain management using blockchain.

Secure Blockchain Smart Contracts for Efficient Logistics System

109

3 System Framework The proposed framework integrates smart contracts with the private, permissioned blockchain to ensure seamless experience for actors involved in an international transaction (see Fig. 2). The actors participating in such a transaction are listed along with their roles: • Letter of credit (LC): Documentary agreement of clauses between trading partners that establishes payment if clauses specified are met. • Smart contracts: Programmable protocols capable of self-executing if conditions specified while framing the contract are met • Exporter and importer: Trading partners • Regulator: Statutory body • Issuing bank and Advising bank: Issuing bank issues LC to importer. Advising bank receives bill of lading (BL) from exporter. The architecture has modules that define the roles of various users on the intelligent trading system. It uses smart contracts to avoid centralized decision-making. The roles of various smart contracts are explained further. With the use of blockchain, the entire flow of interaction is documented on the distributed ledger. It is tamperproof and also provides traceability for audit. All entities are identified with wallet addresses and hence the system provides pseudo-anonymity. The smart contracts roles are specified in Fig. 2; the elimination of intermediaries in the form of issuing and advising bank is possible with the smart contracts support. With the ERC20 standards, the letter of credit smart contract has four interactions (see Fig. 2). Alice creates an AliceCoin and AliceLC ERC20 contract. She issues a token to her own address. AliceLC contract issues this token to Bob on receiving bill of lading documents. approve() method is called by Alice allowing AliceLC to issue

Fig. 2 Flow of interaction between letter of credit smart contract with Alice (importer) and Bob (exporter)

110

A. Kumar and K. Abhishek

Table 1 Data integrity in supply chain Scheme

LC

BL

Export order

Contract

Kuhi et al. [16]

No

Yes

Yes

Yes

Hackius et al. [15]

Yes

Yes

No

Yes

Fu et al. [1]

Yes

No

Yes

Yes

Proposed

Yes

Yes

Yes

Yes

token. This will trigger approval authorizing AliceLC to transfer token on receipt of BL. When Bob transfers BL, AliceLC calls transferFrom() issuing AliceCoin token to Bob which can be used to withdraw money.

3.1 Experimental Setup Metamask chrome extension was used to setup the Ethereum wallet for creating smart contracts [19]. Ropsten test network on Ethereum was used to deploy the proof of concept. Contracts are written in Remix IDE using solidity language, compiled with Solidity compiler version0.6.0 into Ethereum virtual machine (EVM) bytecode for deployment on Ropsten test network blockchain (Table 1).

4 Case Study and Evaluation The feasibility and efficacy of the proposed system in Sect. 3 are evaluated to check if the issues mentioned in Sect. 2.1 are tackled by it: • Bill of lading is stolen, and the consignment is claimed in place of importer: BL is a digital document on the blockchain and hence cannot be stolen or tampered. Without having the private keys of all parties involved (importer, exporter, shipping company), it cannot be claimed. • Bill of lading is stolen, and consignment fee is claimed in place of exporter: BL is a digital document on the blockchain and hence cannot be stolen or tampered. • Consignment is overvalued or undervalued to claim lower tax or higher insurance payout: Consignment value is based on Merchant Smart contract, which is agreed upon by exporter/importer with regulator and insurer. It cannot be changed once agreed and is placed on the blockchain making it tamper-proof. • Convenience fee of intermediary banks may increase costs: Intermediary are miners that verify transactions and get a fee for their work. Fees are minimal as the verification process is automated. Figure 3 illustrates the features and specifications supported by the proposed model vis-à-vis other state-of-the-art model. Similarly, the functionalities supported

Secure Blockchain Smart Contracts for Efficient Logistics System

111

Fig. 3 Functional characteristics supported

by the proposed model (see Fig. 3) are timely notifications of insurance, consignment delivery, document delivery, shipping documents, and blockchain-based data storage and retrieval.

5 Conclusion The logistics industry is hunting for new technologies to improve the existing process, cut costs, and increase supply chain transparency. The technology blockchain provides a solution to most current problems. As blockchain and its application in supply chain and logistics are still in infancy, there is a need for an applicationoriented research. The current paper demonstrates a use-case exemplar of blockchain on securing the supply chain and logistic on international trade. Through the blockchain framework, it was demonstrated that fraud scenarios in traditional supply chain operations can be avoided. Additionally, lack of coordination between entities involved in the logistics, excessive paperwork, centralized decision-making, data unavailability, lack of security, lack of confidentiality, and lack of non-repudiation are other issues tackled using the proposed framework. The critical question that arises from our research is that whether lacking features in data integrity explain insufficiency in the existing researches. Additional features in the security of blockchain need to be engineered to improve functionality, which are tasks earmarked for future work.

112

A. Kumar and K. Abhishek

References 1. Yonggui Fu, Zhu Jianming (2019) Operation mechanisms for intelligent logistics system: a blockchain perspective. IEEE Access 7:144202–144213 2. Perboli Guido, Musso Stefano, Rosano Mariangela (2018) Blockchain in logistics and supply chain: a lean approach for designing real-world use cases. IEEE Access 6:62018–62028 3. Li Xiaofang, Lv Furu, Xiang Feng, Sun Zhe, Sun Zhixin (2020) Research on key technologies of logistics information traceability model based on consortium chain. IEEE Access 8:69754– 69762 4. Chang SE, Chen Y (2020) When blockchain meets supply chain: a systematic literature review on current development and potential applications. IEEE Access 8:62478–62494 5. HackiusN, Petersen M (2017) Blockchain in logistics and supply chain: trick or treat? in digitalization in supply chain management and logistics: smart and digital solutions for an industry 4.0 environment. In:Proceedings of the Hamburg international conference of logistics (HICL), vol 23, pp 3–18, epubli GmbH, Berlin 6. Kolb J, Becker L, Fischer M, Winkelmann A (2019) The role of blockchain in enterprise procurement. HICSS, pp 1–10 7. Paardenkooper K (2019) Creating value for small and medium enterprises with the logistic applications of blockchain. In: International conference on digital technologies in logistics and infrastructure (ICDTLI 2019). Atlantis Press 8. Park Sehyun, Im Seongwon, Seol Youhwan, Paek Jeongyeup (2019) Nodes in the bitcoin network: comparative measurement study and survey. IEEE Access 7:57009–57022 9. Feng Q, He D, Zeadally S, Khan MK, Kumar N (2019) A survey on privacy protection in blockchain system. J Netw Comput Appl 126:45–58 10. Wang Licheng, Shen Xiaoying, Li Jing, Shao Jun, Yang Yixian (2019) Cryptographic primitives in blockchains. J Netw Comput Appl 127:43–58 11. Rahouti Mohamed, Xiong Kaiqi, Ghani Nasir (2018) Bitcoin concepts, threats, and machinelearning security solutions. IEEE Access 6:67189–67205 12. Nakamoto S (2019) Bitcoin: a peer-to-peer electronic cash system. Technical report, Manubot 13. Aggarwal S, Chaudhary R, Aujla GS, Kumar N, Choo KKR, Zomaya AY (2019) Blockchain for smart communities: applications, challenges and opportunities. J Netw Comput Appl 144:13– 48 14. Astarita V, Giofr‘e VP, Mirabelli G, Solina V (2020) A review of blockchain-based systems in transportation. Information 11(1):21 15. Hackius Niels, Petersen Moritz (2020) Translating high hopes into tangible benefits: How incumbents in supply chain and logistics approach blockchain. IEEE Access 8:34993–35003 16. Kuhi K, Kaare K, Koppel O (2018) Ensuring performance measurement in- tegrity in logistics using blockchain. In: 2018 IEEE international conference on service operations and logistics, and informatics (SOLI), pp 256–261. IEEE 17. Maiti A, Raza A, Kang BH, Hardy L (2019) Estimating service quality in industrial internetof-things monitoring applications with blockchain. IEEE Access 7:155489–155503 18. Caro MP, Ali MS, Vecchio M, Giaffreda R (2018) Blockchain-based traceability in agri-food supply chain management: a practical implementation. In: 2018 IoT vertical and topical summit on agriculture-tuscany (IOT Tuscany), p 1–4. IEEE 19. Lee WM (2019) Using the metamask chrome extension. Beginning ethereum smart contracts programming, pp 93–126. Springer

COVID-19 Outbreak Prediction Using Quantum Neural Networks Pranav Kairon and Siddhartha Bhattacharyya

Abstract Artificial intelligence has become an important tool in fight against COVID-19. Machine learning models for COVID-19 global pandemic predictions have shown a higher accuracy than the previously used statistical models used by epidemiologists. With the advent of quantum machine learning, we present a comparative analysis of continuous variable quantum neural networks (variational circuits) and quantum backpropagation multilayer perceptron (QBMLP). We analyze the convoluted and sporadic data of two affected countries, and hope that our study will help in effective modeling of outbreak while throwing a light on bright future of quantum machine learning. Keywords COVID-19 · Corona virus · Quantum machine learning · Quantum neural network

1 Introduction The Novel Corona virus disease/COVID-19 was first detected in Wuhan, China on December 31, 2019. After the 1918 H1N1 influenza, COVID-19 has been accounted for as the most pernicious respiratory infection to affect the population worldwide [1]. With meteoric spread it has taken lives of half a million people and around 13 million confirmed cases bringing lives of millions to a standstill. According to the World Health Organization report, 170 countries now have adumbrated at least one case as on July 2020. While 4.5% is the mortality rate of this deadly disease, for the age group 70–79 this has ascended to be 8.0% while for those above 80 it is soaring 14.8%. People already suffering from heart disease and diabetes are especially at P. Kairon (B) Delhi Technological University, Bawana, Delhi, India e-mail: [email protected] S. Bhattacharyya CHRIST (Deemed to be University), Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Bhattacharyya et al. (eds.), Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1279, https://doi.org/10.1007/978-981-15-9290-4_12

113

114

P. Kairon and S. Bhattacharyya

higher risk, the fact that they are older than 50 only exacerbates the situation [2]. Middle East Respiratory Syndrome (MERS), Severe Acute Respiratory Syndrome (SARS), and COVID-19 all belong to a family of severe acute respiratory syndrome, called as coronavirus. The symptoms are self-evident within 2–14 days [3] and have been signified by gamut of fever, cough, shortness of breath to pneumonia. Flattening the epidemic curve is key part in managing this pandemic since there is no commercially available vaccine against COVID-19. Although there is some hope as researchers in Russia and Oxford University have completed the human trials of the vaccine, which show positive results. Given the effectual predictions in healthcare, machine learning has proven itself fundamentally. After intensifying patterns and expertise of radiologists, machine learning is now helping them to predict the disease and diagnose it earlier. Also with availability of meager data on COVID-19 the role of data scientists has increased drastically since they have to integrate the data and then release the results which will help in taking unerring decisions [4]. Quantum computing [5] is an amalgamation of quantum physics and computer science. Originally proposed in the 1970s, the research remained mostly theoretical until early 2000s with advent of Shor’s and Grover’s algorithms [6]. We are presently in the NISQ (noisy intermediate quantum) era of quantum computing which implies that the presently available quantum processors are small and noisy. Since many countries are investing heavily on quantum technologies, the growth rate has been accelerated in past 2 years. Quantum computers supersede classical computers by harnessing quantum effects for computational vantage providing polynomial and even exponential speedups for specific problems. With Moore’s law coming to an end, quantum computers become even more eminent since they are not made up of small transistors but rather quits. On the other hand, we are living in an age of data and machine learning provides robust models for predictions, classification, and organization tasks.

2 Motivation Presently available artificial neural network models provide robust models for plethora of problems ranging from stock prediction and statistical modeling to image recognition, data mining [7]. However, quantum neural networks are still somewhat unexplored in terms of their applicability to eclectic and pragmatic problems. In terms of convergence rate and fit ability, quantum backpropagation neural networks have outperformed classical ANNs, since they can harness the mathematical advantage of complex numbers [8] using quantum phase door and two negates controlled quantum neuron model. Continuous variable quantum neural networks [9] are similar to variational circuits in that they are free parameter-dependent quantum algorithms. They are gaining attention as an alternate way to characterize NISQ era devices. Given the guessing abilities of quantum neural systems, these networks are agile to

COVID-19 Outbreak Prediction Using Quantum Neural Networks

115

cost degradation. These properties have roused us to apply these network models to regressional tasks like predicting rise in cases for COVID-19 and compare them against the intelligent paradigmatic classical ANNs.

3 Proposed Methodology The experimental data has been collected from [15] for India and USA. We have considered a live time-series dataset from January 30 to July 1, 2020. It provides the number of confirmed cases, number of deaths, and number of people who have recovered from the pervasive infection. We can find additional two features, i.e., the number of active cases which is given by Active cases = No. of Confirmed Cases − (No. of deaths + No. of recovered patients) and new cases added each day which is just the difference of active cases on the nth day and (n − 1)th day. The global data for all the three columns is plotted in Fig. 1. The pervasive nature of the infection is evident from the exponential curves in Fig. 1. In this work, Quantum Neural Networks (QNNs) have been employed to predict rise in cases. QNNs are seldom applied to regressional tasks [10]. We consider two kinds of quantum neural networks. First being a Quantum Backpropagation Multilayer Perceptron (QBMLP) which utilizes the superposition feature and complex numbers [8, 11]. Secondly, we employ a Continuous Variable Quantum Neural Network (CVQNN) provided by Xanadu in their Pennylane and Strawberry fields package [9]. CVQNNs are variational circuits that use Gaussian and non-Gaussian gates to perform affine and non-linear transformations. Both the QBMLP and CVQNN can be generalized to any number of hidden layers and are characterized by the Tanh and Kerr activation functions, respectively. The accuracy for both the models in finding out the rise in number of cases and deaths due to COVID-19 in India and USA is tested. Moreover, a comparative study has been made between both the networks using a two tailed statistical t-test. CVQNN outperforms QBMLP in some cases whereas an opposite effect is seen in some, as can be seen from the two tailed t-test.

4 Models Deep feedforward neural networks have shown tremendous results on problems like image processing, stock predictions, etc. They owe their success to affine transformations done through hidden layers, which are the building blocks of a multilayer perceptron, and activation functions that introduce non-linearity to map input x to output y. A typical architecture can be represented by L : x → y = φ(W x + b)

(1)

116 #105

3

Number of Cases

5

4

3

2

1

0

#106

2.5

Confirmed Death Recovered

Number of Cases

6

P. Kairon and S. Bhattacharyya

Confirmed Death Recovered

2

1.5

1

0.5

0

20

40

60

80

100

Days

120

140

160

(a) India COVID-19 cases

180

0

0

20

40

60

80

100

Days

120

140

160

180

(b) USA COVID-19 cases

Fig. 1 Total number of cases for all the countries from January 30 to July 1, 2020

for l input variables and m output variables, W is a l × m matrix and b is constant vector of length l, where W, b R. φ is usually chosen from a pool of functions, it can be sigmoid, tanh, ReLU (x) = max(0, x). Inputs for all the layers form the output for the next layers, and the weights are optimized for the defined task on basis of hyperparameters adjusted by the user, which include the number of layer wise input variables (features) and number of layers (depth d).

4.1 Fuzzified Quantum Backpropagation Multilayer Perceptron Consider a dataset of N features that can be represented as x = (x1 , x2 , . . . x N ). The input data can be squashed into the range [0, 1] using fuzzy linear membership function [11] with 1 signifying the highest value whereas 0 lies on opposite end of the spectrum. Membership value is allotted to all the data points in between using linear scaling rule [8]. Given the inputs x, we define a function f (x) as |x = cos(x) |0 + sin(x) |1 ≡ cos(x) + i. sin(x)

(2)

The weights which are modeled as rotation gates, R(θi ) to the input qubits, |xi  with input bias λ. Hence, the modified neural network equation takes the form u=

n  i=1

R(θi ) |x − |λ

(3)

COVID-19 Outbreak Prediction Using Quantum Neural Networks

117

Now, instead of applying just a non-linear function as in the classical case, the transformed output is given by y=

amplitude(|1) π g(δ) − arctan( ) 2 amplitude(|0)

(4)

−2δ

is the tanh function. where g(δ) = 1−e 1+e−2δ Similar to the classical multilayer perceptron, the QBMLP is designed with the first layer accepting inputs in form of qubits, followed by any number of hidden layers, ending at an output layer with a single neuron (for regressional tasks). The qubits for QBMLP are designed by transforming the squashed input x into the range of [0, π2 ] so that they become suitable angles for rotation gates, which after can be transformed to π  π π  (5)  xi = cos( x) |0 + sin( x) |1 2 2 2 The output of the model is a quantum state expressed as a complex number in Euler’s form. To extract the physical meaning of this output, either a quantum measurement can be done and the probability of getting |0 , |1 is considered as the output or equivalently the final output can be obtained as square of either the real or the imaginary part of the output. In this paper, we have taken complex part of output which is the same as taking probability of getting |1 as the final result. The necessary changes need to be made in the analysis of backpropagation if one decides otherwise. O = Pr obabilit y(|1) =| I m(y) |2 = sin2 (y)

(6)

Interested readers may refer to [11] for details on the architecture and operation of the network architecture.

4.2 Continuous Variable Quantum Neural Network Pennylane is a quantum machine learning Python library provided by Xanadu [12], working on the principle of quantum differentiable programming. It provides an efficient platform by combining quantum simulators and classical machine learning that helps users in training various circuits [13]. The Strawberry fields in a photonic quantum computing library is used to solve plethora of problems such as boson sampling, graph optimization, etc. This has been used to construct a photonic neural network model having continuous variable gates. Variational circuits [14] behave similar to neural networks in that there is a definite input (input quantum state) which is embedded into the circuit using a suitable embedding, weights (circuit parameters) that need to be learnt by the model and quantum measurement of an observable say A, which generates a classical output on which training rules are applied for parametric learning. A three qumode CVQNN architecture is shown in Fig. 2a. They are

118

P. Kairon and S. Bhattacharyya

Table 1 Various CV quantum gates used to simulate affine transformation Classical CV analog Transformation ˆ ˆ Weight matrix : W Interferometer :U U = |x N N i j x j  = |Cx i=1 | j=1 C 1 i  Squeezing gate : Sˆ Sˆ (r ) |x = e− 2 i r | x  Bias : b Activation function : φ

Displacement operator :Dˆ Dˆ (α) |x = |x + α Non-Gaussian transformation : φ|x = |Φ(x) Φ

quite useful algorithms for NISQ devices in problems such as quantum chemistry, optimization, etc. This continuous variable quantum neural network actually contains classical neural networks within itself. The use of continuous variable Gaussian and non-Gaussian operations such as displacement operators, squeezing gates, Kerr gates, etc. attributes novelty to this approach given the fact that it does not contain quantum features such as entanglement and superposition in contrast with other variational quantum circuits. If the input dataset has N features (columns), it can be encoded into the circuit using displacement operators. Let the classical input x is given  N by x = (x1 , x2 ...x N ) |xi = 0. Then the and the initial quantum state of circuit is given by |x = i=1 displacement operators, D(x), can be applied to all qumodes, to encode classical input x such that N  D(xi )|xi = 0 (7) |x = i=1

Just like in classical feedforward neural networks, any layer takes inputs from the output of the preceding layer, results are extracted by applying homodyne measurements on each qumode. A similar type of transformation has been used by using continuous variable quantum analogs. Information about the various CV analogs applied in the quantum neural network can be found in Table 1. Combining all of the above, in order to obtain our full affine transformation D ◦ U ◦ S ◦ U |x = |W x + d, we can simulate a classical neural network layer which if broken down is basically a transformation given by Eq. 8, and the post-processing of the output is similar to what is done in all neural network models, i.e., parameters are updated on basis of training rules as L(z) = φ(M z + α) (8)

COVID-19 Outbreak Prediction Using Quantum Neural Networks

S(r1)

D(α1)

ф(λ1)

119

S(r1)

D(α1)

ф(λ1)

S(r2)

D(α2)

ф(λ2)

S(r3)

D(α3)

ф(λ3)

BS (ф1,θ1) U1 (ф1, θ1)

S(r2)

S(r3)

U2 (ф2, θ2)

D(α2)

ф(λ2)

BS (ф2,θ2) D(α3)

ф(λ3)

(a) Standard CVQNN architecture

BS (ф3,θ3)

(b) Modified CVQNN architecture

Fig. 2 Three qumode CVQNN architectures

5 Experimental Results Both the QBMLP and CVQNN have been applied to study the rise of corona virus cases in India and USA. The problem is same as multi-dimensional regression analysis done commonly using classical neural nets. The networks have been trained to map non-linear functions on basis of given data points and their performance has been compared on basis of two tailed t-test. A 3-3-1 architecture has been designed for QBP (i.e., 3 input neurons, 3 hidden neurons, and 1 output neuron) and for the CVQNN, a 3 qumode quantum circuit has been constructed with only 1 hidden layer shown in Fig. 2a. The output of the first qumode is finally measured using homodyne measurement in position eigenstates. The four parameters used from the data are confirmed cases, number of deaths and recoveries, and number of days passed. Three of these features have been used to predict confirmed cases and the number of deaths since this information is quite important with regard to public safety and social distancing standards. The activation function used is tanh as opposed to the more commonly used sigmoid function for quantum inspired model. The learning rate has been defined to be 0.001 for 15000 iterations (I Q B M L P ). Since a nested for loop has been used in the algorithm of QBMLP, its time complexity is given by O(Len(y) ∗ I Q B M L P ), where Len(y) is specified by the length of output column. For CVQNN, the learning rate has been chosen as 0.1 (using stochastic gradient descent) with 200 iterations (IC V Q N N ). The Strawberry fields simulator which is used to perform the CV computation utilizes time/memory as O(C W ∗ IC V Q N N ) where C is the cutoff dimension (a hyperparameter equal to 10 for our case) and W is the number of wires (or qumodes) involved in the computation which were three in this case, as there were three features. The caveat is, if we choose cutoff dimension to be low in order to speed up the computation, some operations like displacement, squeezing, and Kerr operations push the quantum states out of the defined space giving low fidelity. Hence, the standard layer architecture has been modified to a novel version to speed up the computation as shown in Fig. 2b. As we can see from the algorithm, the CVQNN has better time complexity than our QBMLP model, but owing to sluggish nature of Strawberry fields simulator the former takes longer to execute in real time than the latter. The prediction results and the cost function decay for QBMLP and CVQNN for the test dataset are shown in Figs. 3 and 4, respectively (Table 2).

120

P. Kairon and S. Bhattacharyya 0.03

0.9 0.8

0.025 0.7 0.02

0.6

Cost

0.5 0.4

0.015

0.01

0.3 0.2

0.005 0.1 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0

1

0

5000

10000

15000

Iterations

(a) India : Death cases

(e) Cost function decay (a)

1

0.09 Actual Predicted

0.9

0.08

0.8

0.07

0.7

0.06

0.6

0.05

0.5 0.04

0.4

0.03

0.3

0.02

0.2

0.01

0.1 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0

1

0

(a) USA : Death cases

5000

10000

15000

(f) Cost function decay (b)

1

0.03

0.9 0.025

0.8 0.7

0.02

Cost

0.6 0.5

0.015

0.4 0.01

0.3 0.2

0.005

0.1 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0

5000

10000

15000

Iterations

(a) India : Confirmed cases

(g) Cost function decay (c)

0.9

0.09 Actual Predicted

0.8

0.08

0.7

0.07

0.6

0.06

0.5

0.05

0.4

0.04

0.3

0.03

0.2

0.02

0.1 0

0.01

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

(a) USA : Confirmed cases

0.9

1

0

0

5000

10000

15000

(h) Cost function decay (d)

Fig. 3 Results and cost function decays of QBMLP for number of deaths and confirmed cases for COVID-19

COVID-19 Outbreak Prediction Using Quantum Neural Networks 1

121

0.018 Actual Predicted

0.016

0.8 0.014 0.6

Cost

0.012

0.4

0.01 0.008 0.006

0.2

0.004 0 0.002 -0.2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0

1

0

50

100

150

Iterations

(a) India : Death cases

(e) Cost function decay (a)

1.2

0.018 Actual Predicted

1

0.016 0.014

0.8

0.012

Cost

0.6 0.4

0.01 0.008 0.006

0.2

0.004 0 -0.2

0.002

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0

1

0

20

40

60

80

100

120

140

160

180

200

Iterations

(b) USA : Death cases

(f) Cost function decay (b)

0.9

0.012 Actual Predicted

0.8

0.01

0.7 0.6

0.008

Cost

0.5 0.4

0.006

0.3 0.004

0.2 0.1

0.002

0 -0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0

1

0

10

20

30

40

50

60

70

80

90

100

180

200

Iterations

(c) India : Confirmed cases

(g) Cost function decay (c) 0.07

1 Actual Predicted

0.06

0.8

0.05

0.6

Cost

0.04 0.4

0.03 0.2

0.02

0

-0.2

0.01

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0

20

40

60

80

100

120

140

160

Iterations

(d) USA : Confirmed cases

(h) Cost function decay (d)

Fig. 4 Results and cost function decays of CVQNN for number of deaths and confirmed cases for COVID-19

122

P. Kairon and S. Bhattacharyya

Table 2 Two tailed t-test for QBMLP and CVQNN Dataset/Model QBMLP India—death prediction India—confirmed prediction USA—death prediction USA—confirmed prediction

CVQNN

Statistic 0.0572

p value 0.9545

Statistic 0.01998

p value 0.98411

−0.0189

0.9849

0.06372

0.94939

−0.02261

0.9820

−0.01964

0.98438

0.0612

0.95132

0.08899

0.92936

6 Discussions and Conclusion In this study, the ability of quantum neural networks to perform regressional tasks has been investigated, specifically in predicting the rise in COVID-19 cases. The operational features used are the number of days passed, and number of confirmed, recovered, and death cases. The performance of a quantum backpropagation multilayer perceptron and a continuous variable quantum neural network has been compared for the abovementioned task on the basis of a two tailed statistical t-test and it is found to be almost similar. While the QBMLP model performs better for checking the rise of confirmed cases in both the countries, the CVQNN model outperforms the former for calculating the rise in the number of deaths. Although the difference in p values for both the models is not that significant, it provides an interesting insight into the nature of both the models.

References 1. Darwish A, Rahhal Y, Jafar A (2020) A comparative study on predicting influenza outbreaks using different feature spaces: application of influenza-like illness data from early warning alert and response system in syria. BMC research notes 13 2. Zheng Y-Y et al (2020) COVID-19 and the cardiovascular system. Nat Rev Cardiol 17:259–260 3. Bai Y et al (2020) Presumed asymptomatic carrier transmission of COVID-19. JAMA 323:1406 4. Koike F, Morimoto N (2018) Supervised forecasting of the range expansion of novel nonindigenous organisms. Alien pest organisms and the 2009 h1n1 flu pandemic, global ecology and biogeography, vol 27, pp 991–1000 5. Steane A (1998) Quantum computing. Rep Prog Phys 61:117–173 6. Monz T et al (2016) Realization of a scalable shor algorithm. Science 351(6277):1068–1070 7. Huarng K, Yu TH-K (2006) The application of neural networks to forecast fuzzy time series. Physica A: Stat Mech Appl 363:481–491 8. Mitrpanont JL, Srisuphab A (2002) The realization of quantum complex-valued backpropagation neural network in pattern recognition problem. In : 9th International conference on neural information processing. ICONIP’02, vol. 1, pp 462–466. IEEE 9. Killoran N et al (2019) Continuous-variable quantum neural networks. Phys Rev Res 1

COVID-19 Outbreak Prediction Using Quantum Neural Networks

123

10. Diep DN, Nagata K, Nakamura T (2020) Nonparametric regression quantum neural networks. arXiv preprint arXiv:2002.02818 11. Bhattacharyya S, Bhattacharjee S, Mondal NK (2015) A quantum backpropagation multilayer perceptron (QBMLP) for predicting iron adsorption capacity of calcareous soil from aqueous solution. Appl Soft Comput 27:299–312 12. Bergholm V et al (2020) Pennylane: automatic differentiation of hybrid quantum-classical computations. arXiv preprint arXiv:1811.04968 13. Killoran N et al (2019) Strawberry fields: a software platform for photonic quantum computing. Quantum 3:129 14. Liu Y et al (2020) Variational quantum circuits for quantum state tomography. Phys Rev A 101 15. The humanitarian data exchange. https://data.humdata.org/dataset/novel-coronavirus-2019ncov-cases