Computational Methods and Data Engineering: Proceedings of ICMDE 2020, Volume 2 [1st ed.] 9789811579066, 9789811579073

This book gathers selected high-quality research papers from the International Conference on Computational Methods and D

2,089 78 17MB

English Pages XII, 566 [559] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Computational Methods and Data Engineering: Proceedings of ICMDE 2020, Volume 2 [1st ed.]
 9789811579066, 9789811579073

Table of contents :
Front Matter ....Pages i-xii
Software Quality Optimization of Coupling and Cohesion Metric for CBSD Model (M. Iyyappan, Arvind Kumar)....Pages 1-19
Detecting WSN Attacks Through HMAC and SCH Formation (Neetu Mehta, Arvind Kumar)....Pages 21-37
Literature Review of Various Nature-Inspired Optimization Algorithms Used for Digital Watermarking (Preeti Garg, R. Rama Kishore)....Pages 39-52
Geospatial Knowledge Management-Fresh Fuel for Banking and Economic Growth? (Anupam Mehrotra)....Pages 53-65
Neural Network and Pixel Position Shuffling-Based Digital Image Watermarking (Sunesh Malik, Rama Kishore Reddlapalli)....Pages 67-78
Hybrid Optimized Image Steganography with Cryptography (Vineet Nandal, Parvinder Singh)....Pages 79-84
TxtLineSeg: Text Line Segmentation of Unconstrained Printed Text in Devanagari Script (Rupinder Pal Kaur, M. K. Jindal, Munish Kumar)....Pages 85-100
Intelligent Strategies for Cloud Computing Risk Management and Testing (Vinita Malik, Sukhdip Singh)....Pages 101-114
Effective Survey on Handwriting Character Recognition (G. S. Monisha, S. Malathi)....Pages 115-131
Enhancement in Braille Systems—A Survey (K. M. R. Navadeepika, V. D. Ambeth Kumar)....Pages 133-139
A Critical Review on Use of Data Mining Technique for Prediction of Road Accidents (Navdeep Mor, Hemant Sood, Tripta Goyal)....Pages 141-149
Rio Olympics 2016 on Twitter: A Descriptive Analysis (Saurabh Sharma, Vishal Gupta)....Pages 151-162
A Survey on Vehicle to Vehicle Communication (Tanay Wagh, Rohan Bagrecha, Shubham Salunke, Shambhavi Shedge, Vina Lomte)....Pages 163-175
OBD-II and Big Data: A Powerful Combination to Solve the Issues of Automobile Care ( Meenakshi, Rainu Nandal, Nitin Awasthi)....Pages 177-189
OBU (On-Board Unit) Wireless Devices in VANET(s) for Effective Communication—A Review (N. Ganeshkumar, Sanjay Kumar)....Pages 191-202
Chinese Postman Problem: A Petri Net Based Approach (Sunita Kumawat)....Pages 203-222
Weather Dataset Analysis Using Apache Pig (Anmoldeep Kaur, Arpan Randhawa)....Pages 223-230
Analysis of Learner’s Behavior Using Latent Dirichlet Allocation in Online Learning Environment (N. A. Deepak, N. S. Shobha)....Pages 231-242
An Overview of Recent Developments in Convolutional Neural Network (CNN) Based Face Detector (Rahul Yadav, Priyanka)....Pages 243-258
A Review of Artificial Intelligence Techniques for Requirement Engineering (Kamaljit Kaur, Prabhsimran Singh, Parminder Kaur)....Pages 259-278
Data-Driven Model for State of Health Estimation of Lithium-Ion Battery (Rupam Singh, V. S. Bharath Kurukuru, Mohammed Ali Khan)....Pages 279-293
Trusted Sharing of IOT Data Using an Efficient Re-encryption Scheme and Blockchain (Preeti Sharma, V. K. Srivastava)....Pages 295-306
Clustering of Quantitative Survey Data: A Subsystem of EDM Framework (Roopam Sadh, Rajeev Kumar)....Pages 307-319
Smell-O-Vision Device (P. Nandal)....Pages 321-329
A Dynamic Approach for Detecting the Fake News Using Random Forest Classifier and NLP (J. Antony Vijay, H. Anwar Basha, J. Arun Nehru)....Pages 331-341
Automated Essay Grading: An Empirical Analysis of Ensemble Learning Techniques (Shakshi Sharma, Anjali Goyal)....Pages 343-362
Survey of Scheduling and Meta Scheduling Heuristics in Cloud Environment (Savita Khurana, Rajesh Kumar Singh)....Pages 363-374
A Novel Idea for Designing a Speech Recognition System Using Computer Vision Object Detection Techniques (Sukrobjon Toshpulotov, Sarvar Saidov, Selvanayaki Kolandapalayam Shanmugam, J. Shyamala Devi, K. Ramkumar)....Pages 375-381
Empirical Classification Accuracy Assessment of Various Classifiers for Clinical Diagnosis Datasets (Sabita Khatri, Narander Kumar, Deepak Arora)....Pages 383-392
Comparison of Transform-Based and Transform-Free Analytical Models Having Finite Buffer Size in Non-saturated IEEE 802.11 DCF Networks ( Mukta, Neeraj Gupta)....Pages 393-409
An Overview of Learning Approaches in Reflection Removal (Rashmi Chaurasiya, Dinesh Ganotra)....Pages 411-425
Comparison of Bioinspired Algorithms Applied to the Timetabling Problem (Jose Silva, Noel Varela, Jesus Varas, Omar Lezama, José Maco, Martín Villón)....Pages 427-437
Algorithm for Detecting Polarity of Opinions in Laptop and Restaurant Domains (Jose Silva, Noel Varela, Danelys Cabrera, Omar Lezama, Jesus Varas, Patricia Manco)....Pages 439-446
Prediction of the Yield of Grains Through Artificial Intelligence (Jose Silva, Noel Varela, Danelys Cabrera, Omar Lezama)....Pages 447-453
A Secured Steganography Algorithm for Hiding an Image and Data in an Image Using LSB Technique (Vaibhav Singh Shekhawat, Manish Tiwari, Mayank Patel)....Pages 455-468
Data Rate Analysis of LTE System for 2 × 2 MIMO Fading Channel in Different Modulation Scheme (Dimple Jangir, Gori Shankar, Bharat Bhusan Jain)....Pages 469-479
Prediction of Defects in Software Using Machine Learning Classifiers (Ashima Arya, Sanjay Kumar, Vijendra Singh)....Pages 481-494
Energy-Efficient Schemes in Underwater Wireless Sensor Network: A Review ( Poonam, Vikas Siwach, Harkesh Sehrawat, Yudhvir Singh)....Pages 495-510
Information Hiding Techniques for Cryptography and Steganography ( Bhawna, Sanjay Kumar, Vijendra Singh)....Pages 511-527
Affect Recognition using Brain Signals: A Survey (Resham Arya, Ashok Kumar, Megha Bhushan)....Pages 529-552
“Memorize, Reproduce, and Forget” Inclination; Students’ Perspectives: A Study of Selected Universities in Ghana (John Kani Amoako, Yogesh Kumar Sharma, Paul Danquah)....Pages 553-564
Back Matter ....Pages 565-566

Citation preview

Advances in Intelligent Systems and Computing 1257

Vijendra Singh Vijayan K. Asari Sanjay Kumar R. B. Patel   Editors

Computational Methods and Data Engineering Proceedings of ICMDE 2020, Volume 2

Advances in Intelligent Systems and Computing Volume 1257

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong

The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. Indexed by SCOPUS, DBLP, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago.

More information about this series at http://www.springer.com/series/11156

Vijendra Singh Vijayan K. Asari Sanjay Kumar R. B. Patel •





Editors

Computational Methods and Data Engineering Proceedings of ICMDE 2020, Volume 2

123

Editors Vijendra Singh School of Computer Science University of Petroleum and Energy Studies Dehradun, Haryana, India Sanjay Kumar Department of Computer Science and Engineering SRM University Delhi-NCR Sonepat, Haryana, India

Vijayan K. Asari Department of Electrical and Computer Engineering University of Dayton Dayton, OH, USA R. B. Patel Department of Computer Science and Engineering Chandigarh College of Engineering and Technology (CCET) Chandigarh, Punjab, India

ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-981-15-7906-6 ISBN 978-981-15-7907-3 (eBook) https://doi.org/10.1007/978-981-15-7907-3 © Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

We are pleased to present Springer Book entitled Computational Methods and Data Engineering, which consists of the Proceedings of International Conference on Computational Methods and Data Engineering (ICMDE 2020), Volume 2 papers. The main aim of the International Conference on Computational Methods and Data Engineering (ICMDE 2020) was to provide a platform for researchers and academia in the area of computational methods and data engineering to exchange research ideas and results and collaborate together. The conference was held at the SRM University, Sonepat, Haryana, Delhi-NCR, India, from January 30 to 31, 2020. All the 41 published chapters in the Computational Methods and Data Engineering book have been peer-reviewed by the three reviewers drawn from the scientific committee, external reviewers and editorial board depending on the subject matter of the chapter. After the rigorous peer-review process, the submitted papers were selected based on originality, significance and clarity and published as chapters. We would like to express our gratitude to the management, faculty members and other staff of the SRM University, Sonepat, for their kind support during organization of this event. We would like to thank all the authors, presenters and delegates for their valuable contribution in making this an extraordinary event. We would like to acknowledge all the members of honorary advisory chairs, international/national advisory committee members, general chairs, program chairs, organization committee members, keynote speakers, the members of the technical committees and reviewers for their work. Finally, we thank series editors, Advances in Intelligent Systems and Computing, Aninda Bose and Radhakrishnan for their high support and help. Dehradun, India Dayton, USA Sonepat, India Chandigarh, India

Vijendra Singh Vijayan K. Asari Sanjay Kumar R. B. Patel

v

Contents

Software Quality Optimization of Coupling and Cohesion Metric for CBSD Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Iyyappan and Arvind Kumar Detecting WSN Attacks Through HMAC and SCH Formation . . . . . . . Neetu Mehta and Arvind Kumar

1 21

Literature Review of Various Nature-Inspired Optimization Algorithms Used for Digital Watermarking . . . . . . . . . . . . . . . . . . . . . . Preeti Garg and R. Rama Kishore

39

Geospatial Knowledge Management-Fresh Fuel for Banking and Economic Growth? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anupam Mehrotra

53

Neural Network and Pixel Position Shuffling-Based Digital Image Watermarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sunesh Malik and Rama Kishore Reddlapalli

67

Hybrid Optimized Image Steganography with Cryptography . . . . . . . . Vineet Nandal and Parvinder Singh TxtLineSeg: Text Line Segmentation of Unconstrained Printed Text in Devanagari Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rupinder Pal Kaur, M. K. Jindal, and Munish Kumar

79

85

Intelligent Strategies for Cloud Computing Risk Management and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Vinita Malik and Sukhdip Singh Effective Survey on Handwriting Character Recognition . . . . . . . . . . . . 115 G. S. Monisha and S. Malathi Enhancement in Braille Systems—A Survey . . . . . . . . . . . . . . . . . . . . . 133 K. M. R. Navadeepika and V. D. Ambeth Kumar

vii

viii

Contents

A Critical Review on Use of Data Mining Technique for Prediction of Road Accidents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Navdeep Mor, Hemant Sood, and Tripta Goyal Rio Olympics 2016 on Twitter: A Descriptive Analysis . . . . . . . . . . . . . 151 Saurabh Sharma and Vishal Gupta A Survey on Vehicle to Vehicle Communication . . . . . . . . . . . . . . . . . . 163 Tanay Wagh, Rohan Bagrecha, Shubham Salunke, Shambhavi Shedge, and Vina Lomte OBD-II and Big Data: A Powerful Combination to Solve the Issues of Automobile Care . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Meenakshi, Rainu Nandal, and Nitin Awasthi OBU (On-Board Unit) Wireless Devices in VANET(s) for Effective Communication—A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 N. Ganeshkumar and Sanjay Kumar Chinese Postman Problem: A Petri Net Based Approach . . . . . . . . . . . . 203 Sunita Kumawat Weather Dataset Analysis Using Apache Pig . . . . . . . . . . . . . . . . . . . . . 223 Anmoldeep Kaur and Arpan Randhawa Analysis of Learner’s Behavior Using Latent Dirichlet Allocation in Online Learning Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 N. A. Deepak and N. S. Shobha An Overview of Recent Developments in Convolutional Neural Network (CNN) Based Face Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Rahul Yadav and Priyanka A Review of Artificial Intelligence Techniques for Requirement Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Kamaljit Kaur, Prabhsimran Singh, and Parminder Kaur Data-Driven Model for State of Health Estimation of Lithium-Ion Battery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 Rupam Singh, V. S. Bharath Kurukuru, and Mohammed Ali Khan Trusted Sharing of IOT Data Using an Efficient Re-encryption Scheme and Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Preeti Sharma and V. K. Srivastava Clustering of Quantitative Survey Data: A Subsystem of EDM Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 Roopam Sadh and Rajeev Kumar Smell-O-Vision Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 P. Nandal

Contents

ix

A Dynamic Approach for Detecting the Fake News Using Random Forest Classifier and NLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 J. Antony Vijay, H. Anwar Basha, and J. Arun Nehru Automated Essay Grading: An Empirical Analysis of Ensemble Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Shakshi Sharma and Anjali Goyal Survey of Scheduling and Meta Scheduling Heuristics in Cloud Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 Savita Khurana and Rajesh Kumar Singh A Novel Idea for Designing a Speech Recognition System Using Computer Vision Object Detection Techniques . . . . . . . . . . . . . . . . . . . 375 Sukrobjon Toshpulotov, Sarvar Saidov, Selvanayaki Kolandapalayam Shanmugam, J. Shyamala Devi, and K. Ramkumar Empirical Classification Accuracy Assessment of Various Classifiers for Clinical Diagnosis Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Sabita Khatri, Narander Kumar, and Deepak Arora Comparison of Transform-Based and Transform-Free Analytical Models Having Finite Buffer Size in Non-saturated IEEE 802.11 DCF Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 Mukta and Neeraj Gupta An Overview of Learning Approaches in Reflection Removal . . . . . . . . 411 Rashmi Chaurasiya and Dinesh Ganotra Comparison of Bioinspired Algorithms Applied to the Timetabling Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Jose Silva, Noel Varela, Jesus Varas, Omar Lezama, José Maco, and Martín Villón Algorithm for Detecting Polarity of Opinions in Laptop and Restaurant Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 Jose Silva, Noel Varela, Danelys Cabrera, Omar Lezama, Jesus Varas, and Patricia Manco Prediction of the Yield of Grains Through Artificial Intelligence . . . . . . 447 Jose Silva, Noel Varela, Danelys Cabrera, and Omar Lezama A Secured Steganography Algorithm for Hiding an Image and Data in an Image Using LSB Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Vaibhav Singh Shekhawat, Manish Tiwari, and Mayank Patel Data Rate Analysis of LTE System for 2  2 MIMO Fading Channel in Different Modulation Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 Dimple Jangir, Gori Shankar, and Bharat Bhusan Jain

x

Contents

Prediction of Defects in Software Using Machine Learning Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481 Ashima Arya, Sanjay Kumar, and Vijendra Singh Energy-Efficient Schemes in Underwater Wireless Sensor Network: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Poonam, Vikas Siwach, Harkesh Sehrawat, and Yudhvir Singh Information Hiding Techniques for Cryptography and Steganography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511 Bhawna, Sanjay Kumar, and Vijendra Singh Affect Recognition using Brain Signals: A Survey . . . . . . . . . . . . . . . . . 529 Resham Arya, Ashok Kumar, and Megha Bhushan “Memorize, Reproduce, and Forget” Inclination; Students’ Perspectives: A Study of Selected Universities in Ghana . . . . . . . . . . . . 553 John Kani Amoako, Yogesh Kumar Sharma, and Paul Danquah Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565

About the Editors

Prof. Vijendra Singh is working as Professor in the School of Computer Science and Engineering at The University of Petroleum and Energy Studies, Dehradun, Uttarakhand, India. Prof. Singh received his Ph.D. degree in Engineering and M.Tech. degree in Computer Science and Engineering from Birla Institute of Technology, Mesra, India. He has 20 years of experience in research and teaching including IT industry. Prof. Singh major research concentration has been in the areas of data mining, pattern recognition, image processing, big data, machine learning, and soft computation. He has published more than 65 scientific papers in this domain. He has served as Editor-in-Chief, Special Issue, Procedia Computer Science, Vol 167, 2020, Elsevier; Editor-in-Chief, Special Issue, Procedia Computer Science, Vol 132, 2018, Elsevier; Associate Editor, International Journal of Healthcare Information Systems and Informatics, IGI Global, USA; Guest Editor, Intelligent Data Mining and Machine Learning, International Journal of Healthcare Information Systems and Informatics, IGI Global, USA; Editor-in-Chief, International Journal of Social Computing and Cyber-Physical Systems, Inderscience, UK; Editorial Board Member, International Journal of Multivariate Data Analysis, Inderscience, UK; Editorial Board Member, International Journal of Information and Decision Sciences, Inderscience, UK. Prof. Vijayan K. Asari is a Professor in Electrical and Computer Engineering and Ohio Research Scholars Endowed Chair in Wide Area Surveillance at the University of Dayton, Dayton, Ohio. He is the Director of the University of Dayton Vision Lab (Center of Excellence for Computer Vision and Wide Area Surveillance Research). Prof. Asari had been a Professor in Electrical and Computer Engineering at Old Dominion University, Norfolk, Virginia, till January 2010. He was the Founding Director of the Computational Intelligence and Machine Vision Laboratory (ODU Vision Lab) at ODU. Prof. Asari received the bachelor’s degree in Electronics and Communication Engineering from the University of Kerala (College of Engineering, Trivandrum), India, in 1978, the M.Tech. and Ph.D. degrees in Electrical Engineering from the Indian Institute of Technology, Madras, in 1984 and 1994, respectively. Prof. Asari received several teachings, research, xi

xii

About the Editors

advising, and technical leadership awards. Prof. Asari received the Outstanding Teacher Award from the Department of Electrical and Computer Engineering in April 2002 and the Excellence in Teaching Award from the Frank Batten College of Engineering and Technology in April 2004. Prof. Asari has published more than 480 research papers including 80 peer-reviewed journal papers co-authoring with his graduate students and colleagues in the areas of image processing, computer vision, pattern recognition, machine learning, and high-performance digital system architecture design. Prof. Asari has been a Senior Member of the IEEE since 2001 and is a Senior Member of the Society of Photo-Optical Instrumentation Engineers (SPIE). He is a Member of the IEEE Computational Intelligence Society (CIS), IEEE CIS Intelligent Systems Applications Technical Committee, IEEE Computer Society, IEEE Circuits and Systems Society, Association for Computing Machinery (ACM), and American Society for Engineering Education (ASEE). Prof. Sanjay Kumar is working as Professor in the Computer Science and Engineering Department, SRM University, India. He received his Ph.D. degree in Computer Science and Engineering from Deenbandhu Chhotu Ram University of Science and Technology (DCRUST), Murthal (Sonipat), in 2014. He obtained his B.Tech. and M.Tech. degrees in Computer Science and Engineering in 1999 and 2005, respectively. He has more than 16 years of academic and administrative experience. He has published more than 15 papers in the international and national journals of repute. He has also presented more than 12 papers in the international and national conferences. His current research area is wireless sensor networks, machine learning, IoT, cloud computing, mobile computing and cyber, and network security. He chaired the sessions in many international conferences like IEEE, Springer, and Taylor & Francis. He is the Life Member of Computer Society of India and Indian Society for Technical Education. Prof. R. B. Patel is working as Professor in the Department of Computer Science and Engineering, Chandigarh College of Engineering and Technology (CCET), Chandigarh, India. Prior to joining the CCET, he worked as Professor at NIT, Uttarakhand, India, and Dean, Faculty of Information Technology and Computer Science, Deenbandhu Chhotu Ram University of Science and Technology, Murthal, India. His research areas include mobile and distributed computing, machine and deep learning, and wireless sensor networks. Prof. Patel has published more than 150 papers in international journals and conference proceedings. He has supervised 16 Ph.D. scholars and currently 02 are in progress.

Software Quality Optimization of Coupling and Cohesion Metric for CBSD Model M. Iyyappan and Arvind Kumar

Abstract The component-based software engineering is a part of the traditional development of component like a Commercial off the shelf and selecting the quality components. In the CBSD application are used for reusable software packages are adapting and re-assembling among the software modules. The major purpose of using the reusable component, to decrease the development time, reducing the complexity, cost of development became very less and increase the overall quality characteristics as well as quality attributes of various software applications. The proposed approach of this paper followed the architecture diagram for the software quality, which consist of COTS repository, various Quality factor and metric measurement of software. In this process followed the selection, adaptation, verification, measurement, installation and up-gradation of the Component-based software development. Two activities majorly focused on this topic: Software Quality and Software Metric. The Software quality contains various aspects measure the metric relationship between software characteristic and sub-characteristics among module. Software Metric followed the package level measurement among the cohesion with the real data set value with correlation coefficient which is proposed by Karl Pearson’s. Also, this metric proposed to measure complexity among the software system with the parameter of Component inside, component outside and Average calculation both component. So the comparative analysis of the quality factor is applied to the coupling metric and package cohesion which is helpful to reduce the complexity and increase the reliability of the software system without fault, failure and error protection of the system. Keywords Software quality · Reusability · Functionality · Reliability · Maintainability · Software component model · ISO standard 9126 · Package class · Coupling · Cohesion and complexity metric M. Iyyappan (B) · A. Kumar Department of Computer Science and Engineering, SRM University Delhi-NCR, Sonepat, Haryana 131029, India e-mail: [email protected] A. Kumar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_1

1

2

M. Iyyappan and A. Kumar

1 Introduction In the software engineering main concern about constructing the software and assembling the hardware, with the help of reuse component device as well as reuse the software development [1]. The component-based software development (CBSD) approach to build the well-defined system and independent development source code component based on software reuse [2]. This process reduced the project development time and effort, also decrease the cost of expenditure and increased software productivity. In the development of project [3] “A coherent package of software artifacts,” the individual components are developed and deliver the entire software component, to adapt the client software system [4]. Few industries developed software component are available in the market for the in-house development or third party software component like a commercial software or open-source [5]. Few example companies like an: Microsoft’s Component Object Model, DCOM,.NET Framework, Sun’s Java Beans, Enterprise Java Beans, J2EE Specification and Object Management Group, Common Object Request Broker Architecture[6]. Commercial off the shelf (COTS) product is readymade available software like a plug-and-play nature, which can be tested by the company and source code cannot be modified. “A COTS product as one that is (i) sold, leased, or licensed to the general public, (ii) offered by a vendor trying to profit from it, (iii) supported and evolved by the vendor, (iv) available in multiple identical copies, or (v) used without modification of the internals” [7]. After this introduction, Sect. 2 defines the literature review of various componentbased quality models. Section 3 discusses the software quality standard for the component. In Sect. 4 about the architecture of software quality for CBSD model. Section 5 discusses the theoretical approach for Cohesion and Coupling measurement of packages. In Sect. 6 discussion about the conclusion.

2 Literature Review of Various Component-Based Quality Models Lai et al. (2011) in the Component-based software development mainly focus on the effort estimations of software like a cost and schedule prediction [3]. This paper is more focused about implementation and testing component of the module, which can exactly mention identifying the suitable component according to their project requirements and architectural design. It is not a developer based component adaptation, so component we can identify, assessing and system selection based on budget and schedule of completion. Lai et al. (2011) in the Component-based software development mainly focus on the effort estimations of software like a cost and schedule prediction. This paper more focused about implementation and testing component of the module, which can exactly mention identifying the suitable component according to their project requirements and architectural design. It is not a developer based

Software Quality Optimization of Coupling and Cohesion …

3

component adaptation, so component we can identify, assessing and system selection based on budget and schedule of completion [3]. Chopra et al. (2014) in this software application consist of various module, phases and component which contain the different line of source code [8]. The combination of these entire process provides a better software application, for that, have chosen the higher quality component and integrate the module, which will increase the efficiency of the software system. So they followed similar properties of components like a: individual module and exchangeable, perfect system structural design, proper working software interface with another module. Moreover, in this paper, discuss the various quality assurance attributes and its testing methodology also. Soni et al. (2014) for the development of software engineering domain are mainly concern about the reusable existing component which is used for reducing the development time and increase performance. In the CBSD model quality assurance provide various observation result on the system development [5]. This paper discuss component activities like Analysis, Development, Certification, Customization, Design, Integration and Testing for the quality assurance. Bansal et al. (2013) in this software quality measurement analyze the functional module and its activities of the software but not a quality assessment of the system [9]. For that, are utilizing the ISO 9126 standard model, for assessing the various characteristics like an: analyze, replace, stable and testing the component. This type of quality assessment is used for client satisfaction as per the requirement of the software system. Patil et al. (2014) in the part of effort estimation, measure the size of the software package and prediction of the project source codes are very difficult because it follows the nature of black-box testing component [10]. So that estimation of the project with ambiguous data and irrelevant as well as not an accurate result. To analyze this concept of reliability and accurate result on effort estimation, the author has proposed Fuzzy logic model of size prediction for CBSD. In this logic, a model followed four different characteristics like an: Fuzzification separate module, the quantifiable result of Defuzzication, Rule-based system and Inference engine of fuzzy. Kahtan et al. (2012) this paper discuss the interdependencies among the software component are not suitable at the time of component implementation phase [11]. The coupling and cohesion interdependencies module are mainly used for the requirement, component evaluation, component selection, design, architecture and implementation along with testing. Varies security features are included in this comparative analysis study of dependability, reliability, integrity, safety and maintainability. Chen et al. (2011) in this component-based software system for the developing new components are focused on the reusable module because it’s reduced the complexity, decreased the development effort, lesser time and cheaper budget [12]. For the reusable components of the project, mandatory required the software quality certification which is helpful for the prediction of software attribute and characteristics. In this paper discuss the coupling, cohesion, interface and complexity these functionalities followed for the better quality of software. High cohesion and low coupling between the components, higher-level component interface among the other module, and reduced the software complexity avoiding the various issues. Kumari et al. (2011) in this paper discuss the complexity metrics of size and interface among the software component [13]. This metric measurement used for the purpose

4

M. Iyyappan and A. Kumar

to control and minimize the complexity, reduced software design, software testing and system maintenance. An interaction complexity measures the total average of interface component base system not exceed than the overall component-based software development. If it increases we cannot control the complexity and reliability of the system & software attribute. It has been proved in the empirical study. Tomar et al. (2010) this author explains about the Component-based software development of reusability also ensure that the functionality and quality of the system [14]. The more complicated process of software measurement on the component-based software system utilized the methodology of verification and validation [15] testing. X model approach is used in this paper for the independent development of software and commercial off-the-shelf package [16]. According to the client requirement for the software, projects are moved into the component repository select the suitable component, assemble the software package and integrate into the system.

3 Software Quality Standard for Component In the field software engineering mainly focused on the quality of the software [17], it is trying to improve the performance of the system as well as increased reliability among the system software. According to the IEEE Standard Glossary of Software Engineering Terminology [18] software quality is defined as “the degree to which a system, system component, or process meets specified requirements “, or “the degree to which a system, system component, or process meets customer or user needs or expectations [19]. There are several methodologies we followed like a quality assurance of software, metric assessment of the system and controlled quality of the software [20]. In the software system hold various characteristic of analyzing the set of attribute and significant features of a product, which meets the customer or user expectations [21]. For measuring software quality mainly followed the IEEE standard which expresses the client satisfaction and future modification [22]. A high-quality product is one which has associated with it many quality factors. The product quality focused on the ISO-15504-5 and ISO/IEC JTC1 used in the Process quality [23]. In this quality, model differentiates two different steps for measuring the software quality using the characteristics and sub-characteristics [24]. The evaluation of a software system using the hierarchical model to show the various factors relationship among other factor elements depends upon the following steps:

3.1 Software Characteristics of Quality This module focused on the functionality. The functional activities of the entire software package, which is related to the high-level programming language. Maintain the proper system according to client response, as well as update the functional module as per the client request.

Software Quality Optimization of Coupling and Cohesion …

5

3.2 Software Sub-characteristics of Quality In this sub-characteristic module focused on the accurate functionality of the software module. Analyze the perfect component module with the use of the testing methodology.

3.3 Software Quality Attributes This module provide a piece of special information about the software system factor and sub-factor with the accurate testing result.

3.4 Software Metrics of Quality To measure the performance of the software package, analyze the complexity and increase the usability of the system. For measuring the quality of software’s are analysing on basis of Testing phases it holds two different methodologies like a White box testing and Black box testing. In this assessment source code of programming languages can measure in white box testing but not in black-box testing. Black box testing was used to measure the COTS component of in-house development and characteristic applied to the other software packages (Fig. 1). Fig. 1 A standard measurement of component quality

6

M. Iyyappan and A. Kumar

4 The Architecture of Software Quality for CBSD Model In this proposed architecture model describe the various phases of the software development life cycle. For the business application development, a major concern about the client requirement and their satisfaction, are the big challenges for the software developer. This module explains the client requirement and analysis, select the suitable component from the repository for the COTS package, also check the adaptation of software component with the help of various quality factors. After the prediction of various quality attributes then moved to the metric measurement of the quality model. Then the final step of our proposed methodology is the integration on the client software system as well as monitoring process are enabled like maintainability and replaceability.

4.1 Client Requirement Phase Client prerequisite component for the new software system. Analysis phase and specification of the software are majorly required to build the new component system, to satisfy the client software.

4.2 Component Analysis and Specification on Repository This module describes the various component repository which can hold more components according to the client requirement for the business application. This repository contains various component specification and design module which helps analyse the client software system.

4.3 Commercial Off-the-Shelf This COTS package contain a module of warehouse artifacts plays an important role for the development of new model because it contains a library repository of all existing components as well as a new component. Artifacts will check whether suitable source code is available from the repository or not if the suitable component is available select that component from the repository. According to client requirement, we used Design and Construction of Component with Reuse as a two-level processing step like an, select the suitable component from the repository without modification of domain engineering otherwise select the modified suitable component from the repository with modification of code modules Fig. 2. In the warehouse, a component

Software Quality Optimization of Coupling and Cohesion …

7

Fig. 2 Software quality architecture model

with Reuse is not compatible for client requirement then repository starts to develop the new component approach to the member of the pool.

4.4 Software Component Adaptation The developer will verify and validate that selected suitable component from the repository of warehouse artifacts. This COTS component package is perfectly fixed on the software system because this reuse process is reducing cost, time, increase the efficiency, reduced complexity and increased the reliability.

8

M. Iyyappan and A. Kumar

4.5 Software Quality and Metric Factor For maintaining a higher level of quality among the software we are following the ISO/IEC standard quality model. In this standard followed various characteristics like a System functionality, Software reliability, usability of software, Time efficiency, System maintainability and portability compliance. In addition to those standard quality measurement using Security and System compatibility. • System Functionality—System Functionality holds the group of an attribute which is reflected from the existing available functional model and its properties of the system. This functionality model provided various services and its operations as per the client requirement because it contains faster delivery of software products with the lower cost of the budget. In this software sub-characteristic of the quality model for the following, steps are used like as a set of attributes that bear on the existence of a set of functions and their specified properties (ISO 1991). It means that the software should provide the functions and services as per the requirement when used under the specified Condition. Pre-existing software with low cost, faster delivery of end product. The sub-characteristics under functionality are System Suitability, Software Accuracy, Interoperability module, Security of System and Functionality Compliance. • System Suitability—System Suitability depends upon the developer requirement, either it’s perfectly fit into software packages or not. Because measuring the software at the time development it is very difficult, so this suitability moves according to the fitness function. • Software Accuracy—In this system functionality, the developer used to measure the software packages to analyze the accurate result system performance and its suitable module. From this accuracy, the developer can decide either to continue the package development or not. • Interoperability Module—In this interoperability module, Attributes of software package that bear on its ability to interact with specified software systems. This indicates whether the format of the data, handled by the target software is compliant with any international standard. For using this type of interaction and standard, it increased the functionality of the system. • Security of System—In this functionality module, the protection of system items from accidental damage or malicious access of another user, modification of software and destruction of programming. We can avoid such a problem to protect the functionality of the system database. • Functionality compliance—Functionality compliance is a method of standard procedures are followed properly for the software products, like a functional guiding factor, standard rule and regulations. It is also improved functional bonding among the software packages and recognized by the standard certifications. • Software Reliability—In reliability, mainly discuss the probability of failure and creating a problem on the system or software at a specific period of moment. For

Software Quality Optimization of Coupling and Cohesion …





• •



• •



9

avoiding such a problem we used to maintain the fault-tolerance on the specific period. Software Reliability main concern about increase performance reduced complexity and avoid failure among system as well as software on the specific time. Reliability is broken into the following sub-characteristics: Maturity, Fault tolerance, Recoverability and Reliability compliance. Software Maturity—In this maturity level of metric measurement between the software and system. To analyze the failure of the software and fault of the system based on reliability. The software maturity provides, it deals with the number of commercial versions of the software and the time interval between each version. Fault Tolerance of Reliability—It evaluates the robustness of the software or system. In this method, a constant level of performance is maintained between the system as well as software. To avoid such a failure and fault of the system result, increased reliability. Software Fault avoidance: To avoid or prevent the introduction of faults by engaging various design methodologies, techniques and technologies, including structured programming, object-oriented programming, software reuse, design patterns and formal methods. Software Fault removal: To detect and eliminate software faults by techniques such as reviews, inspection, testing, verification and validation. Software Fault tolerance: To provide a service complying with the specification despite tolerance and run the component of remaining software modules. Software Recoverability—In this module focused on the software recovery, when the software faced unexpected error or failure of the system. This module recover lost of data directly and re-establish the performance on that period. Software Reliability Compliance—Whether this module followed the proper, international standard for maintaining the quality and increasing the reliability. In this standard module focus on the fault, failure and error predictions among system as well as software. Usability of Software—In this module of usability software depends upon the system developer and client user, whether the software application is suitable for all the criteria or not. These criteria are mainly focused like a domain understanding, program code usage, system configuration and software execution. For applying this methodology, users are benefitted like a lower complexity, higher reusability and proper adaptability of the system. Software sub characteristics of Usability are defined as under. Client recognizable—It is a major task clients have to understand the system behaviour and system performance and its functionality task. Client recognizes that depend task can perform properly on the software module or not. Client Documentation for understanding—This module is focused on client requirements, which help to improve the better system of software development. The developer has to learn the software characteristics and make a system with easy understandable for the user. Project Appearance—In this client usable system focus on the attractiveness of website or project appearance. If the developer has very good knowledge of the graphical user interface design, will provide an attractive module of the final appearance.

10

M. Iyyappan and A. Kumar

• Usability Standard—In this topic focused on the International standard or International certification. This usability module according to the client requirements and developer implementation, both characteristics are properly recognized by the standard organization or not. • Software Efficiency—This module of software quality is the main concern about the efficiency of the software system. Software efficiency is reduced because of using trendy technology and latest programming in the real world. In this module, performance are improved based on programming optimization, checking their internal module and testing their inside component. This type of performance measurement won’t affect the specific design in the programming language, it is only testing the inside module. • Software Behaviour—This behaviour represents a time relationship between the software modules. It’s able to perform a particular task on a specific period of execution which we can apply certain limitations. In this measurement focus on request time, response time, processing time and throughput time, etc. • Infrastructure Behaviour—In the software programming development and usage of the system depend upon the resource behaviour. In this behaviour concepts completely utilized all those software resources, on the specific time and particular condition also. • Software Compliance—Efficiency are compared to the performance result of the software system. For this performance analysis of efficiency, compliance is properly approved by the standard organization or not • System Maintainability—System maintenance depends of the software programming code changes as well as software modification. In source code changes, these phases of modification focus about up-gradation of the software programming. But some of the maintenance phases of the system, which depends upon the component module because it holds only reusable COTS component. The reusable component is required to adapt the system, checking the test workflow and integration process on the system. • Software Customizability—In most of the software, system focus only about the readymade available reusable component of COTS software package. It can easily adapt the system, configuration among the software, test the component workflow and finalize the implementation module. For this customizability not require any source code programming languages. • Software Testing—In this maintenance, phases depend upon the testing component or testing source code. In testing, it can analyze performance, integrated module and functionality of the system. • Software Stability—Stability provides an analysis result of, if any changes are done in the software it will provide a similar output of the module or it will provide any unexpected changes. If that changes affect the system maintenance or not. In other words, it is the degree to which software is composed of discrete softwares such that a change to one software has zero impact on the other software or the system. • Software Analysability—Analysis is a very important phase for deciding on software maintenance. If the particular software has to be updated or not, needed little

Software Quality Optimization of Coupling and Cohesion …





• •

11

modification or not, the existing component is working or not. According to this statistics, results are looking about further modification of the software system. Software Portability—Software portability among the software system main concern about the implementation or integration of the system. In this level, the software has been changing from one environment to another environment with similar modifications at the time of software installation. These activities are reducing the cost of expenditure and schedule of time to complete the task. The specification of software should be platform-independent. Various sub-characteristics defined under portability are: Software Replaceability—Replace is the main concern about updating the software or modification of programming. It is followed by the previous version of the software in our system, it is mandatory to check about the new version of the software with modification or not. This means that the new software can substitute the previous ones without any major efforts. System Adaptability—It refers to whether the software can be adapted to different specified platforms. System Installability—It is the capability for software to be installed easily on different platforms.

5 The Theoretical Approach for Cohesion and Coupling Measurement of Packages These following modules, are used the Inheritance of hierarchy steps like a tree structure which is related to the packages, classes and methods similar to the object-oriented programming. In this Empty Packages are helpful to measure the null value of the software system which behaves like an idle performance of the system. The major concern about the Complexity of Low cohesion (CLC) and the Complexity of High cohesion (CHC) is used for measuring software package as on the base theoretical approach. Coupling and cohesion depend on the direct measurement and indirect measurement between the high cohesion and low coupling. Here, took a similar project of existing real data set value which is used to measure the CLC and CHC to show the comparative result of new metric real data (Table 1). Table 1 Sample data for software package metric

S. No.

Project name

1

Byte Code Engineering Library (BCEL)

2

Bean Scripting Framework (BSF)

3

Jakarta-ORO

4

Element Construction Set (ECS)

5

XGen Source Code Generator

6

Junit

12

M. Iyyappan and A. Kumar

Table 2 Descriptive statistics of the analyzed package cohesion component complexity measure for Low and High cohesion metric

Statistical parameter

CLC

CHC

Maximum value

7.75

18

Minimum value

0

0

Median

0

2.5

Mean

0.453

3.39

Standard deviation

1.53

3.572

Table 3 Comparison with the correlation coefficient values of low and high other metrics Parameters

CLC

CHC

PCoh

LCOM

LCOM1

ICH

SCC

Correlation coefficient

0.20

0.48

0.69

−0.32

−0.34

0.12

0.27

Significance value

0.05

0.05

0.05

0.05

0.05

0.05

0.05

Analysis of experimental results of Proposed Low complexity and high complexity. Table 2 shows the package name with the total number of classes, number of elements with their calculated filtered class and it is termed as R(D) & R(DUI). The number of relations is termed as CDI and finally with all these the Complexity of Low and High cohesion is calculated with the help of Package also compared with existing research work of package cohesion measurement. To analyze the better performance and result of the cohesion metric. From the above results of Table 3, it can be concluded that we can reject the null hypothesis and can trust the alternative hypothesis. Hence there is a strong relationship between the calculation of package cohesion component complexity metric and the component reusability.

5.1 Density Calculation on the Interface Level Measuring the Coupling and Cohesion In this, proposed algorithm is used the approach of a quantitative measure of cohesion and coupling. The measure of coupling and cohesion density used to analyse the relationship between the Interface density module (IDM1, IDM2, IDM3 and IDM4) of modular software system given as follows: MCCD =

CCIin CCIin + CCIout

(1)

where CCIin is the number of coupling and cohesion interaction input within modules, and CCIout is the number of coupling and cohesion interaction output between the distinct modules.

Software Quality Optimization of Coupling and Cohesion …

13

5.2 Proposed Coupling Measurement for Classes of Direct and Indirect Interaction Component-based software system for the development mainly used cohesion and coupling. In this method, two-component are coupled if and only if at least one of them acts upon others. For the development of coupling, metrics used a Graph (G) because it contains node and edges. In this, directed graph represented 5 nodes like an A, B, C, D, E and each is a node connected with other nodes with the use of edge interface interaction. Five parameters are mainly used for the node and edges connectivity of a graph. A starting point, Endpoint, Regular parameter, Neutral parameter, Crucial parameter for measuring coupling metrics. • Complexity of Coupling and Cohesion measurement using the Average Component – Average Component In-Parameter (ACIP)—The complexity measurement of Coupling and Cohesion mainly used the concept of Average component of Input Parameter which is used to measure the inside available components. ACIP =

M  CIPi i=0

m

(2)

– Component In-Parameter—CIPi has taken into 8 parallel input to measure the complexity of the component from the coupling and cohesion. In the summation of component maximum value will mention ‘n’ term and minimum value represent i = 0. 0.10 ≤ X i ≤ 30 n  (Input parameter ) CIPi = Value i=0 0(No parameter)

(3)

– Average Component Out-Parameter (ACOP)—For the complexity measurement of Coupling and Cohesion mainly used the concept of Average component of Output Parameter which is used to measure the interaction of outside available component. ACOP =

M  COPi i=0

m

(4)

– Component Out-Parameter—COPi has taken into eight parallel interaction output measure the complexity of component from the coupling and cohesion. In the summation of component maximum value will mention ‘n’ term and minimum value represent i = 0.

14

M. Iyyappan and A. Kumar

Table 4 Calculate the component value and comparison of complexity level Interface

COPi

ACOP

CIPi

ACIP

AIOBC

I1

1.6

12.4

0.20

1.5

13.9

I2

0.8

0.10

I3

2.4

0.30

I4

1.2

0.20

I5

1.5

0.20

I6

1.4

0.10

I7

1.5

0.10

I8

2.0

0.30

COPi =

n  (ORi ∗ Wr ) + (ONi ∗ Wn ) + (OCi ∗ Wc )

(5)

i=0

5.3 Comparison of In and Out Parameter In this comparison, the table result is observed from the experimental study which contains the real data set value for measuring the complexity of component from coupling and cohesion. From the experimental study taken a proposed value of 8 different developed components which is used for the interaction among the coupling and cohesion, refer Table 4.

5.4 Average of In-Parameter and Out-Parameter of Both Component To measure the complexity of coupling metrics component and cohesion metric component used the terminology of average calculation of inside parameter and outside parameter. To measure an interface complexity of component based system using the AIOBC. AIOBC add both components of inside parameter and outside parameter for the average calculation of component complexity, refer Table 4. AIOBC = ACIP + ACOP, AIOBC = 12.4 + 1.5 = 13.9

(6)

From the below results in it can be concluded that the components having the high values of cohesion and low coupling associated with their proposed algorithm of Hexa-oval, interface density module, the Component parameter of inside, outside and average calculation of optimum components selection framework.

Software Quality Optimization of Coupling and Cohesion …

15

5.5 Client System Integration and Testing Assets Management of Warehouse Artifacts—Select the suitable component from the warehouse repository of the commercial off-the-shelf, then adapt to the developer system and verify the various quality factor. In this factor, we observed various functionality of the system component and its specific characteristics. The Complexity also measured in the phase of the system, then moved to the next phase of System Integration and System Testing to assemble the component into the system then verify the process.

5.6 Maintenance and Software Upgradation In this phase regular monitoring the software and system behaviour how it’s responding to every module of the source-code programming. The Client feedback is a necessary consideration of the system maintenance on the CBSE process. If existing software applications are not up to the business market standard, so the developer tries to replace the new component based on software reusability. In Fig. 3 complexity measurement of Component Low Cohesion and Component High Cohesion result are compared and observed the result from MATLAB software.

Fig. 3 Complexity measurement of CLC & CHC

16

M. Iyyappan and A. Kumar

Fig. 4 Comparison of various metric parameter for complexity

For observing the cohesion metric measurement uses the correlation coefficient value with the real data set package of the component software. The lower value consists of less cohesiveness among the one software module to another programming module but if it is preferred to choose the higher value of interaction provide improved performance of high cohesiveness among the two different software packages. Here blue bar diagram mentioning about low and yellow bar chart representing higher relations. In Fig. 4 comparisons among the various metric parameters to analyse the complexity of the software component. This reduced level of complexity improves the efficiency and reliable performance of the system. The existing metrics like a PCoh, LCOM, LCOM1, ICH and SCC are compared with the proposed cohesion measurement of CLC and CHC provide the better result with the positive observation from the complexity measurement of the software package. In Fig. 5 examination result is seen from the MATLAB Optimization and Simulink programming are utilized to watch the exhibition and think about the product quality for the open-source part. In this procedure interface segments are required for the bundle estimation of source code which is accessible for the created segment and creating segment. This interface part is utilized to watch the unpredictability among the useful modules for programming upkeep and framework usage. So advanced metric estimation of coupling and cohesion worth is utilized for interface among the two distinct parts.

Software Quality Optimization of Coupling and Cohesion …

17

Fig. 5 Comparison of complexity level between interface component

6 Conclusion The product quality paper totally talks about the different parameters and quality variables. The product attributes and submodule qualities are utilized to quantify the measurement estimation of the quality characteristic. Besides, the procedure is isolated about the standard quality estimations utilizing framework usefulness and programming execution for the in-house part improvement. At that point select the reasonable segment from the COTS bundle, adjust the segment into the product framework. At that point, the following procedure is to examine the variables of programming quality with various standard functionalities. The metric estimation assumes a significant job in the part based programming improvement to make an interface between coupling and union, intricacy estimation of inside and outside parameter. This paper watched near aftereffect of attachment estimation between the elevated level bundle and low-level bundle, at that point demonstrated the cyclomatic multifaceted nature result for the coupling connection of low-level interface. In this, connection of useful and non-useful factor are watched for the better nature of programming and its dependability. Here recommended ensuring the best possible starting arranging, gathering the best possible prerequisite, source code plan and ideal usage with the assistance of testing improved the product quality for segment-based programming advancement.

18

M. Iyyappan and A. Kumar

References 1. Ampatzoglou A, Bibi S, Chatzigeorgiou A, Avgeriou P, Stamelos L (2018) Reusability ındex: a measure for assessing software assets reusability. In: International conference on software reuse: new opportunities for software reuse, pp 43–58 2. Nautiyal L, Tiwari U, Dimri S, Koolagudi SG (2012) Component based software developmentnew era with new ınnovation in software development. Int J Comp Appl (IJCA) 51(19):5–9 3. Wijayasirivardhane T, Lai R, Kang KC (2011) Effort estimation of component-based software development- a survey. IET Softw 5(2) 4. Mclntosh S, Kamei Y, Adams B, Hassan AE (2016) An empirical study of modern code review practices on software quality. Empirical Softw Eng 21(5):2146–2189 5. Soni N, Jha SK (2014) Component based software development: a new paradigm. Int J Sci Res Educ 2(6):969–974 6. John D (2004) A process for COTS software product evaluation. Technical report CMU/SEI2003-TR-017 7. Cai, Xia, Lyu, M.R., Wong, K.F., Ko, R.: Component-based software engineering: technologies, development frameworks, and quality assurance schemes. In Software Engineering Conference, APSEC 2000. Proceedings. Seventh Asia-Pacific, pp. 372–379. IEEE, (2000) 8. Chopra S, Sharma HC, Semwal P, Sharma S (2014) Software model for quality controlled component based software system. Int J Adv Res Comput Sci Softw Eng 4(8) 9. Bansal S, Gupta N (2013) Software component quality assessment—a critical survey. International Res J Comput Electron Eng (IRJCEE) 1(2) 10. Patil LV, Shivale NM, Joshi SD, Khanna V (2014) Improving the accuracy of CBSD effort estimation using fuzzy logic. In: IEEE ınternational advance computing conference—IACC 11. Kahtan H, AbuBakar N, Nordin R (2012) Reviewing the challenges of security features in component based software development models. E-Learning. In: IEEE symposium on E-management and E-services (IS3e), vol 1(6), pp 21–24 12. Chen J, Wang H, Zhou Y, Bruda SD (2011) Complexity metrics for component-based software systems. Int J Dig Content Technol Appl 5:235–244 13. Kumari U, Upadhyaya S (2011) An ınterface complexity measure for component-based software systems. Int J Comput Appl 36(1):0975–8887 14. Tomar P, Gill NS (2010) Verification and validation of components with new X componentbased model. In: 2nd ınternational conference on software technology and engineering (ICSTE) 15. Mendoza I, Kalinowski M, Souza U, Felderer M (2019) Relating verification and validation methods to software product quality characteristics: results of an expert survey. In: International conference on software quality: the complexity and challenges of software engineering and software quality in the cloud, vol 338, pp 33–44 16. Bertoa M, Vallecillo A (2002) Quality attributes for COTS components. In: Proceedings of the 6th international ECOOP workshop on quantitative approaches in object-oriented software engineering (QAOOSE), Spain 17. Kim, S.D., Park, J.D.: C-QM: A Practical Quality Model for Evaluating COTS Components. Proceedings of the 21st IASTED International Conference on applied informatics, Innsbruck, Austria, February, (2003) 18. Alvaro A, Almeida DSD, Meira SRL (2005) Quality Attributes for a Component Quality Model”, Proceeding of 10th International Workshop on Component Oriented Programming. WCOP), Glasgow, Scotland 19. Rawashdeh A, Matalkah B (2006) A New Software Quality Model for Evaluating COTS Components. Journal of Computer Science 2(4):373–381 20. Sharma A, Kumar R, Grover PS (2008) Estimation of quality for software components—an empirical approach. ACM SIGSOFT Softw Eng Notes 33(5):1–10 21. Choi Y, Lee S, Song H, Park J, Kim S (2008) Practical S/W component quality evaluation model. In: The 10th IEEE international conference on advanced communication technology (ICACT), Korea

Software Quality Optimization of Coupling and Cohesion …

19

22. Alvaro A, Almeida ES, Meira SRL (2008) A software component quality framework. ACM SIGSOFT Softw Eng Notes 35(1) 23. Mohagheghi P, Conradi R (2007) Quality, productivity and economic benefits of software reuse: a review of industrial studies. Empirical Softw Eng 12:471–516 24. Cho ES, Fim MS, Kim SD (2001) Component metrics to measure component quality. In: Software engineering conference, APSEC, pp 419–426

Detecting WSN Attacks Through HMAC and SCH Formation Neetu Mehta and Arvind Kumar

Abstract Wireless sensor network is playing a vital role in several areas ranging from military surveillance to various industrial applications. Most of the industrial applications require low latency, high delivery rate, and secure data transmission. After the analysis of earlier methods, a new prototype is proposed called advanced malicious detection using hash message authentication code (HMAC) and balanced load sub-cluster head (SCH) selection for WSN. The protocol is using hybrid ack scheme for malicious node detection and sub-cluster head selection for load balancing. Further, it is using a hybrid MAC and hash message authentication code for reducing latency time and for secure transmission, respectively. During the performance analysis, the network parameters like packet delivery ratio and latency are calculated and compared with other similar schemes. The results have shown improvement in packet delivery ratio and latency under various types of attacks. Keywords Sub-cluster head · Black hole attack · HMAC · Malicious node detection

1 Introduction Wireless sensor networks or WSNs are a collection of resource-driven sensor nodes with various functionalities like sensing, processing, communication to fulfill different application requirements. The spatially deployed sensors in the field can measure as well as supervise any change in environmental conditions without having any specific infrastructure support. Recently, several research attempts have been made to efficiently set up and implement the sensor network for a wide range of

N. Mehta (B) · A. Kumar SRM University, Delhi-NCR, Sonepat, Haryana 131029, India e-mail: [email protected] A. Kumar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_2

21

22

N. Mehta and A. Kumar

applications. The needs of all applications cannot be fulfilled by a single generalpurpose WSN design. Therefore, several network parameters such as node density, sensing range, communication, and transmission range have to be considered carefully at the network design phase based on specific applications. To achieve this, it becomes necessary to study the impact of the network parameters on the performance of the networks concerning application specifications. Communication in WSN takes place by verification of pre-defined parameters like node ID, security mechanism, behavior, etc. The application of WSNs is rapidly growing. WSN is, therefore, a collection of autonomous sensor nodes distributed spatially [1]. The actual uses of WSNs include fields of automation and smart-homes, traffic controls, video surveillance, automating process, industries [2], etc. Through the underlying channel of intermediate nodes, data is transferred from the origin node to the target node [3]. Several sensor nodes of the WSN are linked to a central base station (BS). Every sensor node in a WSN is integrally resource-constrained. It has limited capability for processing, storage, and communication bandwidth. Sensor nodes are static or dynamic (mobile), whereas they can be deployed based on the application requirements. One or more base stations (BSs) are deployed in the network; it can be either dynamic or static. Sensor nodes monitor the network area after deploying the network. If an event occurs, the surrounding sensor nodes can detect it to generate a data report for that event and transmit to a BS [4]. The BS processes the data and then forwards it through the high bandwidth communication links to the external world for further processing of sensed data. Recently, the distributed WSN has been evolved into related grouped nodes (clusters) with self-organizing capability, which can also be connected with boundary nodes to open networks. Sensor nodes are often used in an adverse setting, where the sensors are mostly open and susceptible to multiple attacks without protection. Any node can be subjected to an attack in a sensor array from any heading [5]. Implementation of a WSN safety approach and protection becomes an important objective. A WSN can contain a very big amount of sensor nodes between several hundreds. In the past couple of decades, safety at the WSN has received a lot of attention. Safety plays an essential role in both cluster-based WSNs and distributed WSN, particularly in cluster-based WSNs. In cluster-based WSNs, the cluster head (CH) is vulnerable to insider or external attacks [6] and can be easily destroyed. Insider exploits are hard to identify through safety systems as opposed to external attacks. As a result, more attention was given to their security and research. “Normal nodes,” which have already been intercepted and gained a false legal identity, are used in initiating insider attacks. The paper is organized as follows. Section 1.1 presents various types of attacks and their behavior on sensor networks. Section 2 gives an overview of related work. Section 3 focuses on the basic proposed model along with attack construction cases. Section 4 is about protocol implementation using a hybrid ack scheme and HMAC. Section 5 shows the simulated result and comparative with the earlier standards. Section 6 presents the concluding remarks and future direction.

Detecting WSN Attacks Through HMAC and SCH Formation

23

1.1 Types of Attacks Due to being deployed in unattended and adverse locations, almost all the WSNs are under attack. An attack is defined as trying to gain unauthorized access to the resource, information, and service. The passive attack more acts like watcher as they just listen and analyze the network nodes and other information [7]. It is difficult to capture the passive attacker as they are not doing any kind of change. A passive attacker formulates the information for the active attackers. In active attack, the message and other network information can be modified and the attacker can inject the malicious nodes from outside. There are various types of attacks discussed below: 1.

2.

3.

4.

5.

Tampering Attack It results from the attacker’s physical access gain to the node, the aim being recuperation of cryptographic materials such as cipher keys. It is a kind of physical layer attack where the attacked node can be changed so that the attacker can control them. Sleep Deprivation Attack Energy-dependent WSN devices are susceptible to “sleep deprivation” attacks by inducing attacked nodes to always remain wake [8]. As a result, the battery depletes frequently with task overloading. Hence, this type of attack is considered to be very dangerous as it affects the lifetime of sensor nodes. Blackmail Attack Malicious nodes render other nodes as malicious and, hence, evoke elimination. Therefore, a malicious node can disrupt the network by affecting a substantial number of nodes. A node’s energy is exhausted by the assignment of unnecessary tasks and computations. The attacker can spread false information about the good node in the network so that the node can be avoided in the future routes. Black Hole Attack A node generates false routing information altering the data transmission path; therefore, it creates a sinkhole or a black hole within the network [9]. The malicious node publicizes itself having the shortest route to the destination and starts dropping the packet. The malicious node can attack as a single node or in a group. The attack affects the nodes which are far away from the base station. It degrades the network throughput considerably. Wormhole Attack The wormhole is a significant attack in sensor networks which mostly occurs during the initial phase when nodes start to discover the neighboring information. In this, various attackers are located at various ends of a network. These attackers can receive and forward messages through tunnels in various parts [10]. This is a very severe type of attack where malicious nodes are communicating directly with each other with higher speed than any other node in the network.

24

6.

N. Mehta and A. Kumar

Selective Forwarding Attack As stated earlier, a node may behave as the router or server; defective nodes can deny transmitting packets while dropping them under a selective forwarding attack [9]. It is a very severe threat to the sensor network and very difficult to detect as the malicious node works as a normal node but selectively drops some of the packets. 7. Sybil Attack This attack is defined as “malevolent device, illegitimately taking multiple identities”; i.e., attackers use fake node identities to gain access to discrete algorithms, such as node selection, and degrade the WSN functionality [11]. Consequently, legitimate nodes may be denied access to resources. The way to create the Sybil attacker is during the process of communication node communicating with the other through one-hop method. In that condition, any node gets the access of the other normal node and it is the easy way to get the data from the nodes such as node position and id. By the use of this data, the attacker node will create similar ids to establish the attacks to the normal nodes. 8. HELLO Flood Attack Several protocols use the “HELLO” packet to detect surrounding nodes and thereby develop a network topology. The easiest attempt on an intruder is to send a stream of these packets to the network and also to stop the exchanging of other packets [7]. The attack can broadcast the quality route with the help of high transmission power so that most of the nodes in the network can use this route. 9. Jamming Attack Another popular attack on WSN or mobile ad hoc networks is that the radio broadcast is disturbed by transmitting unnecessary wave spectrum inputs [7]. Consequently, the jam could be intermediate, short-lived, or continuous. It is a kind of denial-of-service attack where a high-range signal can disturb the communication. Jamming in a wireless network can occur through noise, interference, and collision. 10. Identity Replication Attack Attackers can duplicate receiver nodes to obtain a large portion of the data transmissions, by positioning these modified node clones across the network. Contrary to the Sybil attacks, the attacks of ID replication are much more dependent on allocating similar IDs to two or more distinct nodes. This exploit can be deployed since an effected sensor node is unidentified in a WSN. 11. Overwhelm Attack This type of attack affects node hardware and software. The attack overwhelms the sensor node so that it will start doing sensing operation indefinitely to increase the traffic to the base station. 12. Path Base DoS Attack In path base DoS attack, the attacker injects the new traffic from outside which consumes bandwidth of the path to base station causes a denial of service [7].

Detecting WSN Attacks Through HMAC and SCH Formation

25

13. Flooding Attack This type of attack puts the traffic down by generating a large amount of traffic in the network [9]. Attackers will make unnecessary new connection requests to exhaust the resources. 14. Desynchronization Attack It is a kind of attack that disturbs the communication between two nodes by resynchronizing the transmission [9]. In this attack, attacker repeatedly sends the false message to the nodes which leads to resource exhaustion. 15. Acknowledgment Spoofing Attack Here, the attacker sniffs the communication of adjacent nodes to generate false acknowledgments. The main goal of the adversary is to convince the sender that the weak link is strong or the dead node is alive [9]. The complexities of attacks have increased due to the advent of multi-hop distributed systems. Identifying malicious nodes or attackers in these kinds of environments is extremely difficult. Few of them provide solutions to attacks in Sybil; also, few research projects provide the solution for the attack in DoS, sinkhole, etc. It is therefore evident that most special attacks may be handled by the current methods. It is necessary to develop a new process for managing multiple attacks. Very few attacks can be handled by existing solutions; however, most attacks are addressed by basic strategies that eliminate different WSN attacks. In addition to energy, bandwidth, resources, transmission capability, and power consumption, the nodes remain extremely constrained. Any unauthorized agent might, therefore, initiate certain attacks that might partly or completely disrupt the WSN system.

2 Literature Review Many methods to improve the safety of the wireless sensor networks are listed in the literature. Karlof and Wagner were the first to identify several security issues in the WSNs [12]. They outlined countless feasible assaults in a LEACH protocol, such as HELLO flood, Sybil, black hole, and gray hole exploits. The reduced energy resources, low calculation capacity and other restrictions within the sensor network, makes it difficult to build a secure network by the researchers, and therefore, WSNs cannot be protected by advanced security mechanism available [13, 14]. Malevolent nodes can, therefore, readily disrupt the regular tasks and render faults in the system. Watchdog method improved the throughput of the network by detecting malicious node misbehaving in the network. The method is using failure counter for detection of malicious node, and in the future the malicious nodes are avoided by updating the information in the routing table. Another such type of scheme AACK is based on two acknowledgment and end-to-end ack scheme. The method is reducing the network overhead considerably.

26

N. Mehta and A. Kumar

The [15] DoS attacks were described by Wood and Stankovic. They also enlisted feasible anti-attack protection systems. Examples include a spread-spectrum method to prevent jams, and debugging codes to protect packet crash attacks. In particular, detecting and defending DoS assaults in WSN are not that simple. In a paper [16], it was suggested that the identification and removal of wormhole attacks and adversary inhomogeneous WSNs to be accomplished by center nodes through routing. Energy-aware cluster head selection and cluster head rotation play a vital role in the stability of the network and also save energy. The concept of advanced clustering and intelligent cluster head selection is a great tool to reduce energy consumption and to increase the lifetime of the network [17]. Most individuals will develop an intrusion detection scheme (IDS) to evaluate safety problems in WSN [18]. Mao [19] suggested the watchdog system and a neighboring surveillance system were put forward by Kredo and Mohapatra [20]. These processes use the immediate or conditional confidence function of a node, “evil or not,” between neighbor nodes, and several fresh security systems are presently intended to enhance these. Mehta [21] figured out the effect of Sybil and black hole attacks on cluster head as well as child node and also analyzed how it degrades the efficiency parameter of WSN. Most basic selective insider attack was described by Gulhane [22] in which all packets are pulled from the transmission path and dropped elsewhere, forming “a black hole." This assault can readily be identified, as the faulty node may not pass a packet for quite a while. It is thus considered unsuccessful by its neighbor stations, and hence the transmission information is updated. According to Liu [23], an on/off attack is a sort of heavily agile attack of selective forwarding; it produces complete or partial packet losses periodically, whereas rest of the time pretending as ordinary nodes. Further, Lu [24] points out that the backpressure algorithm is effective in maintaining the throughput of a network under the set of security attacks. Malevolent nodes are specific packets to efficiently decrease the likelihood of being detected to meddle with packet material or drop components of interesting data. Such an attack causes serious damage to the entire network. Unreliable communication channel in many applications makes it difficult to provide security in sensor networks. They studied the various techniques against wormhole attacks [25]. Rehman [26] discussed various attacks and its effects on different layers of communication networks which can further help in a more robust system design.

3 Proposed Model A new prototype called malicious node detection using HMAC and balanced load sub cluster head selection is being proposed to handle the various security issues. In this model, the network was concentrated on various hybrid attacks, namely black hole attack, Sybil attack, and wormhole attack.

Detecting WSN Attacks Through HMAC and SCH Formation

27

ATTACKS

FALSE REPORTER ATTACKS

PACKET DROPPING ATTACKS

ACKNOWLEDGE HACKING ATTACKS

Fig. 1 Various attacks

The spacing between the normal child nodes and the sub-cluster head (SCH) plays a key part in energy consumption [21]. The balanced load sub-cluster head selection technique is applying load balancing for energy-saving and increasing network lifetime. If few sub-cluster nodes are heavily loaded, it leads to faster energy depletion, and to get normal depletion of energy the balanced load sub-cluster head selection is introduced. So, a balanced load sub-cluster head selection leads to minimal energy depletion of each node present in the network by forming transmission with close-by nodes. The concept of sub-cluster head selection works both for reduction of energy consumption and also to prevent the network from the black hole and Sybil attacks. In the network, the SCH nodes send HELLO packets to all the nodes which are present in the neighboring area and the nodes send in return the acknowledgment. TDMA MAC scheduling technique is used to escape a collision. After the receipt of an acknowledgment, all sub-cluster head nodes measure up the distance between itself to the child nodes with the threshold distance. At the end of the distance calculation, each SCH node sends the message to the concerned child nodes, which are linked with it. If the child receives more than one number of copies, then it will randomly select the SCH node which it has to coordinate. To validate the proposed methodology in this chapter, three different attacks are applied to the nodes in the cluster of the WSN. These attacks are categorized as shown in Fig. 1.

3.1 False Reporter Attack (Attack 1) One of the popular attacks in WSNs is the black hole attack. Malevolent nodes dump the packets received during a fake reporter attack and return a fake misbehavior message when feasible. A black hole concern indicates that one suspect node uses the routing protocol to pretend to become the quickest route to the target node, dropping the routing messages, but not forwarding the messages to their neighbors. The WSNs are readily affected by a single black hole attack.

28

N. Mehta and A. Kumar

3.2 Packet Dropping Attack (Attack 2) In a packet dropping attack, the malicious node drops all packets that it receives. But it does not generate any acknowledgment signal. These attacks can be easily found through the “ack” check method. The malicious node will complete this attempt by dropping the packets traveling through it over some time. It is also known as a gray hole attack. The traceroute simulation tool is used to find the malicious node in the network when the malicious node completely discards all packets passing through it. Once the router identifies the malicious router presence in the network, then it will stop sending the packets to it and also straight away removes its router address from its routing table. Also, it will send this detail to other surrounding routers in the network. But this method is failed if the malicious node also receives this report from the trusty router.

3.3 Acknowledgment Hacking Attack (Attack 3) In an acknowledgment hacking attack, the malicious nodes are ample clever to form acknowledgment packets and asserting negative results into positive.

4 Protocol Implementation The malicious nodes in the wireless sensor networks can be identified using a hybrid acknowledge scheme (HAS). The clusters are created in the network by a grouping of nodes. The cluster includes only three nodes each in it, provided with an individual cluster key for all the nodes. This cluster key is generated by the sink and is distributed to all the clusters in the WSN. The individual cluster key is supplied by the sink to each cluster in the network. For instance, the nodes are named N1, N2, and N3. Initially, node N1 likes to transmit a packet to node N3. It first passes the packet to the node N2 with its cluster key. Node N2 receives this packet after verifying the cluster key of the node N1. The type of packet in Fig. 2 is defined as PAC—data packet, HAS—hybrid acknowledge scheme, Pac_dum—dummy packet. If the cluster key of the node N1 is matched with the cluster key of node N2, then node N2 accepts the packets from the node N1 and sends this packet to node N3, if the destination address is not matched with its address. The node N3 follows the same packet reception procedure performed by node N2 to receive the packets from the previous node in the cluster. After receiving the packet, node N3 sends the HAS signal to the node N1 through node N2 using the said procedure. The node N1 must receive this HAS signal within a particular duration. If it did not receive the HAS signal within the stipulated time, then node N1 assumes that node N2 and node

Detecting WSN Attacks Through HMAC and SCH Formation

29

Fig. 2 Proposed framework for WSN attacks

N3 is suspicious or malicious node. Finally, node N1 sends these malicious nodes’ information to the sink immediately. The malicious node information given to the sink needs to be verified as malicious nodes can send false information to the sink. As shown in Fig. 2, node N4 near to malicious node sends dummy packet with its address to N5 through N1. If the node is malicious, it will not pass the packet to N5. The node N4 also passes the packet to N5 through an alternate route. If both packets received at N5 matched, the misbehavior report is assumed as correct and accepted.

4.1 Balanced Load SCH Selection The balanced load SCH selection aims to lessen the energy utilization and also to enhance the network lifespan by introducing a load balancing concept in it. CH is chosen based on different nodes remaining power that used for selection of CH and SCH within a cluster. The concept of sub-clustering is also used for congestion avoidance because data division has been done so that network overhead on a single node does not affect the performance of the network. The node having maximum energy is elected as cluster head, and node with energy less than the maximum node has been elected as a sub-cluster head. After every rerun, selection of the CH and SCH has been made based on residual energy. Thus, only after the selection of cluster head and sub-cluster heads, sensing information is initiated for transmission over the network so that data can be transmitted from a source node to base station without any extra energy consumption.

30

N. Mehta and A. Kumar

Sink

CH

CH MN

MN

MN SCH

MN

CH

MN

MN

MN

SCH MN

SCH MN

MN

MN

MN

Fig. 3 Architecture of balanced load sub-cluster head

Along with the network, the balanced load concept is implemented to sub-cluster head nodes, where the balanced load SCH is formed in each cluster as shown in Fig. 3. In the figure, the SCH is indicated as sub-cluster head nodes and the MNs are member nodes of a cluster. The cluster head (CH) is far from member nodes, and also base station (BS) is stationed at so far from the field where the nodes are localized. Here, sub-cluster head collects the information from the member nodes and after aggregating the data, it will transfer it to the main cluster head node which further sends the information to the sink. In Fig. 3, various nodes are MN—member node, CH—cluster head, SCH—subcluster head. The distance between the sub-cluster head and member node is calculated as follows Dis tan ce (SCH)(MN) =

 SCH(x, y) − MN(x, y)

(1)

Threshold distance is used so that distance between the node and sub-cluster head should be less than the threshold distance.

4.2 Mechanism for Hybrid MAC The mechanism implicated in this protocol involves a step-by-step procedure after the localization process, and they are as follows: Step 1: Initially, the nodes start to identify the best intermediate node from the location. This is possible through the propagation of hello messages from time to time. The exchange of hello messages is made after each 30 s in this simulation. After

Detecting WSN Attacks Through HMAC and SCH Formation

31

that, hello memory is then swapped over between nodes, which assist the nodes to find the next neighbor, and the details are stored in the neighbor’s list. Step 2: Secondly, the node starts to sense its relevant parameters and transmits the data to the destination using CSMA methodology. This process is continued until two conditions. They are: (a) There should be no increase in traffic load and (b) during the absence of emergency packet transmission. Step 3: Thirdly, high-priority region-based data transmission is initiated. If any node is identified that it is in the high-priority region, those neighbor nodes hold the data, are shifted over to TDMA, and provide the current slot to the node which is in the higher-priority domain. Step 4: Finally, if 2 nodes occupy the high-priority region in the current slot, then that will be assigned to transfer the information one after the other. At the end, CSMA mode will be activated.

4.3 Hash Message Authentication Code If two nodes exchange a confidential symmetric key “K,” a one-way encrypted hash h can be used to effectively create and check an authentication key h(“K”) of the received message. Hash message authentication is computed with a hash function in combination with the secret key. For a small sensor node, the computational power is quite effective and even inexpensive. Though, only the expected receiver can verify HMAC so that it does not call for verification of the transmitted signal. Furthermore, it is a fairly substantial issue to distinguish the hidden code among any of the two communicating nodes. The total amount of keys required is n. (n − 1)/2 in an n node network if the pairwise share key is being applied. Every auxiliary node on the inverse track validates its identification on both the collected tracks. This also validates that the previous and subsequent nodes are its neighbors in the acquired path. When true, the initial node will calculate and pass the following HMAC signal and shared key toward the next node in the route by using the RREP message carried with the succeeding node. The route response thus reaches the S source node. The main notations used in the route reply process are defined in Table 1. The route reply procedure is as follows: Table 1 Main notations used in the above process

Notation

Description

IPS , IPD

The IP address of the source and destination node

SND

The sequence number of destinations

Hop_cnt

Value of hop count

Route_path Path accumulation list of the route path KDS

The shared secret key between nodes

32

N. Mehta and A. Kumar

Table 2 Parameter set for simulation tests

Parameters

Type

Type of channel

Wireless channel

Type of antenna

Omnidirectional antenna

Radio propagation model

Two-ray ground

Network interface

Physical layer

Routing protocol

DSR

Standard

IEEE 802.11b

Packet size

300

Number of mobile nodes

100

Initial energy (J)

1000

Area

1000 × 1000

Data rate

1 Mbps

D: AUTH1 = (IPS, IPD, SND, route_path, path_type)K DS : HMACKDC = (AUTH1, hop_cnt)KDC D->C:(RREP, IPS, IPD, SND, route_path, path_type, hop_cnt,AUTH1, HMACKDC) C: HMACKCB = (AUTH1, hop_cnt)KCB C->B:(RREP, IPS, IPD, SND, route_path, path_type, hop_cnt,AUTH1, HMACKCB) B

: HMACKBS = (AUTH1, hop_cnt)KBS

B->:(RREP, IPS, IPD, SND, route_path, path_type, hop_cnt, AUTH1, HMACKBS) Once the origin of the S node gets RREP signal, it checks whether its next neighboring node gets the signal too and whether this neighboring node is the first route node. If it does, the HMAC numbers will be checked in the answer. When both HMAC attributes are properly checked, the route is then recognized.

4.4 Initialization When initialized, the original energy of node is put to E 0 = 1 and the power utilization from obtaining and transmitting a single packet is measured to E r = 0.0001 and E t = 0.0003 depending on the reality that the power usage of the node getting one piece of information is 1:2.7 [14]. The reception and transmission of data consume a huge

Detecting WSN Attacks Through HMAC and SCH Formation

33

part of the total energy, therefore we stipulate that the node’s extra energy will be calculated by the usage of the interaction energy within the proposal.

5 Experiments and Results The simulation parameter is mentioned in Table 2. NS2 network simulator has been used for performance evaluation. The hundred mobile nodes are placed in area 1000 × 1000 m. The nodes are initialized with an initial energy of 1000 J. The routing algorithm used is a dynamic source routing protocol which is a reactive protocol where the route is created need-wise. In this protocol, every mobile node maintains routing cache where it caches the source route it has come to know. When the host wants to send a packet to some other host, it will check its routing cache for the route. If the route is found, the sender uses that route; otherwise, the source node commences route discovery process. DSR works suitably for high mobility networks.

5.1 Comparison with Earlier Schemes The performance of the proposed methodology is determined through the following parameters discussed below. The parameters include packet delivery ratio and latency and are analyzed during different attacks, which are introduced in the network. The packet delivery ratio is termed as a total number of packets accurately accepted by the receiver to the total number of packets sent. Parameter latency is the delay that takes place in the receiving packet through the number of nodes in the network. The method is also compared to other conventional schemes: watchdog and AACK. Watchdog [27] scheme was also used for improving the packet delivery ratio in the system by detecting the malicious nodes in the system. It used two phases: detection and prevention. In the detection phase, false activity was checked by comparing the behavior with the historical data and then all information like node id, and time of the attack was stored. In the prevention phases, the malicious activity is prevented by installing a preventer node. The preventer node broadcasts attacker information to all the sensor nodes in the network. Another conventional method for simulation comparison used here is AACK [28]. The method is based on two-way acknowledgment and end-toend acknowledgment scheme. In the two-way ack scheme, every node is sending back acknowledgment packet which is two hops away from it. It solves the problem of a watchdog prototype of limited transmission power and receiver collision.

34

N. Mehta and A. Kumar

5.2 Simulation Performance of Attack 1 The proposed scheme attains a packet delivery ratio of 99.4, 95.8, and 94.1% with 10, 20, and 30% malicious nodes in the area. It is observed from Table 3 that packet delivery ratio decreases as the ratio of malicious nodes increases in the system. We have calculated the average PDR value and compared it with its conventional methods as shown in Fig. 4. The projected scheme is showing 94.03% PDR in comparison with watchdog showing 92.08% and AACK showing 92.86%. The latency value with 10% malicious node is 9.07 in comparison with AACK showing 15.1 and watchdog giving 14.9 s. When the malicious nodes increased to 20%, we received the latency as 11.6, 17.4, and 17.9 for the proposed, watchdog, and AACK schemes, respectively, as plotted in Fig. 5. We further see the enhancement of latency with 30% of malicious nodes in the area. The value we got is 16.2, 18.9, and 19.1 for the proposed, watchdog, and AACK schemes, respectively. Table 3 Testing packet delivery ratio and latency (Attack 1) Methodology

Malicious nodes: 10%

Malicious nodes: 20%

Malicious nodes: 30%

Average

Packet delivery ratio (%) AACK (Sheltami [28])

94.1

91.5

92.1

92.86

Watchdog (Marti [27])

92.4

90.7

90.05

92.08

Proposed

99.4

95.8

94.1

94.03

AACK

15.1

17.9

19.1

17.3

Watchdog

14.9

17.4

18.9

17.0

11.6

16.2

11.5

Latency (s)

Proposed

9.07

Fig. 4 Packet delivery ratio (Attack 1)

Detecting WSN Attacks Through HMAC and SCH Formation

35

Fig. 5 Latency (Attack 1)

5.3 Simulation Performance of Attack 2 The proposed method has achieved 96.3, 89.1, and 91.7 PDR for the proposed, watchdog, and AACK, respectively, with the 10% injection of malicious nodes in the area as seen in Fig. 6. With the increase in malicious nodes to 20%, the value received for PDR is 94.6, 82.1, and 82.9 for a new prototype, watchdog, and AACK methods, respectively. For 30% malicious nodes in the system, we received further degradation in PDR with 94.7, 80.9, and 81.1 for proposed, watchdog, and AACK schemes, respectively. We have calculated the latency value for the different methods with 10, 20, and 30% malicious nodes in the sensor area mentioned in Table 4. The latency value for the offered scheme is 12.3 in comparison with 16.8 for AACK and 15.9 for watchdog with 10% malicious nodes in the area. With 20% malicious nodes, the value is 17.2 s in comparison with 19.1 and 18.1 s for the watchdog and AACK, respectively. When we enhance malicious nodes in the system further (30%), we see max latency in the area. The average latency value for the proposed method received is 16.1 s in comparison with 18.6 and 18.9 for the watchdog and AACK, respectively. Fig. 6 Packet delivery ratio (Attack 2)

36

N. Mehta and A. Kumar

Table 4 Performance comparison (Attack 2) Methodology

Malicious nodes: 10%

Malicious nodes: 20%

Malicious nodes: 30%

Average

Packet delivery ratio (%) AACK

91.7

82.9

81.1

85.23

Watchdog

89.1

82.1

80.9

84.03

Proposed

96.3

94.6

94.7

91.01

AACK

16.8

18.1

21.8

18.9

Watchdog

15.9

19.1

20.9

18.6

Proposed

12.3

17.2

15.8

16.1

Latency (s)

6 Conclusion In this proposed model, we have done load balancing by sub-cluster head selection and also hybrid acknowledge scheme is used for malicious node detection. The scheme is using a hybrid MAC for the data transfer operation which includes both CSMA and TDMA. We have used hash message authentication code for providing secure routes. Then, we constructed attack cases and performed experimental evaluations based on packet delivery ratio and latency. Our evaluation indicated that the scheme is performing better as compared to the similar previous scheme. We have also seen effect of increasing number of malicious nodes in the network, on performance parameters. Extension to this work would be to explore energy-efficient intrusion detection system for dynamic WSN covering more number of hybrid attacks.

References 1. Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E (2002) Wireless sensor networks: a survey. Comput Netw 38:393–422 2. Gungor VC, Hancke GP (2009) Industrial wireless sensor networks: challenges, design principles, and technical approaches. IEEE Trans Industr Electron 56:4258–4265 3. Estrin D, Girod L, Pottie G, Srivastava M (2001) Instrumenting the world with wireless sensor networks. In: Proceedings of ICASSP, vol 1, pp 2033–2036 4. Cao H, Leung V, Chow C, Chan H (2009) Enabling technologies for wireless body area networks: a survey and outlook. IEEE Commun Mag 47(12):84–93 5. Liu D, Ning P, Liu A, Wang C, Du WK (2008) Attack-resistant location estimation in wireless sensor networks. ACM Trans Inf Syst Secur11:22. https://doi.acm.org/https://doi.org/10.1145/ 1380564.1380570. 6. Cho Y, Qu G, Wu Y (2012) Insider threats against trust mechanism with watchdog and defending approaches in wireless sensor networks. In: IEEE symposium on security and privacy workshops, pp 134--141 7. Alam S, De D (2014) Analysis of security threats in wireless sensor network. Int J Wirel Mobile Netw 6:35–46 Doi 10.5121/ijwmn.2014.6204

Detecting WSN Attacks Through HMAC and SCH Formation

37

8. Bhattasali T, Chaki R (2011) A survey of recent intrusion detection systems for wireless sensor network. In: International conference on network security and applications. Springer, Berlin, Heidelberg, pp 268–280 9. National Center for Biotechnology Information. https://ncbi.nlm.nih.gov/pmc/articles/PMC 6263508/ 10. Maheshwari R, Gao J, Das SR (2007) Detecting wormhole attacks in wireless networks using connectivity. In: IEEE INFOCOM 2007–26th IEEE international conference on computer communications, pp 107–115 11. Xiao L, Greenstein LJ, Mandayam NB, Trappe W (2009) Channel-based detection of sybil attacks in wireless networks. IEEE Trans Inf Foren Secur 4:492–503 12. Karlof C, Wagner D (2003) Secure routing in wireless sensor networks: attacks and countermeasures. In: Proceedings of the first IEEE international workshop on sensor network protocols and applications, pp 113--127 13. Cao Q, Abdelzaher T, He T, Kravets R (2007) Cluster-based forwarding for reliable end-toend delivery in wireless sensor networks. In: IEEE INFOCOM 2007–26th IEEE international conference on computer communications, pp 1928–1936 14. Hu W, Bulusu N, Jha S (2005) A communication paradigm for hybrid sensor/actuator networks. Int J Wirel Inf Netw 12:47–59 15. Wood AD, Stankovic JA (2002) Denial of service in sensor networks. Computer. 35:54–62 16. Bhagat S, Panse T (2016) A detection and prevention of wormhole attack in homogeneous wireless sensor network. In: 2016 international conference on ICT in business industry & government (ICTBIG), pp 1–6. IEEE 17. Mehta N, Kumar A (2019) Energy-aware unequal multi-hop weighted cluster heterogenous routing protocol for wireless sensor network. Int J Innov Technol Explor Eng 8:1334–1339 18. Alrajeh NA, Khan S, Shams B (2013) Intrusion detection systems in wireless sensor networks: a review. Int J Distrib Sens Netw 9:167575 19. Mao ZM, Wang J, Zhang Y (2012) Inventors, University of Michigan, AT&T intellectual property II LP, assignee. Method and apparatus for mitigating routingmisbehavior in a network. United States patent US 8141156 20. Kredo KII, Mohapatra P (2007) Medium access control in wireless sensor networks. Comput Netw 51:961–994 21. Mehta N, Kumar A (2018) Advanced malicious detection using HMAC balanced load subcluster head selection for wireless sensor network. J Adv Res Dyn Control Syst sp. ed. 13:2321– 2330 22. Gulhane G, Mahajan N (2014) Performance evaluation of wireless sensor network under black hole attack. Int J Comput Technol 1:92–96 23. Liu X, Liu Y, Liu A, Yang LT (2018) Defending ON–OFF attacks using light probing messages in smart sensors for industrial communication systems. IEEE Trans Indust Inf 14:3801–3811 24. Lu Z, Sagduyu YE, Li JH (2015) Queuing the trust: secure backpressure algorithm against insider threats in wireless networks. In: 2015 IEEE conference on computer communications (INFOCOM). IEEE, pp 253–261 25. Farjamnia G, Gasimov Y, Kazimov C (2019) Review of techniques against the wormhole attacks on wireless sensor networks. Wirel Person Commun 105:1561–1684 26. Rehman A, Reham S, Raheem H (2019) Sinkhole attacks in wireless sensor networks: a survey. Wirel Person Commun 106:2291–2313 27. Marti S, Giuli TJ, Lai K, Baker M (2000) Mitigating routing misbehavior in mobile ad hoc networks. In: Proceedings of 6th annual international conference on mobile computer network, Boston, pp 255–265 28. Shakshuki EM, Sheltami TR (2009) Tracking anonymous sinks in wireless sensor networks. In: International conference on advanced information networking and applications, pp 510–516

Literature Review of Various Nature-Inspired Optimization Algorithms Used for Digital Watermarking Preeti Garg and R. Rama Kishore

Abstract Today, a tremendous amount of data is transferred online, so there is a need to secure this data. Digital watermarking is a process of embedding some presumed content image or data in any cover data so that the quality of the content should not degrade and it should not be visible to human eyes. This paper describes various characteristics required by any watermarking algorithm and explains some of the optimization algorithms. DWT, DCT and SVD alone are not sufficient for achieving the required robustness, imperceptibility and security of the digital content; some of the optimization algorithms are required to achieve these, so this paper reviews various nature-inspired optimization algorithms used for optimizing the process of digital watermarking and shows a comparative study of these techniques in tabular form. Keywords Digital watermarking · Optimization · Genetic algorithm · Firefly algorithm · Particle swarm optimization · SVD · Artificial bee colony algorithm

1 Introduction Today’s world is a digital world because every information or data is available in digital form on the Internet. This availability of data on Internet allows users to share and access all the data and information in digital form which infringes the law of copyright ownership of particular data. As everything is available in digital form, one can use other’s data easily and can modify it which gives birth to the digital watermarking. One of the applications of digital watermarking is to provide copyright P. Garg (B) · R. R. Kishore GGSIPU University, New Delhi, India e-mail: [email protected] R. R. Kishore e-mail: [email protected] P. Garg KIET, Ghaziabad, India © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_3

39

40 Fig. 1 Watermarking architecture

P. Garg and R. R. Kishore Cover /host Image

Watermark Image

Watermark Embedding Algorithm

Watermarked Image

identification. Watermarking technique should be like this that no one should know the existence of embedded watermark [1]. Digital watermarking allows the user to add a design like a logo or image in some cover image to prove authenticity and ownership. It is a tool to identify the ownership, authenticity and owner of particular image or document. A watermark can be of two types, visible and invisible. A visible watermark is one which a user can easily see while an invisible watermark is not visible to the users. According to the types of documents, watermarking is applied; watermarking can be of four types, text, image, video and audio watermarking. Watermarking is an important and efficient technique for protecting the multimedia contents. The watermarking procedure is shown in Fig. 1. In the procedure of watermarking, an embedding algorithm is used to embed the watermark into host data, and after implementing the algorithm, a watermarked image is generated. Digital watermarking has various applications like copyright protection, digital right management, fingerprinting, image and content authentication, tamper proofing and many more. In medical applications, also it is very useful for protecting the patient’s personal details from unauthorized users. A watermarking algorithm should be robust against various geometric and image-processing attacks [2]. To achieve these goals, various optimization algorithms are used which make the watermarking algorithm optimized. A review of some of the nature-inspired optimization algorithms is given in this paper which makes the watermarking algorithm optimized. This paper can help the researchers to find various techniques used in this process and a comparison between these so that researchers can select any algorithm for their research. Section 2 of this paper describes various characteristics of watermarking. In Sect. 3, mathematical preliminaries are shown. In Sect. 4, various optimization algorithms are described. The literature review of various papers is described in Sect. 5. Sections 6 and 7 show future scope and conclusion.

Literature Review of Various Nature-Inspired …

41

2 Characteristics of Watermarking 2.1 Robustness After performing operations on the image, there should be no modification in the watermark image. That is embedding watermark should be robust to attacks which attempts to modify or remove the watermark.

2.2 Imperceptibility Imperceptibility means that after adding any watermark into the cover image, it should not degrade the quality of cover image. The similarity between original and watermarked image should be high so that no one can feel presence of any embedded data.

2.3 Security Once a watermark is embedded into the cover image, no unauthorized person should be able to gain the watermark from the embedded image [2], otherwise he/she can use that data to harm its security.

2.4 Transparency There should be no loss of any feature in cover image after embedding the watermark. The watermark should not degrade the quality of the original image as well as no one should know its presence [1].

2.5 Capacity Capacity defines how much embedding can be done on cover image. It means that cover image should be able to handle the load of watermark image.

42

P. Garg and R. R. Kishore

3 Mathematical Preliminaries 3.1 DFT DFT uses phase modulation instead of magnitude components to hide the message. The original image is decomposed into four sub-band images by DWT: three high-frequency parts (HL, LH and HH, named detail sub-images) and one lowfrequency part (LL, named approximate sub-image) [3]. To find DFT, calculations are performed by using fast Fourier transform (FFT). There is an advantage of using DFT that it has less visual effect and is very robust against noise attacks on the message.

3.2 DCT DCT stands for discrete cosine transform; it is the most frequently applied linear orthogonal transformation in digital signal processing [4]. DCT is used to transform an image into frequency domain and perform quantization on it for compressing the image. DCT helps to separate an image into hierarchy of sub bands. Strong energy compaction ability is significant property of the DCT which is used widely in the fields of image processing and watermarking [5].

3.3 DWT DWT stands for discrete wavelet transform which is a fast and easy transformation approach which translates an image from spatial domain to frequency domain. In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled [6]. DFT and DCT represent signal either in spatial domain or in frequency domain, but DWT is able to represent signal in both spatial and frequency domains simultaneously. DWT is used in JPEG2000.

3.4 SVD SVD stands for singular value decomposition; it is an efficient technique to work on matrices. While applying SVD on any image, it converts it into three matrices [7]: A, U and B in which U matrix is called diagonal matrix, while rest two are known as orthogonal matrices [8]. SVD is used in many fields like image compression and image watermarking. It is an effective tool which works on matrices and converts

Literature Review of Various Nature-Inspired …

43

these into singular values. SVD can also be used in combination of DCT, DWT and DFT; as in SVD there is a problem of false positive that is sometimes an unauthorized person may get the watermark when using SVD.

4 Various Optimization Techniques 4.1 Genetic Algorithm One of the most useful nature-inspired algorithm used for optimization is called genetic algorithm. In it elements in binary strings are optimized up to some fitness value [9]. In this algorithm, the best-fitted solution to the problem is found, and then, it is selected for the next step. Genetic algorithm is an optimization technique whose work is to find those values of input for which a best solution or output can be achieved. It first selects populations from the given input and makes them parent and then uses these parents to generate children and repeat it successively until a population in which the optimal solution is found. It basically involves three steps: 1. Selection: Selecting the parent from the input population and use them to make children. 2. Crossover: In this rule, it combines two parents to make children. 3. Mutation: In this rule, it performs some changes in parent to form the children. In [9], GA is used in embedding watermark for finding optimal frequency band and then discrete cosine transformation (DCT) is used. Here in [10] watermarking is performed on frequency domain rather than on spatial domain as it is more robust against various attacks. Watermark is embedded on low-frequency bands of wavelet transform domain, which results in low imperceptibility. Here a blind watermark extraction process using evolutionary algorithm against rotational attacks is used. Canny edge detector is used to convert cover image into two sub-images in [11], and then SVD and genetic algorithms are used. Authors have converted image into sub-images which result in high memory requirement. GA along with Lagrangean support vector regression is used in [12] to provide imperceptibility, robustness and security characteristics of watermarking. In [13], DCT along with DWT is used for watermarking and SVD is also used to generate orthogonal matrices along with this to provide security. Arnold transform of watermark is calculated before embedding it. To optimize the complete process, GA is used. Here work is done for medical applications to provide authentication of the cover image, and to provide robustness, back propagation neural network is applied. In [14] instead of using DCT, DWT scheme is used along with SVD but because of using SVD there is a problem of false positive. Support vector machine (SVM) is used in [15] to train image blocks of different texture and luminance, then position of embedding watermark is selected

44

P. Garg and R. R. Kishore

Table 1 Comparison of various techniques based on genetic algorithm References

Techniques used

Objectives achieved by using the techniques

Limitations

[9]

DCT, GA

Good imperceptibility is achieved by using GA algorithm

As alone, DWT is used here; it does not provide remarkable imperceptibility

[10]

DWT, GA

More robustness than spatial domain techniques is achieved by performing watermarking in frequency domain

Dividing image into sub-images results into high memory requirement

[11]

SVD, GA

Provide a blind watermarking technique by using SVD to achieve more robustness

This algorithm is not time efficient

[12]

LWT, LSVR, GA

provide imperceptible, robust and secure image watermark

The process of encryption is very complex

[13]

DWT, DCT, SVD, Arnold transform, GA

Addresses medical images for providing authenticity of the identity of owner of the image. Achieve robustness, imperceptibility and security objectives of watermarking

No work is done to make it time efficient

[14]

DWT, SVD,GA

Dual watermarking Because of using SVD, scheme is used to achieve there is a problem of security and authentication false positive of the image

[15]

DCT, SVM, optimized GA Optimized genetic algorithm is used to achieve optimized results, and SVM is used to train the network

This scheme does not work good against rotational attacks

by GA by finding scaling factor, and DCT is used to get image in frequency domain. The comparison of genetic algorithm-based techniques is shown in Table 1.

4.2 Artificial Bee Colony (ABC) Algorithm The concept of artificial bee colony is introduced in [16] which use the intelligence of honey bees for optimization. In ABC algorithm, there are three types of bees called employed, onlooker and scout bees. Here solution of any problem is represented by

Literature Review of Various Nature-Inspired …

45

the food source, and nectar amount represents the fitness value of any solution. Here employed bees represent the number of possible solution exits of the problem, and it finds the solutions repeatedly until some threshold is reached, on-looker bees find the fitness function of all the solutions and selects which one is optimal, while the work of scout bees is to leave a particular solution in between to find more optimal solution. ABC algorithm finds the cost function which is optimized by minimizing or maximizing it in the given search space [17]. The scheme described in [18] uses wavelet domain along with SVD technique to provide robustness, for providing optimization ABC algorithm is used. Region selection for watermarking is done on the basis of human visual system. In [19], focus of authors is on all five requirements of watermarking that is robustness, reversibility, security, capacity and invisibility and for security Arnold transform is used. In [20, 22], DCT along with ABC is used to provide robustness. ABC is used here to measure fitness function, this fitness function measures only the robustness of the watermarked image, it did not work on its imperceptibility objective. SVD technique is used in [21] and to remove the problem of false-positive integer wavelet transform (IWT) have been used for watermarking. The complete procedure is optimized using ABC algorithm. Here in [22] DWT along with SVD is used to remove problem of false positive, and ABC is used for optimizing the whole procedure. Authors have presented a robust method in [23] for high dynamic range images (HDRI). In this scheme, authors have used DWT and ABC algorithm for optimization. In [24], authors have worked on reducing the time taken by the algorithm to achieve this objective machine learning technique which has been used along with ABC. But because of using machine learning for training, the network huge memory is required. The comparison of ABC algorithm-based techniques is shown in Table 2.

4.3 Firefly Algorithm It uses the concept of how fireflies catch other flies by using their flashing behavior. It is a meta-heuristic approach which is used for optimization problem. Meta-heuristic is an iterative generation process which guides a subordinate heuristic by combining intelligently different concepts for exploring and exploiting the search space while learning strategies are used to structure information to find efficiently near-optimal solutions [25]. In this, search is performed in both ways from left to right and then from top to the bottom. Like flies find their food by perceiving various gases available in the atmosphere and determine their exact position and reaches there, similarly here it finds the optimal solution to the problem from various available solutions on the basis of fitness value. Firefly algorithm is very easy to use and implement, and it can easily merge with other algorithms also. DWT, DCT and SVD techniques have been used in [26], and this technique is optimized by using a new scheme called chaotic firefly algorithm. In [27], authors have used LWT along with SVD to remove problem of false positive, and complete

46

P. Garg and R. R. Kishore

Table 2 Comparison of various techniques based on ABC algorithm References

Techniques used

Objectives achieved by using the techniques

Limitation

[18]

Wavelet domain, SVD, ABC algorithm

Provide good quality to the watermarked image

Using SVD results in false-positive problem

[19]

Arnold transform, Slantlet transform, ABC algorithm

Develop a lossless watermarking process by focusing on all five requirements of watermarking that is robustness, reversibility, security, capacity and invisibility

No work is done to make the process time efficient

[20]

DCT, ABC algorithm

Work on improving the It did not work on robustness of imperceptibility watermarking by limiting objective the distortion occur in image but did not focus on its imperceptibility

[21]

IWT, SVD, ABC algorithm Achieve robustness and security of watermarking. To remove false-positive problem of SVD, integer wavelet transform is used

[22]

DWT, SVD, ABC algorithm

Work on imperceptibility Main focus of authors is feature of watermarking. on imperceptibility and To remove false-positive not on robustness problem of SVD, DWT is used

[23]

DWT, ABC algorithm

Provide robustness and imperceptibility for high dynamic range images

No work is done to make HDRI image secure

[24]

DCT, machine learning, ABC algorithm

Reduce the time taken by watermarking algorithm with the help of machine learning techniques and provide a time-efficient watermarking scheme

Memory requirement for this approach is very high

No proper balance between robustness and imperceptibility is achieved

procedure is optimized using both bat and firefly algorithms. Here work on achieving security of watermark image is also done. Predictive modeling using regression tree is used in [28]; along with this, LWT is used for watermarking and for optimization firefly algorithm is used. Authors have worked on DWT and SVD in [29] and uses opposition and dimensional-based firefly algorithm (ODFA) to optimize it, and it gives better results than previous schemes. Block-based watermarking scheme is used in [30]. It also used Hadamard transform and distinct discrete firefly algorithm

Literature Review of Various Nature-Inspired …

47

Table 3 Summary of various techniques based on firefly and their objectives References

Techniques used

Objectives achieved by using the techniques

Limitations

[26]

Chaotic firefly algorithm with DCT, SVD and DWT

Chaotic firefly increases the efficiency of watermarking

Embedded the watermark bits in LSB so works poor against JPEG compression

[27]

LWT-SVD, bat algorithm and firefly algorithm

Worked on color images and optimized results are achieved by using combination of bat and firefly algorithm

This scheme is not time efficient

[28]

LWT, regression tree, Fibonacci Q transform and firefly algorithm

Improve the robustness, imperceptibility and security of image and use Fibonacci Q transform to achieve security of watermarked image

No work is done to make it time efficient

[29]

DWT, SVD, ODFA algorithm

Strong robustness is achieved against various image processing attacks

Here no work is done to make the watermark image secure

[30]

Hadamard transform, distinct DFA

Blocks which are optimal High memory requirement for embedding are selected by using a variation of firefly algorithm called DDFA to achieve high robustness and imperceptibility

(DDFA) to achieve required robustness and imperceptibility. The comparison of various techniques based on firefly algorithm is shown in Table 3.

4.4 Particle Swarm Optimization Algorithm It is population-based optimization algorithm based on the behavior of birds and fish that how these behave when birds are in group. Particle also known as candidate can improve its position by considering inertia, personal influence and social influence [31]. In this algorithm, multiple steps are performed until an optimal solution is found. Here, birds are the particles as each particle has a fitness function which depends upon the objective function. Each particle stores their best performance and adjusts their values depending upon their best performance of the group until best fitness value is found. PSO is one of the most used algorithms for optimization because it is easy to implement and gives good result. Authors in [32] used SVD along with weighted quantum particle swarm optimization to get watermarked image which has good human vision. It also uses human

48

P. Garg and R. R. Kishore

visual system (HVS) for both DCT and DWT, but HVS system has complex calculation so in [33] authors have worked on finding Region of Interest (ROI) of an image and then used this information for watermarking, but it takes more time. In [34], authors have proposed the modifications to the existing PSO algorithm and gave it a name multi-objective PSOtridist. Authors in [35] have used DWT and SVD based watermark scheme for color images and optimize it by using dynamic PSO. In [36], a new version of dynamic PSO called guided dynamic PSO is proposed along with DWT and SVD to achieve better robustness. Here in [37] for watermarking along with DCT and SVD lifting wavelet transform, discrete fractional angular transform have been used, that is why it is called a fusing watermarking scheme, which provides robust, imperceptible and secure watermarking. In [38], a scheme is proposed which uses complex wavelet transform (CWT) along with SVD for watermarking and for optimizing the process PSO and Jaya algorithm has been used. The comparison of PSO algorithm-based techniques is shown in Table 4.

5 Conclusion and Future Scope The concept of digital watermarking gives the users freedom to provide their content online to other users while maintaining its authenticity and copyright protection. Any modification done by some unauthorized person in digital content can easily be found by extracting the watermark embedded in it. Digital watermarking provides the copyright protection, temper proofing and fingerprinting to various types of content like audio, video or images. Watermarking can be performed in two domains spatial and frequency. In spatial, directly pixel values are updated for embedding the watermark while in frequency, first an image is converted into frequency domain and then embedding is performed. One example of spatial watermarking is least significant bit (LSB) based which is very easy to implement, but this method is not secure and it also degrades the quality of the cover image, so watermarking in frequency domain is more useful and most of the researchers have done watermarking in this domain only. In frequency domain watermarking bits are spread throughout the image, so no one can easily detect the watermark image. In frequency domain, DCT, DWT, DFT, LWR and SVD techniques are used. Here various papers are reviewed which are based on frequency domain watermarking and which uses various optimization algorithms. So a conclusion can be drawn which can help the future researchers to select any scheme for optimization. These are as follows: • Using SVD alone for watermarking is not a good idea, because there is a problem of false positive in this scheme, i.e., any unauthorized person can access the watermarked image, in such cases an encrypted watermark should be added rather than the whole watermark to get good result. • Watermarking by finding the Region of Interest is also a good option to increase robustness of watermarking.

Literature Review of Various Nature-Inspired …

49

Table 4 Comparison of various techniques based on PSO References

Techniques used

Objectives achieved by using the techniques

Limitations

[32]

DCT, SVD and weighted quantum particle swarm optimization (WQPSO)

Use SVD along with WQPSO to get watermarked image which has good human vision

HVS system has complex calculation

[33]

DWT, SVD, PSO

Improve the performance It takes more time of watermarking by using the concept of region of interest. Performed watermarking on those regions which are not of interest

[34]

DWT, SVD and improved It improves the Because of using SVD multi-objective PSO performance of existing problem of false positive (MOPSO) algorithm MOPSO by changing the exist method of selecting leader and personal best replacement scheme

[35]

DWT, SVD, dynamic PSO

Achieve more robustness Robustness results are not then traditional PSO by so good using a variation of it called Dynamic PSO

[36]

DWT, SVD and guided dynamic PSO

The concept of sharing Not time efficient and fitness is used to achieve secure better performance than DPSO and called this variation of PSO as GDPSO

[37]

DCT, SVD, LWT, discrete functional angular transform, PSO algorithm

Remove the No work is done to make false-positive problem of this scheme time efficient SVD by encrypting the watermark image before embedding it and uses fusion of various schemes to achieve better performance

[38]

Complex wavelet Perform optimization of transform, SVD, PSO and the watermarking Jaya algorithm algorithm by both Jaya and PSO algorithm and have shown that Jaya algorithm gives better performance then PSO algorithm

Robustness results of PSO algorithms are not as good as by Jaya algorithm

50

P. Garg and R. R. Kishore

• Entropy is a good option to convert an image into sub-blocks and arranging them in ascending order. • Machine learning techniques can be used to get time-efficient watermarking algorithm. • Frequency domain techniques alone are not sufficient to provide robustness and imperceptibility, so various other techniques should be used to optimize the procedure. • Various nature-inspired algorithms can be used for optimizing the watermarking process. There are a number of optimization techniques like genetic, artificial bee colony, firefly and particle swarm optimization algorithms which are used in watermarking process; in these techniques, PSO algorithm is the most famous one because it is a fast algorithm and provides better results. The future scope is to work on various other nature-inspired optimization algorithms to improve the robustness and imperceptibility characteristics of watermarking. Watermarking based on machine learning can also be used for optimizing the process.

6 Conflict of Interest No conflict of interest exists.

References 1. Priya S, Santhi B, Swaminathan P, Raja Mohan J (2017) Hybrid transform based reversible watermarking technique for medical images in telemedicine applications. Optik—Int J Light Electron Opt, pp 1–36 2. Aditi Z, Singh AK, Kumar P (2016) A proposed secure multiple watermarking technique based on DWT, DCT and SVD for application in medicine. Multimedia tools application. Springer, pp 1–20 3. Kansal M, Singh G, Kranthi BV (2012) DWT, DCT and SVD based digital image watermarking. In: 2012 IEEE international conference on computing sciences, pp 77–81 4. Hongcai Xu, Kang X, Wang Y, Wang Y (2018) Exploring robust and blind watermarking approach of colour images in DWT-DCT-SVD domain for copyright protection. Indersci Int. J Electron Secur Digit Forens 10(1):79–96 5. Moosazadeh M, Ekbatanifard G (2017) An improved robust image watermarking method using DCT and YCoCg-R Color space. Int J Light Electron Opt, pp 1–32 6. Dubolia R, Singh R, Bhadoria RS, Gupta R (2011) Digital image watermarking by using discrete wavelet transform and discrete cosine transform and comparison based on PSNR. In: IEEE international conference on communication systems and network technologies, pp 593–596 7. Chang C-C, Tsai P, Lin C-C (2005) SVD-based digital image watermarking scheme. Pattern Recogn Lett, pp 1577–1586 8. Chung K-L, Yang W-N, Huang Y-H, Wu S-T, Hsu Y-C (2007) On SVD-based watermarking algorithm. Appl Math Comput, pp 54–57

Literature Review of Various Nature-Inspired …

51

9. Shieh C-S, Huang H-S, Wang F-S, Pan J-S (2004) Genetic watermarking based on transformdomain techniques. Pattern Recogn 37, pp 555–565 10. Lee D, Kim T, Lee S, Paik J (2006) Genetic algorithm-based watermarking in discrete wavelet transform domain. Springer, Berlin Heidelberg, pp 709–716 11. Takore TT, Rajesh Kumar P, Lavanya Devi G (2014) A robust and oblivious grayscale image watermarking scheme based on edge detection, SVD, and GA. In: Proceedings of 2nd international conference on micro-electronics, electromagnetics and telecommunications. Springer, pp 51–61 12. Mehta R, Rajpal N, Vishwakarma VP (2015) Robust image watermarking scheme in lifting wavelet domain using GA-LSVR hybridization. Int J Mach Learn Cyber. Springer, pp 1–17 13. Zear A, Singh AK, Kumar P (2016) A proposed secure multiple watermarking technique based on DWT, DCT and SVD for application in medicine. Multimedia Tools Application. Springer, pp 1–20 14. Singh RK, Shaw DK, Jha SK, Kumar M (2017) A DWT-SVD based multiple watermarking scheme for image based data security. J Inf Optimiz Sci, pp 1–17 15. Zhou X, Cao C, Ma C, Wang L (2018) Adaptive digital watermarking scheme based on support vector machines and optimized genetic algorithm. Hindawi mathematical problems in engineering, pp 1–10 16. Dervis K (2005) An Idea Based On Honey Bee Swarm For Numerical Optimization. Erciyes University, Engineering Faculty, Computer Engineering Department, Kayseri/Türkiye, Technical report-Tr06, October, 2005, pp 1–10 17. Ansari IA, Pant M (2016) Multipurpose image watermarking in the domain of DWT based on SVD and ABC. Pattern Recogn Lett, pp 1–12 18. Ali M, Ahn CW, Pant M, Siarry P (2014) An image watermarking scheme in wavelet domain with optimized compensation of singular value decomposition via artificial bee colony. Information Sciences Elsevier, pp 1–20 19. Ansari IA, Pant M, Ahn CW (2016) Artificial bee colony optimized robust-reversible image watermarking. Multimedia tools applications. Springer, pp 1–25 20. Abdelhakim AM, Saleh HI, Nassar AM (2016) A quality guaranteed robust image watermarking optimization with artificial bee colony. Expert systems with applications. Elsevier, pp 1–10 21. Ansari IA, Pant M, Ahn CW (2016) Robust and false positive free watermarking in IWT domain using SVD and ABC. Engineering applications of artificial intelligence. Elsevier, pp 114–125 22. Ansari IA, Pant M (2016) Quality assured and optimized image watermarking using artificial bee colony. Int J Syst Assur Eng Manag, pp 1–13 23. Yazdan Bakhsh F, Moghaddam ME (2018) A robust HDR images watermarking method using artificial bee colony algorithm. J Inf Secur Appl 41: 12–27 24. Abdelhakim AM, Abdelhakim M (2018) Expert systems with applications a time-efficient optimization for robust image watermarking using machine learning, pp 1–35 25. Johari NF, Zain AM, Mustaffa NH, Udin A (2013) Firefly algorithm for optimization problem. Appl Mech Mater 421:512–517 26. Dong H, He M, Qiu M (2015) Optimized gray-scale image watermarking algorithm based on DWT DCT-SVD and chaotic firefly algorithm. In: IEEE international conference on cyberenabled distributed computing and knowledge discovery, pp 310–313 27. Sejpal S, Shah N (2016) A novel multiple objective optimized color watermarking scheme based on LWT-SVD domain using nature based bat algorithm and firefly algorithm. In: 2016 IEEE international conference on advances in electronics, communication and computer technology (ICAECCT) India, pp 38–44 28. Kazemivash B, Ebrahimi Moghaddam M (2017) A predictive model-based image watermarking scheme using. Regression tree and firefly algorithm. Elsevier soft computing, pp 1–16 29. Moeinaddini E, Afsari F (2017) Robust watermarking in DWT domain using SVD and opposition and dimensional based modified firefly algorithm. Springer multimedia tool application, pp 1–23

52

P. Garg and R. R. Kishore

30. Moeinaddini E (2018) Selecting optimal blocks for image watermarking using entropy and distinct discrete firefly algorithm. Soft computing. Springer, pp 1–23 31. Jino Ramson SR, Lova Raju K, Vishnu S, Anagnostopoulos A (2019) Nature inspired optimization techniques for image processing—a short review. Springer International Publishing AG, part of Springer Nature 2019, pp 113–145 32. Soliman MM, Hassanien AE, Onsi HM (2015) An adaptive watermarking approach based on weighted quantum particle swarm optimization. The natural computing applications forum, pp 1–13 33. Shih FY, Zhong X, Chang I-C, Satoh S (2017) An adjustable-purpose image watermarking technique by particle swarm optimization. Multimedia tools application. Springer, pp 1–20 34. Saxena N, Mishra KK (2017) Improved multi-objective particle swarm optimization algorithm for optimizing watermark strength in color image watermarking. Springer, pp 1–20 35. Saxena N, Mishra KK, Tripathi A (2018) DWT-SVD-based color image watermarking using dynamic-PSO. Advances in intelligent systems and computing, pp 343–351 36. Zheng Z, Saxena N, Mishra KK, Kumar Sangaiah A (2018) Guided dynamic particle swarm optimization for optimizing digital image watermarking in industry applications. Fut Gener Comput Syst, pp 1–46 37. Zhou NA, Luo AW, Zou WP (2018) Secure and robust watermark scheme based on multiple transforms and particle swarm optimization algorithm. Multimedia tools application. Springer, pp 1–17 38. Thakkar FN, Srivastava VK (2018) Performance comparison of recent optimization algorithm Jaya with particle swarm optimization for digital image watermarking in complex wavelet domain. Multidimensional systems and signal processing. Springer, pp 1–23

Geospatial Knowledge Management-Fresh Fuel for Banking and Economic Growth? Anupam Mehrotra

Abstract Over the last decade and a half, economists have sizably grown in number as a class of the users of geospatial technology (GT) and data to innovate, evolve and design new ways of measuring and managing growth. Sensory images from satellites are easily available in public domain, and there has emerged a class of sophisticated remote sensing users in the field of economics. Their growing understanding and application of geospatial technology have revolutionized banking and the economic world. The development is welcome as it goes deeper and enables a renewed look at the growth dynamics to re-energize the global economy particularly when the traditional growth drivers seem exhausted and fading out. However, the data explosion brought in by the surge in geospatial technology will remain largely untapped in the absence of a parallel growth in the art of knowledge management particularly in converting the raw data to useful information and applicable knowledge for its practical use in banking and economic growth. Keywords Geospatial technology (GT) · Knowledge management (KM) · Geographic information system (GIS) · Satellite data · Night light · Non-performing asset (NPA) · Area of interest (AOI)

1 Introduction The global economic growth is facing a setback of late on account of numerous challenges on the economic and geopolitical front. Geospatial technology and the use of big data are beginning to emerge as a vital force providing enormous potential and fresh possibilities for leveraging knowledge and reviving economic growth. By providing otherwise inaccessible information with a wide geographic coverage

A. Mehrotra (B) Amity University Dubai, Dubai, UAE e-mail: [email protected]; [email protected] Dubai International Academic City (DIAC), P.O. Box: 345019, Dubai, UAE © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_4

53

54

A. Mehrotra

Fig. 1 Economic impact of geospatial industry surges to USD 2,210.70 Billion [3] Source A report by A. Narain

and exceptionally high spatial resolution, geospatial technology and data can revitalize the engines of growth and bring the global economy back on track. After making the common man’s life much easier through telecommunication [1] and networking, in another round of developments, the geospatial technology and big data can do wonders in the sphere of responsive urban planning, effective land use, forest cover, agricultural production [2], environmental up gradation and even banking and finance. When the traditional and normal stream of data as available in developing countries generally fail to keep up with the fast growing population and the changing pattern of resource utilization, the geospatial data and the resultant knowledge can fill up the gap and offer novel solutions. Satellite images can show the economies growing or shrinking on a real-time basis. It sounds amazing that a few hundred miles above the Earth, a satellite is helping to change our understanding of economic life below on the Earth. It is providing us with the ways of measuring growth by processing and interpreting the data it generates and the options to accelerate the growth using the supplied clues in a rational and scientific manner. The need of the hour is to efficiently manage the knowledge explosion by responding to these inputs and vertically shifting the course of future growth trajectory. A. Narain in his report titled “Economic impact of the geospatial industry surges to USD 2,210.70 Billion” presents the facts as reproduced below [3] (Fig. 1). The instant research paper highlights in the introductory part the gaps in the existing research in terms of the interpretation and application of geospatial data in comprehending and resolving the issues of economic growth in general. The literature review confirms this gap particularly the gap of addressing the issue of deteriorating asset quality in the banking sector by leveraging the geospatial technology gaining momentum off late. The subsequent sections sequentially describe the basics of geospatial technology and satellite data and the meaning and significance of managing the knowledge emerging in geospatial space for the benefit of the economic growth through revitalization of the banking sector which anyways is the backbone of any economy. It proceeds to highlight the growing application of geospatial data in the sphere of economics and looks forward to making the newly discovered ground as the hotbed for the fresh round of innovations in banking and economic space. The paper concludes with the emphasis on the need for effective management of the knowledge discovered in the geospatial space through trained human resources and expertise and leveraging on it for a renewed vigor in the otherwise slowing down economic growth process.

Geospatial Knowledge Management-Fresh Fuel …

55

2 Literature Review Blake (2007) discusses how the agricultural communities of Arizona and California are impacted by the single package combining the preventive maintenance services for farm equipment and the geo fencing as facilitated by the global positioning satellites [2]. Bhat et al. (2018) attempt to explore the causal relationship between customer knowledge management, satisfaction, trust and loyalty with specific reference to retail banking [4]. Cham et al. (2016) examine and discuss both the technical and social aspects as determinants of the success of any knowledge management system (KMS). They also examine the interrelationship between success of any KMS and user satisfaction [5]. Al Hussaini et al. (2015) establish a strong correlation between knowledge management and operational risk in any banking system and conclude that KMS can effectively reduce operational risk in banks [6]. Nanda, Swagatika, (July 2016), describe how banks have adopted knowledge management practices to enhance their efficiency, customer experience and their competitive advantage over peer banks [7]. Alpha Beta Strategy × Economics (2018) in their report titled “The Economic Impact of Geospatial Services: How Consumers, Businesses and Society Benefit from Location based information”, put the estimate of the revenue generated by geospatial services at US$400 billion per year. The total economic contribution of geospatial services as estimated in terms of consumer benefits is US$550 billion, improvement in revenues and cost is to the extent of 5%, direct and indirect jobs creation is to the tune of 4 and 8 million, respectively, and many others like reduced carbon emission through GPS guided navigation are to the extent of 5% [8] (Fig. 2).

3 Basics of Geospatial Technology and Satellite Data The science of remote sensing and interpreting results thereof has grown phenomenally since the latter half of the nineteenth century when the high level images were rare objects recorded by cameras attached to the high flying balloons, pigeons or kites. Over the last decade and a half, economists have grown their understanding of and dependence on remotely sensed images and the resultant data to decode economic phenomena prescribe solutions and make calculated predictions for the future. GT is predominantly used for digital surveying and mapping, and enables collection, processing, visualization and interpretation of geo-physical data pertaining to target Areas of Interest (AOI) [9]. Sensory images from satellites are easily available in public domain and there has emerged a class of remote sensing users in the field of economics. Remote sensing satellites operate in two orbits—geostationary and sun-synchronous.

56

A. Mehrotra

Fig. 2 Location-based information benefiting consumers and the society [8]. Source A report by Alpha Beta Strategy × economics, 2018, “The economic impact of geospatial services: how consumers, businesses and society benefit from location based information”

Geostationary satellites orbit the earth in a way that they constantly stay above a fixed point on the Earth and constantly monitor the same area of the Earth like the weather monitoring satellites dedicated to a specific geography. Such satellites have the twin disadvantage of not being able to observe the rest of the world and being located at such a high altitude, about 35,000 km above the earth’s surface, that the images sent by them are of relatively lower resolution. The sun-synchronous satellites, also known as polar satellites, observe the whole earth at almost the same time each day as the satellite passes over any given point of the planet’s surface at the same local mean solar time. Such satellites can orbit Earth 7–10 times each day depending on their altitude which is around 6000 km or less above the Earth’s surface. The images captured by such satellites are relatively high resolution images and make more sense for accurate interpretation. The area covered by a satellite during its orbit is called swath which is the width of the sensor’s vision as it looks down at the Earth. Wider the swath, lesser will be the orbits needed to cover monitoring of the whole Earth. A remote sensing satellite may have one or more sensors generating multiple independent data streams. Sensors having spectral resolution may capture data from hundreds of bands. Sensors are mostly passive meaning that they capture and observe energy waves or radiation reflected off the Earth on account of the energy hitting the Earth from the Sun. Active sensors, on the other hand, such as radar and LiDAR emit radiation and record the properties of the consequent radiation reflected back by the Earth like the transit time, sound frequency, etc. [10]. Data, in order to be useful, needs to be transformed to information which calls for its successive intensive and extensive processing from the raw datato subsequent higher level data before they find place in application by social scientists. A satellite not only looks at the locations it passes over and records images from a right angle but also views around and captures images at different angles. Out of the multiple

Geospatial Knowledge Management-Fresh Fuel …

57

overlapping swathes, data of interest needs to be culled out. Such data may at times be combined in a process known as mosaicking to reduce inaccuracies and random errors. There may be a direct conversion of the observed quantities for individual pixels to physical quantities of interest items like night lights, greenness, elevation, degree of concentration, temperature, etc. Many other applications demand classification of pixels in to a discrete set of categories before they may be usefully employed. The classification is guided by the decades of specialization in the art and the science of remote sensing and computers. Advanced and state of the art classification systems are dependent on the Machine Learning techniques [10]. By and large, Geographic Information System (GIS) based on geospatial technology is multiple times faster and much cheaper than the traditional methods of surveillance and data collection [11]. Although it is priceless, the real economic value of geospatial output will elude the professionals in the absence of more sophisticated tools of measuring its real value that remains hidden on account of the free of charge delivery and packaging with other products in many cases [4].

4 Knowledge Management—What is it all about? Knowledge Management (KM) is an exercise undertaken by the modern organizations to identify, conserve, develop, retain and deploy knowledge and its application for the learning of the employees across the organization with intent to maximizing the knowledge-based growth of the organization. This end can be achieved by transforming the organization into a “Learning Organization”. Knowledge Management practices may be introduced as a solution to meet business challenges such as gaining larger market share, stemming fall in profits, improving employee efficiency and productivity, gaining competitive advantage over peers [7] or even improving asset quality in the banks. The scope of activities performed under the banner of knowledge management will depend on the motive behind Knowledge Management Programme. Knowledge Management opens up the scope for multi-disciplinary approaches to organizational strategies for achieving their ultimate objectives. It provides clues about gathering, collating, analyzing and transforming knowledge in to highly classified information for use by the organizations and for making the maximum and scientific use of the knowledge strewn around us. Business Process Re-engineering, competitive intelligence, quality management, leveraging on staff’s and the organization’s core competencies, data warehousing and data mining, managing intellectual capital, supply chain management, customer relationship management, enterprise content management, enterprise resource planning, etc. are all the components of knowledge management and come within its scope. The scope is expanding and tending to cover the latest developments on the technological front that may be leveraged for the organizational growth and achieving its objectives. The onset and application of the geospatial technology has further widened the scope and potential of knowledge management enormously. This is a relatively unexplored area of knowledge management and holds out promisesfor the growth and development of

58

A. Mehrotra

the industrial and economic growth in general and some important segments of the economy, like banking, in particular. The far reaching developments in the world of geospatial technology have created an ideal platform for the interdisciplinary wedding of the two branches of science and has exposed multiple dimensions of knowledge management for exploration in terms of developing the capability to meaningfully interpret the information spewed by geospatial services on the one hand and, leveraging on the discovered geospatial knowledge, on the other, for economic growth and development which in many cases will be routed through the banking system.

5 Growing Dependence of Economic Studies on Satellite Data Economists are increasingly drawing upon the data derived from satellites and sensors to carry out their researches and to prove or disprove the postulates. Over the last decade and a half, economists have grown their understanding of and dependence on remotely sensed images, and the resultant data to decode economic phenomena prescribe solutions and make calculated predictions for the future. Rise in the number of satellites has contributed to the research initiatives in developing economies which are often faced with inhibiting data limitations as compared to their developed counterparts. This revolution is fueled by the similar trends in related technologies like faster computing techniques and the ceaseless efforts of the remote sensing scientists producing a stream of improved satellites and data interpretation methods. Satellite data have been primarily used in areas like urban land cover, agricultural land use, measuring economic activity, weather forecasts, beaches, forest cover, mineral deposits, airborne pollution, fish abundance, electricity use, elevation, terrain roughness, etc. Some prime areas deserving elaboration are summed up as under—Data on Urban Planning—Data claiming to provide a holistic picture of urban land use, demographics and services often lag behind and fail to portray the prevailing characteristics of the rapidly growing population, technological evolution, demographic transition and transforming land use on account of non-availability of the real-time information. The gap is filled up by the geospatial services based on geographic information system and divulging the truth on real-time basis [12]. The Spatial Development Framework for 2040 developed for Johannesburg makes use of the geospatial data to highlight and address the issues related to poverty, inequality, land abuse, housing shortages, over congestion, poor walk ability, etc., and assists the city planners to work on future development options [1]. Night lights reflection as a proxy of economic activity—The visible light reflection by the Earth’s surface at night (luminosity at night) is increasingly considered as indicative of economic activity, patterns of consumption decisions, incomecategories of the population and the level of economic development. The proxy relationship is strong and consistent and is increasingly used to estimate or moderate the GDP data

Geospatial Knowledge Management-Fresh Fuel …

59

of the countries [10]. A study of night light pattern was made to understand how the spatial distribution of economic activity evolved within North Korea during a period of economic sanctions by several nations but countered by China [13]. Night light data may also be used as business opportunity indications for provision and development of banking and financial services in the identified areas for their sustained growth (Fig. 3). For various sociopolitical and diplomatic reasons, the governments tend to manipulate their socioeconomic statistics, mainly the data on growth of gross domestic products, level of poverty, extent of calamities, etc. (like, droughts, famines, epidemics, etc.) for vested interests like procuring international aids and financial assistances including soft loans, or to exaggerate or downplay their impact depending on the call. Statistics on GDP growth serves as a barometer of the performance of the government, and the latter is sometimes inclined to inflate the figures to impress. With reliable indicators like night light data, the comparison with other geographies is possible to arrive at a fair estimate about growth in GDP [15], and the government’s dissemination of misleading and deceptive data on such parameters is subject to scrutiny and cross-check [16]. Satellite images are capable of portraying beyond doubt or debate as to which countries are growing and which are shrinking economically and that too on a real-time basis without having to wait for the time-consuming collection, analysis and interpretation of data at the ground level [17].

Fig. 3 Night light distribution in USA. of USA [14]. Source Bleakley and Lin 2012, the figure shows night lights in different parts

60

A. Mehrotra

Climate and weather studies and their impact on the economic variables—Shortterm weather fluctuations and long-term climate changes have had remarkable impact on the human history and civilization. The innovative attributes of weather as derived from the satellite imagery have increasingly been studied with interest and used in economics for understanding the impact of cyclones, droughts, floods, climatic changes, etc., on human settlements over time. The knowledge about these variables has a strong bearing on useful predictions about changes in agricultural land use, crop yields and mutations, changes in cropping choices pattern, future growth, income, food sufficiency and the need for imports. It also paves the way for concepts like precision farming with geo-informatics techniques [18] (Fig. 4). Depleting forest cover and environmental degradation and their economic repercussions—Satellite imagery is of immense significance in providing an authentic estimate of forest cover and the environmental degradation on account of deforestation over time. Using change detection software, the previous and current maps are compared to identify the physical changes that may have taken place in the AOI during reference time spans and to measure the degree of decline in the forest cover. Management of environmental knowledge about depleting forest cover, global warming,

Fig. 4 Yield changes due to climatic changes. [14]. Source Costinot et al. (2016) predicted yield changes due to climate change for a wheat and b rice. Based on global agro-ecological zones

Geospatial Knowledge Management-Fresh Fuel …

61

Fig. 5 Findings of a survey on depleting forest cover [19]. Source Research journal of environmental sciences

melting of glaciers, rising sea level, threat to biodiversity and growing abnormalcies in precipitation can help in understanding the current status and designing workable strategies to avert environmental disasters. The concept of sustainable growth is dependent on appreciation of the environmental cost of economic growth. It advocates a judicious balance between growth and environmental exploitation and cautions against a drive beyond sustainable levels. Investment in renewable energy and green projects is sheer economic decisions that flow from such realization. The reproduced below reflects on the changes in forest cover during 1978–2006 in different classes in Chakrashila Wildlife Sanctuary, Assam, India (Fig. 5). With entry of private sector players in the satellite sensing industry, which is yet another economic consequence of the popularity of geospatial services, there is a continuous decline in the cost of launching satellites. The economically available high resolution satellite images and their use have invigorated innovative studies in economics just as in other fields [14].

6 Geospatial Technology—The Incubation Pot of a Fresh Round of Banking Innovations An integration of knowledge creation, knowledge sharing and knowledge acquisition in any organization, particularly in banking, can impart the competitive edge to the banks and improve their overall operational efficiency. Remote sensing and geospatial technology flowing from geographic information system (GIS) have scope for wide application in the operational aspects of commercial banking. The knowledge generated through GT which at present finds only token application and is peripherally used in retail banking may be leveraged in the spheres of performance budgeting, agricultural banking, industrial finance, infrastructure finance, customer relationship

62

A. Mehrotra

management, risk management and even follow up of NPA’sand insight into the area-specific recovery strategy. Banks are operating in wide and diverse geographies with differing business goals for different regions. With a deeper understanding of the geo-physical features of the area, rather than basing their performance budgets on mere past trends, banks may target a more realistic business performance aligned with stark ground realities. The level of activity and the status of stock storage in large warehouses, the progress of work in infrastructure projects and construction activities spread over vast stretches that are challenging to monitor and supervise, monitoring and effective surveillance of mortgaged securities can all be handled more effectively with geospatial technology at the disposal of the banks. With the availability of high resolution threedimensional maps of the standing crops, banks may have a much better estimate of the expected yield and variations therein as compared to the last few seasons which would assist banks in arriving at a more accurate assessment of probable losses, if any, and the required provisioning [9] (Fig. 6). The geospatial technology can provide authentic inter temporal images with high positional accuracy and clarity of resolution that reflect on the status of projects under implementation, actual physical stocks at stock yards, assets financed or crops under cultivation. In large infrastructure projects where pre and post finance physical inspection and measurement is a challenge, the solution comes from GT and management of the knowledge flowing from it.

Fig. 6 Customer relationship management [20]. Source Techtarget.com (2018)

Geospatial Knowledge Management-Fresh Fuel …

63

Fig. 7 Leveraging geospatial technology for crop surveys [9]. Source Panchapakesan, R.V., leveraging geospatial tech in banking

The potential benefit of geospatial knowledge management is immense in comparison with its current limited application and deserves fuller exploitation. Required tweaking and customization of the technology may lead to producing more specific and ready to use knowledge for management of the credit and operational risks in the banking sector [7]. With around $150 billion non-performing assets, the Indian banking industry is faced with arguably the worst ever asset quality crisis in the history of independent India. Contribution of the geospatial technology and the benefits accruing from harnessing this stream of knowledge independent of all other sources are unmatched in meeting the asset quality challenges and in building up a sizeable portfolio of good quality assets in banks. The cost involved is comparatively small, just a fraction of the amount at stake, and is becoming increasingly affordable. In the picture below, a drone is shown assessing the extent of crop damage to enable the banks for non-performing assets (NPA) provisioning [9] (Fig. 7). A large number of bank branches networked to provide any time anywhere banking to the customers work on GIS technology giving a centralized online real-time experience. The technology is also used in market analysis, customer management, asset management and disaster management by facilitating preparedness by way of forecasting, early warning, risk mitigation, quick response and disaster recovery.

7 Conclusion and Future Task: Filling the Knowledge Management Gaps in Application of Geospatial Technology Application of artificial intelligence, machine learning and robotics in banking was a case of IT knowledge management in banking. The next level is the application

64

A. Mehrotra

and management of geospatial knowledge in banking which is the backbone of any economic system. Advancements in ICT have brought the world closer in which banks need to be market sensitive, market driven and market responsive. There is a need to develop the proficiency to predict and provide the services and products customers need [4]. This calls for extensive and intensive management of locationbased knowledge involving customer profile through which they can edge ahead of the competition. Geospatial technology can provide the banks with locationspecific knowledge which offers a host of advantages hard to gain from other data types. Location-based geospatial knowledge is an asset to the financial industry as it can provide the wealth of area-specific demographic data on a single map. It can open the gates for strategic analysis, customer behavior analysis, market penetration studies, location placement and expansion decisions, delivery routes and tracking of competitors’ activities. For institutions like Institute for Development and Research in Banking Technology (IDRBT) in India, it is coming up as a fresh challenge to approach the issue of customer satisfaction and delight from the geospatial knowledge management perspective particularly against the backdrop of the emergence of a well-informed, discerning and value sensitive class of customers [5]. An in-depth study of the demographic patterns, standard of living and the preferences of the targeted clientele in any specific geography can give tremendous insight to the banks to cater to their customers better. Given the tremendous potential of the geospatial technology, its application in banking and economics of any country will, however, need concerted efforts in the direction of efficient management of the knowledge generated by the geospatial technology. Customizing the information output and enabling the interpretation of data are the challenges to be met. In the process of geospatial knowledge management, there is a long chain of intermediaries involved right from identification of the proxy data, their collection and processing to their final application. Social scientists seldom use raw satellite data until they are extensively processed from Level 0 data to Level 2 or Level 3 data to render them noise-less, objective, comparable and interpretable. Classifying and utilizing such data take decades of specialization in the art and the science of remote sensing and computers. There is a need to fill the human resources and expertise gap in geospatial technology which has emerged as a branch of knowledge evolving at a rapid pace and demanding its efficient management for opening up the treasures of unexplored opportunities.

References 1. Wang X (2016) How Geo-spatial technology can help cities plan for a sustainable future. https://blogs.worldbank.org/sustainablecities/how-geospatial-technology-can-help-cit ies-plan-sustainable-future 2. Blake C (2007) GPS technology keeps maintenance data, offers geo-fencing security on equipment. Western Farm Press 29(27):2. Retrieved from https://search.ebscohost.com/login.aspx? direct=true&db=e6h&AN=27945077&site=ehost-live

Geospatial Knowledge Management-Fresh Fuel …

65

3. Narain A (2018) Economic impact of geospatial industry surges to US$ 2210.7 Bn. Available at https://www.geospatialworld.net/blogs/economic-impact-of-geospatial-industry/ 4. Bhat SA, Darzi MA, Parrey SH (2018) Antecedents of customer loyalty in banking sector: a mediational study. Vikalpa J Decis Makers 43(2):92–105 5. Cham TH, Lim YM, Cheng BL, Lee TH (2016) Determinants of knowledge management systems success in thebanking industry. VINE J Inf Knowl Manage Syst 46(1):2–20 6. AlHussaini W, Karkoulian S (2015) Mitigating operational risk through knowledge management. J Int Manage Stud 15(2):31–40 7. Nanda S (2016) Role of knowledge management in Indian banking sector. Int J Res Bus Manage (IMPACT: IJRBM) 4(7). ISSN (P): 2347-4572; ISSN (E): 2321-886X, Jul 2016, 37– 44© Impact Journals, available at https://www.scribd.com/document/318899359/The-role-ofknowledge-management-in-Indian-banking sector 8. A Report by Alpha Beta Strategy x Economics (2018) The economic impact of geospatial services: how consumers, businesses and society benefit from location based information. https://storage.googleapis.com/valueoftheweb/pdfs/GeoSpatial%2520FA_Pages-compre ssed%2520%25282%2529.pdf 9. Panchapakesan RV (2019) Leveraging geo-spatial tech in banking. https://www.thehindubusi nessline.com/opinion/leveraging-geospatial-tech-in-banking/article26260681.ece 10. Donaldson D, Storeygard A (2018) Rise of the satellites: data revolutions in development economics research. https://voxdev.org/topic/methods-measurement/rise-satellites-datarevolutions-development-economics-research 11. Lee S (2018) Empowered planning with models, satellite and machine learning. https://www. energyforgrowth.org/memo/empowered-planning-with-models-satellites-machine-learning/ 12. Henderson JV, Regan T, Venables A (2017) Building the city: urban transition and institutional frictions. CEPR discussion paper 11211 13. Lee YS (2018) international isolation and regional inequality: evidence from sanctions on North Korea. J Urban Econ 103:34–51 14. Donaldson D, Storeygard A (2016) The view from above: applications of satellite data in economics. J Econ Perspect 30(4):171–198 15. Vernon Henderson J, Adam S, Weil DN (2012) Measuring economic growth from outer space. Am Econ Rev Am Econ Assoc 102(2):994–1028 16. Kopf D (2018) Satellite statistics reveal which countries cheat on their economic statistics. Available at https://qz.com/1277011/satellite-images-reveal-which-countries-cheat-on-theireconomic-statistics 17. Kearns J (2015) Bloomberg businessweek,satellite images show economies growing and shrinking in real time. Available at https://www.bloomberg.com/news/features/2015-07-08/ satellite-images-show-economies-growing-and-shrinking-in-real-time 18. Rilwani ML, Ikhuoria IA (2006) Precision farming with geo-informatics: a new paradigm for agricultural production in a developing country. Trans GIS 10(2):177–197. https://doi.org/10. 1111/j.1467-9671.2006.00252.x 19. Kumar D (2011) Research article titled “Monitoring forest cover changes using remote sensing and GIS: a global prospective. Available at https://scialert.net/fulltextmobile/?doi=rjes.2011. 105.123. Accessed on 07.08.19 20. Techtarget.com (2018) Guide to customer experience management best practices, technologies. https://searchcustomerexperience.techtarget.com/definition/customer-experience-man agement-CEM-or-CXM

Neural Network and Pixel Position Shuffling-Based Digital Image Watermarking Sunesh Malik and Rama Kishore Reddlapalli

Abstract This paper proposes a new optimized watermarking method by incorporating the concept of entropy, pixel position shuffling, and neural networks to enhance the imperceptibility, security, and resistance against the attacks. Firstly, the proposed watermarking system preprocesses the host image by implementing discrete wavelet transform and extracts the low-frequency components to gain robustness. The extracted low-frequency components are further decomposed into blocks and then transformed into a discrete cosine transform coefficients for inserting the watermark information. The watermark bits are only implanted into the selected discrete cosine transform DCT blocks, and DCT blocks are selected through entropy values. The selection of DCT blocks with entropy helps in strengthening the imperceptibility of the proposed system. The watermark extraction process is optimized by employing the neural network characteristics and enhances the résistance against the attacks. The traits of pixel position shuffling are also utilized in proposed watermark embedding process for ensuring the security of proposed scheme. Keywords Digital watermarking · Artificial neural networks · Discrete wavelet transform · Entropy · Pixel position shuffling · Discrete cosine transform

1 Introduction With the advancement of information and communication technology, large amounts of digital data are transferred over the Internet. Digital data can be easily copied, manipulated, stored, or deleted by user or attackers. On another side, this feature of digital media also facilitates unauthorized duplication and manipulation. To deal with unauthorized duplication and manipulation of digital media, digital watermarking S. Malik (B) USICT, GGSIPU and MSIT, Janakpuri, New Delhi, India e-mail: [email protected] R. K. Reddlapalli USICT, GGSIPU, New Delhi, India © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_5

67

68

S. Malik and R. K. Reddlapalli

comes out as a vital tool. Digital watermarking imparts author information inside the digital content for authentication and copyright protection [1–4]. Digital watermarking may be classified in different ways, such as based on domain, based on how the watermark is extracted, based human perception, based on media, and many more. Digital watermarking classification is shown in Fig. 1. On the basis of domain, digital watermarking methods may be segregated as follows: spatial domain-based watermarking technique and transform domain-based watermarking technique. The watermarking technique with spatial domain implants

Fig. 1 Digital watermarking classification

Neural Network and Pixel Position Shuffling-Based …

69

watermark information by altering the pixel values of host image; whereas, transform domain-based watermarking scheme implants the watermark by transforming pixels values of the original image [5]. The spatial domain-based image watermarking techniques are less robust in comparison with transform domain-based image watermarking techniques [6, 7]. Based on human perception, it is divided into two parts: visible digital watermarking scheme and invisible watermarking scheme. In the visible image watermarking scheme, the watermark can be perceived by human eyes after embedding. If human eyes are unable to perceive the watermark after embedding, then it is known as an invisible watermarking. Based on the type of media on which watermarking is employed, watermarking is classified into four different types, namely text, image audio, and video [3]. Based on how the watermark is extracted, watermarking techniques are categorized into three types: non-blind watermarking scheme, blind watermarking, and semi-blind watermarking [8]. If the host image and procedure of embedding are enforced to extract the watermark, then it is considered to be a non-blind watermarking method. If partial information is required to extract the watermark, then it is considered to be a semi-blind method. If there is no information regarding host image or any additional information to extract, then it is considered as a blind method. Mostly, the digital image watermarking system has been characterized in the form of robustness, unambiguity, perceptual quality, and security. By keeping this frame of reference, different watermarking methods came into existence in the past, but unable to cope up with this growing digital world. So, watermarking algorithms have a scope of improvement of more security, perceptual quality, and robustness. The performance improvisation of a digital watermarking system in terms of security, resistance against attacks, and perceptual quality is taken as an optimization problem. Watermarking algorithm can be optimized by applying traits of artificial neural network (ANN) and genetic algorithms (GA) [9]. The present paper utilizes the traits of artificial neural networks to optimize the proposed watermarking system and gains resistance against attacks. In this paper, a new and unique digital image watermarking algorithm has been proposed by consolidating pixel position shuffling with entropy and neural networks in DWT-DCT domain. The proposed scheme implants the watermark bits in selected DCT coefficients of LL sub-band. The adoption of the entropy and pixel position shuffling enhances imperceptibility and security of the proposed watermarking method. On other hand, the neural networks strengthen the resistance of proposed scheme against the different attacks. The concept of pixel position shuffling and entropy will be discussed in the related work section. The rest of the paper is framed as follows. Section 2 presents the research methodology for the proposed method. Section 3 provides a brief review of the digital image watermarking techniques reported in the past. The proposed watermark embedding work is represented by Sects. 4 and 5 discuss the watermark extraction process. Section 6 presents the discussions and analysis about proposed work. Next, Sect. 7 illustrates the finishing comments of the paper.

70

S. Malik and R. K. Reddlapalli

2 Research Methodology In this paper, various watermarking systems are studied for improving their performance. Figure 2 shows the research roadmap that has been followed in this paper. The research starts with the study of the watermarking system, namely domainbased watermarking, human perception-based watermarking, media-based watermarking, and watermark extraction-based watermarking system. The characteristics of watermarking are overviewed. Then, DWT and DCT-based watermarking systems are identified as a most promising approach for watermark implantation. DWT and DCT-based watermarking systems make the system simple and provide good enough results. However, there is a need to optimize the performance of the digital watermarking techniques. The traits of artificial neural networks and genetic algorithms are utilized for the optimization of the watermarking system. Here watermarking based on artificial neural networks is studied in detail, as discussed in Sect. 3. In furtherance, various techniques like entropy and position shuffling are discussed to strengthen the performance of the digital image watermarking. Finally, the authors came up with a new proposed watermarking algorithm and proposed extraction algorithm based on the previous study reported in the literature.

Fig. 2 Research roadmap

Neural Network and Pixel Position Shuffling-Based …

71

3 Literature Review This section reviews the concepts of watermarking, entropy, and pixel position shuffling. Firstly, watermarking based on neural networks is examined. Afterward, the concept of entropy and pixel position shuffling is discussed.

3.1 Artificial Neural Networks in Watermarking The biological neural network inspired the concept of artificial neural networks (ANN). Initially, ANN was used in watermarking by Hwang et al. [10] to improve security and resistance of the watermarking technique in discrete cosine transform domain. However, the method did not cope well. In this, the author utilized a backpropagation neural network BPNN. Davis and Najarian [11] also proposed another watermarking technique based on discrete cosine transform and ANN in which neural network automates the process of determining the maximum strength of watermark before it is visible to the human eye. In other words, the neural network was used to determine whether the watermarked image is visible or not. Bansal and Bhadauria [12] proposed a watermarking technique for copyright authentication in which watermark is inserted and extracted by a specific full counterpropagation neural network. In this technique, the author stores watermark in the synapses of the FCNN that helps in achieving robustness against the attacks. Liu and Jiang [13] proposed another digital image watermarking scheme based on RBF neural network that determines maximum watermark insertion intensity in DCT coefficients. Huang et al. [14] reported a blind watermarking scheme by utilizing characteristics of a BPNN in which neural network helps in learning the association between the watermark and watermarked image. Chen and Chen [15] also presented blind digital watermarking method hinged on the concept of backpropagation neural network (BPNN), error-correcting code, and chaotic sequence. Backpropagation neural network is utilized to learn the correlation between the wavelet culled samples and a processed chaotic sequence. Watermark is imparted by modifying selected wavelet coefficient. Chaotic sequence is required during the extraction process and also helps in making system robust. Agarwal and Mishra [16] presented image watermarking based on fuzzy BP network that generates a single output which is utilized to embed two different watermarks in the discrete cosine transform domain. Gao and Jiang [17] also proposed a robust image watermarking method based on BP neural network which embeds a watermark information in discrete cosine transform coefficients of LWT domain. Hui et al. [18] reported a digital watermarking scheme in DCT domain-based with artificial neural network in which a neural network is utilized for detection and extraction of the watermark data. The proposed method was more robust in comparison with the traditional methods. Dutta et al. [19] presented feed-forward

72

S. Malik and R. K. Reddlapalli

BPNN-based watermarking scheme in discrete cosine transform. For neural network training, the spiral encoding scheme is utilized. In this scheme, the neural network is needed to train once, then can be employed to implant a watermark data in any secret image. The watermark extraction process also exploits the trained network. Yahya et al. [20] reported a digital watermarking technique in DWT domain with probabilistic neural network which memorizes the association between the watermark information and watermarked image. By learning this association, probabilistic neural networks are employed to optimize the extraction process. Singh et al. [21] presented the multiple watermarking schemes with backpropagation neural network for securing social network content that embeds text watermark and image watermark both. In this scheme, the text watermark is implanted in wavelet domain, and image watermark is inserted in the SVD domain of discrete cosine transform coefficients of LL3 band. Robustness of the proposed scheme is enhanced by employing backpropagation neural network. Movaghar and Bizaki [22] also proposed watermarking scheme in DWT-SVD domain by utilizing the traits of artificial neural network to accomplish the perceptual quality and resistance against attacks. ANN optimizes the embedding block selection and predicts the optimal block for watermark embedding. Therefore, it makes the proposed method robust and imperceptible. Mamatha and Venkatram et al. [23] also presented LWT and backpropagation neural networkbased watermarking method for minimizing computational complexity to gain the robustness. In last few years, the concept of CNN is also employed in digital image watermarking field to increase resistance against the different attacks [24, 25]. In this regard, Mun et al. presented [24] a blind watermarking scheme with convolutional neural network (CNN) by iterative learning framework to enhance the robustness. In this scheme, watermark bits are embedded by using CNN. Kandi et al. [25] proposed non-blind watermarking scheme to enhance the imperceptibility and robustness by using a learning-based auto-encoder convolutional neural network. Proposed method is imperceptible and provides security.

3.2 Entropy and Pixel Position Shuffling The concept of entropy is incorporated with image watermarking in order to enhance perceptual quality of digital image. Fundamentally, entropy is an important tool used for characterizing the image texture. Entropy determines the randomness, and randomness exemplifies the texture statistics of the image. Entropy is calculated by the following equation [26–28] E =−



Ri log2 (Ri )

(1)

i

The concept of pixel position shuffling readjusts all the positions of pixels in an image and encrypts the image. Pixel position shuffling can be achieved by exploiting

Neural Network and Pixel Position Shuffling-Based …

73

various concepts like Sudoku puzzle [29], chaos system, R prime shuffle [30], Fibonacci series [31–33], random shuffling [34], and many more. In this paper, a few of them are discussed. The digital image scrambling based on chaos can be accomplished by shuffling among the pixel positions [35–39]. Yu et al. reported a digital image scrambling technique that employs the concept of logistics map and simplified DES together. In this, the faster computation speed of scrambling process is accomplished by utilization of simplified DES; whereas, Dong et al. [36] reported digital image scrambling scheme based on chaos theory and sorting transformation. The usage of sorting transformation along with chaos theory makes system more secure and effective. Prasad et al. reported image scrambling technique which scrambles the image with randomness of henon map. Shoug dong et al. presented an image scrambling scheme for color image which encrypts RGB image through the pixel coordinates transformation by exploiting the traits of chaotic sequence [39]. Diaconu et al. also reported image scrambling with Knight’s moving rule and chaotic map which scrambles the RGB image by performing transformation between RGB components [38]. Apart from Knight’s moving rule, Sudoku puzzle was also used by researcher to perform scrambling. In this regard, Zou et al. [29] proposed color image scrambling which uses the traits of NxN Sudoku puzzle to perform scramble. The proposed scheme scrambles the image at pixel and bit level. Kekre et al. [30] reported image scrambling scheme with related prime shuffle which accomplishes scrambling at block level. The proposed technique decomposes image into different block and performing shuffling by using different r prime number. On the premise of above-stated papers, the present paper combines the concept of entropy, pixel position shuffling, and artificial neural networks and outlines a new robust, secure, and imperceptible watermarking method.

4 Proposed Watermarking Algorithm This section portrays the proposed digital image watermark embedding work. Figure 3 depicts the proposed embedding flow diagram. In this watermark embedding procedure, the original host image is preprocessed with discrete wavelet transform and extracts LL sub-band to gain robustness against the attacks. Extracted LL subband is processed further with DCT, and entropy is considered to select the embedding location to improve imperceptibility. Then, selected locations are shuffled further to increase the security of the watermark. The proposed digital image watermarking algorithm is below in steps. Step 1. Input: Read the input host image and watermark image. Step 2. Host Image Preprocessing: Apply DWT to the host image and extract LL sub-band for watermark embedding.

74

S. Malik and R. K. Reddlapalli

Fig. 3 Proposed embedding flow diagram

Read input image Pre-processing Block Division and Entropy calculation DCT Computation Pixel Position Shuffling Watermark Embedding Train Neural Network DCT Inversion Generation of Watermarked Image

Step 3. Block division and Entropy: Split the extracted LL sub-band into nonoverlapped blocks of size 8 × 8 and compute the entropy for each block. Sort the blocks according to entropy value and select the first n blocks. Step 4. DCT Computation: Apply discrete cosine transform on selected blocks and create the dataset for watermark embedding. Step 5. Pixel Position Shuffling: Apply position shuffling on created dataset to increase the security of the watermark embedding algorithm and select locations of the blocks to embed watermark. Step 6. Watermark Embedding: Embed the watermark into Mthrowand Nth column of each shuffled discrete cosine transform block. Step 7. Neural Network Training: Train or save the data for the neural network. Step 8. Inversion of DCT: Apply inverse DCT on the processed blocks and recombine the image. Step 9. Watermarked Image Generation: Apply inverse DWT to get the watermarked image.

5 Proposed Extraction Algorithm Proposed extraction algorithm section represents the procedure of watermark extraction. The optimization of watermark extraction process is executed through artificial neural networks to improve the resistance of watermarking algorithm against attacks. Watermark extraction follows almost the same procedure as in the embedding process. Step 1. Load the dataset for training and read the watermarked image. Step 2. Apply discrete wavelet transform on watermarked image and select the LL sub-band for further processing.

Neural Network and Pixel Position Shuffling-Based …

75

Step 3. Split LL sub-band into non-overlapped blocks of size 8 × 8 and compute entropy for each block. Select the blocks based on entropy and compute DCT on each block. Step 4. Apply position shuffling on selected discrete cosine dataset. Convert the entire dataset into a 1-D array. Step 5. Use the trained model to extract the watermark. Step 6. Generate the extracted Watermark.

6 Discussions and Analysis The main purpose of present paper is to propose a new and unique DWT-DCT-based digital image watermarking method. So, this section presents a theoretical comparative analysis proposed method with existing method in Table 1. Table 1 clearly shows that proposed method combines entropy, pixel position shuffling, and artificial neural networks in image watermarking. The security of proposed watermarking system is ensured by implementing pixel position shuffling between the DCT coefficients. The pixel position shuffling of DCT coefficients makes watermarking system secure. The imperceptibility for the proposed watermarking method is assured by the concept of entropy. The watermark bits are embedded in DCT blocks with highest entropy values that make the proposed method imperceptible. In case of robustness, the implanting of watermark bits in low-frequency components helps in gaining the resistance against the attacks. In addition to this, the proposed extraction process is optimized by artificial neural networks in order to enhance the resistance against different types of attacks. The optimization of proposed image watermarking method also helps in improving the performance. By this way, the proposed image watermarking method can accomplish imperceptibility, robustness and security objectives. From this theoretical discussion, it can be analyzed that proposed DWT-DCTbased image watermarking scheme can attain good imperceptibility, robustness, and security in comparison with the existing schemes. Table 1 Theoretical discussion and analysis Proposed method

The proposed method is combined of pixel position shuffling with entropy and artificial neural networks in DWT-DCT domain-based watermarking system. The key objective of proposed method is to achieve imperceptibility, robustness along with security

[6]

In this method, an image watermarking scheme is designed in 2-D level DWT domain. In addition to this, variable visibility factor is also employed for watermark insertion in low-frequency components

[26]

In this method, author presented the entropy-based image watermarking scheme in DWT-SVD domain

[40]

In this, author presented DCT-based image watermarking scheme which maintains the resistance against image processing attacks

76

S. Malik and R. K. Reddlapalli

7 Conclusion In the present paper, work associated with entropy, pixel position shuffling, and neural networks has been examined and proposed a new optimized, robust, and secure watermarking algorithm. In this proposed algorithm, an endeavor has been made to combine pixel position shuffling with entropy and neural network in DWT-DCT domain of image watermarking. The unification of entropy, pixel position shuffling, and neural network improves the digital watermarking system performance in the form of imperceptibility, robustness, and security. Some of the outcomes of the proposed DWT-DCT image watermarking method are listed below: The neural networks optimize the DWT-DCT-based watermark extraction process. The optimization is accomplished by exploiting the trained model in extraction process. The concept of neural network is considered for attaining the resistance against the attacks. In the proposed method, the pixel position shuffling concept is employed for securing the proposed DWT-DCT-based watermarking system. The concept of entropy makes the system imperceptible due to entropy property of degree of randomness. The combination of DWT-DCT also helps in maintaining the robustness of proposed watermarking system.

8 Future Work To implement the proposed watermarking algorithm presented in this paper and also, to test and study the proposed method.

References 1. Kannan D, Gobi M (2015) An extensive research on robust digital image watermarking techniques: a review. Int J Sig Imaging Syst Eng 8:89–104 2. Agarwal N, Singh AK (2019) Singh PK: Survey of robust and imperceptible watermarking. Multimed Tools Appl 78:8603–8633 3. Thanki RM, Kothari AM (2017) Digital watermarking: technical art of hiding a message. In: Intelligent analysis multimedia information. IGI Global, pp 431–466 4. Chen TH, Chang CC, Wu CS, Lou DC (2009) On the security of a copyright protection scheme based on visual cryptography. Comput Stand Interfaces 31:1–5 5. Potdar VM, Han S, Chang E (2005) A survey of digital image watermarking techniques. In: 2005 3rd IEEE international conference industrial informatics, 2005. INDIN’05, pp 709–716 6. Rita C, Parmar G (2016) A robust image watermarking technique using 2-level discrete wavelet transform (DWT). In: 2nd international conference on communication control and intelligent systems (CCIS), pp 120–124 7. Walia E, Singh C, Suneja A (2015) Computationally efficient rotation invariant discrete cosine transform-based semi-blind watermarking technique. Int J Sig Imaging Syst Eng 8:286–297

Neural Network and Pixel Position Shuffling-Based …

77

8. Kishore RR (2018) Sunesh: optimized and secured digital watermarking using entropy. Chaotic Grid Map Perform Anal 12:451–458 9. Mehta R, Rajpal N, Vishwakarma VP (2015) Sub-band discrete cosine transform-based greyscale image watermarking using general regression neural network. Int J Sig Imaging Syst Eng 8:380–389 10. Hwang MS, Chang CC, Hwang KF (2000) Digital watermarking of images using neural networks. J Electron Imaging 9:548–556 11. Davis KJ, Najarian K (2001) Maximizing strength of digital watermarks using neural networks. In: IJCNN’01 international joint conference neural networks proceedings (Cat. No. 01CH37222), pp 2893–2898 12. Bansal EA, Bhadauria SS (2005) Watermarking using neural network and hiding the trained network within the cover image. J Theor Appl Inf Technol 4:663–670 13. Liu Q, Jiang X (2005) Design and realization of a meaningful digital watermarking algorithm based on RBF neural network. In: 2005 international conference neural networks brain, pp 214–218 14. Huang S, Zhang W, Feng W, Yang H (2008) Blind watermarking scheme based on neural network. In: 2008 7th world congress intelligent control automation, pp 5985–5989 15. Chen Y, Chen J (2010) A novel blind watermarking scheme based on neural networks for image. In: IEEE international conference information theory information security (ICITIS), pp 548–552 16. Agarwal C, Mishra A (2010) A novel image watermarking technique using fuzzy-BP network. In: Sixth international conference intelligent information hiding multimedia signal processing, pp 102–105 17. Gao G, Jiang G (2010) Grayscale watermarking resistant to geometric attacks based on lifting wavelet transform and neural network. In: 8th world congress intelligent control automation, pp 1305–1310 18. Chang, Hui Y, Wan Li F, Hong Z (2011) The digital watermarking technology based on neural networks. In: 2011 IEEE 2nd international conference computer control industrial engineering (CCIE), pp 5–8 19. Dutta. J., Basu, S., Bhattacharjee, D., Nasipuri, M.: A neural network based image watermarking technique using spiral encoding of DCT coefficients. In: Proc. Int. Conf. Front. Intell. Comput. Theory Appl, pp 11–18 (2013) 20. Yahya AN, Jalab HA, Wahid A, Noor RM (2015) Robust watermarking algorithm for digital images using discrete wavelet and probabilistic neural network. J King Saud Univ Inf Sci 27:393–401 21. Singh AK, Kumar B, Singh SK, Ghrera SP, Mohan A (2018) Multiple watermarking technique for securing online social network contents using back propagation neural network. Futur Gener Comput Syst 86:926–939 22. Movaghar RK, Bizaki HK (2017) A new approach for digital image watermarking to predict optimal blocks using artificial neural networks. Turkish J Electr Eng Comput Sci 25:644–654 23. Mamatha P, Venkatram N (2016) Watermarking using lifting wavelet transform (LWT) and artificial neural networks (ANN). Indian J Sci Technol 9:1–7 24. Mun SM, Nam SH, Jang HU, Kim D, Lee HK (2017) A robust blind watermarking using convolutional neural network. arXiv Prepr. arXiv1704.03248 25. Kandi H, Mishra D, Gorthi SRKS (2017) Exploring the learning capabilities of convolutional neural networks for robust image watermarking. Comput Secur 65:247–268 26. Singh G, Goel N (2016) Entropy based image watermarking using discrete wavelet transform and singular value decomposition. In: 3rd International conference computer sustainable globally development, pp 2700–2704 27. Huertas R (2013) GA texture characterization based on grey-level co-occurrence matrix, In: International conference information management science, pp 375–378 28. Malik S, Reddlapalli RK (2019) Histogram and entropy based digital image watermarking scheme. Int J Inf Technol 11:373–379

78

S. Malik and R. K. Reddlapalli

29. Zou Y, Tian X, Xia S, Song Y (2011) A novel image scrambling algorithm based on Sudoku puzzle. In: 2011 4th international congress image signal processing (CISP), pp 737–740 30. Kekre HB, Sarode T, Halarnkar P (2014) Image scrambling using R-prime shuffle on image and image blocks. Int J Adv Res Comput Commun Eng 3 31. Zou W, Huang J, Zhou C (2010) Digital image scrambling technology based on two dimension Fibonacci transformation and its periodicity. In: 2010 International Symposium information science engineering (ISISE), pp 415–418 32. Zou J, Ward RK, Qi D (2004) A new digital image scrambling method based on Fibonacci numbers. In: Proceedings 2004 international symposium circuits system 2004. ISCAS’04, pp III--965 33. Zhou Y, Agaian S, Joyner VM, Panetta K (2008) Two Fibonacci p-code based image scrambling algorithms. In: Image processing algorithms system VI, p 681215 34. Liping S, Zheng Q, Bo L, Jun Q, Huan L (2008) Image scrambling algorithm based on random shuffling strategy. In: 3rd IEEE conference industrial electronics applications 2008. ICIEA 2008, pp 2278–2283 35. Yu XY, Zhang J, Ren HE, Xu GS, Luo XY (2006) Chaotic image scrambling algorithm based on S-DES. J Phys Conf Ser, p 349 36. Xiangdong LIU, Junxing Z, Jinhai Z, Xiqin H (2008) Image scrambling algorithm based on chaos theory and sorting transformation. IJCSNS Int J Comput Sci Netw Secur 8:64–68 37. Prasad M, Sudha KL (2011) Chaos image encryption using pixel shuffling with henon map. Dimension 1:50625 38. Diaconu AV, Costea A, Costea MA (2014) Color image scrambling technique based on transposition of pixels between RGB channels using Knight’s moving rules and digital chaotic map. Math Probl, Eng 39. Shou, Dong L, Hui X (2012) A new color digital image scrambling algorithm based on chaotic sequence. In: 2012 international conference computer science serving system (CSSS), pp 922– 925 40. Tewari TK, Saxena V (2010) An improved and robust DCT based digital image watermarking scheme. Int J Comput Appl 3:28–32

Hybrid Optimized Image Steganography with Cryptography Vineet Nandal and Parvinder Singh

Abstract This paper proposes a hybrid optimized steganography algorithm for color image using LSB, parity checker steganography techniques with AES, Blowfish, RC2 and RC4 a combination of steganography and cryptography to improve the security of information hiding, by using peak signal-to-noise ratio (PSNR) as a performance metric. The hybrid optimized steganography algorithm with cryptography is better than traditional image steganography methods with more security and greater value of PSNR. Keywords Hybrid steganography · Image steganography · PSNR

1 Introduction Steganography [1] is the art of secret communication. Its purpose is to hide the very presence of communication as opposed to cryptography whose goal is to make communication unintelligible to those who do not possess the right keys. We propose a hybrid optimized image steganography technique which combines LSB, parity checker steganography techniques with AES, Blowfish, RC2 and RC4 cryptography techniques to get the most optimal results. The system will encrypt the message with AES, Blowfish, RC2 and RC4 separately and will hide the encrypted message in the cover image, respectively, using LSB and parity checker technique. The system will then choose the stego image with maximum PSNR (optimization). The system will hide the combination of cryptography and steganography technique thus used in the last pixels of image. AES, Blowfish, RC2 and RC4 are cryptographically secure algorithms. The cipher text obtained has a high degree of randomness. The randomness may impact PSNR of V. Nandal (B) · P. Singh Department of CSE, DCRUST Murthal, Sonepat, India e-mail: [email protected] P. Singh e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_6

79

80

V. Nandal and P. Singh

images, as images tend to have certain patterns and generally lack entropy. But for the same input AES, Blowfish, RC2 and RC4 produce an output with different degrees of randomness, and their randomness may coincide with entropy of image. More the coincidences lesser will the bit flips be required. Bit flips are inversely proportional to PSNR. Based on the entropy of the image and information to be hidden, we may get different PSNR values in the result. For the same message, different PSNR values can be achieved using different steganography techniques. Similarly for different messages, different PSNR values can be achieved using the same steganography techniques. Hence, this hybrid approach of choosing cipher text which results in higher PSNR improves the strength of steganography and also improves the quality of output image.

2 Literature Survey Emam and Marwa [2] proposed an image steganography algorithm based on spatial domain. The secret message is to be embedded at random pixel location of the cover image using PRNG and not embedding sequentially in the pixels of the cover image. [Zhou 3] A LSB color image steganography using secret passkey is proposed, combining steganography and cryptography, increasing the “human eye visual features”, higher PSNR value and better steganography performance. C. P. Sumathiet. et al. describe imperceptibility, embedding capacity and robustness as the major determinant of effectiveness of steganography algorithm. Juneja and Sandhu [4] proposed an improved LSB image-based steganography with better security of information. An improved embedding algorithm is for hiding encrypted messages in distant and random pixels in edges and smoother areas of image. “Message is encrypted in the beginning and then edges are detected in the cover image using an improved edge detection filter. Encrypted bits are then embedded in the LSB of randomly selected edge area pixels and LSBs of RGB components, respectively, throughout randomly selected pixels using PRNG across a smooth area in the image.” Rajkamal and Zoraida (2014), developed a technique of image steganography using “Hash-LSB with RSA algorithm”. It provided a greater degree of security to hidden data along with steganography technique. ‘It uses a hash function to generate a pattern for hiding data bits into LSB of RGB pixel values of the cover image. This technique encrypts data before embedding it into a carry image.

3 Proposed Algorithm The system will encrypt the message with AES, Blowfish, RC2 and RC4 separately and will hide the encrypted message in the cover image, respectively, using LSB and parity checker technique. The system will then choose the stego image with maximum

Hybrid Optimized Image Steganography with Cryptography

81

PSNR (optimization). The system will hide the combination of cryptography and steganography technique thus used in the last pixels of image. Combination—Bits to hide in last of image. LSB + AES

000

LSB + blowfish

001

LSB + RC2

010

LSB + RC4

011

Parity + AES

100

Parity + blowfish

101

Parity + RC2

110

Parity + RC4

111

Pseudo code for proposed system Step 1: Input Text. Step 2: Input Password. Step 3: Input Image. Step 4: Encrypt the text with AES, BlowFish, RC2 & RC4 separately. Step 5: Hide the encrypted text in Image (Steganography) separately. Step 6: Calculate the PSNR of all the stego images. Step 7: Select the Image with higher PSNR and specify the algorithm used (AES, Blowfish, RC2 or RC4) in the last 3 pixel of the image. Step 8: Output.

4 Results Efficiency here is defined as a percentage increase in PSNR value using hybrid approach among AES, Blowfish, RC2 and RC4 algorithms as compared to when only a single algorithm is used. Efficiency =

PSNR1 − PSNR2 × 100 PSNR1

(1)

Lenna image was used as cover image, and multiple plain texts were hidden using the multiple combinations of (LSB, parity checker steganography techniques with AES, Blowfish, RC2 and RC4 cryptography techniques) to obtain the results. Parity checker with RC4 gave the maximum PSNR in most cases. The hybrid system outperforms solo system and is always bound to give at least the same efficiency in worst case (Tables 1, 2, 3 and 4).

82 Table 1 PSNR values of all compared techniques in Fig. 1

Table 2 PSNR values of all compared techniques in Fig. 2

Table 3 PSNR values of all compared techniques in Fig. 3

V. Nandal and P. Singh Algo

PSNR

LSB-AES

83.02

LSB-Blowfish

85.5

LSB-RC2

84.83

LSB-RC4

86.18

Parity-AES

83.5

Parity-blowfish

86.51

Parity-RC2

85.68

Parity-RC4

87.4

Algo

PSNR

LSB-AES

82.82

LSB-Blowfish

83.33

LSB-RC2

83.62

LSB-RC4

84.68

Parity-AES

83.28

Parity-Blowfish

83.92

Parity-RC2

84.25

Parity-RC4

85.5

Algo

PSNR

LSB-AES

83.07

LSB-blowfish

83.56

LSB-RC2

82.72

LSB-RC4

84.68

Parity-AES

83.45

Parity-blowfish

83.28

Parity-RC2

83.74

Parity-RC4

84.99

5 Conclusion A hybrid image steganography technique is proposed which combines the best of both cryptography and steganography. The resultant stego image thus generated is of maximum PSNR value which increases the effectiveness of steganography and secrecy of data. This system can be further extended with additional cryptography

Hybrid Optimized Image Steganography with Cryptography Table 4 PSNR values of all compared techniques in Fig. 4

Algo

83 PSNR

LSB-AES

80.38

LSB-Blowfish

81.86

LSB-RC2

81.5

LSB-RC4

82.49

Parity-AES

80.67

Parity-Blowfish

82.18

Parity-RC2

81.67

Parity-RC4

82.97

Fig. 1 Comparison of AES, Blowfish, RC2, RC4 with LSB and parity checker technique

Fig. 2 Comparison of AES, Blowfish, RC2, RC4 with LSB and parity checker technique

Fig. 3 Comparison of AES, Blowfish, RC2, RC4 with LSB and parity checker technique

84

V. Nandal and P. Singh

Fig. 4 Comparison of AES, blowfish, RC2, RC4 with LSB and parity checker technique

and steganography techniques which can further increase the performance of this system.

6 Future Scope This system can be further extended with additional cryptography and steganography techniques which can further increase the performance of this system.

References 1. Suman G, Anuradha P (2013) RSA algorithm and LSB steganography. Int J Eng Res Technol (IJERT) 02(10) 2. Emam MM, Aly AA, Omara FA (2016) An improved image steganography method based on lsb technique with random pixel selection. Int J Adv Comput Sci Appl 7(3):361–366 3. Zhou X et al (2016) An improved method for LSB based color image steganography combined with cryptography. In: 2016 IEEE/ACIS 15th international conference on computer and information science (ICIS). IEEE 4. Juneja M, Sandhu PS (2013) An improved LSB based steganography technique for RGB color images. Int J Comput Commun Eng 2(4):513

TxtLineSeg: Text Line Segmentation of Unconstrained Printed Text in Devanagari Script Rupinder Pal Kaur, M. K. Jindal, and Munish Kumar

Abstract Most of the reports either printed or handwritten comprised of significant data that can be helpful in the future. The papers generally rot with time that can lose data totally or up to some extent. Optical character recognition is the process which is used in sparing information from paper for further processing. Text line segmentation is a significant phase in character recognition because incorrectly divided text lines can cause errors in the recognition stage. In this paper, single-column and multi-column documents from different books, magazines and papers imprinted in Devanagari script had been considered. As a result of the low quality of papers in few documents and the unpredictability and complexity of these documents (background noise, paper decay due to aging, short lines, justified lines, distorted text lines), programmed text line segmentation remains an open research field. In this article, the authors have presented a new technique for unconstrained text line segmentation of Devanagari text using a combination of headline detection and median calculation of text line heights. Keywords Books · Magazines · Newspapers · Line segmentation · Headline detection · Median calculation

R. P. Kaur Department of Computer Applications, Guru Nanak College, Muktsar, Punjab, India M. K. Jindal Department of Computer Science and Applications, Panjab University Regional Centre, Muktsar, Punjab, India M. Kumar (B) Department of Computational Sciences, Maharaja Ranjit Singh Punjab Technical University, Bathinda, Punjab, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_7

85

86

R. P. Kaur et al.

1 Introduction Devanagari is one of the ancient scripts in India. Handwritten or printed documents in Devanagari script are used by vast population and are a source of enormous information. Old books and newspapers provide information related to previous history and culture. From the newspapers’ data, anyone can compare any current event or any data with the previous one. Precious Devanagari literature is kept in books, but as paper decay with time due to acidic reaction with environment, data on papers can be unsafe. The solution to these problems was provided by researchers with the advent of optical character recognition that can convert data from paper into a computer processable form. OCR works in many phases, namely pre-processing, segmentation (line segmentation, word segmentation and character segmentation), feature extraction, classification, and post-processing. After pre-processing, line segmentation is necessary and important step for text recognition with higher accuracy. Word and character segmentation accuracy largely depends upon line segmentation accuracy [1]. Many hurdles can be faced in line segmentation in the documents scanned from books, magazines and newspapers that are discussed in Sect. 4. It is also observed in literature that most of the documents used in various research papers are single-columned. Multi-columned documents are not included in the database of any research papers. Moreover, the dataset used for experimentation in the papers in the literature was of a few lines, and results of line segmentation were not shown on full-page text lines [2–6]. So, this work has motivated the authors to propose an efficient algorithm for line segmentation of complete text page. In this paper, line segmentation results in a single-column as well as in multi-column full-page documents from different sources like books, magazines, and newspapers that have been presented. Samples of the documents from books, magazines, and newspaper are shown in Fig. 1. A new technique using a combination of headline detection and median calculation of text line heights has been proposed for the purpose. Documents scanned from books, magazines, newspapers, etc., consist of short text lines, justified text lines, etc. in which headlines of text lines cannot be detected with pixel density threshold value. Line segmentation based on average line height is one of the most widely used methods for segmenting lines [7–9], but accurate average line height cannot be measured due to non-detection of headlines in few text lines. In this paper, a line segmentation is performed in a single-column as well as in multi-column documents. These documents are taken from newspapers, magazines, and books. Newspaper articles generally consist of title text lines and body text that describe the event. Books or magazine documents may also consist of title text lines and body text in columns. Title text lines vary in font and size and hence need to be segmented before line segmentation in body text. All documents are scanned at 300 dpi resolution. From the scanned image, bitmap image is created for further processing. No pre-processing techniques have been applied to the images. It is considered that documents used in this paper for experimentation are already separated from title block; line segmentation is performed only in the body text. Presented

TxtLineSeg: Text Line Segmentation of Unconstrained …

87

Fig. 1 Sample documents from different sources: a a document from a book, b a document from a magazine, c a document from a newspaper

88

R. P. Kaur et al.

Fig. 1 (continued)

work has been divided into sections. Section 1 represents the introduction to topic and motivation for the work. Intensive literature review and comparative analysis have been carried out in Sect. 2. To better understand the segmentation process in any script, one must be known to the characteristics of that script. So, the characteristics of Devanagari script have been described in Sect. 3. In Sects. 4, various problems that can pose difficulties in line segmentations have been elaborated. Different phases in

TxtLineSeg: Text Line Segmentation of Unconstrained …

89

OCR are discussed in Sect. 5. Proposed algorithm and its results have been presented in Sect. 6. Finally, accuracy achieved, and conclusions are drawn in Sect. 7.

2 Related Work Many research articles for text segmentation of various scripts. Pal and Chaudhuri [10] have presented a complete OCR for the printed Devanagari script. They divided the text line into three zones or stripes for segmentation, and projection profile was used for segmentation. A method for Devanagari text documents based on detection of ‘Shirorekha’ was proposed by Chaudhuri and Pal [11]. They assumed that skew in headlines of text lines show the skewness of the whole document. Bounding box was drawn using connected component analysis. They estimated the width of bounding box based on mean and standard deviation to remove any graphics, bounded paragraphs, graphs, or any other stray characters that could cause problems in skewness estimation. After estimating the skewness, lines were segmented based on skew angle. Line, word and character segmentation in Gurumukhi and Devanagari scripts was performed by Kumar and Sengar [2] using vertical and horizontal profiles. Document under consideration for experimentation were of 8 lines in Devanagari script and document in Gurumukhi script was of 15 lines. Lines in the document were clean without any overlapping, over-segmentation, and under-segmentation. Headline detection and connected component analysis method for segmenting lines in Devanagari documents was proposed by Ramteke and Rane [12]. Sample document considered for experimentation was a bank cheque. The image is cropped to extract text lines. Vertical histogram was used for segmenting words and characters. Number of total lines in the document was three. The segmentation accuracy in this paper depended upon the proper writing, i.e., non-overlapped characters, space with words and characters, proper presence of ‘Shirorekha’ in a word. The segmentation for word gave 98% of accuracy, for characters 97% of accuracy. A method based on piece-wise projection profile on Devanagari documents was proposed by Garg and Garg [3]. To segment the lines, document was divided into eight equal-sized vertical strips. Segmentation in each strip was done by using white space gap in between the lines. The results were combined to get segmented lines. Overlapped and touched text lines were not segmented using the proposed method. If a space between diacritic and headline existed, then also lines were not segmented accurately due to over segmentation. Number of lines in the sample document was six. Accuracy reported were 93.9% when the document was divided into 6 strips, 91.1% when the document was divided into 7 strips and 92.8% when the document was divided into 6 strips. Line segmentation in Devanagari documents based on average line height of text lines was proposed by Shukla and Banka [7]. Headline of all text lines had been detected based on fixed threshold value. Height of all text lines had been calculated from headline to headline, and average line height of all the heights was calculated. Based on average line height, segmentation points were measured. Experiments had been carried on Devanagari script as well as on printed Indian script. Sample document in

90

R. P. Kaur et al.

the experimentation consisted of the ten number of lines. Assuming height of the text line between 20 and 40 pixels, a method for segmenting skewed lines was proposed by Malgi et al. [13]. According to the authors, the height of a text line can be up to 25 pixels. Assuming these values, headlines and baselines were detected. Line height was calculated from headline to baseline, and hence, line was segmented based on the height of the line. Accuracy reported in segmenting header lines was 78% and at base line was 89%. Total number of lines in the sample document was 8. In Gurumukhi script, Jindal et al. [8, 9] had worked for recognition of degraded text that consisted of overlapped lines, touching characters in all three zones. Various problems faced during recognition were discussed in the work. Solution to the segment overlapped line was proposed based on detection of headline of text line. Lines were segmented after calculating the line height. Sample documents in research paper consisted of maximum six number of lines. OCR for the printed Gurumukhi script was developed by Lehal and Singh [14]. For segmenting lines, horizontal projection profile had been used. The accuracy rate of 96.6% had been achieved using the proposed method. Kumar et al. [15] had also presented algorithms for line and word segmentation of offline handwritten Gurumukhi script documents. For lines segmentation, they had considered projection profiles-based technique and for word segmentation they had used white space and pitch methodology with which horizontal white space in words was found for segmentation points. These techniques will not work if text lines are touching or overlapped. A ligature segmentation approach is proposed by Lehal [4] for segmenting text lines of Urdu script. Authors had used projection profile for the separating image into horizontal zones and calculated the median of the height of all zones, but sometimes a zone may contain multiple lines. So, median of the interzone heights was calculated to segment the line. Mahmud et al. [16] proposed a method for segmenting Bangla characters. Text lines were segmented based on detection of headline, and words are segmented using projection profile technique. For character segmentation, authors considered characters presented only in middle zone that were segmented by using vertical projection profile after removal of headline. Some characters were segmented into two parts after removal of headline, then depth first search (DFS) algorithm was applied to segment characters, but this technique failed if one character was formed by a combination of two characters. Sample document used in the paper was of three lines on which word and character segmentation was implemented after segmenting text lines. Hasnat et al. [17] proposed a method for the recognition of Bangla script. They used the projection profiles method to segment lines, words and characters from the Bangla script document. A sample document in the paper was of seven lines. But, not any efficient technique is available to complete segmentation of documents consisting of a single-column and multi-columns text. Sahare et al. [1] have segmented multilingual documents using projection profile. Recognition of segmented characters has also been performed that is verified through SVM classifier. Recognition accuracy achieved is 98.86%. Hindi text segmentation has been performed by Pallakollu et al. [5] based on headline detection. Document under experimentation consisted of skewed lines also. Maximum accuracy achieved is 93.6%. Script identification and segmentation have been performed in bilingual documents using white space

TxtLineSeg: Text Line Segmentation of Unconstrained …

91

Table 1 Segmentation accuracy in various documents Sr. No.

Source of document

Number of columns

Number of text lines

Correctly segmented lines

Accuracy (%)

Doc_1

Book

1

31

31

100

Doc_2

Book

2

46

44

95.6

Doc_3

Book

1

32

31.5

98.4

Doc_4

Book

1

33

33

100

Doc_5

Magazine

3

60

60

100

Doc_6

Magazine

2

70

69

98.5

Doc_7

Magazine

3

81

79

97.5

Doc_8

Magazine

1

25

25

100

Doc_9

Magazine

2

40

39

97.5

Doc_10

Newspaper

2

20

20

100

Doc_11

Newspaper

1

15

15

100

Doc_12

Newspaper

3

36

35

97.2

Doc_13

Newspaper

2

24

24

100

Doc_14

Newspaper

3

39

37

94.8

Doc_15

Newspaper

2

28

28

100

and pitch method. Documents were mixer of Roman and Devanagari scripts. Lines in the sample documents were merged, and 90% accuracy has been achieved I am segmenting text lines in these documents. SVM and k-NN classifiers were used for classification purpose among which SVM produced better results. Niranjan et al. [6] have segmented text lines in handwritten tri-lingual documents consisting of Kannada, Devanagari, and Roman scripts. Structural features of text lines have been used to segment the line. So, the authors have presented a technique in this article. Comparative analysis of few techniques has been presented in Table 1.

3 Characteristics of Devanagari Script We have tried on a few techniques before reaching to current proposed technique in this paper. Firstly, we experimented on the segmentation of images from text based on horizontal and vertical projection profile. The basic idea of this technique was that some regular patterns (where the headline is presented), and some irregular patterns are formed when an image is present in the text. The area where the profile is irregular that can be an image area, so first cut the image area using horizontal projection. But sometimes no clear horizontal profile pattern is formed to cut accurate region containing image if black pixels in an image are less. The sample article is shown in Fig. 1. A horizontal projection profile is drawn in Fig. 2, and a vertical project profile is depicted in Fig. 3. As shown in the figure, irregular pattern is formed, leaving some

92

R. P. Kaur et al.

Fig. 2 Devanagari consonants and vowel

Fig. 3 Different zones in a text line

area of the image. If we cut the portion forming an irregular pattern, some area of the image will not be segmented which is not forming same pattern as the rest of the image area is formed. Devanagari is written from left to right. Text line of Devanagari script is divided into three zones, namely upper zone, middle zone, and lower zone as shown in Fig. 3. Zone above the headline is called an upper zone, portion of text line from headline to base line is middle zone, and the lower zone is present below the base line.

TxtLineSeg: Text Line Segmentation of Unconstrained …

93

Headline called ‘Sirorekha’ is the main feature in Devanagari script that joins the characters to form a word and plays an important role in segmenting text lines. Apart from the vowels and consonants, there are compound (composite) characters in Devanagari script as well as in most of Indian scripts, which are formed by combining two or more basic characters.

4 Problems in Line Segmentation In this paper, documents are considered from books, magazines, and newspapers. The layout of these types of documents is complex because of the presence of title text lines, images, multi-column text, etc. Document segmentation into individual blocks is essential before any further segmentation into individual unit. After block segmentation, lines are separated into individual text lines. Many hurdles can be faced in segmentation of text lines with these kinds of documents. Most major problems can be faced due to poor paper and printing quality. Text visibility after binarizing the document heavily depends upon the printing and quality of paper. Heavy ink can cause bulging of character, overlapping of characters, touching within a character. Poor paper quality causes fading of ink, distortion of text, broken characters, missing lines, etc. Following are some problems that generally occur in the documents. • Small lines: It generally happens in newspaper text that starting line and ending line of paragraph in news body is shorter than other lines. This also happens in books and magazine documents. In books when the story is depicted most of the explanatory lines are shorter than other lines as shown in Fig. 4. Due to the low pixel density, headlines of these text lines are not detected; hence, line segmentation based on headline detection cannot produce accurate segmentation results. • Justified lines: In running text of a paragraph, sometimes text is justified to adjust the length of text lines as shown in Fig. 5; hence, pixel density decreases than the fixed threshold value in headline row. With a low pixel density, it is difficult to detect headline of text line. So, line segmentation cannot be implemented with line height method or headline detection method.

Small lines

Fig. 4 Small text lines in a document

94

R. P. Kaur et al.

Justified Lines

Fig. 5 Justified text lines in a document

Fig. 6 Distorted text lines

• Distorted text lines: Sometimes text pixels are distorted when an image is scanned and converted into bitmap as shown in Fig. 6. Text is not clearly visible due to distortion of pixels; hence, pixel density falls. • Over segmentation and under segmentation: Devanagari text lines like many other Indian scripts are also divided into three zones, namely upper zone, middle zone, and lower zone as shown in Fig. 3. Upper zone characters are generally separated from the middle zone by one or more pixels as shown in Fig. 7. Horizontal projection profile is also widely used method for line segmentation. When projection profile was implemented on a sample document, one text line was segmented into multiple strips and caused over segmentation. If lines are overlapped, then segmentation can result in over segmentation. In some research papers, line segmentation was performed using the vertical strip method. The document was divided into equal sized strips, and horizontal projection profile was used for further processing. But in case of over segmentation and under segmentation, this method cannot be easily implemented. • Multiple text lines in documents: Books, magazine pages and newspaper articles consist of multiple text lines as in documents shown in Fig. 1. If the headlines of few text lines are not detected, it is difficult to segment text lines using the line height method, headline detection method or median calculation method. Mostly research papers in Devanagari script consist of databases of only a few lines. Fig. 7 Oversegmented text line

TxtLineSeg: Text Line Segmentation of Unconstrained …

95

5 Phases in OCR OCR consists of multiple phases as depicted in Fig. 8. Following are the steps in character recognition: • Image Acquisition: In image acquisition phase, the image is retrieved from some source through scanner or any other hardware device on some required resolution. It is the first step in the character recognition process. Retrieved image is stored in the system for further pre-processing steps. • Pre-Processing: It is a common name given to some operation implemented at a lower level of image processing system. The purpose of implementing preprocessing techniques is to enhance required image features and to suppress unwanted distortion in the image. In pre-processing phase, the scanned image is converted into a bitmap image. • Segmentation: Segmentation is the basic and crucial step in a recognition system. Most of the work in document analysis is focused on correct segmentation of text data. Segmentation is the process of extracting individual recognizable unit, i.e., character. Text lines are segmented into individual lines, lines into words and the finally, words are segmented into characters. If any document consists of images, page layout analysis is a necessary step to extract text from the document for recognition. Line segmentation is the base for any segmentation process. • Feature Extraction: Appropriate selection of feature extraction methods is also most significant factor in attaining high recognition accuracy in character recognition system. Various feature extraction methods are developed for different types of the scripts, such as shape of characters, contours of the characters, thinned characters (skeletons) or sub-images of each individual character. The feature extraction methods can be based on structural and statistical features. • Classification: Classification of text is performed based on the extracted features of the characters. Text classification is the process of assigning tags or class to

Fig. 8 A block diagram of optical character recognition

96

R. P. Kaur et al.

text according to its features. Feature vector is fed to the classifier, based on prefeed templates of classes, input feature vector is compared to the templates and appropriate class is assigned to the character. If exact class is assigned to the character, then character is called accurately recognized. • Post-Processing: Post-Processing is the final step in optical character recognition. The goal of post-processing is to detect and correct linguistic misspellings in the OCR output text after the input image has been scanned and completely processed. If output generated through OCR have not achieved required accuracy rate then output can be corrected in three ways, namely manual error correction, dictionarybased error correction, and context-based error correction. Any of the methods can be adopted for correcting the output.

6 Text Line Segmentation Various documents are considered and taken from books, magazines, and newspapers. All the documents were scanned at 300 dpi resolution. Scanned images were converted into a bitmap image for further processing. Two techniques were tried to implement on the documents before the proposed method. The first method was line segmentation based on headline detection, and the second was line segmentation based on text line heights median calculation. In the headline detection method, headline of text lines was detected based on certain fixed threshold value of pixel density. As headline row consists of higher density of black pixels as compare to other rows. It is easy to detect a headline row based on pixel density. It is common in printed scripts that upper zone is almost half of the middle zone. For accurate segmentation of all text lines, headline of each line should be located but in documents of books, magazine and newspaper headline of every line could not be detected because of the problems discussed in Sect. 3. So, the only headline detection method did not serve the purpose for line segmentation in these kinds of documents. In the median calculation method, heights of all text lines were calculated from headline to headline. A median value of all heights was calculated. Based on this median value, initially, headlines were posed on the text lines whose headline was not detected and afterward using headline positions, line segmentation was performed. Segmentation based on only line heights median calculation produced good results when sample document was of few text lines. When document consisted of many text lines, headlines posed based on median of the height of the line calculation displaced by few pixels hence resulted in wrong segmentation of some text lines. After failing in achieving required results using these techniques individually, a hybrid method based on headline detection and median calculation is proposed in this paper for segmenting text lines in these documents. If a document consists of title block and body text block, then it is necessary to segment different blocks in a document. The body text block is then divided into columns before segmenting the lines. Headline and body text block segmentation in newspaper articles printed in Gurumukhi script is performed using white space and pitch method. Title block is segmented using continuous horizontal white space and

TxtLineSeg: Text Line Segmentation of Unconstrained …

97

Fig. 9 Headline and column segmentation in a newspaper article image

columns are segmented using continuous vertical white spaces with a fixed threshold value. The same method can be implemented in segmenting Devanagari documents consisting of title block and multi-columns. Figure 9 shows a sample image of title block segmentation and column segmentation in Devanagari text documents. In this paper, documents considered are already separated from title block. Following are the steps of the proposed method for segmenting lines: Step 1: A threshold value of black pixel density was fixed for detecting headlines of text lines. Step 2: All possible headlines were detected based on threshold value. If (row-1) < thresh_val and (row) > thresh_val, then mark this point as starting point of headline of text line. Step 3: Calculate height of every stripe from headline to headline and store it an 2D array line_ht[row]. Step 4: Sort element of array line_ht[row] and calculate a median value for that array elements. Step 5: The text lines whose headline was not detected will be overlapped with other stripes. Pose missed headlines in a strip from detected headline to next detected headline based on median value. Step 6: After detecting and posing headlines of all text lines, mark segmentation points above the headlines equal to half of the value of height of the middle zone. Line segmentation results in the documents from a book, a magazine and a newspaper article are depicted in Fig. 10.

98

R. P. Kaur et al.

Fig. 10 Line segmentation results: a document from a book, b document from a magazine, c document from a newspaper article

TxtLineSeg: Text Line Segmentation of Unconstrained …

99

Fig. 10 (continued)

7 Conclusion In this paper, complete segmentation of text lines in single-column and multi-column document has been presented. A fusion technique based on the headline detection method and text line heights median calculation has been proposed for segmenting unconstrained text lines of Devanagari text. In Table 1, line segmentation accuracy of various documents has been presented. All the problems discussed in the above Sect. 3 were solved successfully with the proposed method.

100

R. P. Kaur et al.

In few documents, text was totally distorted, and headlines were detected only in starting of few text lines. In these documents, false headlines were posed by the end of the document. But as text lines, heights vary in few pixels and there are number of text lines in the document, headlines were not posed in accurate positions using median value, few pixels of upper text line were segmented with lower text line. More experiments can be done to segment totally distorted text. The proposed method can also be implemented in other scripts that share some common properties with Devanagari script.

References 1. Sahare P, Dhok SB (2018) Multilingual character segmentation and recognition schemes for Indian document images. IEEE Access 6:10603–10617 2. Kumar V, Sengar PK (2010) Segmentation of printed text in Devanagari script and Gurmukhi script. Int J Comput Appl 3(8):30–33 3. Garg R, Garg NK (2014) An algorithm for text line segmentation in handwritten skewed and overlapped Devanagari script. Int J Emerg Trends Eng Dev 4(5):114–118 4. Lehal GS (2013) Ligature segmentation for Urdu OCR. In: 12th International conference on document analysis and recognition (ICDAR), pp 1130–1134 5. Palakollu S, Dhir R, Rani R (2012) W2. In: Proceedings of the world congress on engineering and computer science, vol. 1, pp 24–26 6. Zinjore RS, Ramteke RJ, Pathak VM (2017) Segmentation of merged lines and script identification in handwritten bilingual documents. In: Proceedings of the 9th annual meeting of the forum for ınformation retrieval evaluation, pp 29–32 7. Shukla MK, Banka H (2014) Line-wise Script segmentation for Indian language documents. Int J Comput Appl 108(9):34–37 8. Jindal MK, Sharma RK, Lehal GS (2007) Segmentation of horizontally overlapping lines in printed Indian scripts. Int J Comput Int Res 3(4):277–286 9. Jindal MK, Lehal GS, Sharma RK (2009) On segmentation of touching characters and overlapping lines in degraded printed Gurmukhi script. Int J Image Graph 9(03):321–353 10. Pal U, Chaudhuri BB (1997) Printed Devanagari script OCR system. Vivek 10:12–24 11. Chaudhuri BB, Pal U (1997) Skew angle detection of digitized Indian script documents. IEEE Trans Pattern Anal Mach Intell 19(2):182–186 12. Ramteke AS, Rane ME (2012) Offline handwritten Devanagari script segmentation. Int J Sci Technol Res 1(4):142–145 13. Malgi S, Pramod, Gayakwad S (2014) Line segmentation of Devanagari handwritten documents. Int J Electron Commun Instrumen Eng Res Dev 4(2):25–32 14. Lehal GS, Singh C (2000) A Gurmukhi script recognition system. In: Proceedings 15th international conference on pattern recognition 2, pp 557–560 15. Kumar M, Sharma RK, Jindal MK (2010a) Segmentation of lines and words in handwritten Gurmukhi script documents. In: Proceedings of the first international conference on intelligent interactive technologies and multimedia, 25–28 16. Mahmud SM, Shahrier N, Hossain AD, Chowdhury MTM, Sattar MA (2003) An efficient segmentation scheme for the recognition of printed Bangla characters. In: Proceedings of ICCIT, pp 283–286 17. Hasnat MA, Habib SM, Khan M (2008) A high performance domain specific OCR for Bangla script. In: Novel algorithms and techniques in telecommunications, automation and industrial electronics, pp 174–178

Intelligent Strategies for Cloud Computing Risk Management and Testing Vinita Malik and Sukhdip Singh

Abstract The cloud computing uses dispersed models with access whenever requested with computing devices requiring high configuration and minimum management effort. The virtual, multitenant and complex infrastructure needs early risks identification and management. The businesses involved in this computing require reassurances as main thought for cloud services testing. The research mainly recognizes the vulnerabilities, threats and risks involved in this computing. The paper is elaborated with many risk management and risk-based testing in detail with many risk reduction strategies in many computing environments, i.e., cloud computing, artificial intelligent environments and pervasive computing. This research also emphasizes on key factors in risk-oriented testing, risk minimization, trust management, limitations and benefits in cloud computations. Keywords Risk-based testing · Risk management · Cloud computing · Trust · Attacks · Threats · Vulnerabilities

1 Introduction The cloud computations have developed as the prominent computing standards because of technology advancement and better infrastructure availability. These computations impart an illusion of limitless resources with on-demand service, high scalability and pay per use model and high agility which is further characterized to imbibe the optimum resource usage and resource polling with high elasticity [1]. The main business concerns of cloud computing are required to focus on less costs, reduced investments and agile deployment. In spite of many attractive facts associated to this computing, adopting it involves high potential risks. The cloud computing V. Malik (B) D.C.R.U.S.T, Murthal and CUH, Mahendergarh, Haryana 123029, India e-mail: [email protected] S. Singh D.C.R.U.S.T, Murthal 131027, India © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_8

101

102

V. Malik and S. Singh

is a mixture of many computing paradigms, i.e., distributed and grid which makes it convoluted in nature. The services provided by any organization have to consider mainly the information security, privacy and trust control measures followed by it. The research has raised an analytical study focusing on identification of vulnerabilities, risks, threats and attacks in cloud computations. Many challenges involved in cloud mathematics have been elaborated well with risk management strategies and risk-oriented testing in high computing environments. The paper has been parted into several sections for answering the questions detailed below: • How cloud computing, its services and deployments models, attributes/characteristics, advantages/benefits and issues/challenges can be described? • Describe risk management and risk-based testing? • Find the risks, attacks, threats and vulnerabilities involved in cloud computations? • Describe the risk depletion strategies in these computing environments? • Describe the main factors, advantages and challenges involved in cloud testing? • How can the trust be evaluated in cloud computations? Section 2 deals with basics of cloud computations, while Sect. 3 details about risks and attacks involved in these computations along with risk-based testing and risk depletion methods in the artificially intelligent, pervasive and cloud computing environments. The cloud testing is discussed in Sect. 4, and then, in Sect. 5, the relevance of trust is discussed in cloud environment. Conclusion and future scope are presented in Sect. 6.

2 Cloud Computing As given by Gartner report on technology trend, this has been predicted that in coming 05 years, the cloud computing has major chances to prevail the market in all decision-building businesses [2]. IT experts are searching for cloud solutions that help in formulating solutions across main business domains [3]. This computing not only defines a shared resources access model by Internet but also by any other computing network. It requires less administrative effort with minimal service provider’s interaction. In this section, the deployment, service models, characteristics and challenges of cloud computing have been discussed in detail.

2.1 Attributes/Characteristics The main attributes/characteristics of cloud computations are given below [1, 4–8]: • Multi-Tenancy: Multiple users utilizing services.

Intelligent Strategies for Cloud Computing Risk …

103

• Pay/use Model: Paying for the services as per usage. • Network Access: Services access via networking. • Optimized resources usage: As per service type, by the help of measuring capability, it optimizes resources usage. • Elasticity: As per user needs, the services quantity/quality can be increased/decreased.

2.2 Deployment/Stationing Models Four deployment/stationing models are discussed as given [9]: • Private model: The infrastructure gets utilized only for 01 operation. Here, data/information control is the main consideration. • Public model: Here, the cloud services provide the infrastructure. The user has quite low control on cloud data security. • Community model: In this model, many organizations have divided concerns. • Hybrid model: This model is made of 02/more private or community or public kind of clouds. All primary programs run on the private cloud and the secondary run on public cloud.

2.3 Service/Business Models The following service/business models come under the cloud computing [1]: • SaaS Model/Software as a Service: Here, the service provider owns the responsibility for application management. Examples include SAP Business and Google Docs. • PaaS Model/Platform as Service: In this, the application developers develop the programs according to platform needs. The platform then gives the virtual domain/environment to the cloud services. Examples include Google App Engine and Forge.com. • IaaS Model/Infrastructure as a Service: Here, the services own the complete computational resources. The cloud security is maintained by the clients. Service provider maintains the resources, i.e., networks or servers provided to the customers. For example, EMC2, VMwareAmazon Web services.

2.4 Challenges/Issues The cloud computing offers the following challenges [10]: • Security and Privacy: These issues have favored the non-adoption of cloud computing.

104

V. Malik and S. Singh

• Resource Discovery: In distributed cloud, the resource allocation has played key role. • Service Quality: Quality of service is again one factor that makes organization hesitant in adopting cloud services. • Data Scalability: The nodes scaling is done as per user reply/response. • Data Integrity: It is difficult to maintain the data protection from unauthorized access. • Virtualization: Hiding the software/hardware complex details by creating an abstraction layer is again a burdensome task. • Trust: Trust requires to maintain reputation with users that application under usage is safe to use. • Scheduling: The efficient resource usage needs proper data scheduling. • Debugging: The remote data rectification is quite critical here. • Querying: The query processing is quite challenging. • SLA/Service-Level Agreement: Service-level agreement (SLA) makes cloud computing more complex.

3 Risks, Risk Management, Risk-Oriented Testing and Risk Reduction/depletion Strategies This part of research deals with attacks, threats and risks in cloud computing with risk management, risk-oriented testing and risk reducing strategies in cloud, AI and pervasive environments.

3.1 Risk Categories The risk categories in cloud computing are defined as below [11]: • Organizational: It talks about IT planning, IT governance, managerial risks and industrial regulations risks. • Operational Risks: Business risks, daily IT operations risks. • Technical Risks: Cloud infrastructure risks and IT deficiencies lead to technical risks. • Legal Risks: It includes contract risks, data privacy risks and intellectual property risks.

3.2 Risk Factors The following risks factors have been identified [12–14]:

Intelligent Strategies for Cloud Computing Risk …

105

\query{Please check the edits made in the sentence ‘Data transfer risk factors, inadequate knowledge risk factors, insecure application...’.}Data transfer risk factors, inadequate knowledge risk factors, insecure application development, shared resource environment risks, data breaches risks, regulatory compliance risks, service availability risks, improper service management risks, distributed data risks, data recovery risks, data virtualization issue, data integrity issues, service-level agreement risks, resource exhaustion risks, authentication risks.

3.3 Vulnerabilities The vulnerabilities in cloud computing are described as follows [15]: • • • •

Virtualization vulnerabilities at OS level are exploited as denial-of-service attacks. Internet protocol vulnerability leads to flooding and ARP spoofing. Injection vulnerability like SQL injection flaw, OS injection flaw. Browser vulnerability leads to attacks which include SSL Certificate spoofing and HTML services poisoning.

3.4 Threats The cloud computing has the following types of threats [15]: • • • • • • •

Poor storage/bandwidth usage threats. Uncertain interfaces threats. Remote data access threats. Business model changes reliability. Bad risk profiling threats. Identity theft threats. Cloud auditing threats.

3.5 Attacks The following attacks have been found in cloud computing [15]: • Zombie Attacks: Here, attacker deluges the network with many requests. • Shared Technology issues: The virtualizing technology offers shared whenever requested which may cause problems. • Malicious Insiders: The cloud service provider procedures are non-transparent in nature and can cause this issue. • Service Hijacking: User account services are seized by information exploitation.

106

V. Malik and S. Singh

• Virtualization issues: This includes attacks like VM Escape and Hypervisor Rootkit. • Man-in-the-middle attack: The SSL layer misconfiguration causes this type of attack. • Phishing attack: The sensitive data gets compromised by manipulating Web links. • Backdoor Channel attack: The victim resources can be controlled by this attack.

3.6 Risk Management Software risk management encompasses the core stages as identification of risky items, risk items assessment, risk planning and monitoring of risks. Software risk anagement stages are defined as follows [16]: • Risk Identification: It identifies all risk factors of the software projects. • Risk Analysis: It analyzes and assesses the risks involved in the software projects. • Risk Planning: It draws out plan how to mitigate the project risks to obtain satisfactory solutions. • Risk Monitoring: It tracks risks over time and reports their status to the project team.

3.7 Risk-Oriented Testing Risk-managed testing is composed of various stages by which risk is managed well in the software projects by risk-based test case prioritization [17]. The generic riskmanaged/oriented testing is demonstrated (see Fig. 1) as given below. In the generic process of risk-oriented testing, test planning and risk assessment go in parallel. Test cases are designed once all risk items are assessed and classified according to the risk levels [18].

3.8 Risk Reduction/depletion Strategies Main risk depletion/reduction strategies are described as follows [12, 19, 20]: • • • • • • •

Perceptron model Regression modeling Filtering and classification Instance/case-based knowledge algorithm Voting The workload partitioning Structure modeling

Intelligent Strategies for Cloud Computing Risk …

107

Fig. 1 Generic process of risk-oriented testing

The cloud security is divided into architecture-related risks, compliance risks and privacy risks (as seen in Figs. 2, 3, 4) [21]. The cloud security risk management requires compliance and governance risks minimization [22]. Various intrusion detection systems have been employed for addressing vulnerabilities. The defensive mechanism is as proposed given below (as seen in Fig. 5) [23]. The other methods for risk management in cloud computing

108

Fig. 2 Taxonomy defining architecture-based risks

V. Malik and S. Singh

Intelligent Strategies for Cloud Computing Risk …

Fig. 3 Taxonomy defining compliance-related risks

Fig. 4 Taxonomy defining privacy-related risks Fig. 5 Defensive mechanism life cycle

109

110

V. Malik and S. Singh

are given as impact ratings and likelihood, cooperative intrusion detection systems with Nessi2 as a simulator tool fuzzy self-organizing maps, privacy assessment tool (PIA), behavior modeling by model behavior tree and performance metrices [24–28]:

3.9 Risk Reduction/Depletion Strategies in Artificial Intelligent Computing The following AI techniques have been used in the past for risk management [29–35]: Particle swarm optimization, classification Bayesian models, genetic algorithms, fuzzy logic, neural net learning, neuro-fuzzy techniques.

3.10 Risk Reduction/Depletion Strategies in Pervasive Computing The following risk depletion strategies are applied in pervasive computing [36–39]: • • • •

OCTAVE MEHARI CRAMM Risk matrix

4 Main Factors, Advantages and Challenges in Cloud Testing This section describes the main factors, advantages and challenges in the cloud testing:

4.1 Main Factors The main factors required to be considered for testing cloud services are described here [40]: • • • • • •

Test the cloud services on demand Script-based test execution Cross-browser support Real-time reporting Functional testing support Servicing cost

Intelligent Strategies for Cloud Computing Risk …

111

• Integration with another APIs

4.2 Advantages The major benefits of cloud computing are described as follows [41]: • The unit cost of computing gets reduced. • Pre-configured software images reduce errors. • Reduced time to market and high efficiency.

4.3 Challenges The cloud testing challenges are described as follows: • To integrate a public cloud to client’s companies’ data centers, no standards are defined. • Public cloud security has been a big issue. • Poorly defined terms and conditions for cloud service providers. • Limited technology, configurations, servers, networking and bandwidth. • Less awareness of testing team about the test resources. • Long waiting time due to network outage.

5 Trust and Cloud Computing Trust is considered as one of the most important aspects in cloud computing [42]. For a particular time period, it gives a personalized measurement between many facilities which try to act reliable in a state of circumstances [43]. Evaluation of trust is done for measuring quality of service [44]. Trust can be centralized/distributed, static/dynamic, and proactive/periodic/reactive or direct/indirect. The major trust types are depicted with its attributes and applications (as seen in Fig. 6). Trust Evaluation Trust gets evaluated by two methods, i.e., centralized and distributed. In the centralized method, single entity controls the entire architecture. These systems are fastest in search but have scalability issues. In the second, i.e., distributed systems, all the control operations get handled by many network components.

112

V. Malik and S. Singh

Fig. 6 Type of trust with its applications and attributes/characteristics

6 Conclusions and Future Scope This paper not only presents a deep analytical study on various concepts related to cloud computing, i.e., basics, risks, threats and attacks management, but also explores risk management and risk-oriented testing in detail. The research focuses on risk management strategies in cloud computing, artificially intelligent and pervasive environments. \query{Please check the edits made in the sentence ‘The cloud computing testing and how trust is managed...’.}The cloud computing testing and how trust is managed and evaluated in cloud services have been elaborated well. The future scope may be the risk-based testing and trust evaluation of cloud resources by the automated tools. Further, the correlation among risks, testing, trust and quality assurance can be established by developing our own cloud architecture imbibing cloud security and privacy.

References 1. Leguías Ayala IDC, Vega M, Vargas-Lombardo M (2013) Emerging threats, risk and attacks in distributed systems: cloud computing. In: Elleithy K, Sobh T (eds) Innovations and advances in computer, information, systems sciences, and engineering. Lecture notes in electrical engineering, vol 152. Springer, New York 2. Mohapatra S, Lokhande L (2013) Cloud computing and ROI: a new framework for IT strategy. Management for professionals. Springer 3. Doddavula SK, Agarwal I, Saxsena V, Mahmood Z (2013) Cloud computing solution pattern: infrastructure solutions, vol 4. Springer

Intelligent Strategies for Cloud Computing Risk …

113

4. Jansen W (2011) Guidelines on security and privacy in public cloud computing. NIST J, pp 1–60 5. Grance T, Mell P (2009) Definition of cloud computing. NIST J, pp 1–7 , (2009) Cloud Security Alliance, Security guidance for critical areas of focus in cloud computing vol.-2, J Ala Acad Sci 76, (2009) 6. Khajeh Hosseini A, Sriram I (2010) Research agenda in cloud technologies. Technical report 7. Schubert L, Jeffery K, Neidecker-Lutz B (2010) The future of cloud computing opportunities for European cloud computing beyond 2010. ACC 2011, Part IV, 1–71 8. Aleem A, Sprott CR (2013) Let me in the cloud: analysis of the benefit and risk assessment of the cloud platform. J Finan Crime 20(1):6–24 9. Chiregi M, Navimipour NJ (2013) A comprehensive study of trust evaluation mechanisms in the cloud computing. J Serv Sci Res 9:1–30 10. Peng GCA, Dutta A, Choudhary A (2013) Exploring critical risks associated with enterprise cloud computing. LNICST, pp 132–141 11. Ahmed N, Abraham A (2015) Modeling cloud computing risk assessment using machine learning. In: Abraham A, Krömer P, Snasel V (eds) Afro-European conference for industrial advancement. Advances in intelligent systems and computing, vol 334. Springer, Cham 12. Ahmed N, Abraham A (2015) Modelling cloud computing risk assessment using ensemble methods. Adv Intell Syst Comput, pp 261–273 13. Srinivasan S (2014) Assessing cloud computing for business use. Springer briefs in Electrical and computer Engineering, Springer Science + Business Media, pp 101–118 14. Modi C, Patel D, Borisaniya B (2012) A survey on Security issues and solutions at different layers of cloud computing. J Supercomput 63:561–592 15. Bannerman PL (2008) Risk and risk management in software projects: a reassessment, J Syst Softw 2(2):442–450 16. Felderer M, Ramler R (2014) Integrating risk-based testing in industrial test processes . Softw Qual J 22:543–575 17. Alam MM, Khan AI (2013) Risk-based testing techniques: a perspective Study. Int J Comput Appl 65:33–41 18. Oktay KY, Gomathisankram M, Singhal A (2012) Towards data confidentiality and a vulnerability analysis framework for cloud computing. Secure cloud computing, Springer Science+ Business Media , pp 213–238 19. Baldwin A, Pym D,Shiu S (2013) Enterprise information risk management dealing with cloud computing. Privacy and security for cloud computing, 978-1-4471-4188-4 20. Gonzalez N et al (2012) A quantitative analysis of current security concerns and solutions for cloud computing. Springer, vol 1(11) 21. Farell R (2012) Securing the cloud–governance, risk and compliance issues reign supreme. Inf Secur J Global Perspect, 310–319 22. Inayat Z et al (2012) Cloud based intrusion detection and response system open research issues and solutions. Arab J Sci Eng 42(2):399–423 23. Kiran M (2014) A methodology for cloud security risks management. In: Mahmood Z (eds) Cloud computing. computer communications and networks. Springer, Cham 24. Al-Mousa Z, Nasir Q (2015) C1-CIDPS: a cloud computing based cooperative intrusion detection and prevention system framework. Springer, vol 523 25. Pillutla H, Arjunan A (2018) Fuzzy self-organizing maps-based DDoS mitigation mechanism for software defined networking in cloud computing. J Amb Intell Human Comput, pp 1–13 26. Tancock D, Pearson S, Charlsworth A (2013) A privacy impact assessment tool for cloud computing. Privacy and security for cloud computing, pp 1–43 27. Mehrotra S (2014) Towards a risk based approach to achieve data confidentiality in cloud computing. LNCS, vol 8425 28. Sandhu GS, Salaria DS (2014) Bayesian network model of the particle swarm optimization for software effort estimation. Int J Comput Appl 96(4):52–58 29. Khoshgoftaar TM, Liu Y (2007) A multi- objective software quality classification model using genetic programming. IEEE Trans Reliab 56(2):237–245

114

V. Malik and S. Singh

30. Rodvold DM (1999) A software development process model for artificial neural networks in critical applications. In: Proceedings of IJCNN’99. International joint conference on neural networks. (Cat. No. 99CH36339), Washington, 05, pp 3317–3322 31. Liu X, Kane G, Bambroo M (2006) An intelligent early warning system for software quality improvement and project management. J Syst Softw 79(11):1562–1564 32. Hsieh M, Hsu Y, Lin C (2018) Risk assessment in new software development projects at the front end: a fuzzy logic approach. Amb Intell Hum Comput 9:295–305 33. Huang X, Ho D, Capretz LH, Ren J (2007) Improving the COCOMO model by neuro fuzzy approach. Appl Soft Comput 7:29–40 34. Rizzi A, Lacovazzi A (2015) A low complexity real-time internet traffic flows neuro-fuzzy classifier. Comput Netw 91:752–771 35. Carnegie Mellon University. https://www.cert.org/octave/download/intro.html. Last accessed 2015 36. Clusif. https://www.clusif.asso.fr/fr/production/ouvrages/pdf/MEHARI-2010-Overview.pdf. Last Accessed 2010 37. Insight Consulting. https://dtps.unipi.gr/files/notes/2009-2010/eksamino_5/politikes_kai_dia xeirish_asfaleias/egxeiridio_cramm.pdf. Last accessed 2009 38. Ledermuller T, Clarke NL (2011) Risk assessment for mobile devices. Lecture notes in Computer science, pp 210–221 39. Markande K, Murthy SJ (2013) Leveraging potential of cloud for software performance testing. Cloud computing methods and practical approaches, computer communications and networks, pp 293–322 40. Chana I, Chawla P (2013) Testing perspectives for cloud based applications, software engineering frameworks for the cloud computing paradigm. Computer communications and networks. Springer, pp 145–164 41. Sidhu J, Singh S (2016) Improved TOPSIS method based trust evaluation framework for determining trustworthiness of cloud service providers. J Grid Comput, pp 1–25 42. Pathan ASK, Mohammed MM (2015) Building customer trust in cloud computing with an ICT-enabled global regulatory body. Wirel Person Commun 85:77–99 43. Xie X, Liu R, Cheng X, Hu X, Ni J (2016) Trust-driven and PSO-SFLA based job scheduling algorithm on cloud. Intell Autom Soft Comput 22(4):1–6 44. Chiregi M, Navimipour NJ (2015) A comprehensive study of trust evaluation mechanisms in the cloud computing. J Serv Sci Res 9:1–30

Effective Survey on Handwriting Character Recognition G. S. Monisha and S. Malathi

Abstract Researchers have found out the many technologies to recognize hand written character into text. Handwriting recognition is the capability of a computer to read handwriting as actual text. To convert handwriting to text, this is irrefutably the best program which can be used to overcome the many problems which are present in recognition. The paper mainly focuses on basic problem faced in recognition of stroke edges in the handwritten character with various pattern recognition techniques. Based on the survey taken for this problem, say that segmentation of the character is low and prevents high popularity accuracy of unconstrained handwriting. Keywords Handwritten · Recognition · Pattern recognition · Stroke edges

1 Introduction Handwriting recognition is stated to be developing rapidly inside the current globalization. Handwriting recognition is something this can report the capacity of computer machine to convert the human writing to textual content writing. However, handwritten to text on the majority smart devices is not decipherable. Most of the time, there is variability in handwritten text which is initiated by characteristics such as writing speed, movement of the user, space available for writing, overlapping of character and sloppiness. Feature extraction decomposes a hieroglyphic individual into highlights like follows, shut circles, line way, and crossing points of lines. The functions [1] minimize the portrayal and makes the acknowledgment [2] procedure a high technology. In OCR, algorithm may be educated primarily based on a statistic set of recognized handwriting text so that you can learn how to [3, 4] classify the characters that involve G. S. Monisha (B) · S. Malathi Computer Science and Engineering, Panimalar Engineering College, Poonamallee, Chennai 600123, India e-mail: [email protected] S. Malathi e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_9

115

116

G. S. Monisha and S. Malathi

in check set accurately. General strategies of function capacity discovery in PC vision are appropriate to this sort of optical character recognition [5], which is ordinarily observed in down-to-earth penmanship acknowledgment and for sure most current optical character recognition program. For character division, they use of help vector machine. At the point when the information of the segmentation process is used for the feacture extraction of the character. The SVM [6] is an AI calculation and can restrict the division error which is brought about by fast movement of the article. Right off the bat, outline distinction mixed with morphology of arithmetic is applied to separate the article generally. At that point, the dark shading estimation of the picture pixels and DCT parameters are processed as [7] the text of the photography for preparing SVM. At last, a various levelled disintegrated SVM double-choice trees are utilized for order. Test results show that the arrangement of standards is amazing and strong. A step-forward mean-gradient threshold set of rules with CNN [3, 8, 9] is proposed for the multiplication constant decided with the aid of the shade facts to reduce the industrial noises with a huge range of history colors [10]. This algorithm makes use of the HSI (hue, saturation, and intensity) color version and the color spectrum to reach at a shade map chart for this utility. A designated analysis is achieved on the results acquired. With advanced set of rules, alphanumerical [11] character printed on nameplates fixed on the automobile components is examine accurately. Results are supplied to evaluate the conventional neighborhood mean-gradient [12] set of rules with the improved one. Handwriting character acknowledgment is a troublesome issue because of the extraordinary varieties of composing styles, distinctive size of the characters. Multiple sorts of penmanship styles from various people are considered in this work. A picture with higher goals will surely set aside any longer effort to register than a lower goal image. In the reasonable picture obtaining frameworks and conditions, shape twisting is normal procedure because various individuals’ penmanship has a diverse state of characters. Deep neural systems (DNN) have made extraordinary progress in different PC vision and example acknowledgment applications [3, 8], including those for written by hand character recognition. However, most current DNN methodologies treat the transcribed example basically as a picture preprocessing including picture reclamation to evacuate a wide range of clamors and upgrade, division of characters of written by hand, including extraction from divided manually written characters. In this task, we propose an upgrade of the DCNN approach to [9, 13] online transcribed character acknowledgment by fusing an assortment of space explicit information, including picture rebuilding, picture improvement, picture division, picture highlight extraction, and deep neural system (DNN). Section 2 shows the related work of the research work with the various technology, limitation and merits of the one other. The analysis of the technology and techniques of the different paper that helps us to determine correct technology for the particular research is presented in Sect. 3. In Sect. 4, the filter and advantages of the deep neural network are compared in this paper. Conclusion describes the work done. Finally, the paper is referred in the reference section.

Effective Survey on Handwriting Character Recognition

117

2 Related Works The primary goal of this work is individual segmentation as it should be isolated the character from a handwritten file. Most of the researchers also did work on recognition [14] of handwriting with variability on the smart device; this analysis is based on only the limited number of the test case trained. So that reorganization rate is low and, in this method, a deep learning algorithm is not used for better results. Tavoli and Revenooer [15] developed a recognition of characters in the handwritten document including test and train phases which are implemented using a multilayer perceptron neural network by using the particle swarm optimization technology. Sahare and Dhok [16] made a research on developed new algorithms for character segmentation and recognition for handwritten texts using Contourlet Transform. Himakshi Choudhury and S R Mahadeva Prasanna have researched handwriting character recognition using sinusoidal model parameters by using the technology of Markov model, and also the vector machine in this work limitation is proposed model implemented [6] for particular handwriting and finally conclude that due to the calculation of oscillatory motion the recognition rate gets improved. Shobana Rani et al. introduced [3] new techniques for recognition of degraded character images from the ancient Kannada poetry report and additionally at the handwritten character pixels which are amassed from the diverse unrestricted environment using convolutional neural network. The performances of the AlexNet for the printed character are denoted as 91.3%, and the accuracy of handwritten character is 92%. A printed dataset is extracted from the ancient document, and the handwritten text can be collected synthetically from the age groups of 18–21, 22–25 and 26–30, where overlapping is still considered as separate classes of semantic analysis carried out during post-processing stages of OCR. Semantic evaluation is completed all through the after-processing level of OCR. In 2016, a technique [5] is proposed for local improper rectifiers for report image by building the nature of pictures in the archives to recoup the content with a couple of blunders. The performance evaluation of this method consists of public dataset DIQA, and they comprised of the obscured record pictures at the distinctive levels. A neighborhood group-based novel approach is utilized to diminish the obscure in the OCR. The investigation is completed on an open database with the utilization of OCR. The proposed strategy expanded the change of OCR rate to 11%. The model proposed was assessed utilizing the Tesseract without a dialect display. The aftereffects of examination propose that joining the impacts of obscure and character properties can be utilized to anticipate OCR precision. Karaoglu et al. [17] proposed a technique for removing the writings and logos in the scenes and pictures. Here a new approach called fully unsupervised word box strategy is utilized utilizing a predetermined number of recommendations. It has a vast content discovery dataset with 27,601 boxes. Here they inferred that wordlevel printed prompts are more powerful than the character level literary signs. In future work, the f-score can be expanded by utilizing the huge datasets. The IOU threshold overlap ground-truth, and they increase with decrease of recall value. High

118

G. S. Monisha and S. Malathi

retrieving in mind in phrase detection is extra applicable than high f-score. An effective textual cue is created for best-grained commercial enterprise location class and Logo retrieval. Nicole Dalia Cili et al proposed diverse [3] univariate measures to make a segment situating, and we proposed an exciting mission approach for picking the component subset prepared to help the gathering results. The K-Nearest Neighbor computation (K-NN) is a remarkable nonparametric procedure that is used for portrayal. They merged the usage of situating frameworks with an unquenchable request technique to pick feature subsets with a growing number of features, got by including features progressively according to their circumstance in the situating. In any case, these features are less confusing. High component assurance techniques on reduced component subsets can be obtained by picking the features in the most important circumstances in the situating. Another model proposed by Sahare and Dhok [7] made another multilingual division conspire for perusing the characters from the Indian archive pictures. In the proposed estimation, basic division ways are obtained using the helper property of characters, while secured and joined characters are segregated using outline partition speculation. Finally, division results are endorsed using an extraordinarily careful assist vector with a machining classifier, and three new geometrical shape-based features are figured. The first and second features are encircled concerning the center pixel of character, while neighborhood information of substance pixels is used for the figuring of the third segment. Assessments are performed on an openly available database and select database proposed affirmation figuring demonstrates the most raised exactness of 99.84% on the Chars74k numerals database. In [9] Bagoong Shi and Xiang Bai have described end-to-end neural network that can be set up for image-based sequence recognition. Novel neural framework designing and feature extraction are combined into a brought framework. The preliminaries were finished on standard benchmark datasets. The described figuring that performs well in the image-based music score affirmation task and accord is affirmed. CRNN can be prepared to commit moving estimations of pictures and makes figures with various lengths. In future work, it very well may be made into increasingly valuable for genuine applications. The arrangement forecast of the pictures is troublesome, and alternation of the novel neural system engineering is said to see the photos of substance in scenes. A tale neural system is incorporated to have two or three advancements.

3 Analysis of the Different Technologies and Techniques Analysis of the technologies gives the correct decision to choose efficient technology for the particular research of handwritten character. This table shows that some technology is effective in particular terms and consists of the different merits and demerits. This survey concluded that analysis of the handwritten character is more difficult in the curves, slopes, and the overlapping of the character. Moreover, handwriting is

Effective Survey on Handwriting Character Recognition

119

various from person to person, and the character of each word is various recording the bends and stretches of the character. Identification is more difficult in finding the character using these technologies. But deep neural network shows the effectiveness by comparing the other neural network. By comparing the other technologies, deep neural networks work a vital role in the recognition of the stroke in the handwritten character. Table 1 shows the comparative analysis of the technologies and techniques.

4 Result and Discussion Demonstrating the technique of our undertaking dataset of a handwritten individual has been taken from document or folks who have written the man or woman in the paper. A dataset has consisted of the top case letter and additionally the lowercase person and additionally; it consists of a quantity dataset. The maximum of the person is very too difficult to understand and tough to identify the character, where the stroke edges of the character including numerous shapes. Dataset is collected and saved in the database for feature extraction of the individual. The test case is trained with different technologies of a neural network but does not show that much effective way recognition of the handwritten character comparing with deep neural network. The recognition of the character using this technology shows the most trusted and effective way in the recognition of the character in the handwritten character. Initially, the document undergoes in the different process of preprocessing and the image thing binarization of the images is taking place after that segmentation the image will take place. Finally, the feature is extracted by using the DNN and compares the features with the OCR. Handwriting recognition has numerous favourable circumstances that caused it to develop quickly in the innovation word now. There is a vastly different sort of advances that stand to empower others to exploit the handwriting acknowledgment. How this work was when individuals compose letters an alternate way, and they let the PC recognizes what the proposed letter was and change into a book archive. Be that as it may, the issue with this was the distinctive way that the letters are composed and this could make unnatural feel to the individual who is composing it. Another method for valuing this innovation is that simply composing and the PC transforms it to a book archive and yet, the PC does not generally get the correct word and once in a while it embeds an inappropriate letter. Feature extraction decomposes a hieroglyphic individual into highlights like follows, shut circles, line way, and crossing points of lines. The functions minimize the portrayal and make the acknowledgment procedure high technology. In OCR, an algorithm may be educated primarily based on a statistic set of recognized handwriting text so that you can learn how to classify the characters that involve in the check set accurately. General strategies of function capacity discovery in PC vision are appropriate to this sort of optical character recognition which is ordinarily observed in down to earth penmanship acknowledgment and for sure most current optical character recognition program. Due to the calculation of oscillatory motion, the recognition rate gets improved by using the

Title

Handwriting recognition using sinusoidal model parameters

Multilingual Character Segmentation and recognition schemes for Indian document images

Separation of handwritten and machine-printed texts from noisy document using contourlet transform

Year

2018

2018

2018

Issues addressed

Approaches

ParulSahare, Sanjay B. Dhok

ParulSahare, Sanjay B. Dhok

To make paperless environment in work place a separation algorithm is implemented to get a handwriting character and machine printed text character from the documents

They developed a new [7] algorithm for segmentation of character and also for handwritten text recognition Contourlet transform

SVM classifier

Himakshi Choudhury, Handwritten character is HMM and SVM S R Mahadeva created through an Prasanna oscillatory development of the hand. In order to calculate them they implemented a sinusoidal parameter to find the characters

Author

Table 1 Comparative analysis of the technologies and techniques

Identifications are [16] high only if the features are combined

Expanding pursues four does never again give parts advancement in exactness

A proposed model can be implemented only for particular handwritings

Merits/Demerits

(continued)

Increase in detection of recall rate 98.9% is obtained which demonstrates its effectiveness

Most extreme division and discovery rates are 98.86 and 99.84%, separately got

Due to the [6] calculation of oscillatory motion the recognition rate gets improved

Conclusion

120 G. S. Monisha and S. Malathi

Title

A ranking-based feature selection approach for handwritten character recognition

A method for handwritten word spotting based on particle swarm optimization and multi-layer perceptron

Deformed character recognition using convolutional neural network

Year

2018

2018

2018

Table 1 (continued)

N. ShobhaRani, N. Chandan,A. Sajanjain,H. R. Kiran

Mohammadreza Revenooer

Ciliaa, Stefanoa, Francesco Fontanellaa, Alessandra Scotto di Frecaa

Author

Approaches

Particle swarm optimization

Recognition of Convolutional neural degraded character network images from the ancient Kannada poetry report and additionally at the handwritten character pix which are amassed from diverse un restricted environment

To recognize the characters in the handwritten [15] document they include test and train phase which is implemented using multilayer perceptron neural network

They addressed feature Feature- ranking extraction plays a major [18] role to recognize the characters in the handwritten documents that minimize the complexity level

Issues addressed

Semantic evaluation is completed [3] all through the after-processing level of OCR

High variance slows a network convergence to a solution which increase the computational time

assessing the connections among highlights needs high computational time

Merits/Demerits

(continued)

A normal accuracy 87% is achieved for the duration of the popularity of printed character

This implementation has reached best when compared to previous other methodologies

A good classification result is obtained using a reduced set of features and it improves the efficiency

Conclusion

Effective Survey on Handwriting Character Recognition 121

Title

Handwritten number and English characters recognition system

Object and character recognition using spiking neural network

Year

2017

2017

Table 1 (continued) To perceive a manually written numerals and furthermore English character dependent on backpropagation neural system

Issues addressed BP Neural Network_(backpropagation)

Approaches

Snehali Gadariyeb, S. The abnormal spacing Spiking neural network Chaturvedic, A. A. [2] between phrases and Khurshi variations in writing fashion they implemented histogram and edge detection to extract the feature

Wei Li, XiaoxuanHe, Chao Tang, Keshouwu, Xuhui Chen, ShaoyongYu, Yuliang Lei, Yanan Fang, Yuping Song

Author

The designing the system for recognizing [14] the hand written number and English letter. This proposed method and implementation reached 78% accuracy

Conclusion

(continued)

This implementation They conclude by does not achieve showing efficient state of art methods way of limiting a parameters is mandatory to gain efficiency to overcome and using the SNN they improved the performance

A limited number of test case are trained. So that reorganization rate is low. In this proposed method deep learning algorithm is not used for better result

Merits/Demerits

122 G. S. Monisha and S. Malathi

Con-text: text detection for fine-grained object classification

Text line segmentation using a fully convolutional network in handwritten document images

2017

Quang Nhat Vo, Soo Hyung Kim, Hyung Jeong Yang, Guee Sang Lee

Sezer Karaoglu, Ran Tao, Jan C. van Gemert, and Theo Gevers

In scanned documents detecting line is an important problem for processing of handwritten texts

Fully convolutional network (FCN)

To perceive fine-grained OCR engine, condition of object [4] grouping workmanship character utilizing perceived recognition scene message in common pictures

Between casing content qualification to acknowledgment on change edge of still news subtitle

2017

WataruOhyama, Devoted OCR for Tetsushi Wakabayashi perceiving low choice and Fumitaka Kimura data inscription in video photos

Recognition and transition frame detection of Arabic news captions for video retrieval

Approaches

2017

Issues addressed

Local blur correction Van Cuong Kieu, Clarity of images files Document image [5] analysis for document images Florence Cloppet and to get the text character and recognition (DIAR) Nicole Vincent from the document with a minimum number of errors

2017

Author

Title

Year

Table 1 (continued)

Segmentation of touching characters is not possible

Text detection technique does no longer at once locate text region

Degrades an optical character recognition rate (OCR) is blurring

shortcomings that corrupts OCR proficiency obscuring

Merits/Demerits

(continued)

This method adapts to different [12] types of inputs fed. Performance level gets improved

To coordinate literary with noticeable signals for top notch grained class and logo recovery

To recognition the progress [1] of casing effectively with the F-degree is increase to 90%

A haze rectification is finished principally dependent on dark scale security method in a close by window without utilizing any haze model estimation

Conclusion

Effective Survey on Handwriting Character Recognition 123

Text line detection in Byeongyong Ahn, To overcome the [20] degraded historical Jewoong Ryu, Hyung problem of historical document images Il Koo and Nam Ik document degradation Cho they implemented document image binarization and line detection methods

2017

A conventional wide-spread and completely solo word box idea technique

Approaches

Document binarization

They signify the writing block wise local binary style of every author count (BW-LBC) and it’s miles applied to a lot of associated added substances removed and trimmed from examined penmanship samples in which each categorized thing is visible as a texture photo

Textual contents [17] in pictures for satisfactory-grained commercial enterprise region category and brand recovery

Effective feature Abderrazak Chahi, descriptor-based new Issam El khadiri, framework for Youssef El merabet off-line text-independent Writer Identification

Issues addressed

2017

Sezer Karaoglu, Ran Tao, Theo Gevers, and Arnold W. M. Smeulders

Words matter: scene text for image classification and retrieval

2017

Author

Title

Year

Table 1 (continued)

It is hard to locate the ordinary relationship between the scale esteems and the current execution

Model experimented on simple databases and not suitable for challenging databases

High retrieving in mind in phrase detection is extra applicable than high f-score

Merits/Demerits

(continued)

Overall execution on degraded historical document with character detection is improved with efficiency

Experimental performance [19] of the English character recognition competitive technology for the existing systems

An effective textual cue is created for best-grained commercial enterprise location class and Logo retrieval

Conclusion

124 G. S. Monisha and S. Malathi

Learning spatial-semantic context with fully convolutional recurrent network for online handwritten Chinese text recognition

Contour restoration of text components for recognition in video/scene images

2017

2016

Convolution neural network CNN

Stroke candidate pixels (SPC)

The more than one [11] Multi-spatial-context fully spatial context from the convolutional recurrent mark capacity maps and network (MC-FCRN) produce forecast grouping while totally keeping off the troublesome division issue

Wu, Y., Shiva kumara, To restore a total P., Lu, T., Tan, C. L., character forms in Blumenstein, M., & video/scene pictures Kumar, G. H

Zecheng Xie, Zenghui Sun, Lianwen Jin ∗ , Hao Ni and Terry Lyons

Bengali handwritten Bishwajit Purkaystha, To recognize Bengali character recognition Tappos Datta, Md hand written character using deep Saiful Islam by using deep CNN convolution neural network

Approaches

To recognize the Diagonal-based feature English characters extraction picture elements and the picture elements along the central level and upright range are calculated

2017

Issues addressed

Handwritten K. Vijayalakshmi, S. character recognition Aparna, Gayatri using diagonal-based Gopal and W. Jino feature extraction Hans

2017

Author

Title

Year

Table 1 (continued)

It is totally expensive and critical to perform in low level images

LM practices a large amount of time is taken for training case of large corpus

Error suffered by model [8] in detection work are due to highly proximity in features of the character

Maximum number times the network incorrectly detect the vowels

Merits/Demerits

(continued)

It is computationally expensive [22] and tough to carry out in low level images

Correct rate of recognizing a character is 97.50% and 96.58% with accuracy

Maximum identification of Bengali character is 89.93%

Classification of characters is improved [21] highly with better recognition rate

Conclusion

Effective Survey on Handwriting Character Recognition 125

Title

A new approach to extract text from images based on DWT and K-means clustering

End-to-end online writer identification with recurrent neural network

Year

2016

2016

Table 1 (continued)

Xu-Yao Zhang, Guo-Sen Xie, Cheng-Lin Liu, and Yoshua Bengio

Deepika Ghai, Divya Gera, Neelu Jain

Author DWT, sliding window

Approaches

A end to end system Recurrent neural network [13] for content (RNN) autonomous author recognizable proof in on the web

A surface based book extraction technique utilizing DWT with K-implies bunching

Issues addressed

Reasonable utilization for this methodology is bound and constrained, because of the prerequisite of composing with consistent content substance

The exhibition of this methodology is wasteful while the image is high illuminated and growth in resolution of picture processing time additionally improved

Merits/Demerits

(continued)

A distinguishing proof of start to finish essayist structure was actualized by utilizing the RNN innovation it is carefully manage the information of penmanship character which are available in online

Text extraction is done [23] using the k means clustering techniques

Conclusion

126 G. S. Monisha and S. Malathi

An end-to-end Baoguang Shi, Xiang trainable neural Bai network for image-based sequence recognition and its application to scene text recognition

Recognition based Archit Shah, Santanu text localization from Chaudhury natural scene images

2016

2016

ImenChtourou, CheikhRouhou, FatenKallelJaiem, Slim Kanoun

ALTID: Arabic/Latin text images database for recognition research

2016

Author

Title

Year

Table 1 (continued)

Ordering/recovery of pictures or on the other hand recordings dependent on their content substance, scene understanding in machine interpretation

The problem of scene [9] textual content reputation, that’s most of the maximum significant and hard errands in picture based absolutely gathering acknowledgment

To recognition of text, identification of writer, verification and forms analysis by pre-processing

Issues addressed

Scene text localization [25] is the methodologies used for textual content reputation image recognizer inside a particular Feedback system to recursively look for literary substance locales

Novel neural network

Texture analysis technique

Approaches

Restricting and perceiving coincidental or focused on content pixels or videos

The sequence prediction of the images is very difficult

Script, font and writer identification is based on assumption and no experiments are performed

Merits/Demerits

(continued)

A tale structure is worked for content limitation in scene picture and content is perceived

Alternation of the novel neural network architecture, is said to be perceive the pictures of content in scenes

Novel database is created and contains disconnected [24] Arabic/Latin machine printed content picture and manually written character. Content was readied utilizing a similar procedure to make APTID/MF database

Conclusion

Effective Survey on Handwriting Character Recognition 127

Title

Text detection and recognition in natural scene with edge analysis

Year

2015

Table 1 (continued) Issues addressed

Approaches

Chong Yu, To detect and Edge analysis YonghongSong, Quan recognition the text with Meng, Yuanlin edge analysis Zhang, Yang Liu

Author Extraction of [10] stroke width of the character not detected properly due to light reflection and some backgrounds completion

Merits/Demerits

They improved the performance of proposed system compared to other system by detecting the text with edge analysis

Conclusion

128 G. S. Monisha and S. Malathi

Effective Survey on Handwriting Character Recognition Table 2 Comparison of filters

129

Types of filters

Noise (%) Salt and pepper

Gaussian

Speckle

2d median filter

92.58

99.51

94.83

Mean filter

74.07

76.45

83.36

Wiener filter

80.6

98.48

78.57

techniques of HMM and SVM and degrades an OCR rate is blurring provides a primarily based on the set of reorganization handwritten character. It is created and contains an improper connection between printed and manually written characters content utilizing a similar procedure to make APTID/MF database. The comparison of the different filters gets an effective way to remove the different noises in the handwritten images. Moreover, the 2D median filter employs major role elimination of the different noise in the handwritten image. Table 2 shows the comparative analysis of the different filters. Graph 1 shows the comparison of the filters in the percentage. 2D median filters are more effective in the analysis of the different filters. Comparing the other two filters, 2D median filter shows the high efficiency in the reduction of unwanted noise in the images. In this research, preprocessing undergoes with the 2D median filter because the PSNR value another filter low comparing to the other filter where segmentation of this process used the adaptive mean shift algorithm. By comparing with other neural networks, deep neural network (DNN) works an effective way for the extraction of the feature in the handwritten character, where convolutional neural network (CNN) consists of the combination of the two processes and they are used for feature detection and feature map identification. Convolution neural network (CNN) [3, 8] does not produce the accurate extraction of the feature of the character in the handwritten image due to confusion in the stroke edges of the handwritten character. Deep neural networks are efficient way in the extraction of stroke edge in the handwritten character comparing to the convolution neural network. After the extraction of the features, they compare the feature with the OCR template and they display the result. 120 100 80 60 40 20 0

2d Median Filter

Mean Filter

Salt and Pepper

Graph 1 Comparison of filters

Gaussian

Wiener Filter Speckle

130

G. S. Monisha and S. Malathi

5 Conclusion The current technologies are still moderate in providing accuracy in recognition of handwritten character. The application of this handwritten character recognition algorithm is enormous. Nowadays, latest development in technologies has driven the bounds in addition for man to remove older gadgets which posed inconvenience in using. The techniques for the recognition of handwritten text with the aid of segmenting and classifying the characters were proposed in this thesis work. The troubles in handwritten text written via exceptional humans are identified after cautiously analyzing the text. To solve these issues, new strategies had been advanced for segmentation feature extraction and recognition. The deep neural network provides a greater extend in recognition of the handwritten character. Because the recognition of the stroke edged in the human written character more difficult in the other technologies, the deep neural network (DNN) provides an efficient way where the recognition rate of the DNN is high comparing other technologies. CNN provides accuracy with difficulty in recognition of the curves and bends in the character; that impact is changed and overcome by using the DNN. In future, the research works undergo using current technologies and enhance the accuracy rate of the handwritten character.

References 1. Iwata S, Ohyama W, Wakabayashi T, Kimura F (2016) Recognition and transition frame detection of Arabic news captions for video retrieval. In: 2016 23rd international conference on pattern recognition (ICPR). https://doi.org/10.1109/icpr.2016.7900260 2. Bawane P, Gadariye S, Chaturvedi S, Khurshid AA (2018) Object and character recognition using spiking neural network. Mater Today Proc 5(1):360–366. https://doi.org/10.1016/j.matpr. 2017.11.093 3. Shobha Rani N, Chandan N, Sajan Jain A, Kiran R, H. (2018) Deformed character recognition using convolutional neural networks. Int J Eng Technol 7(3):1599. https://doi.org/10.14419/ ijet.v7i3.14053 4. Karaoglu S, Tao R, van Gemert JC, Gevers T (2017) Con-Text: text detection for fine-grained object classification. IEEE Trans Image Process 26(8):3965–3980. https://doi.org/10.1109/tip. 2017.2707805 5. Kieu VC, Cloppet F, Vincent N (2016) Local blur correction for document images. In: 2016 23rd international conference on pattern recognition (ICPR).doi: https://doi.org/10.1109/icpr. 2016.7900269 6. Choudhury H, Prasanna SRM (2018) Handwriting recognition using sinusoidal model parameters. Pattern Recogn Lett. https://doi.org/10.1016/j.patrec.2018.05.012 7. Sahare P, Dhok SB (2018a) Multilingual character segmentation and recognition schemes for Indian document images. IEEE Access 6:10603–10617. https://doi.org/10.1109/access.2018. 2795104 8. Purkaystha B, Datta T, Islam MS (2017) Bengali handwritten character recognition using deep convolution neural network. In: 2017 20th international conference of computer and information technology (ICCIT). https://doi.org/10.1109/iccitechn.2017.8281853

Effective Survey on Handwriting Character Recognition

131

9. Shi B, Bai X, Yao C (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304. https://doi.org/10.1109/tpami.2016.2646371 10. Yu C, Zhang Y, Liu Y, Meng Q, Song Y (2015) Text detection and recognition in natural scene with edge analysis. IET Comput Vis 9(4):603–613. https://doi.org/10.1049/iet-cvi.2013.0307 11. Xie Z, Sun Z, Jin L, Ni H, Lyons T (2018) Learning spatial-semantic context with fully convolution recurrent network for online handwritten chinese text recognition. IEEE Trans Pattern Anal Mach Intell 40(8):1903–1917. https://doi.org/10.1109/tpami.2017.2732978 12. Vo QN, Kim SH, Yang HJ, Lee GS (2018) Text line segmentation using a fully convolutional network in handwritten document images. IET Image Proc 12(3):438–446. https://doi.org/10. 1049/iet-ipr.2017.0083 13. Zhang X-Y, Xie G-S, Liu C-L, Bengio Y (2017) End-to-end online writer identification with recurrent neural network. IEEE Trans Hum-Mach Syst 47(2):285–292. https://doi.org/10.1109/ thms.2016.2634921 14. Li W, He X, Tang C, Wu K, Chen X, Yu S, … Song Y (2016) Handwritten numbers and English characters recognition system. Intell Data Anal Appl 145–154. doi:10.1007/978-3-319-484990_18 15. Tavoli R, Keyvanpour M (2018) A method for handwritten word spotting based on particle swarm optimisation and multi-layer perceptron. IET Softw 12(2):152–159. https://doi.org/10. 1049/iet-sen.2017.0071 16. Sahare P, Dhok SB (2018b) Separation of handwritten and machine-printed texts from noisy documents using contourlet transform. Arab J Sci Eng. https://doi.org/10.1007/s13369-0183365-1 17. Karaoglu S, Tao R, Gevers T, Smeulders AWM (2017) Words matter: scene text for image classification and retrieval. IEEE Trans Multimedia 19(5):1063–1076. https://doi.org/10.1109/ tmm.2016.2638622 18. Cilia ND, De Stefano C, Fontanella F, Scotto di Freca A (2018) A ranking-based feature selection approach for handwritten character recognition. Pattern Recogn Lett. https://doi.org/ 10.1016/j.patrec.2018.04.007 19. Chahi A, El Khadiri I, El Merabet Y, Ruichek Y, Touahni R (2018) Effective feature descriptorbased new framework for off-line text-independent writer identification. In: 2018 international conference on intelligent systems and computer vision (ISCV). https://doi.org/10.1109/isacv. 2018.8354072 20. Ahn B, Ryu J, Koo HI, Cho NI (2017) Textline detection in degraded historical document images. EURASIP J Image Video Process 1. https://doi.org/10.1186/s13640-017-0229-7 21. Vijayalakshmi K, Aparna S, Gopal G, Hans WJ (2017) Handwritten character recognition using diagonal-based feature extraction. In: 2017 international conference on wireless communications, signal processing and networking(WiSPNET). https://doi.org/10.1109/wispnet.2017.829 9949 22. Wu Y, Shivakumara P, Lu T, Tan CL, Blumenstein M, Kumar GH (2016) Contour restoration of text components for recognition in video/scene images. IEEE Trans Image Process 25(12):5622–5634. https://doi.org/10.1109/tip.2016.2607426 23. Ghai D, Gera D, Jain N (2016) A new approach to extract text from images based on DWT and K-means clustering. Int J Comput Intell Syst 9(5):900–916. https://doi.org/10.1080/187 56891.2016.1237189 24. Chtourou I, Rouhou AC, Jaiem FK, Kanoun S (2015) ALTID : Arabic/Latin text images database for recognition research. In: 2015 13th international conference on document analysis and recognition (ICDAR). https://doi.org/10.1109/icdar.2015.7333879 25. Ray A, Shah A, Chaudhury S (2016) Recognition based text localization from natural scene images. In: 2016 23rd international conference on pattern recognition (ICPR). https://doi.org/ 10.1109/icpr.2016.7899796

Enhancement in Braille Systems—A Survey K. M. R. Navadeepika and V. D. Ambeth Kumar

Abstract Education plays an important role in the growth of a person’s skill in the society. But there are many obstacles which may occur globally and must be considered to achieve this goal. One among them is the blind people who needs education to reduce the effect of their difficulties. The aim is to provide Braille conversion for all the visually impaired to gain knowledge. The task’s goal is to plan and build up a Braille System and yield gadgets for the outwardly debilitated people that empower them to interface and impart. This examination proposes a calculation which empowers the client to change over the content that we typically have in our everyday utilization into a Braille script and along these lines encourage the outwardly weakened. The product that has been made is a natural and oversimplified plan that will empower the end client to easily peruse. Keywords Braille · Tactile · Visually impaired

1 Introduction Visually impaired people are those who does not have a good capacity to view things as a person with normal vision view. Normal vision people can use contact lenses or glasses to improve their vision, whereas these visually impaired people cannot access the lens or glasses to improve their vision [1]. They can even perform their daily-based activities easily. They need other people’s help to fulfill their needs. The visually impaired people face many complexities in their daily activities such as moving out, eating, reading, interacting with others, driving, and many other tasks. They obtain information by listening to the audios and reading through enlarged text. In order to survive in this competitive society, the visually impaired people should K. M. R. Navadeepika (B) · V. D. Ambeth Kumar Panimalar Engineering College, Chennai, India e-mail: [email protected] V. D. Ambeth Kumar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_10

133

134

K. M. R. Navadeepika and V. D. Ambeth Kumar

develop their skills in all the possible areas. They should be more and more efficient in education. To access the materials for visually impaired people, there are various technologies available [2]. One of the most popular materials is the Braille book. To obtain knowledge with the Braille book, the tactile dot technology is used by the blind people. The first Braille system was developed by Louis Braille for the visually impaired people; there is an organization called World Blind Union which can be found in around 190 countries. But the practical availability of this book is very limited which is the drawback of the application. There are various technologies which have been developed for them like voice messages as an output which would help them to read the text. To correct the blur and improve the focus, some corrective lenses are worn on eye. Hyperopic, myopia and presbyopia are corrected using the lens. But these corrective lens can adjust the focus of the low vision people. These remedial focal points are normally recommended by the ophthalmologist. The solution requires a considerable number of important determinations to make the focal point [3]. Solutions commonly integrate the power determination of each eye. According to the study, it is evaluated that around 130 crores, individuals live with a type of visual disability, whereas 18.85 crores people have low vision, 21.7 crores individuals are visually impaired from moderate to severe, and 82.6 crores people are close with near blindness. Everyone should learn to read and write. Several studies show that the Braille system can read and write which directly corresponds with the academic excellence and employment among the blind and visually impaired [4]. A person who reads Braille can grow independently in the society as Braille increasingly spreading everywhere. Blind people deserve this chance at equality, and this is something that Braille gives every individual to prove themselves. Speech or any other digital tools are no recompense for Braille. Listening alone is not enough for the individual to succeed in life. The blind people use a system for reading and writing which is called as Braille system (tactile system). It is generally writing with embossed paper [5]. Braille using individual can read the computer screens and other electronic supporting systems using the refreshable Braille displays. In this paper, we are going to examine about various problems that has been recognized and solved in various papers. Many ideas and suggestions are given in the early days and solved for the blind individuals. The Sect. 2 Related works consist of the various work done in the various papers where the techniques, limitations, and advantages of one over the other. Section 3 says about the comparative analysis of various tools, and their accuracy is observed. Section 4 contains the conclusion of the work done. It describes about the part that is been concluded with the enhancement of the system. Finally, the papers which are referred are added as the reference.

Enhancement in Braille Systems—A Survey

135

2 Related Works The Braille system is widely spread method. Several studies show that the Braille system can read and write which directly corresponds with the academic excellence and employment among the blind and visually impaired. In this paper by Raghunandan et al. [1], the main objective is to flourish a Braille system and the device which shows output for the blind people, so that it will help them to interact with other people. The advantage said by this paper is the internal hardware circuit which can be adjusted; the system has lightweight and malleable, and needs less power and is very easy to operate. In this paper by De Silva et al. [6], the given system helps the blind and visually impaired people to overcome their difficulties and everyday struggles in their work. They can also engage with the typing process without any help from others. But this language translator supports Sinhala, Tamil, and English only. The paper by Ferro et al. [2] has the aim which is to make available of a system which is easy going on generating tactile graphics to the people who need the information present in the image. An image is broken and simplified based on its features of object represented in it. Then, the image is represented in physical tactile form. The researchers Nian-Feng L et al. [7] described about a system where the images captured by the camera are used to recognize the Braille characters which is translated to Chinese character simultaneously and marked automatically to the Braille paper. The researchers Ng et al. [3] say that they are presenting a paper with an automatic system which is used to identify the Braille papers and convert the Braille documents into English or Chinese text. This system uses both the single-sided and double-sided pages, and this system has 100 and 97% accuracy, respectively. From the paper of Tripathy et al. [5], it is said that a hand glove is developed to connect with the android devices wirelessly for the visually impaired. This helps the visually impaired to use an application from the Android device. The researchers Domale et al. [8] the given work will avail a chance to have an audio book of English and Marathi of any printed book having English, Marathi or even Braille script. From the paper of Rupali et al. [9], the given work presents an approach to extract and recognize the text from the images, and it converts the extracted text into speech for the visually impaired. The paper of Bawdekar et al. [10] creates an algorithm which is used to convert the text into Braille script which will boost the visually impaired to learn. In the paper of Balasuriya et al. [4], an application is developed for the children between the age of 6 and 14 for their primary education to read, write and identify the objects without the help of any third party by using the region-based convolution networks. The text image to Braille code converter which is done by Shwethashree et al. [11] says that an image is captured using the camera. Now, the captured image is said to be processed, and now, the same image will be processed and converted into the text using MATLAB. This converted text will be detected and processed which will be transferred into an Arduino through serial communication. Now, this Arduino will convert each character into Braille code using the servomotor. This system is available in lower cost.

136

K. M. R. Navadeepika and V. D. Ambeth Kumar

The paper Designing Braille copier based on Image Processing Techniques by Malik et al. [12] says that the research includes a Braille copying machine which is said to produce the Braille documents in exact format without considering the languages used. This machine can work as a two-in-one system such as copying and printing. That is, the Braille paper is copied same as the way of copying an ordinary printed text. This method needs optical recognition and image processing techniques. The Braille document has been successfully printed for both single- and double-sided papers. Visual to tactile conversion of vector graphics by Krufka et al. [13] proposed an algorithm which is used to extract the object boundaries and classifies the process of determining the crucial outlines on a graphical hierarchical structure. The output is printed as a tactile content using Braille printer. Experiences with Lower-Cost Access to tactile graphics in India by Kaleem Rahman et al. [14]. In this paper, the tactile graphics allow the blind people to identify the two-dimensional images which is an important part of their studies. In this paper, they gave a solution for connecting several regular images which are connected and printed as a tactile image at its lowest cost. This is a low-cost device which is meant only for Braille text. Automatic visual to tactile translation—part 1 by Way [15]—this is said to be the first part of the two-part paper. This paper converts all the available images into tactile form automatically. A bigger background is given in the areas of a human’s tangible system and blindness. This technology is used for graphical production in tactile, and even, image processing is used for its proprietary. A Text-to-Braille scanner with ultra-low-cost refreshable Braille display [16] provides a scanner to convert the English text to Braille. A text scanner is given in the mobile which is currently using the optical character recognition engine from the Google. It is an open-source software which can also work without the Internet facilities. The application also provides an assistive technology for voice guidance. Converting a Braille script into voice leads to an easy way of communication for the deaf and blind. The paper written by Krishna Kishore et al. [17] says about a method of providing a message and a voice conversion of what blind and deaf people want to say. A special keyboard is developed for the blind people so that they can send text message as well as voice conversion. It is implemented on the platform of Arduino. The conversion of 2D Mathematical equation to linear form for transliterating into Braille: An aid for visually impaired people by Jariwala et al. [18 ]; the method is used to convert the 2-D mathematical expression into Braille code for every individual to study math and be well in it. After transforming into Braille, the content updated is stored as the text file which is then printed using the Braille embosser. It is a system made with a real-time hand glove [19] which is cost effective. The Braille hand glove is built using the slot sensors and vibration motors through which they can write, read, and send text messages. They can also read e-mails using this hand glove. Choudary et al. [20] say that they have designed a wired keyboard with six dots that are like Braille lipi. This keyboard can be interfaced with laptops and desktop. It has six buttons which represent six dots, and it also has a audio jack so that the blind people can also hear what they have typed. It is embedded with the PC as an open-source software.

Enhancement in Braille Systems—A Survey

137

Table 1 Accuracy of the tools Tools

Language

Numbers

Symbols

Formula

English translator

Only in English

No

No

No

Paths to literacy

English

Yes

Partially

No

Braille translator

English and other languages

Yes

Yes

Partially

Braille in English

English

Yes

No

No

Proposed system

English

Yes

Yes

Yes

3 Comparative Analysis Table 1 represents the accuracy of converting the text, number, and expression. The words, numbers, and the expression are converted based on the alphanumeric, punctuation, symbols, and signs. It is approximately found that if the number of word present is to be 1000 in 66,623, characters in four paragraphs will be converted based on the alphanumeric, punctuation, symbols, and signs present in the document. This is used to find the evaluation performance. Therefore, based on this, 1000 words will be converted into Braille within 17.254 s. From the above-mentioned table, various online tools are used to find the performance of our proposed system. Some tools are used to convert only the languages, while others are used to convert the symbols partially, some are used to convert the numbers also. But our proposed system works efficiently on all the above-mentioned categories.

4 Conclusion The tactile technologies developed help the visually impaired to guide themselves in their studies without any help. The educational data in machine readable text and graphical representation of data electronic Braille helps them in their learning process. Although the tactile technologies have been enhanced or developed altogether, there are still many ways and place for the development in real-time applications. The current technologies are still moderate in providing the education in a better way to the visually impaired individuals. The characters in the textbook are converted into the Braille text which is done in this project to help the visually impaired by providing them a better material for further education. The result so far has a better accuracy but should also be improved in the end. This method basically focuses on work which can efficiently extract the features from each individual characters. This paper says about various technologies that are used for the tactile conversion of various images and words. They have also converted different languages which are easy to access. But still the Braille converter can be made into a low-cost available device. The conversion of mathematical notations and graphics can be improved in the upcoming projects.

138

K. M. R. Navadeepika and V. D. Ambeth Kumar

The application of this text recognition is extensive. Nowadays, recent advancement of in the technologies has pushed the limits further for man to get rid of older equipment which posed inconvenience in using. In the future, more complex mathematical notations can be converted into Braille code which makes the world a better place for the visually impaired people to get rid of the older materials. Multiple languages can also be converted to Braille format to get a better outreach.

References 1. Raghunandan A, Anuradha MR (2017) The methods used in text to Braille conver sion and vice versa. Int J Innov Res Comput Commun Eng IJIRCCE 5(3) 2. Ferro TJ, Pawluk DT (2013) Automatic image conversion to tactile graphic. In: Proceedings of the 15th international ACM SIGACCESS conference on computers and accessibility. ACM, p 39 3. Ng C, Ng V, Lau Y (1999) Regular feature extraction for recognition of Braille. In: Proceedings third international conference on computational intelligence and multimedia applications. ICCIMA’99 (Cat. No. PR00300). IEEE, pp 302–306 4. Balasuriya B, Lokuhettiarachchi N, Ranasinghe A, Shiwantha K, Jayawardena C (2017) Learning platform for visually impaired children through artificial intelligence and computer vision. In: 2017 11th international conference on software, knowledge, information management and applications (SKIMA). IEEE, pp 1–7 5. Tripathy AK, D’Sa M, Alva R, Fernandes J, Joseph AL (2015) Finger Braille: tactile communication for differently abled. In: 2015 international conference on technologies for sustainable development (ICTSD). IEEE, pp 1–5 6. De Silva P, Wedasinghe N (2013) Braille converter and text-to-speech translator for visually impaired people in sri lanka. In: Proceedings of the 15th international ACM SIGACCESS conference on computers and accessibility. ACM, p 39 7. Nian-Feng L, Li-rong W (2011) A kind of braille paper automatic marking system. In: 2011 international conference on mechatronic science, electric engineering and computer (MEC). IEEE, pp 664–667 8. Domale A, Padalkar B, Parekh R, Joshi M (2013) Printed book to audio book converter for visually impaired. In: 2013 Texas instruments India Educators’ conference. IEEE, pp 114–120 9. Dharmale R, Ingole P (2015) Text detection and recognition with speech output for visually challenged person: a review. Int J Eng Res Appl 5(3):84–87 10. Bawdekar K, Kumar A, Das R (2016) Text to Braille converter. Int J Electron Commun Eng Technol (IJECET) 7:54–61 11. Shwethashree S, Sowmya SK, Sri Ranjini, Vanaja N, Parameshwara MC (2018) Text image to Braille code converter. Int J Eng Res Electron Commun Eng (IJERECE) 5 12. Al-Salman AMS, El-Zaart A, Al-Suhaibani Y, Al-Hokail K, Gumaei A (2014) Designing braille copier based on image processing techniques. Int J Soft Comput Eng (IJSCE) 4(5) 13. Krufka SE, Barner KE, Aysal TC (2007) Visual to tactile conversion of vector graphics. IEEE Trans Neural Syst Rehabil Eng 15(2):310–321 14. Dias MB, Rahman MK, Sanghvi S, Toyama K (2010) Experiences with lower-cost access to tactile graphics in India. In: Proceedings of the first ACM symposium on computing for development. ACM, p 10 15. Way TP, Barner KE (1997) Automatic visual to tactile translation. i. Human factors, ac- cess methods and image manipulation. IEEE Trans Rehab Eng 5(1):81–94 16. Hossain S, Raied AA, Rahman A, Rahabin Z, Adhikary AD (2018) Text to Braille scanner with ultra low cost refreshable Braille display. Bangladesh University of Engineering and Technology Department of Electrical and Electronic Engineering

Enhancement in Braille Systems—A Survey

139

17. Krishna Kishore K, Prudhvi G, Naveen M, Text to Braille scanner with ultra low cost refreshable Braille Display 18. Jariwala NB, Patel B (2017) Conversion of 2D mathematical equation to linear form for transliterating into Braille: an aid for visually impaired people. Int JInnov Res Sci Eng Technol 6(4) 19. Shah M, Shah T, Tiwari R, Thakar R (2018) Real-time communication Braille glove for deaf and blind. Int Adv Res J Sci Eng Technol 5(3) 20. Choudhary T, Kulkarni S, Reddy P (2015) A Braille-based mobile communication and translation glove for deaf-blind people. In: 2015 international conference on pervasive computing (ICPC)

A Critical Review on Use of Data Mining Technique for Prediction of Road Accidents Navdeep Mor, Hemant Sood, and Tripta Goyal

Abstract Road accidents in India have become one of the serious issues that need to be resolved. The statistics revealed that India’s 3% of total gross domestic product (GDP) is getting waste in road crashes. This issue not only causes the social problem but the financial burden to the country as well. In order to reduce accidents, several techniques have been given by different researchers, but modeling road accidents using computer science techniques has gained increasing attention over the years as they have the power to predict road accidents with high accuracy. Practically, road crashes are rare events, thus the binary-dependent variable is categorized by dozens to thousands of times rarer events (accidents) than non-rare events (non-accidents). This paper aims at introducing a review of data mining (DM) applications for the analysis of road crash data in road safety and the most commonly available data mining software in the market with their advantages and disadvantages. The study concluded that DM technique can be used by different government agencies and departments for sustainable road management. Keywords Data mining · Modeling · Accident prediction · Traffic accidents

1 Introduction Road transport promotes the development of the economy and movements of goods and freight, and helps in improving the welfare of the citizens in that area. With excessive growth of road networks and registered vehicles, road accidents are also increasing with it [1]. Road crashes impose severe problems to the society in terms of economic costs, human costs, medical costs, and property damage costs. The ultimate loss in road crashes is loss of life which is not acceptable in sustainable road N. Mor (B) · H. Sood Civil Engineering Department, NITTTR, Chandigarh, India e-mail: [email protected] T. Goyal Civil Engineering Department, PEC (Deemed-to-be-University), Chandigarh, India © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_11

141

142

N. Mor et al.

development [2]. The major concern of road safety decision-makers or policymakers is to understand different factors associated with road accidents as this is a very complex phenomenon. The level of traffic crashes is such that one person is dying in every 3–4 min as per the latest report published by MORTH, 2017. This shows the seriousness of the issue [3]. In order to understand the pattern of these crashes, black spots, prediction of accidents, we need an advanced tool that can analyze all these parameters and can give a result with high accuracy [4]. In today’s world of work, modern computer science techniques are available to predict road accidents and these techniques are accepted globally. Some of the computer techniques include DM, ANN, machine learning, etc. In this study, our primary focus is on DM, its advantage and disadvantage, and its application in the field of transportation. The core objective of the study is to introduce the concept of data mining in the field of transportation engineering and to provide a summary of commercially available software for the same.

2 What is Data Mining? Data mining is a logical procedure intended to investigate information (normally a lot of information—typically related to business or market-also called “big data”) looking for predictable and consistent patterns as well as methodical connections among variables, and after that to validate the results. The final objective of DM is to forecast/predict and for that predictive DM is the most well-known type of DM and one that is being used for business applications [5]. DM is generally used by professional intelligence organizations, and by economic experts, but due to its highly accurate results, it is now being used by road safety experts to extract relevant information from the vast datasets collected by different methods [6]. This reason behind the popularity of this technique in transportation engineering is that it serves as a foundation for both machine learning and artificial intelligence as it can scrap existing data to highlight patterns and thus suggests solutions to existing problems.

2.1 Process of Data Mining DM process is classified into two types, i.e., data preparation/data preprocessing and data mining. In the data preprocessing process, the first four processes are cleaning of data, integration of data, selection of information, and its transformation, and the last three processes that are data mining, evaluation of pattern, and representation of knowledge are combined into one process called data mining. Every single DM process faces a lot of challenges and issues in practical application and extracts possibly suitable/convenient information [6]. The process of DM is given in Fig. 1.

A Critical Review on Use of Data Mining Technique …

143

Fig. 1 Process of data mining

2.2 Advantage and Disadvantages of Data Mining Technique Table 1 summarized various advantage and disadvantages of DM technique. Table 1 Advantage and disadvantages of DM technique S.

Advantages of DM

Disadvantages of DM

1

DM can support in the anticipation of upcoming unfavorable situations by demonstrating true data

Extreme work intensity may require investment in elite groups and staff training

2

Enhancement in the compression of database and knowledge, helping reading to users

High difficulty while gathering the information/data

3

DM contributes to strategic decision making by finding key data

Sometimes, the required skills to carry out the collection of data is not a simple job and consumes various resources that may increase the cost

4

DM technique is capable of analyzing an enormous amount of data. The results are very easy to understand: individuals without previous knowledge in the field of computer can decipher the outcomes with their own ideas

Contingent upon the nature and type of data, it may require some time to preprocess all that information

5

The outputs of the models are consistent as The private information of the users may they are tested using different statistical put at risk due to the lack of appropriate methods before being utilized in order to security system make predictions more reliable and valid

6

Generally, models are created and built rapidly. Demonstrating sometimes becomes easier as most of the algorithms have been earlier tested

No.

As it is not a perfect process, it may influence the result of the decision-making process if the input data is wrong

144

N. Mor et al.

Table 2 Traditional versus innovative techniques in accident analysis Method

Description

Innovative

Prediction models (ANN, data mining, machine learning, etc.)

• Crash modification factor and function • Crash reduction factor and function • Traffic conflict techniques • Safety performance function

Traditional

Statistics

Inferential and descriptive statistics

GIS-spatial analysis

Graphics representation, network, and planer Kernel density estimation

Accident reconstruction

Physics approach

3 Innovative Techniques Versus Traditional Techniques in Accident Analysis Generally, we have two types of methods available for data analysis in road safety. These methods are innovative and traditional methods. In today’s world, for better prediction of road crashes, innovative methods are being used. Innovative method includes prediction of road accidents using artificial neural network [7], data mining techniques, clustering, etc., in these methods, usually, road accident is considered as dependent variable and factors like lane width, number of lanes, traffic volume, speed of vehicle, type of vehicle, type of maneuver, type of road, and population in that area are considered as independent variables. The traditional method includes analysis of basic stats, global information systems (GIS), and accident reconstruction [8]. Different types of innovative and traditional methods used in urban road safety are listed in Table 2. Apart from these methods, some interdependence methods are also available for analysis of accident data. Discovery techniques are applicable when at least one or more than one variable can be identified as dependent variables and the remaining as independent variables. Interdependence methods are a type of multivariate statistical techniques in which a whole set of interdependent relationships is examined. In this method, no single variable is defined as dependent or independent [9, 10]. The major objective of applying these techniques is to simplify data for further analysis. The classification of these interdependence methods with examples is given in Table 3.

4 Application of Data Mining in Safer Transportation System/Accident Analysis Nowadays, the DM technique has been demonstrated as a consistent strategy to examine street crashes and give profitable outcomes. The vast majority of the street

A Critical Review on Use of Data Mining Technique …

145

Table 3 Interdependence methods in accident analysis Type of method/technique

Data mining Predictive techniques

Discovery techniques

Statistical method Dependence methods

Interdependence methods

Example

Classification technique, regression

Association analysis, sequence analysis, clustering

Multiple regression analysis, discriminant analysis, MANOVA logistic regression

Factor analysis, correlation analysis, cluster analysis, multidimensional scaling correspondence analysis, and perceptual mapping

crashes information investigation use data mining methods, concentrating on recognizing those factors that influence the seriousness of a crash, though any loss subsequent from road crashes is always undesirable in terms of well-being, property damage, and any other economic factors [10]. Most of the time, it is found that there are certain locations where the accident occurs at a regular interval, and these locations are called black spots, where the concentration of road accidents is high. Nowadays, data mining is being used not only for prediction of road accidents but also to find out the black spots or hot spots in transportation engineering. In this section, various applications of DM technique in the field of transportation engineering for sustainable road system are discussed. Some of the major areas of accident analysis, where DM is widely used are: 1. 2. 3. 4.

DM helps safety planners to identify the accident patterns. It also helps in the identification of a safe and optimized route. DM helps in recognizing black spots/hot spots locations. It helps in assigning priorities to selected locations by making groups of a similar accident pattern. 5. It helps us in finding factors causing road crashes. 6. It also helps us in predicting future trends of road crashes.

5 Commercially Available Data Mining Tools Some of the commercially available DM software with their company, their advantages, and disadvantages are listed in Table 4.

146

N. Mor et al.

Table 4 Commercially available data mining softwares [11–13] S. No.

Product/company

Advantage

Disadvantage

1

Clementine/Integral Sol., Ltd.

Visual interface, algorithm breadth

Scalability

2

GainSmarts/Urban Science Transformations of data, SAS-based, algorithm—option depth

Do not deal with unsupervised classification and visualization is limited

3

Model Quest/AbTech Ltd.

Breadth of algorithms, cost benefits, trouble-free upgradation

Some non-intuitive interface options

4

Intelligent Miner/IBM

Algorithm breadth and Limited availability of graphical tree/cluster output algorithms, automation is absent

5

WizWhy/WizSoft

Easy to use and understand and widely used, effortless scalability

6

PRW/Unica Corporation

Wide-ranging algorithms and automated selection of model

7

Scenario/Cognos Analytics Easy to use

Narrow analysis path

8

DataCruncher/Datamind

Easy to use and understand

Single algorithm and low performance

9

MineSet/Silicon Graphics

Visualization of data is easy Limited algorithms and model export is absent

10

CART/Salford Systems

Options are available for the Difficult file input/output depth of the tree and visualization is limited

11

See5/RuleQuest Research

12

Enterprise Miner/SAS Institute

Visual interface and depth of algorithms

13

Model l/Group 1/Unica

Easy to use and understand, Vertical-tool automated model discovery

14

NeuroShell/Ward Systems Group

Multiple neural network

Unorthodox interface and only neural network

15

S-Plus/Math Soft

Depth of algorithms, extendable/programmable, visualization

Inductive methods are limited

Visualization is limited

Few data options and visualization are limited Difficult to use and issues related to a new product, insufficient data security

6 Suitability of DM Technique in Accident Prediction by Various Authors In the field of transportation, different DM methods such as association rule mining, clustering, and classification are being widely used for the study. As the collection of

A Critical Review on Use of Data Mining Technique …

147

accident data is a very tough task, in reality, for the situation in the transportation field, diverse data is needed to be collected to integrate and to get the desired solutions. While studying traffic management, analysis of accidents, conditions of pavement, signal inventory and traffic signals, characteristics of the road, and a huge amount of data is generated in the field of transportation. Based on the nature and availability of data, the dealing road safety team decides which technique needs to be implemented to resolve the respective problem. The essential prerequisites incorporate the capacity to identify what information is accessible, decide the characteristics of the information, separate the information of intrigue, and transform the information into required formats for the application. Which technique is best for accident data analysis depends on the user’s choice, type of data available, and the tool in which analysis is carried out. Some of the previous work on the prediction of the crash using DM is discussed below. Tavakoli et al. [16] used classification and regression tree in WEKA tool to identify factors responsible for the occurrence of accidents in rural roads of Iran using threeyear accident data. The authors concluded that the speed of the vehicle, type of vehicle, age and sex of the driver, and weather conditions are major contributing factors for the occurrence of crashes. Wu et al. [17] used a random forest model to predict road accidents in Beijing city using four-year accident data. The author concluded that the proposed model using random forest technique showed high accurate results in predicting road accidents. Akomolafe and Olutayo [18] predicted road accidents using ID3 decision tree and functional tree technique in TANGARA tool for Lagos—Ibadan road in Nigeria. The main conclusion given by the author was that day, season, and type of vehicle that are the most significant factors causing accidents in the city. Guan et al. [19] used artificial neural network techniques in MATLAB tool to predict road accidents on NH-1 of USA. Using 677 sets of accident data, the analysis was carried out. The model gives an accuracy of 85.3%. The model was validated to check the power of the model. Krishnaveni and Hemalatha [20] used Naïve Bayes, M1 meta classifier, and random forest technique in MATLAB tool to predict road accidents and to compare the performance of the different algorithm in selected rural roads of Chennai city. The author concluded that the random forest algorithm is better than other classifiers in predicting road accidents. Taamneh et al. [21] established classifier-based models to predict road accidents in Abu Dhabi using six-year accident data in WEKA tool. The authors used four different algorithms for the analysis to find the best one. After training the model, testing was done to check the model’s accuracy. The authors concluded that MLP classifier showed the most accurate result in predicting road accidents.

7 Conclusion While carrying out this study, it was found that traditional methods have various limitations which lead to less accurate results in predicting road crashes. In order to reduce these limitations, nowadays, from a methodological point of view, the

148

N. Mor et al.

application of computer science techniques in road safety is considered to be more efficient and accurate in comparison to traditional techniques for prediction of road crashes as these techniques first determine the significant variables responsible for occurrence of accidents and then establish a relationship between them. One of those techniques is the use of DM in road accident analysis. So, in nutshell, in order to understand the pattern of these accidents, an adequate technique is required that is effective, efficient, and can give reliable outcomes. In order to solve this issue, we try to present the DM technique as one of the best available data analysis methods in the field of sustainable and safe roads as this technique incorporates different algorithms. The study also summarized commercially available data mining software in the market with their advantage and disadvantages. Accurate prediction using this technique may help government agencies and companies to develop smart forgiving roads which are a new concept to achieve the goal of zero accidents, which includes all parameters that promote the well-being of people.

References 1. Chetna, Mor N, Sood H (2018) Black spots identification on Pinjore to Baddi road. Int J Pure Appl Math 120(6):6473–6488 2. Mor N, Sood H, Goyal T (2018) Development and corroboration of crash prediction model. Int J Pure Appl Math 119(15):413–421 3. Singh SK (2017) Road traffic accidents in India: issues and challenges. Transp Res Procedia 25(3):4708–4719 4. Cameron MH (1982) A method of measuring exposure to pedestrian accident risk. Accid Anal Prev 14(5):397–405 5. Golias BJC, Tzivelou HS (1992) Aspects of road-accident death analyses. J Transp Eng 118(2):299–311 6. Roy SK, Chakraborty S (2005) Traffic accident characteristics of Kolkata. Transp Commun Bull Asia Pac 74(2):75–86 7. Hughes BP, Newstead S, Anund A, Shu CC, Falkmer T (2015) A review of models relevant to road safety. Accid Anal Prev 74(1):250–270 8. Kiss ÁO, Sesztakov V, Török Á (2014) Road safety analysis in Gy˝or. Period Polytech Transp Eng 41(1):51–56 9. Da Costa S, Qu X, Parajuli PM (2015) A crash severity-based black spot identification model. J Transp Saf Secur 7(3):268–277 10. Stokes RW, Mutabazi M (1996) Rate-quality control. Transp Res Rec J 1542(1):44–48 11. Mamcenko J, Kulvietiene R (2016) IBM intelligent miner for data and its application. Appl Comput Syst 4(7):81–91 12. Barai SK (2003) Data mining applications in transportation engineering. Transport 18(5):216– 223 13. King MA, Elder JF, Abbott D (1998) Evaluation of fourteen desktop data mining tools. In: International conference on systems, man, and cybernetics, pp 12–19 14. Elder JF, Abbott D (1998) Comparison of leading data mining tools. In: Fourth annual conference on knowledge discovery and data mining, pp 10–23 15. Souza JTD, Francisco ACD, Piekarski CM, Prado G (2019) Data mining and machine learning to promote smart cities: a systematic review from 2000 to 2018. Sustainability 11(4):1077–1086 16. Tavakoli A, Shariat-Mohaymany A, Ranjbari A (2011) A data mining approach to identify key factors of traffic injury severity. Promet-Traffic Transp 23(1):11–17

A Critical Review on Use of Data Mining Technique …

149

17. Wu C, Lei H, Ma M, Yan X (2009) Severity analyses of single-vehicle crashes based on rough set theory. In: International conference on computational intelligence and natural computing, vol 2, pp 59–62 18. Akomolafe DT, Olutayo A (2011) Using data mining technique to predict cause of accident and accident prone locations on highways. Am J Database Theory Appl 1(3):26–38 19. Guan L, Liu W, Yin X, Zhang L (2010) Traffic incident duration prediction based on artificial neural network. In: 3rd international conference on intelligent computation technology and automation, pp 1076–1079 20. Krishnaveni S, Hemalatha M (2011) A perspective analysis of traffic accident using data mining techniques. Int J Comput Appl 23(7):40–48 21. Taamneh M, Alkheder S, Taamneh S (2017) Data-mining techniques for traffic accident modeling and prediction in the United Arab Emirates. J Transp Saf Secur 9(2):146–166

Rio Olympics 2016 on Twitter: A Descriptive Analysis Saurabh Sharma and Vishal Gupta

Abstract Social media mining is gaining popularity among researchers as it provides an opportunity to study real-world events, social interactions, network analysis, and most importantly, user behavior. This paper explores the hidden patterns from the content generated on Twitter during the Olympic Games held in the year 2016 in Rio de Janeiro, Brazil. The statistical analysis found that the two features namely retweets and likes have shown a correlation of 95%. The topics of tweets have shown a trend of discussing small events during a day and this pattern is very prominent for all days. The most expressed sentiment during games is turned out to be a positive sentiment. The temporal analysis has shown that users were more active in a specific time of the day, i.e., evening. The user behavior analysis found that new Twitter users were less active as compared to other users. This paper contributes to the studies of social media mining for sports/entertainment events. Keywords Social media mining · Microblog · Temporal analysis · User behavior · Sentiment analysis

1 Introduction In the last 20 years, Online Social Networks (OSNs) has evolved from specific group forums, chat rooms, bulletin boards; to more inclusive open platforms for showcasing personal views on apparently unlimited topics and areas of life. For example, Facebook is mainly used to connect with family and friends where a user wants to share personal information with the restricted audience, YouTube and Instagram are used to share images, audios, videos with a large private/public audience [1, 2]. LinkedIn is an online place to showcase, connect, and work professionally with others [3]. S. Sharma (B) · V. Gupta University Institute of Engineering and Technology, Panjab University, Chandigarh, India e-mail: [email protected] V. Gupta e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_12

151

152

S. Sharma and V. Gupta

Among all these different types of OSNs, microblogging is becoming popular every year making it the most widely used social media tool. Microblogging is very different from other OSNs in the sense that it allows its user to reach out to the public audience within seconds with few words at the global level. These types of platforms are changing the dynamics of OSNs at a very fast pace. The way users discuss, share, respond, and make connections with other users, make it a very rich source for data mining use cases. Most popular use cases are viral publicity of products [4], return on investment programs, finding influential users for information diffusion, creating trends for a specific event, news sharing, etc. For researchers, all these activities in a dynamic space encourage them to study, explore, analyze, and mine all those hidden and implicit patterns [5], and find correlations for a better understanding of the online world [6, 7]. The organization of this article is as follows: Sect. 2 briefly describes the recent studies on Twitter and motivation for this research. Section 3 explains the methodology used in this research and Sect. 4 contains results and discussion. The last section of the article Sect. 5 explains the conclusion of the study.

2 Related Work Twitter is a popular OSN which provides the platform to share user’s content, in the form of microblogs, publically with everyone. Twitter was created in 2006 and since then its popularity has grown to millions of users, tweeting millions of tweets every day. This makes Twitter an interesting research topic for researchers from various domains [7, 8]. Twitter is used for sharing personal opinions, discussing events, promoting and advertising products, making friends and following other users, etc. Any original content posted by a user is called a “tweet”, but there are other methods by which users can participate on Twitter. Twitter has an option for a user to repost the original content of another user on her page and making it available for her network i.e. all her friends and followers. This is called “retweeting” a tweet. The second option is to “like” the content of another user which makes the tweet available on her page but visible on a different tab for her easy access. This is called mark a tweet as “favorite”. The third option is to post a tweet in the response of another tweet by mentioning the user of the original tweet. This is called creating a conversation by generating a tweet as a reply [9, 10]. These three actions performed by users provide explicit and implicit information about user activities on the platform. The content of tweets contains explicit information in form of text, URLs, emojis, hashtags, and user mentions [11]. The implicit information is contained in the timestamp of a tweet, number of retweets and number of replies a tweet received, geolocation of tweet, user profile, number of friends, number of followers, age of the Twitter account, the total number of tweets posted by a user, number of tweets liked by the user, the temporal behavior of a user for posting tweets [12, 13]. The latest studies on opinion mining, text mining, sentiment analysis, and emotion analysis have suggested that emoticons and emojis play a very significant role in

Rio Olympics 2016 on Twitter: A Descriptive Analysis

153

increasing the accuracy of the overall analysis [14]. The latest study [15] of recent trends in research topics on Twitter has mentioned Sentiment Analysis [16, 17] in the first place in the list of top 38 topics. The research on the topic of human behavior [18–20] and sports/entertainment events could not make it to the top 30 topics. This fact justifies doing more research on these important topics.

3 Methodology In the virtual world of the internet, information about any real-life event can be found on various news sources, blogs, wiki pages, photos/video platforms, etc. These sources provide information that is generated by a few authors and consumed by a large population. This is an example of a uni-direction flow of information. The second possibility is to get the information about an event from a source, where multiple authors discuss, share, like, and diffuse information within their network. This gives rise to the opportunity to study a real-life event from the view of multiple users. Information diffusion, social interactions, user’s likes and dislikes, user’s behavioral patterns, temporal analysis, and many hidden patterns in data can be uncovered by analysis of OSNs. On Twitter, when a user wants to categorize a tweet for a specific topic/term, embedding a hashtag is a common practice. A hashtag is defined as a term/phrase which starts with a special symbol (#). Twitter keeps tracks of all the hashtags being used and based on most frequent hashtags, Twitter trends are generated. To collect data from Twitter about a specific event, researchers use hashtags as query keywords [5, 6]. As shown in Fig. 1, to study the event of the Rio Olympics 2016, the requirement was to collect tweets containing all the fields as described by Twitter API. The main issues in collecting tweets using Twitter API are: only live tweet feed is available and historical tweets cannot be queried based on hashtags, distribution of dataset containing full tweets is not allowed as per API terms of service. These conditions make it difficult for other researchers to reproduce the same results. To overcome this limitation, a standard data collection was needed with full tweets, which can be downloaded at any point in time and can be used by any researcher to reproduce and verify previous researches. The Internet Archive, a non-profit digital library, provides access to a huge amount of Twitter collections, called “Spritzer version”, generated

Statistical Analysis Data Collection

Data Filtering

Data Preparation Sentiment Analysis

Fig. 1 The methodology of proposed work

154

S. Sharma and V. Gupta

from the general Twitter stream in JSON format. These collections contain historical data of many years starting from 2012. These collections can be used as standard datasets to solve the problem of redistribution without violating the terms of Twitter API.

3.1 Data Collection The Twitter collection used for this study is of August 2016 [21]. Rio Olympics was held from August 5 to August 21, so data of only 17 days is considered for this study.

3.2 Data Preparation The collection of 17 days contained raw tweets. Raw tweets contain two types of tweets: full tweets with values of all fields and deleted tweets with only tweetID, userID, and timestamp field. Deleted tweets had no use for this study, so all deleted tweets were filtered out from the collection. After removing deleted tweets, the collection was searched for tweets containing any one of the two terms: Rio and Olympic, in their text field. The search terms used for query were searched as text strings instead of hashtags. The reason for not using a hashtag as a search term is, it does not match those tweets which are relevant to the topic but did not contain any hashtag. Another reason is hashtag is required for getting live tweets from Twitter API, but in case of a static collection, string matching gives better performance. For sentiment analysis, only those tweets were selected which were written in the English language. This is achieved by querying the language field of the tweet. After that, duplicate tweets were deleted to get a set of unique tweets only.

3.3 Data Analysis To gain insights from the data, three approaches were used, namely, statistical analysis, sentiment analysis, and network analysis. All these approaches and their findings are discussed in the next section.

Rio Olympics 2016 on Twitter: A Descriptive Analysis

155

4 Results and Discussions 4.1 Statistical Analysis To understand the data better, it is necessary to find out the statistical properties of the data. Table 1 shows that the tweets related to the event of the Rio Olympics are 0.49% of the whole Twitter collection. This can be due to one of the two reasons. First, this collection has captured a small number of tweets related to the event. The second possibility is that Rio Olympics was not a very significant event on social media and could not make it into the top trending topics of that time resulted in a few tweets posted by interested users. The Google Trends for August 2016 shows that across the globe, the keywords “Rio” and “Olympic” were not quite popular search terms for general web search. This indicates that the dataset used in this study is not biased and shows a similar trend with a web search. In Table 2, total unique tweets are further divided into original tweets (original content posted by user) and retweets. The percentage of original tweets is 38% and the percentage of retweets is 62%. This huge percentage of retweets shows that a significant portion of the discussion of real-life events on Twitter constitutes information diffusion instead of original content generation. Recent researches [2, 3] suggests that the use of hashtags in tweet text plays a significant role in spreading the information/news among interested users. However, this dataset has 62% tweets without hashtags and 38% tweets with hashtags. In the case of original tweet posts, 66% of tweets are without hashtags and only 34% of tweets contain a hashtag. This pattern suggests that most Twitter users do not use hashtags in their posts. The percentage of retweets without hashtags is 59% and only 41% of retweets contain hashtags. Recent studies [4, 7] have shown that the presence of hashtags in a tweet, increases the chances of getting more retweets, whereas this data shows a reverse pattern. This can be explained as OSNs are not static networks, Table 1 Statistics of Twitter collection (Aug 4–Aug 21, 2016) Raw collection of tweets

After removing deleted tweets

Tweets containing keywords: Rio/Olympic

Tweets after duplicate deletion

Tweets in English language only

Average tweets per day

76,626,913

58,104,422

285,161

285,066

219,905

12,936

Table 2 Statistics of tweets All tweets (219,905)

Original tweets (83,499)

Retweets (136,406)

Original tweets Retweets

With hashtags

With hashtags

83,499 (38%)

28,722 (34%) 54,777 (66%) 55,766 (41%)

136,406 (62%)

Without hashtags

Without hashtags 80,640 (59%)

156

S. Sharma and V. Gupta

Fig. 2 Correlation matrix of tweet features

dynamics of OSNs changes very rapidly. The number of active users is increasing day by day which makes the nature of OSNs very complex and hence, one pattern found in OSNs cannot remain constant and may change or evolve with time. The correlation matrix shown in Fig. 2 have three strong correlation and a few weak correlations. The strong correlations are as follows: 1. The correlation of 0.95 between retweets and likes can be interpreted as during the games information received by interested users get consumed in form of likes and shared/forwarded to other users in their network in the form of retweets. The largest share of retweet in this dataset suggests that users tend to consume and share the same information repeatedly instead of generating new content. 2. The correlation between the number of characters in a tweet and the number of words in a tweet. This strong correlation of 0.83 suggests that if a tweet has more number of characters then it also have more number of words. This can be interpreted as users tends to use a large number of short words instead of a few large words. This phenomenon is quite natural because microblogging limits the number of characters. The limit for a single tweet is 140 characters. 3. The correlation of 0.66 between the number of hashtags used in a tweet and the number of hashtags used in a retweet suggests that if a tweet contains more hashtags then it is more likely to get more retweets. This finding is following the same pattern of previous studies. Now, we can say that the association of hashtags with retweets is a generalized phenomenon irrespective of domain and target audience. The weak correlations among different parameters are as follow: • The feature of Tweet character count is weakly correlated with compound sentiment, Hashtag count, and Retweet hashtag count with values 0.16, 0.19, 0.29, respectively. A large number of characters in a tweet has chances of showing positive sentiment, chances of containing hashtags, and chances of getting more retweets.

Rio Olympics 2016 on Twitter: A Descriptive Analysis

157

Fig. 3 Relationship of retweet, likes, and tweet sentiments

• The feature of tweet word count is weakly associated with the compound sentiment, and Retweet hashtag count with values 0.17, 0.16, respectively. A large number of words in a tweet has chances of showing positive sentiment, and chances of getting more retweets. • The association with hashtag has nearly zero value of 0.04. This can be explained as if a tweet contains more number of hashtags then it may result in fewer words as compare to normal text. • The weak association of compound sentiment with Hashtag count and Retweet hashtag counts of value 0.12, 0.14, respectively can be summarized as tweets with hashtags are more likely to have positive sentiment. Figure 3 has shown the relationship between information diffusion and the sentiment of the tweet. The positive sentiment gets more likes with the increasing number of retweets as compared to the other two sentiments. This also implies that positive sentiment was most dominating during all days of the games.

4.2 Sentiment Analysis The concept of sentiment analysis is to apply Natural Language Processing (NLP) techniques on text data to get insights about the emotions or opinions of the usergenerated post. This technique is very common in social media researches [16, 17]. The categorization of tweets into different classes/labels of positive, negative, and neutral provides the ability to understand the user’s feeling/reaction or attitude

158

S. Sharma and V. Gupta

towards a specific topic. Since the implementation of a sentiment analysis algorithm is out of the scope of this research, an open-source library is used for this step. The VADER (Valence Aware Dictionary for sEntiment Reasoning) [22] library is developed using a parsimonious rule-based model. It is optimized specially for social media analysis. It has various rules for punctuation, capitalization, degree modifiers; social media slang words, emoticons, and the context of the whole tweet. It analyzes the input text as lexicons and gives output in four values: positive polarity, neutral polarity, negative polarity, and compound score. The compound score is used to mark the overall polarity of the tweet as positive or negative or neutral. The equations used to classify tweets in three categories are given below. compound_score ≥ 0.05 (overall sentiment: positive)

(1)

−0.05 < compound_score > 0.05 (overall sentiment: neutral)

(2)

compound_score ≤ −0.05 (overall sentiment: negative)

(3)

Table 3 shows the statistics of tweets based on sentiment analysis. The significant portion of tweets comes under a positive category. This pattern is visible in all types of division of data. The percentage of retweets is highest for positive tweets, which suggests that positive sentiment got more sharing among users than the negative sentiment. The positive sentiment got the maximum percentage for hashtag based division also, that reflects on the fact that positive sentiment was the most significant emotion irrespective of the use of hashtags. In some case studies, the use of a specific hashtag was found to be highly correlated with emotions. However, in this case, positive emotion is independent of hashtag usage. The top two sentiments (positive and neutral sentiment) make the maximum fraction and negative sentiment got left with a very small portion of tweets. This pattern can be explained as follows: the tweets mostly contain best wishes, encouraging messages, and congratulations to the participating players. It can be safely interpreted that the overall mood of the Table 3 Sentimental analysis of tweets Sentiment

All tweets (219,905)

Original tweets (83,499) Retweets (136,406)

Original tweets (83,499)

Retweets (136,406)

With hashtags (28,722)

Without hashtags (54,777)

With hashtags (55,766)

Without hashtags (80,640)

Positive

37,258 (45%)

71,245 (52%)

13,664 (47%)

23,594 (43%)

32,317 (58%)

38,928 (48%)

Neutral

30,318 (36%)

43,121 (32%)

10,528 (37%)

19,790 (36%)

17,052 (31%)

26,069 (33%)

Negative

15,923 (19%)

22,040 (16%)

4530 (16%)

11,393 (21%)

6397 (11%)

15,643 (19%)

Rio Olympics 2016 on Twitter: A Descriptive Analysis

159

Fig. 4 Day wise sentiment analysis of tweets

Fig. 5 Histogram of compound sentiment

Rio Olympics was mostly positive and then mostly information sharing and at last, slightly negative due to some minor incidents. Figure 4 shows the distribution of tweets and sentiments for each day. This distribution also confirms that positive sentiment is the maximum fraction of tweets every day from the first day to the last day of the Rio Olympics. Figure 5 shows the histogram of compound sentiment. The bar at value zero indicates neutral sentiment, all values above 0.05 indicate positive sentiment and all values below −0.05 indicates negative sentiment. It can be seen that positive sentiment has a better spread as compared to negative sentiments. This implies that positive sentiment was not only dominant sentiment but most of the tweets tend to have positive sentiment by default. To understand the relationship between tweets, top hashtags were selected based on their frequencies. In Fig. 6, a network of top 20 hashtags is created where edges denote the relation between two hashtags if both hashtags were present in the same tweet. The size of the node denotes the number of triangles formed by a hashtag. The main two hashtags are Rio and Olympics which are used with most of the other hashtags and hence create a maximum number of triangles in network. In any trending topic or any event discussed on Twitter, this is a common finding that only a few hashtags make 70–80% of the total content generated. These top hashtags provide an instant insight into the event/topic which influenced the behavior of the maximum number of users participated in the discussion. The other hashtags with less popularity exist in every event which can be explained as few tweets containing some specific information about sub-topics or limited period events.

160

S. Sharma and V. Gupta

Fig. 6 Network of top 20 hashtags

5 Conclusion This study is an attempt to provide an insight to understand human behavior, information diffusion, interactions among participants, time-based trends in data and emotions, and feelings of users. The contribution of this study is that various patterns found in data provide a basic outline of analysis that should be applied on larger datasets, in different scenarios and different events to find out if these patterns are general for other domains also or maybe these patterns are specific to each case study. For example, recent studies found out that usage of hashtags, URLs, and user mentions are directly proportional to get a large number of retweets, whereas in this study use of hashtag and number of retweets are independent of each other. This makes social media analysis even more exciting and challenging. The key conclusions of this study are: • Any real-life event can be mapped on the OSNs because users express their views and feeling more freely on these platforms. • Information diffusion has the biggest share of user-generated content. The retweet and likes are the most common and highly correlated actions performed by users with a 95% correlation. • The positive sentiment was found in nearly 50% of tweets. The neutral sentiment was found in approximately 30% and negative sentiment with only 20% of tweets. • The positive sentiment had a stable pattern for each day during the games. • The number of words is directly proportional to the number of characters in a tweet with a strong correlation of 83%. • The new users were found to be less active as compared to other users. • The maximum activities are clustered around evening time as compared to day time. This pattern is found stable for each day during the games. • In contrast with previous studies, this research has shown that 60–65% of the tweets did not have hashtags. This finding did not find any correlation between hashtags and information sharing. The maximum number of users was discussing the games without the use of any hashtag. This is a useful finding for future studies

Rio Olympics 2016 on Twitter: A Descriptive Analysis

161

for collecting data from Twitter API. The use of hashtag or useful keywords related to a specific query can make a significant difference in data collection. There can be many factors that make the patterns unique in each study. These factors are timing of the event during the different month of the year, geo-locations of users participated in a particular event, sentiment analysis can have varying results based on the cultural and religious background of users, user behavior of different age groups can show different patterns for different topics/events and so on. This research area will be explored in the future for many other hidden patterns.

References 1. Benevenuto F, Rodrigues T, Cha M, Almeida V (2009) Characterizing user behavior in online social networks. In: Proceedings of the 9th ACM SIGCOMM conference on internet measurement. ACM, pp 49–62. https://doi.org/10.1145/1644893.1644900 2. Golbeck J, Robles C, Turner K (2011) Predicting personality with social media. In: CHI’11 extended abstracts on human factors in computing systems. ACM, pp 253–262. https://doi.org/ 10.1145/1979742.1979614 3. Xu Z, Yang Q (2012) Analyzing user retweet behavior on twitter. In: Proceedings of the 2012 IEEE/ACM international conference on advances in social networks analysis and mining. IEEE, pp 46–50. https://doi.org/10.1109/ASONAM.2012.18 4. Kaushik K, Mishra A (2020) Leveraging sponsorship on twitter: insights from tennis grand slams. In: Proceedings of the international conference on advances in national brand and private label marketing. Springer, Cham, pp 58–64. https://doi.org/10.1007/978-3-030-47764-6_7 5. Gan W, Lin JCW, Fournier-Viger P, Chao HC, Yu PS (2019) A survey of parallel sequential pattern mining. ACM Trans Knowl Discov Data (TKDD) 13(3):1–34. https://doi.org/10.1145/ 3314107 6. Jin L, Chen Y, Wang T, Hui P, Vasilakos AV (2013) Understanding user behavior in online social networks: a survey. IEEE Commun Mag 51(9):144–150. https://doi.org/10.1109/MCOM.2013. 6588663 7. Tang J, Liu H (2014) Feature selection for social media data. ACM Trans Knowl Discov Data (TKDD) 8(4):19. https://doi.org/10.1145/2629587 8. Zhang J, Tang J, Li J, Liu Y, Xing C (2015) Who influenced you? Predicting retweet via social influence locality. ACM Trans Knowl Discov Data (TKDD) 9(3):25. https://doi.org/10.1145/ 2700398 9. Zhou X, Wang W, Jin Q (2015) Multi-dimensional attributes and measures for dynamical user profiling in social networking environments. Multimed Tools Appl 74(14):5015–5028. https:// doi.org/10.1007/s11042-014-2230-9 10. Webberley WM, Allen SM, Whitaker RM (2016) Retweeting beyond expectation: inferring interestingness in Twitter. Comput Commun 73:229–235. https://doi.org/10.1016/j.comcom. 2015.07.016 11. Chen J, Liu Y, Zou M (2017) User emotion for modeling retweeting behaviors. Neural Netw 96:11–21. https://doi.org/10.1016/j.neunet.2017.08.006 12. Zhang A, Zheng M, Pang B (2018) Structural diversity effect on hashtag adoption in Twitter. Phys A 493:267–275. https://doi.org/10.1016/j.physa.2017.09.075 13. Choi HJ, Park CH (2019) Emerging topic detection in twitter stream based on high utility pattern mining. Expert Syst Appl 115:27–36. https://doi.org/10.1016/j.eswa.2018.07.051 14. Feng Y, Lu Z, Zhou W, Wang Z, Cao Q (2020) New emoji requests from Twitter users: when, where, why, and what we can do about them. ACM Trans Soc Comput 3(2):1–25. https://doi. org/10.1145/3370750

162

S. Sharma and V. Gupta

15. Karami A, Lundy M, Webb F, Dwivedi YK (2020) Twitter and research: a systematic literature review through text mining. IEEE Access 8:67698–67717. https://doi.org/10.1109/ACCESS. 2020.2983656 16. Singh P, Dwivedi YK, Kahlon KS, Pathania A, Sawhney RS (2020) Can twitter analytics predict election outcome? An insight from 2017 Punjab assembly elections. Gov Inf Q 101444. https:// doi.org/10.1016/j.giq.2019.101444 17. Grover P, Kar AK, Dwivedi YK, Janssen M (2019) Polarization and acculturation in US Election 2016 outcomes—can twitter analytics predict changes in voting preferences. Technol Forecast Soc Chang 145:438–460. https://doi.org/10.1016/j.techfore.2018.09.009 18. Cresci S, Di Pietro R, Petrocchi M, Spognardi A, Tesconi M (2020) Emergent properties, models, and laws of behavioral similarities within groups of twitter users. Comput Commun 150:47–61. https://doi.org/10.1016/j.comcom.2019.10.019 19. Cresci S, Petrocchi M, Spognardi A, Tognazzi S (2019) On the capability of evolved spambots to evade detection via genetic engineering. Online Soc Netw Media 9:1–16. https://doi.org/10. 1016/j.osnem.2018.10.005 20. Fraiwan M (2020) Identification of markers and artificial intelligence-based classification of radical twitter data. Appl Comput Inform. https://doi.org/10.1016/j.aci.2020.04.001 21. https://archive.org/details/archiveteam-twitter-stream-2016-08. Accessed 05 May 2019 22. Hutto CJ, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth international AAAI conference on weblogs and social media

A Survey on Vehicle to Vehicle Communication Tanay Wagh , Rohan Bagrecha , Shubham Salunke , Shambhavi Shedge , and Vina Lomte

Abstract Nowadays, due to an increase in the number of vehicles, communication among vehicles has become very important. The goal of communication among vehicles is to create efficient management of traffic and safety in public roads. Due to advancement in telecommunication utilities and combining it with informatics, the development of wireless communication has become possible. A VANET (Vehicular Ad-hoc Network) provides wireless communication between vehicles in a network and between roadside units (RSUs) within the coverage area. It provides smart transportation system which helps in efficient traffic management and road safety. The main difficulty in VANET occurs in information routing because of unstable connectivity, high mobility, and network partitioning which creates a need for efficient methodology for information routing in VANET. The survey is about various issues like deployment, development, and various security challenges. Keywords IEEE 802.11p · VANET · Inter-vehicle communications · Routing protocols

1 Introduction Nowadays, traffic safety is an important topic that has to be taken into consideration. For awareness among people and especially the youths, it has been added in fields like social awareness, education, etc. Driving is an essential part of day to day life for most of the people while traveling for school, colleges and office. But along with this traffic safety is a must. As per the increasing need, the vehicle population is also increasing and along with that the rate of accidents is also increasing, hence the vehicle to vehicle communication is necessary. For example, consider a vehicle X and T. Wagh (B) · R. Bagrecha · S. Salunke · S. Shedge · V. Lomte (B) RMD Sinhgad School of Engineering, Pune, Maharashtra, India e-mail: [email protected] V. Lomte e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_13

163

164

T. Wagh et al.

a vehicle Y moving in a road. Suppose X is in front of Y, and X meets with an accident and thus applies brakes. So, without vehicle to vehicle communication, there is no way to pass this message to Y efficiently and immediately. With V2V communication, the accident sensors are activated for vehicle X and it passes the message to main server which is then broadcasted to other vehicles in the form of alert message. With this alert message, vehicle Y slows down. Thus, V2V communication is very helpful in traffic management and reducing accidents. VANET uses DSRC (Dedicated Short Range Communication) which helps in 360-degree communication among nodes/vehicles within communication range. It provides Intelligent Transport System which helps in fast and error-free data exchange/communication. VANET is basically divided into two parts: vehicles that act as movable nodes and infrastructure (road side) which is fixed. Transmitted messages include vehicle speed, vehicle control information, vehicle position. The wireless communication consists of vehicle to infrastructure and vehicle to vehicle communication. In vehicle-infrastructure communication, details to the driver are broadcast through road display. It helps in exchanging about any dangerous situation. The main applications of VANET are safety applications, traffic management, traveling comfort, etc. Challenges faced by VANET are connectivity, routing, quality of service, etc.

1.1 Accidental Statistics The WHO (World Health Organization) gives an account of the review of the generally speaking about individuals close about 1.24 million kick the bucket yearly out and about over the world, with 20–50 million supporting non-deadly wounds. Allinclusive, street traffic wounds are accounted for as the main source of death among youngsters matured 14–30 years and are the three most common reasons for deaths in the individuals matured 14–45 years. The Health Institute reported around 9 lakh (in 1990), 14 lakh (in 2013) people died due to street traffic wound. As per the GS report (2015) of street wellbeing, the most elevated pace of casualties from street traffic wounds are from WHO Africa at the rate of 27 per lakh of population (in 2013). As per reports in 2009, Africa had the most elevated demonstrated casualty rate at 32.2 per 100,000 population, rather than the announced casualty pace of 7.2 per 100,000 in general population. Around 3000 auto crashes happen each day around the globe. Liquor and different medications are seen as a contributing reason in up to 22% of vehicular mishaps on the world’s roadways and byways. Vehicle-related deaths rank as the eleventh most normal reason for death with youngsters in the age bunches from 5 to 24 years of age having the most noteworthy dangers. At a pace of 73.4 passing per 100,000 individuals, Libya by a wide margin has the most elevated frequency of auto crashes anyplace on the planet. Libya encounters the most auto crashes on the planet, with 73.4 fender benders per 100,000 occupants.

A Survey on Vehicle to Vehicle Communication

165

In the previous 6 years, 196,236 individuals passed on in car collisions. That is more than the quantity of individuals who live in Salt Lake City, the capital of Utah. 70% of auto collision unfortunate casualties are tenants inside the genuine vehicle— the drivers and travelers. The other 30% are individuals outside of vehicles, for example, street walkers and motorcyclists. Nearly 13.6 lakh person dies each year because of street car accidents. Demba and Möller [1] without prevalently depending on global system for mobile communication (GSM) network, V2V has the objective to encourage proficient and dependable communication. Thus in order to have reliable V2V communication, we require technologies like DSRC, IEEE 802.11p for wireless communication among vehicles. The rest of the paper is presented as: An overview of VANET with characteristics is given in Sect. 2. In Sect. 3 Technology is described. The object detection algorithm is presented in Sect. 4. Future scope and Conclusion are presented in Sects. 5 and 6 respectively.

2 An Overview of VANET 2.1 Architecture of VANET The architecture of VANET is shown in Fig. 1. It contains vehicles among which communication takes place through wireless medium, Infrastructure, RSU (Road Side Unit) which behaves like router. The range of RSU is higher than vehicles coverage/range. An OBU (On Board Unit) is incorporated in every vehicle to provide effective communication among vehicles and infrastructures. The communication is wireless communication. A GPS (Global Positioning System) is also incorporated in every vehicle to track the position of its own and other vehicles. Also, high voltage battery is used to supply power.

2.2 ITS ITS [2] stands for Intelligent Transportation System. As the name suggests, the transportation system is intelligent. Basically, ITS gives 2 forms of communication, vehicle to infrastructure communication, and vehicle to vehicle communication. Vehicle to Vehicle (V2V) communication is multi-hop communication as it includes multiple vehicles and Vehicle to Infrastructure (V2I) is single-hop communication as it includes RSU which broadcasts message among vehicles in its coverage. The system in ITS refers to the vehicle and it is intelligent because it behaves like a router, receiver, and sender. The vehicular communication is further divided into naïve broadcasting and Intelligent broadcasting. Naïve broadcasting sends packets

166

T. Wagh et al.

Fig. 1 VANET Architecture

at regular interval of time whereas intelligent broadcasting sends packet only when required. Hence, naïve broadcasting may result in collision of packets which is a demerit of naïve broadcasting.

2.3 OBU OBU stands for On Board Unit. It is an on-board device installed in each vehicle in order to have communication with other OBU’s or RSU. It communicates with other OBU’s or RSU wirelessly (e.g. IEEE 802.11p). OBU contains processor for processing the data, networking devices, memory of storage and retrieval of data, specialized connection between other OBU’s, user interface etc. [3]. The purpose of the OBU is to control the congestion in network and to provide reliable and efficient message transfer and data security. It also provides features like ad-hoc networking, wireless communication, geographical routing, etc.

A Survey on Vehicle to Vehicle Communication

167

2.4 AU AU stands for Application Unit. It is present in the vehicle. Utilizing the communication features of OBU, AU uses the applications provided. The AU may be present on OBU with logical distinction. There can be wired or wireless connection between OBU and AU. The AU may be a system used as personal assistant system or used as a safety application. The communication of AU with the network is completely through OBU which helps in networking and mobility functions.

2.5 RSU RSU stands for Road Side Unit. It is a device which is mostly present at locations like parking area or any road side area. The RSU uses DSRC (IEEE 802.11p) and broadcasts the message among all vehicles in its coverage area. It acts as a router and provides communication between vehicle and infrastructure. The RSU helps in distributing the message to different RSU’s or OBU’s, thus extends the network range. It also provides safety information like giving warnings of accident, warning of low bridge, work area, etc.

2.6 VANET Characteristics The characteristics of VANET are very unique. They are: • The mobility is predictable: The hops/nodes in VANET move in an arbitrary way because vehicles are limited by layout and topology of road and by the need to follow signs in road and traffic signals and also vehicles move in different directions. Thus, it leads to predicting mobility. • Secure and safe driving, enhancing commuter comfort and improving traffic efficiency: VANET gives a direct exchange of messages among moving nodes/vehicles, in this manner permitting a lot of uses, requesting direct links between vehicles to be added over the network. Such uses can give drivers going a similar way with notice messages about mishaps, or about the requirement for unexpected hard breaking; leading the driver to assemble a more extensive image of the street ahead. In addition, extra sorts of uses could be applied through this kind of system so as to improve traveler solace, and traffic productivity by scattering data about climate, traffic stream and focal point data (service station, shopping centers, etc.). • Power supply is not an issue: The power is not a big issue in VANET as in MANETs, since vehicles can give persistent capacity/power to the OBU through the long-life battery.

168

T. Wagh et al.

Fig. 2 MHz DSRC spectrum

• Fluctuating network density: The network density is different based on traffic. If traffic is high, then network density is also high and if low then the network density is low (Fig. 2). • The network topology varies quickly: The vehicles moving in speed (like in highway) leads to quick change in network topology which results in change in driver’s behavior as the data keeps on fluctuating due to change in network topology. The radio communication range changes or affects the lifetime of connection between vehicles. Thus, if radio communication range is high, then the lifetime of the connection will be high. Also, other factors which affect the lifetime of the connection is direction of vehicles. If the vehicles are moving in opposite direction then lifetime will be less. Hence, quick change in topology will lead to low or no utilization of network. • Computational capacity is high: The hops are basically vehicles in VANET. The vehicle comprises of various resources like processor, sensors, GPS, memory, and more. These resources make high computational ability of the vehicle which helps in gathering accurate and useful information and making an efficient wireless communication.

2.7 Security in VANET Security in VANET [4–6] is a difficult issue for scientists in the time of digital dangers. The message going starting with one vehicle then onto the next vehicle might be caught or hacked by a gatecrasher or sham who makes helplessness in frameworks execution. In VANET, numerous sorts of assault happen on the framework like Position Cheating [7, 8], ID Cheating, GPS Information Hacking, Spoofing, Message Modification, etc. Pernicious drivers can make issues in the rush hour gridlock which prompts mishap and other severe issues. Henceforth, the nodes/automobiles shall utilize safety systems in order to oppose such dangers. Around there, we show the dangers to the VANET framework and the safety components to find the assaults. The assaults [2, 4–6] performed by vindictive drivers are examined as follows: • Analysis of traffic: In this assault, an assailant investigates the traffic (assortment of data/exchanges). The assailant gathers all the data by checking the vehicular system continually. By gathering the data like email address, solicitations, and

A Survey on Vehicle to Vehicle Communication











169

reactions of the considerable number of vehicles interacting, the aggressor can assault by a speculating methodology. It is likewise a latent assault wherein no information change is performed by the aggressor. Snooping: In this assault, an aggressor gets to the data with no approval. At the point when a node/automobile in the system transfers data to other node then the aggressor captures and gets to a substance of the data and utilizes it for its self. It happens to be a latent assault wherein the assailant just screens or gets to the data without altering the information. Masquerading: The type of assault in which the assailant imitates as other automobiles by giving bogus identity and publicizes like a legitimate hub. At the point when a pair of automobiles convey within the framework then the aggressor goes about as a person in the center and parodies as a subsequent node and takes data from the principal node. It is additionally a functioning assault where the information can be changed. Data modification: In this assault, an assailant catches and adjusts the information. At the point, if a node in the system transfers significant data, e.g., cautioning alert (Rainstorm alert) to other nodes, at that point the assailant can adjust the information, erase the information or postpone the information. Hence, the subsequent node experiences rainstorm issue and mishap happens. This is an exceptionally hazardous assault where the assailant for its own advantage debilitates the framework. It is a functioning assault wherein the information is altered. Sybil attack: In this assault, an aggressor produces numerous personalities and cheats with bogus characters. A noxious vehicle in the system goes about as different vehicle hubs and joins the system and subsequent to joining the system it acts malevolently. This assault is a functioning assault that debases the frameworks’ execution. Repudiation: In this assault, an aggressor denies that he/she communicates something specific. A transmitter node or a collector node could make such assault by refusing that it communicates something specific or it gets any communication information, separately. For instance, suppose a node X transmits data to node Y and Y refuse to get the data, at that point the message is caught and X may again send the data to Y and it might build the deferral.

2.8 Projects Involving VANET Usage of VANET extends in an ongoing framework is a significant challenging task. Some of such executions have been conveyed as of late additionally actualizing such ventures in an ongoing framework requires appropriate reenactment/simulation for estimating the exhibition of the framework. Government has directed numerous new ventures to build up ITS. The European countries, USA and Japan are building up the ITS frameworks by executing VANET extends in the urban regions [9]. Early improvements for the most part center around the convention framework. But, presently latest ideas of application engineering and communication framework have

170

T. Wagh et al.

been obtained. Numerous vehicles delivering organizations like Audi, Ford, BMW, Daimler, General Motors, Nissan, etc. are utilizing the ITS frameworks for traveler wellbeing. Numerous such VANET ventures are overviewed and alluded from [9].

3 Technology 3.1 DSRC DSRC (Dedicated short-range communication) is a remote correspondence innovation intended for vehicles in the Intelligent Transport System to speak with different automobiles or infrastructure. This development works in around 6 GHz band of RF Spectrum. It supports interoperability and has low latency and high quality of efficiency.

3.2 ITS ITS (Intelligent Transportation System) is an advancement application that intend to give imaginative organizations relating to different techniques for traffic and transport the board and enable customers to be better instructed and make progressively secure, progressively arranged, and increasingly insightful use of transport frameworks. Information and correspondence progress are put in the branch of street transport, including structure, vehicles and customers, and in busy time gridlock the load up and flexibility the load up, similarly concerning interfaces with various techniques for transport. ITS may improve the productivity of transport in various circumstances, for example, street transport, traffic execution, and management, versatility.

3.3 Protocols In media transmission, a communication protocol is an arrangement of regulations that permit at least two elements of a framework to transmit data through any sort of or variety of physical medium. The protocol characterizes the principles, language structure semantics and synchronization of conveying messages and conceivable blunder recuperation strategies. Protocols might be actualized by equipment and programming or a blend of both [10]. Table 1 shows the comparison between various protocols. Since the correspondence interface between the nodes/vehicles and the street side establishment may exist for only a concise range interval, the IEEE 802.11p amendment portrays a system to exchange data through that association without the

A Survey on Vehicle to Vehicle Communication Table 1 Comparison between some IEEE protocols

171

Standards

IEEE 802.11a IEEE 802.11b IEEE 802.11p

Modulation

OFDM

DSSS

OFDM

Frequency (GHz)

5.7255.850

2.4002.485

5.8505.925

Bandwidth (MHz)

20

22

10/20

No. of channels 12

14

7

No. of 8 nonoverlapping channels

3

7

Maximum rate (Mbps)

11

27/54

54

need to set up a fundamental help based set (BSS), along these lines without the need to pay special mind to the alliance and confirmation approach to complete before exchanging data. Consequently, IEEE 802.11p enabled stations use the guaranteed winner BSSID (an estimation of all of the 1 s) in the header of the casings they exchange, and may start sending and getting data diagrams when they land on the correspondence channel. Since such stations are neither related nor checked, the confirmation and data characterization frameworks given by the IEEE 802.11 norm (and its modifications) can’t be used. These sorts of helpfulness should then be given by higher framework layers.

4 Object Detection Algorithms 4.1 Fast RCNN In [11], the strategy resembles the R-CNN count. However, instead of supporting the locale proposals to the CNN, we put the information picture to the CNN to create a convolutional characteristic map. From the convolutional characteristic map, we recognize the locale of suggestion and bend them into squares and by using a Region of Interest pooling layer we change them into a static size with the goal in a way that it very well may get encouraged to a completely associated layer. Through the Region of Interest highlight vector, one can utilize a layer (softmax) to anticipate the class of the proposed region and besides the equalization regards for the bounding box. The bounding box regression is added to the neural system training itself with the help of Fast RCNN.

172

T. Wagh et al.

4.2 Faster RCNN In [11], R-CNN and Fast R-CNN utilizes specific inquiry to discover the region recommendations. Specific pursuit is a moderate and tedious procedure influencing the exhibition of the system. Accordingly, an algorithm to detect the object that disposes of the algorithm of selective search and makes the system gain proficiency with the region proposition. Like Fast R-CNN, the picture is given like a contribution to a CNN that gives a convolutional feature map. Rather than utilizing selective search algorithm on the feature map to distinguish the area recommendations, a discrete system is utilized to foresee the region/area proposition. In [12], to deal with the variations in scale of objects and aspect ratio, Faster R-CNN presents the idea of three kinds of aspect ratios 1:1, 2:1, and 1:2 and three kinds of anchor boxes for scale 128 × 128, 256 × 256, and 512 × 512 respectively. So, on total 9 boxes, RPN predicts the probability of region. To improve the anchor boxes at each location, we apply bounding box regression. In [11], the anticipated region propositions are then reshaped utilizing a Region of Interest pooling layer that is used to group the picture inside the proposed region and foresee the balance esteems for the bounding boxes. In [12], the rest of the system is like Fast-RCNN. Fast-RCNN is 10 times slower than Faster-RCNN with similar accuracy.

4.3 You Only Look Once (YOLO) In [12], YOLO splits every image to a matrix of G × G. Every matrix estimates n bounding boxes. Also, the confidence is obtained which depicts whether the object is present in the box truly or not and also it depicts the accuracy of the bounding box for every class in training, YOLO helps in estimating the classification score for each box. To compute the probability of each class which is situated in the predicted box, classes can be merged. Thus, G × G × n boxes are predicted. Real-time execution is possible. It is very quick and fast. YOLO works on the entire image once which opposes to previous methods of working into only regions. Thus, false positives are avoided. The disadvantage of YOLO is that it is not much useful when very small objects are present as it can only predict one type of class in 1 matrix only.

4.4 Single Shot Detector In [12], SSD accomplishes a perfect balance between the accuracy and speed. For detecting objects, it only requires single shot. It executes a network (convolutional) on input image once and computes the feature layer. To predict the probability of

A Survey on Vehicle to Vehicle Communication

173

Fig. 3 Performance of detection of object algorithms

classification and bounding box, we have to execute a small 3 × 3 sized convolutional matrix on this feature layer. SSD estimates the bounding boxes after several convolutional layers, in order to control the scale. At various aspect ratio, it uses anchor boxes. Detection of objects can take place at various other scales. Because each convolutional layer execution take place at different scales. Because SSD uses single shot it is faster than other algorithms like RCNN etc. (Fig. 3).

5 Future Scope The rise of autonomous driving assistance features in cars is quite prominent and a lot of it depends on connectivity to convey all the messages hence V2V communication is important. All the communication among vehicles depends on VANET hence the future of VANET is bright as in upcoming times proper communication between vehicles can provide safety, security, comfort, and help to decongest road by better communication. Autonomous driving or driverless car is the next most big thing in the industry the communication is going to be considered as the main factor for successful implementation of autonomous cars. VANET is going to produce tons of data hence implementation of vehicular cloud can help in providing better services. Vehicles can form a cloud network where all the data can be recorded and other number of operations can be performed on the data to improve the services. Vehicular cloud is can be really useful and will be an emerging area to make research. VANET networks should be made highly fault-tolerant as the nodes in this network are vehicles and miscommunication or delayed communication between the nodes can cause serious problems. Also, during routing if the information is delivered to a node which is faulty, there will be increase in delay to route the data to other nodes. There should be a proper and fast recovery mechanism and also a mechanism to properly inform other nodes about the fault in the network. Making VANET networks fault tolerant

174

T. Wagh et al.

is going to help in better communication and is an important area to be researched. MAC protocol can be used to provide fast exchange of data between nodes. More sensors can be included in the system for other applications like gas emission from the vehicle that can be monitored with the help of respective sensor. Also, image processing (with the help of camera) can be used for various reasons like detecting dangerous situation, collisions, congestions etc.

6 Conclusion In this paper, we have done a survey on the basics of VANET, its characteristics, architecture, security, applications, and the various protocols used for implementing VANET. We have found that accuracy of faster RCNN is highest and speed of YOLO is highest as compared to other object detection algorithms. With emerging technologies, there is a need for establishing a reliable vehicle to vehicle communication system. Many researchers and scientists are still researching on the various issues regarding VANET like its implementation, maintenance, security, etc. This survey will help the researchers, to explore different aspects of VANET. Also, it provides various methodologies used for implementing vehicle to object communication. We have also done some live survey on VANET which gives a brief idea about its usage. Finally, we have discussed some future aspects of VANET which will help in building a smart and reliable vehicular communication.

References 1. Demba A, Möller DPF (2018) Vehicle-to-vehicle communication technology 2. Zeadally S, Hunt R, Chen Y-S, Irwin A, Hassan A (2010) Vehicular ad hoc networks (VANETS): status, results, and challenge. Telecommun Syst 50(4):217–241; Mohammad SA, Rasheed A, Qayyum A (2011) VANET architectures and protocol stacks: a survey 3. Al-Sultan S, Al-Doori MM, Al-Bayatti AH, Zedan H (2014) A comprehensive survey on vehicular ad hoc network 4. Fuentes J, González-Tablas A, Ribagorda A (2010) Overview of security issues in vehicular ad-hoc networks. In: Handbook of research on mobility and computing: evolving technologies and ubiquitous impacts, pp 894–911 5. Raya M, Hubaux J-P (2007) Securing vehicular ad hoc networks. J Comput Secur 15:39–68 6. Isaac JT, Zeadally S, Cámara JS (2010) Security attacks and solutions for vehicular ad hoc networks. IET Commun 4(7):894–903 7. Leinmuller T, Schoch E, Kargl F (2006) Position verification approaches for vehicular ad hoc networks. IEEE Wirel Commun 13(5):16–21 8. Gongjun Y, Olariu S, Weigle M (2009) Providing location security in vehicular ad hoc networks. IEEE Wirel Commun 16(6):48–55 9. https://www.neo.lcc.uma.es/staff/jamal/vanet/?q=content/vanet-its-projects. Accessed Jan 2013

A Survey on Vehicle to Vehicle Communication

175

10. Sathya Narayanan P, Sheeba Joice C (2019) Vehicle-to-vehicle (V2V) communication using routing protocols: a review 11. https://towardsdatascience.com/r-cnn-fast-r-cnn-faster-r-cnn-yolo-object-detection-algori thms 12. https://cv-tricks.com/object-detection/faster-r-cnn-yolo-ssd

OBD-II and Big Data: A Powerful Combination to Solve the Issues of Automobile Care Meenakshi, Rainu Nandal, and Nitin Awasthi

Abstract Automobile care is the next big thing that needs to be monitored to control the air and sound pollution levels on this planet. In an attempt to achieve this, we propose continuous monitoring of various parameters in the automobile as part of our project. Through this paper we wish to compile the excellent work accomplished by various authors in the field of Big data, OBD-II, CAN bus, and OBD-III. Big Data is an expression used for a huge amount of structured and unstructured data which is too large to be processed using conventional software and database methodologies and techniques. In this paper, we shall detail various language/tools needed for big data analysis like Hadoop, Python, Spark, R, and Matlab. OBD (On Board Diagnostics) is a self-diagnostics system constructed inside the car or vehicle and commissioned at the time of manufacturing. OBD-II is a tool that has a defined function and responsibility to make a diagnosis and report the state and situation of car’s engine and health. CAN (Controller Area Network) bus is a system prepared for intercommunication of car or vehicle devices. This bus permits communication of plenty of microcontrollers and various types of devices with each other in real-time and moreover without a host computer. Addressing schemes are not needed by CAN bus, because the network nodes use unique identifiers. OBD-III can be termed as a program that can reduce the waiting time between recognition of an OBD-II system’s emissions malfunction and repair of the vehicle. OBD-III will be forecasted as the future of automobile diagnostic systems. An endeavor has been made to cover the interesting portions and matter on Big data, CAN bus, and OBD-II. Keywords Big data · OBD-II · OBD-III · Vehicle management · Vehicle diagnostics · Preventive maintenance and CAN bus Meenakshi (B) · R. Nandal U.I.E.T., Maharshi Dayanand University, Rohtak, India e-mail: [email protected] R. Nandal e-mail: [email protected] N. Awasthi Apsis Solutions, Bangalore, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_14

177

178

Meenakshi et al.

1 Introduction In this ever-growing fast world, the greater part of the population is shifting from the villages (rural) and small towns to big metro cities of respective countries. At present more than 55% of the world’s population of is residing in the big cities or metropolitan cities and projections state that by 2051–2055, more than 70% of the world’s population will be residing in metropolitan or big cities taking the term urban living to the next level. This massive migration to the big cities will lead to the capacity crisis and will require mammoth developments in the infrastructure, roadways, railways, and other means of transportation which will be extremely difficult to manage. These requirements will need highly efficient transport systems and solutions to maintain and uphold the transport routines. This can only be managed by the evolvement of a much complex transport system and solutions in order to satisfy the need for the heavy and massive transportation system. In public transportation ‘Rapid Bus Systems (RBS)’ and in cargo section ‘Model Change Systems and Chain Systems (MCSCS)’ could emerge as a solution. As the transportation system becomes complex the accessibility and dependability will become the major concern in coming days. In the transport business it is always required to maximize the profit keeping the highest yield. This is of major concern keeping the high levels of competition in the market. Slight change in external conditions like fuel prices, recession, and vehicle breakdown can make a profitable preposition turn into a loss-making business. Regular tab and monitoring of haulage efficiency can maintain profitability, viability in business, and increase competitiveness in market for transport companies. This is achievable with Advanced Smart Intelligent Transport System (ASITS) and solutions, like Smart Fleet Management System (SFMS), which offers well-organized transport management [1]. SFMS streamlines the daily operations by providing services like maintenance planning, handling invoicing, route, and driver planning. It enables reduced paperwork and decreases waiting time in workshops, border crossings, and cargo bays. Thus the vehicles take small and smart routes, waiting time is reduced. OBD (On-board diagnostic) will be instrumental in further refining and upgrading the FMS in existing as well as futuristic vehicles. Contemporary cars are also equipped with very much complex cyber-physical circuits and systems that incorporate hundreds of ECUs with various microprocessors and sophisticated electronic gadgets. These systems produce and collect data that is conventionally processed within the car and represented to the driver on dashboard in an easy sign system or symbolic manner. Today all main vehicle companies are continually dedicated to developing new engines for cars with lower fuel consumption and lower emission features with the keen attention of universal environmental protection, carbon reduction issues, and energy saving. In order to attain this target, the new car engines/vehicles are being fitted with a variety of automotive electronic devices, such as emission control systems, automatic transmission system, electronic fuel injection system ,etc. [2]. Again OBD will play a vital role in this mission of all OEMs.

OBD-II and Big Data: A Powerful Combination to Solve the Issues …

179

This paper is organized in the following manner: In Sect. 2 OBD is discussed; related work is presented in Sect. 3; in Sect. 4, the various technologies used are elaborated; in Sect. 5, we forecast bright future of OBDIII; in Sect. 6, we present paper conclusion.

2 OBD OBD is a self-diagnostics system built within the vehicle itself. Located near the driver wheel, under the dashboard is the OBD-II socket, a 16 pin connector. The system initially originated from California due to awareness and strict rules and regulation of CARB—California Air Resources Board. It demanded OBD in all cars manufactured after the year 1991 for controlling the quantity of harmful gasses emission [3]. OBD-II is the major contributor to information data logging in the automobile industry. It covers logging of all major parameters of the vehicle starting from small vehicle systems analyzing, diagnostics to complete fleet, and route optimization (Fig. 1). On Board Diagnostics ver. 2—OBD-II is a protocol that is used to identify, diagnose and report the engine’s health and condition to the user. This is designed to operate on CAN (Controller Area Network)) Protocol using which the vehicle can be connected to the scanning tools and external hardwares which are also called OBD Scanners. These OBD scanners can be connected to the PC or the laptops and they can then extract the information from the vehicle mother board through which we will be able to get all the parameters of the engine like the driving speeds, engine speeds, coolant levels, coolant temperature, emission gasses control, engine idle time and many other vital information about the engine. CAN is a Vehicle Bus Standard (VBS) which is designed for interacting between various modules in the vehicle. Various microcontrollers used inside the vehicle will be able to interact with each

Fig. 1 OBD-II 16 pin socket and connector

180

Meenakshi et al.

other regarding their operations and functionalities using his simple serial interface. In this type of protocol, it is not required to have a dedicated host computer. Thus, most of the components will be able to communicate with each other using this Bus protocol. OBD scanners extract this CAN Data and then converts into a presentation and application layer into information which is readily understood. They are deciphered using look-up tables of the data which is received from the engine. OBD acts higher layer involving the presentation and the application part where as the CAN Bus acts as the communication layer handling the data and the physical layer issues. Diagnostic Trouble Codes (DTCs) are normally sent through the CAN bus however; this Bus is also able to handle other functionalities. The latest vehicles come loaded with a lot of functions like lock control, infotainment systems and other regular functions like controlling the speed and acceleration along with braking and steering functionalities. All the control and data transmission related activities are handled by this bus. Let’s take an example of the sensor recording the speed of the vehicle in the wheels and it has to provide this data to the speedometer in the display board. The CAN bus will broadcast this information to all the nodes on the vehicle network. CAN bus protocol comprises of a node address that is unique to every device on this CAN networks. All nodes which do not match with the data address would simply ignore the packet in-case it was not addressed to them. The CAN protocol does not send the Source Address and so the destination will always have a question of genuineness of the packet. However, this feature enables us greater ability for testing all the circuits on board. The receiver will normally use the most often received value or the last received value for any of the actions—‘display’ in our illustration. The primary goal of using the CAN protocol was its simplicity to use and provide a strong network communication between the various peripherals like the sensors and actuators, to processors and controllers and any other nodes which might require realtime information. By using this simple connection the wiring between the nodes also becomes simple [4]. Figure 2 demonstrates the general idea of CAN bus supporting and not supporting the CAN network (Fig. 3).

2.1 PIDs OBD-II PIDs or the Protocol Identifiers are simple codes to extract the information from the engine into understandable format. The PIDs are decided by the OBD forum and matches to be similar for most vehicles [3]. The message comprises of a 64-Bit data packet which is followed by an 11-Bit Identifier [5]. The identifier is used to differentiate between response message and request message. Further, the packet data is also subdivided into Databyte, PID and the Mode Bytes.

OBD-II and Big Data: A Powerful Combination to Solve the Issues …

181

Fig. 2 General wiring connection of connections without controller area network

Fig. 3 General wiring connection of controller area network (CAN bus)

2.2 Big Data Vehicle’s data and information is main source for creation of our big data. Hadoop, Python, Spark, R, Matlab or other tools are needed to be used as big data analysis tools. Big Data is a fresh terminology and tool used to recognize and manage the datasets which are complex and huge; managing which cannot be achieved with the existing data mining tools employing the regular practices and methodologies. The 3V’s in the Bog Data Management as proposed by Laney [6] which is extensively used in Ecommerce has divided the data management into three verticals.

182

Meenakshi et al.

Fig. 4 Five V’s of big data

1. Variety 2. Velocity 3. Volume. In the newer concepts, two more V’s were added. 4. Variability: here the interpretation of the data comes into picture which is dependent on the user. 5. Value: here the information—derived from the data which can provide decisionmaking capabilities to the businesses which can provide great advantage to the organization which was not previously present (Fig. 4). The diagram demonstrates the 5 Vs in a simple manner [7].

3 Related Work The massive data that gets gathered inside the vehicle needs to be processed and analyzed. These combinations of data streams in connection with online learning are presented in [8]. Main point of view for usage of stream machine learning is saving time and memory. A complete toolchain has been presented as a solution that includes the entire process from data acquisition to data analysis based on driver conditioning and stream machine learning.

OBD-II and Big Data: A Powerful Combination to Solve the Issues …

183

Recognizes the key characteristics of driving behavior which can cause accidents per year by collecting information through OBD driving behavior data of vehicles [9]. Afterwards, EW-AHP methodology is adopted to build a safe and sound driving scoring system. The score was turned into the adjustment factor of the independent underwriting coefficient in accordance to driving safety latent variable. Further, it is concluded that the variation between premiums is amazing and provides the basis to tailored rate making of Insurance of vehicle. Vehicle ICT platform is presented in [10], which is used to evaluate the CAN bus and OBD-II data from vehicles. Sensor HUB framework is the base of ICT platform. ‘ObdCanCompare’ is an OBD-II scanning and monitoring tool with the capability to gather the data and push it to a smart phone. Once data gets accumulated and processed, it is added to a data store which is Hadoop based. After this process charts, business intelligence reports, and dashboards can be generated. Final objective of the paper is to design an interface for the connected car mechanism which in the long term will be termed as Social Driving. The author here recommends an Inter-Vehicle Communication System (IVCS) [11]. To establish the connection the system employs Wi-Fi employing the Androidbased Smart Phone. The authors decided to use this mechanism to first try their system which can be installed without any major compatibility issues on any of the Android Phone and is also fir for testing the VANET (Vehicular Network) and vehicular ad-hoc systems. The system then proposes that once this is tested then the next system can be replaced with an 802.11p based OBU—On Board Unit. This OBU will be able to transmit any of the crisis message to inform its connected team or base station about any of the mishaps or the accidents. Results indicate that this system will be able to make this planet safer for the traffic condition as proposed by the authors. Focus of [12] is on designing a MyEco-ACC device that is tailored for ACC (Adaptive Cruise Control) for smart electric vehicles. This device depends on the typical driving skills/styles and optimization of regenerative braking. Initially, as a Hammerstein model, a driving style model is preoccupied and its vital factors differ with various driving styles. This vehicle is an EV (Electric Vehicle) fitted with four-wheel hub motors. In this system, after analyzing the characteristics of the regenerative braking the strategy for the braking force distribution is evolved. A system has been proposed in [13] that can be simply connected employing the Wi-Fi to the ECU of the vehicles. To connect the OBD Data of the vehicle to the smart phone the system uses a combination of an Arduino board and a Raspberry Pi SBC (Single Board Computer). The proposed system uses this combination to provide a connectivity of the data to the cloud for the vehicles which do not have inbuilt connectivity to push the data to the cloud. This enables us to have a look at various parameters and functionalities in the vehicle. As of now, this kind of connectivity is available only in the high end luxurious and expensive cars and with this product they wish to extend this support to all the vehicles. This system is a low-cost solution to getting all the parameters on the cloud. Using this solution even the location of the vehicle can also be monitored as the position of the vehicle can also be analyzed. The captured data is then transferred to the smart phone of the user via the Wi-Fi. In this

184

Meenakshi et al.

system, additional sensors can be provided to the vehicle to improve its working and functionalities. Furthermore, for supporting the driver, the driver-assist functions are enabled while driving. The main contents of [14] are analyzing the driving behavior data via the vehicle preloading tools, study the aspects that have impact on safe driving, and setting up the ‘Logistic Regression Model’. Furthermore, they studied the activity variables that have an important impression on the vehicle’s risk circumstances through the model and evaluated the amount and degree of its impact. The research presented in [15] offers a technique to categorize and identify drivers’ driving styles. Driver-In-the-Loop Intelligent Simulation Platform (DILISP) is a distinctive testing situation is created with both dSPACE® and PanoSim-RT®. Three categories are proposed and defined for driving styles based on a. TTS— ‘Time-to-Start’, b. TTF—‘Time-To-Follow’ and c. the RMSVA—‘Root Mean Square of Vehicle Acceleration’. MGHMP (Multi-dimension Gaussian Hidden Markov Process) was used to collect driving data samples from twenty-one drivers and data examined further for driving style recognition. Using the proposed design and detection plan the driving patterns can easily be classified and recognized.

4 Technology Used Various technologies can be used on the side of data handling, some of the popular ones are listed below:

4.1 Hadoop An Open-Source Software Framework called Hadoop is used to run applications and store data on a customized hardware. It offers enormous storage space for any kind of data, massive power of processing, and the capacity and ability to lever nearly unbounded coexisting and parallel jobs or tasks. Master node and slave node are the two main groupings of computer machine or node in a Hadoop. The master node administers data storage on machine and using MapReduce runs parallel computations on that data. The task of data storing and computation running is performed by the machines called slave nodes [16].

4.2 Hadoop Distributed File System The Hadoop runtime system is joined with HDFS that grants concurrency and parallelism in order to get system reliability. HDFS stockpiles gigantic files that are running on clusters nodes. HDFS acts as a chief data storage system used by applications

OBD-II and Big Data: A Powerful Combination to Solve the Issues …

185

of Hadoop. In order to apply a “distributed file system”, it uses architecture named “NameNode” and “DataNode” and that offers high-performance access to data across Hadoop clusters which are extremely scalable [7].

4.3 Map Reduce Map Reduce is simply Google introduced software in order to support distributed computing on gigantic datasets by use of cluster of nodes. Basically it is a programming model capable of processing massive data. Hadoop has capability of running Map Reduce programs printed or written in different languages like Python, Java, Ruby, C++. Map Reduce programs are parallel in character, therefore are incredibly instrumental for achieving large-scale data examination by use of multiple machines in the cluster. The Map Reduce structure has two functions. First is “Mapper” and second is “Reducer”. The Mapper and Reducer functions are fundamentally noted down and carved by the user. Pairs (key, value) describe the data structure of Map and Reduce function. An input is taken by the Mapper function as (key, value) pair and creates an intermediate result set as (key, value) pairs [17].

4.4 R R is a programming language utilizing an open-source environment which is exceptionally flexible for solving statistics based problems. R offers a broad range of classical statistical tests, linear and nonlinear modeling, time-series study, clustering, classification, and techniques of graphics. R is very extensible and enables objects storage; let it be data set, a single number or model output within a workspace R/R session as it is an object-oriented language and environment. Afterwards these objects can be used inside functions, used to generate additional objects, or removed as appropriate [18].

4.5 Python Python is a high-level programming language which support object-oriented concepts along with dynamic semantics. It has advanced built-in data structures, united with dynamic binding and typing. In simple words, Python is a finely designed language that can be used for authentic world programming. It is an object-oriented, portable, open-source, extensible, embeddable, high-level, dynamic, universal purpose programming language that uses interpreter and can be used in an enormous domain of various applications. It is found very suitable for Rapid Application Development and also very suitable as a glue language or scripting to join

186

Meenakshi et al.

existing components collectively. Python has a very easy syntax and offers easy readability and hence decreases the program maintenance cost. Python provides support to packages and modules, which promotes modularity of program and reuse of codes. The Python extensive standard library and interpreter are accessible for all major platforms in binary form or source without any charge and can be liberally circulated [19].

5 Bright Forecast; OBD-III As on date, plenty of devices are already facilitating data transfer of CAN or OBDII through cellular/Wi-Fi. Although it may be cost-saving and convenient yet it is politically challenging due to the ‘big brother’ aspect. The suggestion is to switch OFF the OBD-II data picking while driving, and initiate the transfer as an alternative data collection to be done in a centralized server. This would successfully put the OEMs in better control of utilizing the automotive Big Data. OBD-III may be the ultimate solution that may produce an “out-of-cycle” examination. Once a technical issue is noticed, a note could be mailed to the owner of the car needing an “outof-cycle” examination within a defined timeline. It seems OBD-III will definitely provide additional security base also, i.e., eliminating or eradicating the risk of car hacking. Testing, trials, and finally the practical implementation may take time but it may surely dislocate the market for OBD-II. Nobody can deny the fact that OBD-II is an extremely classy and suitably competent system in sensing emissions issues. But as far as matters for resolving emission issues are concerned, it behaves same as OBDI. OBDII is mere a light signal unless and until there is any mode of enforcement, like inspection of the MIL light during a compulsory checks. Work towards development of OBD-III has already started. This is being done by taking OBDII a step ahead by addition of telemetry. Fundamentally, OBD-III includes addition of a tiny radio transponder (same as being used in transport tolls plazas) to all the vehicles. By employing this transponder, the VIN (Vehicle Identification number) of the vehicle and the DTCs (Diagnostic Trouble Codes) can be sent to the user or the service station for necessary checks using the cloud connectivity [3]. German car industries have already started working in this direction. In simple words OBDIII may be using MRTT ‘Miniature Radio Transponder Technology’ which will be somewhat alike to existing automatic electronic toll collection systems. A vehicle installed with OBD-III would be capable of generating reports of emissions issues straight to regulatory authorities/agencies. The VIN “Vehicle Identification Number” and any diagnostic codes present would be communicated by transponder. As soon as the MIL light glows the system could be made ready or tuned up by design to report an emission issue with the help of a satellite or cellular or satellite link. Its efficiency and cost savings will make this system very attractive. In present system, the whole vehicle fleet in a district or state has to be examined once every year in order to recognize the 40% or above vehicles that have emissions issues [20]. In OBD-III outfitted vehicle, which will

OBD-II and Big Data: A Powerful Combination to Solve the Issues …

187

Fig. 5 OBD III scanner

have remote monitoring through onboard telemetry, the necessity for regular interval examinations could be eradicated as only those vehicles would be tested that would be reported with issues. Means there would be no need to test the vehicle unless it is reported with emission issues. Hence an OBD III enabled vehicles would definitely save end user cost and provide him convenience. In case any vehicle will be diagnosed with emissions problem, it would be mandatory for the owner to get it fixed. This way the goad of society, district or state green tribunals will be completed as the faulty vehicle will be recognized immediately and soon after the issue will be attended and emission of subjected vehicle will be brought in control. This could definitely be improving our country’s air quality index. SEMA (Specialty Equipment Manufacturer’s Association) is also looking forward for implementation of OBD III in cars. Developers are relying on OBD III as a program that will definitely diminish the interval between detection of an emissions malfunction by the OBD-II system and repair of the vehicle. It will be accomplished by immediate transmission of trouble codes to repair agency and owner of the car. This process would occur constantly and continuously—not just every once-in-a-while. OBD III as proposed would exactly make it possible to regularly monitor and record every vehicle so equipped, from the moment it left the driveway to the moment it returned. All the big companies are working on OBD III as inspection process can be computerized via the use of transponder-assisted on-board diagnostic systems; OBD-III, the process could be made less time-consuming and costly. Figure 5 demonstrates concept photo of OBD III scanner.

6 Conclusion Right from the evolution of automobiles, efforts are being made to make vehicles user friendly, safe, fast, and technology assisted. Presently ample of devices are already being used in vehicles for transfer of complex data of CAN or OBD-II via Wi-Fi or cellular mode. As a step of continual enhancement towards the technological upgradation, OEMs and technologists are looking forward towards OBD-III which may be the decisive solution towards the next level of automobile diagnostics system.

188

Meenakshi et al.

Where OBD-II acts as a means of defined function and a responsible source to make an analysis and report the condition of car’s engine and health, OBD-III will be acting a solution and program which may support in minimizing the waiting time between identification of an emission issue of OBD-II system and rectification of the issue/vehicle. Hence OBD-III has been predicted as the potential prospect of diagnostic systems in automobile industry.

References 1. Malekian R, Moloisane NR, Nair L, Maharaj BT, Chude-Okonkwo UAK (2016) Design and implementation of a wireless OBD II fleet management system. IEEE Sens J 17(4):1154–1164 2. Jhou J-S, Chen S-H, Tsay W-D, Lai M-C (2013) The implementation of OBD-II vehicle diagnosis system integrated with cloud computation technology. In: 2013 second international conference on robot, vision and signal processing. IEEE, pp 9–12 3. https://www.csselectronics.com/screen/page/simple-intro-obd2-explained/language/en/. Accessed online: 1 Dec 2019 4. Kwon D, Park S, Ryu J-T (2017) A study on big data thinking of the internet of thingsbased smart-connected car in conjunction with controller area network bus and 4G-long term evolution. Symmetry 9(8):152 5. https://en.wikipedia.org/wiki/OBD-II_PIDs/. Accessed online: 1 Dec 2019 6. Laney D (2001) 3D data management: controlling data volume, velocity and variety. META Group Res Note 6(70):1 7. Meenakshi, Rainu N (2018) A literature review: big data and association rule mining. Int J Eng Technol 7(2.7):948–951 8. Jacob T, Kubica S, Rocco V (2018) Stream machine learning on vehicle data. In: International IEEE conference and workshop, pp 55–60 9. Cheng R, Wang C, Lv G, Liu Z, Wang T (2018) Research on safe driving scoring system and personalized ratemaking of vehicle insurance based on OBD data. In: Proceedings of the 3rd international conference on crowd science and engineering. ACM, p 7 10. Sik D, Balogh T, Ekler P, Lengyel L (2016) Comparing OBD and CAN sampling on the go with the SensorHUB framework. Procedia Eng 168:39–42 11. Su K-C, Wu H-M, Chang W-L, Chou Y-H (2012) Vehicle-to-vehicle communication system through Wi-Fi network using android smartphone. In: 2012 international conference on connected vehicles and expo (ICCVE). IEEE, pp 191–196 12. Sun B, Deng W, He R, Wu J, Li Y (2018) Personalized eco-driving for intelligent electric vehicles. No. 2018-01-1625. SAE Tech Pap 13. Veeraraghavan AK, Kirthika V (2018) Design and development of flexible OBD and mobile communication for internet of vehicles. ICCCSP. https://doi.org/10.1109/ICCCSP.2018.845 2826 14. Pan Y-J, Yu T-C, Cheng R-S (2017) Using OBD-II data to explore driving behavior model. In: 2017 international conference on applied system innovation (ICASI). IEEE, pp 1816–1818 15. Sun B, Deng W, Wu J, Li Y, Zhu B, Wu L (2017) Research on the classification and identification of driver’s driving style. IEEE, pp 28–32 16. Prajapati DJ, Garg S, Chauhan NC (2017) MapReduce based multilevel consistent and inconsistent association rule detection from big data using interestingness measures. Big Data Res 9:18–27 17. Prajapati DJ, Garg S, Chauhan NC (2017) Interesting association rule mining with consistent and inconsistent rule detection from big sales data in distributed environment. Future Comput Inform J 2(1):19–30

OBD-II and Big Data: A Powerful Combination to Solve the Issues …

189

18. Kelley K, Lai K, Wu P-J (2008) Using R for data analysis, pp 535–572. https://cran.r-project. org/doc/contrib/usingR 19. Srinath KR (2017) Python—the fastest growing programming language. Int Res J Eng Technol (IRJET) 4 20. OBDII: past, present & future. www.autotap.com. Accessed online: 1 Dec 2019

OBU (On-Board Unit) Wireless Devices in VANET(s) for Effective Communication—A Review N. Ganeshkumar and Sanjay Kumar

Abstract Vehicular Ad hoc Networks (VANETs) are the most hopeful and prominent domains in the field of wireless networks. VANET(s) are special applications in the area of Mobile Ad Hoc Networks. Since VANET(s) foundation, it has been adopted by academicians and industrialists for its enhancements in Intelligent Transportation Systems (ITS) across the world. Due to the divergent uniqueness, from vastly changing dynamic topology to alternating connectivity, VANETs faces various challenges. In this paper, the characteristics of VANETs, research challenges, communication standards, and various communication devices considerations for OBU’s (On-Board Units) and their features are conferred. Keywords VANETs · Intelligent telecommunication systems · Vehicular communication · Communication devices · OBUs (on-board units) wireless devices · RSUs (road-side units)

1 Introduction Vehicular Ad hoc Networks (VANETs) area the sub domain under wireless networks especially Mobile Ad Hoc Networks (MANETs). VANETs exhibits similar features over MANETs, but have some unique characteristics and behaviour due to the mobility of nodes at great speed. In MANETs the nodes are mobile and the access points may dynamically rearrange their connecting patterns according to its signal availability. Interest on VANET research arose among researchers in era of increasing vehicle manufactures and new technologies being used to offer better solutions on safety and security-oriented applications, immediate information sharing regarding the traveller’s route traffic with instant decisions on blockages and accidents for track N. Ganeshkumar (B) · S. Kumar Department of Computer Science and Engineering, SRM University, Delhi-NCR, Sonepat, Sonepat, Haryana, India e-mail: [email protected] S. Kumar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_15

191

192

N. Ganeshkumar and S. Kumar

changes and alternative route decisions, reducing wastage of time spent at queues of Toll Plazas, alerting drivers through safety messages. In VANETs, when a vehicle node moves from one access point (across infrastructures) to the other, the network will rearrange its connecting pattern. In VANETs, On-Board sensors are attached to vehicles which form a mobile node [1]. As the vehicles are capable of travelling at great speed ranging upto 435 km/h, the connecting pattern has to be dynamic and instantaneous. In such cases, the connectivity should be steady and must assurance the needs for what those technologies are proposed. The node’s availability and its stability are to be decided on the basis of the following network characteristics [2]: • Node velocity: The node velocity is defined by the vehicle’s moving speed using a specific movement pattern whether the vehicle travels in the rural or urban area. • Movement patterns: In both the areas (rural or urban), due to the establishments of very good road facilities by local and national authorities, it confines as predefined movement path, which helps in analyzing the mobility patterns of network. • Node density: The node density is defined based on the number of homogeneous and heterogeneous vehicle nodes travelling in the respective region at given time. • Node heterogeneity: heterogeneous vehicle nodes. • Network fragmentation and cooperation with other networks: Network fragmentation will always a serious factor due to the spontaneous joining and leaving vehicle nodes which decides the network formation, topology construction and establishing communication within the network and also with the other vehicle networks to pass on the data to its destination using the RSUs as intermediates. This paper is organized as follows: VANETs Architecture, VANET(s) applications, and VANET(s) research challenges are described in Sect. 2, Sects. 3 and 4 describes various communication standards and devices being used, features exhibited by those devices and finally Sect. 5 concludes the review and discusses future ideas of this review.

2 VANET Architecture 2.1 Architecture To establish and enable communication among the moving vehicle nodes, the necessary mobility environment and the infrastructures have to be deployed. VANET architecture is composed of three kinds of communication infrastructure, V-V interaction, V-RSU interaction and RSU-RSU interaction. In V-V interaction, the vehicles communicate with the other neighbour vehicles moving over roads [2]. In this type of communication, the communicating vehicles disseminate information which will help the forthcoming vehicles in particular route in near future or resides as some information for future use. In V-RSU interaction, the vehicles interact with the

OBU (On-Board Unit) Wireless Devices in VANET(s) for Effective …

193

Fig. 1 VANET architecture

Road-Side Units and in turn the Road-Side Units communicate the information to the centralized servers through other Road-Side Units through RSU-RSU interaction. The Road Side Units (RSU’s) can be considered as an AP (Access points) or router or as a buffer node where it could accumulate data and deliver data whenever and wherever needed. All data on the RSU’s are uplinked or down linked by vehicle nodes that are travelling across the RSU’s [3, 4]. The architecture of the VANETs is depicted in Fig. 1. To establish this vehicular communication, vehicle nodes are to be equipped with on-board sensor devices while travelling in appropriate direction in support with the Road-side Infrastructures. Although the data are sourced and destined from the moving vehicle nodes, the RSUs play a vital role, acting as intermediary units between the vehicle nodes, in ensuring entire communication process. Hence RSUs are also to be equipped with efficient wireless devices to achieve the purposes. Careful considerations are desirable not only in connecting device equipment, but also instant computational ability, power supply, availability of both RSUs and OBUs over long distances, proper authentic and security mechanisms are also obligatory [5].

2.2 Applications Based on the type of communication either V2I or V2V, the applications of VANET(s) are categorized as follows: • Safety Oriented Applications: Monitoring the adjoining roads, forthcoming vehicles, surface of the roads, road bends, etc. are some of the relevant areas

194

N. Ganeshkumar and S. Kumar

which help in real-time traffic monitoring, road quality checks, message transfer between the vehicles travelling, crash notifications in case of accidents, collision warning and hazard control, etc., falls under this category. • Commercial-Oriented Applications: These applications provide services to the drivers like entertainment and internet access, streaming audio and video, vehicle diagnostics information to both the drivers and service stations personalized maps for navigation, etc. [6]. • Convenience Oriented Applications: These applications comfort the drivers by providing convenience such as E-Toll collections, route information, and diversion notifications, parking availability information, etc. [7].

2.3 Research Challenges Due to the unique behaviour VANETs face various technical and research challenges. These challenges need to be addressed and solution to be addressed and answered in order to deploy these networks effectively [8, 9]. These technical challenges include • Reliability: The reliability is one of the major challenge faced by VANETs in terms of achieving effective communication by the proper utilization of medium among the vehicle nodes and the fixed infrastructure RSU’s. • Availability: Due to the availability of hundreds of nodes and the RSU’s over the environment and their mobility nature, these VANETs should be very scalable among themselves to have their have communication effectively. • Dynamic: VANETs are dynamic in terms of their frequently changing topologies, its environment and its Ad Hoc nature. Because of these characteristics, VANETs rapidly establishes its own subnetworks as isolated individual clusters of nodes. But these subnetworks or clusters should interact with other clusters to achieve the effective communication. • Multi-hop: VANETs expose multiple hops connectivity that offers vital transportation virtually among moving vehicles nodes. This roots a major dispute in terms of reliability and efficiency of the MAC (Medium Access Control) protocols for communication. • Routing: Routing is also a major challenge confronted by researchers and the industrialists since frequent disconnections that occur during communication process. There are so many algorithms and applications being developed to propose reliable routing services across the mobile nodes. • Security and privacy: Security and privacy are prime apprehensions in VANETs. Suitable safety measures have to evolve to ensure availability, integrity, confidentiality, and authentication to handle various types of internal and external security attacks.

OBU (On-Board Unit) Wireless Devices in VANET(s) for Effective …

195

3 Communication Standards To avoid accidents and to improve the efficiency of traffic management, main focus is given towards utilization of wireless communication among vehicles and RoadSide Units (RSU) to develop new intelligent transportation system (ITS) applications helps in improving safety public transportation [10].

3.1 Dedicated Short Range Communication (DSRC) In 1999, United States’ FCC Commission had established licensing and service rules for (DSRC) Service in (ITS) Radio Service that operated in the 5.850–5.925 GHz band (5.9 GHz band) of radio frequency. The DSRC involves V-V and V-I communications, securing the wellbeing of Society. It can save human beings by offering precaution information to the drivers and travellers. DSRC are having less latency and more reliability, towards securing and sustaining interoperability. European Standardization Organization in association with ISO (International Standardization Organization) developed following DSRC standards in which each standard addresses functionalities of different layers (review) as per universal OSI protocol suite [8]. • EN 12253:2004 is a standard for RTTT (Road Transport and Traffic Telematics) based on Dedicated Short-Range Communication (DSRC). In this standard Physical layer are using microwave at 5.8 GHz • EN 12795:2002 is a standard for RTTT based on same Dedicated Short-Range Communication (DSRC), but designated for Medium Access (MAC) and logical link control (LLC) in Data link layer • EN 12834:2002 is a standard for RTTT using Dedicated Short-Range Communication (DSRC) intended for Application Layer • EN 13372:2004 is a standard for RTTT using Dedicated Short-Range Communication (DSRC) that were used for DSRC profiles for RTTT applications • EN ISO 14906:2004 is a standard that was designed for creating an application interface for Electronic Toll Fee Collection.

3.2 WAVE and IEEE 802.11p Wireless Access for Vehicular Environments (WAVE) Standards define architecture and standardizes collection of facilities and interfaces that empowers secure remote correspondence and physical access for high speed, short range, and low latency for vehicular environment. WAVE falls under IEEE 1609 Family.

196

N. Ganeshkumar and S. Kumar

WAVE known as IEEE 802.11p, is an extension of IEEE 802.11 that provides high-speed vehicular environment. WAVE supports the applications in the shortrange (ITS). The communication within range of connectivity, between vehicle nodes (V2V) or between the vehicle nodes and the Road-Side Units infrastructure (V2I) relies on the band of 5.9 GHz. WAVE offers factual information about traffic at real time that improvises the safety and reduces the traffic jams during transportation using OBU and RSU’s [11]. Following are the various WAVE standards: • IEEE P1609.0 offers the architecture and services information that are needed for WAVE devices using multiple channel for communication. • IEEE P1609.2 standard covers up ways for securing WAVE management and application information and additionally portrays authoritative capacities that are important to help the Centre security functions. • IEEE 1609.3–2010 describes supporting the upper-layer communication including TCP/IP. • IEEE 1609.4–2010 describes a range of formats for communication in DSRC applications at 5.9 GHz. • IEEE P1609.5 defines communication management services in support of wireless connectivity among vehicle-based devices, and between fixed roadside devices and vehicle-based devices. • IEEE 1609.11–2010 defines basis for interoperability among On board unit (OBU) and roadside equipment (RSE) using DSRC. • IEEE P1609.12 specifies provisions for allotting WAVE identifiers as defined in the IEEE 1609 series of standards.

4 Communication Devices for VANET(s) Many car manufacturing companies and research associations are exploring approaches ascertaining vehicular networks. Flexible nature of MANETs signifies an attractive solution for inter-vehicle communications. There are many access technologies are proposed for VANET(s) to enhance the robustness in highly mobile environments [12]. The devices used for wireless communication should overcome limitations such as limited bandwidth, attenuations, transmission power, reliability and latency. The following devices are considered to be deployed as On-BoardUnits on moving vehicles and may be also used in RSU’s to establish the vehicular communication environment.

4.1 Bluetooth IEEE 802.15.1 The IEEE 802.15.1 is the origin for the Bluetooth wireless communication technologies. It is a Personal Area Network (PAN) and a type of ad hoc wireless-based

OBU (On-Board Unit) Wireless Devices in VANET(s) for Effective …

197

Table 1 IEEE 802.15.1 standard (Bluetooth) Device

Standard

Comm. range

Data rate (Mbps)

Frequency (GHz)

Bluetooth

IEEE 802.15.1

Class 1—100 m Class 2—10 m Class 3—1 m

3

2.4

Fig. 2 Bluetooth communication

standard for short-range communication. It is actually designed for tiny and lowcost equipment with least power consumption that is intended for the transmission of digital audio and data [13] (Table 1). Generally, devices involving Bluetooth wireless communication technology consists of master and slave nodes. Any mobile node can be turned as a master or a slave node. The communication range the device (usually 10 m) can have is termed as piconet. A slave node may belong to one or more piconets. But, each piconet must have only one master node. Master and slave nodes must be in range for establishing connection and further communication process. Figure 2 shows the aforesaid communication process.

4.2 Wi-Fi IEEE 802.11 Wireless-Fidelity or Wi-Fi, known as a Wireless LAN, is a network technology for data communication under specification of standard IEEE 802.11. IEEE 802.11 have various substandards such as IEEE 802.11a/b/g/n/ac [4] (Table 2). With the help of Internet, Access points or Wireless Routers as intermittent physical devices one can establish communication between sender and receiver nodes Table 2 IEEE 802.11 standards (Wi-Fi) Device

Standard

Comm. range (m)

Data rate (Mbps)

Frequency (GHz)

Wi-Fi

IEEE 802.11a

120

54

5

IEEE 802.11b

140

11

2.4

IEEE 802.11g

140

54

2.4

IEEE 802.11n

250

248

5/2.4

IEEE 802.11ac



433–6933

5

198

N. Ganeshkumar and S. Kumar

Fig. 3 Wi-Fi communication

using wireless as a medium (without need for wires). The medium utilizes radio frequency (within electromagnetic spectrum). The Access point or a wireless router which acts as a fixed infrastructure, creates a communication range where the mobile nodes can establish connection and make communication according to their mobile range. Figure 3 explains Wi-Fi communication.

4.3 BLE/WLAN BLE refers to Bluetooth Low Energy and is proposed for short range Radio Frequency based equipment for connectivity. These devices are extremely small and are used in portable devices like smart phones where battery life is concerned compared to data transfer rate. The devices nowadays are designed to connect with multiple devices, synchronize with them and stay connected, but utilize very least amount of battery life. For this purpose, these devices broadcasts/receives data at regular intervals of time (Table 3). These devices always remain in sleep mode. During initiation of connection alone, the device will be in active mode which prevents unnecessary utilization of energy. Figure 4 explains BLE communication process among broadcaster and the receivers/observers. Table 3 BLE/WLAN

Device

Comm. range

Data rate

Frequency

BLE

12.0 and $1 < 20.0 and ($19 == ‘Yes’)); 3. Third step is to generate the wind direction and speed at 9 a.m. along with whether it will rain tomorrow or not. Therefore use the syntax: rel = for each cities generate ($7, $9, $21); 4. Last step is to display the output by using dump operator. Hence use the syntax: dump rel (Fig. 2). 5.2 Find the number of places where the evaporation is greater than 4.0 and maximum temperature is greater than 25 and both the wind gust direction and the wind direction at 9 a.m. ends with SE (South East direction): 1. First step is to LOAD the ‘weather.csv’ file from the root directory using a function. Therefore the syntax: a = load ‘/weather.csv’ using PigStorage(‘,’); 2. Second step is to create a query that satisfy all the above-given conditions. Since there are so many conditions, we will use ‘and’ operator and a filter operator too. To match string to start with particular character we use ‘*’ operator (which is used for more than one character) followed by last letters. So the type the syntax: cities = filter a by ($3 > 4.0 and $1 > 25 and $5 matches ‘.*SE’ and $7 matches ‘.*SE’); 3. Third step is to group the above relation by ALL. Use the syntax: grp = group cities all; 4. Fourth step is to count the number of places that satisfy the above conditions by using COUNT function. Therefore use the syntax: cnt = for each grp generate COUNT(cities);

228

A. Kaur and A. Randhawa

Fig. 2 The output of the above query gives 7 places which satisfy the above conditions and hence display the wind direction and speed at 9 a.m. along with whether it will rain tomorrow or not

5.

Last step is to display the output by using dump operator. Hence use the syntax: dump cnt [9] (Fig. 3).

5.3 Find places where there will be no rain and wind gust direction is NNE (North North East). As there will be no raining hence find those places where sunshine exceeds more than 4 but also wind gust speed is more than 30.0. Hence display the humidity, pressure, cloud, and temperature at 9 a.m. of those places:

Fig. 3 The output of the above query is given as 4

Weather Dataset Analysis Using Apache Pig

229

Fig. 4 The output of the above query gives 22 that are places which satisfy the above conditions and also display the humidity, pressure, cloud, and temperature of those places at 9 a.m.

1.

2.

3. 4.

First step is to LOAD the ‘weather.csv’ file from the root directory which is done by using a comma separator. Therefore use the syntax: a = load ‘/weather.csv’ using PigStorage(‘,’); Second step is to create a query which satisfies all the above-given conditions hence by using ‘and’ operator for more than one condition and to check if ‘NNE’ comes in string we will use ‘*’ which stands for more than one character. So all together use the syntax: cities = filter a by ($5 matches ‘.*NNE*.’ and $4 > 4 and $6 > 30.0 and $2 == 0); Third step is to generate the humidity, pressure, cloud, and temperature at 9 a.m. For this the syntax: rel = for each cities generate ($11, $13, $15, $17); Last step is to display the output by using dump operator. Therefore use the syntax: dump rel [10] (Fig. 4).

6 Conclusion and Future Work Epitomizing the above discussion on analyzing data with Apache Pig as a tool, it has proven to be an effective tool in solving complex problems and extracting data from it. It can be inferred that for any business to flourish brings along with huge generation of data every minute. It is the hour of the need to extract meaningful information from that data and refine business tactics. Therefore complex datasets need to be handled efficiently. Through sample queries, it is clear that it is an appropriate and easy tool to learn and analyze data.

230

A. Kaur and A. Randhawa

Our future works will involve analyzing more complex datasets with Apache Pig and other components of Hadoop as well as comparing the results obtained where time and accuracy will be the criteria [11].

References 1. Jain A, Bhatnagar V (2015) Crime data analysis using Pig with Hadoop. In: International conference on information security & privacy (ICISP2015), 11–12 Dec 2015 2. Vaddeman B (2016) Beginning Apache Pig (big data processing made easy) 3. Image link: https://images.app.goo.gl/FQXpXyY33x8FLA2k7 4. Swarna C, Ansari Z (2017) Apache Pig—a data flow framework based on Hadoop Map Reduce. https://doi.org/10.14445/22315381/IJETT-V50P244 5. Rathi S (2017) A brief study of big data analytics using Apache Pig and Hadoop distributed file system. Int J Adv Res Comput Eng Technol (IJARCET) 6(1) 6. Concept of Map Reduce: https://www.tutorialspoint.com/map_reduce/index.htm 7. Syntax for Tez mode: https://pig.apache.org/docs/r0.15.0/start.html#Running+the+Pig+Scr ipts+in+Mapreduce+Mode+or+Tez+Mode 8. Dataset “weather.csv” from https://www.kaggle.com/zaraavagyan/weathercsv 9. Priya Ranjani AC, Sridhar M (2018) Analysis of web log data using Apache Pig in Hadoop. IJRAR Int J Res Anal Rev 5(2) 10. Queries learned from the website: https://www.tutorialspoint.com/apache_pig/apache_pig_arc hitecture.htm 11. Bharadwaj V, Johari R (2015) Big data analysis: issues and challenges. In: 2015 international conference electrical electronics signals communication and optimization (EESCO)

Analysis of Learner’s Behavior Using Latent Dirichlet Allocation in Online Learning Environment N. A. Deepak and N. S. Shobha

Abstract There are several challenges associated with on-line based teachinglearning systems. The most important challenge is to recognize the student, who fails to complete the assigned task in a stipulated time. The existing models try to find a solution to this problem, but most of the algorithm fails to classify the input documents correctly and to create linearly-separable clusters of learners. To overcome these issues, the proposed methodology tries to apply the topic models like Latent Dirichlet Allocation (LDA) to create clusters of linearly separable learners. Initially, the required features are extracted and transformed into words-sentences suitable LDA. The words are then fed to the topic-modeling algorithm, (LDA) to generate clusters similar documents or learners. Several experiments were conducted to evaluate the performance of different predictive models. The results show the topic modeling algorithm LDA attains significant clustering of documents over the other state-of-art. Keywords Clusters · Documents · Features · Topic model · Words

1 Introduction Several challenges exist connected with the on-line teaching-learning process [1]. The most important challenge is to recognize the student, who fails to complete the assigned task in a stipulated time. The lack of student’s motivation in various courses and for other online course-activities makes this task more complex [2]. As the course materials for online courses are delivered through the Virtual Learning Environment (VLE) [3, 4], the VLE plays an important role in the on-line teaching-learning process. The Virtual Learning Environment (VLE) is shown in Fig. 1 is a web-based N. A. Deepak (B) R V Institute of Technology and Management, Bengaluru, Karnataka, India e-mail: [email protected] N. S. Shobha Department of IEM, R V College of Engineering, Bengaluru, Karnataka, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_18

231

232

N. A. Deepak and N. S. Shobha

Fig. 1 Virtual learning environment

learning system that replicates real-world learning by integrating virtual equivalents of conventional concepts of education. For example, teachers can assign lessons, tests, and marks virtually, while students can submit assignments and view their marks through the VLE platform. The parents can view their ward performance and important documents while school administrators can organize their school calendars and disseminate school notices via the Internet. The Latent Dirichlet Allocation [5], “A topic modeling algorithm”, mainly developed keeping in mind, the text analysis or clustering of textual data. This has some latent qualities that led to this study. Initially, the primary features are found. The secondary features are then extracted from primary features to increase the words. The words are then fed to LDA, which produces clusters of similar learners. As the performance of LDA reduces with primary sequences, the secondary sequences gain importance, which increases the classification rate by generating more words, which in turn produces the separable clusters of documents. This paper is structured as follows. Section 2 gives an overview of the existing framework connected with the on-line teaching-learning process. Section 3 deals with the exploration of extracted patterns to understand student’s behavior, and finally, Sect. 4 deals with the discussion over the obtained results. This is followed by the conclusion and references.

2 Related Works Most of the existing literature on monitoring the students learning behavior through online courses deal with the use of Machine learning techniques for identifying student’s behavior has been investigated and described Arnold and Pistilli [6] Predicted the students’ performance based on real-time feedback and grades they

Analysis of Learner’s Behavior Using Latent Dirichlet …

233

obtain in their curriculum. Their work also includes the demographic characteristics, past academic history, and student’s effort in prediction. Baradwaj and Pal [7] describe a data mining technique to analyze student’s performance. Their method classifies the student’s behavior by the decision tree method of the data mining tool. They extract knowledge that describes the student’s performance and identify the drop-out students who need extra attention and to allow the tutor to provide appropriate counselling. Kuzilek [8] predicts the student’s academic performance using four types of predictive mathematical models. This includes multiple linear regression models, a multilayer perception network model, a radial basis function network model, and a support vector machine model for monitoring the student’s performance. Wladis and Conway [9] analyzed student’s performance who took the courses on-line. They explored the impact of course-level factors on on-line courses and found the gap between online and face-to-face course outcomes. Wladis et al. [10] describes the role of enrolment choice of students to on-line course selection rationale and course difficulty as factors affecting the retention. They describe with well-documented evidence that on-line course retention rates are lower than faceto-face retention rates. Kabra and Bichkar [11] uses the decision trees for mining educational data. Decision tree algorithms are used to predict student performance by knowing their past performance in academics. Romero et al. [12], uses the final score obtained by students to predict the student’s performance. Their method also uses data mining approaches to improve the prediction result. The learning behavior of students through online courses, using a learning management system is discussed [13], looking at how student’s responses, with different learning styles, prefer to use and learn in some courses. This resulted, in several differences among the students’ learning patterns. In an empirical approach [14], a study is made to analyze the content and process-based learning on peer assessment without teacher assistance. They try to explore students’ knowledge construction for discussion, using Sequential analysis and content analysis to judge the student’s online learning behavior. The student’s activity over a while during the course is analyzed based on activity over the online courses [15]. The cluster of students who have similarities over the period is used as a parameter to create groups such as not active, very active, and discard the course. an online learning environment for managing the learning process requires to promote pro-active education [16]. The accessibility count of student’s notes is the reflection to progress further in their education. This analysis is used to track the learning process of a student and his participation in online courses. The relationship between the student, behavior of reading notes, watching lecturer videos, note assessment, and internal scores are the metrics that are examined.

3 Implementation The LDA, developed by Blei [5], for classifying the textual data, has some hidden qualities that increase the applicability of LDA to classify student’s behavior. The

234

N. A. Deepak and N. S. Shobha

application of LDA, on the on-line data, which contains the student’s turn-around time in submitting the assignment, Interest shown towards learning, is made possible, only if the stored data are converted into words appropriately. This is achieved by proposing a novel transformation technique that interprets the words from the stored data follows: • Initially, the values like Time-to-Submit-Assignment (TSA), Time-Spent-onLecturer-Notes (TSL), Time-Spent-Videos (TSV), Internal Marks (Internals-1 and Internals-2), and Quiz Marks, present in the dataset are analyzed to extract the primary sequences. • The Above-Average-Below-Average (AABA Sequence) and Increase-Decrease (ID Sequence) techniques are used to derive secondary sequences from the primary sequences. This increase, the sequences generated, which will increase the words required for the topic model. • Finally, the words are then fed to LDA, to generate clusters of linearly separable learners.

3.1 Extraction of Features—Primary Sequences The primary goal of feature extraction is to project, the data onto a lower dimension [17–19]. This process mainly relies on linear transformation. The transformation of extracted features into a lower dimension is an important step, this is required because the extracted feature dimensionality or feature space might be high; which needs to lower down. The dimensionality of the features is reduced by selecting some important features based on the application being dealt with. The feature extraction is the key, step to progress further, The main goal of feature extraction methods is to find (q)—feature vectors by application of linear projection on p-set feature vectors. The feature extraction method like the principal component analysis is a linear feature extraction method based on K-L transform [20], suitable for data classification and analysis. The linear Discriminant analysis is based on the concept of searching for a linear combination of variables that best separates two classes (targets) [21]. Linear Discriminant Analysis is used as a dimensionality reduction technique in the pre-processing step for patternclassification and machine learning applications. The goal is to project a dataset onto a lower-dimensional space with good class separability and also reduce computational costs [22]. The subspace-linear Discriminant analysis is used to increase the classification rate, by reducing irrelevant features from the data set. This method selects only important features based on the application, and by eliminating irrelevant features, to reduce the computation process. In this methodology, the values assigned for a specific task like Time-to-SubmitAssignment (TSA), Time-Spent-on-Lecturer-Notes (TSL), Time-Spent-LecturerVideos (TSV), Internal Marks (Internals-1 and Internals-2), Quiz Marks, Concept Maps, Academic self-concept, and Family—Educational Background are used as assessing tools to measure the student’s performance. This plays an important role

Analysis of Learner’s Behavior Using Latent Dirichlet …

235

in creating clustering of student’s behavior in learning. The details regarding the assessing methods are given below. • Time-to-Submit-Assignment: Time to submit the Assignment is the timeinterval, that the student takes to complete and submit the given assignment by the tutor. This time gap may vary from 0th day to 10th. This parameter is counted from the day of the assignment posted over the web. • Time-Spent-on-Lecturer-Notes: This represents the time spent by a student, to read and understand the concepts in the course materials, Notes, or study materials. This involves studying time, copying, understanding, etc. • Time-Spent-Lecturer-Videos: This represents the time spent, by a student to watch the course videos and other videos related to the course. • Internal Marks: This feature shows the performance of a student’s understanding capability, the capability to reproduce, etc. Generally, two internals are conducted for the course opted by the students. • Quiz: This feature tests the performance of a student, group discussion reasoning capability, etc. • Concept Maps: The concept maps are a graphic representation of student’s knowledge. Having students create concept maps can provide you with insights into how they organize and represent knowledge. The student’s performance can be rated between 0 and 100 using concept maps. The value (0) indicate the student has poor knowledge. • Academic Self-Concept: Refers to individual academic achievement. This involves success, grade averages, motivation, creativity, or how they navigated difficult subject areas in their past career. The student’s performance can be rated between 0 and 100 using academic self-concept. The value (100) indicates, a student has gained higher knowledge. • Family—Educational Background: This involves knowing or studying the student’s educational family background. Here the student’s performance can be rated between 0 and 100 using this task. The value (0) indicates a student has a poor educational family background. The above-said task is used to assess student performance, they are numbers, that represent student turn-around time in submitting assignments, understanding course materials, viewing course videos, marks obtained in internals, quiz, academic achievement, and family educational background. These responses are combined together in a specific order they appear in the dataset, to generate a sequence of numbers, known as a primary sequence. In any instance, the primary sequence contains exactly six elements that represent a student turn-around time in completing the assigned task. Now, the proposed algorithm is executed over the primary features. It was found that the performance reduces because the proposed work will not be so effective in creating separable clusters if the generated words are small. This problem is resolved by extracting the secondary sequences from the primary sequences using two novel techniques (1) Above-Average-Below-Average (AABA) and (2) Increase-Decrease (ID) sequencing techniques.

236

N. A. Deepak and N. S. Shobha

3.2 Extraction of Features—Secondary Sequences As the performance of LDA declines with primary sequences, the extraction of secondary sequence gains importance. The new sequences will increase the sequence count, which will increase the words generated. Above-Average-Below-Average: Initially, the average of the sequence (μi(i)) is calculated. The average is then subtracted from the elements of the sequence, Eq. (1). In the next step, the successive elements (C(i)) and (C(i + 1)) of the sequence Eq. (1), is compared. If the elements differ, then, the value of a variable (i) is used to calculate the intermediate sequence Eq. (2). Finally, the above-average-belowaverage sequence (U . ) is calculated as shown in Eq. (3). Figure 2 shows the aboveaverage and below-average sequence. 

(i) = ρ(i) =

(i) =



(i)−1 i=1 (i)

− μ((i))

 (1)

    (i) < 0 && (i + 1) < 0    

U· =

   (i) > 0 && (i + 1) > 0 

(i) =∼

(i) ((i)!ρ(i)) i =1

  β(i) − 1  (i) − (i − 1) i =2

Fig. 2 Secondary sequence: above-average and below-average sequence

(2)

(3)

Analysis of Learner’s Behavior Using Latent Dirichlet …

237

Fig. 3 Secondary sequence—increase and decrease sequence

Increase-Decrease Sequence: The increase and decrease sequences are found by subtracting the successive elements of the same sequence to generate a sequence () Eq. (4). The successive elements of () sequence, is compared Eq. (5), based on the result, the sequence () is generated Eq. (5). Finally, the increase-decrease sequence (∩) is found by subtracting the adjacent elements of the (HD ) as shown in Eq. (6). This process is repeated with all other secondary sequences, to calculate rise and fall sequences (∩) as shown in Fig. 3. (i) =



(i)−1 i=2 (i)

− (i − 1)

 (4)

γ (i) = ((i) > 0)&&((i − 1) > 0) r (i) = ((i) < 0)&&((i − 1) < 0)  (i) =∼

r (((γ (i))!!(r (i)))!!((sign((i))) == 0)) i =2  ζ (i) − 1 ∩= ((i) − (i − 1)) i =2

(5)

(6)

3.3 Generation of Words—Building Blocks The words are generated by transforming the secondary sequences (Y`) into words representation. This technique generates words by processing the elements (δi; δi + 1; δi + 2 and δi + 3), where these elements are assigned with the first four elements

238

N. A. Deepak and N. S. Shobha

of the selected sequence (Y`k). In the successive iteration, the elements (δi; δi + 1; δi + 2 and δi + 3) are updated with new values obtained by moving the element towards the right by one position, of the selected sequence (Y`k). This process is repeated to generate words. In each iteration, only four elements are considered in the generation of words. • The word (w1 = w1 + 1), if the first two elements of the sequence that is, (δi) and (δi + 1) are equal. Equation (7). • The word (w2 = w2 + 1), if the last two elements that are, (δi + 2) and (δi + 3) are equal; Eq. (8). • The word (w3 = w3 + 1), if the summation value of the first two elements and the second two elements are equal. Equation (9). • The word (w4 = w4 + 1), if the difference of first two elements is zero that is, [(δi) − (δi + 1) == 0] or the difference of second two elements is zero that is, [((δi + 2) − (δi + 3)) == 0] or the sign of the difference of the first two elements and second two elements are equal, that is, sign((δi) − (δi + 1)) == sign(δi + 2) − (δi + 2) Eq. (10). For example, let the sequence (Y`1) = 5, 4, 7, 9, 3, 5; then, the elements (δ1, δ2, δ3, and δ4) for the first iteration will be assigned with the initial values (5, 4, 7, and 9), respectively. The generated words would be (w1 = 0, w2 = 0, w3 = 0, w4 = 0) using Eqs. (7)–(10). In the second iteration, the elements (δ1, δ2, δ3, and δ4), will be assigned with the new values (4, 7, 9, and 3), respectively, which is obtained by moving one element towards the right of the selected sequence. Table 1 shows the steps involved in generating the words for LDA.  w1 =  w2 =

Table 1 Words for LDA—the building blocks

1 0



η 1 if i=1 (δ(i)) − (δ(i + 1)) == 0 0 Otherwise

η if i=1 (δ(i + 2)) − (δ(i + 3)) == 0 Otherwise

Input: secondary sequence (U) Output: Words (w) for i = 1 to Length(Ui ) Initialize δ 1 = Ui (Ej) Initialize δ 2 = Ui (Ej + 1); Initialize δ 3 = Ui (Ej + 2); Initialize δ 4 = Ui (Ej + 3); Step 1: if (δ 1 == δ 2 ), Then w(1) = w(1) + 1; Step 2: if (δ 3 == δ 4 ), Then w(2) = w(2) + 1; Step 3: if ((δ 1 + δ 2 ) == (δ 3 + δ 4 )), Then w(3) = w(3) + 1; Step 4: if ((δ 1 ≤ δ 2 ) and (δ 3 ≤ δ 4 )), Then w(4) = w(4) + 1; end for

(7)

(8)

Analysis of Learner’s Behavior Using Latent Dirichlet …

w3 =

⎧ ⎨ ⎩

 1 if

η i=1



2 

i=1

 (δ(i)) −

239



4 





(δ(i)) == 0

i=3

(9)

0 Otherwise



η ⎧ ⎨ 1 if i=1 ((δi) − (δi + 1)) == 0!!(δi + 2) − (δ + 3) == 0 w4 = 1 if sign((δi) − (δi + 1) == sign(δi + 2) − (δi + 3)) ⎩ 0 Otherwise

(10)

As mentioned earlier in this paper, the performance of the proposed algorithm declines, with the primary sequences. This is because with the primary sequence the words generated will be less. It is observed that the proposed algorithm with the primary sequences, generates an average of 20 words per subject. The words and its occurrence per subject are given as (word: occurrence = 2:2, 3:1, 6:1, 33:2, 34:3, 38:3, 54:2, 58:2, 66:2, 85:1, 90:3, 97:3, 100:2, 102:1, 130:8, 132:1, 163:1, 166:3, 164:3, 176:1). This is considerably less when compared to the words generated using secondary sequences. The secondary sequence extracted from the primary sequence, using aboveaverage-below-average and increase-decrease sequencing technique, will increase the words. Hence, the extraction of secondary sequences gains importance in this framework. It is observed that proposed algorithm with secondary sequences generates, on an average 40 words per document. The words and its occurrence within an document is given as (word: occurrence = 2:2, 3:1, 4:4, 6:1, 8:2, 10:1, 13:2, 15:3, 17:3, 18:3, 24:2, 25:1, 30:1, 33:2, 34:3, 42:2, 45:3, 50:2, 55:2, 60:1, 65:1, 66:2, 69:1, 70:1, 78:3, 97:3, 98:3, 100:2, 102:1, 106:1, 110:2, 120:2, 130:8, 132:1, 161:7, 162:3, 163:1, 166:3, 170:3, 176:1). This clearly indicates that the performance of the proposed algorithm increases, if the vocabulary space for LDA gets increased.

3.4 K-means Clustering—Create Clusters of Student Documents In the proposed algorithm, the clusters are formed based on the student’s behavior on completing the assigned task. The clustering is the process of grouping the student’s behavior, together which is similar within a class or a cluster. That is, a cluster is a collection of student documents, which are ‘similar’ among themselves (intracluster) and are ‘dissimilar’, with the documents belonging to other clusters (intercluster). The metric used to measure the similarity between the documents is the distance. The multiple documents belong to the same cluster, only if the distance measured is minimum. The clustering is the technique to classify individuals into different groups based on the chosen metric. More precisely, the clustering is the concept of partitioning the data into groups, so that the data in each cluster have some common properties. The proposed algorithm uses K-means clustering [22], to group the students into clusters based on the exhibited behavior in completing the assigned task. Here

240

N. A. Deepak and N. S. Shobha

the document (ai), is assigned to the (jth) cluster, only if the difference (a − mj) is minimum. The proposed method uses the leave-one-out cross-evaluation technique to create clusters. The input documents are broadly divided into two sets namely, test and training samples, where the test set contains a single document, and remaining documents are considered as the training set. The division of documents is repeated until each document is tested. The problem of assigning the student document to the clusters involves finding the match (closest match) of the students’ behaviors (test set) with the other student’s behavior found in the clusters (training). To achieve this task, the proposed algorithm is executed independently on both test and training samples. The word-occurrence matrix () is found from the training set, whereas only the words are generated from the test set. Finally, the generated words from the test set are compared with the word-occurrence matrix of the training set to find a set of documents that have a similar word and its occurrence (word count). In comparison, the similar documents having similar behaviors are grouped as a single cluster.

4 Experiments and Results Experiments were carried out by considering publicly available datasets obtained from Open University Learning Analytics Dataset (OULAD) [23]. The performance of predictive models, such as Support Vector Machine (SVM), Random Forest (RF), and XGBoost algorithm trained with sampling window technique is evaluated in terms of precision, recall, F-measure, and ROC. Experiments are conducted considering different deadline days and ROC performance is averaged and results are noted as shown in Fig. 4. An average ROC performance improvement of 35.969% is attained by LDA over RF and XGBoost model. From Fig. 4, it can be seen as deadline days increases, the LDA model attains higher accuracy in identifying risk. However, the

Fig. 4 Performance of RF model, XGBoost model over proposed LDA model varied number of deadline days to complete the assigned task

Analysis of Learner’s Behavior Using Latent Dirichlet …

241

Table 2 Performance of RF model, XGBoost model over LDA Model for varied number of deadline days Days-deadline

RF model

XGBoost model

LDA model

Improvement performance

0

0.7755

0.7555

0.8222

0.6677

1

0.6122

0.7999

0.8555

0.5566

2

0.5411

0.7888

0.8557

0.6999

3

0.4721

0.7799

0.9874

0.2075

4

0.4592

0.7696

0.8563

0.0867

5

0.4223

0.7119

0.8452

0.1333

6

0.4122

0.6879

0.7548

0.0669

7

0.4011

0.7243

0.8452

0.1218

8

0.2051

0.3536

0.4226

0.0069 0.0543

9

0.3643

0.6998

0.7541

10

0.2051

0.3536

0.4226

0.0069

AVG

0.48703

0.7424

0.84216

0.15996

accuracy of the RF model degrades rapidly. Table 2 shows the comparison of different models varied number of deadline days to complete the assigned task.

5 Conclusion In recent years, online learning techniques play a major role in field education. These techniques are used to provide distance education, online teaching-learning materials, and course. The use of virtual learning environment is one of the best choices for online learning. This paper contributes to development of “Topic Model” like Latent Dirichlet Allocation (LDA) on text applications. In the initial step, the primary features are extracted from the metadata available publicly in the dataset. Using the primary sequences, the words generated for LDA, would be less. In order to increase the performance of LDA, the secondary sequences are extracted from the primary sequences using above-average below-average and increase-decrease sequencing technique. The words are then fed to the topic-modeling algorithm, (LDA) to generate clusters similar documents or learners. Several experiments were conducted to evaluate the performance of predictive models. The obtained results show, how the topic modeling algorithm LDA attains significant clustering of documents over the other state-of-art.

242

N. A. Deepak and N. S. Shobha

References 1. Wu P, Yu S, Wang D (2018) Using a learner-topic model for mining learner interests in open learning environments, environments. Educ Technol Soc 21(2):192–204 2. Peng X, Liu S, Liu Z, Gan W, Sun J (2016) Mining learners’ topic interests in course reviews based on like-LDA model. Int J Innov Comput Inf Control 12(6):2099–2110 3. Alves P, Miranda L, Morais C (2017) The influence of virtual learning environments in students’ performance. Univ J Educ Res 5(3):517–527 4. Mahajan A (2016) A research paper on virtual learning environment. AIMA J Manag Res 10(2) 5. Nag AY, Blei DM, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 993–1022 6. Arnold K, Pistilli M (2012) Course signals at Purdue: using learning analytics to increase student success. In: Proceedings of international conference on LAK’12, pp 267–270 7. Baradwaj BK, Pal S (2011) Mining educational data to analyze students performance. Int J Adv Comput Sci Appl (IJACSA) 2(6):63–69 8. Kuzilek J, Hlosta M, Herrmannova D, Zdrahal Z, Vaclavek J, Wolff A (2015) Analyzing at risk students at The Open University, pp 1–14. ISSN 2057-7494 9. Wladis C, Conway K (2014) An investigation course level factors as predictors of online STEM course outcomes. Comput Educ 1–22 10. Wladis C, Wladis K, Hachey AC (2014) The role of enrollment choice in online education: course selection rationale and course difficulty as factors affecting retention, pp 1–14 11. Kabra RR, Bichkar RS (2011) Performance prediction of engineering students using decision trees. Int J Comput Appl 36(11):8–12 12. Romero C, Lopez M-I, Luna J-M, Ventura S (2013) Predicting students’ final performance from participation in on-line discussion forums. J Comput Educ 68:458–472 13. Graf S, Liu TC, Kinshuk (2010) Analysis of learners’ navigational behavior and their learning styles in an online course. J Comput Assist Learn 26(2):116–131 14. Hou H-T, Chang K-E, Sung Y-T (2008) An analysis of peer assessment online discussions within a course that uses project-based learning. J Interact Learn Environ 15(3):237–251 15. Conijn R, Zaanen M (2017) Trends in student behavior in online courses. In: Third international conference on higher education advances. https://doi.org/10.4995/HEAD17.2017.5337 16. Perlibakas V (2004) Distance measures for PCA-based face recognition. Pattern Recogn Lett 25:711–724 17. Ghojogh B, Samad MN, Mashhadi SA, Kapoor T, Ali W, Karray F, Crowley M (2019) Feature selection and feature extraction in pattern analysis: a literature review. Research Gate 18. Borges VRP, Esteves SL, De Nardi Araujo P, Oliveira LC, Holanda M (2019) Using principal component analysis to support students’ performance prediction and data analysis. In: VII Congresso Brasileiro de Informática na Educação, pp 1383–1392 19. Yuan J, Li YM, Liu CL, Zha XF (2010) Leave-one-out cross-validation based model selection for manifold regularization. In: Advances in neural networks. Lecture notes in computer science, vol 6063. Springer, Berlin, Heidelberg 20. LDA-Data Mining Map. http://www:saedsayad:com=lda:htm 21. Latent Dirichlet Analysis—Sebastian Raschka. http://sebastianraschka:com=Articles=2014py thonlda:html 22. Oyelade OJ, Oladipupo OO, Obagbuwa IC (2010) Application of k-means clustering algorithm for prediction of students academic performance. Int J Comput Sci Inf Secur (IJCSIS) 7(1) 23. Kuzilek J, Hlosta M, Zdrahal Z (2017) Data descriptor—open university learning analytics dataset. Sci Data 4:170–171

An Overview of Recent Developments in Convolutional Neural Network (CNN) Based Face Detector Rahul Yadav

and Priyanka

Abstract Face detection is a fundamental and extensively studied problem in computer vision. It is the first and primary step for many applications which include face recognition/verification, face tracking, facial behavior analysis, and many other applications. A major challenge for face detector is to detect faces in unconstrained conditions (also called in-the-wild) such as variations in pose, illumination, scale, expressions, makeup, and occlusion. Recently, accuracy and performance of face detectors have improved tremendously because of the use of Convolutional Neural Network (CNN). This survey paper focuses on recent advancements in face detection techniques based on Convolution Neural Network and categorization of CNN face detector. Paper concludes by identifying future directions of the field. Keywords Face detection · Artificial neural networks · Computer vision · Convolutional neural networks · Deep learning · Deep neural network · Object detection

1 Introduction Face detection is a problem of detecting and localizing an unknown number of faces from images and video frames [1]. It is a basic problem of active research in computer vision and primary step for many applications which include face recognition [2], surveillance [3, 4], bio-metric authentication [5], augmentation reality [6], and medical applications [7–10]. The early research on face detection used skin modeling techniques [11] and various pattern recognition techniques [12]. Majority of early face detectors worked well on frontal faces but does not perform well in R. Yadav (B) · Priyanka Electronics and Communication Engineering, DCR University of Science and Technology, Murthal, Sonepat, Haryana 131039, India e-mail: [email protected] Priyanka e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_19

243

244

R. Yadav and Priyanka

Fig. 1 Different type of variations of faces in unconstrained (in-the-wild) condition [19]

unconstrained conditions [13]. The unconstrained conditions involve faces with variations in pose, expression, scale, makeup, light conditions, and occlusion as shown in Fig. 1. The Viola-Jones face detector [14] and its variations [13, 15] could perform well in some unconstrained conditions. These detectors used hand-engineered or handcrafted features like HAAR features and AdaBoost learning technique. The machine learning approach using handcrafted features require domain knowledge for designing feature extractor for given problems [16]. To overcome this disadvantage deep learning is used which generates feature maps directly from raw data. The advancements in computational capabilities (e.g. GPU’s and TPU’s), large software/programming libraries support (like Apache spark, TensorFlow, PyTorch, etc.) and availability of large face datasets [17–19] also motivated deep learning research. In Convolutional Neural Network (CNN) face detection is defined as a problem of detecting distinct instances in an image or video frame. Many notable survey papers on face detection have been published in the past. Majority of these survey papers on face detection reviewed Machine learning algorithms using hand-engineered feature extracting techniques. One of the earliest survey papers discussed various segmentation and localization techniques for human face detection and recognition techniques. Survey [20] was on face detection using feature representation and machine learning techniques. After that survey papers [13, 21] disused Viola-Jones algorithm and its variations and part of deep learning algorithm to face detection was explored in [13]. The paper [22] reviewed skin color modeling techniques face detection. The most recent survey paper [23] reviewed deep learning techniques to face recognition which included a brief discussion on deep learning based face detection techniques. While [24] surveyed face detection techniques for low-quality images and compared various face detectors. The main focus of this survey paper is to review recent CNN Face detection techniques. The remaining paper is structured as follows. In Sect. 2 categorization CNN face detectors, CNN face detectors are categorized according to techniques used for feeding data in CNN. CNN is three divided into three categories. Sections 3, 4 and 5 contain explanation of three categories of face detectors, respectively.

An Overview of Recent Developments in Convolutional Neural …

245

Fig. 2 CNN-based face detector categorization

2 CNN Face Detector Categorization We have divided CNN face detectors into three categories according to methods used to input images in CNN as shown in Fig. 2. First, CNN Cascade Face Detector, these frameworks consist of preprocessing step creating image pyramids. Sliding Window is used to provide input to CNN. Second, Region-based Face Detector, RCNN, Fast R-CNN, and Faster R-CNN are most commonly used framework and it uses region proposal for input, which can further classify into R-FCN (RegionBased Fully Convolutional Network) and Contextual RCNN. Third, Proposal Free Network, these frameworks do not require any region proposals. SSD and YOLO are the most commonly used algorithms.

3 Cascade-Based Face Detector Cascade classifier is defined as set of week classifiers combined to make strong classifiers [25]. CNN cascade [26] uses CNN to leverage its property of automatically extracting features from complex variations from large amount of training data. CNN cascade consists pipeline of six CNN architecture from which three CNN are used for classification and the remaining three are used for bounding box calibration. The input image is transformed into image pyramid to cover faces at different scales and each level is feed into cascade pipeline using sliding window techniques. First CNN in pipeline is called 12-net which is a very shallow classifier used for quick scan and propose region which may contain face while rejecting background/non-face regions. The Proposed regions are fed into second CNN called 12-calibration-net. This stage is used for calibrating bounding box for proposed region. The bounding box is calibrated using a predefined pattern or anchor boxes which is set of three vectors. These three vectors are scale change vector Sn and two offset vectors (xn, yn). Calibration net produces confidence score for each pattern and for every region proposed by 12-net. NMS (Non-Maximum Suppression) layer is used to suppress

246

R. Yadav and Priyanka

the low confidence score windows. The Remaining windows with high confidence scores are resized to 24 × 24 and feed into 24-net for classification in addition same input is also fed into 12 × 12 net for detection small faces and to reduce false detection. The output of 24-net is given to 24-calibration net similar to 12-calibration net. To make detector more robust to false detection 48-net classifier is used and used 48-calibration-net with one pooling layer as last calibration layer. FaceCraft [27] proposed joint training architecture for end to end optimization of cascade detector which train all cascade detector at once. The input images are first converted into image pyramid and then sliding window with stride length of 8 is used for generating face proposals of size 48 × 48. The architecture uses ReLU activation and drop-out before classification and regression layers. It also uses softmax function for classification and smooth L1 loss for bounding-box regression. This paper also discusses joint training for RPN + fast RCNN. The CNN cascade architecture in combination with Kalman filter proposed in [28] used to detect faces in video frames. Architecture proposed is lightweight and called coarse-to-fine convolution network. Overall face detection process consists of resizing input frame, creating it in image pyramid, then feeding it to coarse CNN detector and fine CNN detector. Coarse detector is similar to 12-net which take 12 × 12 input image and classify facial regions and their corresponding bounding box vectors. The output from coarse detector is refined by Fine CNN detector which removes false detections and corrects the bounding box vectors. Kalman filter is used to predict motion of face in frames if there is severe occlusion. The CNN cascade architecture in combination with Kalman filter proposed in [28] used to detect faces in video frames. Architecture proposed is lightweight and called coarse-to-fine convolution network. Overall face detection process consists of resizing input frame, creating it in image pyramid, then feeding it to coarse CNN detector and fine CNN detector. Coarse detector is similar to 12-net which take 12 × 12 input image and classify facial regions and their corresponding bounding box vectors. The output from coarse detector is refined by Fine CNN detector which remove false detections and correct the bounding box vectors. Kalman filter is used predict motion of face in frames if there is severe occlusion. Progressive Calibration Network (PCN) [29] proposed to detect faces in large RIP (Rotation-In-Plane). The PCN is three-stage process, and input is taken by generating image pyramid and sliding window techniques. Stages of PCN have three objectives which are face classification, bounding box regression, and calibration classification for RIP orientation of face. NMS layer is used to remove overlapped regions. The performance of CNN cascade can be improved by using network acceleration, merging multilayers, knowledge distilling, and hard mining techniques as discussed in [30].

An Overview of Recent Developments in Convolutional Neural …

247

3.1 Multi-Task Cascade CNN CNN cascade is also used for multi-task purposes which include detecting several features in addition for face detection. A multi-task framework which predict facial landmark for alignment and localize face using bounding box simultaneously called MTCNN [31]. MTCNN has three-stage cascade architecture which is called P-net (Proposal network), R-net (Refine Network), and O-net. The first two stages are used to propose facial region candidate and refine the proposed regions while the last layer makes final predictions. The architecture used cross-entropy loss for classification and Euclidean loss for both bounding box regression and facial landmark localization. An improved version of multi-task CNN, HEM [32] method for end-to-end training is proposed it use smooth L1 loss instead of Euclidean loss. Hard Example Proposal Network is used to generate training samples. The network generates regions which treated as a positive region if IoU is above the threshold. CNN feature fusion strategy [33] for performing pose estimation and detection proposed. The architecture is pipeline of three CNN networks which are Coarsedetection Net, Optimized-detection Net, Fine-detection, and Pose-estimation Net. The pose estimation vector is three-Dimensional vector that predicts yaw, pitch and roll of detected faces. The CNN feature fusion strategy uses the information distributed across CNN layers by extracting and concatenating feature vectors from low, mid, and high-level layers with different complexities of features. Multitask detection can also be done using public library, [34] proposed framework which performs four task including face detection, bounding box, pose estimation and facial landmark localization using two publicly available library.

3.2 Contextual Cascade Detector Contextual-based detector is a class of detector which use contextual information for detection. A lightweight DCNN based architecture for face detector is introduced in [35]. The network consists of two CNN networks, first network was trained to do binary classification (face or non-face region) and second CNN predicts presence of facial attributes like eye, nose, and mouth for given regions. The initial layers of two networks are connected in parallel which enabled network to detect facial parts and face simultaneously. CNN networks are trained using new technique called progressive positive and hard negative example mining. In which models are trained progressively, starting from simple examples and progressing to tougher examples. The HS-cascade [4] framework utilized Head-with-shoulder information to detect faces from large scene surveillance. The framework consists of two cascade networks for different face sizes. Faces smaller than 20 × 20 is detected by the Small Size Cascade and for faces with larger scale are detected by Big Size Cascade using information of head with shoulder.

248

R. Yadav and Priyanka

To improve the detection rate Anchor Cascade is proposed [36] which uses APN (Anchor Proposal Network), CPM (Context Pyramid Maxout), and context-aware refinement network. In APN the last layer is convolution layer of three-dimensional tensor, which is used to predict n different anchor proposal for given input image patches. CPM consists of four context template which is used to give prediction scores for each proposed window, window with maximum score is used for further processing and discarding other proposals. A two-stream contextual architecture called ICS (Inside Cascade Structure) [37] with two extra layers of ERC (Early Rejection Classifier) and DR (Data Routing) which used whole body as context for face detection. The proposed architecture has three stages which are called P-Net (generates faces proposals from input), R-Net-1 (roughly refines proposed regions), and R-Net-2 (produces final refined results). The extra layers of ERC and DR are used in stage 1 and 2. ERC is a small classifier that uses cross-entropy loss for classifying faces. DR layer uses probability output from previous ERC and threshold value to rejects the sample regions to be processed further in forward pass. If the probability from ERC for region to contain face more than defined threshold the DR accepts the regions and allows it to proceed in forward pass. LLE-CNN [38] detect faces with occlusion, it consists of three modules. The first module called proposal module combines two CNN to extract facial regions and high dimensional descriptors respectively. High dimensional descriptors are given to Embedding module which produces similarity-based descriptors using LLE (Locally Linear Embedding) algorithms and at last verification module is used for classifying facial regions and refining their location using classification and regression task. In Table 1 all the cascade-based face detectors are summarized for comparative analysis.

4 Region-Based Face Detector 4.1 RCNN, Fast-RCNN, and Faster R-CNN Region-based detectors are considered to be main paradigm for object detection. The RCNN [39] consist of modules which are region proposals and detector module. Region proposal used category independent region proposal techniques such as selective search techniques for region proposals. Regions generated are feed into CNN to generate feature vector. Feature vectors are used by classifier and regressor for object category classification and bounding box respectively. This technique increased detection and classification accuracy significantly. The high computational cost due to classifying each object category separately is a major drawback of RCNN. This problem is addressed in fast-RCNN [40] and DDFD (Deep Dense Face Detector) [41]. The DDFD architecture consists of fine-tuned CNN architecture on face dataset by randomly sampled face images as positive example. Fine-tuned network can be

An Overview of Recent Developments in Convolutional Neural …

249

Table 1 Comparative of analysis cascade-based face detectors S.

Method

Category

Highlight

1

A convolutional neural network cascade for face detection [26]

Cascade

Consist six CNN architecture, three for classification and remaining three for regression calibration

2

FaceCraft [27]

Cascade

Proposed training technique for training all CNN architecture at once

3

A face tracking framework Cascade based on convolutional neural networks and Kalman filter [28]

Uses Kalman filter with CNN to track faces in frame

4

Progressive calibration networks [29]

Cascade

Detects face in images with large Rotation-In-Plane (RIP) orientations

5

Multi-task cascaded convolutional networks (MTCNN) [31]

Multi-task cascade

The framework performs face alignment and detection simultaneously

6

HEM method [32]

Multi-task cascade

It improves MTCNN method by using smooth L1 loss instead of Euclidean loss

7

HS-cascade [4]

Contextual cascade It uses head and shoulder information’s for detecting faces in surveillance system

8

Inside cascade structure (ICS) [37]

Contextual cascade Uses ERC (Early Rejection Classifier) for rejecting regions and DR (Data Routing) for using whole body as context to detect faces

9

LLE-CNN [38]

Contextual cascade Local Linear Embedding (LLE) extracted for two CNN is used for classification and regression task for face detection

No.

used with fed region proposals to detector by either using sliding window or regionbased approach on input images. In the paper sliding window technique is used for simplicity. The fast-RCNN is used with cascade CNN for face detection called FR-Net [42]. Cascade CNN is used for generating region proposals which were then feed in fastRCNN for detection in last stage. Cascade CNN used pipeline of four CNN networks called L-Cls-Net, L-Cal-Net, H-Cls-Net, and H-Cal-Net which are similar to 12-net, 12-net calibration, 24-net, 24-net calibration architectures. The RoI pooled regions are feed into fast-RCNN for face detection in last stage. Fast-RCNN is also used to detect tiny faces using contextual information called Three-Category face detector [43]. In this paper, treating face detection as binary classification problem, it used

250

R. Yadav and Priyanka

three-class classification problems in which three classes are tiny faces, normal faces and background. The proposed detector also leveraged contextual information to improve recall rate of tiny faces. Faster-RCNN [44] reduced computational cost, reduce detection time by replacing proposal stage with CNN based RPN (Region Proposal Network). RPN uses kernel of convolution layers which converts the spatial information of an input image into low dimensional feature vectors. For detecting objects of different scale and aspect ratios anchor boxes are introduced. Both RPN and Detector share convolution layer which reduces computational cost. This faster-RCNN were trained to detect faces and their performance was measured against standard dataset in [45] while [46] introduced docker container for pre-trained faster-RCNN based face detector. The techniques like feature concatenation, hard negative mining, and multiscale training with properly calibrated pre-trained model were used to improve efficiency of a faster-RCNN for face detection in [47, 48]. Face R-CNN [49] used several techniques including multitasks loss functions designing, OHEM (Online Hard Example Mining), and multiscale training. To detect small faces DSFD (Different Scales Face Detector) is proposed [50], RCNN based detector used for detecting small faces in an image that used feature pyramid technique and two modules called OFS (Original Feature shot) and FEM (Feature Enhance Module). To detect face with large size variations from 800 × 800 to 8 × 8 simultaneously MP-RCNN (Multi-Path RCNN) [51] is introduced. The proposed method uses combination of CNN and Boosted Forest classifiers in two stages. The first stage is of Multi-Path Region Proposal Network (MP-RPN) which uses three parallel output from CNN to generate pooled deep facial features. MP-RPN uses sampling layer for hard example mining similar to OHEM explained earlier. Then extracted features in addition to deep contextual features from region surrounding faces are fed to Boosted Forest Classifier. Multi-Task RCNN, Supervised Transformer Network [52] addresses the problem of pose variations by using multi-task RPN. Multi-task RPN is used to predict facial regions and facial points which are then fed into RCNN network for verification of proposed regions. Face-MagNet [53] framework uses discriminative information and do not use any skip or residual connections in network. ConvTranpose or deconvolution used in RPN and before RoI pooling layer for finer details.

4.2 R-FCN (Region-Based Fully Convolutional Networks) and Contextual RCNN R-FCN (Region-based Fully Convolutional Networks) [54] is two-stage network that uses fully shared computation on entire images unlike its counterpart faster R-CNN, which makes R-FCN more accurate and faster. The architecture introduced positionsensitive score-map which is used for rotation and invariance in image and score for presence of sub-region of object. Then using score maps each sub-region is voted for

An Overview of Recent Developments in Convolutional Neural …

251

object presences this is called position-sensitive RoI pool. Since there are no layers that can trained after RoI layer this speed-up the process. Face R-FCN [55] uses the proposed R-FCN algorithm with modifications for detecting faces. Inspired from R-FCN, [56] proposed a network that uses face proposal network to generate RoI and weighted grid features are applied to these facial candidates. CMS-RCNN (Contextual Multi-Scale Region-based CNN) [57] inspired by Multi-Scale CNN and introduced Multi-Scale Region Proposal Network (MS-RPN). It is based on the VGG-16 model. The proposed technique uses multi-scale information for region proposals and ROI detection. The network consists of two parts first part MS-RPN which detects small faces and second part is for contextual reasoning to do inferences on proposed regions for facial regions. The three problems related to scale invariance, resolution of input image, and contextual reasoning associated with small faces are addressed in [58]. For Scale-invariance deep features of CNN are used, while deep Multi-scale descriptors called hypercolumn are used for Multiscale representation. Large local context is used for finding tiny faces. The proposed network uses scaled images from image pyramid as input to CNN to predict template response. Templates are of trained network at different scales of input images, these are of two types A and B. A template is for 40–140-pixel tall faces and B is for faces less than 20 pixels tall. After template response NMS is applied to get final response. Contextual information used with multi-task RPN face detector is proposed [59]. The detector uses interior and exterior facial features for improving performance. Interior features are constructed normalizing the pooled feature maps from different layers and concatenating them. Exterior features are supporting features such as hairs, shoulders, torso, etc. Information from these two features are used for region proposals leading to face detection. In Table 2, region-based face detectors are summarized for comparative analysis.

5 Proposal Free Face Detector Proposal free networks are also known as single shot detectors or unified networks the use single CNN architecture for detection and do not require any preprocessing stage. The most famous architecture for these detectors is YOLO [60–62] and SSD [63]. Single-stage detectors are inspired from these two types of detector. YOLO architecture [60] uses image global features for detections. YOLO divides image into grid and predicts class probability for each grid cells, since for making predictions no regions proposals are required therefore it is fast. But it suffered localization errors which was improved in latter versions [61, 62]. The YOLO used real-time face detection using feature pyramid for detecting small face in [64]. A face detector for embedded system based on YOLO called LCDet is proposed by [65]. It used ReLU activations function instead of leaky ReLU and made alterations in last layers of YOLO. 8-bit quantization is used for weights. SSD (Single Shot Detector) [63] uses multi convolutional features and uses VGGNet as base network. The CNN network of progressively reducing size are

252

R. Yadav and Priyanka

Table 2 Comparative analysis of cascade-based face detectors S. No. Method

Category

Highlight

1

Deep Dense Face Detector (DDFD) [41]

Region based detector

Consist of fine-tuned CNN architecture by randomly sampled positive face examples

2

FR-Net [42]

Region-based detector

Uses cascade structure to generate region proposals for fast RCNN detector

3

Faster RCNN for face detection [45–49]

Region-based detector

All these four techniques use fast RCNN for face detection [45] is trained fast RCNN on standard dataset, [46] provide easy installation docker for fast RCNN based face detector, [47–49] uses several training techniques to improve the performance of fast RCNN face detector

4

Different Scale Face Detector (DSFD) [50]

Region-based detector

Uses feature pyramid techniques, OFS (Original Feature Set) and FEM (Feature Enhanced Module) for detecting small faces

5

Multi-Path (MP) RCNN [51]

Region-based detector

Uses both CNN and boosted tree classifier for detecting faces with large scale variation

6

Supervised Transformer Network [52]

Region-based detector

Use multitask RPN to handle pose variation problem in RCNN

7

Face-MagNet [53]

Region-based detector

The network does not have any skip or residual connections and uses discriminative information for face detection

8

Face R-FCN [55]

Region based detector

Inspired by R-FCN, uses modified R-FCN framework for face detection

9

CMS-RCNN [57]

Region-based detector

It is contextual RCNN framework which uses multiscale information for region proposals (continued)

An Overview of Recent Developments in Convolutional Neural …

253

Table 2 (continued) S. No. Method

Category

Highlight

10

Region-based detector

Uses interior features (normalized feature maps of different layers) and exterior features (contextual features like hairs, shoulder, etc.) for generating region proposals

Enhancing Interior and Exterior Deep Facial Features for Face Detection [59]

added at end base network. This allows SSD to detect objects of different sizes. SSD network is also used for multi-task detections such as facial landmark detection by [66, 67]. SSH (Single-Stage Headless) face detector [68] a unified face detector, i.e., it detects faces in a single pass. The detector is called a headless detector because it removes the later layers of classification network. They used VGG-16 base network and architecture of SSH Consist classifier and Detection module added to it like SSD algorithm it also uses feature maps from different convolution layers. A realtime feature sharing algorithm [69] similar to SSD used for face detection. S3FD (Single Shot Scale-Invariant Face detector) [70] for especially for tiny faces. Detector used scale equitable detection framework to detect a face at a different scale and scale compensation anchor-based matching strategy. UnitBox [71] is a single shot detector which uses VGG-16 base architecture and proposed novel IoU formulation. In Table 3, proposal free face detectors are summarized for comparative analysis. Table 3 Comparative analysis of proposal-free face detectors S. No. Method

Category

Highlight

1

YOLO based face detector [64] Proposal free detector

Inspired by YOLO algorithm, uses feature pyramid techniques for detecting small faces

2

LCDet [65]

Proposal free detector

YOLO based detector for embedded systems with 8-bit quantized weights

3

Single Stage Headless face detector [68]

Proposal free detector

Network is SSD like architecture with classification layers removed

4

S3SFD (Single Shot Scale-Invariant Face Detector) [70]

Proposal free detector

It detects faces at different scales by compensating for scales and anchor matching strategy

5

UnitBox [71]

Proposal free detector

Uses SSD like network with novel IoU formulation

254

R. Yadav and Priyanka

6 Conclusion In this review paper various CNN based faced detector techniques were categorized in three main categories according to methods used for input images into detector. Cascade-based and Region-based categories have both used region-based information but cascade-based detectors are faster than Region-based detectors, while region-based detectors have high precision than cascade-based detectors. Proposal free detectors are more recent face detection techniques and a spur in many realworld applications and products. Face detection have a gap between state-off-art face detectors and human level performances for occlude faces and faces with makeup and rotated plane. The training strategy like precise box [72] could be used to extract more features and performance improvement. DCNN face detector faces another challenge due to hardware limitation, limited computation power for mobile and embedded systems their issue of space and time complexity, but there has been some attempt made in this direction like [65, 73]. Apart from traditional CNN architecture other networks like GAN [74, 75], Encoder-Decoder coupled network [76–78] and Recurrent neural network [79] have been used for face detection. However, many interesting applications has developed over DCNN based face detectors buts still there is interesting which not fully studied is off facial behavior analysis because of lack of data (which are computationally expensive). For future work, it is an open question for a face detector to detect faces in arbitrary conditions, e.g., out of focus images or low-quality images. Need for online learning techniques should be developed to handle changes in system. Deep learning techniques like [80, 81] and network like [82] could be used to overcome data deficiency. With development of new techniques, there is the creation of new interesting areas such as human interaction in augmented reality and to develop smart cameras to infer the mood of human by their facial attributes or distinguish between the face in photos. Acknowledgements We wish to acknowledge National Project Implementation Unit (NPIU), a unit of Ministry of Human Resource Development, Government of India, for the financial assistantship through TEQIP-III Project at Deenbandhu Chhotu Ram University of Science and Technology, Murthal, Haryana.

References 1. Hjelmås E, Low BK (2001) Face detection: a survey. Comput Vis Image Underst 83(3):236–274 2. Günther M, Hu P, Herrmann C, Chan CH, Jiang M, Yang S, Dhamija AR, Ramanan D, Beyerer J, Kittler J, Jazaery MA, Nouyed MI, Guo G, Stankiewicz C, Boult TE (2017) Unconstrained face detection and open-set face recognition challenge. In: International joint conference on biometrics (IJCB), Denver, 1–4 Oct 2017. IEEE 3. Nguyen-Meidine LT, Granger E, Kiran M, Blais-Morin L-A (2017) A comparison of CNNbased face and head detectors for real-time video surveillance applications. In: Seventh international conference on image processing theory, tools and applications (IPTA). IEEE

An Overview of Recent Developments in Convolutional Neural …

255

4. Peng C, Bu W, Xiao J, Wong K-C, Yang M (2018) An improved neural network cascade for face detection in large scene surveillance. Appl Sci 8(11):2222 5. Ito K, Aoki T (2018) [Invited paper] recent advances in biometric recognition. ITE Trans Media Technol Appl 6(1):64–80 6. Kowalski M, Nasarzewski Z, Galinski G, Garbat P (2018) HoloFace: augmenting human-tohuman interactions on HoloLens. In: 2018 IEEE winter conference on applications of computer vision (WACV), Mar 2018. IEEE 7. Chong E, Chanda K, Ye Z, Southerland A, Ruiz N, Jones RM, Rozga A, Rehg JM (2017) Detecting gaze towards eyes in natural social interactions and its use in child assessment. Proc ACM Interact Mob Wearable Ubiquitous Technol 1(3):1–20 8. Ahmedt-Aristizabal D, Fookes C, Nguyen K, Denman S, Sridharan S, Dionisio S (2018) Deep facial analysis: a new phase I epilepsy evaluation using computer vision. Epilepsy Behav 82:17–24 9. Hsu G-SJ, Huang W-F, Kang J-H (2018) Hierarchical network for facial palsy detection. In: 2018 IEEE, CVF conference on computer vision and pattern recognition workshops (CVPRW), June 2018. IEEE 10. Hsu G-S, Ambikapathi A, Chen M-S (2017) Deep learning with time-frequency representation for pulse estimation from facial videos. In: 2017 IEEE international joint conference on biometrics (IJCB), Oct 2017 11. Kuchi P, Gabbur P, Bhat PS, David SS (2002) Human face detection and tracking using skin color modeling and connected component operators. IETE J Res 48(3–4):289–293 12. Chellappa R, Wilson CL, Sirohey S (1995) Human and machine recognition of faces: a survey. Proc IEEE 83(5):705–741 13. Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: past, present and future. Comput Vis Image Underst 138:1–24 14. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, vol 1, Dec 2001 15. Mutneja V, Singh DS (2017) Modified Viola–Jones algorithm with GPU accelerated training and parallelized skin color filtering-based face detection. J Real-Time Image Proc 1–21 16. LeCun Y, Bengio Y, Hinton GE (2015) Deep learning. Nature 521:436–444 17. Bansal A, Nanduri A, Castillo CD, Ranjan R, Chellappa R (2017) UMDFaces: an annotated face dataset for training deep networks. In: 2017 IEEE international joint conference on biometrics (IJCB), pp 464–473 18. Maze B, Adams JC, Duncan JA, Kalka ND, Miller T, Otto C, Jain AK, Niggel WT, Anderson J, Cheney J, Grother P (2018) IARPA Janus Benchmark-C: face dataset and protocol. In: 2018 international conference on biometrics (ICB), pp 158–165 19. Yang S, Luo P, Loy CC, Tang X (2015) WIDER FACE: a face detection benchmark. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR) 20. Yang M-H, Kriegman DJ, Ahuja N (2002) Detecting faces in images: a survey. IEEE Trans Pattern Anal Mach Intell 24:34–58 21. Zhang C, Zhang Z (2010) A survey of recent advances in face detection, June 2010. https://www. microsoft.com/en-us/research/publication/a-survey-of-recent-advances-in-face-detection/ 22. Kakumanu PK, Makrogiannis S, Bourbakis NG (2007) A survey of skin-color modeling and detection methods. Pattern Recogn 40:1106–1122 23. Ranjan R, Sankaranarayanan S, Bansal A, Bodla N, Chen J-C, Patel VM, Castillo CD, Chellappa R (2018) Deep learning for understanding faces: machines may be just as good, or better, than humans. IEEE Signal Process Mag 35:66–83 24. Zhou Y, Liu D, Huang TS (2018) Survey of face detection on low-quality images. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pp 769–773 25. Escrivá DM, Mendonça VG, Joshi P (2018) Learn OpenCV 4 by building projects. Packt Publishing Ltd, Birmingham, pp 152–153

256

R. Yadav and Priyanka

26. Li H, Lin ZL, Shen X, Brandt J, Hua G et al (2015) A convolutional neural network cascade for face detection. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), Boston, 7–12 June 2015. IEEE, pp 5325–5334. https://doi.org/10.1109/CVPR.2015.7299170 27. Qin H, Yan J, Li X, Hu X (2016) Joint training of cascaded CNN for face detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 3456–3465 28. Ren Z, Yang S, Zou F, Yang F, Luan C, Li K (2017) A face tracking framework based on convolutional neural networks and Kalman filter. In: 2017 8th IEEE international conference on software engineering and service science (ICSESS), pp 410–413 29. Shi X, Shan S, Kan M, Wu S, Chen X (2018) Real-time rotation-invariant face detection with progressive calibration networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 2295–2303 30. Zeng D, Zhao F, Ge S, Shen W (2019) Fast cascade face detection with pyramid network. Pattern Recogn Lett 119:180–186 31. Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23:1499–1503 32. Cong W, Zhao S, Tian H, Shen J (2017) Improved face detection and alignment using cascade deep convolutional network. CoRR, vol. abs/1707.09364 33. Wu H, Zhang K, Tian G (2018) Simultaneous face detection and pose estimation using convolutional neural network cascade. IEEE Access 6:49563–49575 34. Feng Z-H, Kittler J, Awais M, Huber P, Wu X (2017) Face detection, bounding box aggregation and pose estimation for robust facial landmark localisation in the wild. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 2106–2115 35. Triantafyllidou D, Nousi P, Tefas A (2017) Fast deep convolutional face detection in the wild exploiting hard sample mining. Big Data Res 11:65–76 36. Yu B, Tao D (2018) Anchor cascade for efficient face detection. IEEE Trans Image Process 28:2490–2501 37. Zhang K, Zhang Z, Wang H, Li Z, Qiao Y, Liu W (2017) Detecting faces using inside cascaded contextual CNN. In: 2017 IEEE international conference on computer vision (ICCV), pp 3190– 3198 38. Ge S, Li J, Ye Q, Luo Z (2017) Detecting masked faces in the wild with LLE-CNNs. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 426–434 39. Girshick RB, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition, pp 580–587 40. Girshick RB (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), pp 1440–1448 41. Farfade SS, Saberian MJ, Li L-J (2015) Multi-view face detection using deep convolutional neural networks. In: ICMR 42. Wang K, Dong Y, Bai H, Zhao Y, Hu K (2016) Use fast R-CNN and cascade structure for face detection. In: 2016 visual communications and image processing (VCIP), pp 1–4 43. Jiang F, Zhang J, Yan L, Xia Y, Shan S (2018) A three-category face detector with contextual information on finding tiny faces. In: 2018 25th IEEE international conference on image processing (ICIP), pp 2680–2684 44. Ren S, He K, Girshick RB, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149 45. Jiang H, Learned-Miller EG (2017) Face detection with the faster R-CNN. In: 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017), pp 650–657 46. Ruiz N, Rehg JM (2017) Dockerface: an easy to install and use faster R-CNN face detector in a Docker container. CoRR, vol. abs/1708.04370 47. Sun X, Wu P, Hoi SCH (2018) Face detection using deep learning: an improved faster RCNN approach. Neurocomputing 299:42–50 48. Zhang C, Xu X, Tu D (2018) Face detection using improved faster RCNN. CoRR, vol. abs/1802.02142 49. Wang H et al (2017) Face R-CNN. arXiv preprint arXiv:1706.01061

An Overview of Recent Developments in Convolutional Neural …

257

50. Wu W, Yin Y, Wang X, Xu D (2018) Face detection with different scales based on faster R-CNN. IEEE Trans Cybern 51. Liu Y, Levine MD (2017) Multi-path region-based convolutional neural network for accurate detection of unconstrained “hard faces”. In: 2017 14th conference on computer and robot vision (CRV), pp 183–190 52. Chen D, Hua G, Wen F, Sun J (2016) Supervised transformer network for efficient face detection. In: ECCV 53. Samangouei P, Najibi M, Davis LS, Chellappa R (2018) Face-MagNet: magnifying feature maps to detect small faces. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp 122–130 54. Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: NIPS 55. Wang Y, Ji X, Zhou Z, Wang H, Li Z (2017) Detecting faces using region-based fully convolutional networks. CoRR, vol. abs/1709.05256 56. Shu H, Chen D, Li Y, Wang S (2017) A highly accurate facial region network for unconstrained face detection. In: 2017 IEEE international conference on image processing (ICIP), pp 665–669 57. Zhu C, Zheng Y, Luu K, Savvides M (2016) CMS-RCNN: contextual multi-scale region-based CNN for unconstrained face detection. CoRR, vol. abs/1606.05413 58. Hu P, Ramanan D (2017) Finding tiny faces. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1522–1530 59. Zhu C, Zheng Y, Luu K, Savvides M (2018) Enhancing interior and exterior deep facial features for face detection in the wild. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pp 226–233 60. Redmon J, Divvala SK, Girshick RB, Farhadi A (2016) You only look once: unified, realtime object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 779–788 61. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6517–6525 62. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. CoRR, vol. abs/1804.02767 63. Liu W, Anguelov D, Erhan D, Szegedy C, Reed SE, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: ECCV 64. Yang WJ, Jiachun Z (2018) Real-time face detection based on YOLO. In: 2018 1st IEEE international conference on knowledge innovation and invention (ICKII), pp 221–224 65. Tripathi S, Dane G, Kang B, Bhaskaran V, Nguyen T (2017) LCDet: low-complexity fullyconvolutional neural networks for object detection in embedded systems. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 411–420 66. Hsu G-S, Hsieh C-H (2017) Multi-dropout regression for wide-angle landmark localization. In: 2017 IEEE international conference on image processing (ICIP), pp 3830–3834 67. Chen J-C, Lin W-A, Zheng J, Chellappa R (2018) A real-time multi-task single shot face detector. In: 2018 25th IEEE international conference on image processing (ICIP), pp 176–180 68. Najibi M, Samangouei P, Chellappa R, Davis LS (2017) SSH: single stage headless face detector. In: 2017 IEEE international conference on computer vision (ICCV), pp 4885–4894 69. Zheng C, Yang M, Wang C (2017) A real-time face detector based on an end-to-end CNN. In: 2017 10th international symposium on computational intelligence and design (ISCID), vol 1, pp 393–397 70. Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) S3 FD: single shot scale-invariant face detector. In: 2017 IEEE international conference on computer vision (ICCV), pp 192–201 71. Yu J, Jiang Y, Wang Z, Cao Z, Huang TS (2016) UnitBox: an advanced object detection network. In: ACM multimedia 72. Qi C, Chen X, Wang P, Su F (2018) Precise box score: extract more information from datasets to improve the performance of face detection. CoRR, vol. abs/1804.10743 73. Gao H, Tao W, Wen D (2018) IFQ-Net: integrated fixed-point quantization networks for embedded vision. In: 2018 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), pp 720–7208

258

R. Yadav and Priyanka

74. Chen Y, Song L, He R (2017) Masquer hunter: adversarial occlusion-aware face detection. CoRR, vol. abs/1709.05188 75. Bai Y, Zhang Y, Ding M, Ghanem B (2018) Finding tiny faces in the wild with generative adversarial network. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 21–30 76. Wang L, Yu X, Metaxas DN (2017) A coupled encoder-decoder network for joint face detection and landmark localization. In: 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017), pp 251–257 77. Zhou Z, He Z, Chen Z, Jia Y, Wang H, Du J, Chen D, Wang L, Chen J (2018) FHEDN: a context modeling feature hierarchy encoder-decoder network for face detection. In: 2018 international joint conference on neural networks (IJCNN), pp 1–8 78. Shaban M, Mahmood A, Al-Máadeed S, Rajpoot NM (2017) Multi-person head segmentation in low resolution crowd scenes using convolutional encoder-decoder framework 79. Liu Y, Li H, Yan J, Wei F, Wang X, Tang X (2017) Recurrent scale approximation for object detection in CNN. In: 2017 IEEE international conference on computer vision (ICCV), pp 571–579 80. Vinyals O, Blundell C, Lillicrap TP, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. In: NIPS 81. Niu L, Cai J, Veeraraghavan A, Zhang L (2017) Zero-shot learning via category-specific visualsemantic mapping and label refinement. IEEE Trans Image Process 28:965–979 82. Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. In: NIPS

A Review of Artificial Intelligence Techniques for Requirement Engineering Kamaljit Kaur, Prabhsimran Singh, and Parminder Kaur

Abstract Requirement Engineering is considered as the foremost part of Software development lifecycle. Hand-written requirements usually suffer from redundancy and inconsistency, which impact negatively on success of final software product. Artificial Intelligence techniques are used for avoiding erroneous requirements and human intervention, and also help in analyzing, classifying, and prioritization of requirements. This paper shows the state of the art of Artificial Intelligence techniques used in various Requirement Engineering approaches. It is surveyed that existing techniques Genetic Algorithm, Artificial Neural Network, and K-Nearest Neighbor show positive results in requirement prioritization. Many other techniques like Convolution Neural Network, Case-based reasoning are used for requirement classification, requirement traceability, and requirement analysis. This paper demonstrates some future research work, which is quite important and interesting, but needs further investigation in current research scenario and practice. Keywords Requirement engineering · Requirement engineering activities · Artificial intelligence techniques · Machine learning techniques

K. Kaur (B) · P. Singh · P. Kaur Guru Nanak Dev University, Amritsar, India e-mail: [email protected] P. Singh e-mail: [email protected] P. Kaur e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_20

259

260

K. Kaur et al.

1 Introduction Requirement is, “a statement that identifies a capability, characteristic or quality factor of system in order to have value and utility to a customer or user” [1]. Requirement Engineering (RE) is an iterative process of eliciting, elaborating, structuring, specifying, analyzing, and managing requirements of stakeholders on software system [2]. The complexity of software development is constantly increasing in order to lower the development time. The demands of requirements become harder, not from function and performance aspects but also usability, security, environmental issue, and so on. RE is time-consuming and error-prone process and fixing these errors at design and implementation phases are too expensive. According to Bohem [3], “correcting requirements errors late can cost up to 200 times as much as correcting the errors during requirement phase”. The cost associated with error depends upon the gap between error introduction and error discovery. The ASSERT tool has been introduced by [4] saves time and cost by identifying errors early at development process and automating requirement based test generation. The independence between the model and system can be maximized by creating test cases based on requirements [5]. Some survey studies [6] show that many projects were failed due to bad or incomplete requirements. In order to alleviate problems associated with RE, researchers have explored Artificial intelligence (AI) techniques for increasing success rate of software project. Praehofer and Kerschbaummayr [7] have developed case-based reasoning technique to search and check for design artifacts in database that are inexact for fulfill of requirements. The research on requirement engineering has noticeably advanced in past years. Numerous studies have been conducted on development of tools and techniques to automate requirement engineering process. A comparative survey on these studies can be found in [8]. Fewer efforts have been done and limited comprehensive literature [9] has addressed about contribution of AI techniques in RE. The plethora of AI in RE research needs to be synthesized to identify trends and future directions. The objective of this research work is to illustrate how AI techniques support RE, and also predict the performance, quality, cost, and effort of software. In this paper, selective research papers are reviewed related to the role of AI techniques to enhance the performance RE process. Specifically, this paper reports on review of current AI techniques. The review intends to find out what AI techniques and algorithms have been used to automate RE process such requirement classification, prioritization, analysis tracing, etc. These findings will enable to identify the challenges need to be addressed in order to improve the state of RE processes. The remainder of this paper is organized as follows: Sect. 2 presents the review of various AI techniques. Section 3 highlights current research in RE. Section 4 shows research map for Artificial Intelligence in Requirement Engineering with advantages and limitations. Section 5 concludes review with some future research directions in this field.

A Review of Artificial Intelligence Techniques for Requirement …

261

2 Artificial Intelligence Techniques This section represents the summary of published research work on AI techniques that are used in various Requirements Engineering activities such as requirement elicitation, requirement prioritization, requirement classification, requirement quality, and requirement analysis.

2.1 Non-functional Classifier Automatic requirements classification was first ever discussed by Cleland-Huang et al. [10] in 2007. They have applied information retrieval method to identify and classify non-functional requirements (NFRs). The frequency of indicator terms for NFRs is specified by using TF-IDF with additional parameters. Cleland-Huang et al. [10] have validated their approach by performing experiment on dataset of 15 sample requirements specification. The proposed approach performed very well with 0.8129 recall, indicating that their classifier successfully found 81% of possible NFRs in dataset.

2.2 Internet-Based Application Method Text categorization is considered as an important step in requirement analysis. The method proposed in [11] applied to text classification to reduce the number of specification document for better understanding the impact of requirements. Ko et al. [11] have presented another classification method to be applied in web-based analysis supporting systems. The approach automatically classifies informal information into several views or topic such as cost, priority, development time, quality attribute, and so on. The system requires these views or topics as a set of words representing the view point of each analyst. The authors evaluated the effectiveness of proposed technique by performing experiments using two real datasets in different languages, i.e., English and Korean. The major shortcoming of this method is the need of analysts in order to extract topic words for different views of software going to be developed.

2.3 Decision Tree Based Text Classifier Hussain et al. [12] have developed a system, based on Decision Tree based text classification in order to detect ambiguity in requirement specification document, and results in better quality of requirements. The authors showed how ambiguity in SRS document can be detected through a decision-based text classifier. The text classifier

262

K. Kaur et al.

tool is used over SRS document where paragraphs and sentences are classified. C.45 decision tree algorithm has been used for the classification of quality of requirements in terms of ambiguous or unambiguous. The objective of their work is to identify textual ambiguity in requirement elicitation phase before conceptual modeling of requirement begins.

2.4 Semi-Supervised Approach Casamayor et al. [13] employed semi-automated text categorization approach to detect and classifying NFRs. Initially, the classifier was trained with small set of manually labeled and classify requirements. The Naïve Bayes classifier is coupled with Expectation Maximization (EM) improves the performance by labeling untrained data. The accuracy of classification model is iteratively enhanced on the basis of user’s feedback. The model is trained with 625 instances in dataset. The result of empirical evaluation shows that semi-supervised approach achieves high-level classification accuracy than fully supervised approach.

2.5 Case-Based Reasoning (CBR) and Artificial Neural Network (ANN) The purpose of software requirement specification document is to define the properties that a system must meet to be accepted. The main focus is on the quality of SRS document. Jani and Mostafa [14] have proposed online quality analysis system using Case-Based Reasoning ensuring the requirements quality in terms of completeness, consistency, and correctness. Case-Based Reasoning is an efficient AI technique used to evaluate requirement quality by refereeing to previously stored software requirement quality analysis cases (past experience), without reconstructing the solution from scratch. Again, Jani and Islam [15] have proposed another novel approach that amalgamates Case-Based Reasoning and Artificial Neural Network to improve the quality of SRS document. The corresponding cases include the analysis of quality performed by software specification requirements. When new problem occurs, neural network retrieved similar cases from case base repository and compare quality, thus provide more accurate estimation of quality. But if new case has experienced before, then that case can be directly used from CBR repository to expedite quality process. This work is only proposal, as no real demonstration of method is presented. Case-based reasoning can also serve as an efficient approach for handling requirement change management. Naz et al. [16] proposed a model that integrates Case-Based reasoning and requirement change management in order to improve the quality of requirements. New requirement is considered as new case and proposed model checks its traceability and impact from previously stored cases in terms of

A Review of Artificial Intelligence Techniques for Requirement …

263

cost and benefit, and also resolves conflict while changing requirements. Poonia et al. [17] have also explored the collaboration of Misuse care oriented quality framework metrics with Machine learning techniques for the identification of vulnerabilities in requirement engineering phase, which could be existed in implementation phase of software development.

2.6 K-Nearest Neighbor Requirement elicitation is considered as one of the important phase of requirement engineering. Therefore, several researchers agree that 40% of defects in software projects are due to incorrect recorded requirements. Mulla and Girase [18] have presented an approach that uses the social media and collaborative filtering for elicitation of requirements. For achieving scalability, these techniques can be supported with data mining and machine learning. The authors have proposed predictive model using K-Nearest Neighbor for identifying like-minded users with similar rating histories.

2.7 Case Base Ranking with Machine Learning Approach Perini et al. [19] have presented paper “A Machine learning Approach to Software Requirements prioritization” published in “IEEE transaction of Software engineering”, April 2013. The authors have proposed most stable and efficient approach that decides software requirements priority using machine learning. This paper describes Case Base Rank based method, which provides the trade-off between elicitation effort and accuracy. Case-based rank method follows case-based ranking technique; focus on reduction of requirement comparison. Once the information is gathered from stakeholder, it can be reduced by machine learning for achieving ranking of given quality degree. They have also defined the technique that automatically updates ranks anytime a requirement is included or excluded from list. Singh and Sharma [20] have also amalgamating machine learning with Gradient Decent Ranking for requirement prioritization. This technique is also used for combining project stakeholder’s preference with functional and non-functional requirements, but this method handles very less number of requirements.

2.8 Machine Learning Techniques (Naïve Bayes, KNN, SVM) Dargan et al. [21] have discussed that changes in poor requirements impact overall cost and performance of system. They have used earn value management techniques in order to reduce the negative impact on overall performance of system.

264

K. Kaur et al.

The authors also extracted the quality metrics from textual requirements using RQA (Requirement Quality Analyzer) tool. These quality metrics with statistical model used to predict system operational performance. The paper describes the requirement quality is indeed a predictive factor for end system operational performance. The goal of this paper is to show the statistical significant relationship between quality of requirements and system performance.

2.9 Convolution Neural Network (CNN) Navarro-Almanza et al. [22] have applied deep learning (CNN) with natural language processing and information retrieval techniques for automatic classification of functional and 12 different categories of nonfunctional requirements. The presented technique converts text data into low-dimensional vector space using Word2Vec to vectorize textual requirements. They have also evaluated the performance of model on PROMISE corpus and achieved average score of precision, recall, and f -score with values of 0.80, 0.785, and 0.77 respectively. From the performed experiment, it is concluded that CNN shows efficient results for software requirements classification. Requirement specification document defines the properties and behavior of system. These documents contain auxiliary information such as clarification, chapters, overview, and reference to other documents. This content information is not relevant to system engineers, who implement a system. For better differentiation, each content element is labeled as “requirement” or “additional information”. This task is time-consuming and error-prone. So, Winkler and Vogelsang [23] have applied CNN for automatic classification of content elements of natural language requirement specification. Word2Vec method is used to convert natural language into vectors. To train the model, a set of content elements are extracted from 89 requirement specification of industry. Baker et al. [24] have used CNN and ANN models for classification of NFRs into maintainability, security, operability, performance, usability. The accuracy of model is evaluated on PROMISE dataset containing 1165 requirements. They have also calculated precision, recall and f -score for each model and concluded that CNN can classify NFRs efficiently.

2.10 Ontology with K-Nearest Neighbor (KNN) Approach Wang [25] has proposed an approach that uses K-Nearest Neighbor to automatically analyze Software Requirement Specification. This approach extracts semantic information with the help of machine learning techniques and ontology. Frames are designed for verbs after they calculated from SRS document in e-commerce domain. From these semantic frames, sentences from SRS were chosen and labeled manually. Frequent verbs and nouns in sentence are used as training data in machine learning algorithm. External Ontology knowledge is used to reduce the effect sparsity of data

A Review of Artificial Intelligence Techniques for Requirement …

265

and get reliable results. The proposed architecture follows five steps such as Semantic frame development, building corpus, extracting features, learning SRL labeler, automatically annotating semantic roles of new sentence. Experiment results showed that this approach shows promising results in automatic analysis of requirements.

2.11 Supervised Machine Learning Approach Zijad and Maleej [26] have applied Support Vector Machine (SVM) algorithm to classify functional and non-functional requirements and sub-categories of non-functional requirements (NFRs). Both binary and multi-class classifiers have been introduced. Binary classifier was used to classify functional and non-functional requirements and multi-class classifier identify sub-categories of NFRs such as security, operational, performance, and usability. The study used dataset from open-source PROMISE repository that consists of 625 labeled requirements. They obtained precision and recall up to ~92% in classifying functional and non-functional requirements. The authors have achieved highest precision and recall for security class with ~92% and ~90% respectively.

2.12 K-Mean Algorithm (Unsupervised Machine Learning) Mezghani et al. [27] have proposed the use of K-Means for handling inconsistency and redundancy automatically in requirement engineering. They introduced preprocessing and Part-of-Speech (POS) tagging to detect technical terms relates to requirement document and also appraises impact of K-Means algorithm. They have taken dataset from industry, and domain experts generate three corpuses, i.e., corpus1, corpus2, corpus3. These corpuses contain redundant, inconsistent, and random requirements, respectively. The authors claim that K-Means algorithm provides relevant results for corpus1 and 2 but not for corpus3, which is considered as inadequacy of proposed approach.

2.13 Logarithmic Fuzzy Trapezoidal Approach (LFTA) with Artificial Neural Network (ANN) Existing approaches for requirement prioritization suffers from complexity, uncertainty, ambiguity, and negative membership function. Singh et al. [28] have presented new hybrid approach LFTA with ANN in order to resolve conflict occur while assigning priorities to requirements, and overcome issues of existing prioritization approaches. LFTA with ANN approach provides optimal priority weight vector

266

K. Kaur et al.

and positive degree of membership function to make consistency among fuzzy decision-making processes.

2.14 Least-Square Based Random Genetic Algorithm (LSRGA) Ahuja et al. [29] have proposed mathematical model named as Least Square Based Random Genetic Algorithm, which enhances the performance of requirement prioritization. This learning algorithm was selected due to its high accuracy in achieving optimal results. The main objective of this model was to assist requirement engineers in requirement prioritization, results in reduction in time and minimization in efforts of decision making. The performance of LSRGA is measured empirically and compared with integrated genetic algorithm. It can perform the prioritization process with less time consumption than Analytical Hierarchical Process. The model lacks in addressing the dependencies among requirements during requirement prioritization process, considered as the limitation.

2.15 Interpretable Machine Learning with Dependency Parsing Dependency parsing is the task of identifying the grammatical structure of sentence by determining the linguistic dependencies between the words, according to the predefined set of dependency parsing [30]. Dalpiaz et al. [31] have classified requirements on the basis of functional and quality aspects. The authors have explored that presence of linguistic features like dependency type can help to determine whether a textual requirement contains functional and quality aspects. For this purpose, they have presented Interpretable ML as a tool, for building and evaluating classifiers in RE. They concluded that this tool works well for small number of feature set with clear semantics.

2.16 Cloud-Service Method with Machine Learning Techniques After the elicitation of requirements, there is essential need of recognizing and implementing quality requirements. Merugu and Chinnam [32] have proposed CloudService based automated approach to identify quality software requirements and states which characteristic of requirement belongs to which class. This proposed work implements using deep learning models for classification. The reason to choose deep

A Review of Artificial Intelligence Techniques for Requirement …

267

learning model as it provide improved performance and accurate results over learning task. The classification method can be build using machine learning techniques such as Support Vector Machine, Multilayer perceptron, Convolutional neural network. Automatic classification based on training data, which contains labeled requirements. The training data is taken as input to classification models. The models predict the target document and detect the quality of software specification requirements.

2.17 Ontology, Machine Learning Approach, and NLP Daramola et al. [33] have introduced novel approach Hazard Identification (HazId), which identifies safety concern in requirement development phase. Authors developed a tool KROSA (Knowledge Reuse-Oriented Safety Analysis) combining technologies such as ontologies, case-based reasoning, and natural language processing for hazard identification. There is a need for reuse in time-consuming and repetitive tasks. The proposed system reuses requirements from ontologies results in reducing the cost of safety analysis. The pre-existing ontology was extracting requirements from text and reusing those requirements to identify hazards in early projects. The approach uses “words” and “phrase” obtained by natural language processing methods and ontology manages and resolve the conflict with other NFRs (e.g., safety vs. usability).

2.18 Natural Language Processing (NLP) Ferrari and Esuli [34] have identified ambiguity by ranking and scoring process, between the different domains with the help of NLP. Word embedding for every language model is used to compare terms in different domains evaluating its potential ambiguity. Also, they provide a way to identify frequently occurred terms in different domains. The proposed methodology evaluated on seven scenarios involves combination of five domains. The limitations of method are inaccuracy and discrepancy, which occurs between different annotators while measuring the performances.

2.19 Natural Language Processing (NLP) and Neural Network (NN) Hayrapetian and Raje [35] have devised semi-automated approach to analyze the set of security requirements in terms of completeness and inherent ambiguity with respect to certain security features. Due to free-flowing nature of requirements, it is fascinating to apply NLP in order to remove vague data. NN is used for identifying

268

K. Kaur et al.

relationships, and comparing test document against standards. The inputs to NN are feature vectors and output of one layer taken as input to next layer, process is going on until reached at final output layer, results in classification of requirements in terms of completeness, ambiguous or none.

2.20 Natural Language Understanding (NLU) Memon and Xiaoling [36] have presented NLU as major field of Natural Language processing and as hard as Artificial Intelligence. NLU is the future of machine world which enables machines to understand human language without formalized syntax of programming language. Speech Analytics technique of NLU is used for collecting quality requirements.

2.21 Reinforcement Learning Sultanov and Hayes [37] have discussed reinforcement learning method for tracing textual pair of requirements artifacts. The authors have used ambiguity detector tool on software requirement document. Requirement statement is taken as input to tool and checks whether a given requirement statement is ambiguous or not. The output of the tool makes decision tree to detect ambiguities in software requirement document. The experiment was performed on two sets of requirements of different domains, and validation results are compared with traditional Information Retrieval (IR) technique which shows that proposed method has better results.

3 Current Research in Requirement Engineering Requirements are important in software engineering and fundamental step to ensure the quality of product and increases customer satisfaction. This section focuses on existing work on requirement engineering activities such as requirement elicitation, requirement analysis, and requirement prioritization. The column shows requirement engineering activities and rows show the related work in each activity. Table 1 highlights the current research focus in RE through literature.

3.1 Requirement Elicitation Requirement elicitation is considered as one of the important phases of requirement engineering. The goal of this phase is to collect stakeholder’s need and explore

A Review of Artificial Intelligence Techniques for Requirement …

269

Table 1 Current research in RE Requirement engineering activities

Activities

Elicitation

Requirements elicitation

[18]

1

Analysis

Ambiguous requirements

[12, 17, 34, 35, 37]

22

Conflicting and inconsistent requirements

[27, 33]

Requirement quality

[14–16, 21, 31, 32, 36]

Classification of requirements

[10, 11, 13, 22–26]

Prioritize requirements

[19, 20, 28, 29]

Prioritization

Number of studies

4

the application domain of different domain end-users. From Table 1, only one paper discussed the elicitation of requirements using social media and collaborative filtering. This shows that this phase needs more exploration in order to collecting requirements automatically using AI techniques.

3.2 Requirement Analysis The focus of analysis phase is parsing and understanding of elicited requirements. The ambiguous, conflicting, and inconsistent requirements are collected. The modification and changes in requirements in later phase increase the cost and time of development. So, in Table 2, different research papers [12, 14–17, 21, 27, 31–37] discussed on automatic detection of erroneous requirements using AI techniques improve the overall performance of software. Requirement classification is also an important step of requirement analysis, because improper handling of requirements increase software maintenance cost and impacts the quality of software. Therefore, identification and classification of requirements become critical and necessary steps. Software requirements can be classified into two categories: Functional and NonFunctional requirements. The researchers make use of AI techniques to automatically classify software requirements [10, 11, 13, 22–26].

Requirement classification

Non-functional requirements classifier

Internet-based analysis system

Decision tree text classifier Requirement quality

Semi-supervised approach

Case-based reasoning and artificial intelligence

[10]

[11]

[12]

[13]

[14, 15]

Decision tree text classifier used for automatically classifies requirements as ambiguous or non-ambiguous

Automatically classifies requirements on the basis of topic word and detect the structure of requirement

Classify requirement into functional and non-functional requirements

Description

Requirement quality

• Low precision value (0.1244) • Large number of false positive value

Limitation/future work

• Accuracy (0.86) • Robust

Case-based reasoning • Speedup quality process techniques stores previous • Saves time and money cases and Artificial Neural Network evaluates the similarities between new case and existing case

(continued)

• CBR can be used with other learning algorithm to improve the performance

• Active learning is needed in order to reduce labeling efforts

• Handles small set of dataset

• Reduce difficulties in – informal and unorganized requirement • Reduce amount of work done

• Require less efforts than manual classification

advantages

Requirement identification Text categorization • Require less human technique is used with effort in human labeling semi-supervised approach to identify NFRs

Requirement classification

Purpose in RE

References AI technique

Table 2 Research map for AI techniques for RE approaches with description and advantages on usage of these

270 K. Kaur et al.

Requirement change management

Case-based reasoning

Artificial Neural Network

K-Nearest Neighbor

Machine learning technique with case base ranking and gradient decent ranking

Machine learning Requirement quality techniques (logistic regression, Naïve Bayes, classifier, K-nearest model)

[16]

[17]

[18]

[19, 20]

[21]

Maintain repository for requirement changes and resolving conflicts

Description • Increase customer satisfaction • Reduce cost • Deliver product on time

advantages

Requirement prioritization

Requirement elicitation

• Easy to use • Improve efficiency • Consume less time

• Improve the quality of requirement document

Requirements statements • Improve accuracy (from are taken from dataset and 0.74 to 0.86) metrics are calculated for each requirement and this data can be used to measure the performance of classifier

Order set of requirements according to their preference through ML techniques

Requirements are collected from stakeholders from social media on large scale

Identification of vulnerable Misuse care oriented • Minimize risk requirements quality framework metrics with machine learning techniques for the identification of vulnerabilities in requirement

Purpose in RE

References AI technique

Table 2 (continued)

(continued)

• Other technique such as ANN, CNN, and DT can also be used for the same purpose

• Complex • Handles small set of data





• Handles small set of data

Limitation/future work

A Review of Artificial Intelligence Techniques for Requirement … 271

Requirement classification

Deep learning models (convolutional neural network and artificial intelligence)

K-Nearest Neighbor (supervised ML approach) with ontology

Support vector machine

Unsupervised machine learning (K-means algorithm)

Artificial Neural Network

[22–24]

[25]

[26]

[27]

[28]

Requirement prioritization

Requirement quality

Requirement classification

Requirement analysis

Purpose in RE

References AI technique

Table 2 (continued)

Rank requirement according to their weight priority with the help of comparison matrix

Used for detection of redundancy and inconsistency by finding similarity between two requirement sentence

• Automatically classify functional and non-functional requirements with binary and multiclass classifier

• Automatically analyze requirements by extracting nouns and semantic

• Classify requirement statement

Description

• Proposed approach can be used with complex dataset

• Sometimes classify incorrectly • Recurrent NN can be used in future for classification

Limitation/future work

• Optimized results • Reduce inconsistency, uncertainty, ambiguity

• Efficient

• Complex

(continued)

• Another clustering algorithm is applied based on semantic features

• Handle imbalanced data – accurately • Removing redundant and irrelevant feature

• Handle sparsity of data • Effective

Improve accuracy

advantages

272 K. Kaur et al.

Requirement prioritization

Genetic algorithm

Interpretable machine learning with dependency parsing

Deep learning model Requirement quality (Artificial Neural Network, convolutional neural network, support vector machine)

Machine learning with natural language processing and ontology

Natural language processing

[29]

[31]

[32]

[33]

[34]

Requirement analysis

Requirement quality

Requirement classification

Purpose in RE

References AI technique

Table 2 (continued) Description

advantages

• Save manpower • Improve efficiency

• Effective if words have clear semantics



• Other automated techniques can also be used to predict the quality of requirements

• Handle only small set of data

• Proposed technique can be integrated with existing approaches of prioritization in order to improve the software economically

Limitation/future work

(continued)

Identify ambiguous terms • Ability to adopt multiple • Cannot explain about in between different domains domains domain ambiguity automatically

Both techniques CBR and • Reduce human effort ontology are used for knowledge reuse

Automatically classify and detect the quality requirements

Verb/adverb, adjective of sentence used for classification

Priority and dependencies • Reduce redundancy between requirements • Minimize error probability

A Review of Artificial Intelligence Techniques for Requirement … 273

Requirement analysis

Neural network with natural language processing

Natural language Requirement analysis understanding (NLP + AI)

Reinforcement learning

[35]

[36]

[37]

Requirement analysis

Purpose in RE

References AI technique

Table 2 (continued)

Recognize common textual segment and link are created in such document

Used in voice activation, text categorization, question-answering from client and analysis of large scale projects requirements

Identify security requirements and automatically check whether requirements written in security standard

Description

• Improve accuracy

• Detect transported and mispronunciation words • Detect redundancy, ambiguity, and incompleteness

• Reduce the cost • Saves time

advantages



• Can be used for large dataset

Limitation/future work

274 K. Kaur et al.

A Review of Artificial Intelligence Techniques for Requirement …

275

3.3 Requirement Prioritization Requirement prioritization is major crucial step in requirement engineering. Prioritization is performed to rank the requirements in order of importance and increase the economic value of system [38]. In large software projects, subset of requirements can be implemented. There is existing work on automatic prioritization of requirements make use of Machine learning techniques [19, 20], Neural Network [28], and Genetic Algorithm [29].

4 Research Map for AI in RE To support AI in RE, research map have been developed mapping the current research work with advantages and limitation/future work. The role of AI techniques can be varied according to RE activities.

5 Conclusion and Future Work In this paper, 21 Artificial intelligence techniques are reviewed that are used to automate requirement engineering activities. It is concluded the application of AI techniques or machine learning algorithms to requirement engineering is not an easy task. As in requirement classification, machine learning algorithm required prelabeled training data. On the other hand, it is surveyed that manually classification of requirements is time-consuming task. As compared to traditional manual classification techniques, automatic classification of machine learning techniques improves the performance of systems. From the existing literature of automatic classification of requirement, another deep learning model (Recurrent Neural Network) can be used for software requirement classification [22]. For requirement prioritization, hybrid approach is suggested overcome issues like ambiguity, time complexity, scalability, and inconsistency. To improve its complexity of hybrid approach by using Genetic Algorithm can be taken as future work [28]. Predicting performance of software from requirements by using machine learning classification technique is major novel step taken by researchers towards AI, and an interesting future work is to improve model validity, by using alternative modeling approaches, and piloting [21]. Automatically classification and prioritization of requirements reduce human intervention, complexity, time, and effort by using AI techniques. This paper concludes that there is a close collaboration between AI and RE in order to addressed open challenges facing the development of real-world application of AI in RE.

276

K. Kaur et al.

References 1. Young RR (2004) The requirement engineering handbook. Artech House Technology Management and Professional Development Library, Artech House, Boston 2. Sommerville I (2011) Software engineering, 9th edn. Pearson Education Inc., Saddle River 3. Bohem BW (1984) Software engineering economics. IEEE Trans Softw Eng 4–21 4. Moitra A, Siu K, Crapo AW, Durling M, Li M, Manolios P, Meiners M, McMillan C (2019) Automating requirements analysis and test case generation. Requirements Eng 24(3):341–364. https://doi.org/10.1007/s00766-019-00316-x 5. Utting M, Pretschner A, Legeard B (2012) A taxonomy of model-based testing approaches. Softw Test Verif Reliab 22(5):297–312. https://doi.org/10.1002/stvr.456 6. Chaos: Standish Group (2015) Chaos report 7. Praehofer H, Kerschbaummayr J (1999) Case-based reasoning techniques to support reusability in a requirement engineering and system design tool. Eng Appl Artif Intell 12(6):717–731. https://doi.org/10.1016/S0952-1976(99)00043-3 8. de Gea JMC, Nicolás J, Alemán JLF, Toval A, Ebert C, Vizcaíno A (2011) Requirements engineering tools. IEEE Softw. 28(4):86–91. https://doi.org/10.1109/MS.2011.81 9. Meziane F, Vadera S (2010) Artificial intelligence in software engineering: current developments and future prospects. In: Artificial intelligence applications for improved software engineering development: new prospects. IGI Global, pp 278–299. https://doi.org/10.4018/ 978-1-60566-758-4.ch014 10. Cleland-Huang J, Settimi R, Zou X, Solc P (2007) Automated classification of non-functional requirements. Requirements Eng 12(2):103–120. https://doi.org/10.1007/s00766-007-0045-1 11. Ko Y, Park S, Seo J, Choi S (2007) Using classification techniques formal requirements in requirement analysis-supporting system. Inf Softw Technol 1128–1140. https://doi.org/10. 1016/j.infsof.2006.11.007 12. Hussain I, Ormandjieva O, Kosseim L (2007) Automatic quality assessment of SRS text by means of a decision-tree-based text classifier. In: Seventh international conference on quality software (Qsic). IEEE. https://doi.org/10.1109/QSIC.2007.4385497 13. Casamayor A, Godoy D, Campo M (2010) Identification of non-functional requirements in textual specifications: a semi-supervised learning approach. Inf Softw Technol 52(4):436–445. https://doi.org/10.1016/j.infsof.2009.10.010 14. Jani HM, Mostafa SA (2011) Implementing case-based reasoning technique to software requirements specifications quality analysis. Int J Adv Comput Technol. https://doi.org/10.4156/ijact. vol3.issue1.3 15. Jani HM, Islam AT (2012) A framework of software requirement quality analysis system using case-based reasoning and neural network. In: International conference on new trends in information science and data mining. IEEE. https://doi.org/10.9790/0661-01037882 16. Naz H, Molta YH, Asghar S, Abbas MA, Khatoon A (2013) Effective usage of AI technique for requirement change management practices. In: International conference on computer science and information technology, pp 121–125. https://doi.org/10.1109/CSIT.2013.6588769 17. Poonia AS, Banerjee C, Banerjee A, Sharma SK (2019) Aligning misuse case oriented quality requirements metrics with machine learning approach. In: Ray K, Sharma T, Rawat S, Saini R, Bandyopadhyay A (eds) Soft computing: theories and applications. Advances in intelligent systems and computing, vol 742. Springer, Singapore. https://doi.org/10.1007/978-981-130589-4_64 18. Mulla N, Girase S (2012) A new approach to requirement elicitation based on stakeholder recommendation and collaborative filtering. Int J Softw Eng Appl. https://doi.org/10.5121/ ijsea.2012.3305 19. Perini A, Susi A, Avesani P (2013) A machine learning approach to software requirements prioritization. IEEE Trans Softw Eng 39(4):445–461. https://doi.org/10.1109/TSE.2012.52 20. Singh D, Sharma A (2014) Software requirement prioritization using machine learning. In: Proceedings of the international conference on software engineering and knowledge engineering, SEKE, pp 701–704

A Review of Artificial Intelligence Techniques for Requirement …

277

21. Dargan JL, Wasek JS, Campos-Nanez E (2016) Systems performance prediction using requirements quality attributes classification. Requirements Eng 21:553–572. Springer, Singapore. https://doi.org/10.1007/s00766-015-0232-4 22. Navarro-Almanza R, Ramirez RJ, Licea G (2017) Towards supporting software engineering using deep learning: a case of software requirements classification. In: International conference in software engineering research and innovation. IEEE. https://doi.org/10.1109/CONISOFT. 2017.00021 23. Winkler J, Vogelsang A (2016) Automatic classification of requirements based on convolutional neural networks. In: International requirement engineering conference workshops. IEEE. https://doi.org/10.1109/REW.2016.021 24. Baker C, Deng L, Chakraborty S, Dehlinger J (2019) Automatic multi-class non-functional software requirements classification using neural networks. In: 2019 IEEE 43rd annual computer software and applications conference (COMPSAC), vol 2. IEEE, pp 610–615. https://doi.org/ 10.1109/COMPSAC.2019.10275 25. Wang Y (2016) Automatic semantic analysis of software requirements through machine learning and ontology approach. J Shanghai Jiaotong Univ 21(6):692–701. Springer-Verlag, Berlin, Heidelberg. https://doi.org/10.1007/s12204-016-1783-3 26. Zijad K, Maleej W (2017) Automatically classifying functional and non-functional requirements using supervised machine learning. In: IEEE international requirement engineering conference. https://doi.org/10.1109/RE.2017.82 27. Mezghani M, Kang J, Sedeas F (2018) Using K-means for redundancy and inconsistency detection. In: Natural language processing and information systems. Springer International Publishing AG, pp 501–508. https://doi.org/10.1007/978-3-319-91947-8_52 28. Singh YV, Kumar B, Chand S, Sharma D (2019) A hybrid approach for requirements prioritization using logarithmic fuzzy trapezoidal approach (LFTA) and artificial neural network (ANN). In: Singh P, Paprzycki M, Bhargava B, Chhabra J, Kaushal N, Kumar Y (eds) Futuristic trends in network and communication technologies. FTNCT 2018. Communications in computer and information science, vol 958. Springer, Singapore. https://doi.org/10.1007/978-981-13-38045_26 29. Ahuja H, Sujata, Batra U (2018) Performance enhancement in requirement prioritization by using least-squares-based random genetic algorithm. In: Panda B, Sharma S, Batra U (eds) Innovations in computational intelligence. Studies in computational intelligence, vol 713. Springer, Singapore. https://doi.org/10.1007/978-981-10-4555-4_17 30. Kubler S, McDonald R, Nivre J (2009) Dependency parsing. Morgan & Claypool Publishers. https://doi.org/10.2200/S00169ED1V01Y200901HLT002 31. Dalpiaz F, Anna DD, Aydemir FB, Cevikol S (2019) Requirements classification with interpretable machine learning and dependency parsing. Zenodo. In: 2019 IEEE 27th international requirements engineering conference (RE). https://doi.org/10.1109/RE.2019.00025 32. Merugu RRR, Chinnam SR (2019) Automated cloud service based quality requirement classification for software requirement specification. Evol Intell. Springer-Verlag GmbH, Germany. https://doi.org/10.1007/s12065-019-00241-6 33. Daramola O, Stålhane T, Omoronyia I, Sindre G (2013) Using ontologies and machine learning for hazard identification and safety analysis. In: Maalej W, Thurimella A (eds) Managing requirements knowledge. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-344 19-0_6 34. Ferrari A, Esuli A (2019) An NLP approach for cross-domain ambiguity detection in requirements engineering. Autom Softw Eng 26:559–598. https://doi.org/10.1007/s10515-019-002 61-7 35. Hayrapetian A, Raje R (2018) Empirically analyzing and evaluating security feature in software requirement. In: Proceedings of the 11th innovations in software engineering conference, Hyderabad. ACM. https://doi.org/10.1145/3172871.3172879 36. Memon KA, Xiaoling X (2019) Deciphering and analyzing software requirements employing the techniques of natural language processing. In: International conference on mathematics and artificial intelligence, Chengdu. ACM, pp 153–156. https://doi.org/10.1145/3325730.332 5757

278

K. Kaur et al.

37. Sultanov H, Hayes HJ (2013) Application of reinforcement learning to requirement engineering: requirement tracing. IEEE, pp 52–61. https://doi.org/10.1109/RE.2013.6636705 38. Achimugu P, Selamat A, Ibrahim R, Mahrin MNR (2014) A systematic literature review of software requirements prioritization research. Inf Softw Technol 56(6):568–585. https://doi. org/10.1016/j.infsof.2014.02.001

Data-Driven Model for State of Health Estimation of Lithium-Ion Battery Rupam Singh, V. S. Bharath Kurukuru, and Mohammed Ali Khan

Abstract The trend of electrification within the automotive industry has resulted in the development of electric vehicles (EVs) due to their sustainable and environmentfriendly feature. One of the major components of these cars is their energy storage device, as its performance is very crucial for the success of the EV. But the major concern is the properties of the energy storage or battery change with time and usage. Therefore, the degradation of battery properties is interesting, especially the capacity decline. To understand and counter this degradation, it must be measured with high precision. This paper develops a data-driven approach to estimate the health of the batteries by using the support vector machine model. The proposed method identifies the keen points of the operation of an EV battery under aging conditions by performing accelerated aging and trains them with the adapted datadriven approach. To access the aging conditions, an equivalent electrical model of a cell of EV battery is simulated and operated for various current pulses. Further, when tested with unkown operation of a battery, the developed estimator predicted its health with an accuracy of 96.25%. The results depicted that the proposed concept gives an improved measure of the battery state and a pattern on how the capacity degradation occurs with time. Keywords Electric vehicle · Lithium-ion battery · State of charge · State of health · Pulsed current test

R. Singh (B) Department of Electrical Engineering, Delhi Technological University, New Delhi, India e-mail: [email protected] V. S. B. Kurukuru · M. A. Khan Department of Electrical Engineering, Faculty of Engineering and Technology, Jamia Millia Islamia (A Central University), New Delhi, India e-mail: [email protected] M. A. Khan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_21

279

280

R. Singh et al.

1 Introduction The automotive industry is experiencing a clear trend towards electrification. Electric vehicles are believed to be more environmentally friendly and sustainable, and more in line with future technologies and regulations, like autonomous cars and emissionfree cities. The success of the electric car, both from an environmental and commercial perspective, is heavily connected to the sustainability of the battery. If the battery needs to be replaced after a few years, or the performance is severely degraded, the impact on both the environmental footprint and business case will be substantial. Even though batteries have been used for a long time, the nature of the battery, a chemical system with very little transparency, makes it difficult to fully understand. Still, there is a harsh competition in the car market, and companies need to deliver good quality and long warranties. This poses an interesting challenge: to promise quality products without full control over the components. In general, batteries, and specifically lithium-ion batteries, degrade with time and use [1]. A new battery has a certain capacity, resistance, etc., that gives certain performance. Over time and with use, these properties change, often in a way that gives a lower performance. For instance, as the capacity declines, the range of the vehicle decreases. This is generally called aging and is often measured in relation to the initial properties, for instance, the capacity of the battery is 90% of the initial capacity [2]. This measure is in general named the state of health (SOH). A SOH of 70–80% is often defined as end of life (EOL) for a battery. Hence, to rival a combustion engine car, a lifetime of tens of years is needed, which limits the annual degradation to only a few percent [3]. Battery degradation has been studied in laboratories in extent, but this knowledge does not fully cover the highly dynamic and differentiated use in electric vehicles [4, 5]. In addition, the number of parameters affecting the battery under various conditions makes a laboratory testing expensive and tedious. Hence, to achieve better solutions researchers started to analyze the capacity degradation of the batteries [6, 7]. The capacity degradation is perhaps the most important symptom of aging, but the capacity measurement is not trivial in an electric vehicle. The simplest, and most accurate, form of measurement would be to perform a complete discharge. However, this is not often done in a car, and the capacity might vary with different circumstances. For instance, it is known that the capacity of a battery is dependent on temperature. In fact, it is possible that an older battery in a warmer temperature has a higher capacity than a new battery at a colder temperature. The high stakes, difficulty to accurately measure SOH, and differentiated use of batteries in electric vehicles make the issue of battery degradation in electric vehicles very interesting. This research aims to evaluate a data-driven approach by using machine learning techniques to determine the SOH and map the battery degradation. The feasibility of the approach is determined by the capability to determine the SOH so that different battery usage can be associated with different aging rates. This would an overview of batteries in EV’s without having to measure the SOH under specific circumstances. Due to the exploratory nature of the project, many steps are required from choosing

Data-Driven Model for State of Health Estimation …

281

Offline data driven model establishment Accelerated aging

Dataset preperation

Feature Extraction

Training the data driven model

Online SOH estimation Battery cell

Feature Extraction

Data driven model

SOH estimation

Fig. 1 Data-driven model for state of health estimation

a data set, to estimating the SOH. The SOH estimation can be achieved with many different properties, but this paper limits its search based on a capacity-based SOH. This means that the SOH is a normalization of current capacity of the battery with respect to the initial capacity. Further, the extracted data, i.e., the voltage response of the battery, has some distinctive features which need to be extracted in order to differentiate between different weeks of aging. Generally, this process can be achieved easily by mapping the change of voltage response with respect to the instant and duration of its input. But, better techniques like Fourier and wavelet transform can be adapted for extracting more features and for developing an efficient classifier. Further, the extracted features are applied to a two-class support vector classifier. The usage of features and development of the algorithm are further achieved in the paper as shown in Fig. 1. The information regarding the battery degradation and its effects on various characteristics is briefed in Sect. 2. Section 3 provides a detailed insight of modeling of lithium-ion batteries, their capacity analysis and performance modeling. In Sect. 4, the various methods like feature extraction process and machine learning classifiers are explained. The algorithm development for estimating the state of health of the batteries is achieved in Sect. 5, and the discussion is concluded in Sect. 6.

2 Lithium-Ion Battery The lithium-ion battery is a common choice for electrical energy storage since it has high energy density and slow aging compared to many alternative [8]. There are many different configurations of cells, different geometries, materials and sizes, but a simple schematics of a lithium-ion battery cell can be seen in Fig. 2. All cells have an anode and a cathode side, a negative and a positive pole. Lithium, the active material of the cell, can flow from anode to cathode and from cathode to anode, discharging, and charging the battery [8]. Both the anode and cathode materials are layered structures where lithium can be intercalated or extracted. Between the electrodes is an electrolyte, often a lithium salt. It does not let electrons pass

282

R. Singh et al.

Fig. 2 Schematic of lithium-ion battery

through, but allows a flow of lithium-ions. At the anode, the electrolyte is operated outside the stability window and thus can be reduced. This consumes lithium and creates a solid interface, called the solid–electrolyte interface (SEI), that protects from further reduction [9]. Besides fundamental to the function of the battery, this also provides stabilization to the graphite in the anode [10].

2.1 Aging The properties of battery change with time and use. Many degrade, causing the abilities of the battery to decline, generally called aging. A battery might age differently depending on how it is used, referred to as cycle aging. However, a battery also degrades without being used, referred to as calendar aging [3]. The extent of these effects differs for different materials, morphology, design, etc., [11]. In a broad view, different causes of aging can be put into three categories: loss of lithium inventory (LLI), processes that make the lithium unusable for cycling, loss of active materials (LAM), reduced amount of material enabling the lithium transfer, and structural damage to the components of the battery. All of them include many physical and chemical processes. The LLI is deemed to be the most severe aging mechanism, which is accelerated by high charging rates and very low or very high temperatures [12]. On a macroscopic level, aging is the effect from many ongoing processes in the battery [13]. These constitute the complex inner workings of the battery and make aging difficult to fully pinpoint. Going through them all is outside of the scope of the project, but this section aims to convey the complexity of the system. Just as there are many different phenomena that age the battery, there are multiple parameters that

Data-Driven Model for State of Health Estimation …

283

induce or accelerate them. These are often found through empirical studies, or from experiments validating theoretical knowledge. From a literature-based research, the following parameters were found important to aging in batteries and to the projects.

2.2 Battery States Battery performance is often described in battery states. These are artificial states of the battery that are suitable for describing the operation, but can often not be measured directly. Instead, they are the aggregation of many different measurements. The states are then continuously estimated and used to make sure that the battery is operated within the safe operation window [14]. State of charge (SOC) is the amount of charge stored in the battery, presented in percent of total capacity, 0–100%. It is the closest analogy to a fuel gauge in a petrol engine car [14]. The SOC can be estimated in many ways. The most reliable and intuitive way is to completely discharge the battery under known conditions and measure the amount of charge drained. The battery will be depleted, and the measured SOC, however known, will no longer be relevant [15]. A related method is Ampere counting, sometimes known as Ampere-hour integral. The idea is to integrate over the current to get the change in SOC from a known starting point. Coulombic efficiency is used to scale the integrated current and expresses the success rate of moving lithium ions [14, 15]. The integral is then normalized by C, the capacity. The mathematical expression for Coulomb counting can be seen in Eq. (1). Ampere counting suffers if the measurement of the current is imprecise. Also, it is often hard to know the initial SOC and the Coulombic efficiency, but, if these are known, the method could provide an accurate measurement of the SOC [14]. 1 SOC = SOC0 + C

t ηI dτ

(1)

t0

An alternative method is to connect the open circuit voltage (OCV) to the SOC. This is valid if the battery is in equilibrium, which is after a longer resting period [14, 15]. Because of this, it is not a suitable method for continuous estimations in a car, but could be suitable for parking. The OCV curves are often taken from individual cells and are mapped to the SOC of the entire battery via a lookup table. Novel versions account for how this mapping changes with time, but more often the lookup table is constant throughout the battery lifetime. Depth of discharge (DOD) is often used instead of SOC. They are measuring the same change in stored charge related to the capacity of the battery. However, the DOD measures the deviation from a full battery, 100%, and the SOC from a depleted battery. This means that the relationship between DOD and SOC can be described by:

284

R. Singh et al.

SOC = 100% − DOD

(2)

The state of health (SOH) is a general measure of how well the battery is doing. It is often used to compare the current capability of the battery to the initial 100%. To do so, it is connected to a property of the battery, often capacity, resistance, selfdischarge rate, or power density [3, 15, 16]. All these will degrade with time and use, and SOH is, therefore, closely connected to aging. In fact, it might be considered a tool to measure aging in the battery. Unfortunately, these properties will degrade differently, and if the SOH is truly a measure for battery aging, it should include, or combine, the degradation of multiple properties, not just one of them [10]. Perhaps most used is a capacity-based SOH, and how much charge the battery can store. This parameter will decrease with age, decreasing the amount of energy that can be stored in the battery. Such degradation is important since it is closely connected to the range of the car. It also means that an older battery will charge quicker since there is less available lithium for transport [14]. Other parameters related to SOH are the end of life (EOL) and the remaining useful lifetime (RUL). EOL is often referred to the point where the battery is no longer useful and is popularly defined as 70 or 80% of the SOH. The RUL is the number of cycles left before reaching the EOL [14]. Being able to estimate these states is very important to operate the batteries in the best way and for many business-related reasons.

3 Modeling of Vehicle Battery Depending on the type of configuration, the electric vehicles are divided into four different types, namely hybrid electric vehicles, plugin hybrid electric vehicles, fuel cell hybrid electric vehicles, and battery electric vehicles. Further, there are three different configurations of hybrid electric vehicles, series hybrid, parallel hybrid, and series/parallel hybrid. Each of these types and configurations has their own advantages and disadvantages, but the common property among them is their storage system. The choice of storage for electric vehicles depends upon three major requirements, specific energy, specific power, and cycle life. A brief comparison of vehicle types and their requirement is given in Table 1. Table 1 Comparison of vehicle types and their requirement Vehicle type ↓

Requirement →

Series hybrid

Cycle life

Specific power (W/kg)

Specific energy (Wh/kg)

Medium

Low

High

Parallel hybrid

High

High

Low

Series/parallel hybrid

Medium

Medium

Medium

Battery electric

Low

Low

High

Fuel cell

High

High

Low

Data-Driven Model for State of Health Estimation … Table 2 Parameters for modeling of a lithium-ion battery

285

Item

Value

Battery type

Lithium iron phosphate battery (LiFePO4 )

Battery nominal capacity (Q nom ) 2.5 Ah Battery nominal voltage (Vbat,nom,cell )

3.3 V

Battery maximum voltage (Vbat,max,cell )

3.6 V

Battery minimum voltage (Vbat,min,cell )

2.0 V

Maximum charge current (Ibat,cha,max,cell )

10 A

Maximum discharge current (Ibat,dis,max,cell )

70 A

Operating temperature

−30 to 50 °C

In general, the battery requirements are mainly based on the hybridization factor (HF) which is given by: HF =

PEM PEM + PICE

(3)

where PEM is the power required to drive the electric machine and PICE is the power required for driving the internal combustion engine. Hence, considering the requirements of various EVs, the modeling of a lithiumion battery for developing a state of health monitoring system is carried out based on the parameters given in Table 2. The electrical equivalent implementation of lithium-ion battery as shown in Fig. 3 is implemented in places standalone software. It is assumed that the battery consists of an inner voltage source, a charge resistor, and discharge resistor. It is also assumed Rbat , dis ,cell Ibat ,cell

Vbat ,int, cell

Rbat ,cha ,cell

+ Vbat ,cell

Fig. 3 Electrical equivalent implementation of lithium-ion battery

286

R. Singh et al.

that they all depend on the actual charge level and independent on the actual current level. The battery equivalent circuit is shown in Fig. 3. The inner voltage source of the battery is given by: Vbat,int,cell (DoD) = Vbat,cell (Ibat , DoD) + Rbat,dis,cell (DoD)Ibat

(4)

The inner discharge resistance can therefore be calculated by: Rbat,dis,cell (DoD) =

Vbat,cell (I1 ) − Vbat,cell (I2 ) I2 − I1

(5)

where I1 and I2 correspond to the current vectors for discharging at various C rates. The value of charge resistor is calculated by: Vbat,int, cell (Ibat , DoD) = Vbat,cell (DoD) + Rbat,cha,cell (DoD)Ibat

(6)

Further, the charge modeling of the battery is achieved by observing the discharge capacity of the battery depending on the current level. The highest discharge capacity can be measured at the reference capacity for depth of discharge and state-of-charge calculation (Q ref ). The depth of discharge for the battery can be calculated by: DoD = DoD(t0 ) −

1 Q ref

 α Ibat,cell dt

(7)

where α is selected based on the operating points of the open circuit voltage, for identifying whether the battery has been fully discharged.   α Ibat,cell =

Q ref   Q nom Ibat,cell

(8)

Further, the battery model can be validated with the help of charge–discharge profiles or by performing the pulsed current tests. Since voltage and current are the only electrical characteristics that can be measured to observe the internal changes of the battery and pulsed current test usually measures the battery power and its degradation, the pulsed current test will be relatively convenient for developing the estimation algorithm. An example of theoretical representation of the pulsed current test is shown in Fig. 4. During the current pulse test, the amplitude of the current pulse is set to a particular amplitude I as shown in figure (a) for a time period t. At this duration, the voltage responses of the battery are recorded as shown in figure (b) along with the aging. Further, the feature extraction and the method for establishing a data-driven model are discussed.

Data-Driven Model for State of Health Estimation …

2

V2

3

I

V

1

287

V3

V1

V4

4 t

t

(a)

(b)

Fig. 4 Pulsed current test a Current pulse applied for the test. b Voltage response of the battery

4 Methodology 4.1 Feature Extraction The points at change in amplitude (rising and falling edge) and the distance between change in amplitudes of the voltage response during the pulse test duration are considered as the features for developing the state of health estimator since the transition times for pulsed test are already known while developing the current profiles. The change in amplitudes and their distances from each other in the voltage profile are calculated as shown in Fig. 5.

D2 V2

V

V1

D4 D 3

V3

D1 V4

t Fig. 5 Feature extraction process for voltage response of the battery from the pulsed current test

288

R. Singh et al.

4.2 Data-Driven Models The data-driven models program a systematic structure using the techniques that iteratively learn from the data. These techniques include the neural networks, support vector machines, relevance vector machines, and many others. Various regression algorithms can also establish the relationship between the features and the battery SOH. The performance of these data-driven models mainly relies on the collected training samples and the features. In this work, the support vector machine is used for developing the estimation algorithm. Support Vector Machine (SVM). In machine learning, support vector machine (SVM) is a non-probabilistic, linear, binary classifier used for classifying data by learning a hyperplane separating the data. While dealing with a linearly separable dataset with n different features, a hyperplane is basically an (n − 1) dimensional subspace which separates the dataset into two sets, each set containing data points belonging to a different class. Here, the input vectors are mapped into a very high dimensional feature space. The algorithm constructs parallel hyperplanes, one on each side of the separating class and the hyperplane with the largest separation margin between the training points of the two classes is chosen. SVM then creates a model trained with input examples to predict the class of a new sample [17]. An overview of SVM process is in shown in Fig. 6. In general, SVM algorithm is based on constructing an optimal hyperplane with the maximal margin between support vectors. These support vectors are selected from the training samples in the boundary region of multiple classes. Another important thing is that the number of the used support vectors is small in comparison with the size of training dataset. If the input data are linear separable, then SVM algorithm searches for the optimal hyperplane in the unchanged feature space of the input data. However, in the case the input data are linear non-separable, SVM maps the input data using the so-called kernel function to the higher dimensional feature space. The used kernel function must satisfy Mercer’s theorem [18] and correspond to some type of inner product in the constructed high dimensional feature space. One of the advantages of Initial Classification

Data SVM Training

Data Weights

SVM Classification

Elements in Classification Fig. 6 Process of support vector machine

Elements out of Classification

Data-Driven Model for State of Health Estimation …

289

SVM is its universality as the algorithm allows using different kernel functions, which depend on the researcher’s assumptions and problem domain. Another advantage of SVM algorithm is its effectiveness in the high dimensional feature spaces. The core of SVM is a quadratic programming problem (QP), separating support vectors from the rest of the training data. The time complexity of SVM is given in (9), according to [19] where n is number of samples and d is number of features.   O max(n, d) ∗ min(n, d)2

(9)

5 Algorithm Development

Voltage (V)

As discussed in Sect. 2 the lithium-ion batteries are considered a suitable choice developing the SOH estimation algorithm due to their wide use in many applications. Hence, a lithium iron phosphate battery is modeled with its equivalent electrical circuit as discussed in Sect. 3. The variation in SOC of the battery due to aging is achieved by increasing the temperature and accelerating the calendar aging of the battery. The pulsed current test is performed by considering the aging of the battery for 12 months which is divided into two half year classes. The pulsed current test is carried out on batteries for different states of charges ranging from 10 to 90% with 20% change in both charge and discharge conditions. The amplitude of the current pulse is set around 5C, 4C, 2C, and 1C, respectively, for every change in state of charge of the battery. The voltage response of the battery for change in aging is recorder as shown in Fig. 7: For the voltage response, the points at change in amplitude and the distance between change in amplitudes of the voltage response during the pulse test duration are extracted as features as discussed in Sect. 4.1. The features extracted formed a feature matrix with 8 features with 400 samples for each feature. All these data 3.30 3.28 3.26 3.24 3.22 3.20 3.18 3.16 3.14 3.12 0

10

20

30

40

50

Week 4 Week 8 Week 12 Week 16 Week 20 Week 24 Week 28 Week 32 Week 36 Week 40 Week 44 Week 48 Week 52 Week 1

Time (S) Fig. 7 Voltage responses for the current pulse test performed for accelerated aging of the battery

290

R. Singh et al.

Fig. 8 Feature data scattered into an XY plane

are assigned their corresponding classes for 2 months of aging. Further, 80% of the randomly divided data is used for developing the SOH estimator and the 20% data is used for validation. Now the classifier is used to train the training data set for developing the classifier. Initially, the data is scattered into an XY plane as shown in Fig. 8 to observe the arrangement of the data. It is observed that the data is nonlinearly scattered. As mentioned in Sect. 4.2, SVM is a linear classifier which learns an (n − 1)dimensional classifier for classification of data into two classes. However, it can be used for classifying a nonlinear dataset. This can be done by projecting the dataset into a higher dimension in which it is linearly separable. To achieve this, a trick known as “kernel trick” is used to learn a linear classifier to classify a nonlinear dataset. It transforms the linearly inseparable data into a linearly separable one by projecting it into a higher dimension. A kernel function is applied on each data instance to map the original nonlinear data points into some higher dimensional space in which they become linearly separable. In this research, a polynomial Kernel is used to map the data into high dimensional feature space. The hyperplane and boundary conditions of the nonlinearly separable data in a high dimensional feature space are shown in Fig. 9. Once the data is mapped into the high-dimensional feature space, the SVM trains the data with the help of the parameters depicted in Table 3. By training with the above parameters, the SVM depicted an accuracy of 96.25%. The SOH estimation results of the testing data are depicted in Fig. 10. The results presented in this research depict the concept that data aggregation into events, and the use of support vector machines can achieve efficient SOH estimation. The results give some interesting depiction that the degradation is faster in the early stages, and the usage of high-power charging will damage the battery.

Data-Driven Model for State of Health Estimation …

291

a

b

Fig. 9 a Data scattered by reproducing kernel Hilbert space. b Decision boundary in input space Table 3 Classifier parameters

Parameter

Value

Classifier

Support vector machine

Kernel type

Polynomial

Order of kernel

2

Optimal tradeoff constant

0.5

Weights of the classifier

−32.38, 32.11, 7.5840

Bias

1.1316

292

R. Singh et al.

Capacity

Capacity (Ah)

2.6 2.4 2.2 0

4

8 12 16 20 24 28 32 36 40 44 48 52 Week

Fig. 10 State of health estimation result for proposed method

6 Conclusion This paper developed a machine learning framework for estimating the state of health for batteries. An equivalent electric model of lithium iron phosphate battery is developed to observe its performance under various temperatures, state of charges, and operating conditions. Further, to develop a SOH estimator, the performance of the modeled battery is evaluated with the help of pulsed current test. The pulsed current test is performed by considering the aging of the battery for twelve months under different SOC and C rates. The voltage response of the battery during different aging periods is observed, and the edge points and distance between the edge points are extracted as features. Further, all the features are trained with a SVM classifier. During the initial training period, the data is observed to be nonlinearly scattered, and hence to train the data using SVM, a polynomial kernel trick is employed. The resultant training depicted 96.25% efficiency while tested with an untrained data set. Further, the results depicted that, the health of batteries degrades faster in their early life than the later time, i.e., during usage time. In extension to the above SOH estimation, the state of charge aspects can be involved with the classification process to assess the reliability and estimate the end of life of the battery. In addition, the SOH estimation can be improved by adapting more features and involving other conditions apart from the aging phenomenon.

References 1. Ritchie AG (2004) Recent developments and likely advances in lithium rechargeable batteries. J Power Sources 136:285–289. https://doi.org/10.1016/j.jpowsour.2004.03.013 2. Yang F, Wang D, Zhao Y et al (2018) A study of the relationship between coulombic efficiency and capacity degradation of commercial lithium-ion batteries. Energy 145:486–495. https:// doi.org/10.1016/j.energy.2017.12.144

Data-Driven Model for State of Health Estimation …

293

3. Barré A, Deguilhem B, Grolleau S et al (2013) A review on lithium-ion battery ageing mechanisms and estimations for automotive applications. J Power Sources 241:680–689. https://doi. org/10.1016/j.jpowsour.2013.05.040 4. Hesse HC, Schimpe M, Kucevic D, Jossen A (2017) Lithium-ion battery storage for the grid—a review of stationary battery storage system design tailored for applications in modern power grids. Energies 10(12):2107 5. Barai A, Uddin K, Dubarry M et al (2019) A comparison of methodologies for the non-invasive characterisation of commercial Li-ion cells. Prog Energy Combust Sci 72:1–31. https://doi.org/ 10.1016/j.pecs.2019.01.001 6. Xu B, Oudalov A, Ulbig A et al (2018) Modeling of lithium-ion battery degradation for cell life assessment. IEEE Trans Smart Grid 9:1131–1140. https://doi.org/10.1109/TSG.2016.257 8950 7. Todeschini F, Onori S, Rizzoni G (2012) An experimentally validated capacity degradation model for Li-ion batteries in PHEVs applications. IFAC Proc 45:456–461. https://doi.org/10. 3182/20120829-3-MX-2028.00173 8. Tomaszewska A, Chu Z, Feng X et al (2019) Lithium-ion battery fast charging: a review. eTransportation 1:100011. https://doi.org/10.1016/j.etran.2019.100011 9. Vetter J, Novák P, Wagner MR et al (2005) Ageing mechanisms in lithium-ion batteries. J Power Sources 147:269–281. https://doi.org/10.1016/j.jpowsour.2005.01.006 10. Han X, Ouyang M, Lu L et al (2014) A comparative study of commercial lithium ion battery cycle life in electrical vehicle: aging mechanism identification. J Power Sources 251:38–54. https://doi.org/10.1016/j.jpowsour.2013.11.029 11. Agubra V, Fergus J (2013) Lithium ion battery anode aging mechanisms. Materials (Basel) 6:1310–1325. https://doi.org/10.3390/ma6041310 12. Wang J, Purewal J, Liu P et al (2014) Degradation of lithium ion batteries employing graphite negatives and nickel–cobalt–manganese oxide + spinel manganese oxide positives: part 1, aging mechanisms and life estimation. J Power Sources 269:937–948. https://doi.org/10.1016/ j.jpowsour.2014.07.030 13. Danzer MA, Liebau V, Maglia F (2015) Aging of lithium-ion batteries for electric vehicles. In: Advances in battery technologies for electric vehicles. Elsevier, pp 359–387 14. Waag W, Fleischer C, Sauer DU (2014) Critical review of the methods for monitoring of lithium-ion batteries in electric and hybrid vehicles. J Power Sources 258:321–339. https://doi. org/10.1016/j.jpowsour.2014.02.064 15. Lu L, Han X, Li J et al (2013) A review on the key issues for lithium-ion battery management in electric vehicles. J Power Sources 226:272–288. https://doi.org/10.1016/j.jpowsour.2012. 10.060 16. Bloom I, Cole B, Sohn J et al (2001) An accelerated calendar and cycle life study of Li-ion cells. J Power Sources 101:238–247. https://doi.org/10.1016/S0378-7753(01)00783-2 17. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/ 10.1023/A:1022627411411 18. Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Philos Trans R Soc A Math Phys Eng Sci 209:415–446. https://doi.org/ 10.1098/rsta.1909.0016 19. Chapelle O (2007) Training a support vector machine in the primal. Neural Comput 19:1155– 1178. https://doi.org/10.1162/neco.2007.19.5.1155

Trusted Sharing of IOT Data Using an Efficient Re-encryption Scheme and Blockchain Preeti Sharma and V. K. Srivastava

Abstract The major unit of Internet of things or IoT is data, and it is being used by IoT sharing system. It is necessary to develop a trust between sensor owner and sensor user. This research paper aims to provide solution to the following burning problems by providing the “dual” re-encryption methods: (i) scalability, (ii) trust, and (iii) digital payments. This method saves the data in a distributed cloud, when the encryption process is finished. A bridge has been developed in this work between the sensor owner and sensor user without any intervention. Keywords IOT · Blockchain

1 Introduction “IOT is a network of interconnected computing devices which have the ability to transmit data over a network without any need of human interaction” [1]. It is all about connecting objects to the Internet, making the devices smart by enhancing their ability to transmit and receive data to and from Internet [2]. In IoT, the daily life objects will be equipped with sensors and devices for communication. The data transmitted by these devices and then can be used for decision making [3]. The analytics and real-time insights gathered from IoT devices can be used to derive various business logics. IoT can bring transformation by automating the processes which are not feasible to be done manually. An object that can be represented digitally has a great importance, and this can only be achieved by the use of IoT [4]. IoT focuses on elimination of human intervention and building systems, making various processes autonomous [5]. P. Sharma (B) · V. K. Srivastava Baba Mastnath University, Rohtak, Haryana, India e-mail: [email protected] V. K. Srivastava e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_22

295

296

P. Sharma and V. K. Srivastava

Blockchain is revolutionizing the digital industry and subsuming multiple sectors including financial services, insurance sector, health care, IT, and cybersecurity. Blockchain is a distributed ledger where each member on the network maintains a local copy of the data which is encrypted using public key infrastructure (PKI) security method [6]. In this way, it can establish trust between two different parties without any need of a third-party trust service provider [7]. Blockchain allows the verification of transactions by providing a distributed, secured, and immutable ledger. In blockchain, the decisions or business logic output can be managed and handled by the third party by the use of smart contracts which are lines of code that are stored in a blockchain network. Business logics or contracts which are mutually agreed upon by all the members initially can be transformed into smart contracts that govern the future transactions among the participatory members [8]. Workflow of Blockchain In this Fig. 1, we have a chain of different nodes and all the nodes have same blockchain containing the same blocks and as soon as the transaction occurs; i.e., A sends money to B, the new validated block is been added to the blockchain and it gets updated in all the nodes and as it is immutable so there is no way to go back. As soon as any transaction takes place, the block will be added to the chain and it will get updated in all the nodes [9]. Interaction of Blockchain and IoT For further understanding of interaction between IoT and blockchain, we can see one example through this architecture. In this architecture, it is shown that IoT device we will have input and output, and based on the data, we will store it in blockchain IoT wallet for further processing have adaptation layer and an adapter. Whereas on other side, we have a client of Blockchain which is connected with various IoT plugins

Fig. 1 Workflow of blockchain

Trusted Sharing of IOT Data Using an Efficient Re-encryption …

297

Fig. 2 Interaction between IoT and blockchain

and network gateways. Above all we have IoT network that connects all the parts and transfer data among each other [10] (Fig. 2).

1.1 Difference Between Various Approaches of IoT and Blockchain 1. IoT-IoT: In this approach, all transactions and communications will take place by IoT devices and it can work offline also. 2. IoT–Blockchain: As this is immutable approach by using blockchain in this we do all interactions through blockchain. 3. Hybrid Approach: In this, we need to select some transactions that need to be done by Blockchain devices (Fig. 3). Use Cases Traditional System: In the existing banking system, suppose one person wants to transfer money from his account to another person’s account, then there is a subtle need for a third party, i.e., banks. So the person needs to have trust in this third party and have to share critical information in order to use their payment infrastructure. In this scenario, banks are trustworthy mediators between the two parties [11]. Blockchain-enabled System: For the above scenario, let us suppose we have a blockchain-enabled environment and we want to transact money from one account to another, so the role of the third party for any transaction related activity is not required,

298

P. Sharma and V. K. Srivastava

Fig. 3 Difference between different approaches of IoT and blockchain

as the data is maintained with every peer, and the integrity and consistency are maintained across all the nodes using blockchain, implemented using proper consensus algorithm and peer validations, responsible for the data integrity in blockchain. Various blockchain platforms for creating distributed applications are Ethereum and Hyperledger. A lot of focus is required in the field of correctness and validation of the contract logic [12]. IoT: The vision of IoT is to transform conventional devices into smart and selfgoverning devices. The idea of IoT is about the formulation of a connected world where things can communicate with each other very efficiently. Various IoT solutions are getting deployed in various sectors which are digitizing the industries [13]. Blockchain can play a crucial role in the telematics system of IoT devices and how the devices will communicate with each other. Using blockchain for IoT reduces the cost that is associated with the involvement of thirty party individuals and can also prove beneficial by reducing the risk of fraud and unauthorized transactions [13]. Enhancing the insurance management system in insurance industry is one of many suitable use cases of “Integration of IoT with blockchain.” Smart contracts can help the insurance companies in efficiently managing the claims and the damages done to the vehicle. Information coming from the vehicles can be transmitted to the insurance companies which then can be used to derive the policies that can help in managing the claim process. Whenever a claim request is raised, better paying off strategies can be derived by using the telematics of the vehicle so that only valid claims are paid. Using blockchain and IOT-based framework, a solution can be prepared that can start auto-initiation of the claim process [14]. Our Contribution: To solve the issues, a system is proposed in which proxy reencryption mechanism will be applied, and at the same time, confidentiality and

Trusted Sharing of IOT Data Using an Efficient Re-encryption …

299

privacy of the data will be maintained. There are several advantages of using blockchain mechanism namely. 1. If there is a financial transaction gathered in the blockchain, these transactions are administered by smart contracts. 2. The legal formalities in the two parties will be persuaded. 3. There is no requirement of manual checking of the payments. Everything will be stored on the cloud. 4. Financial transactions are minimized if we use this system of implementation.

2 Related Work We can find various thesis works on IOT and its danger, and also, lots of studies are available on its security and privacy [15]. In this work, understanding and identification of threats are been done. Usage of blockchain for security of various IoT Platforms was discussed by Singh and Ashok [16, 17]. Khan and Salah both reviewed various challenges in IoT security and insecure data transfer with high-level risk of security [18]. S. Notra and M. Siddiqui explained the shortcomings in security various hacking techniques through IoT devices. Significant security and privacy concerns are opened as the tools of IoT assemble, sense also allow the serving of big data [19]. “The concept of proxy re-encryption was initially introduced by Blaze, Bleumer and Strauss in 1998” [20]. A similar but not a very dynamic scheme was introduced by authors makes IoT inadequate for the information shared on cloud and to resolve this a quick fix, i.e., “pairing free re-encryption scheme” has been introduced for sharing of data [21, 22]. Almost all the previous work done by all the researchers give a talk on the issues faced during the IoT data sharing with utmost confidentiality. Solving all the security threats using device embedded security is nearly impossible. We propose both proxy re-encryption scheme and blockchain together can provide a trading platform to provide security in transferring the private data to the customer. Benjamin Leiding and Will Vorobev in the research paper, entitled as, “Enabling the Vehicle Economy Using a Blockchain-Based Value Transaction Layer Protocol for Vehicular Ad-Hoc Networks” tried to fill the gap in the state of the art by introducing a blockchain-based transaction layer library for (semi)-autonomous vehicles that enable the upcoming V2X economy and presented the advantages of the system, outline the requirements and goals, as well as the architecture of the Chorus V2X platform and ecosystem. They did not focus on the long-term vision of Chorus Technology and the development of the abstract transaction and interaction layer as well as the API and library integration [23]. António Brandão, Henrique São Mamede, and Ramiro Gonçalves in the research paper, entitled as, “Systematic Review of the Literature, Research on Blockchain Technology as Support to the Trust Model Proposed Applied to Smart Places” focused on the areas bitcoin (about 40%), IoT (about 30%), financial (about 15%), cryptocurrencies, electronic government (about 12%), smart contracts, smart cities, business (with about 10% each),

300

P. Sharma and V. K. Srivastava

and health (about 5%). Authors did not consolidate the concept of smart places, review generic data models for smart cities, and adapt to a model that describes natural ecosystems in data ecosystems, in IoT scenarios, with layers information aggregation, allowing an architecture supported in blockchain technology [24]. Zhi Li, Hanyang Guo, Wai Ming Wang, Yijiang Guan, Ali Vatankhah Barenji, George Q. Huang, Kevin S. McFall, and Xin Chen, in the research paper, entitled as, “A. Blockchain and AutoML Approach for Open and Automated Customer Service” proposed an open and automated customer service platform based on Internet of things (IoT), blockchain and automated machine learning (AutoML) is proposed. The data is gathered with the use of IoT devices during the customer service. An open but secured environment to achieve data trading is ensured by using blockchain. AutoML is adopted to automate the data analysis processes for reducing the reliance of costly experts. The proposed platform is analyzed through the use case evaluation. A prototype system has also been developed and evaluated. The simulation results show that our platform is scalable and efficient. The authors did not extend the system to help companies to collaborate with each other to provide collaborative customer services. The proposed system was not investigated on homomorphic encryption. Their system did not increase transaction throughput by increasing block size limit [25] and using different consensus method. They did not pay attention to the balance between security and efficiency [26]. Michael Mylrea in the research paper, entitled as, “AI Enabled Blockchain Smart Contracts: Cyber Resilient Energy Infrastructure and IoT” designed a highperformance blockchain platform, using technologies such as distributed network architecture, intelligent devices node mapping, as well as PBFT-DPOC consensus algorithm to realize the decentralized autonomy of intelligent devices [27]. Mohamed Rahouti, Kaiqi Xiong, and Nasir Ghani, in the research paper, entitled as, “Bitcoin threats and machine learning security solutions” tried to explore the key security concerns. Here authors presented the global overview and major components of Bitcoin protocol and then touched the major threats and weakness of Bitcoin system and also discussed the current existing security studies and solutions and summarize open research challenges and trends of future research for Bitcoin security. The authors did not discuss the Bitcoin infrastructure which is a point of vulnerability and exploitation for cyber threats. Gihan J. Mendis, Yifu Wu, Jin Wei, Moein Sabounchi, and Rigoberto Roche in the research paper, entitled as, “Blockchain as a Service: A Decentralized and Secure Computing Paradigm” designed a paradigm by exploring blockchain, decentralized learning, homomorphic encryption, and software-defined networking (SDN) techniques. The performance of the proposed paradigm is evaluated via different scenarios in the simulation section. This paradigm is not effective to process private and/or scattered data in suitable decentralized ways for machine learning [28]. Ahsan Manzoor, Madhsanka Liyanage, An Braeken, Salil S. Kanhere, Mika Ylianttila in the research paper, entitled as, “Blockchain based Proxy ReEncryption Scheme for Secure IoT Data Sharing” presented a blockchain-based proxy re-encryption scheme. The system stored the IoT data in a distributed cloud after encryption. To share the collected IoT data, the system established runtime dynamic smart contracts between the sensor and the data user without the involvement

Trusted Sharing of IOT Data Using an Efficient Re-encryption …

301

of a trusted third party. This scheme used an efficient proxy re-encryption scheme which allows that the data is only visible by the owner and the person present in the smart contract. The proposed system is implemented in an Ethereum-based testbed to analyze the performance and security properties. The author did not implement on a different blockchain platform, e.g., Hyperledger. They did not extend this architecture by adding a distributed cloud storage to make the system more scalable [29]. VanCam Nguyen, Hoai-Luan PHAM, Thi-Hong TRAN, Huu-Thuan Huynh, Yasuhiko Nakashima in the research paper, entitled as, “Digitizing Invoice and Managing VAT Payment Using Blockchain Smart Contract” combined decentralized storage network (DSN) with the smart contract (SC), proposed a new model based on blockchain technology to authenticate the transaction, calculate value-added tax, and approve VAT payment. The proposed system is not able to decreases the risk of data loss attacks completely and improves the trust in implementing VAT payment (nonaffection from the third party) as analyzed, to some extent [30]. Nelson Bore, Andrew Kinai, Juliet Mutahi, David Kaguma, Fred Otieno, Sekou L. Remy, Komminist Weldemariam in the research paper, entitled as, “On Using Blockchain Based Workflows” discussed the approach to automate the process of creating, updating, and using workflows for blockchain-based solutions. In particular, we present a workflow definition schema using existing templates and showed how the workflow definition is used to automate the generation of graphical user interfaces and the possibility of generating associated blockchain smart contracts in the future. They did not evaluate the properties such as the cognitive load, time on task, accuracy, and interoperability when using autogenerated resources [31].

3 Proposed Framework • This framework considers four entities: blockchain, data requester, cloud provider, IoT sensors. • Through smart contract, the sensor’s owner activates the sensors followed by the registration on the blockchain. • Now the sensor’s owner provides the sensor in an encrypted form. • Now user will use smart contract to give request to access one sensor. • Now a smart contract will be generated after the agreement of sensor’s owner and the requester and the same will be stored on a block of a chain. • Customer after that commerce with the blockchain so that he can get to know about the public key and can administer all the finance-related transactions. • The software sorts the information as stated by the customer after the blockchain notifies the available storage on cloud. • Sensor’s owner will update the re-encryption cryptographic key on the smart contract as soon as the requester’s hit up is accepted. • The server of the cloud will manage the decryption and re-encryption of data that is sorted, before the data is stored on location that is not permanent on cloud server.

302

P. Sharma and V. K. Srivastava

Fig. 4 Proposed framework

• Blockchain will notify the requester when the data is ready about the temporary location. • Now by using cryptographic key, the requester can decrypt the data (Fig. 4).

4 Security Aspects Now, a unique technique can be submitted, which is verification-based technique. This technique provides an algorithm which is seven stages, as given below: Stage-I: Installation files. Stage-II: Producing user key. Stage-III: The process of encryption. Stage-IV: Producing reply key. Stage-V: The process of revised encryption. Stage-VI: Process of change of information into readable and understandable format (DCX). Stage-VII: Process of change of information into readable and understandable format (DCY). Some of the important stages are being described below: Stage-I: Encyrpt (params, M, idA. dA, T0): The metadata is generated for the message M, i.e., meta = (idA||T0). Next, the following computations are made: r = H2(dA || meta), R = rP CA = M ⊕ H3(meta || rPA) hA = H4 (C4 || meta). sA = r − hAdA. The output C of this algorithm equals to C = (CA, meta, hA, sA).

Trusted Sharing of IOT Data Using an Efficient Re-encryption …

303

Stage-II: ReKey(params, dA.idB, CertB, CA, meta): First r = H2(dA || meta) is derived from C. Then, the public key of idB is computed as PB = H1(CertB || idB)CertB + P0. rkAB = H3(meta ||rPA) ⊕ H3(meta || rPB) The output is the key rk. Stage-III: ReEncrypt(params, CA, rkAB): The re-encyrption phase changes the ciphertext CA to CB by CB = rkAB ⊕ CA. Note that CB also corresponds to M ⊕ H3 (meta || rPB), which will be used in decrypting phase of the delegate. The output C is now the tuple, containing CB, meta, IDB, hA, sA. Stage-IV: Decrypt1(params, C, dA): Here the delegator wants to decrypt the ciphertext to derive the original message and to check its authenticity. Therefore, the following computations are required: r = H2(dA || meta). M = CA ⊕ H3(meta || rPA) hA = H4(CA || meta). Check: sA = r − hAdA. Stage-V: Decrypt2(params, C , dB): In this phase, the delegate B derives the message M from C by the following operations: R = sAP + hAPA. M = CB ⊕ H3 (meta || dBR) Check: hA = H4(CA || meta).

5 Implementation The architecture shows the forthcoming system having 03 Internet of things sensing devices, three computers for mining, 05 Ethereum fullnodes, two regular users, and a server (Fig. 5). Firstly, the devices have to be connected to any source of Internet. The full nodes and the miners are connected through protocol named as auto-discovery of Geth. The storage is done through a Google firebase cloud. After every duration of some seconds, transactions are produced by the miners. These miners work on virtual machines with similar specifications of hardware [32]. Creation of two smart contracts on truffle is been done [10] and also was executed with solidity version [9]. Functions such as registering sensor, requesting information, and other functions for finance are executed by the smart contract. The second smart contract is created automatically at runtime whenever there is a request from the user on the other side. Also, IOT sensors are used to create cryptographic functions which are explained in the scheme named as proxy re-encryption, a sensor application is crested in PYTHON language. It also stores the data to the storage provided by cloud server. This program is executed in accordance with the blockchain using version of Pythonlibrary. The sensor’s MAC addresses are used for re-encryption, and they act as their identity.

304

P. Sharma and V. K. Srivastava

Fig. 5 Implementation

On blockchain, a new smart contract is invented conditions upon the preferences set by the practioner at the runtime. To get the information about sensor, this program uses JSON–ROC from the blockchain. The server then allows the program to get the application and to go through all the tests with checks on integrity and signature. The requested data is been decrypted after that. RST and Google firebase are important components of the cloud storage server. The Google Firebase then stores the data about data in the format given by JSON. The sensitive data is been encrypted also the authentication is done on RSP. Proxy re-encryption is also done by the cloud, and smart contract variable is updated for data address sharing.

6 Conclusion/Future Work Here, blockchain and the proxy re-encryption scheme together are proposing a trading system to guarantee a safe transfer of the sensitive data. An implementation model on a private Ethereum testbed is also authenticated. In the forthcoming system by using raspberry and off the self-laptops, the practicality of the system is also shown. In future, the proposed system can be extended by using alternate platform of blockchain of r3 Corda, Iota, etc. Scalability can be improved in future researches.

Trusted Sharing of IOT Data Using an Efficient Re-encryption …

305

Also, it is expected that IoT will be more advanced with the help of blockchain technology. Collaboration with these two can be used in various fields of the government. These could provide speedy communication between people, industries, and government sector. In mining also, blockchain consensus could have key role in inclusion of IoT, and also in collaboration with these two, we can have improved security using blockchain technology.

References 1. Rahman A (2019) Blockchain and IoT-based cognitive edge framework for sharing economy services in a smart city. IEEE (2019) 2. Outchakoucht A (2017) Dynamic access control policy based on Blockchain and machine learning for the Internet of Things. Int J Adv Comput Sci Appl 8(7):417–424 3. Reyna A (2018) On blockchain and its integration with IoT: challenges and opportunities. Future Gener Comput Syst 88:173–190 4. Kim SKY (2017) Proof of concept of home IoT connected vehicles. MDPI (2017) 5. Koien GM (2014) Security and privacy in the Internet of Things: current status and Open Issue. IEEE 2014 6. Brandão A, Mamede HS (2018) Systematic review of the literature, research on blockchain technology as support to the trust model proposed applied to smart places 7. Blockchain as a service: a decentralized and secure computing paradigm. arXiv, 2019 8. Yu S (2018) A high performance Blockchain platform for intelligent devices. IEEE 9. Abhinav C, Balaji R, Blockchain—a potential game-changer for Insurance claim. ITC Infotech 10. Sina NR, Eryk S, Cepilov Fabio M, Aydinli K, Timo S, Thomas B, Burkhard S, Adaptation of proof of stake based Blockchains for IoT data streams 11. Zile K (2018) Blockchain use cases and their feasibility 12. Link. https://nptel.ac.in/ 13. Oham C (2018) B-FICA: BlockChain based framework for auto-insurance claim and adjudication. arXiv, 2018 14. Patel DR (2014) A survey on Internet of Things: security and privacy issues. IJCE 15. Singh M (2018) Blockchain: a game changer for securing IoT data. IEEE 16. Ashok (2016) Building on secure foundation for the Internet of Things 17. Salah K (2018) IoT security: review, Blockchain solutions, and open challenges. Future Gener Comput Syst 18. Notra RS, Siddiqi M (2014) An experimental study of security and privacy risks with emerging household appliances. IEEE 19. Zhang ZK (2014) IoT security: ongoing challenges and research opportunities. IEEE 20. Blaze M (1998) Divertable protocols and atomic proxy cryptography. Springer 21. Chu K (2009) Conditional proxy broadcast re-encryption. Springer (2009) 22. Sun M (2018) A proxy broadcast re-encryption for cloud data sharing. Multimed Tools Appl 23. Vo W, Obev R (2018) Enabling the vehicle economy using a Blockchain-based value transaction layer pro protocol for vehicular ad-hoc networks 24. Brandao G (2018) Systematic review of the literature, research on Blockchain technology as support to the trust model proposed applied to smart places 25. Ana R, Cristian, M, Jaime C, Enrique S, Manuel S (2018) On Blockchain and its integration with IoT, challenges and opportunities. Future Gener Comput Syst 88:173–190 26. Prbha C (2017) Automatic vehicle accident detective and messaging system using GSM and GPS modem Blockchain in insurance

306

P. Sharma and V. K. Srivastava

27. Li Z, Guo H, Wang WM, Guan Y, Barenji AV, Huang GQ, McFall KS, Chen X (2019) A Blockchain and AutoML approach for open and automated customer service. IEEE Trans Ind Inf 15(6):3642–3651 28. Mendis RJ (2019) Blockchain as a service: a decentralized and secure computing paradigm 29. Manzoor A, Blockchain based proxy re-encryption scheme for secure IoT data sharing 30. Digitizing invoice and managing VAT payment using Blockchain smart contract 31. Bore A, On using Blockchain based workflows 32. Go implementation of the ethereum protocol. www.github.com/ethereum/go-ethereum,official

Clustering of Quantitative Survey Data: A Subsystem of EDM Framework Roopam Sadh and Rajeev Kumar

Abstract Surveys are utilized frequently in several key domains such as policy making and higher education. Clustering is a popular and useful analytical tool which is used while analyzing quantitative survey data for various purposes. One of such purposes is the identification of divergent preferences and requirements of academic stakeholders. However, results of clustering are directly dependent on the properties of the dataset. Hence, this study attempts to explore distinct properties of quantitative survey data that significantly influence the clustering results. This study utilizes educational data mining (EDM) framework to explore such distinct properties by analyzing the results of K-means clustering applied over a real academic survey dataset. Analysis of clustering results suggests that quantitative survey data has several distinct properties which makes it unfit for most of the available clustering techniques. Hence, survey data requires a dedicated clustering method that takes care of these distinct properties. For showing the impacts of these properties over clustering results, an example data division method is also introduced in this study and its results are also compared with the results of K-means. Keywords EDM framework · Survey data · Cluster analysis · Higher education · Stakeholder theory

1 Introduction Surveys are heavily used in educational domain for various purposes. Exploration and measurement of academic quality indicators is one of such purposes where surveys are utilized [1]. For an instance, survey is used to measure the reputational standing of higher educational institutions (HEIs), which is one of the important R. Sadh (B) · R. Kumar School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi 110067, India e-mail: [email protected] R. Kumar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_23

307

308

R. Sadh and R. Kumar

quality parameters in most of the institutional rankings [2]. Moreover, most of the studies regarding academic service quality have utilized surveys for collecting data [3–5]. Due to global acceptability and importance of surveys, it becomes necessary to understand the properties of survey data that directly influences the results of analyses significantly. Cluster analysis is utilized to detect the groups of similar observations inside the dataset [6]. It provides an easy way to identify most dominant features of the data [7]. Hence, clustering of quantitative survey data is an effective tool to explore the general perceptions of different survey respondents [8]. However, the reliability of clustering results depends directly on type of data and its properties [9]. Due to this reason, it becomes a prerequisite to identify the properties of data before applying clustering methods. Therefore, this paper attempts to explore various properties of quantitative survey data that can influence clustering results. For achieving this purpose, this study utilizes educational data mining (EDM) framework [10]. The field of EDM is dedicated to analyze the data generated in educational settings by using appropriate analytical tools and techniques as well as various essential knowledge sources [11]. It provides useful insight of educational data to assist objective decision making. In this paper, K-means clustering is applied over a real survey dataset that contains the responses of several academic stakeholders regarding the quality parameters of HEIs. Clustering is done in order to explore the differences in the perceptions of survey respondents. Thereafter, clustering results are analyzed through EDM framework for their appropriateness and meaningfulness. Analysis of clustering results conducted through EDM framework suggests that quantitative survey data has some distinct properties which directly affects the clustering results. Therefore, these properties make quantitative survey data inappropriate for most of the existing clustering algorithms. Due to these properties, the results of K-means in this study are not interpretable and do not follow well-established stakeholder theory, which is popular and heavily used in higher educational domain [12]. For clarifying the impact of these properties of quantitative survey data over the performance of clustering, an example data division method is also introduced in this paper and its results are compared with K-means method. It is found that the performance of data division method introduced in this paper is more suitable for quantitative survey data than K-means. This suggests that quantitative survey data requires a dedicated clustering method which takes care of distinct properties of it and gives more meaningful results. The next section gives a brief introduction of EDM framework and its applications in educational domain. The description of dataset used in this paper is given in Sect. 3. Cluster analysis of used dataset is described in Sect. 4. Section 5 discusses about distinct properties of quantitative survey data. Section 6 concludes our findings.

2 EDM Framework and Its Applications EDM framework was built in order to assist data-driven decision making in higher education domain. It provides a unified solution to serve several academic objectives

Clustering of Quantitative Survey Data …

309

such as analysis of educational data, institutional quality management, validating theories, assessments, and supporting the development of appropriate analytical tools [10]. EDM framework selects appropriate analytical method according to the task in hand, analyzes the data, and provides the results in the formats desired by user. For doing so, it utilizes several knowledge sources that include publicly available data sources, theories, and expert knowledge. It consists of four main components that are (i) data stores, (ii) knowledge sources, (iii) data mining subsystem, and (iv) user interface. Overall architecture of EDM framework is given in Fig. 1. Data stores are online data repositories maintained by the institutions that are updated each time an academic activity takes place in the institution. Knowledge sources contain several publicly accessible information bases such as bibliometrics, scientometrics, theoretical knowledge, and expert knowledge. Data collected from data stores and knowledge sources is then cleaned and unified using Extraction, Transformation, and Load (ETL) functions of data mining subsystem. Data analysis finally takes place that selects and applies appropriate analytical approach over filtered data according to the task in hand. User interface (UI) component is responsible for handling user query and generating results in the formats specified by the user. This study uses EDM framework for analyzing the results of clustering of quantitative survey data for verifying the suitability of the existing clustering methods over survey data. Analysis is done in order to explore the factors that influence the performance of clustering of quantitative survey data.

User Query

Data Mining Tools & Techniques Statistical

Data

Machine

Tools

Mining

Learning



Reports

E

L Institutions Data Stores

U

ETL

T

I

Knowledge Sources

Ranking Analysis

Accreditation

Bibliometric/

Standard

Expert

Scientometrics

Curriculum

Knowledge



ETL = Extraction, Transform,

System Watchdog Integrity

Fig. 1 EDM framework

Consistency

Completeness

Load Validation

UI = User Interface

310

R. Sadh and R. Kumar

3 Used Dataset The dataset was collected during a study that was intended to explore the importance of different quality parameters of HEIs. Eleven quality parameters were discovered in the study that are given along with statistics in Table 1. Six of these eleven parameters were explored by the scrutiny of five most popular national and international institutional rankings [13–17]. Five additional parameters were explored by conducting focus group and personal interviews of students, faculty, parents, administrators, and professionals. After finding the parameters, the perception of different academic stakeholders regarding the importance of each parameter is explored through conducting an extensive online survey. Likert scale with four marking levels was used to denote the importance of each parameter. Survey data was collected from National Capital Region (NCR) of India due to the availability of representative premier institutions of the country in NCR. Respondents of twelve premier institutions from sciences, social sciences, medical, technology, and humanities domains participated in the survey. Seven respondent categories undergraduate, postgraduate, researcher, faculty, parents, administrator, and professional were identified in the survey. The population of respondents in each category except faculty and administrator was assumed infinite. Population of faculty in chosen institutions was 5727, whereas no official data was found regarding the population of administrators [16]. Random sampling with 5% error and 95% confidence level was considered for descriptive analysis in the study. Accordingly, 2620 responses were finally considered in which 438 undergraduates, 463 graduate study, 447 graduate research, 389 professionals, 395 parents, 401 faculty, and 87 administrators were identified after filtering out invalid and incomplete responses. Since, the dataset is balanced and satisfies minimum sampling requirements [18]. Hence, this dataset is used for conducting cluster analysis in this study. However, Table 1 Mean and standard deviation of explored quality parameters of HEIs

Quality parameters

Mean

Standard deviation

Teaching

3.2

0.74

Graduate outcomes

3.0

0.89

Academic flexibility

3.0

0.76

Transparency and accountability

3.0

0.77

Infrastructure and resources

3.0

0.73

Research

3.0

0.87

Student support services

2.9

0.78

International outlook

2.8

0.81

Fee structure and financial assistance

2.7

0.96

Academic autonomy

2.5

0.92

Inclusivity

2.5

0.84

Clustering of Quantitative Survey Data …

311

the administrator category is excluded from analysis as official figure regarding its population is unknown for selected institutions and the number of administrator’s responses obtained in survey is comparatively lesser than that of other categories.

4 Clustering of Dataset and Analysis of Results At first, K-means clustering algorithm is applied over the dataset. Thereafter, clustering results are analyzed with respect to the suitability of K-means method for quantitative survey data. The limitation of K-means and other clustering algorithms is then explained through specifying distinct properties of quantitative survey data. For elaborating the impacts of these properties of survey data, this study introduced a simple data division method for survey dataset. An analysis of clustering results is finally done through EDM framework.

4.1 Clustering of Survey Dataset A total of 2533 responses from six respondent categories were considered for cluster analysis in this paper. These respondent categories are undergraduate, postgraduate, researcher, faculty, parents, and professional. The names of these categories are abbreviated for the ease of representation. Faculty is abbreviated as Fac, parent as Par, postgraduate as PG, professional as Pro, researcher as Res, and undergraduate as UG. K-means clustering algorithm is applied taking 6 as the value of K. Value of K is taken as six since dataset contains responses from six respondent categories. Clusters are named as K1 up to K6 for denoting six K-means clusters. The results of K-means clustering are given in Table 2. K-means clustering results are represented by pie charts in Fig. 2 where each pie-chart corresponds to a cluster having population of each respondent category in percentage. Table 2 Results of K-means clustering over used survey dataset (taking K = 6) Cluster

UG

PG

Res

Fac

Par

Pro

Total

K1

20

17

41

227

26

34

365

K2

59

135

10

20

25

175

424

K3

238

45

17

8

99

20

427

K4

58

92

115

54

142

21

481

K5

39

80

120

36

5

76

356

K6 Total

25

94

144

56

98

63

480

438

463

447

401

395

389

2533

312

R. Sadh and R. Kumar

Pro UG Par 9% 6% 7%

PG 5%

UG 14%

Res 11%

Par 23%

Pro 41% PG 32%

Fac 62%

Par 6%

Fac 5%

K1

Par 30%

PG 19% Res 24%

K4

Res 4%

Par 1%

Pro 21%

UG 56%

PG 10%

K3

K2

Pro UG 4% 12%

Fac 11%

Res 2%

Fac 2%

Pro 5%

Pro UG 13% 5%

UG 11% PG 23%

Fac 10% Res 34%

K5

Par 20% Fac 12%

PG 20%

Res 30%

K6

Fig. 2 Percentage population of different respondent categories in each K-means cluster

4.2 Analysis of Clustering Results Clustering results are given to EDM framework as input for micro-analysis. Results are analyzed through EDM framework with respect to the suitability of K-means algorithm for quantitative survey data. Framework uses various knowledge sources for analysis purpose. In this case, EDM framework uses stakeholder theory as a benchmark (knowledge source) to analyze clustering results since it is well established in education domain. Stakeholder theory suggests that organization should consider the requirements of its stakeholders when taking strategic decisions [19]. The categories of stakeholders are generally defined on the basis of their divergent behaviors, i.e., on the basis of their role or their influence over the organization [20]. Hence, the opinions and requirements of stakeholders are generally divergent. The literature related with academic quality also suggests that the perception of different academic stakeholders is quite divergent based on which the quality parameters of HEIs are defined [21]. Since stakeholder theory is well established in higher education domain, the variables in used survey dataset are defined on behalf of the opinions of different academic stakeholders. Hence, cluster analysis of educational survey data should also confirm the academic stakeholder categorization. In other words, if clustering results are reliable and clustering algorithm performs satisfactorily, then the dataset should be divided in such a way that each cluster represents a stakeholder category on the basis

Clustering of Quantitative Survey Data …

313

of the population of stakeholders in it. Hence, it can be said that the clustering results of academic survey data are meaningful if each of the clusters contains a fair majority of a particular academic stakeholder category. It can be seen in Fig. 2 that K-means clusters K1 and K3 contain majority of the populations from faculty (62%) and undergraduate (56%) categories, respectively. Hence, it can be concluded that these clusters represent the behavior most familiar with faculty and undergraduate categories, respectively. None of the other clusters can be associated with any other particular stakeholder category as the population distribution in these clusters (Fig. 2) is not decisive. These clusters (K2, K4, K5, K6) either contain approximately equal proportions of different respondent categories or a single cluster contain the majority of more than two different stakeholder categories. However, this is contradictory to the applicability of stakeholder theory in academic domain, according to which each cluster should represent a specific stakeholder category. This indicates that K-means clustering method produce results that are not satisfactorily reliable regarding quantitative survey data. It can be deduced clearly from Table 2 and Fig. 2 that either the clusters contain majority of population from multiple respondent categories or equal amount of responses from a category is distributed among multiple clusters. For example, it can be seen in Table 2 that cluster K2 contains the majority population from both postgraduate and professional categories; hence, this cluster cannot be linked with any of these two categories. It can also be seen in Fig. 2 that researcher category is distributed among K4, K5, and K6 clusters with almost equal proportions approximately 24%, 30%, and 34%, respectively. Hence, none of the fraction of the population of researcher in these clusters is decisive to link a cluster with researcher category. Due to these reasons, none of the cluster K2, K4, K5, and K6 can be identified with any specific stakeholder category. The reasons behind this kind of distribution of respondents in K-means clusters are quite clear. Most dominant reason behind such non-decisive distribution is the utilization of aggregate statistic (Mean) by K-means clustering method. Aggregate statistics have general tendency to suppress the frequent differences, if differences are not sufficiently large. Since survey data uses Likert scale with small values of fixed range, the difference between variable values is not much large [22]. This small difference is reduced to nominal values while averaging the dataset. That is why the placement of mean centroids in survey data space is almost arbitrary which is not capable of distinguishing frequent dissimilarities in the survey dataset. This can be understood by simple rule of thumb that is: If two variables are having small values, then their variability does not affect mean value significantly. Hence, two variables that are quite different in terms of variability can have same mean values. The second significant reason behind unreliable results of K-means over survey data is that it treats each survey response as a series of different variable values and not as a single pattern. Due to this reason, the magnitude of difference between variable values significantly affects the clustering results. For example, if responses are similar in pattern (horizontal variation between series of variable values) corresponding to the majority of variables and are dissimilar with large magnitude in few of the variables, then these large differences influence the clustering results. The reason behind is that

314

R. Sadh and R. Kumar

K-means and other existing clustering algorithms divide data based on value-based similarity for which distance matrices are readily available [23]. However, survey data requires similarity to be measured in terms of patterns (number of differences) and not in terms of the magnitude of differences. Pattern-based similarity measures are useful for survey data due to several reasons. Patterns are significant in surveys as these patterns reflect the behaviors or perceptions of the respondents. Survey data contains very small values which are simply the indicators of behavioral difference regarding various variables [22]. Since the objective of surveys is to seek the differences in choices of the respondents and not to measure the difference between values. Hence, the responses should be treated as patterns, and dissimilarities should be measured in terms of number of differences in whole pattern and not in terms of differences in magnitudes. Due to these reasons, several studies conducted in the past have suggested that the utilization of aggregate measures such as mean is not appropriate for survey data, and pattern-based measures should be defined and developed for proper analysis of survey data [24–26]. One more significant property of survey data is that it contains associated information in terms of respondent category labels. Surveys generally try to seek the relationship between survey items corresponding to the defined respondent categories. These relationships define the general behavior of respondents or the perceptions of the respondents. Hence, the analysis method should use these respondent labels for generating proper results. These category labels thus can be used to define clustering parameters for obtaining better results. For example, these labels can be used to guide the placement of centroids that can reflect true preferences of respondents and to protect the dissimilarities against suppression. For depicting the influence of above-mentioned properties of quantitative survey data, an example data division method is introduced in the paper. This example method is introduced simply to elaborate the concept and to open the new path for future work related with the development of dedicated clustering method for survey data. The proposed example method utilizes respondent category labels for guiding the mean-based centroid at positions that best describe the distinct behaviors of respondents and protect the differences in behavior against suppression, which is caused due to the use small data values (Likert Scale with few levels).

4.3 An Example Data Division Method for Survey Data The preferences of different respondent categories are significantly different and aggregation of their preferences produce results that suppress the differences. However, category-wise aggregation produces specific results and preserves features of respondent behaviors. Mean values of variables in used dataset are given categorywise in Table 3, which shows significant variation in choice patterns of respondent categories. So, if these mean value patterns (given in Table 3) are treated as mean centroids for dividing dataset, then results will not be suffered from suppression of dissimilarities.

Clustering of Quantitative Survey Data …

315

Table 3 Mean values of quality parameters for each separate respondent category Quality parameters

UG

PG

Res

Fac

Par

Pro

Teaching

3.4

3.3

3

3.5

3.2

3.1

Graduate outcomes

3.3

3

2.3

2.9

3.4

3.3

Academic flexibility

3

3.3

3.1

3

2.6

3

Transparency and accountability

2.8

2.9

3.2

3.1

3

2.9

Infrastructure and resources

3.2

3

2.9

2.7

3.1

2.8

Research

2

2.7

3.4

3.4

3

3.2

Student support services

3.2

3.1

2.9

2.4

3.1

2.7

International outlook

2.4

2.8

3.1

3

3

2.8

Fee structure and financial assistance

3.1

2.7

3.2

2

3.3

2

Academic autonomy

2

2.3

2.7

3.2

2.6

2.1

Inclusivity

2.5

2.3

2.8

2.4

2.7

2.1

Therefore, patterns of mean marking values given in Table 3 are used as the mean centroids for division of the used survey dataset. First, each mean marking pattern is given with a name which signifies the name of a data subset. These mean patterns are named as C1, C2 up to C6 as there exist six mean value patterns. Each response in the dataset is then matched with these mean patterns, and Euclidian distances of the response from each of these patterns are calculated. Response is given with the name of a mean marking pattern for which it shows smallest Euclidian distance, e.g., C1, or C2 etc. By this way, whole dataset is divided into six subsets. The results of dataset division by example method are given in Table 4. The results of dataset division given in Table 4 are represented through pie charts of Fig. 3. It can be observed in Fig. 3 that each subset can be identified with one of the respondent categories according to the dominant majority of the respondent category. Each subset of data (Fig. 3) contains a fair dominant majority of a particular respondent category. According to Fig. 3, subset C1 can be linked with undergraduates as their proportion in C1 is quite high (67%). Proportion of PG students in subset C2 is 45%, which represents the behavior of postgraduates. Subset C3 can be linked Table 4 Results of division of the dataset taking mean value patterns as mean centroids Cluster

UG

C1

311

79

C2

35

111

C3

16

80

C4

19

27

C5

45

60

C6 Total

PG

Res

Fac

Par

Pro

Total

7

6

30

31

464

24

20

15

42

247

272

20

42

13

443

50

273

22

61

452

73

38

260

41

517

12

106

21

44

26

201

410

438

463

447

401

395

389

2533

316

R. Sadh and R. Kumar

Pro Fac 7% 1% Res Par 2% 6%

Par 6%

PG 17%

Fac 8% UG 67%

Res 10%

C1 UG 4% Par 5%

Pro 14%

UG 14%

Pro 17%

UG 4%

Par 9%

PG 45%

PG 18%

Res 61%

C2

C3

PG 6%

UG 3% Pro UG 8% 9% PG 12%

Res 11%

Res 14% Fac 7%

Par 50% Fac 60%

C4

Pro 3%

Fac 5%

C5

Pro 49%

PG 26% Res Fac 5% Par 11% 6%

C6

Fig. 3 Percentage population in each subset created by example data division method

with researchers according to their dominant majority (61%). C4 can be identified with faculty as their majority (60%) is significantly large in C4. C5 and C6 can be linked with parents and undergraduates according to their dominant populations that are approximately 50% and 49%, respectively. These results confirm that the preferences of academic stakeholder categories are quite divergent. By this way, results confirm the principle of stakeholder categorization which is well established in education domain. The example data division method introduced in this paper preserves the preference patterns of each respondent category and thus gives the results which are more meaningful and interpretable. This phenomenon confirms that survey data possesses distinct properties. These properties significantly influence clustering results of survey datasets. Therefore, quantitative survey dataset requires a dedicated clustering algorithm that takes proper care of these properties. Moreover, quantitative survey data contains associated information with it that can be used for obtaining better results and to avoid manual presetting of clustering parameters, i.e., number of clusters and features of clusters.

Clustering of Quantitative Survey Data …

317

5 Discussion EDM framework was designed to provide an integrated and stand-alone solution for addressing various objectives inside educational settings and for fulfilling the needs of different academic stakeholders. These objectives include management and analysis of educational data, various assessments, academic quality evaluation, datadriven decision making, and assistance for the development of data analysis tools and techniques for educational data. Surveys are used quite frequently in quality assessment of HEIs and its various units. Hence, analysis of quantitative survey datasets is simply a subsystem of the existing EDM framework. Cluster analysis is quite an important tool for dividing the data according to its internal features. In context of quantitative survey data, cluster analysis is an effective tool for identifying distinguished behaviors and general perceptions of the respondents. The preferences of the stakeholders are quite important in objective decision making inside the institutions or organizations. Hence, proper cluster analysis of educational survey data is an essential prerequisite for achieving valuable insight from the data. However, proper cluster analysis of quantitative survey data requires a thorough understanding of the properties of survey data since these properties directly influence the performance of algorithm chosen for the cluster analysis. This paper have explored various distinct properties of quantitative survey data that should be considered to achieve meaningful clustering results. First, clustering is performed over a survey dataset used in this study that contains the responses of several academic stakeholders regarding various quality parameters of HEIs. For clustering purpose, K-means clustering algorithm is used. Thereafter, the results of dataset clustering are utilized through EDM framework. It is found that the results of K-means clustering method are not aligned with stakeholder theory, which is quite famous and well established in educational domain. The analysis of clustering results through EDM framework suggests that quantitative survey data has several distinct properties that directly impact the clustering results significantly. These properties include the utilization of small values for marking having fixed value range. The operation of aggregate statistics over these small values suppresses the differences in respondents’ marking behavior. Associated information with survey observations in the form of category labels is also a striking feature of survey datasets. The relationships between survey variables are generally defined on the basis of these category labels. Moreover, the pattern of marking is more important in surveys than marking values of the survey variables since these patterns reflect actual behavior of the respondents. An example data division method is introduced in this paper for quantitative survey data to explore the impact of distinct properties of survey data. This example data division method utilizes guided mean centroids for dividing the data and hence preserves the marking features (patterns) against suppression. Therefore, the results of example methods are more appropriate in comparison with K-means clustering method. This signifies that quantitative survey dataset requires dedicated clustering method that can take care of these distinct properties in order to achieve meaningful clustering

318

R. Sadh and R. Kumar

results. Moreover, this also validates studies conducted in the past which suggested aggregate statistics are not appropriate for survey datasets [24–26]. Hence, patternbased clustering methods are more suitable than value-based clustering methods for quantitative survey data. Therefore, the following distinct properties of quantitative survey data are concluded in this study that should be considered with proper care in order to obtain meaningful clustering results. • Small marking values (Likert scale with few levels). • Associated side information (category labels). • Patterns of marking (choice patterns of respondents).

6 Conclusion Surveys are quite important in objective decision making inside educational domain. Proper analysis of quantitative survey data is thus a crucial prerequisite which requires thorough understanding of survey data properties beforehand. In this study, clustering of a real academic survey data is conducted and its results are analyzed through EDM framework. This analysis suggests that quantitative survey data has several distinct properties which influence the performance of clustering methods that use aggregate statistics. Hence, dedicated clustering method is required for proper cluster analysis of quantitative survey datasets. An example data division method is also introduced in this paper for depicting the impacts of these distinct properties over clustering results. An example data division method is applied over used dataset, and its results are then analyzed. Results of example method suggest that there is an urgent need of dedicated clustering approach that takes care of distinct properties of survey data and provides reliable results. Designing a dedicated clustering method for quantitative survey data which provides reliable results is thus a future work of this study.

References 1. Vroeijenstijn AI (2003) Towards a quality model for higher education. J Phil Higher Educ Qual Assur 1(1):78–94 2. Mori M (2016) How do the scores of world university rankings distribute? In: 5th IIAI international congress on advanced applied informatics (IIAI-AAI). IEEE, Kumamoto, pp 482–485 3. Abidin M (2015) Higher education quality: perception differences among internal and external stakeholders. Int Educ Stud 8(12):185–192 4. Ibrahim Y, Arshad R, Salleh D (2017) Stakeholder perceptions of secondary education quality in Sokoto State, Nigeria. Qual Assur Educ 25(2):248–267 5. Vnouckova L, Urbancová H, Smolová H (2017) Factors describing students’ perception on education quality standards. J Effi Responsib Educ Sci 10(4):109–115 6. Velden M, D’Enza AI, Yamamoto M (2019) Special feature: dimension reduction and cluster analysis. Behaviormetrika 46(2):239–241 7. Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165– 193

Clustering of Quantitative Survey Data …

319

8. Tan PN, Steinbach M, Kumar V (2006) Cluster analysis: basic concepts and algorithms. In: Introduction to data mining, vol 8, pp 487–568 9. Estivill-Castro V (2002) Why so many clustering algorithms: a position paper. SIGKDD Explor 4(1):65–75 10. Sadh R, Kumar R (2019) EDM framework for knowledge discovery in educational domain. In: International conference on emerging trends in communication, computing, and electronics (IC3E). Springer, Allahabad, pp 409–417 11. Peña-Ayala A (2014) Educational data mining: a survey and a data mining-based analysis of recent works. Expert Syst Appl 41(4):1432–1462 12. Jongbloed B, Enders J, Salerno C (2008) Higher education and its communities: interconnections, interdependencies and a research agenda. High Educ 56(3):303–324 13. QS World University Rankings (2019) https://www.topuniversities.com/universityrankings, Last accessed 31 Oct 2019 14. Times Higher Education Rankings (2019) https://www.timeshighereducation.com, Last accessed 31 Oct 2019 15. Academic Rankings of World Universities (2019) https://www.shanghairanking.com, Last accessed 31 Oct 2019 16. National Institutional Ranking Framework Homepage (2019) https://www.nirfindia.org/Home, Last accessed 31 Oct 2019 17. Complete University Guide Homepage (2019) https://www.thecompleteuniversityguide.co.uk, Last accessed 31 Oct 2019 18. Bluman AG (2009) Elementary statistics: a step by step approach. McGraw-Hill Higher Education, New York 19. Mitchell RK, Agle BR, Wood DJ (1997) Toward a theory of stakeholder identification and salience: defining the principle of who and what really counts. Acad Manag Rev 22(4):853–886 20. Lyytinen A, Kohtamäki V, Kivistö J, Pekkola E, Hölttä S (2017) Scenarios of quality assurance of stakeholder relationships in Finnish higher education institutions. Qual High Educ 23(1):35– 49 21. Burrows A, Harvey L (1992) Defining quality in higher education: the stakeholder approach. In: AETT conference on quality in education. University of York, pp 6–8 22. Lee JW, Jones PS, Mineyama Y, Zhang XE (2002) Cultural differences in responses to a Likert scale. Res Nurs Health 25(4):295–306 23. Kriegel HP, Kröger P, Zimek A (2009) Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Trans Knowl Discov Data (TKDD) 3(1):1 24. Manicas PT (2006) A realist philosophy of social science: explanation and understanding. Cambridge University Press (2006) 25. Grice JW (2014) Observation oriented modeling: preparing students for research in the 21st century. Compr Psychol 3:05–08 26. Grice JW (2015) From means and variances to persons and patterns. Front Psychol 6:1007

Smell-O-Vision Device P. Nandal

Abstract The olfactory communication and the approach toward smell is boosting interest toward research in the field of human–computer interaction. In this paper, the smell-o-vision device has been fabricated which works on the principle of identification of the element in the video running on the screen. The element identification is done with the help of Google video intelligence API. According to the component identified, the corresponding odor is produced. The position of the generation of odor is vital for the experience of the user. The position of it can be changed as per the requirement of the user. This device would work for any type of video. The similar natured elements have been clubbed under a single category, which is the primary discriminator for the smell generation. The widening effect of the smell is done with the help of fan; the activation of it is the prime source of the odor generation. Time synchronization of the smell generation device with the elements of the video is used for the smooth functioning of the device. Keywords Smell · Video intelligence API · Digital scent

1 Introduction Augmented reality (AR) techniques can be used to manufacture, invent or reconstruct scenes in a virtual environment [1]. In other words, augmented reality is a live, indirect or direct, aspect of a physical, real-world environment, the components of which are supplemented (or augmented) by computer-generated sensory inputlike graphics, video, GPS data or sound [2]. Therefore, the technology works by improving one’s present impression of reality. Classification of augmented reality based on five natural powers of human senses, i.e., feeling, sight, hearing, taste and smell consist of haptic AR, visual AR, audio AR, gustatory AR and olfactory AR, respectively [3]. The applications of AR are countless; entertainment, performance, translation, commerce, education, sightseeing, architecture, computer games, art, P. Nandal (B) MSIT, GGSIPU University, New Delhi, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_24

321

322

P. Nandal

emergency services, military, advertising, navigation in real-world environments, information visualization, digital signage and so on [4, 5]. Recent AR technology can be implemented on movie theaters, iPhone, iPad and Android [6]. Movies are still perhaps the dominant form of entertainment. They take you in the imaginary world of some other’s mind and some brilliant directors execute it so meticulously. A bunch of multimodal displays was supplemented to the robotized platform to make the filmic scenes further realistic and to enhance the experience of the users. The accelerated technological growth of the current time span has permitted the progress of commercial solutions which incorporate a multiplicity of multimodal displays in movie theaters [7–9], at which point these systems are known as 4D or 5D theaters or cinemas. It is also claimed that the present technology alters the cinema existence from “watching the movie to almost living it” [10]. According to the view of some authors, 3D cinema is expanded by the 4D cinema by permitting a scope of real-time sensory effects, including projected wind and mist blasts, leg and back pulsation, seat movements, lightning, fog, scent perfume discharge, etc., everything harmonized according to the sequence of the film [11]. Relevant odors are released to make particular scenes of the movies influential [12]. As per the other view, the fourth dimension is correlated with the vibration and/or movement of the seat, while the fifth dimension accommodate the remaining of sensory effects [13]. The smell-o-vision always has been fantasized and the Hollywood has been one of the first who have been trying to get their hand into this technology for like 5 decades or so. They tried in movies like “AromaRama”, “Sensurround” and in more recent movies like “Spy Kids” and “Batman vs Superman” to name a few. The technology has created a form of euphoria among the audiences but rather are unable to deliver to their expectation [14]. Computer-controlled smell output is the capability to control when a smell is emitted. There are three ways in which an output device can discharge odor in a controlled manner. Airbrush like systems can be used that emit liquid particles of scent by stream of air; inkjet-based systems which employ heat to produce small droplets scent to be sprayed into the air; and a variety of systems which utilize heat to expand the evaporation of a scented oil or wax that may or may not use a fan to spread the scent into the room. A device that creates odor vapor is known as an olfactory display. In an olfactory display, we can smell essence at certain events. The research related to olfactory displays started in 1962 [15]. Each specific smell is generated from prefabricated combination of fragrant chemicals in solid or liquid form. Up to eight bottles of various odorant mixtures were used by Nakamoto and Yoshikawa [12]. Aroma chips can also be used to encapsulate fragrance in a hydrogel [16]. The odor vapor is diffused into the environment which is generated by heating the aroma chip [17]. Ink jet devices were used in an olfactory display system as described by Saito et al. [18] to achieve accurate control of odor concentrations and to switch among odors. Odor vapors were generated by ejecting the small droplets of liquid-phase odor samples into an airstream. The latest variant of this system is mobile and has a smaller size [19]. A wearable device such as the scent collar can also be used for odor presentation [20]. An odor vapor in the headspace of a selected bottle is dispatched to the user’s

Smell-O-Vision Device

323

nose via a tube attached to a headset in the olfactory display systems described in [12, 21]. The computer-controlled solenoid valves are used to control the intensity of the odor that is presented to the user. The odor vapor is diluted with the clean air by using the solenoid valves. A similar hardware setup which also consists of solenoid valves is presented by Yamada et al. [22]. The device is wearable and smaller in size which permits the user to walk around in an immersive virtual or real environment when being presented with odors. Matsukura et al. [23] proposes a new olfactory display system which can produce an odor distribution on a 2D display screen. The system presented studies about the position and intensity of the odor generation. This paper is organized as follows. Section 2 represents the prerequisite for the proposed work. Section 3 depicts the implementation and working. Section 4 is devoted to the results and discussion. Section 5 is conclusion and future work.

2 Prerequisite of Proposed Work To fabricate smell-o-vision device, the hardware required, software used and the API used is described in this section.

2.1 Hardware Used The hardware components required are illustrated below: Raspberry Pi. The Raspberry Pi was introduced to be used as a prototype to learn details of programming and it is a small single-board computer. It does not have any peripherals. It can run few games, word processor, image editor like Gimp, etc. It constitutes a Broadcom SoC (System on Chip) with an integrated ARM processor (compatible CPU and GPU). The speed of the processor extends from 700 MHz to 1.4 GHz and on-board memory ranges from 256 to 512 MB RAM. The boot media is an SD card. Various models of Raspberry Pi have been released till now belonging to three generations, i.e., Pi 1, Pi 2 and Pi 3. The first model Pi 1 model B was released in 2012 and the latest model Pi 3 Model A+ was made public in 2019 (Fig. 1). Four Channel Relay. It is an interface board used to control different appliances. Microcontrollers such as Raspberry Pi, MSP430, Arduino, TTL logic, ARM, DSP, PIC, 8051, AVR can directly control it via a standard interface. It is equipped with high current relay and 12 and 5 V input voltage (Fig. 2). Aer Pocket and Micro-USB fan. Odor is emitted when the video is running on the screen. For this purpose, the author used six types of aer pockets (consumer product) to be used as a source of fragrance generator. Micro-USB fan is a compact fan used by connecting to the USB port. Fan’s angle can be adjusted up and down. The odor is diffused with the help of micro-USB fan (Fig. 3).

324

Fig. 1 Raspberry Pi Fig. 2 Four channel relay

P. Nandal

Smell-O-Vision Device

325

Fig. 3 Micro-USB fan

2.2 Software Used The software used for the fabrication of this device is described below: Raspbian Operating System. It is a free operating system that is built on the top of Debian, optimized for the Raspberry Pi hardware. This OS comes with way over 35,000 packages, pre-compiled software, compiled in a pleasant format for trouble-free installation on Raspberry Pi computer. Several versions of Raspbian OS are available. The features of Raspbian OS include Pi Improved Xwindows Environment Lightweight (PIXEL). Node.js. It is an open-source, JavaScript runtime environment used for executing JavaScript code. It is used for server-side scripting and executes the code outside of a browser. API is used by the Node.js modules to decrease the intricacy of writing server applications. It supports event-driven programming in JavaScript. It can be used for controlling sensors and microphones, etc. Python. Python is a high level, used in generic purposes that are wide ranged, interpreted programming language. This language supports a humongous amount of the support and libraries. It is quite popular in the Raspberry Pi community. It can be used with other peripheral devices such as the microphone to sense sound, also for the control of the sensor.

2.3 Application Programming Interface (API) Used The API used in the current work is described below: Google’s YouTube API. The YouTube API provides an access to the users in regards to the video statistics and also the YouTube channel data. YouTube channels data can be accessed via two types of calls, REST and XML-RPC. This

326

P. Nandal

API helps in embedding a YouTube video to your Web site to enhance the video watching experience of the user. This comes in addition to control of Web page experience by providing utilities to customize the player’s appearance. In addition, it provides with data analytics feature as well. This API could be used in multiple platforms providing the much needed flexibility to operate it efficiently. Google Cloud Video Intelligence API. Pre-trained machine learning models are present in Video Intelligence API which automatically recognizes a vast number of actions, places and objects in streaming and stored video. It is eminently capable for common use cases and develops over the span as fresh concepts are imported. The Google Cloud Video Intelligence API provides with features to developers which enables them to use Google video analysis technology as one of the constituents of their applications. The REST API provides the user with an option to interpret videos saved locally or in Google Cloud Storage with much needed contextual information at the level of the complete video and with accurate drill down analysis providing them with per segment, per shot and per frame information. The catalog can be searched in the same way as the text documents. The API supports common video formats, including .MOV, .MPEG4, .MP4 and .AVI. It provides with features namely label detection and shot change detection and regionalization.

3 Implementation and Working The steps used in fabricating the smell-o-vision device are illustrated below: • Raspbian OS is installed. • Download and install etcher software, it can work on all platforms (Linux, Mac and Windows). • Open etcher and select the chosen Raspbian image. • Select the SD card that you wish to install Raspbian on and then proceed to flashing the SD card. • Once it is finished you can safely remove the SD card from the computer. • Insert into your Raspberry Pi and any other extra cords such as power, mouse, keyboard and display. • Cloud Video Intelligence API is set in the Google Cloud Platform Console. • The environment is set up by creating a service account and application default credentials. • Google Cloud client library is installed. • Video is annotated using label detection. • Server-side programming is done with Node.js • YouTube video in run inside Web site using IFrame player. A database (massive size library of the multimedia elements) stored with the Google Cloud video intelligence API was created which defined 20,000 labels. These labels would help in the element identification and categorizing them accordingly.

Smell-O-Vision Device

327

The data used should be stored only in Google Cloud storage to help in faster data retrieval and analysis as compared to the data stored on PC. By clicking on any video working with the help of YouTube API, a series of request is being sent to the designed server module. This module would be having two specific functions namely get data and activate Pi. The get data function would make sure that would pass on the data stored in the first row and accordingly with the activate pin function having two variable entity label and time interval would activate the corresponding pin. The type of pin that is activated would depend on the label identified and the time interval of it would be dependent on time interval variable. This time interval variable is made out from the start time and end time. In addition to this, a module named analyze is also being used to create the database. So all the information regarding the element label identified with their start and end time is being done by this functionality.

4 Results and Discussion The proposed device model is quite promising and has a lot of scope in not so far future. However, this device does have some disconcerting issues which at present moment pose quite a hitch. The categories for the classification of the elements, identified could be a bit intermingling at time and thereby could create a quite boastful result which in reality and from human perspective would be inaccurate, which perhaps could lead to erroneous generation of smell. The smell released by the device for a particular element identified is sometimes not distinguishable with that of another. As the fragrance of the one could overshadow the other, and in some cases, the mixture of smells could harm the experience of the customer. The illusion of the element’s presence could thereby be troubled and hamper the experience of the user. The speed of the propellant device at the moment is static and needs to be more dynamic to cater the needs of the different element’s odor needs. The placement of the fragrance generation is quite vital for the experience of the user. In fact, both the strength and position of the fragrance generator are critical for the user experience. However, in this case, the flexibility to move around the device is not quite there which rather makes it not so endearing to the potential customers. Finally, the rapidness with which the elements in the video changes results in the release of the corresponding smells in quick succession which could lead to the superfluous mixing of the smells that could impede the experience of the user. So, the limitation is to precisely control the timing and concentration of vapor and to rapidly switch between the various odors. This work involves the usage of fragrance generator and the Raspberry Pi to provide a remarkable interface experience.

328

P. Nandal

5 Conclusion and Future Work This proposed device would be a paradigm shift in terms of usage of the fragrance generator as the previously made fragrance generators were not so agile. They were just made for a particular motion picture and could not be extended to every other video but with this device; the perception of 4D experience would change, making it more cost effective and scalable. This is something unprecedented and mindboggling. The device proposed here would be a prototype but has an everlasting potential to become a mega-successful device. Firstly, the number of categories identified for now can be and rather will be extended to many folds with the increased investment in the financial and infrastructural resources. The accuracy of the algorithm to identify and differentiate the pseudo-similar elements can be improved to avoid the inaccurate identification and categorization of the elements. The flexibility to modify the speed of the fan dynamically could improve the performance of the device. This would help in controlling the intensity of the fragrance to quite an extent. This would avoid the superfluous intermixing of the fragrance that perhaps could deter the capabilities of the fragrance generator. The flaccidity with which the fragrance generator shall be moved should be increased and rather be effortless. The proposed device has shown a lot of promising attributes and the experience it gives to the user is bound to become a new normal just as the audio did with the silent films. However, for now, the device does have these shortcomings but they can be improved and would lead to a much better and user-friendly device.

References 1. Olalde K, Guesalaga I (2013) The new dimension in a calendar: the use of different senses and augmented reality apps. Proc Comput Sci 25:322–329 2. Irizarry J, Gheisari M, Williams G, Walker BN (2013) InfoSPOT: a mobile augmented reality method for accessing building information through a situation awareness approach. Autom Constr 33:11–23 3. Geroimenko V (2012) Augmented reality technology and art: the analysis and visualization of evolving conceptual models. In: Proceedings of 16th international conference on information visualisation. IEEE, Montpellier, France, pp 445–453 4. Wikipedia (2019) Augmented reality. Available at: https://en.wikipedia.org/wiki/Augmented_ reality, Last accessed 21 Oct 2019 5. Bonsignore EM, Hansen DL, Toups ZO, Nacke LE, Salter A, Lutters W (2012) Mixed reality games. In: Proceedings of computer supported cooperative work companion. ACM, Seattle, Washington, USA, pp 7–8 6. Madden L (2011) Professional augmented reality browsers for smartphones: programming for junaio, layar and wikitude. Wiley 7. Portalés C, Casas S, Vidal-González M, Fernández M (2017) On the use of ROMOT—a RObotized 3D-MOvie Theatre—to enhance romantic movie scenes. Multimodal Technol Interact 1(2):7 8. Casas S, Portalés C, García-Pereira I, Fernández M (2017) On a first evaluation of ROMOT—a RObotic 3D Movie Theatre—for driving safety awareness. Multimodal Technol Interact 1(2):6

Smell-O-Vision Device

329

9. 5D Cinema Extreme (2019) The magic of 5D Mozi! Available online: https://www.5dcine ma.hu/, Last accessed 13 Oct 2019 10. Yecies B (2016) Transnational collaboration of the multisensory kind: exploiting Korean 4D cinema in China. Media Int Aust 159(1):22–31 11. Casas S, Portalés C, Vidal-González M, García-Pereira I, Fernández M (2016) ROMOT: a RObotic 3D-Movie Theater allowing interaction and multimodal experiences. In: International conference on love and sex with robots. Springer, Cham, pp 50–63 12. Nakamoto T, Yoshikawa K (2006) Movie with scents generated by olfactory display using solenoid valves. IEICE Trans Fundam Electron Commun Comput Sci 89(11):3327–3332 13. Zhuoyuan G (2019) The difference between 4D, 5D, 6D, 7D, 8D, 9D, xD Cinema. Available online: https://www.xd-cinema.com/the-difference-between-4d-5d-6d-7d-8d-9d-xd-cin ema/, Last accessed 22 Oct 2019 14. Gosain D, Sajwan M (2014) Aroma tells a thousand pictures: digital scent technology a new chapter in IT industry. Int J Curr Eng Technol 4:2804–2812 15. Heilig ML (1962) Sensorama simulator. U.S. Patent 3 050 870, 28 Aug 1962 16. Kim DW, Lee DW, Miura M, Nishimoto K, Kawakami Y, Kunifuji S (2007) Aroma-chip based olfactory display. In: Proceedings of 2nd international conference on knowledge, information and creativity support systems. JAIST, Nomi, Ishikawa, Japan, pp 97–103 17. Kaye JN (2001) Symbolic olfactory display. Master’s thesis. Media Lab, Massachusetts Institute of Technology, Cambridge (2001). 18. Sato J, Ohtsu K, Bannai Y, Okada KI (2009) Effective presentation technique of scent using small ejection quantities of odor. In: Proceedings of virtual reality conference. IEEE, Lafayette, LA, USA, pp 151–158 19. Sugimoto S, Segawa R, Noguchi D, Bannai Y, Okada K (2011) Presentation technique of scents using mobile olfactory display for digital signage. In: Proceedings of IFIP 13th international conference on human-computer interaction. Springer, Berlin, pp 323–337 20. Morie JF, Luigi DP, Lathan, C, Pettersen M, Vice JM (2009) Scent delivery device and method of simulating scent in a virtual environment. U.S. Patent 7,484,716. University of Southern California (USC) 21. Nakamoto T, Otaguro S, Kinoshita M, Nagahama M, Ohinishi K, Ishida T (2008) Cooking up an interactive olfactory game display. In: Proceedings of conference on computer graphics and applications. IEEE, pp 75–78 22. Yamada T, Yokoyama S, Tanikawa T, Hirota K, Hirose M (2006) Wearable olfactory display: using odor in outdoor environment. In: Proceedings of virtual reality conference, pp 199–206. IEEE, Alexandria, VA, USA 23. Matsukura H, Yoneda T, Ishida H (2013) Smelling screen: development and evaluation of an olfactory display system for presenting a virtual odor source. IEEE Trans Visual Comput Graph 19(4):606–615

A Dynamic Approach for Detecting the Fake News Using Random Forest Classifier and NLP J. Antony Vijay, H. Anwar Basha, and J. Arun Nehru

Abstract Social media’s presence can have big and very negative impacts on the individuals and on society too. The widespread of intentionally hoax news could mislead the reader. These are all false story with an intention to fool people, so this fake news analysis built, detection and intervention on social media platforms have become one of the hot topics to research that is grasping very huge attention of the truth seekers. The survey properly reviews fake or false news research. The survey finds different ways in which the random forest algorithm and NLP can be used for detecting a fake or false piece of news. Our model is emanated from counting vector which is used for word tallies. It also uses the technique repetition inverse document also called as RID matrix which tallies the words which inform that continuity of words copied from various reports of paper in the given volume of data. These do not consider tasks which are similar to arranging the word and context. There can be many possibilities where two or more articles which are having similarity in word count can be totally different in their meaning or understanding. There are fewer possibilities which could predict either “Real “or “Fake” piece of information presented in the news as it is harder to spot any hoax/fake news. Our suggested task on gathering the dataset of which contains both rumour and true news and employing Random Forest Algorithm and NLP to design or develop a model which can classify an article and tell whether it is untrustworthy information or real news based on the words, phrases or sentences. Our goal is to achieve the trustworthiness of the readers. Keywords Fake or rumoured news detection · Random forest algorithm · Natural language processing (NLP) · Repetition inverse document (RID) · Inverse document repetition (IDR) · Expression frequent (EF) J. Antony Vijay (B) Hindusthan College of Engineering and Technology, Coimbatore, TamilNadu, India e-mail: [email protected] H. Anwar Basha · J. Arun Nehru SRM Institute of Science and Technology, Vadapalani, Chennai, India e-mail: [email protected] J. Arun Nehru e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_25

331

332

J. Antony Vijay et al.

1 Introduction Invented made up stories [1] and distorted facts are extensively spreading on the Internet. Online users or netizens consume this hoax news which is low in cost and easy to access. This intentionally fabricated low-quality news has the potential to a great threat to an individual or community, causing harm and economic damages. The rapid increase of these online users is increasing day by day, so the probability to get fed by these fake news is most likely. The verifiable hoax news [2, 3] which is often distributed on media platform has the potential for outrage. Therefore, we need software which can filter out the false or rumoured news and provide us with the true news without any misinformation [1]. The booming development of these online social media and networks can have an immense impact on the audience. The goal of our survey is to completely eradicate a massive number of false information from social media [4, 5] by this software. The audience is more tilted towards social media in comparison with the traditional news media (newspaper and television) not only that, as they are easier to share, discuss and comment on it with your friends, family or with other people. The traditional news [5] has more quality than that of social media. Due to hoax news, the audience accepts the false belief because of this which creates confusion and triggers people’s distrust on the news.

1.1 Definition of Bogus Message The counterfeit or bogus message [2] is described as a manipulated story which is purposely used for some ulterior motive or some purpose. Other forms of names are junk or pseudo-news. Not only they have existed for years but also have been taken place for several years. These irrelevant and unethical practice are called cheque-book journalism [5]. It usually attracts the viewers or bystanders to their Web sites which generates their online advertising revenue and improves their ratings. Its impact has been huge on the netizens or online users who go through with the online social media news [3]. Consuming these fake news changes the mindset of the users and influences them. It was analyzed by an entity “top 20 rumoured news of U.S. election in 2016 [5]. They are mostly found in social media, fake news Web sites or traditional media. President Vladimir Putin has signed the new legislation to the Russian administration and government to block the fake news Web site and give them the punishment who posts the material which is insulting and disrespectful to the state of Russian society.

A Dynamic Approach for Detecting the Fake News Using Random …

333

Fig. 1 Categorization of false information based on Internet

1.2 Impact of the Fake News Spreading [6] the enormous and massive amount of false information has become a business that can lead to a division of different groups based on the content of the news. As long as there is a demand for the product, there will be someone to supply fake news [7]. These people can claim wrongful facts and spotting it is harder. It can depict great discourtesy for the society including government, official government symbols, constitution or governmental bodies of any country. Disrespecting the state icon or symbols and the authorities by creating fake news to gain revenue and better ratings. Propaganda [8] is the conscious manipulation through which it can change the direction of the people’s perception and action. Furthermore, in politics, one can put another person into the limelight which can be in a negative or positive way due to which the thinking of the people may change. There can be fake accounts, hidden paid posters and paid content on the social media [4] sites (Fig. 1).

1.3 Spotting of Fake News The fake news, propaganda or misinformation [1] has to be eradicated from our social media to get the true story. Our survey aims at investigating the principles, methodologies and algorithms of NLP and Naive Bayes [7]. There are the specialized systems that target only at certain parts of it and would not be able to detect anything else. General systems that aim to detect several aspects are not as accurate because the system naturally has to adapt to the changes and the rules cannot be as strict as in a specialized system (Fig. 2).

334

J. Antony Vijay et al.

Fig. 2 Fake news detection on media platform

2 Related Works The literature survey was done taking ideas from various papers. We analysed an existing system of the news detection system that is available in the market.

2.1 Existing System This system is inefficient as they cannot predict [8–10] whether a piece of information is fake or not. The basic human perception [11] and behaviour studies that were done across multiple areas such as philosophy, psychology, economics and social science [12] provide a very valuable and deep understanding of fake news. The research in this discipline introduces us to new opportunities for quantitative and qualitative studies in the field of fake news data which is very big in size, is very scarcely available. These theories facilitate us with the building of well-justified and explain fake news detection [13] and intervention model, whereas it also helps in developing a dataset that helps in exploring the real true news required for counterfeit news studies. Many comprehensive literature surveys have been done in various disciplines and have twenty known theories have been found that can most probably be used to study the fake news. These known concepts of the theories can be used to study fake news from three different angles: (i) Users: It describes the involvement of the people in creating and propagating of fake or rumoured news [8, 9]. (ii) Style: the way false news is written. (iii) Propagation: it describes how fake news is propagated using different sources of propagation like social media and other platforms.

A Dynamic Approach for Detecting the Fake News Using Random …

335

2.2 Fake News Analysis Based on Style We will further explain in Sect. 3, these fundamental theories address how false stories content, writing style [11] can be different from the correct news. For an illustration, reality monitoring specifies that the authentic events which can be conveyed by higher levels of sensory-perceptual information. For an illustration, reality monitoring specifies that the authentic events which can be conveyed by higher levels of sensory-perceptual information.

2.3 Fake News Analysis Based on Propagation This analysis [12] solely describes how fake news is propagated. It describes the various medium that can be used in order to propagate [8, 9] a rumour. We can make a model that predicts [11] the sources that could be used to propagate model. It can also help make reasonable assumptions based on epidemically simple models. There are many examples including Semmel weis reflex backfire effect and conservatism bias which mentions that fake news which is incorrect is hard to correct and it propagates with minimum resistance.

2.4 Fake News Analysis User-Based The given theory [14] studies the fabricated stories from a user’s point of views. It takes into consideration how a user is engaged with fake news and what roles does a user play in creating fake news and propagating [8] it. Users can be categorized into (i) malicious users: users that create and propagate fake news intentionally motivated by some monetary. (ii) Normal users: some users can be innocent that spread news along with malicious users [13]. They are called naïve users because they do this unintentionally and are driven by self-influence or social influence. (iii) The systems which are currently available in the market are used in the form of the sentiment analysis model [12]. The machine learning models [10] are used for predicting whether a given statement is positive or negative. Positive Statement: The positive statement is being referred to as a statement in favour of someone or something. Negative Statement: The negative statement will refer to a hatred comment against some person, place, object or situation, etc. The sentiment analysis [12] is mostly used by e-commerce Web sites and applications in order to measure the popularity [14] of their product. It basically helps them know which product of theirs is successful in the market and which products fail to impress the people. Big e-commerce companies like Amazon, Flipkart and others use sentiment analysis models in order to know customer emotions against their products. Sentiment analysis [12] is also used by social networking sites. The major reason

336

J. Antony Vijay et al.

Fig. 3 Flow of fake news detection program

for using sentiment analysis by social networking companies is to prevent online bullying or cyberbullying [12]. Nowadays, it is often seen that people are bullied on social networking sites for their comments or any kind of posts. Cyberbullying can damage someone‘s mental state which may also lead to depression and other mental conditions. The model which is being described in this paper can really prove a milestone which comes in detecting fake news. Fake news is creating versatile issues ranging from sarcastic articles to news which are fabricated affecting people in one or the other way. Due to rumoured news [8], there is a growing distrust for the media which is a growing problem adding huge turbulence with many ups and downs in the society. An intentional and misleading [1] piece of information is “fake news”, but due to malicious users and some normal users present on social media, the discourse is changing. This used to tally the word count [11]. These models do not take this into account about essential features to order the words and about content. The word counts of the given two articles are similar, but the sense of the content must totally contrary. The effort was taken by the community of big data to solve the issue. Combatting with the false information understated that string differentiation calculation with an aligned perspective concept (Fig. 3).

3 System Description 3.1 Flow of Working The technique counting vector or a RID matrix which tally the words count which are relative to how frequently they are appearing in a different articles. Because those issues are different way of string differentiation, implement an adaptive random forest classification considers as effective for the string classification process. The aim of this project is to consider the string transformations, i.e. count vectorizer versus tf-idf vectorizer. The major goal is choosing the kind of string for utilizing which can be fully strung versus headlines. At present, upcoming phase will be pulling out traits

A Dynamic Approach for Detecting the Fake News Using Random …

337

either counting vector or RID vector that can be performed on repeated number of words used as well as clauses, lowering the case and primarily remove a dot word contained in the NLTK package which are common words such as “is”, “of”, “to” and using those words which appears more commonly in a given text of the dataset.

3.2 RID RID is also known as repeated inverse document utilized for extracting information also to mine the text. How essential a word is to a file in a corpus is determined by the weight, which is a measurement of statistics which is used for the calculation. Its significance extended accompanied by repeated words for the record. The different applications of the RID loading strategy are most commonly utilized in the Internet browser for ranking as well as scoring to find the importance of topic which gave by reviewer challenge. Every question has different rank factors as well as RID vector to find out the stopping strings for differentiating the content meaning.

3.3 Inverse Document Repetition or IDR It is calculated by logarithmic number of files from the database is fractionated with the presence of particular expression in the record.

3.4 Expression Frequent or EF This is used to measure how often an expression appeared in the record. Every record having their own width or size, we have to normalize the number of records by finding the expression frequent value count. EF or expression frequent = (total number of counts expression “e” happens with the record)/(the total number TF or term frequency = (the sum of number of counts expression “e” happens with the record)/(the sum of number of expression happens with the record). RID or repetition inverse document—this utilized for measuring the significance of an expression. During the computation of the TF or term frequency, all expressions are taken into an account as equitably likely. This will also know as there are particular expressions similar to “is”, “of”, as well as “that”, might can presence many counts because are of few significance.

338

J. Antony Vijay et al.

3.5 Test and Train Dataset There are two types of dataset. One of the datasets is test dataset which is mostly used for testing the dataset and is utilized to give an unbiased computation of the final model. We need to have those data in test datasets that have not been applied in skilling dataset ever. A holdout dataset is the test dataset. Test data and train data are generally in the ratio of 0.30:0.70. The developed model is first to suitable for educating the record, it includes an instances to utilize the limits, for example, the weights connecting nephrons between input layer, hidden layer and an output layer is available in CNN. This prototype (e.g. NN and RFC) first instructed in the instructing data record with a guided machine educating methodology like HOG. The instructing data record most frequently consists of combination of vectorizer either vectorizer or scalar and the yield either vectorizer or scalable both input and output should be a scalable quantity. The current model must be executed the instructing data record and should produce an output that will be cross checked with the target where each and every entered feed vectorizer in the instructing data record is taken from document. The output with specific learning algorithm and comparison are used. Then, the variables prototype have regulated accordingly. Prototype is fitted using both varying picking limitation calculated. The following data only taken from social networks database. From this statistics easily understood that most of the data have been taken from fake dataset to spread as rumours, but very few data are being used from Buzz Feed (Table 1).

4 Extraction of Data Information gathering procedure took two stages “fake news” or “real news”. To get true information from the false stories, dataset is an important task. It needs large amount of work around many. Web site dragging with sum of 6530 articles, authentic information data record created, primarily taken by communication administration (Fig. 4). Table 1 Statistical information of dataset

Dataset

Fake dataset

Buzz feed

News

621

254

True news

190

80

Fake news

169

96

5500

2201

Tweets Verified users Unverified users Likes Replies

560

252

4647

1176

27,977

16,875

4242

5243

A Dynamic Approach for Detecting the Fake News Using Random …

339

Fig. 4 Identifying “fake news”

5 Architectural Description 5.1 Data Preprocessing The following record possesses entire pre-treating methods adequate to precede all records along with strings. Initially, decipher files which then undergo training, testing and validation of records after that perform all pre-treating methods similar as root strings and tokenizing, etc. Most of the experimental record analytics executed such as to reply varying diffusion and checking the standard of record similar to void or lost assets, etc.

5.2 Attribute Removal Here the document, we have directed element removal and determination techniques from scientific-pack learning Python library files. Be that as it may, for include determination, we utilized techniques, for example, basic sack of strings and expression recurrence. Word embedding and parts of speech labelling are being utilized to remove a few highlights, however, parts of speech labelling and word embedding.

5.3 Classification Here, we have manufactured every one of the classifiers for making sense of the mistaken data identification. The separated highlights encouraged for different classifiers. Bayes algorithm, arithmetic regress, support vector machine, irregular angle descent as well as RFC used from sci-kit-learn. Every single extricated highlight was used in the entirety classification. When fitting the prototype, afterwards, differentiation is taken from first score though checked the disarray grid. Subsequent to getting it fixed to every one of the classification methods, two best accomplishing

340

J. Antony Vijay et al.

prototypes were picked as competitor prototypes to detect bogus news characterization. The actualized framework adjusts by compiling grid search computer vision techniques on these up-and-comer models and picks the best-performing parameters for this classifier. At last, the favoured prototype used for bogus news identification with the probability of truth. Addition to this, we have likewise taken the best 50 qualities from our expression recurrence tf-idf vectorize to see which words are generally significant in every single of the categories. The absorbed information is utilized to perceive, preparing along with testing pairs of data’s performance while the sum is expanded the information in our classifications.

5.4 Prediction Our finally selected and best-performing classifier was an algorithm which was then saved on disc with name final_model.sav. Once you shut down this repository, this model will be duplicated to the user’s machine and will be utilized by prediction.py file to sort out the false information. It takes a news story as input from the user where the model is then employed for the last classification result that is shown to the user along with the possibility of truth.

6 Implementation The implementation will first start by installing the requirements needed to run our model. After installing the required components like Python, sci-kit learn, NLP package, tf-idf, Spyder anaconda and pip, the dataset will be divided into two parts which are 1. Test dataset 2. Train dataset. We will first tokenize the training dataset, remove all the stop words using NLP tokenizer and NLP stop words which contain all the stop words. To remove the stop words from our training data, the cleaning process will use tf-idf and count vectorizer. After extracting the features from our trained dataset, we will feed our extracted features into our Naïve Bayes model. We will save the final prediction file with predictions.py. This file will take a newspaper report as input from the user then model.

A Dynamic Approach for Detecting the Fake News Using Random …

341

7 Conclusion In this paper, the target fake news detection system is implemented on a Python platform which is having sci-kit learn and other natural language processing tools in order to process the initial input given by the user. This system is efficient enough to distinguish between fake news and real news because of the robust model that has been used which is both reliable and efficient. This model can be executed locally on the user’s system and on cross platforms also given that the user has to do necessary adjustments in his system like creating a Web service rest framework API on his system.

References 1. Gilda S (2017) Evaluating machine learning algorithms for fake news detection. IEEE 2. Reis JCS, Correia A, Murai F, Veloso A, Benevenuto F (2019) Supervised learning for fake news detection. IEEE Intell Syst 34(2):76–81 3. Karsai M et al (2011) Small but slow world: how network topology and burstiness slow down spreading. Phys Rev E Stat Phys Plasm Fluids Relat Interdisc Top 4. Granik M, Mesyura V (2017) Fake news detection using Naive Bayes classifier. In: 2017 IEEE first Ukraine conference on electrical and computer engineering (UKRCON) 5. Liu Y, Wu Y-FB (2011) Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks. In: The thirty-second AAAI conference on artificial intelligence (AAAI-18) 6. Nematzadeh A, Ferrara E, Flammini A, Ahn YY (2014) Optimal network modularity for information diffusion. Phys Rev Lett 113(8):088701 7. Jain A, Kasbe A, Fake news detection. In: IEEE international students’ conference on electrical, electronics and computer sciences 8. Al Asaad B, Era¸scu M (2018) A tool for fake news detection. In: 20th international symposium on symbolic and numeric algorithms for scientific computing (SYNASC) 9. Kotteti CMM, Dong X, Li N, Qian L (2018) Fake news detection enhancement with data imputation. In: IEEE 16th international conference on dependable, autonomic and secure computing, 16th international conference on pervasive intelligence and computing, 4th international conference on big data intelligence and computing, and 3rd cyber science and technology congress 10. Granik M, Mesyura M (2017) Fake news detection using Naive Bayes classifier. In: IEEE first Ukraine conference on electrical and computer engineering (UKRCON) 11. Gilda S (2017) Evaluating machine learning algorithms for fake news detection. In: IEEE 15th student conference on research and development (SCOReD) 12. Gentzkow M, Shapiro JM (2006) Media bias and reputation. J Polit Econ 114(2):280–316 13. Firmstone J, Coleman S (2014) The changing role of the local news media in enabling citizens to engage in local democracies. J Pract 8(5):596–606 14. Parikh SB, Atrey PK (2018) Media-rich fake news detection: a survey. In: IEEE conference on multimedia information processing and retrieval

Automated Essay Grading: An Empirical Analysis of Ensemble Learning Techniques Shakshi Sharma and Anjali Goyal

Abstract Automated essay grading refers to the application of natural language processing tools for assigning scores to student essays. It is an important research domain as teachers are often required to grade a large amount of student essays in educational settings. Fair grading of essays is a challenging and tedious task. Teachers often consider this as unproductive work. Thus, there is a need for an automated approach for teachers so that they are no longer required to manually grade the student essays. Various automated essay scoring systems using machine learning and information retrieval concepts have been developed in the past studies. In the recent years, ensemble classification techniques have gained popularity. Ensemble techniques use multiple classifiers for making a prediction and have proved to be outperforming classical machine learning. In this paper, we present an empirical study of ensemble learning techniques for classification of student essays. We studied performance of five machine learning and four ensemble learning techniques for conducting experiments. We further utilized feature selection technique to improve the prediction efficiency. The performance results on automated student assessment prize dataset available on Kaggle showed that ensemble techniques outperform the efficiency of traditional machine learning techniques. Keywords Automatic essay grading · Machine learning · Ensemble learning · Feature selection · Classification · Kaggle · Essay sets

S. Sharma The NorthCap University, Sector-23A, Gurugram, Haryana 122017, India e-mail: [email protected] A. Goyal (B) Amity University Uttar Pradesh, Sector-125, Noida, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_26

343

344

S. Sharma and A. Goyal

1 Introduction Assessment plays a vital role in the academic setting. Due to the increasing number of students undertaking essay-based examinations, rating (or grading) of all the essays manually by the teachers has emerged as a cumbersome task [1]. To ease this grading task, various organizations are building automated essay grading frameworks in order to reduce the time spent by the teachers in grading-related activities. Automated essay grading alludes to the procedure of reviewing student essays without any human impedance [2]. An automated essay grading framework operates as follows: It takes as input an essay composed for a given brief, and after that assigns a numeric score to the essay reflecting its quality, in view of its grammar, content, and organization. For the correct scoring mechanism, the procedure of feature engineering is the most troublesome piece of structure in automated essay grading frameworks. Besides, it is difficult for people to consider each of the variables that are engaged with appointing a correct score to the essay. Another problem area in manual grading of essays is related to the variation of evaluator’s styles. Different teachers may assign different score values to the essays which seems unfair to students. Besides this evaluating large number of essays is a tedious action. The enthusiasm for the advancement and being used of computer-based assessment systems has developed exponentially in the recent couple of years, due to both the expansion of the number of students going to colleges and to the potential outcomes given by e-learning ways to deal with online training. One of the troubles of reviewing grades is the subjectivity, or if nothing else the apparent subjectivity, of the evaluating process. There exist various studies on automated essay grading [3, 4]. However, it is still not a fully solved problem. More studies are required to further enhance the state-of-the-art. Hence, in this work, we investigated various traditional and ensemble learning algorithms for automated essay grading. We used automated student assessment prize (ASAP) dataset by The William and Flora Hewlett Foundation (Hewlett) which is available on Kaggle [5] for experimental evaluation. In this dataset, there are 8 essay sets which consists of essays by school students from Grade 7 to Grade 10. We used a total of nine features: vocabulary, word count limit ratio, semantic similarity topic essay, voice, semantic similarity essay, spell errors, tense, grammatical errors, and long sentences for classification of essays. Three schemes are studied to classify the essays in appropriate grades. In Scheme-1, we applied various traditional machine learning classification techniques. In Scheme2, we applied ensemble learning techniques for essay grading and compared the results with the results of traditional machine learning classifiers. In Scheme-3, we studied the effect of feature selection techniques over the performance of different machine learning and ensemble classification techniques. The rest of the paper is organized as follows. Section 2 gives an overview literature related to automated essay grading. Section 3 explains the research methodology employed in this research work. Section 4 presents the experimental results. Section 5 concludes the paper and provides some interesting future research directions.

Automated Essay Grading: An Empirical Analysis …

345

2 Literature Survey Automated essay grading has been a pertinent research area since last two decades. Various examinations such as the graduate record examination (GRE) evaluates students on the basis of student essays. In order to meet the increasing need for automated essay grading modules, different frameworks and tools have been devised in the past studies. The initially developed automated essay grading frameworks present various general characteristics of writing to assess the essays such as organization of ideas, topic content, and syntactic variety using natural language processing tools. The secondary school English instructor page [6] proposed the initial robotized framework for evaluating student essays. The proposed framework was effective in reducing the teacher’s time spent in manually checking the essays. Regardless of its noteworthy accomplishment at anticipating instructors’ paper evaluations, the early form of this framework got constrained due to the limited accessibility of vital devices such as home PCs and Internet. The broad utilization of the Internet, word preparing programming and natural language processing helped the automated essay scoring systems to re-evolve. Various tools such as project essay grade (PEG) [6], e-rater [7], intelligent essay assessor (IEA) [8], and IntelliMetric [9] were developed for automated essay grading. In e-rater prototype, researchers explained more than 100 naturally extractable essay features including the semantic features. Linear regression technique is then applied to find out good scoring models. The proposed technique produced good performance results. IEA used latent semantic analysis (LSA) for essay grading. This technique evaluated the semantic relation among two documents [10]. Some systems use coherence in noisy text for essay grading. Miltsakaki and Kukich [11] have investigated the job of centering theory [12] in finding topic shifts in essays. Coherence is an idea that depicts the ow of data starting with one piece of talk then onto the next and ranges from lower level firm components, for example, causal relationship and connectives, up to larger amount of components that assess associations between the talk and per user’s psychological portrayal of it. Existing systems measure coherence in noisy text with different supervised and unsupervised approaches. The unsupervised methodologies normally measure lexical cohesion. Foltz et al. [8] reported that rational writings contain a high number of semantically related words. They measured lucidness as a component of semantic relatedness between contiguous sentences. Hearst et al. [10] subdivided the writings into multisection units that speak to subtopics and identifies examples of lexical co-occurrence and conveyance, for example, recognizing reiteration of vocabulary crosswise over adjoining sentences. Ramalingam et al. [13] used machine learning technique and linear regression for automated essay grading. They found that the grades provided by machine learning technique are more accurate as compared to human raters. Larkey et al. [14] applied Bayesian and kNN classification techniques for essay grading. They found that the machine learning classification obtained significant agreement between automated grader and manual grader. In the recent years, ensemble-based classification techniques have gained popularity. These techniques use multiple classifiers for making a prediction and have proved to be outperforming classical machine

346

S. Sharma and A. Goyal

learning classification. In this paper, we present an empirical study of ensemble-based techniques for automated essay grading.

3 Research Methodology This segment of the paper gives a brief description about the research framework, experimental dataset, and evaluation parameters used in this empirical study.

3.1 Research Framework Figure 1 shows the solution approach used in this research work for automated grading of essays. The overall architecture used for essay grading consists of multiple steps. The first step is to extract features. We used nine features: vocabulary, word count limit ratio, semantic similarity topic essay, voice, semantic similarity essay, spell errors, tense, grammatical errors, and long sentences as proposed by Madala et al. [15]. Next, the features are transformed so as to scale them to a common scale. In step 3, various machine learning classification techniques are applied for baseline comparison. Classification is a type of supervised machine learning which learns the model by giving training examples and then test it using some test data. In classification technique, the output is in the form of categories (or class labels). There exists a wide variety of supervised machine learning algorithms. We used the following five classification techniques in this work: Naive Bayes. Naive Bayes classification is a probabilistic technique based on Bayes theorem [16]. It assumes that predictors are strongly independent of each other which means that one feature cannot be related to another feature. It is a simple, easy to build classifier and works well for large datasets. k-Nearest Neighbor. k-nearest neighbor (or kNN) is an instance-based lazy technique that can be used for classification and regression tasks [17]. To label a new data, this algorithm finds the k-closest data points to new data and then the label is decided for new data on the basis of class labels possessed by k-nearest neighbor, i.e., the class label occurring maximum number of times among nearest neighbors. Generally, the value of k is decided as an odd number to ignore possibility of ties. We used two values of k, k = 3 and k = 5 for experimental evaluation in this work. Decision Tree. Decision tree [18] is one the most widely used supervised machine learning technique. It builds a tree by breaking the data into smaller parts. The top node represents the root of the tree. The internal nodes represent the conditions or rules by which the dataset has been split. The leaf nodes represent the category (or class label) to which that the example data instance belongs to. The training data is used to build the tree-like structure and the testing data is used for parsing

Automated Essay Grading: An Empirical Analysis …

347

Fig. 1 Architecture diagram of the research framework

the tree by checking the rules to obtain class labels. The test instance flows from the root node till the leaf node. Support Vector Machine. Support vector machine (SVM) is a discriminative classifier formally characterized by formation of an isolating hyperplane among the data instances [19]. Given the labeled training data, classification yields an ideal hyperplane which arranges new models. In two-dimensional space, the hyperplane is a line isolating a plane in multiple sections where in each class lay in one section.

348

S. Sharma and A. Goyal

Logistic Regression. It is a type of classification supervised machine learning technique in which the classification is done on the basis of probabilities [20]. It can be thought of as a special case of regression where outcome is categorical. It uses sigmoid nonlinear activation function to produce the output. In step 4, we applied various ensemble learning techniques for automated essay grading. Ensemble classification techniques are an advanced class of meta-machine learning algorithms that combine predictions obtained from multiple learning algorithms to achieve better predictive performance. These techniques decrease the bias and variance of prediction models and tries to obtain better prediction accuracy from any of the constituent learning algorithms alone. We empirically studied five different ensemble classification techniques in this work. Random Forest. Random forest is an ensemble classifier technique that uses bagging and decision trees [21]. It classifies the test instances based on the majority voting of various decision trees. A random forest containing m decision trees is formed by generating m bootstrap samples from the training data. AdaBoost. Boosting is an ensemble learning technique which creates a strong classifier from weak classifiers [22]. AdaBoost combines the classifiers with selective training data at each iteration and assign the additional weight to misclassified instances. The increased weight in final voting helps to obtain a strong model with good prediction accuracy. Gradient Boosting. Gradient boosting is an ensemble learning technique that uses the residual error directly for updating the model [23]. It does not update the weights of misclassified data points. Rather it defines a step function and helps to minimize the loss function. XGBoost. Extreme gradient boosting (XGBoost) is a decision tree-based parallel ensemble-based learning calculation [24]. It uses a gradient boosting system which uses more accurate approximations to find the best tree model. It is really fast algorithm and is well suited for structured data. The winning of various Kaggle competitions has led XGBoost ensemble classification technique to be quite popular. In step 5, we study the effect of using feature selection technique for automated essay grading. We used mutual information technique for feature selection. This technique measures how much information the presence or absence of a feature contributes to making the correct classification decision. Finally, the performance obtained using various techniques is evaluated using popular performance metrics.

Automated Essay Grading: An Empirical Analysis … Table 1 Confusion matrix

349 Positive (P)

Negative (N)

Positive (P)

TP

FP

Negative (N)

FN

TN

3.2 Experimental Dataset The dataset for this study was extracted from Hewlett Foundations Automated Student Assessment Prize (ASAP) dataset available on Kaggle.1 This dataset contains essays written by students of Grade 7 to Grade 10 and is rated by 2–3 examiners. The average rating is used as final grade for the essay. The ASAP dataset is available in 8 sets. We used Set-1, Set-7, and Set-8 from the available essays. The same sets were used by Mandala et al. [15] and hence enables the usage of same features and effective comparison of obtained results with literature.2 The range of grade score in each chosen set is: Set-1 (2–12), Set-7 (0–30), and Set-8 (0–60). Sets-1, 7, and 8 contains 1783, 1569, and 723 training essays and 589, 441, and 233 testing essays, respectively. Overall, we used 4075 essays for training the models and 1263 essays for testing purposes.

3.3 Evaluation Measures Model evaluation is an imperative step in machine learning classification. The classification problem can be evaluated on the basis of various measures. We used five popular evaluation measures: confusion matrix, precision, recall, F1 Score, and accuracy. Confusion Matrix. Confusion matrix is a table which is formed on the basis of predicted and true values of data. For binary classification problem, a confusion matrix of size 2 × 2 is formed. Table 1 shows a binary confusion matrix. There are four notations used in Table 1: True positive (TP) represents that the model predicted instance as label ‘P’ and true label is also ‘P’. True negative (TN) represents that model predicted as label ‘N’ and true label is also ‘N’. False positive (FP) represents that the model predicted as label ‘P’, but true value is ‘N’, whereas false negative (FN) represents that model predicted label as ‘N’ but true value is ‘P’. For a good prediction model, true positive and true negative should be high in number. Precision. Precision determines how many instances are actually positive from the ones which are predicted as positive by the model. This helps in determining how useful the results are 1 https://www.kaggle.com/c/asap-aes/data. 2 https://figshare.com/articles/AutomatedEssayGradingPre-ProcessedDataset/5765727/1.

350

S. Sharma and A. Goyal

Precision = TP/(TP + FP)

(1)

Recall. Recall determines how many positive values out of actual true positive values are captured by the model. It determines how complete our results are Recall = TP/(TP + FN)

(2)

F1 Score. F1 Score (or F-measure) is the harmonic mean of precision and recall. It is used to maintain the balance between the precision and recall. F1 Score = (2 ∗ Precision ∗ Recall)/(Precision + Recall)

(3)

Accuracy. Accuracy is the ratio of correctly predicted values divided by correctly classified and incorrectly classified values. Accuracy = (TP + TN)/(TP + TN + FP + FN)

(4)

4 Experimental Results The goal of this study is to study the effect of ensemble learning approaches for automated essay grading. For achieving this goal, we performed the following three tasks. 1. We applied traditional classification techniques for automated essay grading. 2. We applied ensemble-based classification learning techniques, random forests, AdaBoost, gradient boosting, and XGBoost for automated essay grading. 3. We studied the effect of feature selection technique, mutual information to select important features and then compared the performance obtained using traditional and ensemble classification techniques.

4.1 Performance Evaluation Using Set-1 Table 2 presents the results of using traditional and ensemble machine learning techniques for Set-1. We found that among various traditional classifiers, logistic regression classifier achieved highest accuracy up to 85.2%. The ensemble-based technique, gradient boosting outperformed traditional classifiers by achieving accuracy up to 86.7%. We further applied feature selection technique, mutual information. Table 3 presents the performance results of using traditional and ensemble machine learning techniques for Set-1 after applying feature selection. Six important features are selected using feature selection. The selected features are: sentence count, long sentences, SS topic essay, vocabulary, word count limit ratio, and word count. From Table 3, we found that the use of feature selection helped in better classification of

Automated Essay Grading: An Empirical Analysis …

351

Table 2 Performance results obtained for Set-1 using traditional and ensemble-based classification techniques Classifier

Accuracy (%)

Precision

Recall

F1 Score

Logistic regression (LR)

85.2

0.85

0.85

0.85

kNN (k = 3)

80.4

0.81

0.80

0.81

kNN (k = 5)

82.8

0.84

0.83

0.83

Naive Bayes

81.8

0.87

0.82

0.83

Decision tree

77.7

0.79

0.78

0.78

SVM

77.5

0.78

0.78

0.78

Random forest

84.2

0.85

0.84

0.84

AdaBoost

69.1

0.80

0.69

0.70

Gradient boosting

86.7

0.87

0.87

0.87

XGBoost

86.5

0.87

0.87

0.87

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

Table 3 Performance results obtained for Set-1 using traditional and ensemble-based classification techniques (with feature selection) Classifier

Accuracy (%)

Precision

Recall

F1 Score

Logistic regression (LR)

85.2

0.85

0.85

0.85

kNN (k = 3)

80.1

0.81

0.80

0.81

kNN (k = 5)

83.7

0.84

0.83

0.83

Naive Bayes

79.7

0.87

0.82

0.83

Decision tree

75.3

0.79

0.78

0.78

SVM

73.5

0.74

0.74

0.74

Random forest

83.0

0.85

0.84

0.84

AdaBoost

69.1

0.80

0.69

0.70

Gradient boosting

87.0

0.87

0.87

0.87

XGBoost

85.4

0.87

0.87

0.87

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

essays. An improvement of 0.3% is obtained in the best classification performance of gradient boosting algorithm using feature selection. We further segregated the grade range 2–12 of Essay Set-1 in four classes. The grade range 10–12 is grouped into Grade Class A, 7–9 in Grade Class B, 4–6 in Grade Class C, and 2–3 in Grade Class D. Table 4 shows the number of correctly and incorrectly classifies instances corresponding to logistic regression classifier according to different grade classes. Tables 5 and 6 show the accuracy of various traditional and ensemble classifiers for different grade classes without and with feature selection, respectively.

352 Table 4 Distribution of number of correctly classified and incorrectly classified instances using logistic regression classifier for different grade ranges (Set-1)

S. Sharma and A. Goyal Grades

Correctly classified

Incorrectly classified

Grade A

100

41

Grade B

372

40

Grade C

29

5

Grade D

1

1

4.2 Performance Evaluation Using Set-7 Table 7 presents the results of using traditional and ensemble machine learning techniques for Essay Set-7. We found that among various traditional classifiers, logistic regression classifier achieved highest accuracy up to 79.1%. The ensemble-based technique, AdaBoost outperformed traditional classifiers by achieving accuracy up to 81.1%. We further applied feature selection technique, mutual information. Table 8 presents the performance results of using traditional and ensemble machine learning techniques for Set-7 after applying feature selection. Four important features are selected using feature selection. The selected features are: sentence count, long sentences, SS topic essay, and vocabulary. From Table 8, we found that the use of feature selection did not helped in better classification of essays. No improvement is obtained in best classification performance of AdaBoost algorithm using feature selection. We further segregated the grade range 0–30 of Essay Set-7 in four classes. The grade range 23–30 is grouped into Grade Class A, 15–22 in Grade Class B, 7–14 in Grade Class C, and 0–6 in Grade Class D. Tables 9 and 10 shows the accuracy of various traditional and ensemble classifiers for different grade classes without and with feature selection, respectively.

4.3 Performance Evaluation Using Set-8 Table 11 presents the results of using traditional and ensemble machine learning techniques for Essay Set-8. We found that among various traditional classifiers, kNN classifier achieved highest accuracy up to 94.4%. The ensemble-based technique, gradient boosting achieved accuracy up to 92.7%. We further applied feature selection technique, mutual information. Table 12 presents the performance results of using traditional and ensemble machine learning techniques for Set-8 after applying feature selection. Three important features are selected using feature selection. The selected features are: sentence count, SS topic essay, and vocabulary. From Table 12, we found that the use of feature selection did not helped in better classification of essays. No improvement is obtained in best classification algorithm performance using feature selection.

70.9

90

85

50

A

B

C

D

100

85.2

83.7

69.5

kNN (k = 3)

100

88.2

84.7

75.8

kNN (k = 5)

100

85.2

86.8

75.8

Random forest

100

88.2

89.3

78

XGBoost

100

100

76.9

91.4

Naïve Bayes

100

85.2

80.8

66.66

Decision tree

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

Logistic regression

Grades

Table 5 Accuracy of various traditional and ensemble classifiers for different grade ranges (Set-1)

100 100

50

60.1

88.6

AdaBoost

73.5

83.9

60.2

SVM

100

85.2

90.7

75.1

Gradient boosting

Automated Essay Grading: An Empirical Analysis … 353

73

89.5

85.2

50

A

B

C

D

100

85.2

83.4

68.7

kNN (k = 3)

100

88.2

85.9

75.8

kNN (k = 5)

100

82.3

85.9

74.4

Random forest

100

91.1

88.5

74.4

XGBoost

100

100

75

88.6

Naïve Bayes

100

94.1

77.9

63.1

Decision tree

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

Logistic regression

Grades

Table 6 Accuracy of various traditional and ensemble classifiers for different grade ranges with feature selection (Set-1)

100 100

50

60.1

88.6

AdaBoost

61.7

81

54.6

SVM

100

91.1

91.2

73.7

Gradient boosting

354 S. Sharma and A. Goyal

Automated Essay Grading: An Empirical Analysis …

355

Table 7 Performance results obtained for Set-7 using traditional and ensemble-based classification techniques Classifier

Accuracy (%)

Precision

Recall

F1 Score

Logistic regression (LR)

79.1

0.78

0.79

0.79

kNN (k = 3)

68.0

0.46

0.68

0.55

kNN (k = 5)

68.0

0.46

0.68

0.55

Naive Bayes

69.3

0.78

0.69

0.72

Decision tree

61.67

0.66

0.62

0.63

SVM

68.0

0.46

0.68

0.55

Random forest

73.2

0.74

0.73

0.73

AdaBoost

81.1

0.79

0.81

0.80

Gradient boosting

80.2

0.80

0.80

0.80

XGBoost

79.1

0.79

0.79

0.79

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

Table 8 Performance results obtained for Set-7 using traditional and ensemble-based classification techniques (with feature selection) Classifier

Accuracy (%)

Precision

Recall

F1 Score

Logistic regression (LR)

79.8

0.80

0.80

0.80

kNN (k = 3)

68.4

0.72

0.68

0.72

kNN (k = 5)

69.3

0.71

0.69

0.70

Naive Bayes

70.7

0.79

0.71

0.73

Decision tree

65.0

0.71

0.65

0.67

SVM

76.8

0.76

0.77

0.76

Random forest

73.4

0.75

0.73

0.74

AdaBoost

53.7

0.76

0.54

0.62

Gradient boosting

79.8

0.80

0.80

0.80

XGBoost

79.1

0.79

0.79

0.79

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

We further segregated the grade range 0–60 of Essay Set-8 in four classes. The grade range 46–60 is grouped into Grade Class A, 31–45 in Grade Class B, 16–30 in Grade Class C, and 0–15 in Grade Class D. Tables 13 and 14 show the accuracy of various traditional and ensemble classifiers for different grade classes without and with feature selection, respectively.

41.6

87.6

63.7

0

A

B

C

D

0

0

100

0

kNN (k = 3)

0

0

100

0

kNN (k = 5)

50

55.1

81.3

66.6

Random forest

50

59.8

88.6

50

XGBoost

100

63.7

70.6

91.6

Naïve Bayes

100

55.1

64.6

50

Decision tree

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

Logistic regression

Grades

Table 9 Accuracy of various traditional and ensemble classifiers for different grade ranges (Set-7)

74 0

0

88

0

AdaBoost

0

100

0

SVM

50

62.9

88.3

66.6

Gradient boosting

356 S. Sharma and A. Goyal

58.3

84

73.2

0

A

B

C

D

100

55.1

74.3

58.3

kNN (k = 3)

0

55.9

76.3

50

kNN (k = 5)

100

59.8

79.3

66.6

Random forest

50

61.4

87.6

58.3

XGBoost

50

71.6

69.6

91.6

Naïve Bayes

50

59.8

67

75

Decision tree

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

Logistic regression

Grades

Table 10 Accuracy of various traditional and ensemble classifiers for different grade ranges with feature selection (Set-7)

50.3 50

50

54

83.3

AdaBoost

60.6

85.6

33.3

SVM

50

62.2

88

66.6

Gradient boosting

Automated Essay Grading: An Empirical Analysis … 357

358

S. Sharma and A. Goyal

Table 11 Performance results obtained for Set-8 using traditional and ensemble-based classification techniques Classifier

Accuracy (%)

Precision

Recall

F1 Score

Logistic regression (LR)

90.5

kNN (k = 3)

94.4

0.93

0.91

0.92

0.89

0.94

kNN (k = 5)

0.92

94.4

0.89

0.94

0.92

Naive Bayes

74.2

0.94

0.74

0.82

Decision tree

79.3

0.92

0.79

0.85

SVM

46.7

0.93

0.47

0.62

Random forest

91.4

0.94

0.91

0.93

AdaBoost

62.6

0.92

0.63

0.74

Gradient boosting

92.7

0.94

0.93

0.93

XGBoost

91.4

0.95

0.91

0.93

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

Table 12 Performance results obtained for Set-8 using traditional and ensemble-based classification techniques (with feature selection) Classifier

Accuracy (%)

Precision

Recall

F1 Score

Logistic regression (LR)

90.9

0.92

0.91

0.91

kNN (k = 3)

87.5

0.92

0.88

0.90

kNN (k = 5)

88.4

0.92

0.88

0.90

Random forest

88.8

0.94

0.89

0.91

XGBoost

92.7

0.95

0.93

0.94

Naive Bayes

86.2

0.95

0.86

0.89

Decision tree

79.3

0.92

0.79

0.85

SVM

89.6

0.93

0.90

0.91

AdaBoost

30.4

0.94

0.30

0.41

Gradient boosting

93.1

0.95

0.93

0.94

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

5 Conclusion and Future Work Automated essay grading is an important research domain in academics. In this paper, we present an empirical analysis of various traditional and ensemble learning classification techniques for automated essay grading. We conduct experiments using ASAP dataset available on Kaggle. From the experimental evaluations, we observed that ensemble learning techniques outperform traditional machine learning algorithms.

0

92.7

63.6

A

B

C

0

100

0

kNN (k = 3)

0

100

0

kNN (k = 5)

72.7

90

0

Random forest

90.9

90.4

0

XGBoost

90.9

71.8

50

Naïve Bayes

72.7

79

50

Decision tree

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

Logistic regression

Grades

Table 13 Accuracy of various traditional and ensemble classifiers for different grade ranges (Set-8)

0

100

0

SVM

9

75.4

50

AdaBoost

81.8

95

0

Gradient boosting

Automated Essay Grading: An Empirical Analysis … 359

0

94

45.4

A

B

C

45.4

90.4

0

kNN (k = 3)

54.5

90.9

0

kNN (k = 5)

54.5

89

50

Random forest

63.6

94.5

50

XGBoost

90

86.3

50

Naïve Bayes

54.5

79.5

50

Decision tree

Bold represents the highest performance result obtained by an evaluation measure corresponding to an essay set

Logistic regression

Grades

Table 14 Accuracy of various traditional and ensemble classifiers for different grade ranges with feature selection (Set-8)

54.5

92.2

0

SVM

90.9

92.2

0

AdaBoost

54.5

94

50

Gradient boosting

360 S. Sharma and A. Goyal

Automated Essay Grading: An Empirical Analysis …

361

This research can be extended by applying various classifiers used in this work on all the essay sets of ASAP dataset.

References 1. Cummins R, Meng Z, Edward J (2016) Constrained multi-task learning for automated essay scoring. Association for Computational Linguistics, pp 1–11 2. Zesch T, Michael W, Dirk SA (2015) Task-independent features for automated essay grading. In: Proceedings of the tenth workshop on innovative use of NLP for building educational applications, p 224232 3. Shermis MD, Jill B, Sharon A (2013) Introduction to automated essay evaluation. In: Handbook of automated essay evaluation. Routledge, pp 23–37 4. Zupanc K, Zoran B (2017) Automated essay evaluation with semantic analysis. Knowl-Based Syst 120:118–132 5. Automated Student Assessment Prize Dataset (2019) Available online: https://www.kaggle. com/c/asap-aes, Last accessed 22 Dec 2019 6. Page EB (1966) The Imminence of… grading essays by computer. Phi Delta Kappan 47(5):238243 7. Burstein J, Joel T, Madnani N: The E-rater automated essay scoring system. In: Handbook of automated essay evaluation. Routledge, pp 77–89 8. Foltz PW, Lynn AS, Karen EL (2013) Implementation and applications of the intelligent essay assessor. In: Handbook of automated essay evaluation. Routledge, pp 90–110 9. Schultz MT: The IntelliMetric automated essay scoring engine—a review and an application to Chinese essay scoring. In: Handbook of automated essay scoring: current applications and future directions, pp 89–98 10. Hearst MA (2000) The debate on automated essay grading. IEEE Intell Syst Appl 15(5):22–37 11. Miltsakaki E, Karen K (2000) Automated evaluation of coherence in student essays. In: Proceedings of the linguistic resources in education conference. Athens, Greece, www.ling. upenn.edu/elenimi/grad.html 12. Grosz BJ, Weinstein S, Joshi AK (1995) Centering: a framework for modeling the local coherence of discourse. Comput Linguist 21(2):203–225 13. Ramalingam VV, Pandian A, Chetry P, Nigam H (2018) Automated essay grading using machine learning algorithm. J Phys Conf Ser 1000(1):1–8 (IOP Publishing) 14. Larkey LS (1998) Automatic essay grading using text categorization techniques. In: SIGIR, p 98 15. Madala DSV, Gangal A, Krishna S, Goyal A, Sureka A (2018) An empirical analysis of machine learning models for automated essay grading. PeerJ Preprints, p e3518v1 16. Han J, Jian P, Micheline K (2011) Data mining: concepts and techniques. Elsevier 17. Tan S (2006) An effective refinement strategy for KNN text classifier. Exp Syst Appl 30(2):290– 298 (Elsevier) 18. Quinlan JR (2014) C4.5: programs for machine learning. Elsevier 19. Smola AJ, Bernhard S (2004) A tutorial on support vector regression. Statist Comput 14(3):199– 222 20. Zhang T, Frank JO (2001) Text categorization based on regularized linear classification methods. Inf Retrieval 4(1):5–31 21. Breiman L (2001) Random forests. Mach Learn 45(1):5–32 22. Freund Y, Robert ES (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139 23. Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378

362

S. Sharma and A. Goyal

24. Chen T, Carlos G (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794

Survey of Scheduling and Meta Scheduling Heuristics in Cloud Environment Savita Khurana and Rajesh Kumar Singh

Abstract Scheduling is the heart of any successful cloud. A wrong scheduling decision affects its performance and reliability. The traditional scheduling algorithms, i.e., shortest job first, round-robin, etc., do not give appropriate results in the cloud and also not fulfill the quality of service constraints imposed by the user. Cloud implements virtualization concept. To manage a large number of virtual resources; scheduling is desirable. The job requirement of different users may be different so an efficient and reliable job scheduling algorithm is necessary for the performance and reputation of the cloud in the present scenario. The task scheduling is used to divide the tasks into the number of subtask to reduce makespan time and increase the performance of Cloud. Researchers proposed different scheduling algorithms to improve the performance of the cloud. In this paper, the survey of different scheduling algorithms with their merits and demerits has been discussed. Keywords Cloud · Quality of service · Scheduling · Meta-heuristics

1 Introduction Nowadays, the high-speed computing is in demand. Computers become tremendously powerful in terms of their processing speed, storage. At the same time, there exist complex applications and a wide variety of simulations experiments that desired a large number of computational resources, which is not fulfilled by a single system. So to satisfy these aforementioned requirements, cloud computing is born. Cloud computing is a type of distributed system having a pool of computing resources which can be provided to the user as per there need at a low cost [1]. There is S. Khurana (B) Computer Science and Engineering Department, I. K. Gujral Punjab Technical University, Jalandhar, Punjab, India e-mail: [email protected] R. K. Singh SUS Institute of Computers, Tangori, Mohali, Punjab, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_27

363

364

S. Khurana and R. K. Singh

a large number of cloud providers having their own cloud, i.e., Amazon, Google Engine, IBM, and Microsoft which provides services to the customer on pay per use policy, which is beneficial for companies plus users. So the motive of cloud computing is to efficiently utilize the virtual machines and schedule the task on most suitable resources to reduce the makespan time, cost and increase reliability, resource utilization [2]. An efficient scheduling is desirable in the cloud environment [6]. The scheduling system in the cloud is responsible to select the best suitable machines in a cloud for user jobs as a scheduling problem is multi-objective [20] so different parameters should be addressed to achieve QoS. The meta-scheduler and local scheduler both are essential to achieve the scheduling objectives [6]. The local scheduler schedules the task in a particular virtual machine but the Meta scheduler is responsible for the submission of the task in all virtual machines. To solve optimization problems various heuristic techniques [5] are implemented in cloud, i.e., batch mode heuristic, immediate mode heuristic, meta-heuristic. Some heuristics gives good results in term of makespan, some are best in flow control and also some are cost-efficient but not generic as their results vary from application to application. The commonly used task scheduling algorithms are Min–Min, Max–Min, Min– Max, LJFR-SJFR, Suffrage, Relative Cost, Opportunistic Load Balancing, Switching Algorithm, Minimum Completion Time, Minimum Execution Time, Genetic Algorithms, Ants Colony Optimization, and Flower Pollination algorithm [3, 6, 26]. This survey focuses on the design of scheduling systems for the computational cloud. First of all, challenges for designing scheduling systems in cloud environments are investigated. Cloud scheduling framework that is useful for guiding the design is elaborated. This rest of the paper include the related work in Sect. 2, cloud scheduling process and its architecture in Sect. 3. In Sect. 4 different heuristics and meta-heuristic algorithms have been described. In Sect. 5 need of scheduling has been discussed. In Sects. 6 and 7 include the conclusion and future work.

2 Related Work Bittencourt et al. proposed an optimized scheduling algorithm is implemented in which hybrid cloud is used to speed up the execution of workflows within expected execution time and reduced cost of execution. The results are compared with the greedy approach which shows better results [12]. Abrishami et al. proposed a new QoS-based workflow scheduling algorithm that used fair policy to compute execution time and results into low cost. This algorithm is worked very well on Software as a Service cloud architecture and used Partial Critical Paths (PCP) for scheduling tasks. The proposed algorithm promised to execute the tasks within the defined user’s deadline [13]. Verma et al. proposed a Bi-Criteria Priority-based Particle Swarm Optimization (BPSO) has been proposed. The algorithm is implemented on different scientific workflows. The proposed algorithm is considered bi- criteria, i.e., budget and deadline. The results are compared with the Standard PSO and BHEFT—Budget

Survey of Scheduling and Meta Scheduling Heuristics …

365

constrained earliest finish time [14]. The proposed algorithm significantly decrease execution cost as compared to existing one. Abrishami et al. proposed a PCP—Partial critical path algorithm is designed. In this algorithm, two types of scheduling algorithms is used to fulfill the quality of service constraint set by user. The first one is IaaS Cloud Partial Critical Paths (ICPCP), and second one is IaaS Cloud Partial Critical Paths with Deadline Distribution (IC-PCPD2) [15]. Both algorithm results are analyzed on different workflows LIGO, CyberShake, and SIPHT and show better results. Shukla et al. proposed an Evolutionary Multi-objective Optimization-based algorithm. This algorithm is proposed to solve this workflow scheduling problem on an Infrastructure as a Service (IaaS) platform [4]. Liu et al. proposed a workflow scheduling and cloud workflow management framework has been explored. The paper mainly focused on the cloud based architecture for smart city environment [27]. Lee et al. addressed the problem of resource-efficient workflow scheduling and presented a Maximum Effective This paper discuss about different heuristics and meta-heuristics scheduling algorithms. The paper discussed about different quality of service metrics like cost, load balancing, different types of time factors, makespan etc. are considered by different algorithms, and analyzed the performance Reduction (MER) algorithm [28]. This algorithm improves the source utilization, reduction in resource provisioning and savings in energy consumption. Zeng et al. designed multi-objective scheduling algorithm. The proposed algorithm is taken care of security and budget constraint. The proposed strategy also reduced the makespan as well as security in scheduling the tasks [3]. Ali et al. proposed an algorithm schedules tasks in cloud computing environment by considering the quality of service parameters. The algorithm works on independent tasks. The proposed work considered four different attributes of the task like its type, priority, and execution time. Then, it is map for execution as per their quality of service parameter. The results are compared with min–min and simple task scheduling algorithm [18]. Kumar et al. developed new hybrid metaheuristics algorithm of simple ant colony optimization and global colony optimization namely GACO. It is based on foraging behavior of ants. This paper optimized the service level agreements violation cost in the cloud environment. Mainly this paper focused on optimized resource allocation as virtual machines [22]. Kumar et al. have performed comparative analysis results of previous research work using the five most popular meta-heuristic techniques. In this paper has optimized service-level agreement In this paper authors have mathematically formulated QoS criteria, SLA penalty cost and the cloud-domain-specific constraints. The near optimal solution of each metaheuristic approach has been reported in four performance metric cases: worst, best, mean and standard deviation [23].

366

S. Khurana and R. K. Singh

3 Cloud Scheduling Process and Architecture In Cloud environment, resource provisioning, resource management, task scheduling, security problems, data center energy consumption are the major issues and challenges. Task scheduling is the major concern of Cloud to achieve the efficiency of the data center [7]. In principle, scheduling in the cloud performs two tasks: ordering the tasks and mapping. When there are large numbers of tasks in the cloud for waiting for execution, order is performed to set up the task order. Ordering of tasks is a desirable part of scheduling if task priority or deadline is important. Mapping is the process of selecting a suitable resource from a large pool of resources and assigns to the most appropriate task. During the mapping process, the performance prospective is estimated to decide the finest schedule. The aim of cloud scheduling is to maximize the throughput, to maximize resource utilization, to minimize the execution time and fulfilling economical constraints. The performance of the cloud should be improved by reducing the task processing time and by making sure that all the grid resources are used without being idle. The Meta-scheduler is used to schedule the tasks to schedule and map the tasks to the best location within a distributed environment. Meta-scheduling is more complex than local scheduling because of the heterogeneous nature of the resources. To schedule the jobs within site local scheduler is sufficient, but in distributed computing, Met scheduler is required (Fig. 1).

4 Scheduling Heuristics The main objective Task scheduling is to map the task to most appropriate VMs in terms of processing speed and cost [21]. The task in the cloud environment is divided into two categories. i.e. dependent task and independent task. To schedule, the independent task is easy as a task is not dependent upon other task and having no priority order need to be followed during the scheduling process in the cloud environment [6]. But to schedule the dependent task which is a quite difficult one due to some functional and non-functional data dependencies among the tasks and which need to be followed during the scheduling process. The Scheduling process of dependent tasks is called Workflow Scheduling. The selection of appropriate task scheduling strategies in cloud environment is based on task dependency. The main objective in scheduling the tasks is minimizing the makespan and cost but in general, these two factors are inversely proportional to each other because higher configuration machine is capable of reducing the makespan time but increase the cost, so manage these two factor is challenges job in cloud computing.

Survey of Scheduling and Meta Scheduling Heuristics …

367

Fig. 1 Basic cloud scheduling architecture

In this paper, different heuristics, Meta-heuristics algorithms i.e. Min–Min, Max– Min, Min–Max, LJFR-SJFR, Suffrage, RC [6] mode heuristics, i.e., OLB, SA, MCT, MET [8] are studied (Fig. 2).

4.1 Batch Mode Heuristics The simple and yet powerful heuristics represent the batch mode scheduling methods. These are distinguished from other sophisticated scheduling methods such as metaheuristics as these methods are more efficient to provide a high-quality allocation of resources which otherwise would take longer execution times by using these sophisticated scheduling methods. The followings are different batch mode methods: Min–Min Algorithm This heuristic considered the minimum completion time of task for execution, as a result, the tasks having smaller completion time will be executed first. The major drawback of this approach comes in the pictures when there is a large number of the smaller task are ready for execution, as a result, the larger tasks has to wait for a long duration[25]. The unbalanced Load is also another drawback of this approach.

368

S. Khurana and R. K. Singh

Heuristics Algorithms Immediate Mode Heuristics OLB SA

Batch Mode Heuristics Min – Min Max-Min

MCT

Min-Max

MET

LJFR-SJFR Suffrage Relative Cost

Fig. 2 Different heuristics algorithms

Max–Min Scheduling Algorithm The working of Max–Min heuristics is just opposite to Min–Min algorithm, instead of focusing smaller task Max–Min heuristic gives prefer to large tasks as compared to smaller tasks in the cloud. It is working like Min–Min scheduling algorithm but it focuses on the larger task first. The major drawback is an unbalanced load and increases the waiting time of the shortest task unnecessary, this will lead to the starvation problem. Min–Max This heuristics combines the features of both min–min and max–min heuristics. It works in two-phase. During the first phase, the smallest completion time is calculated, and in the next phase, the max–min heuristic is applied. This heuristic also removed the drawback of both algorithms. LJFR-SJFR Algorithm Largest Job Fastest Resource and Shortest Job Fastest Resource (LJFR-SJFR) heuristic algorithms are working on the two principals. This heuristic submits the largest tasks on faster machines to shorten the time and the smaller task submit to faster machines to shorten the time of the schedules of tasks.

Survey of Scheduling and Meta Scheduling Heuristics …

369

Suffrage This heuristic calculates the difference between the second minimum completion time of task and first minimum completion time of task submitted. The objective of the suffrage heuristic is to allocate the machine to most victim tasks in terms of expected completion time so that all unassigned tasks should be executed and the starvation problem cannot come in the picture. Relative Cost (RC) This heuristic took into account of the cost factor for the task submission. The cost is depending upon two factors i.e. static relative cost which is calculated at the starting and dynamic relative cost is computed after every iteration.

4.2 Immediate Heuristics Mode A job submits in the cloud system for execution; the immediate mode scheduling is used. Further, the scheduler will get activated and it doesn’t wait for the interval. The arrival rate of the jobs is low then resources are allocated immediately without waiting for any kind of time interval. The immediate mode heuristics i.e. opportunistic load balancing (OLB), minimum completion time (MCT), minimum execution time (MET), switching algorithm) (SA) are following discussed. Opportunistic Load Balancing (OLB) This heuristic considered the idle time of virtual machines. The task is submitted to the idle virtual machine without considered the scheduling algorithms, thus scheduling time is avoided. This heuristic gives the best results in the conditions where tasks are less as compared to machines. Switching Algorithm (SA) This heuristic considered the best features of lowest completion time heuristics and lowest execution time heuristics in scheduling. These heuristic first implements the MCT then apply the MET to achieve the load balancing cyclically. Minimum Completion Time (MCT) This heuristics only considered the completion time of tasks for submission of task on machines. This method is very simple but inefficient during complex scenarios. Minimum Execution Time (MET) This heuristics only considered the execution time of tasks for the submission of tasks on machines. This method is very simple but inefficient during complex scenarios as it does not take care of resource availability and also load balancing.

370

S. Khurana and R. K. Singh

4.3 Meta Heuristics Different authors categorized meta-heuristics algorithms on the basis of different perspectives. Figure 3 shows different Meta-heuristics algorithms. Meta Heuristics algorithms focus on the solution space and try to find the nearer optimal solution. These algorithms are categorized on the basis of different properties, behavior. In Fig. 3 the algorithms have been categorized as Swarm based optimization, natureinspired algorithms, Evolutionary algorithms [16]. Genetic Algorithm Genetic Algorithm is a meta-heuristics algorithm for solving optimization problems of task scheduling. It has three main operations, i.e., selection, crossover, and

Meta heuristics Algorithms

Evolutionary Algorithms

Nature Inspired Algorithms

Swarm Based Algorithms

Genetic Algorithms

BAT Algorithm

Ant Colony Optimization

Differential Evolution

Flower Pollination Algorithm

Particle Swarm Optimization

Evolutionary Programming

Grey Wolf Algorithm

Fig. 3 Different types of metaheuristic algorithms

Artificial Bee Colony Optimization

Survey of Scheduling and Meta Scheduling Heuristics …

371

mutation of the population. Selection operation used fitness function for each generation, individuals and these individuals further produced another next generation and recurring again and again until find the desired result. During the Crossover operation phase two different individuals exchange genes in the same position to produce new individuals in a given population. The last phase is Mutation in this phase which produced individuals more adaptable and, of course, weaker by changing the genes of individuals in the population. In a simple genetic algorithm, randomly initialize the population of the given problem. After that crossover and mutation operations are applied to produce a new population. A Service level agreement (SLA) is build up between user and cloud provider. It includes quality of service metrics like cost, time, budget, etc. Flower Pollination Algorithm (FPA) It is a natural inspired meta-heuristic algorithm to find the optimum solution of the NP-hard problems [19]. There exist two types of Pollination-Biotic and Abiotic [11, 18]. In the case of biotic, the pollen is passed by pollinators like birds, bees, etc. In the case of abiotic pollination, the atmosphere is used for the reproduction of flowers [17]. To choose the type of pollination, a Switch Probability is used. For biotic pollination, switch probability should be higher than 0.5. The reproduction factor is used to find the fitness solution for the sequence of execution of tasks. To improve the solution, levy flight optimization to find the best position of the task and its task execution sequence. It also increases the efficiency of the algorithm. Particle Swarm Optimization (PSO) PSO algorithm is designed by Kennedy and Eberhart [8] for solving the optimization problems. It is used to optimize a problem and trying to find the solution about a given measure of quality. The initial space is a population of particles. It is an evolutionary algorithm and each particle is a possible solution of the particular problem space. Each particle updates its position (X i ) iteratively as it is moving in the search space. i i = X ki + Vk+1 X k+1

(1)

This algorithm is applied to solve task scheduling problems in an optimization way. If there exist n tasks and m is the number of virtual machines in the data center. The scheduler used the PSO algorithm to distribute the tasks to the available virtual machines. In this analogy, the position matrix n × m matrix is build up. The fitness function is designed as per the user-defined quality of service metric [24]. It can be makespan, cost, deadline, etc. as per user’s need. Ant Colony Optimization (ACO) Ant colony optimization algorithm is used to simulate foraging activities of ants, i.e., when a group of ants searching for food, they spread pheromone to communicate with each other, once the ants get succeed in finding a food source, which leave pheromone on that path and its is help other ants to reach food source.

372

S. Khurana and R. K. Singh

5 Discussion Quality of Service (QoS) is a fundamental criteria of any scheduling algorithm. It includes performance, cost, security, reliability, etc. or a combination of them as constraints set by clients [27]. In a distributed computing environment like Cloud, Grid; there are a number of issues like availability, non-centralized control etc. which are QoS constraints that do not occur in a single computer system [9]. The general scheduling algorithms does not take care of QoS parameters. Further it will affect the overall performance of the Cloud system. There are two types of tasks in the cloud environment one is computation demanding tasks and another is communication demanding tasks. Further, the communication demanding tasks are like migration of a task from one machine to another machine which requires sufficient bandwidth for its completion. For providing the solution of scientific problems; these types of tasks are considered in the category of computational tasks. These kinds of problems require high-speed processor to complete the assigned task within minimum time delay. If these computational types of problems are submitted for execution on high bandwidth machines rather than high-speed computation power machines, then performance of the cloud system degrades. The resources are also not efficiently utilized. Similar is the case of the communicational based tasks [10]. So, there is need for appropriate scheduling algorithms for effectively utilizations of resources, and all the tasks get executed so that tasks will be completed within constraints decided by the users and the cloud providers.

6 Conclusion In this paper, a study of different task scheduling algorithms in cloud has been done. It was observed that none of above algorithms considered Quality of Service (QoS), which is a desirable factor in the present scenario. Quality of Service based scheduling will increase the resource utilization and decrease turnaround time because the job which requires high quality of service in term of bandwidth submitted to the higher bandwidth resource and similarly a highly computational based job submitted to more efficient resource in term of processing speed. So it is desirable that QoS should be including in our scheduling process for the reputation and reliability of cloud.

7 Future Work Scheduling is one of the prominent issue in cloud computing. A number of heuristics and meta-heuristics algorithms exist. Most of the scheduling algorithms focused to optimized time and cost metrics. The future work can be extended to develop new optimization techniques which helps in reduction of energy consumption of the

Survey of Scheduling and Meta Scheduling Heuristics …

373

virtual machines and their reliability. The work can be extended to the workflow scheduling in which the dependency of the task is considered as priority during the scheduling process.

References 1. Miglani N, Sharma G (2018) An adaptive load balancing algorithm using categorization of tasks on virtual machine based upon queuing policy in cloud environment. Int J Grid Distrib Comput 11(11):1–2 2. Bawa RK, Sharma G (2012) Reliable resource selection in grid environment. arXiv preprint arXiv:1204.1516. 6 Apr 2012 3. Zeng L, Veeravalli B, Li X (2015) SABA: a security-aware and budget-aware workflow scheduling strategy in clouds. J Parall Distrib Comput 1(75):141–151 4. Shukla S, Gupta AK, Saxena S, Kumar S (2016) An evolutionary study of multi-objective workflow scheduling in cloud computing. Int J Comput Appl 133(14):14–18 5. Nejatzadeh S, Karamipour M, Eskandari M (2013) A new heuristic approach for scheduling independent tasks on grid computing systems. Int J Grid Distrib Comput 6(4):97–106 6. Xhafa F, Barolli L, Durresi A (2007) Batch mode scheduling in grid systems. Int J Web Grid Serv 3(1):19–37 7. Al-Khateeb A, Abdullah R (2012) An enhanced meta-scheduling system for grid computing that considers the job type and priority. Computing 94(5):389–410 8. Eberhart R, Kennedy J (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4, 27 Nov 1995, pp 1942–1948 9. Maheswaran M, Ali S, Siegel HJ, Hensgen D, Freund RF (1999) Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J Parall Distrib Comput 59(2):107–131 10. Bawa RK, Sharma MG (2012) Performance prediction and QoS based resource selection in grid. Int J Grid Distrib Comput 5(3):69–80 11. Bibiks K, Li JP, Hu YF (2015) Discrete flower pollination algorithm for resource constrained project scheduling problem. Int J Comput Sci Infor Sec 13(7):8–199 12. Bittencourt LF, Madeira ER (2011) HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds. J Int Serv Appl 2(3):207–227 13. Abrishami S, Naghibzadeh M (2012) Deadline-constrained workflow scheduling in software as a service cloud. Sci Iran 19(3):680–689 14. Verma A, Kaushal S (2014) Bi-criteria priority based particle swarm optimization workflow scheduling algorithm for cloud. In: 2014 recent advances in engineering and computational sciences (RAECS), 6 Mar 2014. IEEE, pp 1–6 15. Abrishami S, Naghibzadeh M, Epema DH (2013) Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds. Future Gener Comput Syst 29(1):158–169 16. Zhu Z, Zhang G, Li M, Liu X (2015) Evolutionary multi-objective workflow scheduling in cloud. IEEE Trans Parallel Distrib Syst 27(5):1344–1357 17. Masdari M, ValiKardan S, Shahi Z, Azar SI (2016) Towards workflow scheduling in cloud computing: a comprehensive analysis. J Netw Comput Appl 1(66):64–82 18. Ali HG, Saroit IA, Kotb AM (2017) Grouped tasks scheduling algorithm based on QoS in cloud computing network. Egypt Inform J 18(1):11–19 19. Yang XS (2012) Flower pollination algorithm for global optimization. In: International conference on unconventional computing and natural computation. Springer, Berlin, Heidelberg, 3 Sept 2012, pp 240–249 20. He H, Xu G, Pang S, Zhao Z (2016) AMTS: Adaptive multi-objective task scheduling strategy in cloud computing. China Commun 13(4):162–171

374

S. Khurana and R. K. Singh

21. Alkhashai HM, Omara FA (2016) An enhanced task scheduling algorithm on cloud computing environment. Int J Grid Distrib Comput 9(7):91–100 22. Kumar A, Bawa S (2019) Generalized ant colony optimizer: swarm-based meta-heuristic algorithm for cloud services execution. Computing 101(11):1609–1632 23. Kumar A, Bawa S (2019) A comparative review of meta-heuristic approaches to optimize the SLA violation costs for dynamic execution of cloud services. Soft Comput 2019:1–4 24. Miglani N, Sharma G (2019) Modified particle swarm optimization based upon task categorization in cloud environment. Int J Eng Adv Technol 8(4):67–72 25. Bawa RK, Sharma G (2013) Modified min–min heuristic for job scheduling based on QoS in grid environment. In: 2013 2nd international conference on information management in the knowledge economy. IEEE, 19 Dec 2013, pp 166–171 26. Khurana S, Singh RK (2020) Workflow scheduling and reliability improvement by hybrid intelligence optimization approach with task ranking. EAI Endorsed Trans Scalable Inf Syst 7(24):1–10 27. Liu L, Zhang M, Lin Y, Qin L (2014) A survey on workflow management and scheduling in cloud computing. In: 2014 14th IEEE/ACM international symposium on cluster, cloud and grid computing. IEEE, 26 May 2014, pp 837–846 28. Lee YC, Han H, Zomaya AY, Yousif M (2015) Resource-efficient workflow scheduling in clouds. Knowl Based Syst 1(80):153–162

A Novel Idea for Designing a Speech Recognition System Using Computer Vision Object Detection Techniques Sukrobjon Toshpulotov, Sarvar Saidov, Selvanayaki Kolandapalayam Shanmugam, J. Shyamala Devi, and K. Ramkumar Abstract It is very challenging for establishing the communication with deaf people around the world. They need to get the assistance from others, and others need not be true always. To overcome this situation, a device we propose to develop an application which will provide the easy way of communicating using sign language without the help of others. The concept of this device development is a novel idea. It is intended to make the device as standalone using the recent development in embedded system technology. The proposed system is aimed to develop a pocket assistant for the deaf and hearing-impaired people in communicating with other people. All the functionality of the application is built around the organization of communication to establish a conversation between the user and his interlocutor. It is planned to develop a device to recognize the interlocutor’s speech in real time, query the related sign representation stored in database, and display the text or set of pictures in sign language on the screen. The device will be based on Raspberry Pi hardware. The technology involves capturing the audio using …. The audio will be processed by removing the noise and fed into the audio-to-text convertor to output the text message. Text information is detected using histogram of oriented gradients (HOG) and local binary pattern (LBP). The required information will be selected and queried to extract the desired sign representation from the database and provide the desired output to the user on the screen. Technological transfer of the proposed product will enable S. Toshpulotov · S. Saidov · S. K. Shanmugam Concordia University, Chicago, IL, USA e-mail: [email protected] S. Saidov e-mail: [email protected] S. K. Shanmugam e-mail: [email protected] J. Shyamala Devi · K. Ramkumar (B) SRM University, Kattankulathur, India e-mail: [email protected] J. Shyamala Devi e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_28

375

376

S. Toshpulotov et al.

mass production that can be utilized in national and global market for the benefit of the elderly and deaf people. It has three modules, login, recording the information, and translating the information and storing it in the database. Keywords Histogram of oriented gradients · Local binary pattern · Interlocutor

1 Introduction The term “deaf-mute” is referred to as one of, deaf used in sign language as a mode for communicating or deaf and dumb, refers not able to speak. In the world, the word “deaf” is spelled ways, “d” refers to people’s level of hearing through audiology and “D” refers to the deaf people, use sign language for communication. Almost 5% of the world’s population in according to the data of World Federation of the deaf (WFD) is disabling hearing loss. The disability of hearing loss has its diverse impact on his/her speech and language development. The degree of hearing loss can be seen in three categories based on the extend of the impact. They are moderate, severe, or profound levels. People who are disabled with severe or profound hearing loss have considerably higher voice handicap index (VHI) compared with moderate level hearing loss [1]. People those have mild hearing loss have less problems in the development of speech because they could not able to hear certain sounds and speech with clarity, is not going to bring that much impact. It is considered that deaf community is not a monolithic group, it is having wide groups as hard of hearing people, by birth deaf people, congenital deaf people, orally educated deaf people, and late deafened adults. Every group has different degree of loss of hearing, their own impact, usage of different sources of methods of communication. In this world, deaf people faces many challenges, irritations, and frustrations that affect the day-to-day life activities of the individual. By considering this scenario, children with this hearing loss have very high impact. Many of them become introverts and resist face-to-face socialization. It not only affects their socialization but also their regular communications, say, inability to communicate with family members, friends, neighbors, colleagues, etc. Many times, care takers find it difficult to take care of the children. To overcome the above difficulties for the hearing loss persons, there exists varied channels of communication, by which deaf-mute people could exchange the information they want to. The communication channels identified are helper pages, notes, books with letters, lip reading, gestures, and sign languages. Another method to support them are different medical treatments to get rid of the deafness. But, still the methods are expensive as per the details of WHO, 2017 data. The cost involved are direct, indirect, and intangible costs. However, there are many problems during the process of communication establishment. Even if there are over 300 sign languages exist, one such is that normal people are with the lack of awareness with sign language, and it is not easier for normal people to understand and use the language.

A Novel Idea for Designing a Speech Recognition System …

377

The major objective is to support this community people and make them to access all resources and happy with their life, by developing a mobile-based technology application (MBTA). The remainder of the paper is organized as follows: Sect. 2 discusses literature review; Sect. 3 provides an overview of the technology supported. Section 4 describes the proposed system, and the conclusion is presented in Sect. 5.

2 Literature Study Though we have different communication channels or methods for different group of deaf community, people are still lagging in the knowledge of deaf people culture and is reflected in the documentation of the society and healthcare environment, discussed by L. Sirch et al. Kuenburg et al. portray the communication challenges faced y healthcare professionals and deaf people [1]. Sign language plays an important role in establishing visual communication technology for healthcare professionals. Sharma et al. materialized wearable sensor gloves for detecting the hand gestures of sign language, solves the social problems of deaf-mute by bridging the communication gap with the outside world [2]. In this sensor-based technology approach, flex sensors were used to record the sign language and sense the environment. Soltani et al. developed a gesture-based game for deaf-mute by using vision-based approach [3]. It identifies the gesture command and converts into text, and it creates a platform to use the interactive environment. Many vision-based approaches are used and easily accepted by the deaf-mute community [11]. As smartphone technology plays a vital role in creating an environment for interacting socially and to overcome the communication barriers, it is portable and is more convenient compared with sensor or vision technology [12]. The emergency assistant “iHelp” was developed and used to report the emergency situation. Monovoix, Android application, behaves as a interpreter for sign language [13]. “Ear Hear,” uses sign language to communicate with normal people [14]. For the normal persons to interact with deaf people, text-to-speech (TTS) technology is used, where speech signal is passed as input, corresponding video with sign language is played for the corresponding input, and it is easier to understand [15].

378

S. Toshpulotov et al.

3 Technology Supported 3.1 Histogram of Oriented Gradients (HOG) and Local Binary Pattern (LBP) The object detection is an important concept in computer vision and image processing [4]. The most feature descriptor for object detection is histogram of oriented gradients (HOG). This technique counts the occurrences of gradient orientation in localized portions of an image. This feature descriptor method, HOG, is similar to scale-invariant feature transform descriptors, edge orientation histograms, and shape contexts. But in focus, it is computed on a dense grid of uniformly spaced cells and uses overlapping local contrast normalization to improve accuracy [5]. In computer vision, the best visual descriptor used for classification is local binary patterns (LBP). This technique which is a part of texture spectrum model is proposed in 1990. Later, LBP was clearly described in 1994. LBP is itself powerful feature for texture classification, but it combines with feature descriptor, histogram of oriented gradients (HOG), to increase the object detection performance considerably on identified datasets [6]. Silva et al. studied the above techniques and compared the several improvements in the field of background subtraction in 2015 [7].

3.2 Science and Technology Component The concept of this device development is a novel idea as there is not much product available in the market for the use of elderly citizens [16]. It is intended to make the device as standalone using the recent development in embedded system technology [17]. The device will be based on Raspberry Pi hardware. The technology involves capturing the images using 5 MP cameras under special lighting to enhance the visibility of the details. The images so captured will be processed with appropriate image processing algorithm to identify the required details. Further these details will be converted into audio format for ease of listening.

4 Proposed System The proposed system is designed to provide two way communication between normal people and deaf-mute people. It has four modules. The first module captures the audio and handles pre-processing techniques of removal of noise. The processed audio is fed into the audio convertor and output the text message. The text message can also be downloadable in different formats, as text and pdf formats. The text message is going to act as an input for the second module. The second module detects the text using histogram-oriented gradients (HOG) algorithm and local binary pattern (LBP).

A Novel Idea for Designing a Speech Recognition System …

379

Here, the combination of HOG and LBP descriptor is focussed on the structure of the text as object, identifies the pixel as an edge, extracts the gradient and orientation, generates histogram for each region, and normalizes the histogram. The reason for using HOG-LBP combination is that usage of HOG and LBP features is successful in different pattern of handling reduction process and detection [8]. The implementation is easier because of open CV for HOG and LBP algorithms which helps to collaborate HOG-LBP features. Now, the detected text acts as an attribute for querying the sign representation of the extracted from the database. Using sign language-supported tools, all possible sign notations are stored in the database. The final part displays the extracted sign representation to the user using any output devices. The application provides a way to identify the speech samples of alphabets, numbers, and commonly used sentence which are used by deaf-mute people in their daily life.

4.1 Diagrammatic Representation of the Prototype Since this research is focused on the societal benefits and ethical idea of supporting community for free of cost, the system is comfortable for all devices like mobile phones, laptops, ipad, etc. The combined features of HOG-LBP are expected to increase the efficient detection and increase the performance of the system. As it is used mainly with mobile phones, system efficiency is critical, and efficient approach used in this system increases the computational speed and efficiency (Fig. 1).

Capture the Audio from User

Audio

Detected Text

Extract the Sign representation for the detected text using Querying DB Fig. 1 Diagrammatic representation of the prototype

Text

Detects the Text using HOG

Display the Sign R i

380

S. Toshpulotov et al.

5 Conclusion In today’s life, timely communication is a mandatory factor. People in deaf community faces many challenges, irritations, and frustrations that affect them in doing everyday activities. This application provides a two way communication. It records the normal person’s speech and convert into sign language and display it to the deafmute persons. The application also provides a platform identify the speech samples of alphabets, numbers, and commonly used sentences which are used by deaf-mute people in their daily life. In future, the deaf-mute persons sign language representations are recorded, and every sign representation is detected, converted into text messages, and displayed to the normal person.

References 1. Sirch LS, Palese A (2017) Communication difficulties experienced by deaf male patients during their in-hospital stay: findings from a qualitative descriptive study. Scand J Caring Sci 31(2):368–377 2. Sharma MV, Kumar NV, Masaguppi SC, Mn S, Ambika DR (2013) Virtual talk for deaf, mute, blind and normal humans. In: Proceedings of the 2013 1st Texas instruments India educators’ conference, TIIEC 2013, pp 316–320 3. Soltani F, Eskanderi F, Golestan S (2012) Developing a gesture-based game for deaf/mute people using microsoft kinect. In: Proceedings of sixth international conference on complex, intelligent, and software intensive systems, pp 491–495 4. Lowe DG (2004) Distinctive image features from scale-invariant key points. Int J Comput Vision 60(2):91–110 5. Liu H (2009) Skew detection for complex document images using robust borderlines in both text and non-text regions. Pattern Recogn Lett 29:1893–1900 6. Arafat Y, Muhammad Saleem S, Afaq Hussain S (2009) Comparative analysis of invariant schemes for logo classification. In: Proceedings of the international conference on emerging technologies (ICET), pp 256–261 7. Butzke, M, Silva, AG, Hounsell, MS, Pillon, MA (2008) Automatic recognition of vehicle attribute-color classification and logo segmentation. Hifen, Urugaiana, pp 32–62 8. Mehmood Z, Anwar SM, Ali N, Habib HA, Rashid M (2016) A novel image retrieval based on a combination of local and global histograms of visual words. Math Probl Eng 2016:1–12. Article ID 8217250 9. Anagnostopoulos CNE, Anagnostopoulos IE, Psoroulas ID, Loumos V, Kayafas E (2008) ‘License plate recognition from still Images and video sequences’: a survey. IEEE Trans Intell Transp Syst 9:377–391. https://doi.org/10.1109/TITS.2008.922938 10. Zhang C, Chen X, Chen W (2006) A PCA-based vehicle classification framework. In: 22nd international conference on data engineering workshops 11. Bagarinao E, Kurita T, Higashikubo M, Inayoshi H (2009) Adapting SVM image classifiers to changes in imaging conditions using incremental SVM: an application to car detection. In: Proceedings of the 9th Asian conference on computer vision (ACCV), pp 363–372 12. Kim KK, Kim KI, Kim JB, Kim HJ (2000) Learning-based approach for license plate recognition. In: Proceeding of IEEE workshop on neural networks for signal processing, vol 2, pp 614–623 13. Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell 27(10):1615–1630

A Novel Idea for Designing a Speech Recognition System …

381

14. Juan L, Gwun O (2010) A comparison of SIFT, PCA-SIFT and SURF. Int J Image Process (IJIP) 3(4):143–152 15. Sivaraman S, Trivedi MM (2010) A general active-learning framework for on-road vehicle recognition and tracking. IEEE Trans Intell Transp Syst 11(2):267–276 16. Vapnik V, Golowich S, Smola A (1997) Support vector method for function approximation, regression estimation, and signal processing. In: Advances in neural information processing systems. MIT Press, Cambridge, pp 281–287 17. Shao Y, Lunetta RS (2010) Comparison of support vector machine, neural network, and CART algorithms for the land cover classification using limited training data points. The U.S. Environmental Protection Agency

Empirical Classification Accuracy Assessment of Various Classifiers for Clinical Diagnosis Datasets Sabita Khatri, Narander Kumar, and Deepak Arora

Abstract Classification is a predictive data mining task. Nowadays, it is also playing a pivotal role in the field of medical diagnostic towards early disease predictions. The aim of applying different classification techniques in diseases like cancer, diabetes, kidney infections, etc., is not to undermine the decision of doctor, but the outcomes determined from the classifiers might augment the correct treatment initiatives. The classifiers developed for medical diagnosis should be validated on reliable results to be trustworthy by doctors. In this research work, the authors attempted to assess the classification accuracy of different classifiers on datasets taken from UCI with crossvalidation. Majorly, SVM, logistic regression, ML perceptron, Naïve Bayes, fuzzy logic, k-nearest neighbours, random forest, and J48 are used for experimentation purposes. The performance measures like accuracy, RO curve, kappa statistics, MAE, RMSE, and model building time are used on WEKA. The authors have chosen datasets specifically related to liver, heart, and diabetes among widely spread most life-threatening diseases. Experimental results show that random forest demonstrated the best classification and prediction capability over other classifiers and chosen datasets. Keywords Classification · Data mining · Clinical data · WEKA

S. Khatri (B) · N. Kumar Babasaheb Bhimrao Ambedkar University, Lucknow, India e-mail: [email protected] N. Kumar e-mail: [email protected] D. Arora Amity University, Lucknow Campus, Lucknow, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_29

383

384

S. Khatri et al.

1 Introduction Recent studies show that data mining has become most appropriate and important for performing different analytical studies, specifically towards medical diagnostic decisions and related analytical issues. So, the utilization of data mining techniques in the medical analysis is growing day by day. There is most likely that assessment of information taken from the patient and choice of specialist are most vital for early disease prediction and quality treatment. For this purpose, data mining algorithms are now being used meticulously. These algorithms are capable of analysing the medical information in a shorter time with a great classification rate. Among major life-threatening diseases, heart disease has its own importance in recent medical research work. Medical practitioners are generating vast amounts of patient data, which is not being used effectively for early disease predictions. This unused data can be converted into a dataset for modelling and analysis purposes using different data mining techniques. Liver disease is another life-threatening disorder, which may lead to serious consequences at its later stages. Liver disease is also known as hepatic disease, which affects the liver. Diagnosing different sorts of liver disease is a very feverish process since patients need to experience various laboratory tests. Based on different test values and diagnosis, liver disease can easily be classified by applying data mining classification algorithms. Diabetes mellitus is a collection of various metabolic ailments all together in which a man faces high glucose complexities. The high glucose delivers various side effects and brings lots of inconveniences towards the regular function of a human body. Recent research trends in data mining show the great potential of its applicability towards early medical diagnosis of any disease and better follow up the entire treatment process. In this work, the authors have selected classifiers from different classifier families for heart, liver, and diabetes datasets. Naïve Bayes classifiers are found to be best in analysing a large dataset with a probabilistic approach on attributes [1]. Support Vector Machine classifier is helpful in making decision boundaries and isolating data with the help of objects belongs to different classes [2]. k-Nearest Neighbours majorly classifies unknown instances on the bases of relations of similarity function [3]. ML perceptron is a neural structure where each node is connected with each other [4]. J48 is again supervised learning based on present attribute values already available in the sample training dataset [5]. In the random forest approach, different classification trees are formed on the dataset [6]. Logistic regression is regression technique used to predict set of dependent and independent variable [2]. J48, SVM, random forest, multilayer perceptron and Bayesian network comparison are examined in Liver Disease Datasets, and SVM gives better performance algorithm [6]. SVM, logistic regression, and decision tree have been implemented on heart disease datasets [2]. Bayes algorithm and rule-based algorithm and decision tree have also been evaluated [7]. Datasets like heart (Cleveland Heart Disease Dataset), liver (Bupa Dataset), and diabetes (Pima Indian Diabetes Dataset) have been selected from UCI. The authors have compared the accuracy levels and found that the random forest

Empirical Classification Accuracy Assessment …

385

classification algorithm has a higher classification accuracy and prediction rate over others on chosen datasets. The remainder of the paper is organized as follows: Section 2 discusses background and literature work; Sect. 3 provides description of datasets used. Results and discussion is presented in Sect. 4, and the conclusion and future work is presented in Sect. 5.

2 Background and Literature Work Recent advancement clearly shows the applicability of data mining techniques in the majority of computing and business domains. Studies show that it can do wonders if applying towards predicting different diseases in their early phase and thus ensured the quality treatment initiative for poor patients. Through data mining techniques, automated mining of huge data repositories can be performed to identify useful hidden knowledge patterns from medical data. The literature shows extensive research work is being initiated through different researchers. To predict heart disease, the applicability of association rules can easily be seen in medical data [8]. Machine learning algorithm has a proven track record of identifying different knowledge patterns automatically from patient health testing data with the help of tree and fuzzy rule-based classifiers for predicting heart diseases [9]. A similar application of C4.5 and decision tree has also been performed by Hyontai on liver disease datasets. He also suggested implementing it on datasets of the liver, and these classifiers have the capability of producing more accurate results [5]. The combination of the genetic artificial immune algorithm for medical diagnosis to classify liver disorders is also attempted by Liang by using IS characteristics and memory [10]. k-Nearest neighbours with a combination of genetic algorithm are also found to be a very useful classifier specifically for heart disease classification by Jabbar et al. [3]. Nalini performed an evaluation of different classifiers like K mean clustering, support vector machine, and decision tree applied to heart disease classification [11]. Santosh also performed classification on heart disease for Naïve Bayes and genetic classifiers [12]. Bhuvaneswari extracted features in the first stage and then applied a genetic approach to these features for selecting a high-ranked feature set. Later in the third stage, k-NN, decision tree, and neural network have been applied for lung disease classification purposes [13]. Pinky applied the model for heart disease classification by defining the role of attributes like smoking, diabetes, etc., for heart disease classification [14]. Hadoop experimental environment is also used for predicting diabetic patients by Saravana et al. for their extensive research work [15]. Suvarna also explained techniques for early prediction of diabetes classification and prediction [16].

386

S. Khatri et al.

Table 1 Description of datasets used in experimentation Dataset

No. of features

Instances

Heart disease dataset

14

303

Liver disorder

7

Pima Indian diabetes dataset

9

Total missing values

Considered instances

Features detail

6

297

Age, sex, cp, trestbps, chol, fbs, restecg, thalach, exang, oldpeak, slope, ca, thal, num

345

0

345

Mcv, alkph, os, sgpt, sgot, gammagt, drinks, selector

768

375

393

Preg, pg, bp, tsft, se, bmi, dpf, age, class

3 Description of Datasets Used In this research work, the authors selected three disease datasets, namely heart, diabetes, and liver. Table 1 shows the brief description of different datasets. In this research work, eight different classifiers, namely SVM, Naïve Bayes, ML perceptron, k-NN, logistic regression, J48, FURIA, and random forest, have been chosen for this study and individual classifier’s performance also has been evaluated by considering performance measures.

4 Results and Discussion During experimentation, the performance measure values for all selected classifiers are found very close to each other when applied using tenfold cross-validation, as depicted in Tables 2, 3 and 4. Results show that random forest is having good accuracy levels along with moderate kappa values. Results show higher ROC value, whereas MAE and RMSE are found on the lower side, as shown in Figs. 1, 2 and 3. Kappa found 0.6867 for the heart dataset, as in Table 2 it is higher towards logistic regression, whereas the RO curve shows 0.907 for random forest as depicted in Fig. 1b. The MAE is lower for ML perceptron on the heart disease dataset, whereas RMSE is on the lower side for RF as per Fig. 1c. Other performance measures are also found better for RF as depicted in Fig. 1d. Similarly, Table 3 represents the higher accuracy results of random forest with high kappa and RO curve values for the liver dataset, as depicted in Fig. 2a, b. In Fig. 2c, FURIA shows lower MAE values, whereas RMSE values are on the lower side. Figure 2d depicts classification performance measures for the liver dataset. For the diabetes dataset, results are depicted in Table 4 and Fig. 3a, b, respectively. It

Empirical Classification Accuracy Assessment …

387

Table 2 Classifier accuracy and performance measures (heart dataset) Classifier

Accuracy

Kappa statistics

MAE

RMSE

ROC

Precision

F-measure

Recall

Naïve Bayes

83.165

0.6602

0.1863

0.3638

0.904

0.832

0.831

0.832

SVM

56.229

0.0591

0.4377

0.6616

0.528

0.636

0.447

0.562

Logistic regression

84.5118

0.6867

0.2188

0.3456

0.902

0.846

0.845

0.845

Multilayer perception

81.8182

0.6346

0.1838

0.3979

0.893

0.818

0.818

0.818

k-NN

77.4411

0.5449

0.2276

0.4732

0.766

0.774

0.774

0.774

FURIA

81.8182

0.6315

0.1927

0.3915

0.856

0.82

0.817

0.818

J48

74.7475

0.49

0.2922

0.4677

0.74

0.747

0.747

0.747

Random forest

82.4916

0.6462

0.2661

0.3513

0.907

0.825

0.824

0.825

shows binary tree classifier J48 performs better and has the highest accuracy values. Kappa and ROC are found high for random forest. In Fig. 3c, MAE is found lower in the case of FURIA; however, RMSE is found lower for random forest. Figure 3d depicts all major performance measures. After comparing the proposed research work with the existing literature, it is found that most of the authors have considered a dataset of a particular disease only. This research work is intended to identify the best possible classifier family with greater classification accuracy on selected disease datasets. It is found that J48 and RF classifiers are having good accuracy levels when compared to others, inclusive of all major families of classifiers.

5 Conclusion and Future Remarks The present work reviles the choice of tree-based classifiers, performing great on chosen datasets from the UCI repository. With little bit higher MAE and RMSE values, J48 and RF both are demonstrating better ROC trends and class forecast accuracy. Also, the kappa values of random forest are again showing better outcomes. Logistic regression indicates better forecast proficiency over Naïve Bayes. The experimentation performed in this research work clearly shows that decision tree-based classifiers to be considered as the best classifiers for disease datasets. This research work can be extended towards the implications of hybrid algorithms based on the decision tree approach with a blend of attribute reduction after identifying the highest correlated attributes to the class.

388

S. Khatri et al.

Fig. 1 Classifier performance measures (heart dataset). a Classification accuracy values. b ROC versus kappa values. c RMSE versus MAE values. d Accuracy measure values

Empirical Classification Accuracy Assessment …

389

Fig. 2 Classifier performance measures (liver dataset). a Accuracy values. b ROC versus kappa values. c RMSE versus MAE values. d Accuracy measure values

390

S. Khatri et al.

Fig. 3 Classifier performance measures (diabetes dataset). a Accuracy values. b ROC versus kappa values. c RMSE versus MAE values d Accuracy measure values

Empirical Classification Accuracy Assessment …

391

Table 3 Classifier accuracy and performance measures (liver dataset) Classifier

Accuracy

Kappa statistics

MAE

RMSE

ROC

Precision

F-measure

Recall

Naïve Bayes

55.3623

0.153

0.4597

0.5083

0.640

0.609

0.544

0.554

SVM

59.4203

0.0419

0.4058

0.637

0.518

0.702

0.462

0.594

Logistic regression

68.1159

0.3291

0.4151

0.4584

0.718

0.677

0.675

0.681

Multilayer perceptron

71.5942

0.4023

0.3543

0.4523

0.742

0.714

0.711

0.716

k-NN

62.8986

0.2401

0.3718

0.6072

0.630

0.630

0.629

0.629

FURIA

65.5072

0.2819

0.3296

0.5169

0.700

0.651

0.655

0.652

J48

68.6957

0.3401

0.3673

0.5025

0.665

0.683

0.687

0.68

Random forest

73.3333

0.4411

0.3823

0.4345

0.764

0.732

0.733

0.729

Table 4 Classifier accuracy and performance measures (diabetes dataset) Classifier

Accuracy

Kappa statistics

MAE

RMSE

ROC

Precision

F-measure

Recall

Naïve Bayes

77.8061

0.4964

0.2566

0.4299

0.830

0.777

0.777

0.778

SVM

66.8367

0

0.3316

0.5759

0.500

0.447

0.536

0.668

Logistic regression

77.8061

0.4739

0.2916

0.391

0.839

0.772

0.771

0.778

Multilayer perceptron

74.2347

0.399

0.2744

0.4447

0.793

0.735

0.737

0.742

k-NN

73.4694

0.3873

0.2666

0.5136

0.679

0.729

0.731

0.735

FURIA

77.8061

0.4844

0.2221

0.4407

0.781

0.773

0.774

0.778

J48

79.3367

0.547

0.2687

0.4148

0.774

0.801

0.796

0.793

Random forest

76.7857

0.4628

0.29

0.3884

0.841

0.763

0.765

0.768

References 1. Inza I, Merino M, Quiroga J, Larranaga P (2005) Feature selection in bayesian classifier for the prognosis of survival of cirrhotic patients treated with tips. J Biomed Inf 38(5):376–388 2. Mythili T, Mukherji D, Padalia N, Naidu A (2013) A heart disease prediction model using SVM, decision tree, logistic regression. Int J Comput Appl 68(16):11–15 3. Akhiljabbar M, Deekshatula BL, Chandra P (2013) Classification of heart disease using knearest neighbor and genetic algorithm. In: First international conference on computational intelligence: modeling techniques and applications, vol 10, pp 85–94 4. Ahmad F, Isa NA, Hussain Z, Osman MK (2013) Intelligent medical disease diagnosis using improved hybrid genetic algorithm—multilayer perceptron network. J Med Syst 37(2):9934

392

S. Khatri et al.

5. Sug H (2012) Better decision tree induction for limited data sets of liver disease. In: International conference on future generation information conference, FGIT 2012. Gangneug, pp 88–93 6. Gulia A, Vohra R, Rani P (2014) Liver patient classification using intelligent techniques. Int J Comput Sci Inf Technol 5(4):5110–5115 7. Srikanth P, Deverapalli D (2016) A critical study of classification algorithms using diabetes diagnosis. In: proc. of IEEE 6th International Conference on Advanced Computing (IACC), Bhimavaram, 245–249 8. Jabbar MA, Deekshatulu BL, Chandra P (2012) An evolutionary algorithm for heart disease prediction. In: 6th international conference on information processing. ICIP 2012, pp 378–389 9. Anooj PK (2012) Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules. J King Saud Univ Comput Inf Sci 24(1):27–40 10. Liang C, Peng L (2013) An automated diagnosis system of liver disease using artificial immune and genetic algorithms. J Med Syst 37(2):9932 11. Priya GN, Kannan A, Anandhakumar P (2013) An efficient classification analysis for multivariate coronary artery disease data patterns using distinguished classifier techniques. In: Fourth international conference on signal and image processing, vol 2, pp 385–394 12. Kumar S, Sahoo G (2014) Classification of heart using Naïve Bayes and genetic algorithm. In: International conference on CIDM, vol 2, pp 269–282 13. Bhuvaneswari C, Aruna P, Loganathan D (2014) A new fusion model for classification of the lung disease using genetic algorithm. Egypt Inf J 15:69–77 14. Bajaj P, Choudhary K, Chauhan R (2015) Prediction of occurrence of heart disease and its dependability on RCT using data mining techniques. In: Second international conference India 2015, vol 2, pp 851–858 15. Saravana Kumar NM, Eswari T, Sampath P, Lavanya S (2015) Predictive methodology for diabetic data analysis in big data. In: 2nd international symposium on big data and cloud computing, vol 50, pp 203–208 16. Pawar S, Sikchi S (2016) An extensive survey on diagnosis of diabetes mellitus in healthcare. In: International conference on data engineering and communication and technology, vol 1, pp 97–104

Comparison of Transform-Based and Transform-Free Analytical Models Having Finite Buffer Size in Non-saturated IEEE 802.11 DCF Networks Mukta and Neeraj Gupta Abstract Lots of analytical models had already been proposed for accessing the functioning of IEEE 802.11 DCF networks in various load conditions. Most of the models restricted their study either for small buffer or infinite buffer size for the evaluation purpose. Such models failed to capture the influence of dynamic storage areas on various performance parameters. Extending these models for arbitrary buffer size is a challenging task. Few authors accepted this challenge and developed analytical models for characterizing the influence of varying buffer length on performance parameters for unsaturated systems. Most of them relied on complex mathematical concepts of transform-based approach which suffers from the computation speed. In order to accelerate the computation process, authors proposed to use simple and flexible transform-free method for output parameters evaluation having finite buffer size. This paper studies and compares the transform-based and transform-free analytical models developed for the unsaturated IEEE 802.11 DCF networks applying random buffer capacity. The comparison is done in terms of modeling accuracy, time complexity, and computation speed. The performance descriptors are evaluated by employing non-saturated renewal model and considering individual stations as M/G/1/K queuing system. Results show that non-saturated models utilizing transform-free approach provide high computation speed than that of transform-based approach for arbitrary buffer size. Keywords IEEE 802.11 · DCF · Analytical model · Fixed-point analysis · Queuing theory · Non-saturated network · Buffer size

Mukta (B) · N. Gupta K. R. Mangalam University, Gurgaon, India e-mail: [email protected] N. Gupta e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_30

393

394

Mukta and N. Gupta

1 Introduction The IEEE 802.11 standard is a de facto standard used for wireless local area networks (WLAN). It has a fundamental medium access control (MAC) scheme known as distributed coordination function (DCF). DCF is contention-based MAC access scheme utilizing the carrier-sense multiple access with collision avoidance (CSMA/CA) protocol. To better explain the performance of WLAN, an examination of the DCF MAC access scheme is necessary. In recent past, this topic has gained lot of attention among the research community. Researchers developed many analytical models to analyze the performance of DCF. Depending on the offered load conditions to the network, these models can be differentiated as saturated networks and nonsaturated networks. Saturation assumption deals with the situation in which node’s buffer always has a packet available for transmission whenever station wants to transmit the packet and channel is found idle. Authors in [1] presented a preliminary mathematical model to precisely analyze the functioning of DCF under saturated condition. The backoff process is modeled by the bi-dimensional Markov chain. The work in [2, 3] extends the [1] work by integrating short retry limit after that the packet is discarded as per IEEE 802.11 standard [4]. All these models are based on [1] Markov chain which generates unnecessary complexity due to large number of state transitions. Authors in [5] address this issue and proposed a non-Markovian analytical model using fixed-point analysis under saturated condition. This model presents a simplified and generalized solution for the DCF performance evaluation in saturated network conditions with respect to the [1] Markov chain model. However, the saturation assumption is not realistic as it ignores the dynamic buffer essential to cater packets burst. This motivates the researchers to develop analytical model for the non-saturated condition for estimating DCF network performance. To model such networks, an additional state or post-backoff state is added to the saturated settings of the [1] Markov model to represent the empty buffer [6, 7]. In order to remove the complexity associated with Markov chain, authors in [8] extended the model in [5] for the nonsaturated case by multiplying the saturated transmission probability with that of non-empty buffer using fixed-point analysis. Buffer plays an important role for nonsaturated networks. Buffers are used to hold short-term packets burst and maintain high link efficiency. The key parameter that needs to be evaluated while designing such models is the packet service time. Once we determine the packet service time, the other metrices like collision probability, transmission attempt rate, and system throughput can be easily evaluated. Most of the previous works are based on infinitely large queue length or negligible buffer size in their investigation [7, 9, 10]. Although some models attempted to integrate the buffer with finite length, however performed their analysis based on fixed buffer size [11–13]. Authors in [14] extend the work [8] for arbitrary buffer size using queuing theory and non-saturated fixed-point analysis for analyzing the performance of DCF. Probability generating function (PGF) and transform-based approach are utilized to compute the packet service time at MAC

Comparison of Transform-Based and Transform-Free Analytical …

395

layer. This makes the mathematical process complex and intensive. The computation process can further be accelerated by using Lattice-Poisson algorithm given by [15]. Reference [16] presented a simplified and flexible solution for modeling the unsaturated networks having arbitrary buffer length. Simple closed-form expressions are given to compute the probability distribution of service time and probability of empty buffer using discrete M/G/1/K queuing model and fixed-point analysis. In the current paper, we presented a comparative study among the analytical models (transform-based and transform-free) given by [14], modified using [15] and [16] having random queue length in unsaturated condition. Comparison is done for modeling correctness, time complexity, and execution speed. We consider throughput and collision probability as a performance descriptor computed by employing M/G/1/K queuing system and fixed-point formalization based on renewal theory. The work is systematized as follows: Section 2 discusses analytical models and their assumptions. Section 3 presents mathematical analysis of the analytical models for service time distribution and probability of empty buffer. Section 4 discusses the comparative analysis and simulation results obtained from these models in terms of throughput and collision probability at different buffer size. Section 5 finally concludes the paper.

2 Analytical Models Analytical models provide an effective way to estimate the results in case of any change in network parameters. Buffers play a significant role in wireless network functioning. An inadequate incoming packet buffer can decrease the channel usages rate thereby wasting the valuable resources, whereas a very large buffer contributes towards increased system delay. Prevailing literature is primarily focused on either negligible or infinite buffer [7, 9, 10] but this assumption is not very realistic and useful. Further work for non-saturated networks having general buffer size can be referred in [11–13] but the models used the predetermined buffer size for evaluation purpose. All these models are based on complex Markov chain. The work in [14, 16] simplified the analysis by deploying the fixed-point formalization using renewal theory. Each of these models has their own set of assumptions and terminologies as follows: Reference [14] further improved their work in [8] by employing arbitrary buffer size and thus provided a mathematical IEEE 802.11 DCF network model to study the system performance. The main assumptions of the model are: (i) The collision probability of a station does not depend on the result of previous transmission attempt and remains the same until there is a change in network topology; (ii) the network conditions are assumed to be ideal and thus no other factor than collision is contributing to transmission failure; (iii) the nodes are within transmission range of each other and thus forming a single-cell network; (iv) the model is operating in homogeneous network conditions and keeping the number of nodes fixed; (v) each individual node is assumed to work as M/G/1/K queuing system with K being the finite incoming

396

Mukta and N. Gupta

buffer; and (vi) the existence of any hidden terminal and capture effect is excluded while designing the network. The network time is split into distinct timeslots. On each idle time slot, the backoff counter starts decrementing. A station can attempt for transmission only after its backoff counter is nil. A packet transmission is considered as finished if it is successfully received at receiver or the retry limits gets exhausted. The time taken by a packet from its head-of-line position to transmission finished status is called as MAC service time. In empty buffer condition with no next packet immediately available for transmission, the station goes to idle state and its backoff counter is reinitialized corresponding to minimum contention window size. The utilization of probability generating function (PGF) and numerical inverse transform function to compute the packet service time makes this model quite complex and computationally intensive. Reference [14] presented another simpler approach based on idea that probability of finite queue length (Pk ) can be obtained by using the infinite queue length model. This improves the time complexity of the model, but the computation process can further be improved by utilizing the LatticePoisson algorithm given by [15] whose mathematical analysis is given in the next section. Authors in [16] proposed a transform-free analytical model based on [17] for analyzing the performance of IEEE802.11 DCF for arbitrary buffer size. The model utilized the simple and closed-form expressions for determination of packet service time. Keeping rest all assumptions same as in [14], the model in [16] is much faster and worthier than that of previous work. The performance parameters are calculated by deploying M/G/1/K queue and fixed-point formalization for non-saturated networks condition. The above-stated models (transform-free and transform-based) under the mentioned network assumptions provide extremely accurate outcomes. A comprehensive mathematical analysis of these seminal works is presented in the following section.

3 Mathematical Analysis As per specifications of IEEE 802.11, if a station buffer is non-empty (saturated) and wants to transmit a packet, then it should wait for a DIFS time period before beginning its transmission. The station can commence its transmission if the channel remains idle for DIFS time period. In case medium is sensed non-idle because of other ongoing transmission then the station postpones its communication for arbitrary period chosen between [0, CW0 − 1] known as backoff time, where CW0 is the least contention window (CW) size. Once the ongoing transmission finishes, backoff counter starts decrementing on each idle time slot. If the counter value is nil and no channel activity is sensed, then station starts its transmission. The successful delivery is acknowledged by receiver in the form of ACK packet. In the absence of any acknowledgement until specified time duration (ACK Timeout), the packet is considered as lost or collided. For each unsuccessful transmission, the value of

Comparison of Transform-Based and Transform-Free Analytical …

397

backoff counter will get doubled and increased up to its maximum value (CWmax ) and remain same until the re-transmission limit is reached. Once the retry limit exhausted, then the packet is abandoned. In case station buffer is not having any packet (nonsaturation), the station goes into initial backoff stage. On completion of this stage and new packet arrives in its buffer, the station will immediately attempt for transmission after monitoring the channel non-busy state for DIFS interval otherwise the station retains its idle position. In case channel is found busy, then size of CW gets doubled and above process is repeated. Let there are total n stations in the network. Under idle channel assumptions where collision is the only reason for transmission failure, a transmission is successful if only single station is transmitting in the network. Thus, γ = (β) = (1 − β)n−1

(1)

Here γ and β denote the collision probability and attempt rate for non-saturated networks. As per [8], the transmission probability by a node in unsaturated network (β) is approximated by multiplying the saturated network transmission probability with probability that at least one packet available in buffer, i.e., (1 − P0 ). Thus, we have: β = (1 − P0 )β c

(2)

In order to find out the roots of (1) and (2), it is required to get P0 and β c first. The value of saturated attempt rate (β c ) can be obtained using (3) [5] which is expressed as the expected number of transmission attempts divided by the expected number of backoff slots required for a packet’s transmission is finished. β c (γ ) =

1 + γ + γ 2 + · · · + γ M−1 b0 + γ b1 + γ 2 b2 + · · · + γ M−1 b M−1

(3)

3.1 Calculation of Packet Service Time and Probability of Empty Buffer (P0 ) Each station is modeled as an M/G/1/K queuing system with finite queue length K to hold the packets waiting for transmission. Let λ represents the packet incoming rate assuming Poissonian traffic arrival. Let Y c denotes the packet service time. Let X and , respectively, represent the backoff decrement process variable and slot decrement process variable. The X number of backoff slots needs to be decremented before transmission of a packet is considered as finished, packet service time is written as:

398

Mukta and N. Gupta

Yc =

X 



(4)

i=1

A packet can be attempted maximum M backoff stages until the transmission is declared to be finished. Thus, X representing total backoff slots spent in various backoff stages M depends the collisions faced by the packet (γ) and is represented as: X=

j 

n k with probability δ(γ , j), 0 ≤ j ≤ M − 1

(5)

k=0

where  δ(γ , j) =

(1 − γ )γ j , j = 0, . . . , M − 2, j = M −1 γ M−1 ,

(6)

The variable ηk is evenly spread in [0, CWk − 1] with average backoff bk , where  CWk =

2k .CW0 ; for 0 ≤ k < m 2m .CW0 ; for m ≤ k ≤ M − 1

(7)

Here m being the backoff stage number at which upper bound of CW is achieved. The length of each backoff slot  is dependent on the channel activity. In idle channel, it decrements with idle slot time σ. However, if some channel activity is sensed (successful transmission with time T s or unsuccessful transmission with time Ts¯ ), the backoff decrement is paused for that duration and then proceeds at next idle slot, represented as: ⎧ with probability (1 − pb ) ⎨σ; (γ ) = T s + σ ; with probability Ps ⎩ Ts¯ + σ ; with probability Ps¯

(8)

where Pb , Ps , and Ps¯ are given as: n  n Pb = 1 − 1 − β c = 1 − (1 − γ ) n−1

(9)

 n−1 1 Ps = nβ c 1 − β c = n 1 − (1 − γ ) n−1 (1 − γ )

(10)

Ps¯ = Pb − Ps

(11) c

Let the mean values of X, , and Y c are denoted as X ,  and Y . From (4) and [18], we have

Comparison of Transform-Based and Transform-Free Analytical …

Y c (γ ) = X (γ ).(γ )

399

(12)

In (12), X (γ) and (γ) are calculated as: X (γ ) = b0 + γ b1 + γ 2 b2 + · · · + γ M−1 b M−1

(13)

(γ ) = σ + Ps .Ts + Ps¯ .Ts¯

(14)

From [18], the PGF of arbitrary variables X and , respectively, notated by X (γ , z) and (γ, z) are given as:

Xˆ (γ , z) =

M−1 

δ(γ , i).

i=0

i

 nˆ k (z)

ˆ , z) = (1 − Pb )z σ + Ps .z Ts +σ + Ps¯ .z Ts¯ +σ (γ  ηˆ k =

CWk 1 . 1−z ; CWk 1−z 1 1−z CWm . ; CWm 1−z

(15)

k=0

k = 0, . . . , m − 1 k = m, . . . , M − 1

(16)

(17)

In queuing theory, the traffic intensity (ρ) being dependent on incoming traffic λ c and mean packet service time Y is given as: ρ = λ.Y c (γ )

(18)

c

where λ represent arrival rate and Y is given in (12). Since X and  depend on γ, thus using (4) and [14] the probability generating function of Y c is calculated as:

ˆ , z) Yˆ c (γ , z) = Xˆ γ , (γ

(19)

3.2 Computation of P0 and Pk Using M/G/1/K Queuing System [14] Let Pk notates the steady-state probability of k packets being available in queue of maximum permissible length K at any instant of time. To calculate Pk and P0 authors in [14] utilized two methods. The first method is based on numerical transform inversion algorithm, where Pk can be obtained by Eq. (19) in M/G/1/K queuing model:

400

Mukta and N. Gupta

Pk =

P0

+

P1

Pk ; k = 0, 1, . . . , K + · · · + PK

(20)



The Pk is given as:

Pk =

⎧ ⎪ ⎪ 1;    ⎪ ⎨ λak−1 +λ k−1 j=1 P j .ak− j

k=0

; k = 1, . . . , K − 1 1−λa0 k−1 ⎪  ⎪ ⎪ P j ; k = K ⎩ ρ − (1 − ρ)

(21)

j=1

P0 can be calculated by solving recursive Eqs. (20) and (21) where ak is obtained  by inverting a , using Lattice-Poisson inversion algorithm:

⎧  for k = 0 ⎨ aˆ (λ, 0); kl−1    ak = 1 aˆ  λ, r e−iπ j/(kl) eiπ j/l ; for k = 1, . . . , K − 1 ⎩ 2klr k.

(22)

j=−kl





where the generating function a is given as:   1 − e−(1−z)λ .Yˆ c γ , e−(1−z)λ aˆ = 1 − e−(1−z)λ 

(23)

Although the use of numerical inversion transform algorithm is quite effective and accurate, it is computationally complex and intensive. Moreover, the time complexity to perform the operations is O(K 2 ) due to the use of recursive equations given in (20) and (21). For reducing the time complexity in M/G/1/K queuing system, the authors in [14] presented a second method built on M/G/1/∞ model for calculation of P0 when ρ < 1. The key idea behind this is to obtain the probability of finite queue length (Pk ) from that of infinite queue length model. Thus, the first step is to calculate the queue length distribution for the infinite buffer capacity which is given as:  πk∞

=

1 − ρ;

  ∞ ak−1 π0∞ + k−1 j=1 ak− j π j ; 1−a0

for k = 0 for k = 1, . . . , K − 1

(24)

From [18], the blocking probability is given as:  pk = Using (25), P0 is given as:

πk∞ , 1−ρqk (1−ρ)qk , 1−ρqk

for k = 0, 1, . . . , K − 1 for k = K

(25)

Comparison of Transform-Based and Transform-Free Analytical …

P0 =

1−ρ 1 − qk ρ

401

(26)

To solve for P0 , we need to calculate ρ and qk which can be obtain using (18) and (27), respectively. The value of qk is obtained using Lattice-Poisson algorithm as: K l−1    1 qk = qˆ λ, r e−iπ j/(kl) eiπ j/l 2K lr K j=−K l

(27)



where q (λ, z) is the probability generating function for the sequence qk , k = 0, 1… and can be written as:

1 − zπk∞ (λ, z) q(λ, ˆ z) = 1−z

(28)

where π ∞ is the PGF of π ∞ and expressed as:

πˆ



  (1 − ρ)(1 − z)Yˆ c γ , e−(1−z)λ  πˆ (λ, z) =   Yˆ c γ , e−(1−z)λ − z ∞

(29)

The computation done using above methods is complex as equations utilized are recursive in nature. The computation process can further be speed up by using Lattice-Poisson numerical inversion algorithm and Euler summation [19].

3.3 Computation of P0 Using Lattice-Poisson Algorithm Given by [15] To further improve the computation process, we employed a Lattice-Poisson numerical inversion algorithm to compute the value of P0 as given in (26). The value of P0 is dependent on ρ and qk . Here, we are revising the expression for qk only by using (30) given by [15]: ⎫ ⎧ k−1 ⎬ ⎨     1 qk = ˆ −r ) + 2 q(λ, ˆ r ) + (−1)k q(λ, (−1) j Re qˆ λ, r e−iπ j/(kl) K ⎭ 2K r ⎩ j=1

(30) where the value of ‘r’ can be obtained using (31) which governs the error bound of the calculation and K represents the buffer size. Reference [15] suggested and proved that constant parameter E = 8 provides an accuracy of 10–8 .

402

Mukta and N. Gupta

Table 1 Assessment of transform-based and without transform methods using queuing system

Fixed-point equation

[14] M/G/1/K

Modified using [15] M/G/1/K

[16] M/G/1/K

Renewal theory considering M/G/1/∞ analysis adapted for finite queue size K

[14] M/G/1/∞ analysis Renewal theory using modified using [17] Lattice-Poisson algorithm of [15]

Numerical Numerical transform/transform-free inversion method transform

Numerical inversion transform

Transform-free approach

Precision

Correct if traffic intensity ρ < 1

Correct if traffic intensity ρ < 1

Always correct

Time complexity

O(K)

O(K)

O(K)

Processing time (in seconds) for number of nodes 24 with Poisson traffic arrival L = 51.5625 packet/s Scenario-1: K = 2

16.773 s

Scenario-2: K = 6 Scenario-3: K = 20

8.356 s

3.331 s

40.438 s

20.678 s

3.272 s

144.17 s

67.374 s

3.312 s

r=

1 4 = 10−( K ) E ( ) 2K 10

(31)



“Re” means the real part of function q (λ, z) obtained using (28). Substituting the value of ρ and qk from (18) and (30) in (26) gives the value of P0 with time complexity O(K). The use of Lattice-Poisson algorithm improves the computation speed as can be seen in Table 1.

3.4 Computation of P0 Using Transform-Free Approach [16] Authors in [16] presented a simplified analytical approach under non-saturated operation to assess functioning of wireless DCF network with arbitrary queue length. The probability distribution for service time and other key parameters are computed using transform-free approach in an accurate and fast manner. The model utilized the M/G/1/K queuing system and fixed-point formalization using renewal theory for evaluating the performance descriptors. Using simple and closed-form expression for determining the probability of idle queue and full queue make the computation faster and simpler. To solve for the probability of empty buffer (P0 ), authors in [17] calculate the variance of random variables X and  as:

Comparison of Transform-Based and Transform-Free Analytical …

 d2 Xˆ (γ , z) 2 V (X ) = |z = 1 + X − X 2 dz   ˆ , z) d2 (γ 2 V () = |z = 1 +  −  dz 2

403



(32)

(33)

From (4), (32), (33), and [18], the variance of the service time is calculated as: 

V Y

c



 X   2 =V  = X .V () + V (X ).

(34)

i=1

Using (12) and (34) the squared coefficient of variation of the service time (S 2 ) is equated as: S2 =

V (Y c ) Yc

2

=

X .V () + V (X ).2  2 X .

(35)

Using Eqs. (18), (35), and [17], the probability of empty buffer is given as: P0 =





ρ

2.

(ρ − 1)  √ 2 −S √ 2 ρe√ +K2+1

(36)

2 ρe−S .S 2 −

2+

ρe−S .S 2 −

ρe−S

−1

The probability PK of queue being full of finite queue length K is expressed as: ⎛

√

2 ρe−S .S 2 −

⎜ ⎝ρ PK =

√ 2.

2 ρe−S +2K

2 ρe−S .S 2 −

2+



ρ





2 ρe−S .S 2 −

2+





⎞

2 ρe−S



⎟ ⎠(ρ − 1)

2 ρe−S +K +1

2 ρe−S .S 2 −



2 ρe−S

(37)



−1

Putting the value of P0 in (2) and further solving (1) and (2) provides the value of β and γ under non-saturated network conditions which can further be utilized to calculate the system throughput, S th , using (38): Sth =

  E payload information in a slot time Ps Pb L = E[Duration of slottime] Pb .Ps .Ts + Pb Ps¯ Ts¯ + (1 − Pb )σ (38)

where L represents the length of payload. The probabilities Pb , Ps , and Ps¯ are given in (9), (10), and (11), respectively. T s and Ts¯ are the successful and unsuccessful packet transmission time, respectively.

404

Mukta and N. Gupta

4 Comparative Analysis and Simulation Results The models [14, 16] are closely related to each other. The difference lies in their mathematical analysis for the computation of blocking probability (Pk ) and probability of empty buffer (P0 ). The model in [14] utilized the recursive Eqs. (20), (21), and (22) for calculating the value of Pk and P0 which makes the process computationally complex and intensive. To improve the computation process, [14] utilized the Lattice-Poisson algorithm. Note that we revised the expression for qk used in the calculation of P0 by using Lattice-Poisson algorithm given by [15]. The other model presented by [16] utilized the simple closed-form Eqs. (35) and (36) for calculating Pk and P0 . This makes the computational process simple and faster as compared to [14, 15]. Once we determine the value of P0 , rest all factors like collision probability and attempt rate can be computed by solving (1) and (2) which we further utilized to compute the system throughput by using Eq. (38). Programming for the analytical models discussed in this paper is done in ‘C’ language. The programs are run on 64-bit Windows 10 operating system, Intel® Core™ i5-4300 M CPU @ 2.60 GHz processor, 8 GB RAM. The roots of both equations are plotted with the help of Gnuplot [20]. Table 1 presents the comparative analysis among the analytical models (transform-based and transform-free) in terms of accuracy, time complexity, and time needed for execution.

4.1 Validation and Simulation Results To evaluate efficiency of models (transform-based and transform-free), we compared the outcomes with NS2 simulation measurements. During simulation, packet arrival rate is assigned to follow Poisson distribution. The values used in analysis are listed in Table 2. A homogeneous network consisting of 24 sender stations within a small area of 600 m × 600 m is considers for experimental work in NS2. The system parameters γ (1) and S th (38) are calculated for buffer size 2, 6, and 20 under varying network load conditions. Figures 1, 2, and 3 represent the effect on collision probability by varying the total offered load to the network for buffer size K = 2, 6 and 20 keeping the number of stations fixed at twenty-four. We observed that from light to full load, collision probability increased by increasing the queuing buffer. This is due to the reason that increased buffer can store higher number of data packets which contends for accessing the shared channel which leads to higher number of collisions. Figures 4, 5, and 6 compare the normalized system throughput against different traffic load for buffer size K = 2, 6, and 20 with twenty-four stations. It can be noticed from the graphs that initially by increasing the traffic load there is a rise in system throughput. However, as the offered traffic approaches close to channel capacity C max , the curve gets steady which indicates the saturated system throughput. The results obtained using analytical models (transform-based or transform-free) are

Comparison of Transform-Based and Transform-Free Analytical … Table 2 IEEE 802.11 parameters setting for experiment

405

Parameters

Setting

Payload, L

8184 bits

MAC HDR

224 bits

PHY HDR

192 bits

ACK

112 bits + PHY header

Idle slot σ

20 μs

DIFS

50 μs

SIFS

10 μs

Propagation delay

1 μs

Raw channel capacity

11 Mbps

Control rate

1 Mbps

CW0

32 slots

Number of CW sizes,

m

5

Short retry limit, M

7

Ts = Ts¯

48 slots

Collision Probability

0.4 0.3 0.2

Zhao Gupta Abate NS2 Measured

0.1 0.0 0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Normalized Total Offered Load

Fig. 1 Collision probability versus normalized offered load (n = 24 and K = 2). The legends ‘Zhao’, ‘Gupta’ and ‘Abate’ stands for estimation using [14], [16] and [15] respectively. The legend “NS2 Measured” stands for NS2 simulation results

closely matches with each other and validated through simulated results obtained using NS2 simulator. All the models are accurate in terms of throughput parameter based on their assumptions.

406

Mukta and N. Gupta

Collision Probability

0.4 0.3 0.2 Zhao Gupta

0.1

Abate NS2 Measured

0.0 0.0

0.2

0.4

0.6

0.8

1.0

Normalized Total Offered Load

Fig. 2 Collision probability versus normalized offered load (n = 24 and K = 6). The legends mean the same as in Fig. 1

Collision Probability

0.4 0.3 0.2 Zhao Gupta Abate NS2 Measured

0.1 0.0 0.0

0.2

0.4

0.6

0.8

1.0

Normalized Total Offered Load

Fig. 3 Collision probability versus normalized offered load (n = 24 and K = 20). The legends mean the same as in Fig. 1

System throughput (Mbps)

6 5 4 3 Zhao

2

Gupta Abate

1

NS2 Measured 0 0.0

0.2

0.4

0.6

0.8

1.0

Normalized Total Offered Load

Fig. 4 Normalized system throughput versus normalized offered load (n = 24 and K = 2). The legends mean the same as in Fig. 1

Comparison of Transform-Based and Transform-Free Analytical …

407

System throughput (Mbps)

6 5 4 3 Zhao

2

Gupta Abate

1

NS2 Measured 0 0.0

0.2

0.4

0.6

0.8

1.0

Normalized Total Offered Load

System throughput (Mbps)

Fig. 5 Normalized system throughput versus normalized offered load (n = 24 and K = 6). The legends mean the same as in Fig. 1 6 5 4 3 Zhao

2

Gupta Abate

1

NS2 Measured

0 0.0

0.2

0.4

0.6

0.8

1.0

Normalized Total Offered Load Fig. 6 Normalized system throughput versus normalized offered load (n = 24 and K = 20). The legends mean the same as in Fig. 1

5 Conclusion This work exhibits the performance study of analytical models utilizing arbitrary buffer length in unsaturated IEEE802.11 DCF networks. The investigation is not indepth and rather concentrates primarily on the mathematical hypotheses implied in these approaches. Although both the models are accurate enough as validated through simulation results, the only difference lies in modeling the simplicity and computation speed. It can be observed that the models based on numerical inverse transform algorithm are mathematical complex due to the use of recursive equations while the models based on transform-free approach are quite simple and faster attributable to the use of simple closed-form expressions for calculating the probability of empty buffer. It can be observed that transform-free approach has the quickest computation speed than transform-based approach. Another noticeable observation from the

408

Mukta and N. Gupta

analysis is that the increased buffer length not implicitly means a system throughput gain. Researchers are required to develop simple and transform-free approaches to deal with real-time conditions to exhibit the IEEE 802.11DCF networks.

References 1. Bianchi G (2000) Performance analysis of the IEEE 802.11 distributed coordination function. IEEE J Select Areas Commun 18(3):535–547 2. Wu H, Peng Y, Long K, Cheng S, Ma J (2002) Performance of reliable transport protocol over IEEE 802.11 wireless LAN: analysis and enhancement. In: Proceedings. Twenty-first annual joint conference of the IEEE computer and communications societies (INFCOM’02), vol 2. IEEE, New York, USA, pp 599–607 3. Chatzimisios P, Boucouvalas AC, Vitsas V (2005) Performance analysis of IEEE 802.11 MAC protocol for wireless LANs. Int J Commun Syst 18(6):545–569 4. IEEE 802.11b (1999) IEEE standard for information technology—telecommunications and information exchange between systems—local and metropolitan networks—specific requirements—part 11: wireless LAN medium access control (MAC) and physical layer (PHY) specifications: higher speed Physical layer (PHY) extension in the 2.4 GHz band. https://standards. ieee.org/standard/802_11b-1999.html, Last accessed 23 Dec 2019 5. Kumar A, Altman E, Miorandi D, Goyal M (2005) New Insights from a fixed point analysis of single cell IEEE 802.11 WLANs. In: Proceedings IEEE 24th annual joint conference of the IEEE computer and communications societies, vol 3. IEEE, Miami, FL, USA, pp 1550–1561 6. Liaw YS, Dadej A, Jayasuriya A (2005) Performance analysis of IEEE 802.11 DCF under limited load. In: 2005 Asia-Pacific conference on communications. IEEE, Perth, WA, Australia, pp 759–763 7. Malone D, Duffy K, Leith D (2007) Modeling the 802.11 distributed coordination function in nonsaturated heterogeneous conditions. IEEE/ACM Trans Netw 15(1):159–172 8. Zhao Q, Tsang DHK, Sakurai T (2009) A simple and approximate model for nonsaturated IEEE 802.11 DCF. IEEE Trans Mob Comput 8(11):1539–1553 9. Duffy K, Ganesh AJ (2007) Modeling the impact of buffering on 802.11. IEEE Commun Lett 11(2):219–221 10. Cantieni GR, Ni Q, Barakat C, Turletti T (2005) Performance analysis under finite load and improvements for Multirate 802.11. Comput Commun 28(10):1095–1109 11. Garetto M, Chiasserini C-F (2005) Performance analysis of 802.11 WLANs under sporadic traffic. In: Boutaba R, Almeroth K, Puigjaner R, Shen S, Black JP (eds) Networking 2005, LNCS, vol 3462. Springer, Berlin, pp 1343–1347 12. Zhai H, Kwon Y, Fang Y (2004) Performance analysis of IEEE 802.11 MAC protocols in wireless LANs. Wirel Commun Mob Comput 4(8):917–931 13. Zheng Y, Lu K, Wu D, Fang Y (2006) Performance analysis of IEEE 802.11 DCF in imperfect channels. IEEE Trans Veh Technol 55(5):1648–1656 14. Zhao Q, Tsang DHK, Sakurai T (2011) Modeling nonsaturated IEEE 802.11 DCF networks utilizing an arbitrary buffer size. IEEE Trans Mob Comput 10(9):1248–1263 15. Abate J, Whitt W (1995) Numerical inversion of Laplace transforms of probability distributions. ORSA J Comput 7(1):1–116 16. Gupta N, Rai CS (2015) A simple mathematical model for performance evaluation of finite buffer size nodes in non-saturated IEEE 802.11 DCF in ad hoc networks. In: Satapathy S, Govardhan A, Raju K, Mandal J (eds) Emerging ICT for bridging the future—proceedings of the 49th annual convention of the computer society of india (CSI) volume 1. Advances in intelligent systems and computing, vol 337. Springer, Cham 17. Smith JM (2011) Properties and performance modelling of finite buffer M/G/1/K networks. Comput Oper Res 38(4):740–754

Comparison of Transform-Based and Transform-Free Analytical …

409

18. Tijms HC (2003) A first course in stochastic models. Wiley 19. Abate J, Choudhury GL, Whitt W (2000) An introduction to numerical transform inversion and its application to probability models. In: Grassman WK (ed) Computational probability, Kluwer, pp 257–323. https://doi.org/10.1007/978-1-4757-4828-4_8 20. Gnuplot (2019) https://www.gnuplot.info, Last accessed 23 Dec 2019

An Overview of Learning Approaches in Reflection Removal Rashmi Chaurasiya and Dinesh Ganotra

Abstract Images taken through a glass pane suffer the problem of reflection. Reflection introduces some artefacts in images which is the main obstacle in photography and surveillance. Reflection in an image highly degrades the quality and has to be removed to enhance the quality. Removal of reflection includes utilizing polarizers, depth of field map, natural image statistics, image smoothening, etc. Although it has been studied for a very long time, it is still an ill-posed problem. The rise of deep learning in the last decade revolutionized each and every field, and image processing is not an exception to it. Several deep learning-based methods involving numerous architectures are applied to remove reflection. This paper attempts to give a summary of the state-of-the-art deep learning-based approaches for reflection removal. Keywords Learning approaches · Reflection removal · Deep learning

1 Introduction Often image has to be captured through the glass pane when target object is behind the glass (e.g. person in a car, museum). Glass between the target and the camera produces unwanted scene in captured image. This severely reduces the quality and usage of image. Similarly, photographs taken in low light conditions with flash also produce undesired reflection in the scene. Physical approaches (movement of camera) [1–3], polarizing lenses [4, 5], draping the camera with a black cloth, etc., were the initial attempts to remove reflection. These techniques can only decrease the effect of reflection to some extent. Reflection removal can be viewed as layer separation problem [6, 7], where reflection and background form a linear combination. Mathematical model for reflection R. Chaurasiya (B) · D. Ganotra IGDTUW, New Delhi 110006, India e-mail: [email protected] D. Ganotra e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_31

411

412

R. Chaurasiya and D. Ganotra

Fig. 1 Depiction of problem statement illustrates the degradation of image quality due to reflection. I represents observed image with reflection and B is desired background scene

removal can be given as I =B+R

(1)

where observed image (mixture image) I combines background (B) and reflection (R) both. The aim is to recover background and reflection components from a single image. This constraint makes this problem ill-posed, since two unknowns R and B need to be recovered from one known I. With this condition, problem statement defined by Eq. (1) has infinite solutions. Figure 1 shows mixture image (left) and background (right). Various approaches have been applied to solve the problem of reflection for over a decade. These approaches can be broadly classified using two parameters: (1) number of input images given and (2) nature of the method applied (e.g. learning/nonlearning). Based on the number of input images given to extract the background, there are two approaches: (1) single image methods and (2) multiple image methods. In single image method, only a single image [6, 8] is put to use for the extraction of background. While in multiple image method, an image sequence is used, and these images may vary in the illumination levels [9], viewpoints [10] different motion between image sequence [11, 12] and focus [13]. Figure 2 describes both of these methods. Successes of deep learning-based methods have generated interest in image processing. Based on nature of the method applied, there exist learning and nonlearning methods. Methods involving any form of learning architecture (e.g. convolutional neural network [14]) form the category of learning-based methods and conventional approaches that do not involve learning are non-learning methods. Learningbased methods are modern and most advanced where they can learn the mapping function from the observed image. Such approaches are generally fully automatic. Though deep learning better understand image properties, unavailability of large

An Overview of Learning Approaches in Reflection Removal

(a) Single Image method,

413

(b) Multiple Image Method (multiple image and video)

Fig. 2 Input/output of single image and multiple image-based method

training dataset limits its application. This paper gives a brief study on learningbased methods applied for the removal of reflection. The rest of the paper is organized as follows: Sect. 2 describes state-of-the-art methodology in reflection removal. Section 3 provides information about the datasets available. The main objective of this paper that is learning-based approaches in reflection removal is covered in Sect. 4. Based on the methods explained in Sect. 4, Sect. 5 gives insights into the significance of the methods in discussion section. Lastly, conclusion and future work are given in Sects. 6 and 7.

2 Reflection Removal Approaches Numerous techniques have been proposed such as polarizing techniques, sparse blind separation with motion, superimposed image decomposition [15], ghosting cues [16], images with flash and no flash [17] and similar other techniques. These methods can be differentiated based on the priors, nature of input and the process to solve the problem. All these proposed techniques utilize an objective function that involves some image priors. Most used techniques in reflection removal are listed below.

2.1 Gradient Distribution Here the focus is always background, so capturing images through the glass pane makes reflection layer out of focus. Gradients belonging to reflection in this technique become blurry [18], and gradients belonging to background layer become sharp. Difference in the sharpness in two layers provides crucial cue to separable reflection. This way sharp edges can be labelled as background edges and blurred (smooth)

414

R. Chaurasiya and D. Ganotra

edges labelled as reflection edges. Further, this difference in sharpness also leads to the difference in gradient distribution, i.e. background has more abrupt changes in gradient than reflection.

2.2 Ghosting Cues Ghosting cues arise due to the shifted double reflections of the reflected scene off the glass surface. Here a convolutional kernel is modelled to calculate the relative attenuation between two reflected ghosting components. Layers are separated using an algorithm called Gaussian mixture model (GMM). This technique basically addresses the depth of the glass between the camera and the background. These double reflections are also referred as ghosting cues [16].

2.3 Depth of Field Based The distance between the nearest and farthest point (i.e. depth of field) [19] of a scene can also be used as an important feature. In Wan et al. [20], the assumption is based on the fact that background is always behind the glass. Since glass causes the unwanted reflection layer, reflection layer is always closer to the camera, and hence, reflection layer is at lower depth than the background layer.

2.4 Polarizer Based Polarization property [1, 4, 5] of light states that reflected light is completely polarized in the direction perpendicular to the plane of incidence at Brewster angle where transmitted light is polarized in the direction parallel to the plane of incidence. This property is widely used for layer separation and reflection removal.

2.5 Sparsity Priors Statistics of natural image demonstrates that local features and image gradient are sparse. Observation of natural images shows that they have heavy tails and peak at zero gradient distribution. This long tail distribution is modelled with gradient sparsity prior. This sparsity [6, 21, 22] acts as an important cue in layer separation [7] and reflection removal tasks.

An Overview of Learning Approaches in Reflection Removal

415

2.6 Image Sequence Based Rather than using a single image to infer background and reflection images, some of the methods [9–13] capture a sequence of images. These images are often captured from different viewpoints to get additional information of the background. In this sequence, Xue et al. [23] require the user to take a short video sequence with slightly moving the camera. The difference in the relative position of the background and reflection from the camera allows them to separate mixture image based on their motions.

2.7 Sparse Blind Separation with Motion Gai et al. [7] take a series of mixture images to separate reflection and transmission layers. It is assumed that mixtures are linear with respect to layers. The assumption behind this technique relies on relative motion between layer due to movement of the camera and the glass surface or the target object. From one mixture to another, some layer properties change. So diversity of different mixtures can lead to automatic separation.

2.8 Optimization-Based Methods There is an optimization function which is based on some prior (l0 prior, gradient sparsity, DoF) and a data fidelity term that penalizes difference between the desired background and mixture image. Then this optimization function is solved with halfquadratic splitting or some other techniques based on the nature of the application. Optimization based on gradient sparsity prior implements smooth gradient prior on reflection layer and sparse gradient prior on background layer. Wan et al. [20] proposed optimization based on the DoF. Arvanitopoulos et al. [8] used Laplacianbased fidelity term. Another method [6] solved a constraint optimization problem that requires user assistance to label gradients belonging to reflection/background.

3 Dataset Dataset is the main asset of any deep learning technique. It is said that ‘any deep learning approach is as good as its dataset’. Unavailability of large-scale training data is the main hurdle in this field, and there was not a single benchmark dataset specifically designed for reflection removal. Most of the methods create their own synthetic dataset so the reliability of a method cannot be evaluated properly. To address this

416

R. Chaurasiya and D. Ganotra

unavailability of the dataset, Wan et al. [24] created a benchmark dataset, S I R 2 dataset with 40 controlled and 100 wild scenes. Most prevalent way to create reflection data is by blending two images in some ratio (usually 8:2), where reflection is always added at lower ratio than background. Creating a synthetic dataset for reflection requires inclusion of all the variants of reflection. There exist sharp reflection, blurry reflection and double reflection (shown in Fig. 3). Chi et al. [25] created their dataset using all these variants. Wan et al. [26] used two images to create a single reflection image, where image which acts as reflection image is captured with a black piece of paper behind the glass. Further, images were taken with different illumination levels and with random focus to create different blur levels in images. This reflection image dataset (RID) has a total of 3250 images so far it is not available publicly. Mostly each method [27–30] creates their own dataset. An attempt to create synthetic dataset using [30] is shown in Fig. 4, and here two images are used to create

Fig. 3 Synthetic images of S I R 2 dataset with blurry reflection, sharp reflection and double reflection; upper row shows mixture images and lower row shows background images

Fig. 4 Synthetic image creation (image sequence from left; background image, reflection image and mixed image)

An Overview of Learning Approaches in Reflection Removal

417

a single mixed image where the intensity of the image used as reflection is decreased.

4 Learning Approaches for Reflection Removal Most popular techniques of reflection removal are given by exploiting edge feature. Application of deep learning in reflection removal problem was pioneered by Fan et al. [27], who proposed a two-staged framework (CEILNet) based on the assumption that reflected layer is relatively blurry than the background. The first stage predicts the edges of the background image, and the second stage reconstructs a three channel background image using the edges predicted in the first stage. Same assumption is used by CRRN [26] (multi-scale concurrent reflection removal network) where combination of two stages in a unified framework occurs. Their architecture includes encoder–decoder network, VGG-19 [31] network and deconvolution layers. Zhang et al. [30] have used perceptual loss functions in their work that includes feature loss, adversarial loss and exclusion loss. Using three loss functions than conventional pixel-wise loss function helped them to ensure the feature details. Later, Wan et al. [29] extended their work in CRRN by designing a cooperative framework (CoRRN) with statistical loss calculated between reflection and background (cooperative reflection removal network). Second most utilized feature after edges is DoF, and these methods include Permanand et al. [19, 32, 33], where Paramanand et al. [32] proposed a two-staged approach that trains a classifier to estimate depth on images captured by plenoptic camera. Tao et al. [19] require observed image and depth map as inputs, where DoF confidence map is estimated at each point of the image with Kullback–Leibler divergence. Another approach [34] with DoF is specifically designed to solve traffic surveillance problem. Due to the unique conditions (i.e. position of the camera, fast moving objects), existing practices of using polarizers are computationally expensive. Wang et al. [34] were the first to propose a solution for these types of reflection problems where the algorithm is designed for windshield glasses. Their DoF guided deep learning framework consists of four components—DoF feature prediction, reflection inference, background estimation and semantic constraint module. For this task, they have created their own dataset. Cropped images from surveillance camera and street view dataset for reflection layer are used. Though results were promising for most of the cases of windshield reflection, there is still a need to improve for extreme cases. Generative adversarial networks (GAN) are also been applied to generate reflection and background scene from a single image by Zheng et al. [28, 33]. In multiple image method, Chang and Jung [35] trained their network on a sequence of images as input, and then this multiple image model is applied on a single image to produce reflectionfree background. Reflection removal in wild scene is addressed by Wieschollek et al. [36], and here reflection and background in wild scene are separated with polarizer technique utilizing deep learning. Table 1 listed some of the state-of-the-art methods described above.

Article

Fan et al. [27]

Wan et al. [26]

Chi et al. [25]

Zhang et al. [30]

Lee et al. [28]

S. No.

1

2

3

4

5

CEILNet—consists of a convolution, batch normalization, 13 residual layers and ReLu activation layers Pixel-wise loss function MSE and gradient discrepancy

Architecture

Single image

Single image

Single image

Utilizes (generative adversarial networks) GANs, where generator and discriminator have six convolution layers and six transpose layers, respectively. Padding, stride and batch normalization are also used in their architecture

Main model has total nine-layered dilated fully convolution architecture with receptive field of 513 × 513. For feature loss VGG-19 was used

Encoder–decoder—12 convolution, 12 deconvolution with ReLu

Single image with input gradient CRRN—network of convolution, maxpool, deconv layers Perceptual loss function

Single input image with its corresponding input edge map

Input

Table 1 Methods of reflection removal with deep learning

(continued)

This method removes conventional assumption of reflection being always blurry. This enables them to model the image feature space rather than pixel-level combination. Real images with adversarial loss function are used for training the network

Primarily based on three different loss functions 1. Feature loss 2. Adversarial loss 3. Exclusion loss

Tasks in this article can be summarized in three steps 1. Feature extraction 2. Reflection removal 3. Transmission layer restoration

A concurrent network that combines gradient inference and image inference both in a single framework. RID dataset of real images is also created

Relies on a two-stage end-to-end deep architecture. First prediction of the edges of the background and then estimation of the background with the help of predicted background edges and input image

Method

418 R. Chaurasiya and D. Ganotra

Article

Wan et al. [29]

Chang and Jung [35]

Wang et al. [34]

S. No.

6

7

8

Table 1 (continued)

Single image

Single image

Single image

Input

Deep Learning framework having four sub-stages for DoF, reflection, background, semantic information

Encoder–decoder network having skip connection. Internal and external losses are also defined in their work

CoRRN—A cooperative framework consists of three sub-networks. CencN, IdecN and GdecN associated with context, image and gradient, respectively

Architecture

(continued)

Specifically designed for windshield reflection. Adversarial loss and L2 loss are used. Edges were separated using DoF feature, reflection layer and background are estimated, and at last semantic constraint is utilized to regularize the training

Utilized a number of images and generated a series of four output images. Internal loss is calculated between generated output images, and external loss is used to compare those generated images with ground truth background to optimize the result. Once the network is trained, it is used as a single image method

The gradient level statistics of background and reflection is considered here. Integration of context information and multi-scale gradient information also helps

Method

An Overview of Learning Approaches in Reflection Removal 419

Article

Yang et al. [37]

Jin et al. [38]

Wei et al. [39]

S. No.

9

10

11

Table 1 (continued)

Single image

Single image

Single image

Input

First estimated background B, with estimated B, reflection R is estimated. Finally, predicted R is used to improve estimated background. Adversarial loss is used here

Method

Used [27] ‘network’ as its base network with some modification. Context between channels and multi-scale context within the channels are also estimated.

It successfully tackles misaligned data as well as overlapping between R and T layers. Channel attention and pyramid pooling are used. List of loss function includes pixel, feature and adversarial loss for aligned data. For unaligned data feature loss campared on higher lavel layers in combination with adversarial loss is used.

Three identical ResNet for preprocessing Addresses to the problem here are with and a refining residual network gradient-based methods, eliminated this by enlarging the receptive field of the neural network. Further, this method has fewer parameters than the existing state of the art

Bidirectional framework comprises vanilla generator, reflection estimator and background estimator

Architecture

420 R. Chaurasiya and D. Ganotra

An Overview of Learning Approaches in Reflection Removal

421

The concern with gradient-based methods is that they provide localized priors, and this issue of context ambiguity is addressed by Jin et al. [38] by adapting a large receptive field network. A bidirectional framework is proposed by Yang et al. [37], and each module in their framework can be trained independently, i.e. one can wait till one module converged enough to be fed to next module as input. Further, each module can also be pre-trained independently. This strategy is inferred by them as greedy training and fine-tuning. Semantic information [40] acts as a global feature which helped in recognizing objects and to guide the semantic objects belongs to the same layer. Background semantic map is estimated first from mixture image than estimated background information along with mixture image is used to predict reflection and background. Their proposed SRRN network has feature extraction, layer reconstruction and semantic estimation module. Recent method proposed by Wei et al. [39] handles the challenge of unalignment between the image pair of R and T by introducing a certain combination of adversarial loss and feature loss of high-level (conv5 2) layer is used as loss function. To tackle the ambiguity between the two layers, contextual information is incorporated through channel attention [41] and pyramid pooling module [42].

5 Discussion In this paper, a brief study on reflection removal techniques has been provided. Out of the two categories of reflection removal methods, multiple image methods are not practical in real-life scenarios. Taking multiple images from different viewpoints limits its applicability to a user, these methods are more appropriate for a controlled environment. This justifies the fact that the number of methods with multiple images is very less compared to methods for a single image. It is also observed that single image methods still have some scope of improvement. Each of the methods attempts to learn a certain cue to separate reflection and background assuming some specific condition. Although the methods described above are all the state of the methods, they do not provide highly accurate solution. There is some limitations, for instance assuming reflection being always blurry or smoothed is not always true in real-life scenarios. Sparse blind separation of motion performs slower and has a constraint of relatively static background scene. High degree of movement in the reflection layer causes time aliasing artefacts in both layers. Similarly, reflection glass pane may not always have a large depth so ghosting cue techniques cannot be applied every time on them. Further, an algorithm using ghosting cues is not able to cleanly separate reflections and also causes patchy artefacts. In addition to that, though reflection only occupies a part of the image, most methods are applied all over the image. This practice downgrades the quality of the image. The area involving reflection needs to be located first. Locating reflection with user assistance by Levin et al. [6] has compelling results but this has to be done automatically. A solution to this problem is provided by learning-based approaches. Learning-based techniques perform automatic and end-to-end solution but with the absence of a problem-specific dataset, it remains

422

R. Chaurasiya and D. Ganotra

an ill-posed problem. S I R 2 serves the purpose of evaluation dataset effectively but lack of training dataset limits learning approaches. Ideally, real-life reflection is hard to be modelled in synthetic images due to the misalignment between image pairs (mixture and background) even with stable tripods. Further refraction through the glass shifts the path of transmitted light and therefore causes misalignment. Other than dataset, loss functions play an important role too, and recent methods [28, 30] prefer combination perceptual loss function and pixel-wise loss function (MSE), where features are compared in addition to pixels back-propagation optimization algorithms. Incorporating semantic information [40] provides understanding of the object in the background and also acts as a global feature which further assist in resolving the problem.

6 Conclusion Methods based on low-level priors (i.e. smoothness, ghosting) and CNN-based approaches found to be less effective for some cases due to the lack of generalization. Most recent approach attributed the root cause of the problem and produced the best results among the state of the art. However, it is not feasible when reflection is stronger than the background, and it is even hard for a human to separate two layers.

7 Future Work As mentioned in the dataset section that real data is not available, only way to proceed this work is to create synthetic dataset. Future work includes the removal of reflection employing attention mechanism. Attention is a mechanism that allows a neural network to have the ability of focusing on a subset of features/hidden layers. It selects certain features to be amplified that the attention mechanism thinks are more important to predict the desired output. Attention was initially employed for natural language processing (NLP) tasks; however, the concept of attention proved to be a powerful tool to various image processing problems too.

References 1. Kong N, Tai Y, Shin JS (2014) A physically-based approach to reflection separation: from physical modeling to constrained optimization. IEEE Trans Pattern Anal Mach Intell 36(2):209–221. https://doi.org/10.1109/TPAMI.2013.45 2. Levin A, Zomet A, Weiss Y (2003) Learning to perceive transparency from the statistics of natural scenes. Advances in neural information processing systems, pp 1271–1278. Vancouver, Canada (2003)

An Overview of Learning Approaches in Reflection Removal

423

3. Tsuji T (2010) Specular reflection removal on high-speed camera for robot vision. In: 2010 IEEE international conference on robotics and automation, Anchorage , pp 1542–1547. https:// doi.org/10.1109/ROBOT.2010.5509252 4. Wolff L-B (1989) Using polarization to separate reflection components. In: Proceedings CVPR 1989: IEEE Computer Society conference on computer vision and pattern recognition, pp 363–369. San Diego, CA, USA (1989). https://doi.org/10.1109/CVPR.1989.37873 5. Schechner Y-Y, Shamir J, Kiryati N (1999) Polarization-based decorrelation of transparent layers: The inclination angle of an invisible surface. Proceedings of the seventh IEEE international conference on computer vision, vol 2, Greece, pp 814–819. https://doi.org/10.1109/ ICCV.1999.790305 6. Levin A, Weiss Y (2007) User assisted separation of reflections from a single image using a sparsity prior. IEEE Trans Pattern Anal Mach Intell 29(9):1647–1654. https://doi.org/10.1109/ TPAMI.2007.1106 7. Gai K, Shi Z, Zhang C (2012) Blind separation of superimposed moving images using image statistics. IEEE Trans Pattern Anal Mach Intell 34(1):19–32. https://doi.org/10.1109/TPAMI. 2011.87 8. Arvanitopoulos N, Achanta R, Süsstrunk S (2017) Single image reflection suppression. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, pp 1752–1760 9. Agrawal A, Raskar R, Nayar SK, Li Y (2005) Removing photography artifacts using gradient projection and flash exposure sampling. ACM Trans Graph (TOG) 24(3):828–835. https://doi. org/10.1145/1073204.1073269 10. Li Y, Brown MS (2013) Exploiting reflection change for automatic reflection removal. 2013 IEEE international conference on computer vision, Sydney, NSW, pp 2432–2439. https://doi. org/10.1109/ICCV.2013.302 11. Simon C, Park IK (2015) Reflection removal for in-vehicle black box videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, MA, pp 4231–4239. https://doi.org/10.1109/CVPR.2015.7299051 12. Han BJ, Sim JY (2017) Reflection removal using low-rank matrix completion. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, pp 3872–3880. https://doi.org/10.1109/CVPR.2017.412 13. Schechner YY, Kiryati N, Basri R (2000) separation of transparent layers using focus. Int J Comput Vision 39(25):25–39. https://doi.org/10.1023/A:1008166017466 14. Dumoulin V, Visin F (2016) A guide to convolution arithmetic for deep learning. arXiv:1603. 07285v2 15. Guo X, Cao X, Ma Y (2014) Robust separation of reflection from multiple images. In: 2014 IEEE conference on computer vision and pattern recognition, pp 2195–2202. Columbus, OH. https://doi.org/10.1109/CVPR.2014.281 16. Shih Y, Krishnan D, Durand F, Freeman W (2015) Reflection removal using ghosting cues. In: IEEE conference on computer vision and pattern recognition (CVPR), Boston, pp 3193–3201. https://doi.org/10.1109/CVPR.2015.7298939 17. Agrawal A, Raskar R, Chellappa R (2006) Edge suppression by gradient field transformation using cross-projection tensors. In: 2006 IEEE Computer Society conference on computer vision and pattern recognition (CVPR’06), New York, USA, pp 2301–2308. https://doi.org/10.1109/ CVPR.2006.106 18. Fan Q, Yang J, Hua G, Chen B, Wipf D (2017) A generic deep architecture for single image reflection removal and image smoothing. In: Proceedings of the IEEE international conference on computer vision (ICCV), Venice. Italy. https://doi.org/10.1109/ICCV.2017.351 19. Tao MW, Hadap, Malik SJ, Ramamoorthi R (2013) depth from combining defocus and correspondence using light-field cameras. In: 2013 IEEE International conference on computer vision, Sydney, pp 673–680. https://doi.org/10.1109/ICCV.2013.89 20. Wan R, Shi B, Tan AH, Kot AC (2016) Depth of field guided reflection removal. Paper presented at the meeting of the ICIP), pp 21–25. Phoenix, AZ. https://doi.org/10.1109/ICIP.2016.7532311

424

R. Chaurasiya and D. Ganotra

21. Robert F, Barun S, Aaron H, Sam R, William F (2006) Removing camera shake from a single photograph. ACM Trans Graph 25:787–794. https://doi.org/10.1145/1179352.1141956 22. Levin A, Zomet A, Weiss Y (2004) Separating reflections from a single image using local features. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, Washington, DC, USA. https://doi.org/10.1109/CVPR.2004.1315047 23. Xue T, Rubinstein M, Liu C, Freeman WT (2015) A computational approach for obstructionfree photography. In: ACM transactions on graphics, Proceedings of SIGGRAPH, New York, USA, vol 34(4). https://doi.org/10.1145/2766940 24. Wan R, Shi B, Duan LY, Tan AH, Kot AC (2017) Benchmarking single-image reflection removal algorithms. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3922– 3930, Venice. https://doi.org/10.1109/ICCV.2017.423 25. Chi Z, Wu X, Shu X, Gu J (2018) Single image reflection removal using deep encoder-decoder network. CoRR. arXiv:1802.00094v1 26. Wan R, Shi B, Duan LY, Tan A-H, Kot AC (2018) CRRN: multi-scale guided concurrent reflection removal network. IEEE conference on computer vision and pattern recognition (CVPR), Salt Lake City, UT, USA, pp 4777–4785. https://doi.org/10.1109/CVPR.2018.00502 27. Fan Q, Yang J, Hua G, Chen B, Wipf D (2017) A generic deep architecture for single image reflection removal and image smoothing. In: IEEE international conference on computer visio. arXiv:1708.03474v2 28. Lee D, Yang MH, Oh S (2018) Generative single image reflection separation. arXiv:1801. 04102 29. Wan R, Shi B, Li H, Duan LY, Tan AH, Chichung A (2019) CoRRN: cooperative reflection removal network. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019. 2921574 30. Zhang X, Ng R, Chen Q (2018) Single image reflection separation with perceptual losses.In: IEEE conference on computer vision and pattern recognition (CVPR), Salt Lake City, UT, USA. arXiv:1806.05376v1 31. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR. arXiv:1409.1556v6 32. Paramanand C, Noroozi, Mehdi & Favaro, P.: Reflection separation and deblurring of plenoptic image. arXiv:1708.06779v1 33. Zheng Q, Shi B, Jiang X, Duan LY, Kot A (2019) Denoising adversarial networks for rain removal and reflection removal. IEEE international conference on image processing (ICIP), pp 2766–2770 Taipei, Taiwan. https://doi.org/10.1109/ICIP.2019.8803225 34. Wang C, Shi B, Duan L (2019) Learning to remove reflections from windshield images. Signal Process Image Commun 78:94–102. https://doi.org/10.1016/j.image.2019.06.007 35. Chang Y, Jung C (2019) Single image reflection removal using convolutional neural networks. IEEE Trans Image Process 28(4):1954–1966. https://doi.org/10.1109/TIP.2018.2880088 36. Wieschollek P, Gallo O, Gu J, Kautz J (2017) Separating reflection and transmission images in the wild. ECCV. LNCS, vol 11217, pp 90–105. Springer. Cham. https://doi.org/10.1007/9783-030-01261-8_6 37. Yang J, Gong D, Liu L, Shi Q (2018) Seeing deeply and bidirectionally: a deep learning approach for single image reflection removal. In: Computer vision—European conference on computer vision 2018. LNCS, vol 11207, pp 675–691. Springer, Cham. https://doi.org/10.1007/ 978-3-030-01219-9_40 38. Jin M, Süsstrunk S, Favaro P (2018) Learning to see through reflections. In: IEEE international conference on computational photography (ICCP), pp 1–12. Pittsburgh, PA. https://doi.org/10. 1109/ICCPHOT.2018.8368464 39. Wei K, Yang J, Fu Y, Wipf D, Huang H (2019) Single image reflection removal exploiting misaligned training data and network enhancements. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8170–8179, Long Beach, CA, USA. https://doi.org/10.1109/CVPR.2019.00837 40. Liu Y, Yu L, You S, Lu F (2019) Semantic guided single image reflection removal. arXiv:1907. 11912

An Overview of Learning Approaches in Reflection Removal

425

41. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: The IEEE conference on computer vision and pattern recognition (CVPR), June (2018). arXiv:1709.01507v4 42. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell (TPAMI) 37(9):1904–1916 (arXiv:1406.4729)

Comparison of Bioinspired Algorithms Applied to the Timetabling Problem Jose Silva, Noel Varela, Jesus Varas, Omar Lezama, José Maco, and Martín Villón

Abstract The problem of timetabling events is present in various organizations such as schools, hospitals, transportation centers. The purpose of timetabling activities at a university is to ensure that all students attend their required subjects in accordance with the available resources. The set of constraints that must be considered in the design of timetables involves students, teachers and infrastructure. This study shows that acceptable solutions are generated through the application of genetic, memetic and immune system algorithms for the problem of timetabling. The algorithms are applied to real instances of the University of Mumbai in India and their results are comparable with those of a human expert. Keywords Genetic algorithm · Memetic algorithm · Immune system · Faculty timetabling · Course timetabling

J. Silva (B) · J. Maco · M. Villón Universidad de Ciencias Aplicadas, Lima, Peru e-mail: [email protected] J. Maco e-mail: [email protected] M. Villón e-mail: [email protected] N. Varela · J. Varas Universidad de la Costa, St. 58 #66, Barranquilla, Atlántico, Colombia e-mail: [email protected] J. Varas e-mail: [email protected] O. Lezama Universidad Tecnológica, San Pedro Sula, Honduras e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_32

427

428

J. Silva et al.

1 Introduction The timetabling of tasks within organizations is one of the most common and difficult problems to address because it seeks to assign various activities and resources in a space of time [1]. In universities, the aim is to generate a timetable design that complies with the restrictions of students, teachers, curriculum and buildings of the institution. In addition to the fact that the timetabling problem depends on the type of school, university and education system, there is no timetable design that can be applied in a generalized way to all schools, universities and education systems [2]. In general, the timetabling problem is defined from a set of events (classes, courses, exams), which must be assigned in a set of time periods which are subject to a set of restrictions [3]. According to Adriaen et al. [4], the university timetabling is classified in five groups: 1. Faculty timetabling (FTT): The assignment of teachers to subjects. 2. Class-teacher timetabling (CTTT): The assignment of subjects with the least possible time conflicts between groups of students. 3. Course timetabling (CTT): The assignment of subjects with the lowest possible time fluctuation between individual students. 4. Examination timetabling (ETT): The assignment of tests to students in such a way that one student does not take two tests at the same time. 5. Classroom assignment (CATT): After classes are assigned to teachers, classteacher classrooms are assigned. This study focuses on generating acceptable solutions to the problem of timetable scheduling through the use of metaheuristic algorithms. There is a variety of approaches that have been used to solve the timetabling problem such as graph coloring [5], restriction satisfaction programming (CSP)-based methods [6], IP/LP (integer programming/linear programming) [7], genetic algorithms [8, 9], memetic algorithms [10, 11], tabu search [12, 13], simulated sewing [14], local search [15], best–worst ant system (BWAS) and ant colony optimization[16], and hyper-heuristic approach [17]. There are many different problems at least of NP class, which can be solved with different metaheuristic algorithms, but as the No-Free Luch Theorem [18] indicates, there is no metaheuristic that outperforms all the others for all the known problems of the NP class. Due to the above, this study makes a comparison between genetic, memetic and immune system algorithms, applying non-parametric statistical tests. The instances used come from real data of the University of Mumbai in India (UM), where the schedule of timetables is prepared by an expert and seeks to generate metaheuristic algorithms solutions that compete with the proposals of the human expert.

Comparison of Bioinspired Algorithms Applied …

429

2 Method 2.1 The Problem of Task Timetabling The timetabling of tasks can be defined as the process of assigning classes to resources such as time, space (rooms) and teachers (staff), while satisfying a set of restrictions [19]. There are two types of restrictions [20]: • Hard: The restriction that cannot be absolutely broken. Some examples of hard restriction are [12]: classroom availability, conflicts between students, availability of resources (teachers, classrooms). • Soft: The set of restrictions that are preferred to satisfy but not all of them are supposed to be satisfied. Some examples of soft restrictions are [12]: room capacity, minimal number of days occupied, etc. A definition of the timetabling problem is given by Lewis Rhydian [21]: Dada una 4-tupla (e, t, p, S) which is the representation of a possible solution, where E = {e1 , e2 , …, en } is a set of events (classes or subjects), T = {t 1 , t 2 , …, t s } is a set of time periods, P = {p1 , p2 , …, pm } is a set of places (classrooms), A = {a1 , a2 , …, ao } is a set of users (students registered in courses) and S ⊆ A is a subset of students; the 4-tuple has an associated cost function f (t). The problem is to look for 4-tuples (e, t, p, S) or solutions that minimize the associated cost function (t).

2.2 API-Carpio Method The API-Carpio methodology [22] describes the process of scheduling educational timetables as follows: f (x) = FA(x) + FP(x) + FI(x)

(1)

where F(x) = Number of students in conflict within the timetable x, (CTT). F(x) = Number of teachers in conflict within the timetable x, (FTT). F(x) = Number of classrooms and laboratories in conflict within the timetable x, (CATT). This study is limited to take only through F(x) which is defined as: FA =

k  j=1

where:

FAv j

(2)

430

J. Silva et al. 

FA =

 MV j −1 MV i

  s=1

A j,s ∧ A j,s+l



(3)

l=1

Having: FAV j = Number of students in conflict within the vector V j . V j = It is a time vector that contains several subjects. A j,s ∧ A j,s+l = Number of students that demand the simultaneous registry of the subjects M j,s ∧ M j,s+1 .

2.3 Design Methodology The design methodology proposed by Soria et al. [23] allows different task scheduling policies and constraints lists to be modeled by converting all time and space restrictions into one simple type of restriction: student conflicts. Structures such as the MMA matrix, LPH list, LPA list and LPS list are proposed. The first three represent hard restrictions and the last one represents soft restrictions. In this work, two of the four structures will be used: MMA matrix and LPH list. In [24], they define their structures as follows: MMA Matrix: It contains the number of possible conflicts (between students) if two lessons are assigned in the same space of time. LPH List: This list provides information about each lesson (class, event or subject) in which possible time space can be assigned.

2.4 Genetic Algorithm The genetic algorithms were developed by J. Hollan in the 70s, in order to understand the adaptive process of the natural system, to then apply it in optimization and machine learning in the 1980s [25]. Genetic algorithms are adaptive methods, generally used in problems of searching and optimization of parameters, based on sexual reproduction and the principle of survival of the fittest. In [26], the author defines the algorithm 1 that corresponds to a simple genetic. In [24], the author mentions that genetics is one of the most used strategies, and in this work, heavy roulette selection was used. This consists in assigning each individual a proportional part of probability in relation to the aptitude function. Having f i as the fitness of the individual, pi in the population P and the probability formula can be consulted at [27]. The cross is the probabilistic process that changes the information between two chromosomes (parents) to generate two child chromosomes [28], and the one used in this work is the cross to a point described in [24], where the cross site k is randomly selected and the two children are created by exchanging the

Comparison of Bioinspired Algorithms Applied …

431

parent segments. For the muta operator, it will be to a point where a position k will be randomly selected and the value of that position will be changed to another in the LPH [24].

2.5 Memetic Algorithms In 1976, Dawkins designed the concept of meme, which unlike the gene can be modified by its carrier. This author assumed that there is progress as a gene in the genetic algorithm that is transferred to the next generation, i.e., the characteristics obtained are transferred from one generation earlier to the next, along with population changes. The authors in [29] describe what the memetic algorithm is. A meme is a unit of data that can recreate itself. These units are transmitted between people and any other, which can be adjusted to it, and are capable of saving the data unit while a gene remains unchanged during transmission [30]. The components of a memetic algorithm are [24]: genetic algorithm and local search. The local search is a modification that can be made on the entire population of individuals with whom the algorithm works. In this, a copy of each individual is made and can be altered in some way. If the copy of a specific individual is better than the original, the copy replaces the original individual. Algorithm 2 describes the steps of a memetic.

2.6 Immune System These are based on imitating the behavior of the human immune system, which is responsible for protecting the body from external and internal pathogens and its main task is to recognize the cells in the body and classify them as own and not

432

J. Silva et al.

own. Artificial immune system algorithms have been successfully applied in various optimization problems [29]. Basically, the process of the artificial immune system algorithm consists of randomly generating a population of candidate solutions; then selecting a percentage of the best individuals, which are cloned; then, these individuals are applied a hypermuta and finally continued until the target solution is reached. But, to prevent the population from growing without measure, a poda is put in place to allow the population to return to its initial size [31]. Algorithm 3, proposed by [32], corresponds to the artificial immune system.

Comparison of Bioinspired Algorithms Applied …

433

3 Results The instances used for the tests with the metaheuristics mentioned above belong to UM, corresponding to two different educational plans, relating to the year 2017 and 2018, have approximately 48–60 classes (events) and a number of 7—13 spaces of time, respectively. The configuration used in the genetic, memetic and IS algorithms is shown in Table 1, which shows that the function calls were 400,000; the initial population for each was the same. The criterion for stopping the algorithms was function calls. Table 2 shows the results obtained (conflicts) according to the objective function shown in 1, where the mean, median and standard deviation of the eight instances evaluated with the GA, MA, IS and expert conflicts can be observed. It can be observed in Table 2 that the standard deviations of some instances there is not much difference between algorithms, as is the case of instances 2, 3 and 6, since what we are looking for is to find algorithms that have acceptable solutions with a low standard deviation, which means that the data are very close to the mean, which is indicative of the reproducibility of results by the metaheuristic algorithms used. In order to apply the non-parametric statistical tests, the Friedman test [28] was applied, from where the medians of the genetic, memetic and IS algorithms will Table 1 Data for the starting configuration of the GA, MA and IS Parameter

Genetic

Memetic

Immune system

Population

40

40

40

Function calls

400,000

400,000

400,000

Elitism

0.2

0.2

NA

Cross

0.95

0.95

NA

Muta

0.2

0.2

NA

Poda

NA

NA

100

Table 2 Results of the different metaheuristics and experts applied to the instances Instance

Mean

Median

Standard deviation

Expert



GA

MA

IS

GA

MA

IS

Ga

MA

IS

1

288.2

288.4

300.2

281

288

310

16.32

17.0

19.2

590

2

162.7

170.2

195.2

166

165

214

15.74

14.04

14.01

311

3

255.2

266.2

294.3

254

259

330

20.71

20.1

20.21

454

4

170.2

151.2

179.2

144

155

195

12.01

11.36

9.14

250

5

130.2

136.1

165.1

133

141

181

10.54

8.14

9.62

301

6

74.1

78.4

89.6

75

78

100

6.98

7.20

7.84

154

7

76.1

80.2

182.1

80

82

189

9.89

139

8

92.1

99.1

106.0

90

99

110

11.56

260

8.21 13.1

9.557 17.01

434

J. Silva et al.

be taken, and then to apply the test and find out which was the algorithm with the best performance, the Wilcoxon sign range sum test was applied, in order to contrast the results of the algorithm with the best performance and the results of the human expert. Table 3 shows the ranges and results of the Friedman omnibus test where h0 = There are no differences in the performance of the algorithms and ha = There are differences between Algorithms, where the value P is lower in the three cases than the value α = 0.05, so there is not enough evidence to accept h0 . Taking Genetics as the control algorithm, because it has the lowest rank, post-hoc tests are performed, with a value of α = 0.05. Table 4 shows the z and p values with Bonferroni setting [28]. As can be seen in the peer tests for the case of the GA versus the MA in the three tests, the p value is lower than the ∝ so there is no enough evidence to reject h0 , while for the case of the GA versus IS, the new p values are lower than the α, so it states that there is a difference in the behavior of the algorithms. As there is no difference in the behavior of the genetic and the memetic, the one with the smallest range in the tests will be used, which in this case was the genetic (see Table 3), to compare it in the test of sum of ranges with Wilcoxon sign and to see if the results of the genetic with the results of the expert come from populations with equal medians. The results are shown in Table 5. When applying the Wilcoxon sign sum of ranges test (see Table 5), under the hypothesis: h0 = There are no differences in the medians versus ha = There are differences between the medians, where a significance value was taken α = 0.05 and 8 degrees of freedom, it was found that there is sufficient evidence to reject h0 . Therefore, the data come from populations with different medians, that is, the genetic algorithm improves the results obtained by the human expert. Table 3 Ranges, statistics and p value for GA, MA and IS Algorithms

Friedman

Friedman Aligned

Quade

Genetic

1.456

60

1.120

Memetic

1.958

76.1

1.995

IS

3.000

Statistic

152

15.24

P Value

3.000

13.24

0.0009

21.54

0.0085

7.254E−05

Table 4 Post-hoc tests. Taking genetic algorithm as control algorithm Algorithm

Friedman

-

z

Friedman aligned

Quade

Bonferroni z

Bonferroni z

Bonferroni

GA versus MA 1.60

0.2785

1.06

0.6325

1.48

0.2987

GA versus IS

0.0004

6.19

1.33E-09

3.44

0.0024

4.00

Comparison of Bioinspired Algorithms Applied …

435

Table 5 Test of sum of ranges with Wilcoxon’s sign between genetic and results of the human expert Instance

Genetic

Expert

Range

Sign

1

284

589

9

-

2

164

311

3

-

3

255

433

6

-

4

150

250

4

-

5

133

314

5.7

-

6

80

162

3

-

7

72

144

1

-

8

92

245

5.7

-

W− =

33

W=

0

W+ =

0

W 0=

5

4 Conclusions The use of optimization algorithms in the scheduling of timetables allows minimizing the number of conflicts between the different resources of the institution such as teachers and physical structure, allowing students to have a calendar that meets their needs. This study first shows a comparison between different metaheuristics, which allow finding solutions to solve the problem of scheduling tasks, then determines the algorithm that obtained better performance and makes a comparison with the results of the expert through a non-parametric statistical test. The instances used belong to real data from the UM, where the number of conflicts of the solution proposed by the expert in charge of designing the timetable was obtained allowing a comparison between the results generated by the metaheuristics vs. those of the human expert. The results show that the genetic algorithm had a better performance; however, Friedman’s statistical test indicates that there is not enough discernible evidence between the behavior of the genetic algorithm and the memetic, but due to the value of the ranges, it was the one taken for comparison with the expert’s results. Finally, when doing the Wilcoxon sign range test between the genetic and the expert results, it indicates that there is a difference in position between the distributions of the genetic results and the human expert, so, based on the results, it can be said that the genetic algorithm improved the results for this set of instances. As future research, it is proposed to integrate other metaheuristic algorithms using another type of selection, cross and muta in the case of genetic and memetic and to implement other versions of immune system algorithms that generate better solutions and apply the omnibus test by incorporating the results of the human expert together with the metaheuristics. In addition, it is suggested to select an approach that involves the remaining variables of the API-Carpio methodology, such as the sum of the conflicts of the teacher, along with those of the classrooms.

436

J. Silva et al.

References 1. Jorge AS, Martin CJ, Hugo T (2010) Academic timetabling design using hyper—heuristics. Springer, Berlin, pp 43–56. https://doi.org/10.1007/978-3-642-15534-53 2. Asratian AS, de Werra D (2002) A generalized class–teacher model for some timetabling problems. University of Technology, Department of Engineering Sciences and Mathematics, Mathematical Science, & Mathematics. Eur J Oper Res 531–542. https://doi.org/10.1016/S03772217(01)00342-3. 3. Soria-Alcaraz Jorge A, Martín C, Héctor P, Sotelo-Figueroa MA 2013) Comparison of metaheuristic algorithms with a methodology of design for the evaluation of hard constraints over the course timetabling problem. Springer, Berlin, pp 289–302. https://doi.org/10.1007/978-3642-33021-6_23 4. Viloria A, Lis-Gutiérrez JP, Gaitán-Angulo M, Godoy ARM, Moreno GC, Kamatkar SJ (2018) Methodology for the design of a student pattern recognition tool to facilitate the teaching— learning process through knowledge data discovery (big data). In: Tan Y, Shi Y, Tang Q (eds) Data mining and big data. DMBD 2018. Lecture notes in computer science, vol 10943. Springer, Cham 5. De Werra D (1985) An introduction to timetabling. Eur J Oper Res 19(2):151–162 6. Obit JH, Ouelhadj D, Landa-Silva D, Vun TK, Alfred R (2011) Designing a multi-agent approach system for distributed course timetabling, pp 103–108. https://doi.org/10.1109/HIS. 2011.6122088 7. Lewis MRR (2006) Metaheuristics for university course timetabling. Ph.D. Thesis, Napier University 8. Deng X, Zhang Y, Kang B, Wu J, Sun X, Deng Y (2011) An application of genetic algorithm for university course timetabling problem, pp 2119–2122./ https://doi.org/10.1109/CCDC.2011. 5968555 9. Mahiba AA, Durai CAD (2012) Genetic algorithm with search bank strategies for university course timetabling problem. Procedia Eng 38:253–263 10. Kamatkar SJ, Kamble A, Viloria A, Hernández-Fernandez L, Cali EG (2018) Database performance tuning and query optimization. In: International conference on data mining and big data. Springer, Cham, pp 3–11 11. Nguyen K, Lu T, Le T, Tran N (2011) Memetic algorithm for a university course timetabling problem, pp. 67–71. https://doi.org/10.1007/978-3-642-25899-2_10 12. Aladag C, Hocaoglu G (2007) A tabu search algorithm to solve a course timetabling problem. Hacettepe J Math Stat, pp 53–64 13. Moscato P (1989) On evolution, search, optimization, genetic algorithms and martial arts: towards memetic algorithms. Caltech Concurrent Computation Program (report 826) 14. Frausto-Solís J, Alonso-Pecina F, Mora-Vargas J (2008) An efficient simulated annealing algorithm for feasible solutions of course timetabling. Springer, Berlin, pp 675–685 15. Joudaki M, Imani M, Mazhari N (2010) Using improved memetic algorithm and local search to solve university course timetabling problem (UCTTP). Islamic Azad University, Doroud 16. Thepphakorn T, Pongcharoen P, Hicks C (2014) An ant colony based timetabling tool. Int J Prod Econ 149:131–144. https://doi.org/10.1016/j.ijpe.2013.04.026 17. Soria-Alcaraz J, Ochoa G, Swan J, Carpio M, Puga H, Burke E (2014) Effective learning hyper-heuristics for the course timetabling problem. Eur J Oper Res 77–86. https://doi.org/10. 1016/j.ejor.2014.03.046 18. Wolpert H, Macready G (1996) No free lunch theorems for search. Technical report, The Santa Fe Institute, vol 1 19. Lai LF, Wu C, Hsueh N, Huang L, Hwang S (2008) An artificial intelligence approach to course timetabling. Int J Artif Intell Tools 223–240. https://doi.org/10.1007/s10479-011-0997-x 20. McCollum B, McMullan P, Parkes AJ, Burke EK, Qu R (2012) A new model for automated examination timetabling. Ann Oper Res 291–315

Comparison of Bioinspired Algorithms Applied …

437

21. Conant-Pablos SE et al (2009) Pipelining memetic algorithms, constraint satisfaction, and local search for course timetabling. In: MICAI Mexican international conference on artificial intelligence, vol 1, pp 408–419 22. Carpio-Valadez JM (2006) Integral model for optimal assignation of academic tasks. In: Encuentro de investigacion en ingenieria electrica. ENVIE, Zacatecas, pp 78–83 23. Soria-Alcaraz JA, Martin C, Héctor P, Hugo T, Laura CR, Sotelo-Figueroa MA (2013) Methodology of design: a novel generic approach applied to the course timetabling problem, pp 287–319. https://doi.org/10.1007/978-3-642-35323-9-12 24. Talbi E (2009) Metaheuristics: from design to implementation. Wiley, US 25. Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley Pub. Co, Reading 26. Yang X-S (2010) Nature-inspired metaheuristic algorithms. Luniver Press 27. Abdoun O, Abouchabaka J (2011) A comparative study of adaptive crossover operators for genetic algorithms to resolve the traveling salesman problem. Int J Comput Appl 28. Derrac J, García S (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence. In: Swarm and Evolutionary Computation 29. Azuaje F (2003) Review of “Artificial immune systems: a new computational intelligence approach.” J Neural Netw 16(8):1229–1229 30. Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recogn 33:1455–1465 31. Lü Z, Hao J (2010) Adaptive tabu search for course timetabling. Eur J Oper Res 235–244. https://doi.org/10.1016/j.ejor.2008.12.007 32. Viloria A, Lezama OBP (2019) Improvements for determining the number of clusters in kmeans for innovation databases in SMEs. Procedia Comput Sci 151:1201–1206

Algorithm for Detecting Polarity of Opinions in Laptop and Restaurant Domains Jose Silva, Noel Varela, Danelys Cabrera, Omar Lezama, Jesus Varas, and Patricia Manco

Abstract The easy access to the Internet and the large amounts of information produced on the Web, Artificial Intelligence and more specifically the Natural Language Processing (NLP) provide information extraction mechanisms. The information found on the Internet is presented in most cases in an unstructured way, and examples of this are the social networks, source of access to opinions, products or services that society generates daily in these sites. This information can be a source for the application of the NLP, which is responsible for the automatic detection of feelings expressed in the texts and its classification according to the polarity they have; it is the area of analysis of feelings, also called opinion mining. This paper presents a study for the detection of polarity in a set of user opinions issued to Restaurants in Spanish and English. Keywords Opinion mining · Supervised learning · Natural Language Processing

J. Silva (B) · P. Manco Universidad de Ciencias Aplicadas, Lima, Peru e-mail: [email protected] P. Manco e-mail: [email protected] N. Varela · D. Cabrera · J. Varas Universidad de la Costa, St. 58 #66, Barranquilla, Atlántico, Colombia e-mail: [email protected] D. Cabrera e-mail: [email protected] J. Varas e-mail: [email protected] O. Lezama Universidad Tecnológica, San Pedro Sula, Honduras e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_33

439

440

J. Silva et al.

1 Introduction The monitoring of opinions has been studied for some years through the Natural Language Processing; however, the studies and research carried out have been achieved in most cases in languages other than Spanish. The systems for monitoring opinions carried out in those languages and mainly in the English language have yielded favorable results. The following is a description of some of the studies related to this research. In [1], the opinion mining system called “sentiue” is described, which aims to determine the polarity of the sentiment expressed about some specific aspect of an entity. MALLET is used as a classification system, and textual words and lemmas are used as a characteristic. This system participated in Task 12 of SemEval-2017 obtaining a 79% of precision in the domain of Laptops, for the determination of the polarity of sentiments of a given text. The research developed by Brun et al. [2] presents a contribution to Task 5 of SemEval 2018, which focused on the English and French languages for the domain of Restaurant. Their system is based on composite models, which mix linguistic features with automatic learning algorithms. According to the observed results, the authors obtained 88% accuracy for the domain of Restaurant (English language), and 78% accuracy for the determination of polarity in the domain of Restaurant (French language). In [3], the authors describe the system they use in Task 5 of SemEval 2018. Their system is based on supervised automatic learning, using a classifier of maximum entropy, conditional random field, and a large number of features such as global vectors, latent Dirichlet assignment, bag of words, gesture icons and others, obtaining very competitive results in Task 5 of SemEval-2018. In the research of [4], a supervised weighting scheme is proposed based on two factors: the importance of a term in a document (ITD) and the importance of a term to express sentiment (ITS). Experimental results show that the method produces the best accuracy in two of three data sets. The study in [5] proposed ontology-based opinion mining, focusing on opinion texts from English-language users, providing a new method for the analysis of sentiments based on vector analysis. For the studies conducted in [6], an automatic method was proposed to classify the polarity of opinions in consumers of a company’s products. The algorithm is based on the use of ontologies to find all those opinions made to their products including a fusion of ontologies within the analysis of opinions. The authors conclude that as the number of terms in each ontology increases, there is no improvement in the methods using high level ontologies compared to direct methods. In the research developed in [7], two models are presented to discover the polarity of messages in social networks, particularly extracted from Twitter. The first model extracts the lexical–syntactic characteristics of each tweet. The second model obtains the characteristics of each tweet based on the centrality of graphs.

Algorithm for Detecting Polarity of Opinions in Laptop …

441

In the work of [8], the authors classify feelings into tweets according to their content in positive, negative or neutral, proposing improvements to the objective dependencies of the classification of feelings on Twitter with the incorporation of characteristics dependent on objectives through three steps. In the first one, a subjective classification is made to decide if there is subjectivity in the tweet or it is neutral. In the second step, for the subjective elements, its polarity is given (positive or negative), and finally, an optimization is made based on graphs using tweets that are related. For the first two steps, they make use of support vector machines (SVM-Light) in which target-dependent characteristics achieve an accuracy of 85.6%. The authors of [9] create a corpus for the experiments, adding contextual polarity judgments to the existing annotations in the Multi-Perspective Question–Answer Opinion Corpus (MPQA) by means of an annotation scheme. For the experiments, they used a lexicon with subjective clues, which were grouped according to their confidence and subjectivity. Later, the lexicon was expanded by means of a dictionary and a thesaurus. The algorithm allowed the contextual polarity to be identified automatically for a wide range of sentimental expressions. The best result obtained considers 28 characteristics and achieves an accuracy of 75.9%. In Bakliwal et al. [10], the authors construct a dataset of political tweets. Each tweet in this set is annotated as positive, negative or neutral. Sarcasm tweets were included, but in their experiments, they omitted sarcasm and achieved an accuracy of almost 59% with simple lexical search operations. In [11], a hybrid approach is presented to determine the feeling of each tweet. They preprocess by analyzing abbreviations, lemmatization, elimination of closed words, etc. Six Twitter datasets were tested, achieving 83.3% F1, 85.7% accuracy and 82.2% recall. This document shows the proposed solution for the analysis of feelings in opinions expressed by users through social networks, reviewing two domains of opinions, Restaurants and Laptops, the first of them in Spanish and English language and the last one in English language only. The objective is to detect the polarity of each opinion, that is, to classify the polarity of the review given by the user, being this positive, negative, neutral or conflict. In order to achieve this objective, the use of a classifier is proposed, and the one selected in this research is support vector machine using cross validation for data testing. The results obtained show an adequate performance with the proposed algorithm. The distribution of this article is described below, where Sect. 2 presents the proposed algorithm or model, Sect. 3.1 presents the information of the used data, and the results achieved in Sect. 3.2. Finally, the conclusions are presented in Sect. 4.

2 The Proposed Algorithm The proposed algorithm for determining the polarity of the opinions provided in Task 5 of SemEval 2018 [12] is described below.

442

J. Silva et al.

Fig. 1 Proposed algorithm

The algorithm starts with a preprocessing of data, and during this phase, a cleaning of the information such as the elimination of punctuation marks, elimination of closed words, etc. is carried out. In the second phase, characteristics are extracted using the sci-kit-learn library, which is an automatic learning tool in Python [13, 14]. The third phase uses a classification system based on support vector machines provided by the sci-kit-learn automatic learning tool. The proposed algorithm is shown graphically in Fig. 1.

2.1 Preprocessing • Extraction of opinions from the document in XML format: Filter only opinions from the document. • Cleaning up opinions: Remove empty or closed words, punctuation marks, accents and single characters [15]. • Tokenization: Tokenize opinions by word. • Stemming: A heuristic process that shortens the end of words and often includes the elimination of derivative affixes: cars = car, fuimos = fuí [16].

Algorithm for Detecting Polarity of Opinions in Laptop …

443

• Filter categories: Filter opinions of training data by polarity such as: POSITIVE, NEGATIVE, NEUTRAL and CONFLICT [17].

2.2 Feature Extraction Frequency and inverse frequency of the document for a term (TF-IDF): This method indicates how relevant the word is with respect to the selected document and the corpus in general. It also allows to rate the documents of the corpus based on these keywords, that is, if the words have more weight, then the document is more related to them than one with the same words but with less weight. Python’s sci-kit-learn tool provides several vectorizers for translating input documents into feature vectors: TfidfVectorizer is a vectorizer that uses TF-IDF as the weighting scheme [18, 19].

2.3 Classification System This system uses the support vector machine (SVM) method which is an automatic learning technique. In this case, SVC based on libsvm [20] is used [20, 21].

3 Results This section describes the data used for the analysis of opinions and the results corresponding to each domain, with the proposed algorithm.

3.1 Data Set The data used are the training data provided by SemEval-2018 for Task 5, sub-task 2. The domains considered are Restaurants for Spanish and English, and Laptops for English. Table 1 shows the total opinions by domain, language and polarity. Table 1 Total opinions by domain, language and polarity Domain

Positive

Negative

Neutral

Conflict

Restaurants (Spanish)

1625

487

132

68

Restaurants (English)

1125

332

59

42

Laptops (English)

1232

714

273

42

444

J. Silva et al.

3.2 Experimental Findings Following the algorithm presented in Sect. 3, an evaluation of the polarity detected with the classification system is performed using TF-IDF as characteristic. Tables 2, 3 and 4 present the total of opinions evaluated by polarity, the result of precision, recall and F1 by domain, as well as the test data by polarity. It is worth mentioning that the data test was performed with cross validation [18, 22], in which a certain percentage is taken to train the data and the remaining percentage for the test. In this case, 18% is considered as a data test in all domains and an average of 12 repetitions. That is why Table 2 shows in detail the results of the domain of Laptops in English and the last column breaks down the division of opinions evaluated by polarity. It also presents that the total of opinions evaluated were 852, the same for Table 2, but in the case of Restaurants in English language, presenting a total test of 452 opinions evaluated, and finally, in Table 3, it is observed that the total of tests for Restaurants in Spanish language were 521 opinions. Table 2 Algorithm results for the domain of Laptops in English language Polarity

Precision

Recall

F1

Total of the Test

Conflict

0.00

Negative

0.55

0.00

0.01

12

0.58

0.59

64

Neutral Positive

0.00

0.00

0.01

28

0.76

0.97

0.83

Total

0.72

215

0.74

0.72

852

Table 3 Results of the domain of Restaurants in the English language Polarity

Precision

Recall

F1

Total of the Test

Conflict

0.00

Negative

0.68

0.00

0.00

7

0.34

0.45

46

Neutral Positive

0.01

0.00

0.00

10

0.80

0.97

0.88

Total

0.77

178

0.77

0.74

452

Table 4 Results of the domain of Restaurants in Spanish language Polarity

Precision

Recall

F1

Total of the test

Conflict

0.00

0.00

0.00

16

Negative

0.68

0.54

0.57

78

Neutral

0.00

0.00

0.00

14

Positive

0.88

0.97

0.91

352

Total

0.75

0.88

0.79

521

Algorithm for Detecting Polarity of Opinions in Laptop …

445

4 Conclusions This article presents an automatic classification algorithm to identify the polarity of the opinions provided in Task 5 of SemEval 2018. The domains considered in the tests are Restaurants in Spanish and English, and Laptops for the English language. Based on the results obtained, it can be observed that the SVM method obtains satisfactory results by obtaining more than 71% precision in the three domains with the TFIDF characteristic. The proposed algorithm does not use additional information to enrich the opinions provided by Semeval-2018, which is why the algorithm does not adequately classify conflict and neutral opinions. As future research, the enrichment of training data is considered in order to improve the classification of opinions with little information, as well as the use of other methods of supervised classification and their testing with training data and tests provided by Semeval-2018.

References 1. Saias J (2015) Sentiue: target and aspect-based sentiment analysis in semeval-2015 task 12. In: Proceedings of the 9th international workshop on semantic evaluation, Denver, Colorado, Association for Computational Linguistics, pp 767–771 2. Brun C, Perez J, Roux C (2018) Xrce at semeval-2018 task 5: feedbacked ensemble modeling on syntactico-semantic knowledge for aspect-based sentiment analysis. In: Proceedings of the 10th international workshop on semantic evaluation, San Diego, California, Association for Computational Linguistics, pp 282–286 3. Hercig T, Brychcín T, Svoboda L, Konkol M (2018) Uwb at semeval-2018 task 5: aspect based sentiment analysis. In: Proceedings of the 10th international workshop on semantic evaluation, San Diego, California, Association for Computational Linguistics, pp 354–361 4. Deng ZH, Luo KH, Yu HL (2014) A study of supervised term weighting scheme for sentiment analysis. Expert Syst Appl 41:3506–3513 5. Peñalver I, Garcia F, Valencia R, Rodríguez MA, Moreno V, Fraga A, Sánchez JL (2014) Feature-based opinion mining through ontologies. Expert Syst Appl 41:5995–6008 6. Dragoni M, Federici M, Rexha A (2019) ReUS: a real-time unsupervised system for monitoring opinion streams. Cognit Comput 11(4):469–488 7. Pereg O, Korat D, Wasserblat M, Mamou J, Dagan I (2019) ABSApp: a portable weaklysupervised aspect-based sentiment extraction system. arXiv preprint arXiv:1909.05608. 8. Jiang L, Yu M, Zhou M, Liu X, Zhao T (2011) Target-dependent twitter sentiment classification. In: The 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the conference, Portland, Oregon, USA, pp 151–160 9. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase level sentiment analysis. In: HLT/EMNLP 2005, human language technology conference and conference on empirical methods in natural language processing, Proceedings of the conference, Vancouver, British Columbia, Canada 10. Bakliwal A, Foster J, van der Puil J, OBrien R, Tounsi L, Hughes M (2013) Sentiment analysis of political tweets: towards an accurate classifier. In: Proceedings of the workshop on language in social media, Atlanta, Georgia, Association for Computational Linguistics, pp 49–58 11. Khan FH, Bashir S, Qamar U (2014) Tom: Twitter opinion mining framework using hybrid classification scheme. Decis Support Syst 57:245–257 12. Pontiki M, Galanis D, Papageorgiou H, Androutsopoulos I, Manandhar S, AL-Smadi M, AlAyyoub M, Zhao Y, Qin B, De Clercq O, Hoste V, Apidianaki M, Tannier X, Loukachevitch

446

13.

14.

15.

16.

17. 18.

19.

20.

21. 22.

J. Silva et al. N, Kotelnikov E, Bel N, Jiménez SM, Eryigit G (2018) Semeval-2018 task 5: aspect based sentiment analysis. In: Proceedings of the 10th international workshop on semantic evaluation, San Diego, California, Association for Computational Linguistics, pp 19–30 Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830 Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, Niculae V, Prettenhofer P, Gramfort A, Grobler J, Layton R, VanderPlas J, Joly A, Holt B, Varoquaux G (2013) API design for machine learning software: experiences from the scikit- learn project. In: ECML PKDD workshop: languages for data mining and machine learning, pp 108–122 Viloria A, Gaitan-Angulo M (2018) Statistical adjustment module advanced optimizer planner and SAP generated the case of a food production company. Indian J Sci Technol 9(47). https:// doi.org/10.17485/ijst/2018/v9i47/107371. Rousseeuw P (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20(1):53–65. Disponible. https://doi.org/10.1016/0377-042 7(87)90125-7 Wilcoxon F (1945) Individual comparisons by ranking methods. Biometr Bull 1(6):80–83 Toutanova K, Klein D, Manning CD, Singer Y (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology, vol 1, ser. NAACL ’03. Association for Computational Linguistics, Stroudsburg, pp 173–180. Disponible: https://doi.org/10.3115/1073445.1073478 Lis-Gutiérrez JP, Gaitán-Angulo M, Henao LC, Viloria A, Aguilera-Hernández D, PortilloMedina R (2018) Measures of concentration and stability: two pedagogical tools for industrial organization courses. In: Tan Y, Shi Y, Tang Q (eds) Advances in swarm intelligence. ICSI 2018. Lecture notes in computer science, vol 10942. Springer, Cham Zhao WX, Weng J, He J, Lim E-P, Yan H (2011) Comparing twitter and traditional media using topic models. In: 33rd European conference on advances in information retrieval (ECIR11). Springer, Berlin, pp 338–349 Viloria A, Lezama OBP (2019) Improvements for determining the number of clusters in kmeans for innovation databases in SMEs. Procedia Comput Sci 151:1201–1206 Viloria A, Acuña GC, Franco DJA, Hernández-Palma H, Fuentes JP, Rambal EP (2019) Integration of data mining techniques to postgresql database manager system. Procedia Comput Sci 155:575–580

Prediction of the Yield of Grains Through Artificial Intelligence Jose Silva, Noel Varela, Danelys Cabrera, and Omar Lezama

Abstract Grass turns out to be an appropriate food for cattle, mainly in tropical climate countries such as Latin American countries. This is due to the high number of species that can be used, the possibility of growing them year-round, the ability of the ruminant to use fibrous supplies and be an economic source (Sánchez et al., Data mining and big data. DMBD 2018. Lecture notes in computer science, vol 10943. Springer, Cham, 2018, [1]). In this work, an application of neural networks was carried out in the forecasting of more accurate values of production and quality of grasslands. Keywords Artificial intelligence · Forage · Grass · Neural networks

1 Introduction Latin American countries make great efforts in introducing new species of higher yield and quality. These include Brachiariadecumbens and hybrid pastures. Similarly, varieties of the species Megathyrsusmaximus were introduced into livestock, including Likoni cultivars and more recently Tanzania, which adapt very well to climatic conditions and reflect yields that sometimes exceed 30 tons of dry matter per hectare [2]. J. Silva (B) · D. Cabrera Universidad de Ciencias Aplicadas, Lima, Peru e-mail: [email protected] D. Cabrera e-mail: [email protected] N. Varela Universidad de la Costa, St. 58 #66, Barranquilla, Atlántico, Colombia e-mail: [email protected] O. Lezama Universidad Tecnológica, San Pedro Sula, Honduras e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_34

447

448

J. Silva et al.

For the introduction of different varieties, it is necessary to characterize the yield and quality of these crops, in order to establish appropriate management strategies. This process showed a number of shortcomings [3]: • Complex technical laboratories are used. • Sometimes, reagents or other elements are not available for laboratory studies. • It becomes cumbersome and long. That is why this paper shows a study on computer systems, supported by neural networks, that allows to predict the productive and quality indicators of different varieties of the species Megathyrsusmaximus, Brachiariadecumbens and hybrid, from the age of regrowth and climate factors.

2 Characterization of Production Indicators and the Quality of Pastures and Fodder The process of characterizing pastures and fodder can be done in dissimilar ways taking into account what is investigated and what is desired to obtain, which allows to determine different factors of performance and quality. Currently, at the University of Granma, research is carried out that will allow to predict, for the species Megathyrsus maximus, Bracharia dencumbens and hybrid, the production per hectare (RMS) and the digestibility of dry matter (DMS) and organic matter (DMO), metabolizable energy (EM) and net lactation (ENL), neutral detergent fiber (Fda) and lignin (LDA), crude protein (PB), calcium (Ca) and phosphorus (P) [4–6]. Many studies have been conducted to determine the performance and quality of Megathyrsus maximus, Bracharia dencumbens and hybrid such as: The review of the morphological and agro-productive attributes that characterize the species of the genus Brachiaria most commonly used in the livestock sector by Cruz et al. [7], where topics related to taxonomy and description of the genus, its main species, their origin and adaptation, and different results in terms of edible biomass yield, general behavior in different environments, as well as their response under specific management conditions. In [8] the morphological, productive and reproductive characteristics of Braquiaria (Brachiaria decumbens vc) were determined. On the other hand, in Cuba, the agro-productive behavior of four new grasses (Cynodon dactylon vc) was investigated for five years [9].

3 Prediction to Determine the Yield and Quality of Pastures and Fodder The world of computer science is very diverse, and it can be said that there are virtually no limits to its use. For several years, the prediction has gone on to play

Prediction of the Yield of Grains Through Artificial Intelligence

449

a fundamental role with the use of various techniques such as those provided by artificial intelligence (AI), with which it is possible to solve problems that are of interest, in order to save time and resources [10]. The agricultural sector is not alien to this phenomenon and within it pastures and fodder, because due to the intrinsically complex, dynamic and nonlinear nature of its processes, have required solutions based on advanced techniques, which provide greater accuracy and understanding of the results [11].

4 Computer Systems for Prediction in Pastures and Fodder Specifically addressing the application of prediction techniques in the determination of production and quality indicators of pastures and fodder, research has been carried out such as: Adaptation of the Agrotechnology Transfer Decision Support System (DSSAT) model to simulate the production of Brachiariadecumbens [12]. As a first step, a comprehensive review of grass literature was conducted, required for entry into the DSSAT; it was then switched to calibration and validation using information from several experimental stations. The latter were located at low latitudes, which could be limiting for model use at other latitudes. The development of the application called “Forage Production Prediction System” by Carrilho et al. [13], which allowed: to make the adjustment of models of prediction models of dry matter performance, using multiple linear regression, functional and neural networks; and simulate yield per hectare of grasslands, by incoming climate variables and making use of pre-adjusted models. For this purpose, three techniques were used: multiple linear regression analysis, functional networks and artificial neural networks (RNA). This application facilitates decision-making in production, as the estimated models are able to predict the forage supply with a minimum error rate. In this way, users of the computer system can use the findings of scientific research in the establishments, providing information about a small number of climate variables. The models developed in this system are specific to the conditions of the province of Corrientes, Argentina [14] that do not match those that are presented in other places of Latino American. However, the results are excellent, with the use of AI techniques, which in recent times has played a very important role in research and development in the agricultural sector.

5 AI Applications in Agriculture The use of artificial intelligence such as computational vision, robotics and control, expert systems, and other current promising techniques: neural networks, fuzzy logic, genetic algorithms, and bioinformatics, have been commissioned in recent times

450

J. Silva et al.

provide solutions to problems in complex agricultural systems effectively. In addition, the promotion of these technologies as well as a reduction in costs is promoting research into the use of AI in various forms in the agricultural sector [15]. Currently, various efforts are being made around the world to be able to apply AI knowledge to the agricultural sector, giving rise to a number of regular conferences, including the “World Computing Congress in Agriculture and Natural Resources,” “International Workshop on Artificial Intelligence in Agriculture,” “EFITA Congresses: European Federation for Information Technologies in Agriculture,” as well as various journals have been created with the main objective of disseminating research on this topic, among the most important are” “Agricultural Systems,” “Computers and Electronics in Agriculture,” “Biosystems,” “Biosystems Engineering” [16]. The main areas of agriculture in which AI can be used are [17]: • • • • • • •

Agricultural and natural resource planning. Comprehensive crop management. Pest and disease control. Diagnosis. Investment analysis. Selection of machinery. Irrigation and other control.

Various research has been carried out in these fields, with great advances making very beneficial progress for agriculture. It is worth highlighting the case of agricultural robotics, where we currently have virtual entities that are able to carry out search and transmission of information in the field. Thus, we can mention the drone for precision agriculture eBee developed by a Swiss company, which can obtain very accurate images of the fields and cover hundreds of hectares, all without the costs and complications of manned services. It allows you to get a higher resolution than satellite images typically offer. Using a digital processing computer system, you can transform the images obtained into a large orthomosaic. It then applies algorithms such as the Normalized Difference Vegetation Index (NDVI) and creates a map of the crops. It highlights exactly which areas should be examined further, allowing less time to be spent on exploration and more time treating plants that need it [18]. Also neural networks have been widely used, so various studies can be mentioned as: • The system for discriminating crop weeds with images using neural networks [19]. • The classification of hyperspectral data by means of decision trees and neural networks for the determination of stress caused by weeds to maize [20]. • Simulation of a drying process of Echinacea angustifolia with neural networks [21].

Prediction of the Yield of Grains Through Artificial Intelligence

451

• The application of networks in the modeling of the reference purity of Cuban final honeys [22]. • Predicting the yield of a banana crop using widespread regression artificial neural networks [23]. As can be seen, the applications of AI techniques in the agricultural sector are very broad; in the particular case of neural networks, they represent a vast field of research since they have been shown to have application in various fields of science for their ability to deal with nonlinearities in various phenomena such as the performance and quality of Crops.

6 Artificial Neural Networks in Crop Yield and Quality Prediction Artificial neural networks (RRs) are a non-algorithmic or digital information processing system, producing statistical models, nonlinear and intensely parallel differentials, capable of learning from incomplete and noise data (unwanted information that is mixed with the useful one), which are based on the current knowledge of the nervous system of the different living organisms. Compared to traditional classifiers, neural networks are nonparametric and make weak assumptions regarding the shape of the distribution of the analyzed medium. The generated models are more robust when the systems are nonlinear and have a different distribution function than the Gaussian [24]. RFs mimic the hardware structure of the nervous system, with the intention of deconstructing parallel, distributed and adaptive information processing systems, which may exhibit some “smart” behavior [25]. Neural systems are composed of the following elements [26]: • • • • •

A set of elementary processors or artificial neurons. A connectivity or architecture pattern. A dynamic of activations. A learning rule or dynamic. The environment in which it operates.

For their ability to learn, robustness, nonlinearity and tolerance to the inaccuracy and uncertainty of the environment, for a few years, they have been achieving excellent results in diverse fields. The most common are related classification, functional estimation and optimization; in general, pattern recognition is often seen as a common denominator. The following areas of application can be noted, among others: speech and character recognition, vision, robotics, control, signal processing, prediction, economics, defense, bioengineering, among others. In recent decades, artificial neural network systems have been an option in agriculture; not only to classify, evaluate or detect events, but also as a predictive tool, such

452

J. Silva et al.

as to determine performance and quality of pasture and fodder production, provided that weather, management and characteristics of the varieties of previous years [27].

7 Conclusions Artificial neural networks represent a vast field of research, as they have been shown to have application in various branches of science, such as agriculture; have a number of advantages over statistical expressions frequently used for prediction, as neural models typically do not start from restrictions on starting data (type of functional relationship between variables), nor do they usually impose budgets (such as Gaussian distribution or others). On the other hand, the ability of neurons to calculate nonlinear output functions empowers the network to solve problems of this type.

References 1. Sánchez L, Vásquez C, Viloria A, Rodríguez Potes L (2018) Greenhouse gases emissions and electric power generation in Latin American countries in the period 2006–2013. In: Tan Y, Shi Y, Tang Q (eds) Data mining and big data. DMBD 2018. Lecture notes in computer science, vol 10943. Springer, Cham 2. Aitkenhead MJ, Dalgetty IA, Mullins CE, Strachan NJ C (2003) Weed and crop discrimination using image analysis and artificial intelligence methods. Comput Electron Agric 39(3) 3. Bustos JR (2005) Inteligencia Artificial en el Sector Agropecuario. Disponible en: https://www. docentes.unal.edu.co/jrbustosm/docs/estado2.pdf [Consultado el 29 de septiembre del 2015] 4. Olivera Y, Machado R, Pozo PP (2006) Características botánicas y agronómicas de especies forrajeras importantes del género Brachiaria. Pastos y Forrajes 29(1) 5. Ramírez J (2010) Rendimiento y calidad de cinco gramíneas en el Valle del Cauto. Tesis en opción al grado de Doctor en Ciencias Veterinarias. Instituto de Ciencias Agrícolas. La Habana, Cuba 6. SenseFly (2014) El dron para la agricultura de precisión. Disponible en: https://www.sensefly. com/fileadmin/user_upload/sensefly/documents/brochures/eBee_Ag_es.pdf [Consultado el 29 de septiembre del 2015] 7. Cruz MC, Rodríguez LC, Vi RG (2013) Evaluación agronómica de cuatro nuevas variedades de pastos. Revista de Producción Animal 25(1) 8. Erenturk K, Erenturk S, Tabil LG (2004) A comparative study for the estimation of dynamical drying behavior of Echinacea angustifolia: regression analysis and neural network. Comput Electron Agric 45(1–3) 9. Hernández D, Carballo M, Reyes F (2000) Reflexiones sobre el uso de los pastos en la producción sostenible de leche y carne de res en el trópico. Pastos y Forrajes 23(4) 10. Hernández RM, Pérez VR, Caraballo EAH (2012) Predicción del rendimiento de un cultivo de plátano mediante redes neuronales artificiales de regresión generalizada. Publicaciones en Ciencias y Tecnología 6(1) 11. López AM, Adolfo A, Guido JP, Ortega AC (2006) Software de Predicción de la Producción Forrajera. Disponible en: https://www.unne.edu.ar/unnevieja/Web/cyt/cyt/2001/8-Exactas/E002.pdf [Consultado el 29 de septiembre del 2015] 12. Martín B, Molina AS (2001) Redes neuronales y sistemas borrosos. 2ªed. Alfaomega, España. Ra-Ma

Prediction of the Yield of Grains Through Artificial Intelligence

453

13. Carrilho PHM, Alonso J, Santo LDT, Sampaio RA (2012) Comportamiento vegetativo y reproductivo de Brachiariadecumbensvc. Basilisk bajo diferentes niveles de sombra. Revista Cubana de Ciencia Agrícola 46(1) 14. Lezama OBP, Izquierdo NV, Fernández DP, Dorta RLG, Viloria A, Marín LR (2018) Models of multivariate regression for labor accidents in different production sectors: comparative study. In International conference on data mining and big data, vol 10942(1). Springer, Cham, pp 43–52 15. Suárez JA, Beatón PA, Escalona RF, Montero OP (2012) Energy, environment and development in Cuba. Renew Sustain Energy Rev 16(5):2724–2731 16. Sala S, Ciuffo B, Nijkamp P (2015) A systemic framework for sustainability assessment. Ecol Econ 119(1):314–325 17. Singh RK, Murty HR, Gupta SK, Dikshit AK (2009) An overview of sustainability assessment methodologies. Ecol Ind 9(2):189–212 18. Varela N, Fernandez D, Pineda O, Viloria A (2017) Selection of the best regression model to explain the variables that influence labor accident case electrical company. J Eng Appl Sci 12(1):2956–2962 19. Yao Z, Zheng X, Liu C, Lin S, Zuo Q, Butterbach-Bahl K (2017) Improving rice production sustainability by reducing water demand and greenhouse gas emissions with biodegradable films. Sci Rep 7(1):1–12 20. Suárez DFP, Román RMS (2016) Consumo de água em arroz irrigado por inundação em sistema de multiplas entradas. IRRIGA 1(1):78–95 21. Stuart AM, Pame ARP, Vithoonjit D, Viriyangkura L, Pithuncharurnlap J, Meesang N, Lampayan RM (2018) The application of best management practices increases the profitability and sustainability of rice farming in the central plains of Thailand. Field Crops Res 220(1):78–87 22. Izquierdo NV, Lezama OBP, Dorta RG, Viloria A, Deras I, Hernández-Fernández L (2018) Fuzzy logic applied to the performance evaluation. Honduran Coffee Sector Case. In: Tan Y, Shi Y, Tang Q (eds) Advances in swarm intelligence. In: ICSI 2018. Lecture notes in computer science, vol 10942(1). Springer Cham, pp 1–12 23. Bezerra BG, Da Silva BB, Bezerra JRC, Brandão ZN (2010) Evapotranspiração real obtida através da relação entre o coeficiente dual de cultura da FAO-56 e o NDVI. Revista Brasileira De Meteorologia 25(3):404–414 24. Diaz-Balteiro L, González-Pachón J, Romero C (2009) Forest management with multiple criteria and multiple stakeholders: an application to two public forests in Spain. Scand J For Res 24(1):87–93 25. Hák T, Janoušková S, Moldan B (2016) Sustainable development goals: a need for relevant indicators. Ecol Ind 60(1):565–573 26. Lampayan RM, Rejesus RM, Singleton GR, Bouman BA (2015) Adoption and economics of alternate wetting and drying water management for irrigated lowland rice. Field Crops Res 170(1):95–108 27. Delgado A, Blanco FM (2009) Modelo Multicriterio Para El Análisis De Alternativas De Financiamiento De Productores De Arroz En El Estado Portuguesa, Venezuela. AGROALIMENTARIA 28(1):35–48

A Secured Steganography Algorithm for Hiding an Image and Data in an Image Using LSB Technique Vaibhav Singh Shekhawat, Manish Tiwari, and Mayank Patel

Abstract Recent advances in computer security have shown that the most effective way to secure information is to hide information rather than encrypt it. Hiding facts is that the art and technological know-how of hidden writing is a manner to hide records in a completely specific form of data. In addition, hidden textual content or pictures with the cover photograph and change some objects from the preliminary bit to the recipient. A record report or picture is hidden in the cowl photograph will steady the attacker’s authentic facts or statistics. The least significant bit technique (LSB) may be an unremarkably used technique of activity data and is prone to attack owing to its simplicity. During this analysis, it shows that the way to use the operator’s image to cover a picture and different knowledge. In this, a tendency to compare the signal-to-noise ratio (PSNR) and the error squared error (MSE) to determine the level at which the stego image is masked with the cover image. Keywords Cover image · LSB technique · Steganography · Image hiding · Data hiding · Stego image

1 Introduction Data security has become a heavy digital communication downside via the net or the other medium. Cryptography and steganography of data are employed to take care of the confidentiality of the data, however, information concealment is a more efficient way, as a result of a hidden message is not noticed. Hiding information plays an important role in communicating confidential messages [1]. Protecting facts with V. S. Shekhawat (B) · M. Tiwari · M. Patel Department of CSE, GITS, Udaipur, Rajasthan, India e-mail: [email protected] M. Tiwari e-mail: [email protected] M. Patel e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_35

455

456

V. S. Shekhawat et al.

Fig. 1 Types of steganography

the artwork and science of hidden writing may be a method of hidden recordings in a totally unique form of recordings to cover facts and therefore the unauthorized or abusive individual has no fear that those messages may be invisible and able to send the hidden information with text or characters in the image or ship the original document (image, audio and video) to the recipient by placing a few bits to defend the information. Then, the statistics become hidden in the cowl report and is now not beneficial to the crook [2]. Stenography can be applied to all types of file formats with a high frequency. The facts entered or disguised are now widely used with text, pictures, audio and video on a quilt. The reduction type is assessed as proven in Fig. 1 [3].

2 Related Work Singh et al. [4] has discussed spatial field technology is one in every of the best systems in terms of integration and extraction of counseling. The uses pixel values for pix to write the key message. The maximum common happiness of the set of rules policies for the modern technological layer is that the LSB approach is calculated in the course of which the binary instance of the photograph pixel value is to start with calculated and also the bits are then would not to mask secret messages. Swain [5] has present high capability data, steganography technology mistreatment differential and substitution mechanisms. Divide the image into blocks of 3 × 3 pixels that do not overlap. For every pixel of the square, the least critical piece substitution is applied to the least huge piece, and QVD is likewise applied to the staying six bits. In this way, there are two degrees of incorporation. Prashanti and Sandhyarani [6] provided an overview of recent achievements in hiding LSB-based spatial domain information containing increased crypto-graphical endpoints that may not be detected, furthermore, because of the power and capability of hidden information. Bhardwajand Sharma[7] provided three levels of security, instead of concealing message bits foursquare within the unfold image, pixels are created aloofly by the pseudo-impulsive range generator by essentially concealing secret information behind a ramification image utilizing the LSB converse technique. Al-Afandy et al. [8] proposed an awfully reliable technique of concealing knowledge victimization image cropping and fewer exciting bit data (LSB). It relies upon on splitting a mystery

A Secured Steganography Algorithm for Hiding an Image and Data …

457

textual content message with 4 components and extracting 4 parts of a coloration picture with related mystery coordinates. Each a part of the message is covered in the crop with a predefined secret series. Gather is combined with an umbrella photograph that offers a stego image. This is more secure thanks to the concealment of know-how and stuck time, and it is difficult to extract touchy facts. Experimental outcomes showed that the PSNR technique and centralized centralization time had been equal with comparable opportunity procedures, but that was more secure.

3 LSB Technique Methods of concealing info are usually divided into two categories: abstraction domain ways and conversion domain methods. Abstraction domain ways are the best ways to incorporate info. These ways use gray levels in pixels and their color values to engraft the message. Within the field conversion field (frequency domain), the message is enclosed within the reborn area. The common embrace ways in these methods are least significant bit replacement (LSBS). LSBS hide information within the least important bits of the image on the quilt of every pixel that is not visible to the human eye. Cover pictures are often 24-bit, 8-bit or grayscale [9]. LSB is one of the maximum famous and easy ways to overlay information. The LSB set of rules replaces the lowest little bit of a report in keeping with the message bit. This is the most used way to cover statistics to cover the message. This is usually a splendid technology, due to the fact the LSB opportunity does no longer reason an extensive deterioration in exceptional. The maximum crucial disadvantage of LSB technology is if a malicious user recovers all LSB bits, he can get the whole message because the message is hidden inside the LSB schema [10]. This is the simplest and widest acknowledged method to cover statistics from images. This fact hiding algorithm works for eight bits (grayscale) or 24 bits (color images). With this algorithm, picture nice can be maintained by way of altering negligible shapes for every pixel in the picture so that the visibility is clear. Based on the logical process, the algorithm merges the pixel cost from the secret image to the bit much less necessary than the pixel value of the cover image. The use of this algorithm approves accelerated effectively because it approves low complexity calculation [11]. Gedkhaw et al. [2] discussed the protection of statistics with the assist of concealing records is that the artwork and technological know-how of writing a hidden methodology of hidden knowledge in several forms of information to cowl the facts. Additionally able to send text or image hidden behind the lid and started some parent bits for the recipient. A computer file or image is hidden within the cowl will defend the first knowledge from the assailant. In this, studied the performance of the most appropriate length of the conceal in addition to the methodology of hidden knowledge even as harming the photograph of a low-price cover of all time (LSB). Results show that a cover photograph has the alternative end result on file length, which need to be hidden for all time.

458

V. S. Shekhawat et al.

4 Proposed Methodology In the various work studied that, the max of the work is done on hiding data on the cover image to provide the security of the data. That data is either image or text in a max of the work. In this proposed work done by the hiding image and text in a stego image, and that stego image hide in the cover image. Using this proposed era, enhancing facts safety because within the proposed paintings, first disguise the stego image inside the cowl photograph. In this stego picture, hide safe photograph and the textual content that want to trust that is hiding in this safe photograph. Therefore, the attacker first noticed the cover image and the attacker assaults the duvet picture, then most effective get the stego image that hides the cover photograph. Over secure image and text is secure from in the attacker that is hidden in stego image. Go with the flowchart of the preceding paintings and the proposed method are proven in Figs. 2 and 3, respectively. In the previous work, the sender image is hiding into carrier image using LSB technique. In this technique, sender image pixel value have to convert in binary form, and then this binary value hides in LSB of carrier image pixel bytes. Fig. 2 Flowchart of basic design flow of image hiding in an image using LSB

A Secured Steganography Algorithm for Hiding an Image and Data …

459

Fig. 3 Flowchart of the proposed design hiding an image and data in an image

5 Results of Proposed Technique In the preceding analysis, the maximum work to cover one degree, wherein handiest one image is hidden in a photograph. The end result of the preceding task is shown in Figs. 4, 5, 6 and 7. Figure 7 suggests the MSE value, PSNR value and the whole execution time of the manner. In this case, get the MSE value is 0.50105 and the PSNR value get the 51.132. In the proposed layout approach, first, take the duvet picture as shown in Fig. 8, then select the primary secret photograph or stego picture as proven in Fig. 9 and now select the second one secret photograph that has to want to hide in stego, the picture that appears in Fig. 10. After choosing the two secret pix, now write the textual content rub down that need to hide within the second mystery photograph that looks in Fig. 11. After selecting the duvet picture, both the secrete image and secrete massage, hide or embedded these all using the LSB technique. In that, first hide the secrete massage in the second secrete image, after that, hide the secrete

460

V. S. Shekhawat et al.

Fig. 4 Cover image

Fig. 5 Secrete image

massage and second secrete image embedded image in a first secrete image. Now, masking this photograph embedded in the cover image the usage of LSB generation, after hiding these images and text, the merged final is displayed in an identical manner as the duvet picture (as shown in Fig. 12). So now an unauthorized person has very difficult to get the secrete massage and secrete image from the embedded image. After done the hiding also calculate the PSNR and MSE values, get the very low MSE and high PSNR values that show the in stenography the hiding is more secured then the previous work. Figure 13 shows the value of MSE, PSNR and execution time of the complete process.

A Secured Steganography Algorithm for Hiding an Image and Data …

461

Fig. 6 Embedded image secrete image of mountain with cover image of natural view

Fig. 7 Value of MSE and PSNR and execution time of cover image natural view and secrete image mountain

In the proposed design, get the MSE value is 0.031237 and PSNR value is 63.1841 for nature view cover image. In the stenography, the MSE value is low and high PSNR value is good for better hiding. In the proposed design technique, the PSNR value is high and the MSE value is low then the previous work done. So the proposed model enhanced the security using the proposed technique (Figs. 14, 15, 16, 17, 18 and 19). In the proposed design, get the MSE value is 0.031244 and PSNR value is 63.1831 for tiger cover image. In the stenography, the MSE value is low and high

462

Fig. 8 Cover image of nature view for the proposed technique

Fig. 9 First secrete image (stego image) for the proposed technique

V. S. Shekhawat et al.

A Secured Steganography Algorithm for Hiding an Image and Data …

463

Fig. 10 Second secrete image (image that wants to more secure) for the proposed technique

Fig. 11 Secrete massage in the proposed technique

PSNR value is good for better hiding. In the proposed design technique, the PSNR value is high and the MSE value is low then the previous work done. So the proposed model enhanced the security using the proposed technique (Table 1).

6 Conclusion In this paper, it presents an image and data hiding technique using the LSB methodology. In the proposed design, it enhances the security of text message and image both, hides the multiple images in the cover image and hides the text message into the last hidden secrete image. Using this technique, it enhances the security of the text message and image both using the LSB technique. After hiding multiple images and text message, the final embedded is shown as the cover image. In this proposed work, get the value of MSE is very low (0.031237) and the value of PSNR is very high (63.1841) that give the proof the proposed hiding technique is better than the previous methods.

464

V. S. Shekhawat et al.

Fig. 12 Cover image with both secrete images and hidden massage using the proposed technique for nature view cover image

Fig. 13 Value of MSE, PSNR and time in operation using proposed technique for nature view cover image

A Secured Steganography Algorithm for Hiding an Image and Data …

Fig. 14 Tiger cover image for the proposed technique

Fig. 15 First secrete image (stego image) for the proposed technique

465

466

V. S. Shekhawat et al.

Fig. 16 Second secrete image rare pound coin (image that wants to more secure) for the proposed technique Fig. 17 Secrete massage in the proposed technique

Fig. 18 Cover image with both secrete images and hidden massage using the proposed technique for tiger cover image

A Secured Steganography Algorithm for Hiding an Image and Data …

467

Fig. 19 Value of MSE, PSNR and time in operation using the proposed technique for tiger cover image

Table 1 Comparison of PSNR and MSE value in previous work implementation and the proposed work

S. No.

Cover image

Previous work implementation

Proposed work

1

Natural view

MSE = 0.50105 MSE = 0.031237 PSNR = 51.132 PSNR = 63.1841

2

Tiger

MSE = 0.50105 MSE = 0.031244 PSNR = 51.132 PSNR = 63.1831

Reference 1. Madhu B, Gopalakrishna Kini N, Kini VG, Gautam (2019) A secured steganography algorithm for hiding an image in an image. In: Integrated intelligent computing, communication and security, studies in computational intelligence, vol 771, Springer Nature, Singapore Pte Ltd., pp 539–546 2. Gedkhaw E, Soodtoetong N, Ketcham M (2018) The performance of cover image steganography for hidden information within image file using least significant bit algorithm. In: IEEE The 18th international symposium on communications and information technologies (ISCIT 2018), pp 504–508 3. Menon N, Vaithiyanathan (2017) A survey on image steganography. In: IEEE international conference on technological advancements in power and energy (TAP Energy), pp 1–5 4. Singh A, Chauhan M, Shukla S (2018) Comparison of LSB and proposed modified DWT algorithm for image steganography. In: IEEE international conference on advances in computing, communication control and networking (ICACCCN2018), pp 889–893 5. Swain G (2018) Very high capacity image steganography technique using quotient value differencing and LSB substitution. Springer Arabian J Sci Eng 1–8 6. Prashanti G, Sandhyarani K (2015) A new approach for data hiding with LSB steganography. In: Emerging ICT for bridging the future, vol 2. Advances in intelligent systems and computing, vol 338, pp 423–430. Springer International Publishing, Switzerland 7. Bhardwaj R, Sharma V (2016) Image steganography based on complemented message and inverted bit LSB substitution. In: 6th international conference on advances in computing & communications, ICACC 2016, 6–8 Sept 2016, Cochin, India. Procedia Computer Science 93:832–838

468

V. S. Shekhawat et al.

8. Al-Afandy KA, EL-Rabaie EM, Faragallah OS, Elmhalawy A, El-Banby GM (2016) Highsecurity data hiding using image cropping and LSB least significant bit steganography. In: IEEE, pp 400–404 9. Anjum A, Islam S (2016) LSB steganalysis using modified weighted stego image method. In: IEEE 3rd international conference on signal processing and integrated networks (SPIN), pp 630–635 10. Joshi K, Yadav R (2015) A new LSB-S image steganography method blend with cryptography for secret communication. In: IEEE third international conference on image information processing, pp 86–90 11. Dehare P, Bonde P (2014) Hiding image in image by using FMM with LSB substitution in image steganography. Int J Adv Res Comput Sci Manag Stud 2(11):455–459

Data Rate Analysis of LTE System for 2 × 2 MIMO Fading Channel in Different Modulation Scheme Dimple Jangir, Gori Shankar, and Bharat Bhusan Jain

Abstract The ITU-approved LTE and LTE Advanced are considered as the fourthgeneration (4G) communication system. Its data rate is expected to be higher than 1 Gbps. The structure of the physical layer is presented by the 3 GPP (Third Generation Partnership Project). In the paper, MIMO LTE performance was analyzed for networks with totally different modulation schemes and for the MIMO 2 × 2 channel for independent agency 0 Hz, EVA 5 Hz, EVA 70 Hz, and Static MIMO. By everchanging the signal-to-noise ratio (SNR), the bit error rate (BER) of LTE is calculated for various multi-input multi-output (MIMO) systems. The analysis is concluded in terms of speed of the data rate in Mbps. Keywords Fourth generation · 3GPP · MIMO evolved node B (eNB) · User equipment (UE) · LTE

1 Introduction Wireless communication systems with multi-antenna arrays became a part of intensive analysis in recent years. Using multiple antennas on the facet of the transmitter and receiver will considerably improve data rate and rate. Learning the performance and limits of a MIMO system becomes vital as a result of there is several suggestions for understanding the coming up with MIMO systems [1]. The increasing call for mobile broadband has driven analysis to improve quality of service (QoS) and better facts transfer speeds. This model introduces and unites the fourth generation D. Jangir (B) · G. Shankar Department of Electronics and Communication, Jaipur Engineering College, Kookas, Jaipur, Rajasthan, India e-mail: [email protected] G. Shankar e-mail: [email protected] B. B. Jain Jaipur Engineering College, Kookas, Jaipur, Rajasthan, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_36

469

470

D. Jangir et al.

of future evolution (4G LTE) as the successor of the second technology (2G) and the third era (3G) technologies [2]. The normal model of LTE, model 8, became frozen in 2008 and additional improvements were made to future releases of 3 GPP. Multiple options and improvements are more than one inputs and multiple outputs (MIMO), grouping of carriers, and assist for better modulation schemes. Continuous improvement in network performance is very important to satisfy growing demand [3]. LTE (LTE) could be a fourth-generation (4G) mobile radio interface. LTE is a component of access to a sophisticated packet system (EPS). The primary necessities for LTE networks are high spectral power, excessive peak facts rates, quick reaction times, and frequency flexibility [4]. The fourth-generation (4G) cellular verbal exchange systems are designed to clear up the unresolved issues inside the 1/3 generation (0.33 generation) structures and provide good assessment for the modern services, excessive pace audio transmission, high nice wireless video transmission channels, 4G in preferred and anywhere MAGIC Mobile [5]. As a long-time period undertaking, 4G structures, that is, cellular structures for broadband Wi-Fi access, are in the discipline of specialized cell communications. 4G systems will not solely be compatible with next-generation mobile communications, however will support fastened wireless networks. The properties of 4G systems are summarized by word integration. In fact, there are three main goals that 4G technologies should meet. A nonstop association, an information transfer rate of 100 Mbps within the user terminal and alternative services that it should implement CALM [6], continuous communication for vehicles, is that the new world customary for ITS. It includes millimeter wave radio detection and ranging, GPS and a 2G radio interface to support IT activities. Modern Wi-Fi verbal exchange systems increase facts costs, dependable communications, and enhance insurance and reduce energy wishes. The MIMO stage can be recognized as a candidate to address those challenges. MIMO era provides higher spectral performance and improves the reliability of verbal exchange structures. Communication throughout a cooperative relay improves velocity and extends insurance area. This also reduces the need to use excessive transmit energy, which in flip reduces interference with different nodes. Both technologies also can be used to obtain spatial range [7]. Recent studies have increased interest in MIMO relay that can use key sources to optimize Wi-Fi bleaching and gain the benefits of each technologies [8]. MIMO migration relies in particular on relay protocols consisting of amplification, transport (AF), decoding, and transfer (DF). In the AF protocol, the relay node transfers the amplified model of the acquired signal to the vacation spot node, providing significant profits using much less sophisticated processing. On the alternative hand, the relay node based totally at the DF protocol reduces the received sign, recodes, modifies and sends to the vacation spot. Many researchers have established that the AF protocol can be carried out in practice [5]. In [6], the authors evolved the AF relay system and as compared it to direct transfer. The performance evaluation suggests a clear gain for the usage of AF relays, which well-known shows a large development in bit blunders rate (BER) underneath sensible wireless conditions. AF and DF structures had been carried out in [9] and compared in phrases of implementation loss and complexity. They have shown that

Data Rate Analysis of LTE System for 2 × 2 MIMO Fading …

471

the AF protocol is less complex and represents less implementation loss (that means that the shows are very just like theoretical research). These outcomes guide the concept that the AF relay protocol requires less sophisticated processing. Many AF MIMO relay analyzes count on that the immediately CSI facts for the supply relay channels are available within the relay node, wherein the relay node ought to consume a variety of resources for a CSI estimate. This contrasts with the purpose for the use of the AF MIMO relay as a much less complicated relay protocol. Alternatively, a more realistic and less complex AF scheme is known as a fixed relay AF advantage, which applied a fixed benefit to the relay node. Most studies at the fixed advantage MIMO AF sequence were performed the usage of big theoretical tools with a random matrix. It especially focuses on the capacity of the converging community and the essential limits for performing theoretical records for large networks. Relay choice schemes with inconsistent MIMO OSTBC AF relays can further improve gadget performance without additional signal processing abilities inside the relay contract. To advantage a clear understanding of those systems with more than one relay nodes, these scenarios are required in additional studies. The AF relay protocol becomes the maximum appealing migration protocol among the studies community, which transports the amplified model of the received sign to the relay node. The AF relay may be labeled as a advantage variable relay or a set advantage relay in keeping with the provision of CSI inside the relay node. Most studies on variable-gain AF series variable have been executed on the belief that the CSI of the source collection and relay channels is to be had at the relay station. The MIMO package may be used to mitigate the excessive results of fading by means of exploiting channel know-how in both the transmitter and receiver. In particular, the emission beam modulation (TB) may be used on the transmitter and a mixture of most ratios (MRC) inside the future. MIMO AF double-hop relay systems that use TB/MRC offer tremendous performance gains. However, CSI requires TB on the transmitter degree and is not ideal for many realistic eventualities. In addition, relay-based verbal exchange structures should search for node-degree interference, including CCI at migration and vacation spot. These sensible components are studied with MIMO double-hop AF systems. Motivated by using the issues indexed above, in this thesis, we examine the overall performance of one of the kinds of MIMO gadget double hopping on asymmetric fading channels. In particular, the most reliable MIMO AF modulation collections are the selection of bimodal MIMO relay jumping with the effect of STBC, CCI and delayed comments at the MIMO AF relay. Asymmetric fading for the double-hop machine is the supply migration channels and relay destinations which might be issue to the fading of Rayleigh and Rician, respectively. The work analyzes the most advantageous beam width overall performance of a set profit AF MIMO twin hop gadget in a scenario wherein the supply and vacation spot relay channels suffer from Rayleigh and Rician fading, respectively. In deep fading scenarios, supply-to-destination verbal exchange calls for higher reliability, which is viable with single-move transmission and a higher diversity arrangement. However, productivity may be improved by means of the use of multicirculation transmission in a reliable fade scenario. In this work, we have attention

472

D. Jangir et al.

on supplying an extra degree of diversity to the machine by using unicast broadcasts. Vector transmitting and receiving beam formation vectors are received to growth the signal-to-noise ratio (SNR) to the maximum on the destination. We use random matrix theory gear with confined dimensions with extraordinary Richard change shade scenarios for device analysis. Here, we do not forget that the relay vacation spot channel states suffer from a mismatched independent Rician vanishing, wherein the imply is a mismatched low-rank Rician vanishing, wherein the imply (LoS component) has a decrease rank and equal mismatch (Rician). New SNR statistical outcomes has derived at the destination in terms of the cumulative distribution characteristic (CDC), the probability density function, and moments. Next, we use those statistic expressions to derive the cutoff probability, the symbol error charge (SER), and the electricity potential. Diversity orders for simplified scenarios are derived in excessive SNR analysis. These overall performance metrics are used to evaluate the highest quality MIMO AF modulation device overall performance with distinct antenna configurations, Rican factors, and one-of-a-kind fading situations. Additionally, the amount of saturation within the series is predicted in terms of opportunity of interruption. Finally, the system is in comparison with the AF MIMO device based totally on OSTBC. Look within the choicest relay choice charts for a multi-area orthogonal area-coding orthogonal gadget with inconsistent amplification and transmission relays as channel status records is not always available at run time. The source and relay channels enjoy fade with Rayleigh and Rician Vanishing, respectively. Two feasible schemes for deciding on relays are proposed, both of that are statistically sizeable by extracting a particular closed form expression for the cumulative distribution function and the opportunity density characteristic for the immediate SNR ratio on the destination. In the primary posting choice method, immediate SNR maximization at the destination has considered the fine relay selection. In the second one diagram, the maximum migration destination channel is taken into account. Derived statistical effects are used to analyze gadget overall performance with cut probability, bit price error, and erotic electricity. Finally, we compare the two series choice systems in keeping with the dimensions of the relay pool and the Rican factor.

2 Related Work Navita et al. proposed and analyzed LTE commercial enterprise on three crucial multiples that get admission to technologies (OFDMA, SC-FDMA, and MIMO). Three modulation schemes (BPSK, QPSK, sixteen QAM) are used for those techniques to analyze their performance using BER and SNR as essential parameters. BPSK has a decrease SNR fee and a better cost than BER, so it gives a higher result. Based on our evaluation, we conclude that of the three technologies utilized in LTE, OFDMA has the highest excellent of service compared to MIMO and SCFDMA [10].

Data Rate Analysis of LTE System for 2 × 2 MIMO Fading …

473

Arunmozhi et al. discussed the QSM technology that is accomplished with the aid of increasing the field of spatial constellation to a brand new dimension, the use of both segment and quadrature components. Real and imaginary orthogonal portions of the records image are transferred. QSM device performance with AF relay protocol changed into evaluated using ML receiver detector. QSM collaborative system bypasses traditional SM collaborative system. The QSM relay achieves better spectral efficiency and operates without disturbing the complexity of the receiver. The proposed QSM relay can be carried out in a cooperative network that has orthogonal channels between the relays and the holiday region and also can be extended to the DAF relay community [11]. Payal Shah et al. presented the day after day, the growth in call for better statistics fees is riding technological innovation that responds to the demand for the use of more information. Identifying and using the innovations using the cellular broadband market international are a supply of mirrored image. The currently added three GPP fashionables are high-speed records services, multimedia broadcasting, and multimedia streaming are a long-term development (LTE), LTE network overall performance in distinct radio situations to enhance tool slippage and spectral efficiency. The evaluation is also achieved for multienter and multi-output (MIMO) configurations, use of MATLAB and LTE-A Vienna stepping simulator. Simulation results show that maximum of the records rate can increase the use of 256QAM underneath accurate radio conditions, coordinate a few points to enhance mobile device peripheral costs, and MU-MIMO has better overall performance [12]. Huang et al. [13] present some ability barriers of the AF MIMO relay gadget are derived within the presence of both CCI and put off feedback. However, these studies are confined to Rayleigh fading scenarios, and this is not always practical for real relay deployment.

3 MIMO System Systems with multiple inputs and multiple outputs (MIMO) are a natural continuation of developments within the field of communication antenna arrays. MIMO systems incorporate several transmit antennas at the transmitter and several others receive antennas at the receiver. The MIMO communication feature, that uses a physical channel between many antennas for transmission and reception, is presently receiving special attention. MIMO systems have many benefits over communication with only one antenna. The attenuation sensitivity is reduced because of the spatial diversity provided by several special ways. Beneath sure environmental conditions, performance needs related to the high spectral potency of communication will be considerably reduced by eliminating the compression space of the theoretical capability of the associated data. Here, spectral potency is outlined because the total range of knowledge bits per second per hertz is transmitted from one matrix in a different [4]. MIMO makes use of more than one transmit and obtain antennas to enhance

474

D. Jangir et al.

the overall performance of the range of communication systems and multi-cast technologies. MIMO delivers advanced spectral power, improves irresponsibility, reduces attenuation and improves noise immunity [14]. The dual-hop MIMO AF device on asymmetric attenuated channels will be beneficial for finding performance differences with the OSTBC, primarily based fixed MIMO AF systems, which in turn provide the way to determine the exceptional choice for any realistic deployment. Collaborative communications can also be used to triumph over adverse channel color trade in wireless environments. The use of a couple of relays on this fading scenario helps provide dependable delivery from one supply to another, as these destructive channel consequences can be mitigated the usage of relay selection strategies such as deciding on the high-quality relay. Several studies of sequence selection have been done to grow the range arrangement, reduce harmful channel results, and overcome duplex transmission loss. Impressive enhancements in information measure and bit error rate (BER) have recently multiplied interest in multi-antenna systems. However, the value goes hand in hand with the price for the complexes of the hardware. The radio interface has complexes, size, and worth that rely upon the quantity of antennas. Channel participation rate can increase linearly with SNR at low SNR, however logically increase with SNR at high SNR within the MIMO system the combination transmission power of a specific is going to the divided into many special ways transportation the capacitance for every mode nearer to the linear mode thereby increasing the general spectral potency [11]. MIMO systems supply high spectral potency with considerably lower energy consumption per little bit of data. The antenna parts are shown in Fig. 1. It will be seen from the diagram that the MIMO functions have a linear relationship, whereas the SIMO/MISO functions have an exponent relationship to the quantity of antenna parts. This additionally applies to MIMO, a discussion topic in an efficient wireless association (Fig. 2).

Fig. 1 MIMO system

Data Rate Analysis of LTE System for 2 × 2 MIMO Fading …

475

Fig. 2 Graph showing the variation of channel capacity with number of antenna elements

4 System Description This architecture helps in the design of the LTE system for the transmission and reception of the data. The data is created randomly and is passed through various modules in the transmitter section to the fading channel and then to various blocks of receiver section. The Simulink model of the above-mentioned system is shown below: Figures 3, 4, and 5 explain the transmitter, receiver, and measurement as part of the simulation, respectively. The transmitter section contains the data source generator, the CRC generator, the turbo channel coder, and the transmit PDSCH processing block. The MIMO fading channel with AWGN is used as the channel to transmit the generated signal. At the receiver section, the receive PDSCH processing block, turbo channel decoder, and the general CRC syndrome detector are present. The data source which is a random number generator is used to generate the data source

Fig. 3 Transmitter section

476

D. Jangir et al.

Fig. 4 Receiver section

Fig. 5 Measurements

which is to be transmitted to the receiver. The data source generated here is passed to the CRC generator which adds the sufficient CRC bits to identify any error in the received bit while transmission. The turbo channel coder codes the generated bits so as to effectively use the channel bandwidth which is then passed to the PDSCH processing block to transmit the data. The PDSCH processing block contains the PDSCH scrambler which is then passed to the modulator section. Till now, the data sources are process individually the layer mapper performs the spatial multiplexing to make a single source then it is process by the spatial multiplexing procedure, then it is again remapped by resource element mapper to map it according to the resource grid chosen, and then finally OFDM modulation has done for transmitting the processed data. This data or signal is then transmitted over the MIMO fading channel and AWGN Channel for modeling the channel transmission. At the receiver end, first the Receive PDSCH process the received data which OFDM demodulates the received signal then the data has extracted from the signal which has passed to the MIMO receiver to demultiplexer the data the layer damper employed. The data

Data Rate Analysis of LTE System for 2 × 2 MIMO Fading … Table 1 System configuration

477

Parameters

Value

Channel bandwidth

10 MHz

Duplex mode

FDD

Transmit channel modulation

OFDM

Channel type

Flat static MIMO, EPA 0 Hz, EVA 5 Hz, EVA 70 Hz

FEC coding

Turbo coding

SNR

12.1 dB

Modulation

QPSK, 16-QAM, 64-QAM

Subcarrier spacing

15 kHz

Antenna diversity

2 × 2 MIMO

obtained from the layer damper is then passed to demodulator and then to the PDSCH descrambler. The data obtained after descrambling is the received code words. These code words may or may not contain error which may occur in the channel section. At least the error introduced by the fading AWGN channel is measured by the speed of transmission. In the measurement section, a MATLAB function block is used to obtain the speed of the communication in terms of Mbps. The system is configured as per the parameter shown in Table 1.

5 Simulation Results and Analysis For the configured system as shown in Table 1, the performance is measured in terms of speed of the data transmission based on the acceptable SNR value given to the system. The transmitted and received signal over the channel has been viewed on a spectrum analyzed as shown below. The speed analysis and bit error rate for the configured system are shown in Table 2. Above table shows speed of the transmission in Mbps and bit error rate of the two code words considered for the transmission for 16QAM, 64QAM and QPSK modulation for 2 × 2 MIMO fading channel of OFDM channel modulation.

6 Conclusion The result obtained from the LTE system is shown in Table 2. It is observed that for a 2 × 2 MIMO system the speed of the system remains the same for all modulation types, i.e., 16QAM and 64QAM and QPSK use OFDM at the transmitting section. However, the bit error rate changes significantly of these modulation types. It is observed that for QPSK, 2 × 2, EPA 0 Hz the bit error rate is minimum for Code

478 Table 2 Obtained speed and bit error rate

D. Jangir et al. Configuration

Speed in Mbps

Bit error rate (CW1, CW2)

16QAM, 2 × 2, EPA 12.96 0 Hz

0.1125, 0.05844

64QAM, 2X2, EPA 0HZ

19.69

0.2079, 0.1456

QPSK, 2 × 2, EPA 0 Hz

6.2

0.001345, 0.000326

16QAM, 2 × 2, EVA 12.96 5 Hz

0.1977, 0.07009

64QAM, 2 × 2, EVA 19.69 5 Hz

0.2855, 0.1633

QPSK, 2 × 2, EVA 5 Hz

0.03883, 0.003672

6.2

16QAM, 2 × 2, EPA 12.96 5 Hz

0.1651, 0.1454

64QAM, 2 × 2, EPA 19.69 5 Hz

0.2523, 0.2343

QPSK, 2 × 2, EPA 5 Hz

0.04008, 0.02285

6.2

16QAM, 2 × 2, EVA 12.96 70 Hz

0.13, 0.1326

64QAM, 2 × 2, EVA70Hz

0.2172, 0.22

19.69

QPSK, 2 × 2, EVA 70 Hz

6.2

0.03102, 0.3217

16QAM, 2 × 2, flat static MIMO

12.96

0.2804, 0.04857

64QAM, 2 × 2, flat static MIMO

19.69

0.3597, 0.1463

QPSK, 2 × 2, flat static MIMO

6.2

0.05496, 0

Word 1, while for Code Word 2 the minimum bit error rate is obtained for QPSK, 2 × 2, flat static MIMO. It is also observed that the bit error rate of QPSK, 2 × 2, EPA 0 Hz is minimum for both the code words. So, it is inferred from this result that in case of QPSK for 2 × 2 MIMO channel the transmitting data undergoes minimum bit error. These results can be analyzed for a 4 × 4 MIMO fading channel also as a part of future work.

Data Rate Analysis of LTE System for 2 × 2 MIMO Fading …

479

References 1. Vij R, Mishra P, Singh G (2014) Performance evaluation of comparison of sphere decoder with other equalization techniques on 2x2 MIMO systems using Rayleigh and Rician flat fading channels. In: IEEE fourth international conference on communication systems and network technologies, pp 182–186 2. Shah P, Sakhardande K, Shah G (2018) Performance analysis of LTE network using QAM and MIMO configuration, In: IEEE international students’ conference on electrical, electronics and computer science, pp 1–6 3. Halawa TN, Fathy RA, Zekry A (2016) Performance analysis of LTE-A with 256-QA. In: Sixth international conference on digital information processing and communication 4. Sampath Kumar D, Samundiswary P (2015) Performance analysis of MIMO-LTE using various modulation schemes under different channels. In: IEEE international conference on electrical, electronics, signals, communication and optimization (EESCO), pp 1–4 5. Lim C, Yoo T, Clerckx B, Lee B (2013) Recent trend of multiuser MIMO in LTE-advanced. In: IEEE communications magazine 6. Haque MM, Rahman MS, Kim K-D (2013) Performance analysis of MIMO-OFDM for 4G wireless systems under Rayleigh fading channel. Int J Multimed Ubiquit Eng 8(1) 7. Sampath Kumar D, Samundiswary P (2015) Performance analysis of MIMO-LTE using various modulation schemes under different channels. In: International Conference on Electrical, Electronics, Signals, Communication and Optimization (EESCO) 8. Gallo L, Erome Harri J (2013) Short paper: a LTE-direct broadcast mechanism for periodic vehicular safety communications. In: IEEE vehicle networking conference 9. Wang J, Zhang Z, Ren Y, Li B, Kim J-U (2014) Issues toward networks architecture security for LTE and LTE-a networks. Int J Secur Its Appl 8(4):17–24 10. Navita A, Amandeep N (2016) Performance analysis of OFDMA, MIMO and SC-FDNMA technology in 4G LTE networks. In: IEEE 6th international conference—cloud system and big data engineering (Confluence), pp 554–558 11. Arunmozhi S, Prasannadurga SL, Nagarajan G (2017) Performance analysis of quadrature spatial modulation based cooperative relaying MIMO networks. In: International conference on inventive systems and control (ICISC-2017), pp 1–4 12. Shah P, Sakhardande K, Shah G (2018) Performance analysis of LTE network using QAM and MIMO configuration. In: IEEE international students’ conference on electrical, electronics and computer science (SCEECS 2018), pp 1–6 13. Jimaa S, Chai KK, Chen Y, Alfadhl Y (2011) LTE-A an overview and future research areas. In: IEEE second international workshop on the performance enhancements in MIMO-OFDM, pp 395–399 14. Akyildiz IF, Gutierrez-Estevez DM, Chavarria Reyes E (2010) The evolution to 4G cellular systems: LTE-advanced. In: Physical communication, vol 3, pp 217–244

Prediction of Defects in Software Using Machine Learning Classifiers Ashima Arya, Sanjay Kumar, and Vijendra Singh

Abstract The Software Defect Prediction (SDP) model engages in predicting defects and bugs in the software. This model will detect and predict bugs during early stages of the software development life cycle to improve the overall quality of software and reduce the cost also. In this paper, the author presents a model that will predict the bugs with the help of machine learning classifiers. For this model, the researcher has used the dataset NASA from the known repositories and used two supervised ML classifier algorithms such as linear supervision (LR) and Naive Bayes (NB) for detecting and predicting faults. This study describes how ML algorithms work effectively in SDP models. The results collected showed that the linear regression approach performs better and predicts the faults with accuracy. Keywords Software defect prediction · Machine learning · Linear regression · Naïve Bayes

1 Introduction Software Defect Prediction (SDP) is an emerging research area in the domain of software engineering. There are many factors which affect the development of software such as large number of lines of codes, team of coders and source codes like libraries, functions, etc., and add vulnerabilities to the software. Defects may occur due to modifications and enhancements in an existing code to meet the new criteria and to enable new functionality also [1]. We get unexpected outcomes due to faults, A. Arya (B) · S. Kumar SRM University, Sonipat, India e-mail: [email protected] S. Kumar e-mail: [email protected] V. Singh School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_37

481

482

A. Arya et al.

errors and bugs [2]. For developing quality software, it is essential to detect errors and take corrective measures [3]. SDP plays an important role to maintain software quality. This model will allow professionals to test more rigorously on areas that are prone to defects as compared to other parts of the system, which will reduce the cost of manpower [3]. SDP plays an important role to maintain software quality. Many quantitative approaches have been used by the researchers to execute different experiments to find a better approach for this research problem. Software Defect Prediction (SDP) model consists of four main elements [4]. 1. Independent variables—independent variables are also known as predictors such as historical data, bug reports or any change in the existing code on which the predictions are to be drawn from the defect proneness module. 2. Modeling technique–modeling technique used to analyze SDP model is machine learning (classification). 3. Dependent variables—dependent variables are the prediction results which are produced by the model. 4. Scheme—performance measure is to be taken to display the results of a model. Two approaches are used in Software Defect Prediction model [5]. 1. Type of software metrics. 2. Dealing with faults from similar software projects. The objective of this paper is to study and implement machine learning classification algorithms which execute better in predicting defects in software. Machine learning is one of the most dominant approaches used to analyze the software defects [6]. The outcomes of the model are based on previous software information, source code, bug reports and communication between developers. In this paper, two supervised ML learning classifiers such as linear regression (LR) and Naïve Bayes (NB) are discussed and applied to the dataset to evaluate the ML capabilities in SDP [7]. To execute ML algorithms, first step is to identify and classify the predictor variable by applying a specific algorithm for training the attributes of dataset, and then the predictor is evaluated on a software module to produce the outcome to estimate its performance. To evaluate this study, the dataset is collected from the GitHub (www.github.com) repository to gain better insights. The following section is an overview of defect prediction. Section 2 is the literature review which includes the related work. Section 3 is detailed about machine learning classifiers used to analyze the SDP model. Section 4 deals with the dataset and evolution methodology which describes the process of predicting the defects followed by results in Sect. 5. Section 6 concludes that which ML classifier predicts defects with better accuracy.

Prediction of Defects in Software Using Machine …

483

2 Related Work Song et al. [6] use class imbalance learning algorithms to deal with the defective and non-defective components of software of different datasets. The performance parameter taken for their study is the Matthews correlation coefficient (MCC) [8]. Khanh Dam et al. [8] explained an approach for finding the defects. They develop a deep tree-based prediction model which is capable of learning features for representing source code. The effectiveness of this approach is for cross-project prediction also [9]. Miholca et al. [9] formulated a nonlinear hybrid model using supervised classification for Software Defect Predictions. This model uses association rule mining with neural network to distribute the defective and non-defective entities for software [10]. Wei et al. [10] used support vector machine (LTSA) algorithm to predict the defects in a software system at operational phases. In this study, the researchers extract the intrinsic structure of low-dimensional data to reduce data dimensionality [11]. Shaoa et al. [11] proposed a novel supervised approach for prediction based on atomic class-association rule mining (ACAR). This model is used to improve the prediction of defect-prone modules by dividing the data into different frames for preprocessing which helps in performance evaluation [12]. Mahmood et al. [12] studied and investigated the replication techniques to find the software defects to maintain the consistency of the software [11]. Ali Rana et al. [13] in this study used an association mining-based approach to learn defect-prone modules and non-defectprone module in available imbalanced datasets [13]. Ghosh et al. [14] proposed a nonlinear manifold detection technique (MDT) which reduced and eliminated the irrelevant dimensions of a dataset to improve the software quality. Dimension reduction techniques and feature selection techniques are also implemented with MDT to predict the software defects [14].

3 Machine Learning Algorithm The objective of this paper is to analyze the machine learning classifiers such as Naive Bayes (NB) and linear regression (LR). These machine learning algorithms will demonstrate the accuracy and efficiency. The details of the ML algorithm are: (a) Linear regression (LR): “Linear regression is a supervised machine learning algorithm where the estimated output is continuous and has a constant slope” [10]. (b) Naive Base (NB): This type of classification in machine learning is a collection of classification based on the Byers theorem. In this algorithm, the characteristics of each pair are classified and are independent of each other [15].

484

A. Arya et al.

4 Datasets and Evaluation Methodology The SDP model evaluation process begins with the collection of dataset. The null values are analyzed from the selected dataset. From the selected dataset, define the predictor variable. The dataset will train with the help of ML algorithms to predict defects. The result of this model comes in the form of accuracy according to the parameter chosen as a predictor variable [16] (Fig. 1). The SDP model evaluation process begins with the collection of dataset. The null values are analyzed from the selected dataset. From the selected dataset, define the predictor variable. The dataset will train with the help of ML algorithms to predict defects. The result of this model comes in the form of accuracy according to the parameter chosen as a predictor variable [17]. The following attributes are extracting using McCabe and Halstead measures. There are different measures to be taken by this model to predict defects (Tables 1 and 2).

Fig. 1 SDP evaluation process

Prediction of Defects in Software Using Machine … Table 1 Performance and evaluating measures

485

Performance measures

McCabe metrics

Halstead

Accuracy (acc)

Essential complexity

Base measure

Probability of detection (PD)

Cyclomatic complexity

Set the constants for each function

Precision (pre)

Design complexity

Derived measures

The total code taken by detector (effort)

Lines of code

Probability of false alarm (pf)

Table 2 Formula used for evaluating matrices

Performance Measures a) Accuracy is denoted by acc and calculated as (A + D) / (A + B + C + D) b) Probability of detection denoted by pd and calculated as D/(B + D) c) Probability of false alarm denoted by (pf) and calculated as C/(A + C) d) Precision (pre) denoted by and calculated as D/(C + D) e) Amount of code selected by detector denoted by (effort) and calculated as (c.LOC + d.LOC)/(TotalLOC) Base measure a) Number of unique operators denoted by mu1 b) Number of unique operations denoted by mu2 c) Total occurrence of operators denoted by N1 d) Total number of operands denoted by N2 e) Length is calculated as = n = n1 + n2 f) mu = mu1 + mu2 Derived Measures a) Volume (P) = V = N * log2 (mu) (number of mental comparisons required to write a program of length n) b) Quantity on minimum implementation (V*) = (2 + mu2) * log2 (2 + mu2) c) Program Length (L) = V* / N d) Difficulty(D) = 1/L e)Intelligence(I) = L * V  f) Attempt to write the program (E) = V/L g) Program writing time (T) = E/18 s

Module is really to blame: Yes or No. (a) There is no fault in the classifier. A|B (b) The classifier predicts some faults. C|D

486

A. Arya et al.

4.1 Data Discovery and Visualization It informs about the data, and information about attribute is Local: “line of code count” v(g): “cyclonic complexity” ev(g): “required complexity” iv(g): “Design Complexity” n: “total operator + operand” v: “Volume” L: “Program Length” D: “Difficulty” i: “Intelligence” E: “Attempt” T: “Time Estimator” lOCode: “Line count” LOComment: “count lines of comments” lOBlank: “count of blank lines” uniq_Op: “unique operator” uniq_Opnd: “unique operand” Total_Op: “total operator” Total_Opnd: “total operand” BranchCount: “flow graph” Faults: {false, true} module does not have one or more reported faults. Figure 2 shows attributes of SDP model. Now, it shows a number of columns and rows are present in data. It also shows the defect rate (true/untrue) present in data as in Fig. 3 and counts the value of defect. Histogram: Defect rates (true/false) are displaying in graphical form by using the histogram for above values. So, one can easily determine that how many defects are being present in given dataset. The graph is drawn between no. of defects and frequency; as you can see in Fig. 4, the value of false defect is much higher than true defect. Covariance: “Covariance is a measure of the directional relationship between the returns on two risky assets. A positive covariance means that asset returns move together, while a negative covariance means returns move inversely.” Figure 5 shows covariance matrix of data we are using. Covariance is represented in graphical form in Fig. 6 using heat map function, which itself distinguishes two things: (a) The light color shows high covariance. (b) The bright color shows low covariance.

Prediction of Defects in Software Using Machine …

487

Fig. 2 Attributes used in SDP model

Fig. 3 Value of defect rates

4.2 Data Preprocessing It is a technique which is used to convert the input data in a useful and given format according to the dataset. In this dataset, null values are not available therefore no cleaning needed. The below output shows no missing values are there and all data is important.

488

Fig. 4 Histogram

Fig. 5 Covariance matrix

A. Arya et al.

Prediction of Defects in Software Using Machine …

489

Fig. 6 Graphical representation of covariance matrix

4.3 Feature Extraction Dimension reduction is required to reduce the size of the dataset so that it can be managed properly for processing. Feature extraction technique provides this facility and reduces the computing resources for processing. In this work, dimensionality reduction on particular attributes is (data.n < 300) and (data.v < 1000) and (data.d < 50) and (data.e < 500000) and (data.t < 5000) as shown in Fig. 7.

4.4 Data Normalization (Min–Max Normalization) The normalization technique is used to prepare the data for learning. The aim of implementing this technique is to normalize the values of numeric columns in the dataset to a common scale, without distorting differences in the ranges of values. Here also, this work is doing the same for attribute (volume and bug). We scaled up the value for both attributes (volume and bug) and represent the normalized value in Fig. 8.

490

A. Arya et al.

Fig. 7 Feature extraction by complexity evaluation

4.5 Model Selection (a) A Naive Bayes algorithm used Bayes’ theorem to identify objects. Naive Bayes classifiers assume strong, or naive, independence between attributes of data points. Intially select attribute values related to (iloc) and (complexity evaluation). At last, calculation of ACC value by K-fold cross-validation of NB model is done and its value is 0.9816209 as shown in Fig. 9. (b) Linear regression performs the task of predicting the variable as dependent variable (y) based on a given independent variable (x). In this technique, a linear relation can be estimated between x (input) and y (output). Firstly, select data for selection and the columns for that are loc and b. So now we are having 10,885 rows × 2 columns. Secondly, generate a covariance matrix as represented in Fig. 10. Thirdly, generate intercept and coef. of model (Fig. 11).

5 Result The results of linear regression are calculated using the least squares method and the root-mean-square error methods as shown in Fig. 12. In general, as these values are calculated as the mean value and the difference, it is considered that the model has better estimation ability as it approaches 0. The calculation of ACC value by K-fold cross-validation of NB model is done, and its value is 0.9816209 as shown in Fig. 13.

Prediction of Defects in Software Using Machine …

491

Fig. 8 Data normalization

From the values, the study is elaborated as the fact that the values are close to zero shows us that the model has good predictive ability. But linear regression value is much near to zero as compared to NB. So, linear regression model has better estimation ability as compared to Naive Bayes.

6 Conclusion and Future Work The Software Defect Prediction model is used to determine software system faults. Many quantitative approaches have been used by researchers for this research problem but due to the increasing amount of data, researchers need to conduct a variety of experiments for better solutions. Machine learning (ML) technology gives

492

Fig. 9 Naïve Bayes classifier Fig. 10 Linear regression classifier

Fig. 11 Result of linear regression classifier

A. Arya et al.

Prediction of Defects in Software Using Machine …

493

Fig. 12 MSE and RMSE values

Fig. 13 Comparisons of results

a platform to find better solutions. In this paper, two supervised ML learning classifiers such as linear regression (LR) and Naïve Bayes (NB) are discussed and applied to the dataset to evaluate ML capabilities in SDP models. From experiments, the researchers have evaluated that the ML algorithm provides better solutions with greater accuracy. This study concludes that linear regression models have better estimation capability than Naive Bayes estimation ability. In this study, only one dataset is taken to verify the efficiency of ML algorithms on SDP model. In the future work, researcher can take number of datasets and ML algorithms to check the better accuracy of SDP models for predicting defects.

References 1. Rahman F, Posnett D, Devanbu P (2012) Recalling the imprecision of cross-project defect prediction. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering, Cary, NC 2. Kalaivani N, Beena R (2018) Overview of software defect prediction using machine learning algorithms. In: Int J Pure Appl Math 118(20):3863–3873 3. Peng H (2015) An empirical study on software defect prediction with a simplified metric set. Inf Softw Technol 170–190

494

A. Arya et al.

4. Bowes D, Hall T, Petric J (2018) Software defect prediction: do different classifiers find the same defects? Softw Qual J 26:525–552. https://doi.org/10.1007/s11219-016-9353-3(2018) 5. Hosseini S, Turhan B, Mäntylä M (2017) A benchmark study on the effectiveness of searchbased data selection and feature selection for cross project defect prediction. Inf Softw Technol 1–17 6. Song Q, Guo Y, Shepperd M (2018) A comprehensive investigation of the role of imbalanced learning for software defect prediction. IEEE Trans Softw Eng 7. Chen X, Zhang D, Zhao Y, Cui Z, Chao N (2018) Software defect number prediction: unsupervised vs supervised methods. Inf Softw Technol. https://doi.org/10.1016/j.infsof.2018.10. 003(2018) 8. Khanh Dam H, Pham T, Wee Ng S, Tran T, Grundy J, Ghose A, Kim T (2018) A deep tree-based model for software defect prediction. arXiv:1802.00921v1 [cs.SE] 9. Miholca D, Czibula G, Gergely I (2018) A novel approach for software defect prediction through hybridizing gradual relational association rules with artificial neural networks. Inf Sci. https://doi.org/10.1016/j.ins.2018.02.027(2018) 10. Wei H, Changzhen H, Chen S, Yuan X, Zhang Q (2019) Establishing a software defect prediction model via effective dimension reduction. Inf Sci 399–409 11. Shao Y, Liu B, Wang S, Guoqi L (2018) A novel software defect prediction based on atomic class-association rule mining. Exp Syst Appl. https://doi.org/10.1016/j.eswa.2018.07.042 12. Mahmood Z, Bowes D, Hall T, Peter C, Lane R, Petric J (2018) Reproducibility and replicability of software defect prediction Studies. Inf Softw Technol. https://doi.org/10.1016/j.infsof.2018. 02.003 13. Ali Ranaa Z, Mian Awais M, Shamaila S (2015) Improving recall of software defect prediction models using association mining. Knowl Based Syst 14. Ghosh S, Rana A, Kansal V (2018) A nonlinear manifold detection based model for software defect prediction. Procedia Comput Sci 132:581–594 15. Sharma D, Chandra P (2018) Software fault prediction using machine-learning techniques. Smart Comput Inf 541–549 16. Thankachan A, Raimond K (2015) A survey on classification and rule extraction techniques for data mining. IOSR J Comput Eng 8(5):75–78 17. Hammouri A, Hammad M, Alnabhan M, Alsarayrah F (2018) Software bug prediction using machine learning approach. Int J Adv Comput Sci Appl (IJACSA) 9(2)

Energy-Efficient Schemes in Underwater Wireless Sensor Network: A Review Poonam, Vikas Siwach, Harkesh Sehrawat, and Yudhvir Singh

Abstract Recently, underwater wireless sensor networks have gained a lot of attention of research from both academic world and industry and in the order to explore the immense underwater environment. To design network routing protocol is very challenging task in underwater wireless sensor networks since UWSN has peculiar features like large propagation delay, dynamic topology change, more latency and then terrestrial sensor network, low packet delivery ratio, high error rate, low bandwidth and less energy. In underwater wireless sensor network, improving energy efficiency is one of the most important challenges because substitution of batteries of sensor nodes is very costly due to harsh underwater environment. Energy consumption in UWSNs is still an open issue to be investigated, and further research should be conducted to increase the energy efficiency of UWSNs. This paper presents energyefficient approaches used in three layer of protocol stack of underwater wireless sensor network architecture. Keywords Underwater wireless sensor network · Routing protocol · Energy efficiency · Cluster head

1 Introduction Underwater wireless sensor network is scalable sensor network that depends on restricted sensing and coordinated networking among large number of inexpensive Poonam (B) · V. Siwach · H. Sehrawat · Y. Singh U.I.E.T., Maharshi Dayanand University, Rohtak, India e-mail: [email protected] V. Siwach e-mail: [email protected] H. Sehrawat e-mail: [email protected] Y. Singh e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_38

495

496

Poonam et al.

and densely deployed sensor nodes which are deployed both at underwater and at the surface and whose important aim is to perform collaborative tasks over the prescribed area. Underwater sensor nodes are capable to transmit the important data with their sensing capabilities within the short distance [1]. The main fundamental issue in wireless sensor networks is efficient energy utilization. Since sensor nodes play the double job as both detection of event and routing of the data to their exact location, a single sensor node exhausting of the battery may be the reason of partitioning of networks in some network topologies or making some area uncovered. In the majority of applications, replacement of batteries may be impracticable due to the large number of nodes and difficulty to get the zone that is to be sensed, so sensor node’s lifetime depends on battery power. Routing protocols are responsible for discovering and maintaining the routes in the sensor network [2]. However, the underwater area is highly challenging and an unpredictable due to numerous reasons. Most of the sensor nodes are mobile in underwater environment due to water current. Electromagnetic waves are not applicable in underwater because they are absorbed in water. Due to dynamic nature of water, the communication techniques that are used in ground sensor network are not applicable in acoustic network [1]. The propagation speed of signal in underwater environment is five times lower than speed in terrestrial network. Bandwidth of underwater network is also very limited because of absorption. Limited bandwidth and high latency will result in high end-to-end delay. Water shows the property of absorbing radio frequency waves. Random access techniques are not able to work efficiently in underwater environment. This will result in attenuation and loss of energy of radio frequency waves in water [3, 4]. In underwater wireless network, the power required for transmission of data is nearly 100 times more than the power required for reception [5]. Reliable transmission of data is very crucial. The design of scalable, robust and energy-efficient routing protocol in acoustic networks is the most important research issue. Power consumption can be controlled in UWSNs in more than one layer. To increase the lifetime of sensor network, energ-efficient protocols and techniques are used.

2 Energy Conservation at Physical Layer Radio waves and optical waves are not feasible in underwater due to its harsh environment as radio waves are extremely attenuated in underwater environment and optical waves are scattered and absorbed in water. Water absorbs all electromagnetic waves and disperses them so acoustic communication technology may be considered as the single reasonable technique that perfectively works effectively in underwater environment. So for military and civilian applications, acoustic waves are used since they provide robust underwater communications. Acoustic waves are omnidirectional in nature with low signal attenuation. It has also been revealed that absorption of electro√ magnetic waves in underwater environment is approximately 45 f (db/km) where

Energy-Efficient Schemes in Underwater Wireless Sensor …

497

‘f ’ denotes frequency in hertz and absorption of an acoustic signal is lesser by three orders of magnitude than other waves [6–8]. The vital issues handled by the physical layer in underwater wireless network protocol stack architecture are, i.e., interfacing to transmission medium, modulation, and equalization filtration and collision detection. Despite the entire features, underwater acoustic channel presents a lot of new communication challenges. The challenges faced by acoustic channel are path loss, high bit of error rate, limited bandwidth and high propagation delay [3]. Path losses depend on both the transmission distance and signal frequency. Limited bandwidth may result into low data rates, which further depend on both communication range and signal frequency. In large networks that work over hundreds of kilometer, data transmission rate cannot exceed more than a few kilohertz. On another side, in a shortrange network that works over tens of meters, bandwidth of about hundred kilohertz is used in communication. Acoustic wave’s communications are categorized in various classes as far as range and transmission capacity, but this cannot exceed more than 40 kilobits per second at a range of one kilometer [9]. Energy at physical layer can be saved using energy-efficient modem technology. Modem is responsible for transmission and reception of data. At the present time, commercially available modems which can work in underwater environment can transmit up to 30 kbps over the distances ranging from hundred meters to few kilometers [10]. The available commercially acoustic modems offer data rates that range from 100 bits per second to 40 kilobits per second and operate up to a few kilometers range and an operating intensity of thousands of meters. During designing of acoustic modem for underwater wireless network, the designer should promote for inexpensive and low energy utilization at each level from analog electronics to signal processing techniques. Designer of these modems should be careful to incorporate the selection of digital signal processing format, the choice for transducer and corresponding analog electronics, the interfaces choices to sensors and decision of hardware platform for execution [11]. Freitag et al. [12] have built up a compressed and low battery power acoustic modem called micro-modem. Micro-modem uses digital signal processing technique that consumes about 180 mW. The communication power is firm and integrated with hardware for every application. This micro-modem has two modes: (i) low rate, low power and unsynchronized and (ii) high rate, high power and synchronized mode. This modem additionally consolidates fundamental sound navigation functionality. Wills et al. [13] introduced low-cost low-power acoustic modem that works in impenetrable underwater environment where communication range is small. Most important energy saving transformation is to use dedicated, low power, all analog wake-up tones receiver to activate the costly data receiver, and when system is in inactive state, then most of the components can be turned off while leaving the wake-up receiver on.

498

Poonam et al.

2.1 MAC Layer The fundamental objective of MAC layer protocols is to avoid collisions. Other objectives of MAC layer are to maintain energy efficiency, scalability and latency. Function of this layer is frequency selection, modulation and data encryption [14–18]. MAC layer manages and controls various communication channels; these channels are then shared by various sensor nodes to avoid collision between them and for maintenance of reliability of transmission conditions. Different MAC solutions have been proposed by various authors that consider energy-efficiency factor into account. Deterministic MAC techniques, namely time division multiple access (TDMA), frequency division multiple access (FDMA) and code division multiple access (CDMA), are not functional in the underwater environments caused by many problems like narrow channel bandwidth, vulnerability of fading and multipath, handling of long propagation delays, optimizing energy consumption, trouble of power control at each node and scalability with number of nodes. To save energy, some solutions have been provided by various researchers in form of protocols which are described as [15]: 1. Contention-free protocols TDMA-based protocols are time-scheduled systems, where communication happens in fixed time schedules. TDMA-based protocols have a built-in duty cycle that helps in collision avoidance and reducing idle listening time. The disadvantage of these protocols is that there is requirement of coordination and synchronization of TDMA slot allocation which results in more energy consumption. 2. Contention-based protocols In contention-based methods, the bandwidth or channels are not allocated before hand to any user. If several users transmit their data at same time, then problem of collision can take place which will influence reliability of transmission. As the number of users increases, the collision also increases which has effects on performance and energy consumption. 3. Hybrid protocols The proposal of hybrid methods is to consolidate the best properties of contentionfree and contention-based protocols. So the shortcomings of both these methods will be removed in this scheme. But this protocol has disadvantage that it will increase software complexity. Despite the fact that these protocols help in improving functionalities, they likewise increase the requirements for compelled resources which may decrease their applications. 2.1.1

Other Techniques and Protocol at Mac Layer

Protocol: T-Lohi protocol that works at MAC layer helps in conservation of energy. Various authors proposed three versions of T-Lohi protocol. Simulation result shows

Energy-Efficient Schemes in Underwater Wireless Sensor …

499

that Synchronized T-Lohi (ST Lohi) is the main energy-effective protocol contained by 3% of optimal energy. Whereas throughput of Aggressive Unsynchronized TLohi (AUT-Lohi) is highest out of three protocols, it utilizes about 50% channel bandwidth. Robustness of packet delivery of Conservative Unsynchronized T-Lohi (CUT-Lohi) is best since there is no packet loss in this protocol. All three versions of T-Lohi protocol show efficient channel use, stable throughput and excellent energy conservation [16, 17]. Additionally, MAC protocols need two requirements such as synchronization or localization, so usually they do not consider energy consumption factor in these two tasks. MDS-MAC protocol [14] is proposed recently that combines synchronization and localization factors into its provision and also consider extra energy consumption factor. Recently proposed protocol combines these factors into their specifications for the complete estimation of energy consumption techniques [15]: • IEEE 802.11: This technique uses varying modulation schemes like DBPSK, DQPSK and CCK to maximize the data rate. But this technique has some disadvantages like large overhead in control and data packet and more consumption of energy. • IEEE 802.15.1: It is used for data communication, but this technique is not suitable in underwater environment. Also, this technique has disadvantage of more energy consumption, and there is single point of failure of master node. • IEEE 802.15.4: This technique is related to Zig Bee technology. It functions at low data rate, and hardware is inexpensive. But, this is also having disadvantage of more energy consumption. • Wireless HART: This techniques works at all layers of protocol stack of UWSN. This is having the advantage that it can be applied in harsh environment like underwater, but single point failure of manager node is there in it, and more energy consumption is there this technique. • ISA 100: It works same as wireless HART technique, but in this technique, two manager sensor nodes are required to control the network that increases the system complexity, and also, more energy consumption is there in this technique. 2.1.2

Research Issues in MAC and Physical Layer

In spite of the fact that there are different MAC layer protocols introduced for underwater wireless sensor network, but there is not a single protocol that is trusted as a standard protocol. The reason for this is that MAC protocols are application dependent. Most of the proposed protocols only work in networks with predefined properties, which will result in small application area. Above all, mobility and topology factors still require more thorough research. Due to lack of standardization of physical sensor hardware at physical and MAC layer, energy conservative protocols are developed at upper layers of protocol stack architecture [19]. In 1980s, various underwater modems are proposed by various researchers which are based on frequency-shift keying (FSK). FSK scheme is relied on non- coherent modulation technique. FSK technique does not take into account phase-tracking

500

Poonam et al.

factor which is a challenging task in the underwater environments. Also, FSK modulation scheme is unsuitable for the high data rate multiuser network. To avoid problems of non-coherent modulation, coherent modulation techniques, namely phase-shift keying (PSK) and quadrature amplitude modulation (QAM), have been proposed which will help in increasing the throughput of system [20]. But, these modulation schemes also have few disadvantages: • • • • •

Time variability due to underwater channels Non-fixed delay Doppler spread Path loss Inter-symbol Interference

Differential Phase Shift Key (DPSK) works as an intermediary solution in the middle of non-coherent and coherent system in the terms of bandwidth utilization. Although this approach significantly reduces carrier phase-tracking requirements, disadvantage is that error rate is large in this than PSK at an equal data rate [20]. Energy is more consumed using modem so research is going on to develop the modem that consumes low battery power. Some researchers have proposed compact and a low battery power acoustic modems called as micro-modem. The micro-modem uses Texas Instruments fixed-point DSP scheme, which consumes nearly 180 mW in active state. Some researchers proposed a new low-cost and low power acoustic modem which will target to support short-range acoustic communication in fully densed underwater sensor network. So, conclusion is that at physical layer energy is saved using micro-modem [12].

3 Energy Approaches at Network Layer Routing is a basic task of any network, and routing protocols are used for discovering and maintaining a suitable route for data transmission from source node to sink node. To find out an optimal routing protocol in the network layer in UWSNs is still new and needs to be addressed properly by researchers. Underwater wireless sensor node endures from restricted battery power. Researchers are still trying to find out solution of the issue by using various methods such as energy harvesting wireless sensor devices that convert electrical, mechanical and acoustic energy into that energy which will give power to sensors nodes. However, this enhancement of energy harvesting schemes is not enough and still in the initial phase which incurs various problems in the implementation [21]. In order to extend the battery power of underwater wireless sensor nodes, energy consumption during transmission mode should be decreased. Various researchers are trying to find out an optimal energy-efficient routing protocol in underwater wireless sensor networks [2].

Energy-Efficient Schemes in Underwater Wireless Sensor …

501

3.1 Routing Challenges and Design Issues 1. Energy resources: The most important objective of the routing protocols is to deliver data from source node to sink node in a well-organized manner. Energy conservation is the main concern for development of routing protocols. Data must be delivered to the sink node in an efficient manner without compromising the accuracy of delivered data [22, 23]. 2. Failure of node: The low-cost equipments of sensor nodes sometimes result in unpredicted failures. Furthermore, the wireless sensor channel results in data losses during transmission. So routing protocols be supposed to offer robustness to failures of node and prevent single point of failure problem. Due to harsh conditions of underwater environment, the routing protocols should provide efficient transmission of data between sensor nodes placed on the bottom and sink node with regular channel errors and failure of nodes [22–24]. 3. Bandwidth: for underwater wireless sensor networks bandwidth capacity is low because there are high error rates in routing protocol [22, 25]. 4. Node mobility: Node mobility is also another issue for underwater wireless sensor networks because sensor nodes are not fixed at base of the sea. This situation consequences in dynamic topology [22, 23]. 5. Data heterogeneity: Data will be generated from different sensor nodes at different rates. This will create a heterogeneous environment that results in more complexity in routing the data to the nodes at sink node [23]. 6. Node density: The node that will monitor physical phenomena requires high density of sensor nodes to be deployed. Most of the sensor nodes prevent largescale information of networks topology from being acquired at each node. So, the distributed protocols that work with finite information of topology are developed to furnish scalability [22, 26]. 7. Addressing: Unique address cannot be provided to each node in sensor network. So ad hoc routing protocol cannot be used in underwater wireless sensor network because this routing protocol requires unique address for every node in the sensor network. Address-based routing protocols are not feasible in underwater while local addressing scheme is applied in UWSN. So, the address mechanism which does not requires unique address ID is applied in UWSN [23, 26].

3.2 Routing Protocols Objectives 1. Data assurance: The assurance of data delivery to the sink node is requisite for all routing protocols. It means routing protocols must have prior knowledge of route between the communication sensor nodes. Due to this property, performance of message delivery ratio is increased [27]. 2. Quality of service: Some applications require that the critical data must be delivered to base station with more reliability within a particular time so that

502

Poonam et al.

instant counteractive and defensive action can be taken. So the underwater wireless sensor network requires quality of service-based protocol for delivery of important data with high reliability and low latency [27, 28]. 3. Mitigate packet loss: In several underwater sensor network applications, the detection accurateness of the event depends on data delivery ratio. Data delivery ratio is the ratio of amount of packets effectively received by the sink node to the amount of packets generated by source nodes. When data delivery ratio decreases below a specific level, it affects network responsiveness because event detection accuracy becomes low at sink node. So, data loss should be avoided at each hop. Data loss occurs due to collision between the transmitting nodes, dead sensor nodes, overflow at data at relay node, interference of human being, environmental disturbance. So, routing protocols must have the property to mitigate packet loss [28]. 4. Load balancing and Network lifetime: Routing protocol should have the property to balance the energy consumption at each sensor node. This objective is required for the sensor networks where the application must run for long time. So, energy should be maintained at equal level at each node, and in this way, network lifetime can be increased [28].

3.3 Energy-Efficient Protocols Routing protocols specify the communication between sensor network nodes. Each router should have knowledge about the directly attached network. The information about route is shared among immediate neighbors initially and later throughout the sensor network. In this way, routing protocol gets to know about network topology. Main intention of energy-efficient routing protocol is to save energy at each node and to increase the lifetime of network [29, 30]. Energy-efficient routing protocols can be classified into three types: • Data-centric routing protocol • Location-based routing protocol • Hierarchy-based routing protocol In this paper, only hierarchical-based routing protocols are described.

3.3.1

Energy-Efficient Hierarchical Routing Protocol

A. Low-Energy Adaptive Clustering Hierarchy (LEACH) This protocol is introduced by Hienzelman. LEACH is an adaptive, self-organizing and clustering protocol. LEACH is very energy-efficient hierarchical routing protocol, which uses clustering technique. In this routing protocol, each sensor network is divided into clusters of nodes and each cluster has their cluster head;

Energy-Efficient Schemes in Underwater Wireless Sensor …

503

multiple cluster heads form a high-level network of sensor nodes. In this protocol, cluster head is selected randomly and role of cluster head is rotated among other nodes to balance the energy load among the sensor network. Energy consumption is reduced by turning off non-cluster head nodes when there is no communication between them [31–33]. Working of LEACH Protocol: This protocol assumes that all sensor nodes are homogenous and energy constrained. Base station is fixed at long distance from all sensor nodes. In every rotation, different cluster heads are chosen to save energy [34]. This protocol works in two stages: setup sage and steady phase. In first (setup) phase, formation of clusters happens and data is transferred in steady phase. In the beginning stage, a number is chosen between 0 and 1 by every node, and based on chosen number, threshold value is calculated. The node whose value is less than threshold value is selected as cluster head node. Calculation of threshold value is done by using the following formula:  T (n) =

 Pt 1−Pt · r · mod

0,

1 Pt



if n ∈ G otherwise

where T (n) = computed threshold value. Pt = preferred percentage value of cluster heads. R = round number. G = number of sensor nodes that have not been chosen as cluster head in previous 1/Pt rounds. The node that is selected as cluster head will broadcast the message to all neighbor’s nodes. The neighbor that receives that message will send that message to its neighbor’s nodes and send a ‘JOIN REQ’ message to cluster head. After receiving JOIN REQ message, cluster head node includes that node into its own member node table. In steady phase, data is transferred within the cluster network using TDMA schedule and this schedule is broadcasted to its member sensor nodes. Each node sends its data to cluster head node, and then, cluster head node sends that data to sink node [35]. Selection of cluster head by this method will reduce unnecessary utilization of energy, and in this way, network lifetime is enhanced. Fusion of data takes place at cluster head node which will helps in reducing traffic in network [21, 36, 37]. Advantage of LEACH [21]: 1. LEACH is a totally distributed technique. 2. There is no requirement of global information of network in this protocol. 3. It is an energy-efficient and simple routing protocol of network layer.

504

Poonam et al.

4. There is random rotation of cluster head (CH), so there is chance for each node to be selected as cluster head in a round. 5. Time division multiple access modulation technique is used in this protocol, so every node of sensor network can take part in rounds simultaneously. 6. Each sensor node in network has communication only with related cluster head. So, coordination and control is maintained in network for cluster setup and operation. 7. Data redundancy is minimized since data is aggregated only by the cluster head of all nodes of whole network. Drawbacks of LEACH Protocol: 1. LEACH assumes that all nodes have same initial energy, which is not possible in real-time situations. 2. This protocol assumes that all nodes are static, but most of the nodes are mobile in underwater network. 3. This protocol does not take into account of existence of multiple base stations. This will work better when there is single base station. 4. Extra overhead is there due to dynamic clustering. 5. Transmission of data to base station is done by cluster head, so failure of cluster head node will result in lack of robustness. 6. More consumption of energy is there in data transmission phase since single-hop routing method is used in this protocol. 7. Cluster head is selected randomly, and this selection does not take into account energy factor which will results in failure of that particular node. B. LEACH Centralized Routing Protocol LEACH Centralized (LEACH-C) is enhanced version of LEACH protocol. In this protocol, base station will take the decision about cluster configuration. Each sensor node in network is able to calculate its energy level. Each sensor node sends information of its energy level to the base station, and along with this information, location of every node is also sent to base station. Base station will select the cluster head based on energy level and distance between that node and base station node. Initially, average energy of all nodes is calculated by the base station node and then based on that cluster head is selected. The node having high energy level than average energy is selected as cluster head for the current round. This enhanced version of LEACH protocol also works in two phases [29, 38, 39]. LEACH-MF Distance between cluster head and sink node will increase energy consumption, so network lifetime is reduced. This drawback is removed in LEACH-MF routing protocol. In this protocol, multi-layer clustering technique is used. In multi-layer clustering, super cluster heads are formed. Creation of super cluster head is done by all cluster head. Super cluster heads then send data to sink node which will improve lifetime of sensor network [40].

Energy-Efficient Schemes in Underwater Wireless Sensor …

505

C. Modified LEACH In this protocol, new cluster heads are chosen in every round so the unnecessary overhead will be removed. Cluster head is selected on the basis of threshold value. A node which is cluster head in first level will be elected as cluster head in next round if its residual energy is more than threshold value. A proficient cluster head substitution algorithm is required to decrease the unnecessary utilization of inadequate energy resources. Modified LEACH uses this technique along with a dualtransmission energy-level mechanism. This dual-transmission mechanism allows farthest and nearest nodes from base station to transmit the data with different energy levels; this will maintain stability in energy level in sensor network [41]. D. Multi-hop LEACH Protocol In this protocol, quick energy drainage problem is solved by cluster head since single hop communication method is used in large sensor area network. There is intra-cluster communication between cluster head and all nodes related to that cluster head. Data by all nodes is transmitted to head by using single-hop distance method, and then aggregation of data takes place at head node, and after that data is sent directly to base station node. Sometimes, data from a cluster head is sent to intermediate cluster head and then to base station node. This protocol works better when there is large distance between cluster head and base station, and cluster head also has communication with intermediate cluster head for transmission of data to base station [42]. E. PEGASIS Power-Efficient Gathering in Sensor Information Systems (PEGASIS) protocol is also energy-efficient which uses some features of LEACH protocol because it also uses cluster head. In this protocol, nodes are arranged in a structure chain to avoid communication consumption of frequent selection of cluster head. Each node has information about other nodes in network and uses greedy algorithm to locate the adjacent neighbor node. The node nearest to neighbor node is selected as cluster head in next turn. Data is transmitted in chain from node to node, integrated and in last transmitted to sink node. Communication cost is avoided in this protocol: By reducing the data used in transmission and communication, using data aggregation method, nodes using low power mode, communication is possible only between nearest neighbor nodes. But disadvantage of this protocol is that if cluster head will fail then whole routing network will be destroyed. Chaining of nodes will result in complex structure of network. So, this protocol is not used in real-time applications [43–45]. F. TEEN Threshold-Sensitive Energy-Efficient Sensor Network (TEEN) protocol uses clustering technique similar to LEACH protocol; difference is that this protocol is used only in reactive sensor network. TEEN protocol uses two threshold values: hard and soft thresholds. Filtering process is used in this protocol that will help in reducing data transmission. After selecting cluster head, two threshold values are broadcasted

506

Poonam et al.

in network. Hard threshold value specifies the minimum value of data transmission, whereas soft threshold indicates changing rate of detected data. When data that is to be transmitted will exceed the hard threshold value, then this data is sent to cluster head and that data value will be taken as new threshold value, and monitor value is saved as sensed value. When the monitored data value is more than hard threshold value and difference of this value with sensed value is also more than soft threshold value, then in this condition, nodes can transmit data to sink node and new data value is taken as new hard threshold value [46]. Advantage of this protocol is that quantity of data transfer is reduced by setting the value of hard and soft threshold. But, this protocol is not suitable for periodic reporting of data [24, 43, 47]. G. APTEEN Adaptive Threshold-Sensitive Energy-Efficient Sensor Network (APTEEN) protocol is advanced version of the TEEN protocol in which parameters are adjusted by cluster head according to the needs of users. Data transmission is successful without any delay, and less energy is consumed by each node. Disadvantage of this protocol is that it does not support QoS and no security of data is there during transmission [48]. H. Two-Tier Data Dissenination (TTDD) This protocol is used where sink nodes are multiple and sink is movable in sensor network. In this, multiple nodes sense events, and out of these nodes, one node is elected as source node. Grid network is constructed in which initially source node finds out location of nearby cross-point. Greedy algorithm is used to find out the new cross-point of grid network. Flooding query method is used to request adjacent crossnode, and then this request is transmitted in all cross-point and in last to source node, and after that data will be transmitted to sink node. Sink node then waits for entire data and uses agent mechanism for reliability of data which is transmitted. Advantage of this protocol is that due to use of single path, network lifetime will increase. But computation and maintenance of grid network is very complex [32, 49, 50].

4 Conclusion Energy-efficient communication in underwater sensor network is a new topic for research, having lots of potential to design and develop novel energy-efficient routing protocols since limited a number of researches are done till date. The purpose of this paper is to study various approaches and routing protocols used in three layer of underwater wireless sensor network stack architecture. Due to various restrictions in underwater, design of energy-efficient technique is very difficult task. Energy is most concerning issue in sensor network, so the network must keep up a longer life time to accomplish its objectives. This paper tries to explain hardware technology in physical layer and various protocols in MAC and network layer. In this paper, hierarchical routing protocols are explained which are very energy efficient.

Energy-Efficient Schemes in Underwater Wireless Sensor …

507

Routing protocols have various challenges which are also explained in this paper. The routing protocols in all networks have common objectives regarding to optimal energy efficiency. Most of routing protocols have drawback that they do not support QoS. Various descendents of LEACH protocols are compared to basis LEACH protocol. Advantages and disadvantages of hierarchical routing protocols, LEACH, TEEN, TTDD, PEGASIS, are also discussed.

Fig. 1 Setup and steady-state phase of LEACH protocol

Table 1 Comparison of LEACH descendants from basic LEACH LEACH descendants

Difference

LEACH-C

In leach protocol, all nodes participate in making cluster but in LEACH-C only base station takes responsibility to make cluster on the basis of residual-energy and location information of all nodes

LEACH-MF

Multi-layer clustering is used that will reduce distance between cluster head and sink node, while in basis LEACH, distance between sink node and cluster node is more that will consume more energy

M-LEACH

Selection of cluster head depends on threshold value. Dual-transmission energy mechanism is used in modified LEACH protocol. It is best suited for highly mobile environments

Multi-hop LEACH

Multi-hop inter-cluster communication is used which will help in saving energy

508

Poonam et al.

Table 2 Comparison of routing protocols LEACH

PEGASIS

TEEN

APTEEN

TTDD

Hierarchical

Hierarchical

Hierarchical

Hierarchical

Hierarchical









Lifetime of network

Good

Very good

Very good

Better

Good

Data based











Based on location











Support for QoS











Classification Type proactive

References 1. Akyildiz IF, Pompili D, Melodia T (2005) Underwater acoustic sensor networks: research challenges. Ad Hoc Netw 3(3):257–279 2. Ismail N, Mohamad MM (2018) Review on energy efficient opportunistic routing protocol for underwater wireless sensor networks. KSII Trans Internet Inf Syst 12(7) 3. Khan A, Ali I, Ghani A, Khan N, Alsaqer M, Rahman A, Mahmood H (2018) Routing protocols for underwater wireless sensor networks: taxonomy, research challenges, routing strategies and future directions. Sensors 18(5):1619 4. Nguyen TT, Nguyen SM (2014) Power-aware routing for underwater wireless sensor network 5. Anguita D, Brizzolara D, Parodi G (2009) Building an underwater wireless sensor network based on optical: Communication: Research challenges and current results. In: 2009 third international conference on sensor technologies and applications. IEEE, pp 476–479, June 2009 6. Giles JW, Bankman IN (2005) Underwater optical communications systems. Part 2: basic design considerations. In: MILCOM 2005–2005 IEEE military communications conference. IEEE, pp 1700–1705, Oct 2005 7. Lanbo L, Shengli Z, Jun-Hong C (2008) Prospects and problems of wireless communication for underwater sensor networks. Wireless Commun Mobile Comput 8(8):977–994 8. Partan J, Kurose J, Levine BN (2007) A survey of practical issues in underwater networks. ACM SIGMOBILE Mobile Comput Commun Rev 11(4):23–33 9. Ayaz M, Baig I, Abdullah A, Faye I (2011) A survey on routing techniques in underwater wireless sensor networks. J Netw Comput Appl 34(6):1908–1927 10. Heidemann J, Li Y, Syed A, Wills J, Ye W (2005) Underwater sensor networking: research challenges and potential applications. In: Proceedings of the technical report ISI-TR-2005–603. USC/Information Sciences Institute 11. Ovaliadis K, Savage N, Kanakaris V (2010) Energy efficiency in underwater sensor networks: a research review. J Eng Sci Technol Rev (JESTR) 3(1):151–156 12. Freitag L, Grund M, Singh S, Partan J, Koski P, Ball K (2005) The WHOI micro-modem: an acoustic communications and navigation system for multiple platforms. In: Proceedings of OCEANS 2005 MTS/IEEE. IEEE, pp 1086–1092, Sept 2005 13. Wills J, Ye W, Heidemann J (2006) Low-power acoustic modem for dense underwater sensor networks. In: Proceedings of the 1st ACM international workshop on underwater networks. ACM, pp 79–85, Sept 2006 14. Climent S, Sanchez A, Capella J, Meratnia N, Serrano J (2014) Underwater acoustic wireless sensor networks: advances and future trends in physical, MAC and routing layers. Sensors 14(1):795–833 15. Koskela P (2018) Energy-efficient solutions for wireless sensor networks 16. Syed AA, Ye W, Heidemann J (2008) Comparison and evaluation of the T-Lohi MAC for underwater acoustic sensor networks. IEEE J Sel Areas Commun 26(9):1731–1743

Energy-Efficient Schemes in Underwater Wireless Sensor …

509

17. Syed AA, Ye W, Heidemann J (2008) T-Lohi: a new class of MAC protocols for underwater acoustic sensor networks. In: IEEE INFOCOM 2008—the 27th conference on computer communications. IEEE, pp 231–235, Apr 2008 18. Yunus F, Ariffin SH, Zahedi Y (2010) A survey of existing medium access control (MAC) for underwater wireless sensor network (UWSN). In: 2010 Fourth Asia international conference on mathematical/analytical modelling and computer simulation. IEEE, pp 544–549, May 2010 19. Heidemann J, Ye W, Wills J, Syed A, Li Y (2006) Research challenges and applications for underwater sensor networking. In: IEEE wireless communications and networking conference. WCNC 2006, vol 1, pp 228–235. IEEE, Apr 2006 20. Shahapur SS, Khanai R (2015) Underwater sensor network at physical, data link and network layer-a survey. In: 2015 International conference on communications and signal processing (ICCSP). IEEE, pp 1449–1453 21. Xu D, Gao J (2011) Comparison study to hierarchical routing protocols in wireless sensor networks. Procedia Environ Sci 10:595–600 22. Sharma A, Gaffar AH (2012) A survey on routing protocols for underwater sensor networks. Int J Comput Sci Commun Netw 2(1):74–82 23. Prathap U, Shenoy PD, Venugopal KR, Patnaik LM (2012) Wireless sensor networks applications and routing protocols: survey and research challenges. In: 2012 International symposium on cloud and services computing. IEEE, pp 49–56, Dec 2012 24. Cui JH, Kong J, Gerla M, Zhou S (2006) The challenges of building scalable mobile underwater wireless sensor networks for aquatic applications. IEEE Netw 20(3):12 25. Ayaz M, Abdullah A (2009) Underwater wireless sensor networks: routing issues and future challenges. In: Proceedings of the 7th international conference on advances in mobile computing and multimedia. ACM, pp. 370–375, Dec 2009 26. Domingo MC, Prior R (2008) Energy analysis of routing protocols for underwater wireless sensor networks. Comput Commun 31(6):1227–1238 27. Pompili D, Akyildiz IF (2009) Overview of networking protocols for underwater wireless communications. IEEE Commun Mag 47(1):97–102 28. Han G, Jiang J, Bao N, Wan L, Guizani M (2015) Routing protocols for underwater wireless sensor networks. IEEE Commun Mag 53(11):72–78 29. Zayed H, Taha M, Allam AH (2018) Energy efficient routing in wireless sensor networks: a survey 30. Nam H, An S (2008) Energy-efficient routing protocol in underwater acoustic sensor networks. In: 2008 IEEE/IFIP international conference on embedded and ubiquitous computing, vol 2. IEEE, pp 663–669, Dec 2008 31. Bhavana V, Rathi J, Reddy KR, Madhavi K (2018) Energy efficiency routing protocols in wireless sensor networks—a comparative study. Int J Pure Appl Math 118(9):585–591 32. Azam MH, Abdullah-Al-Nahid M, Abdul Alim M, Amin Z (2010) A survey and comparison of various routing protocols of wireless sensor network (WSN) and a proposed new TTDD protocol based on LEACH. Int J Comput Netw Secur 8:80–83 33. Ahmed M, Salleh M, Channa MI, Rohani MF (2017) Energy efficient routing protocols for UWSN: a review. Telkomnika 15(1):212 34. Nayyar A, Puri V, Le DN (2019) Comprehensive analysis of routing protocols surrounding underwater sensor networks (UWSNs). In: Data management, analytics and innovation. Springer, Singapore, pp 435–450 35. Sharma S, Kumar M (2015) Leach protocol: a survey. Int J Comput Sci Commun Netw 5(4):228–232 36. Gill RK, Chawla P, Sachdeva M (2014) Study of LEACH routing protocol for Wireless Sensor Networks. In: International conference on communication, computing & systems (ICCCS2014), Oct 2014 37. Domingo MC (2011) A distributed energy-aware routing protocol for underwater wireless sensor networks. Wireless Pers Commun 57(4):607–627 38. Afsar MM, Tayarani NMH (2014) Clustering in sensor networks: a literature survey. J Netw Comput Appl 46:198–226

510

Poonam et al.

39. Singh SK, Kumar P, Singh JP (2017) A survey on successors of LEACH protocol. IEEE Access 5:4298–4328 40. Yan JF, Liu YL (2011) Improved LEACH routing protocol for large scale wireless sensor networks routing. In: 2011 International conference on electronics, communications and control (ICECC), pp 3754–3757. IEEE, Sept 2011 41. Mahmood D, Javaid N, Mahmood S, Qureshi S, Memon AM, Zaman T (2013) MODLEACH: a variant of LEACH for WSNs. In: 2013 Eighth international conference on broadband and wireless computing, communication and applications. IEEE, pp. 158–163, Oct 2013 42. Maurya P, Kaur A (2016) A survey on descendants of leach protocol. Int J Inf Eng Electron Bus 8(2):46 43. Khalid M, Ullah Z, Ahmad N, Arshad M, Jan B, Cao Y, Adnan A (2017) A survey of routing issues and associated protocols in underwater wireless sensor networks. J Sens 2017 44. Lindsey S, Raghavendra CS (2002) PEGASIS: power-efficient gathering in sensor information systems. In: Proceedings, IEEE aerospace conference, vol 3, pp 3–3. IEEE, Mar 2002 45. Zhang Y, Sun H, Yu J (2015) Clustered routing protocol based on improved K-means algorithm for underwater wireless sensor networks. In: 2015 IEEE International conference on cyber technology in automation, control, and intelligent systems (CYBER). IEEE, pp 1304–1309, June 2015 46. Ahmed S, Khan IU, Rasheed MB, Ilahi M, Khan RD, Bouk SH, Javaid N (2013) Comparative analysis of routing protocols for under water wireless sensor networks. arXiv preprint arXiv: 1306.1148 47. Manjeshwar A, Agrawal DP (2001) TEEN: A routing protocol for enhanced efficiency in wireless sensor networks. In: ipdps, vol 1, p 189, Apr 2001 48. Zorzi M, Casari P, Baldo N, Harris AF (2008) Energy-efficient routing schemes for underwater acoustic networks. IEEE J Sel Areas Commun 26(9):1754–1766 49. Sharma T, Singh H, Sharma A (2015) A comparative review on routing protocols in wireless sensor networks. Int J Comput Appl 123(14) 50. Ye F, Luo H, Cheng J, Lu S, Zhang L (2002) A two-tier data dissemination model for largescale wireless sensor networks. In: Proceedings of the 8th annual international conference on Mobile computing and networking. ACM, Sept 2002 51. Sahana S, Singh K, Das S, Kumar R (2016) Energy efficient shortest path routing protocol in underwater sensor networks. In: 2016 International conference on computing, communication and automation (ICCCA). IEEE, pp 546–550, Apr 2016

Information Hiding Techniques for Cryptography and Steganography Bhawna, Sanjay Kumar, and Vijendra Singh

Abstract Cyber-attack is a category of attack that uses malicious code for intentionally altering the computer data, exploiting computer systems and networks, which might result in disruptive ramification and even compromise the data. Cyber-attack can impel to cybercrimes like information and identity theft. In computing context, the term security means cyber security. The main aim of cyber security is to protect the systems information, communication channels and sending information from unauthorized users. With the increasing volume and sophistication of cyber-attacks, user’s personal information must be protected. This paper presents a review of various data hiding techniques that are used to provide security to the information that is being shared over unsecure channel. Keywords Cyber security · Cryptography · Steganography · Visual cryptography

1 Introduction A cyber-attack is a kind of attack on a computer system or Web site [1], which is initiated from a computer, in order to compromise the confidentiality, availability or integrity of a computer or information stored in computer. Cyber security requires coordinated efforts throughout an information system. Cyber security is also termed as information technology security. Governments, financial institutions, military, corporations, hospitals and other business organizations collect, process and pool a

Bhawna (B) · S. Kumar Department of CSE, SRM University, Delhi-NCR, Sonepat, Haryana, India e-mail: [email protected] S. Kumar e-mail: [email protected] V. Singh School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_39

511

512

Bhawna et al.

large amount of important information on computers and communicate that information to other computers through networks [2]. So, we need some method to secure the confidential information from any kind of attack. Cyber-attacks can be categorized as unauthorized access to computer or data stored in computer, installing malicious code on a computer, and retrieval of data in unauthorized way, denial of service attacks or unwanted disruption. The methods used for investigating and responding to a cyber-attack depend upon the nature of the attack itself. In order to prevent the important data from being compromised, we need some method to protect our data from various attacks. Data security aims at protecting the data from any kind of accidental or intentional destruction or modification of data. A number of techniques and technologies can be applied to limit the access of unauthorized users. The main objective of data security is to preserve any kind of data that is being stored, created, received or transmitted by any organization. The importance of shielding data from threats is more important today than it has ever been. Most of the threats are from external sources. No matter what technology or technique is being used, main motive is to protect the sensitive data. Data encryption, data masking, data erasure and data resilience are ways to provide security to data. Data encryption method applies a code to every piece of data, and access is granted to encrypted data only with authorized key. Data masking basically masks the specific area of the data that needs to be protected from malicious users. Data eraser simply erases the data if not being used or active for a particular duration. In data resilience method, backup copies of data are created so that the organization can recover the data if it has been accidently corrupted or deleted. Remainder of the paper is organized as follows: Sect. 2 explains different types of data hiding techniques (cryptography, steganography). In Sect. 3, previous works done in the field of cryptography, steganography and visual cryptography have been explained. In Sect. 4, conclusion and future aspects have been described.

2 Types of Data Hiding Techniques When information is transferred through an unsecured channel, such as Internet, then security needs to be provided to ensure the challenges and issues that arise during transfer of information, such as unauthorized access, loss of information and leakage of personal information of user. Data hiding techniques provide reliability, integrity, confidentiality and authenticity to the user. The data hiding techniques have been classified into two types: cryptography and steganography. Both these techniques are used to protect the sensitive information from being compromised, but both of these techniques work in different manner and have their own pros and cons.

Information Hiding Techniques for Cryptography and Steganography

513

2.1 Cryptography Importance of data security has increased because of the use of unreliable network, i.e., Internet, for sharing important information. Cryptography is a technique used for providing security. This technique can be used in various applications. Since last many years, a number of cryptographic algorithms have been developed [3]. This technique is composed of two processes, encryption and decryption. Security is provided by all the cryptographic algorithms. All the techniques are implemented using a key, and the data can be encrypted and decrypted, with the help of relevant key only.

2.1.1

Types of Cryptography

The algorithms used in cryptography are divided into two categories: stream ciphers and block ciphers. Stream ciphers work by encrypting only one bit of plain text at a time, whereas block ciphers encrypt number of bits of plain text together as a single unit (generally 64 bits). Cryptographic algorithms are categorized as (i) Symmetric key (or private key) cryptography (ii) Asymmetric key (or public key) cryptography [4]. Symmetric key cryptographic algorithms are faster as compared to asymmetric key cryptographic algorithms, when executed on a computer. From various symmetric algorithms, data encryption standard (DES) is one of the most prominent methods. Later, advanced encryption standard (AES), more powerful version of DES, was introduced. RSA is most widely used asymmetric algorithm. Table 1 describes a number of security services. (i) Symmetric or private key cryptography Table 1 Cryptography services Service

Description

Confidentiality

Confidentiality is set of rules to protect the information from unauthorized access. Confidentiality may be applied to whole messages or parts of message

Authentication

Authentication means to confirm that the message came from an authorized user and that message has not been changed

Data integrity

Integrity means protecting the information from being changed or modified by any unauthorized user. Hashing is a common method that is used to provide integrity to data

Access control

Access control refers to prevent any unauthorized access to resources

Non-repudiation

Non-repudiation provides proof of sending and receiving. It provides a guarantee of message transmission between two parties. For that, digital signatures are used

514

Bhawna et al. Plain Text

Cipher Text

Decryption

Encryption Algorithm

Plain Text

Cipher Text

Internet

Private Key

Algorithm

Private Key

Fig. 1 Private key cryptography

This technique works with one common key, which is used for encrypting and decrypting the message (Fig. 1). The key is distributed to both, the sender and receiver of the message. The algorithm is also known as private key cryptographic algorithm because the key is kept private between the sender and receiver. The robustness of the algorithm relies on how well the key is protected from intruders. There are various symmetric key (private key) algorithms that are used today, as shown in Table 2. Blowfish: Blowfish uses block cipher mode in which the algorithm encrypts a number of bits of plain text at a time. It is a fast and simple encryption algorithm. Length of key used in this algorithm is variable, generally up to 448 bits. This algorithm can be executed on either 32 bit system or 64 bit system. Data Encryption Standard (DES): This algorithm comes under the category of block cipher. Length of the key used in DES depends upon the application for which the algorithm is employed. Generally, a key of length 56 bits is used in DES. The limitation of DES algorithm is that a smaller key can be compromised easily. Triple DES: This is an improvement over DES. This algorithm makes DES three times more secure by using the algorithm thrice with three different keys. The total length of the key used in 3DES is 168 bits. Triple DES uses three keys, e.g., k1 , k2 , k3 Table 2 Common symmetric key (private key) algorithms Algorithm

Description

Key length

Blowfish

Used block cipher

32–448 bits

DES

Used block cipher

56 bits

Triple DES Different keys are used and also known as a three-fold application 168 bits RC2

Block cipher presented by by Rivest

1–2048 bits

RC4

Stream cipher established by Rivest

1–2048 bits

RC5

Block cipher, study presented by Rivestin 1994

32/64/128 bits

RC6

Block cipher presented to meet requirements of AES in 1997

128 bits

AES

Proposed by Daemen and Rijmen

128–256 bits

IDEA

Block cipher developed by Massey and Xuejia

128 bits

Information Hiding Techniques for Cryptography and Steganography

515

each of 56 bits. During encryption, the cipher text is obtained as CT = E k3 (Dk2 (E k1 (P T )))

(1)

During decryption, the plain text is obtained as:PT = Dk1 (E k2 (Dk3 (CT)))

(2)

Here, PT = plain text, CT = cipher text, E = encryption process and D = decryption process. Rivest Cipher 2(RC2): This is a symmetric key algorithm and comes under the category of block cipher. This algorithm was proposed by Ronald Rivest. Here, RC means Rivest cipher or Ron’s code. This algorithm is a 16-bit block cipher. The key used in this algorithm is of variable length between 1 and 2048 bits. This algorithm consists of 18 mixing and mashing rounds. Rivest Cipher 4(RC4): RC4 comes under the category of stream cipher. This algorithm appeared to be fairly strong. This algorithm is also known as ARC4 (Alleged RC4) algorithm. RC4 allows variable key length between 40 and 256 bits. Rivest Cipher 5(RC5): It is a block cipher. Variable block size is used by this algorithm (32, 64 bits or 128 bits), variable key length (0–2040 bits) and number of rounds are also variable (0–255). It is a fast algorithm. The algorithm provides appropriate security if length of the key used is long, but if the key size is short, then the algorithm is quite weak. Rivest Cipher 6(RC6): It is also a block cipher. It uses key sizes of length 128, 192 and 256 bits. The algorithm works with 128 bits of block size. This algorithm is an improvement over RC5, so it provides greater security than the RC5. It uses less number of rounds and gives a higher throughput. Advanced Encryption Standard (AES): This algorithm was developed by Joan Daemen and Vincent Rijmen. It is a block cipher [5]. The algorithm uses key size of 128, 192 or 256 bits, and variable numbers of rounds are used. The key length determines the strength of the algorithm. This algorithm is an improved version of DES. It is nearly 6 times faster than triple DES. International Data Encryption Algorithm (IDEA): IDEA is a block cipher that works with block size of 64 bits and key size of 128 bits. It is a publicly known algorithm. This algorithm is used in a program called Pretty Good Privacy (PGP) program, to encrypt files and e-mail. It was patented in number of countries, but the last patents of algorithm expired in 2012, so now IDEA is free for all users. (ii) Asymmetric or public key cryptography This technique works with two separate keys for encryption and decryption, known as public key and private key. Encryption is done using public key, and decryption is done using private key. The strength of the algorithm depends on the degree

516

Bhawna et al. Plain Text

Cipher Text

Cipher Text

Plain Text Decryption

Encryption Internet

Algorithm

Algorithm

Public Key

Private Key

Fig. 2 Public key cryptography

of difficulty of computation of private key (decryption key) from the public key (encryption key). The public key is distributed publically, but the private key kept secret. Now, a sender will use the public key of the receiver for encrypting the message. The message can only be decrypted with the help of receiver’s private key (Fig. 2). Various types of asymmetric (public key) algorithms are RSA Asymmetric Algorithm: RSA was developed by Rivest–Shamir–Adleman. The main application of RSA is digital signatures. The security of RSA depends on the factoring of numbers [6]. This algorithm works as 1. Initially, two prime numbers p and q are chosen 2. 3. 4. 5.

Evaluate the value of n = p * q Cipher text is computed by sender as, c = me (mod n) Compute decryption component d so that (d * e) mod ((p − 1) * (q − 1)) = 1 Receiver computes plain text as m = cd (mod n)

Diffie–Hellman Algorithm: In this technique, two users exchange a secret key over an unsecure channel. This algorithm uses discrete logarithm algorithm from finite fields. This algorithm works as follows: Let, denote A:sender, B:receiver 1. A and B agree on prime numbers p and g. 2. The sender ‘A’ chooses a secret number ‘a’ computes (gamod p) and sends the obtained value to the receiver ‘B’. 3. The receiver ‘B’ chooses a secret number ‘b’ computes (gbmod p) and sends the obtained value to sender ‘A’. 4. A evaluates ((gbmod p)a mod p). 5. B evaluates ((gamod p)b mod p. Now, A and B can use these numbers as their keys

Digital Signature Algorithm (DSA): It is an asymmetric key algorithm. This algorithm can be used to verify if message has been altered by any intruder during transmission of the message. This algorithm also ensures the receiver about the sender’s identity. It is like an electronic signature used for maintaining non-repudiation and data integrity.

Information Hiding Techniques for Cryptography and Steganography

517

2.2 Steganography With the increased usage of Internet for sharing important information, there is a great need of providing security to information [7]. Various techniques and methods have been developed for providing security to communication. Cryptography was developed in order to keep the message secret, by encrypting and decrypting the message, but sometimes, just hiding the contents of message is not enough, but it might be required to hide the existence of the message too [7, 8]. One such technique that uses hidden communication is called steganography. Steganography and cryptography are two different techniques. The main approach in cryptography is hiding the contents of information or message, thereby making the message secret, while the main approach in steganography is to hide presence of a message, thereby making the communication secret [9]. Steganography and cryptography, both techniques, are used for securing the information from any kind of unauthorized access, but neither technique alone is perfect and therefore can be compromised.

2.2.1

Classification of Steganography

Steganography algorithms can use any kind of digital file formats for hiding the secret, but some of these file formats have a high degree of redundancy. The term redundancy stands for ‘the number of bits of an object that provide accuracy.’ The existing approach of steganography is shown in Fig. 3. Basically, there are three categories of file formats which can be used for hiding factual description in steganography [10], i.e., text, image and audio/video. (i) Text steganography Aim of text steganography is to change format or some characteristics of the text. This criterion of reliable decoding without any visible change seems to be somewhat

Start Application

Encryption

Cover

Decryption

Message

Image File

Fig. 3 Existing approach of steganography

Image File

Message

Cover

518

Bhawna et al.

conflicting as it has the challenge of designing documents marking techniques. The thieve coding techniques reveal different approaches either separately or jointly. Line-Shift Coding: In this method, locations of text lines of a document are shifted vertically to encrypt the document. A unique shape of text is created to hide the information. This method is not suitable for text file using OCR, as the information will be destroyed. Word-Shift Coding: In this method, the document is altered horizontal by shifting the words to encode the document. This method is applicable for the bit map of a page image or even the format file. Both (bit map and format file) of them can be used for decoding. Feature Coding: This method of coding is applicable to different types of formatting such as bitmap image or format file. The file is analyzed for certain features of the text, and then, these features are changed or kept unchanged. Decoding requires the original image and specification about change in pixels or feature. (ii) Image steganography In steganography, the most common technique is to hide the information inside the images [11]. The least significant bits can be used to alter the cover source using various color variations. Different types of techniques used in image steganography are LSB, DCT and DWT. Least Significant Bits: This method is used for hiding information in cover images. Bits of the secret message are embedded into the LSB of image in a deterministic sequence. In a 24 bit color image, one bit of each of the red, green and blue color components is used to hide the secret message bits [12]. In this way, only three bits of each pixel can be used. The changes are too small and cannot easily rectify by the humans. So, the message remains inaccessible. Discrete Cosine Transform (DCT): This method converts an image from spatial domain to frequency domain [13]. The image is divided into several spectral sub bands according to the visual quality frequency components such as high, middle and low. DCT is based on the coefficients values obtained from the carrier image which should be less than the threshold value and embedded secret in the carrier image with that value. Discrete Wavelet Transform (DWT): This method describes a multi-resolution decomposition process. In this method, the signals are divided into two parts as high and low frequency. Information about edge components is contained in high frequency, and the low frequency part of signal is again divided into high and low frequency. As the human vision is less sensitive to changes in edges, that is why high frequency components are generally used for steganography (Table 3). (iii) Audio Steganography Audio steganography embeds the secret message in the digital audio signal, in which the binary pattern of the corresponding audio file is changed. Different types of methods used for audio steganography are LSB Coding: Quantization technique is used along with sampling technique, for converting an audio signal from analog to digital form. In LSB coding, the secret

Information Hiding Techniques for Cryptography and Steganography

519

Table 3 Comparison of image stenographic techniques Steganography technique

Cover media

Technique

Advantages

LSB (least significant Bit)

Image

In this technique, LSB of every pixel of one image is used to hide MSB of another image

Easiest and simplest way of hiding data

Discrete cosine transform (DCT)

Image

This method embeds the information by changing the DCT coefficient

Data can be distributed more evenly in cover image, but cannot hide large amount of data

Discrete wavelet transform (DWT)

Image

In this method, wavelets are Provides high security but used to encode an image cost of computing DWT is higher

message is stored in LSB of binary pattern of audio file, by replacing the LSB of binary pattern with secret message bits. Phase Coding: When the phase of an audio signal changes, then that change in phase is not recognized by the human auditory system (HAS) very easily, but if there is some noise level in the signal, then that noise can be recognized by the human auditory system (HAS). This fact is used by the phase coding method. This method encodes the secret message by shifting the phase of a digital signal for obtaining encoding. Spread Spectrum: Two approaches are used for spread spectrum such as direct sequence spread spectrum and frequency hopping spread spectrum. The secret message is spread through the frequency spectrum of the audio signal. It obtains a better performance as compared to other audio steganography techniques.

2.3 Visual Cryptography There are various secret-sharing schemes that help us to encode any document or file, making it difficult for anyone to decipher. It is a technique for encrypting an image, which on encoding creates multiple shares of the image. When the obtained shares are overlapped, the secret image can be produced. The main attention of this technique is that decryption can be done by HVS. This method was developed by Moni Naor and Adi Shamir [14]. In their study, the authors broke an image in ‘n’ shares. To reveal the original secret image, these ‘n’ shares should be obtained. The shares obtained were taken on transparent sheets and were overlapped on one another to decrypt the secret image (Fig. 4). The shares obtained are noise-like structures. Alone, no share reveals any information about the original secret image. This technique is used in various applications, where secret is hidden inside images and needs to be protected from any unauthorized access.

520

Bhawna et al.

A Secret

A Share 1

Share 2

Stacked

Fig. 4 (2, 2) Visual cryptography shares

2.3.1

Visual Cryptography Schemes

Different types of visual cryptographic schemes are (i) Visual Cryptography for Gray Images While working with gray images, dithering technique is used to transform the gray image into binary image, and then, encryption scheme for binary image is applied to create the shares. So, using dithering technique, a gray sub pixel is not used directly for construction of shares. (ii) General Access Structures of Visual Cryptography This structure is used to define the qualified and forbidden sets of the shares obtained after encrypting the secret image. In basic model (m, n), any ‘m’ number of shares from qualified set is required to obtain the original secret (m ≤ n). Even if m − 1 shares are used from the qualified set, it will not reveal the original secret. Also, even if ‘m’ numbers of shares are used from the forbidden set, it would not reveal the secret image. Only, ‘m’ or more than ‘m’ shares from the qualified set can reveal the secret. (iii) Halftone Visual Cryptography The shares obtained by visual cryptography schemes were of poor quality. For that limitation, halftone visual cryptography was proposed which produces meaningful shares with increased quality. In this method, a secret binary pixel ‘X’ is encoded in an array of Q1 × Q2 sub pixels, which are termed as halftone cell. Using halftone cells, meaningful shares can be generated, which are not noise-like structures. (iv) Visual Cryptography for Color Images There are three methods that can be used for encrypting a color image: 1. The first method uses larger pixel expansion. The colors of the secret image are printed on shares directly. In this method, the image obtained after decryption is of degraded quality. 2. In second method, the color image is transformed into black and white image by dividing the image over the three channels of color from RGB or CMY. This method also degrades the quality of the image obtained after decryption.

Information Hiding Techniques for Cryptography and Steganography

521

3. The third method provides a better quality image after decryption. In this method, the encryption of the secret image is done at bit level, by using the binary of color image. This method needs devices to decrypt the image. Depending on the application, any one, out of these three methods, can be used for encrypting a colored image.

3 Related Work 3.1 Cryptography Paliwal and Gupta [15] gave a review on different kinds of existing encryption techniques. The paper also focused on techniques used for image encryption, information encryption, Chaos-based encryption and double encryption. Each technique is different in its own way and can be used in real-time encryption. Verma et al. [16] presented the comparison between performances of most widely used algorithms like DES, triple DES, Blowfish and AES. The authors compared the algorithms by implementing them with different data block sizes for evaluating their encryption and decryption speed. Shah et al. [17] presented the solution to data security problem. The authors provided the solution through cryptography technique. The technique was based on ASCII value. The basic factor while transferring the data over network is how much security is provided by the channel and not the amount of data that is being transferred. Chandra et al. [18] proposed an algorithm for providing security to data. The algorithm presented by authors is highly efficient, but there are some areas related to algorithm that are still open. The authors also presented the related future scope. Cryptography is a technique that is used for secure communication. It provides a solution to protect data against intruders. Data encryption techniques make use of complex mathematical calculations for achieving secure communication. But still the cryptographic algorithms are prone to one or more attacks. Saranya et al. [19] presented a study on different symmetric key cryptographic algorithms, the authors gave comparison between different symmetric algorithms and also presented the attacks to which they are vulnerable. Encryption is a process in which a message is scrambled in such a manner that message can be read by authorized person only. Encryption is a method which is used for securing information. As, today so much information is stored on computers, there is a need to make sure that the information stored in computers is not vulnerable to any attack. Singh and Supriya [20] presented a review of different encryption algorithms. Each algorithm might be suitable for different applications. Authors found that result of AES algorithm is better than other algorithms in terms of speed, throughput, and time and avalanche effect. In today’s era, the most challenging issue is information security. Main motive of cryptography is to provide integrity, availability, confidentiality, authentication to user. For securing the data, various encryption methods are used. Menon el al. [21] presented a paper, in which AES, DES and 3DES are compared for data encryption.

522

Bhawna et al.

All the algorithms are unique in their own way authors found that out of these; AES is much more efficient than other algorithms in terms of time, speed and throughput. Analysis of various symmetric and asymmetric key cryptographic algorithms was presented by Gupta and Kaur Walia [22].The authors analyzed the symmetric key algorithms and asymmetric key algorithms like DES, Triple DES, AES, Blowfish, IDEA, RSA according to their parameters such as size of data, size of key. The authors found that Blowfish algorithm is better than other algorithms in terms of performance and efficiency. Apart of Blowfish, RSA is another most secure algorithm that is most widely used. RSA can be used in combination with other algorithms for improving security (Table 4). Jirwan [23] presented a review paper of various asymmetric key cryptography algorithms. Authors found that ECC uses small key for faster execution as compared to the RSA, because it uses small key. But its mathematical operation is complex as compared to RSA. Kumar and Rana [24] presented the review of AES algorithm and proposed an improved version of AES. While transferring the information over Internet, security of the information is very important. The authors proposed that by adding 16 numbers of rounds in AES algorithm, more security can be given to secret information. Mewada et al. [25] presented a paper that describes the architecture for private key algorithm on the basis of attributes, such as scalability, flexibility and reliability. They described the effective approach for symmetric key cryptosystem to solve the security issues. Table 4 Previous survey of cryptography Year of survey

Authors

Characteristics

2011

Verma et al. [16]

Analysis of encryption algorithms

2013

Paliwal and Gupta [15]

Review of encryption techniques

2013

Singh and Supriya [20]

Study of RSA, DES, 3DES, AES algorithms

2013

Jirwan [23]

Analysis of asymmetric encryption techniques

2014

Shah et al. [17]

Cryptography algorithm using ASCII value of data

2014

Chandra et al. [18]

Survey of symmetric encryption algorithms

2014

Gupta and Kaur Walia [22]

Analysis of various cryptographic algorithms

2014

Saranya et al. [19]

Comparison between various symmetric encryption techniques

2015

Kumar and Rana [24]

Improved AES algorithm

2016

Mewada et al. [25]

Description of efficient symmetric key cryptographic algorithms

2017

Menon et al. [21]

Comparison of AES, DES and 3DES algorithms

Information Hiding Techniques for Cryptography and Steganography

523

3.2 Steganography Singh et al. [26] presented an improved LSB algorithm for colored images. The technique proposed by authors embeds the data into three channels of red, green and blue (RGB) image. Sumeet and Savina [27] presented various image steganography techniques. The authors presented overview, importance and challenges of steganography techniques. The authors also discussed about other security techniques. Kaur and S. Behal [28] presented a review paper, comparing available image steganography techniques such as LSB, DWT and DCT-based steganography. The authors implemented LSB algorithm in spatial and DCT and DWT algorithm in frequency domain. Sharma and Shrivastava [29] proposed a steganographic algorithm based on logical operation of image (8 bit (grayscale) or 24 bit (color) image) that provide security against steganalysis. Algorithm works by embedding ‘n’ most significant bits of secret image into ‘n’ least significant bits of cover image. The quality of the stego-image obtained was much better without extra computational complexity. Swain [10] presented two new methods of steganography in spatial domain. The authors proposed an algorithm where ‘n’ bits in a pixel were replaced by some other set of ‘n’ pixels. The first method proposed by authors hides one bit of secret information, and the second method hides two bits of secret information. The algorithm was implemented, and security was found to be improved (Table 5). Steganography is different from other data hiding technique like watermarking and cryptography. Mahajan and Khadke [31] presented a paper that highlights some basic concepts about steganography and its types. The authors presented 2/3 LSB steganography method which is useful to hide more data within the image with less distortion in the cover image. Today ‘Information Hiding’ is used to relate to Table 5 Previous survey of steganography Year of survey Authors

Characteristics

2012

Kavitha et al. [30]

Image steganography using least significant bit method

2012

Sharma and Shrivastava [29] Hiding an image in another image by making use of LSB algorithm with minimize detection

2014

Patel and Patel [7]

Review of image steganography techniques

2014

Kaur et al. [27]

Classification of steganographic techniques

2014

Kaur and Behal [28]

Comparison of digital steganographic techniques

2015

Mahajan and Khadke [31]

Review on LSB steganography and 2/3 steganography

2015

Singh and Singh [32]

Improved LSB technique for RGB images

2016

Swain [10]

Image steganography by making use of substituting variable length of bits

2017

Muhammad et al. [33]

Image steganography for authenticity of visual contents in social networks

2019

Kanojia and Choudhary [34] Data hiding using LSB steganography

524

Bhawna et al.

watermarking and steganography. Patel and Patel [7] presented a review of image steganography and various image steganographic techniques like DCT, LSB, DWT and gave a comparative analysis of these techniques. In today’s world, most of the images and secret messages are shared over social networks, so a major challenge is to sustain the confidentiality, authenticity and integrity of the message. K. Muhammad et al. [33] presented a secure framework for validation of contents of a visual. The method makes use of I-plane of the input image in HSI color space for secret data embedding using Morton scanning (MS). The authors verified that results obtained by the proposed scheme give better performance. Kanojia and Choudhary presented LSB image steganography technique [34] where the use of secret key is demonstrated. To limit the unauthorized access, the secret information was stored in LSB of image. A secret key is used that safeguards the information.

3.3 Visual Cryptography Blundo et al. [35] presented an analysis of visual cryptography for gray images. In gray level images, the pixels have n gray levels, from 0 (white pixel) to 1 (black pixel). The authors presented a necessary and sufficient condition for such scheme. Bhumje and Kadbe [36] presented a review of existing visual cryptography techniques. For each scheme, they provided a detailed explanation of the methods which are used to provide security to the secret image. Ramya and Parvathavarthini [37] presented a paper in which authors gave a study of various visual cryptography schemes. Authors also gave the performance analysis on the basis of expansion of pixel, image format, number of secret images and type of shares generated. Visual cryptography is an image cryptography method, which encrypts the secret image in a way that two different shares of the secret image are obtained. Most of the previous schemes of VC have been developed for binary images. Taghaddos and Latif [38] presented a new variant of visual cryptography, for encrypting gray images. The method proposed by authors used decomposition of pixel of a gray image at bit level for extracting binary bit planes. These obtained bit planes were then encrypted and restored as two shares (gray scale). The original secret image was obtained by stacking these gray shares over one another (Table 6). Visual cryptography is a branch of cryptography [42] where decryption is done by only users having shares. Visual cryptography works by using the characteristics of human visual system (HVS). The method does not require any complex computation for decryption. Maan and Chawla [43] gave a review of the visual cryptography and various applications of VC. Visual cryptography is used in various applications like bank customer identification, biometric security and remote electronic voting, etc. Tamilarasi et al. [39] proposed a method for halftone images. The proposed method improved the quality of decrypted images, and pixel expansion was also reduced. Reddy and Prasad [40] proposed a scheme for extended visual cryptography scheme. The method encrypted multiple secret images in such manner that meaningful shares were generated. For regenerating the original multiple secret images, all the generated

Information Hiding Techniques for Cryptography and Steganography

525

Table 6 Previous survey of visual cryptography Year of survey

Authors

Characteristics

2000

C. Blundo et al. [35]

Defining and analysis of visual cryptography scheme for gray images

2013

Bhumje and Kadbe [36]

Review of existing visual cryptography schemes

2014

Ramya and Parvathavarthini [37]

Review and performance analysis of various VC schemes

2014

Taghaddos, Latif [38]

Visual cryptography using pixels at bit level for gray scale images

2014

J. Tamilarasi et al. [39]

Improving image quality of halftone images

2016

Reddy and Prasad [40]

VC for sharing multi secret

2018

Kar et al. [41]

VC algorithm for colored images

shares are required. Kar et al. [41] implemented an algorithm, in which they applied color decomposition and halftone techniques on the input images to obtain the shares. The generated shares revealed the original secret image only when all the shares were overlapped.

4 Conclusion and Future Work For an individual and organization, data is a very important entity and so is security of data. Many techniques have been used till now to provide security to data. This paper urges a review of the techniques being adopted for information security through cryptography, visual cryptography and steganography, as none of these standalone for yielding good results. More advanced methods and techniques are needed to be explored. Future endeavors in this direction should no doubt involve combining the concepts of cryptography, visual cryptography and stegnography altogether, leading to the provision of multilayered security of the secret data.

References 1. Youngsoo K, Ikkyun K, Namie P (2014) Analysis of cyber attacks and security intelligence. Springer, Berlin 2. Priyadarshini P, Prashant N, Narayan DG, Meena SMA (2016) Comprehensive evaluation of cryptographic algorithms: DES, 3DES, AES, RSA and Blowfish. In: International conference on information security & privacy (ICISP), Procedia computer science, vol 78. ScienceDirect, Elsevier, pp 617–624 3. Rajdeep B, Rahul H (2015) A review and comparative analysis of various encryption algorithms. IJSIA 9(4):289–306

526

Bhawna et al.

4. Agrawal CP, Zeenat H (2016) Analysis of different cryptography algorithms. IJSEAS 2(4):347– 351 5. Vandana CK (2013) Modification in advanced encryption standard. JIKRCE 2(2):356–358 6. Sheetal C, Sandeep S (2014) A Comparative Study of Rivest Cipher Algorithms. IJICT 4(17):1831–1838 7. Palak RP, Yask P (2014) Survey on different methods of image steganography. IJIRCCE 2(12):7614–7618 8. Rupali K, Patil AS (2015a) Review paper on different types of steganography. IJRECE 36(2):122–124 9. Rupali K, Patil AS (2015b) Review paper on different types of steganography. IJRECE 3(2):122–124 10. Gandharba S (2016) Digital image steganography using variable length group of bits substitution. In: International conference on computational modeling and security (CMS 2016). Procedia Computer Science, vol 85. Elsevier, Amsterdam, pp 31–38 11. Amandeep K, Rupinder K, Navdeep K (2015) A review on image steganography techniques. IJCA 123(4):20–24 12. Akanksha S, Monika C, Shilpi S (2018) Comparison of LSB and proposed modified DWT algorithm for image steganography. In: 2018 International conference on advances in computing, communication control and networking (ICACCCN), IEEE 13. Aya YA, Wafaa RS, Ibrahim MH (2019) Data security in cloud computing using steganography: a review. In: 2019 International conference on innovative trends in computer engineering (ITCE), IEEE 14. Moni N, Adi S (1994) Visual cryptography. Springer, Berlin 15. Swati P, Ravindra G (2013) A review of some popular encryption techniques. IJARCSSE 3(2):147–149 16. Verma OP, Ritu A, Dhiraj D, Shobha T (2011) Performance analysis of data encryption algorithms. IEEE, pp 399–403 17. Naitik S, Nisarg D, Viral V (2014) Efficient cryptography for data security. In: International conference on computing for sustainable global development. IEEE, pp 908–910 18. Sourabh C, Smita P, Safikul S A, Goutam S (2014) A comparative survey of symmetric and asymmetric key cryptography. In; International conference on electronics, communication and computational engineering (ICECCE), IEEE, 83–93 19. Saranya K, Mohanapriya R, Udhayan J (2014) A review on symmetric key encryption techniques in cryptography. IJSETR 3(3):539–544 20. Gurpreet S, Supriya (2013) A study of encryption algorithms (RSA, DES, 3DES and AES) for information security. IJCA 67(19) 21. Menon CB, Joy A, Emmanuel E, Paul V (2017) Analysis on symmetric algorithms. In: Int J Eng Sci Comput 22. Anjula G, Navpreet KW (2014) Cryptography algorithms: a review. IJEDR 2(2):1667–1672 23. Nitin J, Ajay S, Sandip V (2013) Review and analysis of cryptography techniques. IJSER 4(3):1–6 24. Puneet K, Shashi BR (2015) Development of modified AES algorithm for data security. Elsevier, Optik 25. Shivlal M, Pradeep S, Gautam SS (2016) Classification of efficient symmetric key cryptography algorithms. IJCSIS 14(2) 26. Amritpal S, Harpal S (2015) An improved LSB based image steganography technique for RGB images. IEEE 27. Sumeet K, Savina B, Bansal RK (2014) Stegaanography and classification of image steganography technique. In: International conference on computing for sustainable global development (INDIACom). IEEE 28. Navneet K, Sunny B (2014) A survey on various types of steganography and analysis of hiding techniques. IJETT 11(8):388–392 29. Vijay KS, Vishal S (2012) A steganography algorithm for hiding image in image by improved LSB substitution by minimize detection. JOTAIT 36(1):1–8

Information Hiding Techniques for Cryptography and Steganography

527

30. Kavitha, Kavita K, Ashwani K, Priya D (2012) Steganography using least significant bit algorithm. IJERA 2(3):338–341 31. Apurva SM, Sheetal GK (2015) Review on LSB steganography. IJCST 3(2):216–218 32. Shahzad A, Zakariya SM, Rafiw MQ (2013) Analysis of modified LSB approaches of hiding information in digital images. In: International conference on computational intelligence and communication networks, pp 280–285 33. Khan M, Jamil A, Seungmin R, Sung WB (2017) Image steganography for authenticity of visual contents in social networks, vol 76(18). Springer, Berlin, pp 18985–19004 34. Kanojia P, Choudhary V (2019) LSB based Image Steganography with the aid of secret key and enhance its capacity via reducing bit string length. In: 2019 3rd international conference on electronics, communication and aerospace technology (ICECA), IEEE 35. Carlo B, Alfredo DS, Moni N (2000) Visual cryptography for grey level images. Inf Process Lett 75:255–259 36. Anjali MB, Premananad KK (2015) Visual cryptography scheme: a review. IJSR 4(3):1309– 1311 37. Ramya J, Parvathavarthini B (2014) An extensive review on visual cryptography schemes. In: International conference on control, instrumentation, communication and computational technologies (ICCICCT). IEEE, pp 223–228 38. Taghaddos D, Latif A (2014) Visual cryptography for gray-scale images using bit-level. JIHMSP 5(1):90–97 39. Tamilarasi J, Vanitha V, Renuka T (2014) Improving image quality in extended visual cryptography for halftone images with no pixel expansion. IJSTR 3(4):126–131 40. Reddy LS, Prasad MVNK (2016) Extended visual cryptography scheme for multi-secret sharing. In: Proceedings of 3rd international conference on advanced computing, networking and informatics. Springer, Berlin 41. Chinmoy K, Suman K, Sreeparna B (2019) an approach for visual cryptography scheme on color images. Springer, Berlin 42. Kim HJ, Choi Y (2005) A new visual cryptography using natural images. In: International symposium on circuits and systems. IEEE 43. Pooja M, Raman C (2015) A review paper on visual cryptography technique. Int J Eng Techn Res (IJETR) 3(7)

Affect Recognition using Brain Signals: A Survey Resham Arya, Ashok Kumar, and Megha Bhushan

Abstract Emotions, known as affects, play a vital role in daily life of human beings and its recognition based on physiological signals has been receiving close interest from the various researchers. Among these signals, Electroencephalography (EEG) which captures the brain signals has got constant attention for emotion identification. It is the most convenient, portable and easy solution which has been applied in various applications like safe driving, health care, social security and neuro-science. This paper presents a survey of the EEG-based neurophysiological research from 2012 to 2018, providing details of the work done in emotion recognition. The main phases of emotion recognition process include stimulus selection, signal(s) recording of subjects, features selection, extraction and classification to categorize emotions. This survey concludes the current state of the art and some future challenges which will be valuable for the research society, in particular, for those who are choosing affect recognition as their research interest. Keywords Affective computing · EEG · Emotions · Physiological signals · Recognition

1 Introduction An emotion (affect) is essential in everyday existence of human beings as it plays a main role in their cognitive activities like decision-making process, problem solving, intelligence, interaction and perception [1]. It is a complicated psychological state R. Arya (B) · A. Kumar Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India e-mail: [email protected] A. Kumar e-mail: [email protected] M. Bhushan School of Computing, DIT University, Dehradun, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_40

529

530

R. Arya et al.

that includes a subjective journey (how the person feels), a physiological response and a behavioral reaction [2]. The method which helps in understanding the type of affect a person is feeling is known as affect recognition [3]. Moreover, affective computing area is a flourishing area which is concerned with acquiring data from the human body in order to recognize, interpret and process the emotions [4]. It has attracted great interest from the various researchers of interdisciplinary fields. In state-of-the-art years, this field has expanded from psychology to various engineering domains in collaboration with basic studies on emotion theories [5]. Thus, there is no doubt in saying that emotion pervades each and every aspect of human life, and it has deep influence on the movements, actions, thoughts and perceptions. Affective computing leads to the development of systems known as affective human–computer interaction (HCI) that aims to identify and interpret emotional states of human being(s) [6]. Moreover, various applications like health care, especially mental health monitoring, safe driving, social security [7], entertainment, education, business, virtual worlds, etc., have adapted this field in order to improve the daily-life activities of human beings [8]. Besides this, when the same situation occurs, human beings located in different environments usually have distinct experiences, different responses and various emotional behaviors. Hence, for better understanding of emotions, physiological signals are captured from the human body [9]. These signals are generated by the central and autonomic nervous system which is responsible for carrying and transmitting the meaningful information including our actions, responses, feelings and emotions to the various parts of the body. The physiological signals (also known as modalities) such as electroencephalography (EEG), electrocardiography (ECG), Galvanic skin response (GSR)/electrodermal activity (EDA), electromyography (EMG) and respiration rate (RSP) are commonly used to measure emotions [10]. Many researchers have used these modalities individually as well as in combination to achieve more accurate identification of emotions. This paper is structured in multiple sections. Section 2 gives an overview of the emotions, human brain, EEG and comparative analysis of the existing work related to the field of affective computing from 2012 to 2018 which is shown in tabular form. This table includes datasets from which various elements are used as stimulus, multiple-selected EEG features as well as the extraction methods, classifiers, targeted emotions and summary of each article. After that, the standard EEG-based affect recognition methodology has been explained in Sect. 3 along with various equipment used for signals recording. At last, conclusion is drawn with some of the future challenges in Sect. 4.

Affect Recognition using Brain Signals: A Survey

531

2 Background 2.1 Theories of Emotions Emotions generally are expressed by people in various manners and are usually represented with the help of two different approaches. One is to categorize emotion in a set of basic discrete emotions which further helps in deriving other emotions [11]. Ekman [12] conducted various experiments and concluded that there are six basic emotions, namely happiness, sadness, fear, surprise, disgust and anger. Later on, in 2001, Plutchik had also suggested eight basic emotions: joy, disgust, anger, curiosity, sadness, acceptance, fear and surprise [13]. Rest other emotions can be created by using the combination of basic ones like disappointment contains surprise and sadness. In the second approach, emotions are categorized into three dimensions, namely valence, arousal and dominance. Valence indicates feelings that goes from very negative (like stress) to strong positive one (like contentment), and arousal dimension is from relax or calm state to alert or excited state. Dominance is related to the strength of an emotion [12]. It also represents the degree of control generated by the stimulus [14]. Furthermore, circumplex model of affect is commonly used for categorion of emotions. It divides emotions in two dimensions where valence is represented by horizontal axis, and vertical axis shows the arousal dimension. After intersection, these two axes meet at one point which further divides the space into four quadrants as shown in Fig. 1 [15].

2.2 EEG Human brain generates electrical signals inside the scalp which are generally read by using a technique known as EEG. These signals reflect the activities of brain generated by the central and autonomic nervous systems [16]. EEG also measures fluctuations occur in voltage which results from the current flows inside the neurons present in brain [17]. These signals are partitioned into specific brain waves like delta, theta, alpha, beta and gamma having different frequencies [17]. Among these frequencies, delta waves are associated with unconscious mind having lowest frequency, i.e., less than 4 Hz. These waves appear during states like deep dreamless sleep, deep meditation and in activities that need continuous attention. Further, theta waves lie within 4–8 Hz range are linked with the subconscious mind and its related activities, for instance, light sleep, drowsiness, imagining and dreaming. Alpha waves occur more during calm and relaxed mind state, yet aware. Its frequency is between 8 and 13 Hz and appears more over the occipital as well as parietal areas in brain. Beta waves lie within 13–30 Hz associated with normal waking and conscious mind. Most people operates in beta band during the day [18].

532

R. Arya et al.

Fig. 1 Circumplex model of affect in valence and arousal dimensions [15]

At last, gamma waves (fast) frequency lie within 30–50 Hz. A sample image of EEG signals frequency bands is shown in Fig. 2.

2.3 Database Used In the field of affective computing, several databases available to analyze various modalities which result in different categories of emotions are given in Table 1. These databases contain multiple elements responsible for stimulate an emotion in human being [2]. For instance, images from the International Affective Picture System (IAPS) [19–21], Geneva Affective Picture Database (GAPED) [22] and audio clips from the International Affective Digitized Sound System (IADS) have taken for elicitation [23, 24]. Physiological signals also help in detecting human emotions and are available in various standard databases. According to the review, these two databases—Database for Emotion Analysis using Physiological Signals (DEAP) [9] and SJTU Emotion EEG Dataset (SEED) [25]—provide physiological signals of human body.

Affect Recognition using Brain Signals: A Survey

533

Fig. 2 Different types of brainwaves [2]

2.4 Comparison Table The information related to affect recognition from year 2012 to 2018 is provided in Table 1. It includes databases which contain various elements used as a stimulus, salient EEG features selection as well as extraction methods, filters for removal of noise with artifacts, classifiers for categorization of emotions and short summary of articles.

3 EEG-Based Affect Recognition Methodology EEG is a well-known, effective and non-invasive method used to measure changes in different regions of brain [53]. For EEG-based emotion recognition, multiple steps are applied on subjects. These steps comprise of signal acquisition using correct placement of electrodes, signal preprocessing including artifacts removal, feature selection, extraction methods and then emotion classification in terms of valence as well as arousal as shown in Fig. 4 [2].

Year

2018

2018

Citation

[6]

[26]

Videos (DEAP)

DEAP

Stimulus

HOC, statistical features, Hjorth parameter, frequency domain features, frontal symmetry alpha, GA

Time–frequency domain features, decision tree

Features and extraction methods

Bandpass filter

ICA

Filter

Table 1 Analysis of electroencephalography-based affect recognition

k-NN

MLP

Classifier

Stress

Valence and arousal

Emotions

(continued)

High classification accuracy using GA-based feature selection for stress detection. This proposed method can be applied in multiple applications for stress detection

All the regions in the brain are interconnected and equally responsible for the emotional activity. Fusion of EEG-EMG modality can produce good results

Summary

534 R. Arya et al.

Year

2018

2018

2018

Citation

[27]

[19]

[28]

Pictures

Pictures (IAPS), music

Workload

Stimulus

Table 1 (continued)

PSD, MPC

Frequency domain features, PSI, DTF, GPDS

Time domain and frequency domain features

Features and extraction methods

Bandpass infinite impulse filter 1–45 Hz

Bandpass filter 4–32 Hz

Bandpass filter 0.5–64 Hz, ICA

Filter

k-NN

k-NN, SVM

k-NN, Gaussian SVM

Classifier

Valence and arousal

Valence and arousal

Low and high stress

Emotions

(continued)

In gamma band, at intermediate arousing level, visual emotional stimuli affect brain dynamics. Role of gender in Emotion processing can be investigated

In comparison to the rest state, connectivity among the brain regions changes in stress state

Gaussian SVM has the highest accuracy in detecting stress. The early stress detection proposed model can give contribution in improving workers safety, well-being, health and productivity by assessing their stress levels in real time

Summary

Affect Recognition using Brain Signals: A Survey 535

Year

2018

2017

2017

2017

2017

Citation

[25]

[29]

[30]

[20]

[31]

Video (DEAP)

Pictures (IAPS)

Pictures

Pictures

DEAP, SEED

Stimulus

Table 1 (continued)

S-Golay

Data normalization, rhythm extraction

Filter

Delta, theta, alpha, beta, gamma, DWT

ERP

AMR

Bandpass filter

Theta, alpha, beta, Passive high-pass gamma, PSD, SP, CSP filter, butterworth low-pass filter

Gamma, beta, alpha, theta, delta, DWT

Linear and nonlinear EEG features

Features and extraction methods

k-NN

No classifier

LDA

HMM

SVM

Classifier

Valence and arousal emotions

Negative

Positive and negative

Like and dislike

Positive and negative

Emotions

(continued)

Gamma band plays important role in classification of arousal and valence dimensions

Feature P3 and its amplitude changes emotional stimulation in parietal lobe

This paper aims at finding high accuracy for better electrode placement on head

EEG-based emotion recognition has good scope in neuromarketing. Fake answers can be detected in future

It aims to find out the robust EEG features which helps in cross-subject emotion recognition

Summary

536 R. Arya et al.

Year

2017

2016

Citation

[32]

[33]

Music

Pictures

Stimulus

Table 1 (continued)

FD, PSD, DWT

ERSP

Features and extraction methods

Bandpass frequency filter 05–60 Hz, notch filter 60 Hz, ICA

Analog bandpass filter

Filter

DBN

ANOVA

Classifier

Valence and arousal

Anger, happiness and neutral

Emotions

(continued)

With FD and PSD features, DBN improves model performance in valence classification and similarly with FD and DWT in arousal classification

Theta synchronization leads to increase in low depression patients following happiness stimulation and increase of theta synchronization due to anger elicitation in high depression patients

Summary

Affect Recognition using Brain Signals: A Survey 537

Year

2016

2016

Citation

[34]

[35]

Video

Music

Stimulus

Table 1 (continued)

Theta, alpha, beta, gamma, STFT

Theta, alpha, beta, gamma, DTF

Features and extraction methods

SWT

Bandpass frequency filter 2–42 Hz, manual removal of artifacts

Filter

SVM

SVM

Classifier

Fear and relaxation

Joyful, melancholic and neutral

Emotions

(continued)

Single-channel electrode carries enough information to classify emotions which can help in future in the field of neuro-science. SWT increases the emotions classification rate by 3%, and its accuracy can be increased by using other available classifiers also

According to the three emotions, i.e., joy, melancholic and neutral, connectivity between the various parts of the brain is estimated. For the automatic musical emotion’s detection, a non-invasive assessment tool can be developed

Summary

538 R. Arya et al.

Year

2016

2016

2016

Citation

[36]

[37]

[23]

IADS

DEAP

Music

Stimulus

Table 1 (continued)

Theta, alpha, beta, HFD, HOC, DFT, statistical

Theta, alpha, beta, gamma, HHS, HOC, STFT

Statistical, PSD, FFT, WT

Features and extraction methods

Bandpass filter 2–42 Hz

Bandpass frequency filter 4–45 Hz

Bandpass frequency filter 1-50 Hz

Filter

SVM-Polynomial

SVM, RF

MLP-BP

Classifier

Pleasant, happy, angry and frightened

Anger, surprise and other

Happy, sad, love and anger

Emotions

(continued)

The most stable features FD, HOC and bands give the best accuracy in real-time emotion detection

For EEG emotion detection, gamma plays an important role in pre-frontal and left temporal lobe, and RF is more robust and easier to use for classification than SVM

In comparison with SVM and k-NN classifier, MLP gives the best accuracy in emotion recognition while listening music

Summary

Affect Recognition using Brain Signals: A Survey 539

Year

2015

2015

2015

Citation

[38]

[21]

[22]

GAPED music

Pictures (IAPS)

DEAP

Stimulus

Table 1 (continued) Filter

Delta, theta, alpha, beta, gamma, WT

Hjorth parameters

SVM, k-NN

ML-SVM

Classifier

Notch filter 50 Hz, SVM low-pass filter 0.3 Hz

Bandpass filtering

Gamma, WT, SE, AR, Bandpass frequency CC filter 4–45 Hz

Features and extraction methods

Happy, sad, fear and pleasure

Happy, sad,calm, neutral and scared

Excitation, sadness, hatred and happiness

Emotions

(continued)

The best modality in classify valence and arousal is EEG and decision-level fusion

More accuracy of k-NN over SVM for emotion recognition

The algorithm proposed in this paper includes SE, AR, and ML-SVM which give better accuracy than the other algorithms used earlier for emotion recognition

Summary

540 R. Arya et al.

Year

2015

2014

2014

Citation

[39]

[11]

[40]

Video

Video

DEAP

Stimulus

Table 1 (continued) Filter

Delta, theta, alpha, beta, Gamma, PSD

Delta, theta, alpha, beta, Gamma, PSD, WT, NDA

High-pass filter 0.1 Hz, low-pass filter 100 Hz, notch filter

Manually artifacts rejection, LDS

Theta, alpha and beta, N/A statistical, linear and nonlinear features

Features and extraction methods

LDA

SVM linear

C4.5 decision tree

Classifier

Positive and negative

Positive and negative

Valence and arousal

Emotions

(continued)

Most predictive feature for emotion recognition falls in gamma band

In emotional activities, high-frequency bands play more important part than low-frequency bands. LDS filtering method can also help in improving the classification accuracy

Within EEG features, gender-specific correlations are analyzed by an EEG-based emotion assessment system. It also analyzes valence and arousal dimension to increase human–computer interaction performance

Summary

Affect Recognition using Brain Signals: A Survey 541

Year

2014

2014

2014

2013

Citation

[41]

[42]

[43]

[44]

Pictures (IAPS)

DEAP

DEAP

Video

Stimulus

Table 1 (continued)

Statistical

AR

Beta, SE

Theta, alpha, beta, gamma, PSD, WT

Features and extraction methods

ICA

Bandpass frequency filter 4–45 Hz

Bandpass frequency filter 4–45 Hz

Low-pass filter 50 Hz, bandpass frequency filter

Filter

k-NN, SVM

k-NN

SVM

QDA

Classifier

Neutral, positive/negative arousal

Valence and arousal

Valence and arousal

Positive, neutral and negative

Emotions

(continued)

Difficult to get the accuracy over large dataset by any classifier

AR coefficients are more efficient in distinguishing between valence and arousal dimension

Informative channels about the emotions are mainly in pre-frontal region of the brain

EEG-based functional connectivity helps in finding the link between emotional states and activities in brain

Summary

542 R. Arya et al.

Year

2013

2013

2013

Citation

[45]

[46]

[47]

GAPED

DEAP

Stanford emotional clips

Stimulus

Table 1 (continued)

Alpha, beta, theta, delta, gamma, WT

Alpha, beta, theta, gamma, delta, statistical, HFD

Alpha, gamma, beta, FFT

Features and extraction methods

BSS

Bandpass frequency filter 4–45 Hz

Bandpass frequency filter 0.05–60 Hz

Filter

SVM

SVM-polynomial

k-NN

Classifier

Positive and negative

Valence, arousal and dominance

Disgust, fear, happy, neutral and surprise

Emotions

(continued)

In comparison to other areas of brain, better emotions are recognized by frontal pair of channels, and low-frequency bands do not give good results than high-frequency bands

Proposed algorithm including HOC, FD and SVM predict eight emotions accurately

It aims in analyses of emotions with short-time EEG signals containing linear and nonlinear features. k-NN performs well than the other classifiers

Summary

Affect Recognition using Brain Signals: A Survey 543

Year

2013

2012

2012

2012

2012

Citation

[48]

[49]

[50]

[24]

[51]

Music

IADS

Video

Pictures (IAPS)

Video

Stimulus

Table 1 (continued)

Beta, gamma, time–frequency

Statistical, HFD

Delta, alpha, beta, FFT

Alpha, beta, statistical, HOC, DWT

Delta, theta, alpha, beta, gamma, DE DASM, RASM

Features and extraction methods

Bandpass frequency filter 0.16–85 Hz, notch filters 50, 60 Hz

CAR, bandpass frequency filter 4–45 Hz, 0.2–45 Hz (own dataset), notch filters 50, 60 Hz

Bandpass frequency filter 0.2–45 Hz, notch filters 50 Hz, 60 Hz

Bandpass frequency filter 8–30 Hz

Manually removed artifacts

Filter

k-NN

SVM

AdaBoost.M1

k-NN

SVM

Classifier

Like and dislike

Valence, arousal and dominance

Amusement, fear and neutral

Calm, negative/positive excited

Positive and negative

Emotions

(continued)

Using time–frequency domain features, liking and disliking of music is detected

The algorithm containing FD feature is suitable for valence detection in real-time applications

AdaBoost.M1 and theta band provide high emotion recognition rates

Various feature extraction and classification techniques has been examined on small dataset

For emotion recognition, DE feature is more suitable than the traditional features. Gamma related more to the emotional states then the other frequency bands

Summary

544 R. Arya et al.

2012

[52]

DEAP

Stimulus

Filter

Statistical, PSD, HOC CAR, bandpass frequency filter 4–45 Hz

Features and extraction methods k-NN

Classifier Stress and calm

Emotions

PSD gives more accuracy than HOC in detecting stress and calm state

Summary

Classifier: HMM Hidden Markov model, LDA Linear discrimination analysis, KNN K-nearest neighbor, DBN Deep belief network, SVM Support vector machine, MLP-BP Multi-layer percepton back propagation, ML-SVM Multi-class support vector machine, RBF Radial basis function, QDA Quadratic discriminant analysis Filter: S-Golay Savitzky–Golay, ICA Independent component analysis, MA Moving average, AMR Average mean reference, BSS Blind source separation, CAR Common average reference, SWT Stationary wavelet transform, LDA Linear dynamical system Feature extraction: AR Auto-regressive, CC Cross-correlation, DWT Discrete wavelet transform, PSD Power spectral density, signal power, CSP Common spatial pattern, ERP event-related potential, DFT Discrete Fourier transform, ERSP Event-related spectral perturbations, FFT Fast Fourier transform, GA Genetic algorithm, HOC High-order crossing, HFD Higuchi fractal dimension, HHS Hilbert–Huang spectrum, PSI Phase slope index, DTF Directed transfer function, GPDS Gneralized partial-directed coherence, MPC Mean phase coherence index, SE Sample entropy, STFT Short-time Fourier transform, WT Wavelet transform, DASM Differential asymmetry, RASM Rational asymmetry, DE Differential entropy

Year

Citation

Table 1 (continued)

Affect Recognition using Brain Signals: A Survey 545

546

R. Arya et al.

3.1 Subjects and Stimulus The participant in the experiment, whose physiological data is acquired for the emotion analysis is known as subject, and stimulus is responsible for eliciting an emotion in same. Generally, two approaches used for emotion elicitation are subject and event elicitation. In subject elicitation, participants are asked to remember some past emotional scenes of their life which help in eliciting an emotion in them. In event elicitation, different modalities like video, audios, pictures, etc., are used for stimulate an emotion in participants.

3.2 EEG Electrode Placement EEG reads brain signals by placing electrodes over the head. For its proper placement, an International 10/20 Standard system is used as a reference represented in Fig. 3 [54]. This system relies on the link within the location of electrodes and the specified area of the cortex in human brain. The numbers 10 and 20 in International System show the gap in between adjoining electrodes (10% or 20% of the total distance of the skull from right to left or front to back). As mentioned earlier, brain consists of four lobes. It also has two hemispheres, left and right. An alphabet is used to represent the lobe on which electrode is placed for reading signal. For instance, F stands for Frontal, P for Parietal, O for Occipital, C represents Central and T for Temporal lobe. In reality, no central lobe exists in the brain. It is used for the purpose of identification of other lobes and locations. Alphabet

Fig. 3 International 10/20 system [54]

Affect Recognition using Brain Signals: A Survey

547

z (zero) associated with any other alphabet represents the placement of electrode on the middle line, and number describes about the hemisphere of the brain. For instance, electrodes placed on the right hemisphere are denoted by even numbers, and odd numbers represent the electrodes of left hemisphere as represented in Fig. 3. For correct electrode placement, four anatomical landmarks are used as a reference—nasion (the point between the forehead and nose), inion (the lowest point of the skull from the back of the head) and the pre-auricular points anterior to the ear.

3.3 Signal Preprocessing An EEG device directly gets signals from the brain. These signals contain a lot of noise and various artifacts which are not generated from the brain itself. These noises are caused from eye movements, any muscular activity or any cardiac activity which are captured by using an EEG device. These signals degrade the quality of recorded data, and after that, various denoising methods are used to remove them. Similarly, for removing artifacts, various removal techniques are used which have been considered important for emotion recognition procedure. Signal preprocessing implements both by adopting two approaches. One is to remove the noise manually by providing prior instructions to the subjects about their postures and movements, and the other is to use the artifacts removal techniques like blind separation source, independent component analysis, notch filter, common average reference, laplacian, average mean reference, etc., implemented in MATLAB [55].

3.4 Feature Extraction In this phase, important features are selected and extracted which shows the relation between EEG data and the emotional states correspondingly [11]. Generally, EEG features are categorized in three domains: time, frequency and time–frequency domain. Firstly, time domain features figure about the EEG signals synchronization which are responsible for giving the idea of similarity between them, and multiple electrodes are used to measure them. Features related to amplitude like energy, power, mean, variability and regularity tested with variance with its types lie under time domain. It also includes ERP and Hjorth features [56]. Secondly, frequency domain features include power features from different frequency bands which are delta, theta, alpha, beta, and gamma [10]. Lastly, time–frequency domain comprises those features belonging to time and frequency domains (Fig. 4).

548

R. Arya et al.

Fig. 4 Steps in EEG-based emotion recognition process [2]

3.5 Classification In affect recognition process, the last step is classification. A huge number of classifiers are available for categorizing the emotions into multiple dimensions, and the most commonly used are Bayesian, support vector machine (SVM) and decision trees. Based on the review conducted, majority of the authors use SVM with different kernels like radial basis function, linear, polynomial and Gaussian. Variations like adaptive SVM, multi-class SVM or least squares SVM are also used. The k-nearest neighbor (k-NN) is also selected by most of the works, while in others, l varies from 2 to 8. Some authors also use linear discriminant analysis, quadratic discriminant analysis, naive Bayes and multi-layer percepton back propagation for emotion classification [57].

4 Conclusion and Future Challenges This paper focuses on the affect recognition in human beings based on the physiological signals. This field of affective computing has attracted great interest from the last few years. It has explored its branches in multiple domains that help in human well-being, their safety, health and growth. Earlier, this field was limited only to

Affect Recognition using Brain Signals: A Survey

549

the analysis of emotions as well as feelings, and now, the focus has been shifted on real-time applications. For instance, neuro-science, behavior prediction, analyzing thought patterns, education, defense, mental health monitoring, e-learning, etc., are various areas in which researchers are working. However, there are still multiple challenges in physiological signal-based affect recognition procedure. Firstly, neural correlation exists between human brain regions and emotions. Although this correlation is supported by EEG signals [6], however, it can be more accurately determined by combining EEG with any other modality like ECG or EMG. As each modality exhibits different approach and each emotion has different effect on human body, thus, there is a need of multi-modal approach for correlation analysis. Secondly, in stress recognition procedure, Gaussian SVM is used to classify stress levels in two classes, i.e., low and high. Due to its static nature, in real-time applications when users face same stressful situation and respond differently, it diminishes the performance of stress recognition procedure [27]. Hence, for better performance, there is a need to acquire more accurate values of stress levels by introducing additional classes using other algorithms like k-NN and GA. Thirdly, correlation analysis is performed within EEG features to differentiate emotions (i.e., positive and negative) in cross-subject emotion recognition [25]. However, the results are not that accurate and additional features need to be explored for enhancing accuracy of the difference between positive and negative emotions. Another potential line of research includes the verification of the efficacy of EEG features to study other cognitive processes. Lastly, EEG signals help in predicting user’s preferences for any product [29] which can assist in forecasting the future success of the product in the market. Yet, analyzing the fake responses of a user is still challenging. Thus, there is a requirement of approaches to deal with aforementioned problem. Moreover, while watching products, eye movement of user can be tracked which can combine as another modality in predicting preferred choices. The combination of more robust features and classifiers can be explored to enhance the prediction results.

References 1. Damasio (1995) Descartes’ error: emotion, reason, and the human brain. Harper Perennial, G.P. Putnam’s Sons 2. Alarcao S, Fonseca M (2017) Emotions Recognition Using EEG Signals: A Survey. IEEE Trans Affect Comput 10:374–393. https://doi.org/10.1109/TAFFC.2017.2714671 3. Katsigiannis S, Ramzan N (2018) DREAMER: A database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices. IEEE J Biomed Health Inf 22:98–107. https://doi.org/10.1109/JBHI.2017.2688239 4. Picard R (1995) Affective computing. MIT Media Laboratory, Perceptual Computing Section. Technical Report 321 5. Nijboer F, Morin FO, Carmien SP, Koene RA, Leon E, Hoffmann U (2009) Affective braincomputer interfaces: psychophysiological markers of emotion in healthy persons and in persons with amyotrophic lateral sclerosis. In: 2009 3rd international conference on affective computing

550

6.

7. 8.

9.

10. 11.

12.

13. 14.

15.

16.

17. 18.

19. 20.

21.

22.

23. 24.

R. Arya et al. and intelligent interaction and workshops. IEEE Xplore, Amsterdam, pp 1–11. https://doi.org/ 10.1109/ACII.2009.5349479 Soroush Z, Maghooli M, Setarehdan SK, Nasrabadi AM (2018) A novel approach to emotion recognition using local subset feature selection and modified Dempster-Shafer theory. Behav Brain Funct 14. https://doi.org/10.1186/s12993-018-0149-4 Shu L, Xie J, Yang M, Li Z, Li Z, Liao D, Xu X, Yang X (2018) A review of emotion recognition using physiological signals. Sensors 18. https://doi.org/10.3390/s18072074 Ali M, Mosa AH, Machot F, Kyamakya K (2016) EEG-based emotion recognition approach for e-healthcare applications. In: 8th international conference on ubiquitous and future networks, pp 946–950. IEEE Xplore, Vienna. https://doi.org/10.1109/ICUFN.2016.7536936 Koelstra S, Muhl C, Soleymani M, Lee JS, Yazdani A, Ebrahimi T, Pun T, Nijholt A, Patras I (2012) DEAP: A Database for Emotion Analysis Using Physiological Signals. IEEE Trans Affect Comput 3:18–31. https://doi.org/10.1109/T-AFFC.2011.15 Kim BH, Jo S (2018) Deep physiological affect network for the recognition of human emotions. IEEE Trans Affect Comput. https://doi.org/10.1109/TAFFC.2018.2790939 Wang XW, Nie D, Lu BL (2014) Emotional state classification from EEG data using machine learning approach. Neurocomputing 129:94–106. https://doi.org/10.1016/j.neucom. 2013.06.046 Nicolaou MA, Gunes H, Pantic M (2011) Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. IEEE Trans Affect Comput 2:92–105. https://doi.org/10.1109/T-AFFC.2011.9 Plutchik R (2001) The nature of emotions: human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice. Am Sci 89:344–350 Basu S, Jana N, Bag A, Mahadevappa M, Mukherjee J, Kumar S, Guha R (2015) Emotion recognition based on physiological signals using valence-arousal model. In: 3rd international conference on image information processing, pp 50–55. IEEE Xplore, Waknaghat. https://doi. org/10.1109/ICIIP.2015.7414739 Russell JA, Barrett LF (1999) Core affect, prototypical emotional episodes, and other things called emotion: dissecting the elephant. J Pers Soc Psychol 17:715–734. https://doi.org/10. 1037//0022-3514.76.5.805 Alotaiby T, Abd El-SamieSaleh FA, Alshebeili A, Ahmad I (2015) A review of channel selection algorithms for EEG signal processing. J Adv Signal Process . https://doi.org/10.1186/s13634015-0251-9 Ahirwal MK, Londhe N (2012) Power Spectrum analysis of EEG signals for estimating visual attention. Int J Comput Appl 42:34–40. https://doi.org/10.5120/5769-7993 Anwar D, Garg P, Naik V, Gupta V, Kumar A (2018) Use of portable EEG sensors to detect meditation. In: 10th international conference on communication systems & networks, pp 705– 710, Bengaluru. https://doi.org/10.1109/COMSNETS.2018.8328299 Khosrowabadi R (2018) Stress and perception of emotional stimuli: long-term stress rewiring the brain. Basic Clin Neurosci 9:107–120. https://doi.org/10.29252/NIRP.BCN.9.2.107 Su J, Duan D, Zhang X, Lei H, Wang C, Guo H, Yan X (2017) The effect of negative emotion on multiple object tracking task: an ERP study. Neurosci Lett 641:15–20. https://doi.org/10. 1016/j.neulet.2017.01.038 Mehmood RM, Lee HJ (2015) Emotion classification of EEG brain signal using SVM and k-NN. In: 2015 IEEE international conference on multimedia & Expo workshops (ICMEW), pp 1–5. IEEE Xplore, Turin. https://doi.org/10.1109/ICMEW.2015.7169786. Jatupaiboon N, Ngum SP, Israsena P (2015) Subject-dependent and subject-independent emotion classification using unimodal and multimodal physiological signals. J Med Imaging Health Inf 5:1020–1027. https://doi.org/10.1166/jmihi.2015.1490 Lan Z, Olga S, Wang L, Liu Y (2016) Real-time EEG-based emotion monitoring using stable features. Int J Comput Graph 32:347–358. https://doi.org/10.1007/s00371-015-1183-y Liu Y, Sourina O (2012) EEG-based valence level recognition for real-time applications. In: International conference on Cyberworlds. Darmstadt, pp. 53–60. https://doi.org/10.1109/CW. 2012.15

Affect Recognition using Brain Signals: A Survey

551

25. Xiang L, Dawei S, Peng Z, Yazhou Z, Yuexian H, Bin H (2018) Exploring EEG features in cross-subject emotion recognition. Front NeuroSci 12:162–177. https://doi.org/10.3389/fnins. 2018.00162 26. Shon D, Im K, Park JH, Lim DS, Jang B, Kim JM (2018) Emotional stress state detection using genetic algorithm-based feature selection on EEG signals. Int J Environ Res Publ Health 15. https://doi.org/10.3390/ijerph15112461 27. Jebelli H, Hwang S, Lee S (2018) EEG-based workers stress recognition at construction sites. Autom Constr 93:315–324. https://doi.org/10.1016/j.autcon.2018.05.027 28. Greco A, Valenza G, Scilingo EP (2018) Brain dynamics during arousal-dependent pleasant/unpleasant visual elicitation: an electroencephalographic study on the circumplex model of affect. IEEE Trans Affect Comput . https://doi.org/10.1109/TAFFC.2018.2879343 29. Yadava M, Kumar P, Saini R, Roy PP, Dogra DP (2017) Analysis of EEG signals and its application to neuromarketing. Multimedia Tools Appl 76(18):19087–19111. https://doi.org/ 10.1007/s11042-017-4580-6 30. Wei Y, Wu Y, Tudor J (2017) A real-time wearable emotion detection headband based on EEG measurement. Sens Actuat A 263:614–621. https://doi.org/10.1016/j.sna.2017.07.012 31. Mohammadi Z, Frounchi J, Amiri M (2017) Wavelet-based emotion recognition system using EEG signal. Neural Comput Appl 28:1985–1990. https://doi.org/10.1007/s00521-015-2149-8 32. Bocharov AV, Knyazev GG, Savostyanov AN (2017) Depression and implicit emotion processing: an EEG study. Neurophysiol Clin 47:225–230. https://doi.org/10.1016/j.neucli. 2017.01.009 33. Thammasan N, Fukui K, Numao M (2016) Application of deep belief networks in eeg-based dynamic music-emotion recognition. In: International joint conference on neural networks, pp 881–888. IEEE Xplore, Vancouver. https://doi.org/10.1109/IJCNN.2016.7727292 34. Shahabi H, Moghimi S (2016) Toward automatic detection of brain responses to emotional music through analysis of EEG effective connectivity. Comput Hum Behav 58:231–239. https:// doi.org/10.1016/j.chb.2016.01.005 35. Jalilifard A, Pizzolato EB, Islam MK (2016) Emotion classification using single-channel scalpEEG recording. In: 38th annual international conference of the IEEE engineering in medicine and biology society, pp 845–849. IEEE Xplore, Orlando. https://doi.org/10.1109/EMBC.2016. 7590833 36. Bhatti AM, Majid M, Anwar SM, Khan B (2015) Human emotion recognition and analysis in response to audio music using brain signals. Comput Hum Behav 65:267–275. https://doi.org/ 10.1016/j.chb.2016.08.029 37. Ackermann P, Kohlschein C, Bitsch JÁ, Wehrle K, Jeschke S (2016) EEG-based automatic emotion recognition: Feature extraction, selection and classification methods. In: IEEE 18th international conference on e-health networking. applications and services (Healthcom), pp 1–6. IEEE Xplore, Munich, Germany. https://doi.org/10.1109/HealthCom.2016.7749447 38. Vijayan AE, Sen D, Sudheer AP (2015) EEG-based emotion recognition using statistical measures and auto-regressive modeling. In: IEEE international conference on computational intelligence & communication technology, pp 587–591. IEEE Xplore, Ghaziabad. https://doi. org/10.1109/CICT.2015.24 39. Chen J, Hu B, Moore P, Zhang X, Ma X (2015) Electroencephalogram-based emotion assessment system using ontology and data mining techniques. Appl Soft Comput 30:663–674. https://doi.org/10.1016/j.asoc.2015.01.007 40. Stikic M, Johnson RR, Tan V, Berka C (2014) EEG-based classification of positive and negative affective state. Brain-Comput Interfaces 1:99–112. https://doi.org/10.1080/2326263X.2014. 912883 41. Lee YY, Hsieh S (2014) Classifying different emotional states by means of EEG-based functional connectivity patterns. Public Libr Serv (PLoS) ONE 9. https://doi.org/10.1371/journal. pone.0095415 42. Jie X, Rio C, Li L (2014) Emotion recognition based on the sample entropy of EEG. Bio-Med Mater Eng 24:1185–1192. https://doi.org/10.3233/BME-130919

552

R. Arya et al.

43. Hatamikia S, Maghooli K, Nasrabadi AM (2014) The emotion recognition system based on autoregressive model and sequential forward feature selection of electroencephalogram signals. J Med Signals Sens 4:194–201. https://doi.org/10.4103/2228-7477.137777 44. Sohaib AT, Qureshi S, Hagelback J, Hilborn O, Jericic P (2013) Evaluating classifiers for emotion recognition using EEG. Foundations of Augmented Cognition. Lecture notes in computer science, vol 8027. pp 492–501. https://doi.org/10.1007/978-3-642-39454-6_53 45. Murugappan M, Murugappan S (2013) Human emotion recognition through short time Electroencephalogram (EEG) signals using fast Fourier transform (FFT). In: IEEE 9th international colloquium on signal processing and its applications, pp 289–294. Kuala Lumpur. https://doi. org/10.1109/CSPA.2013.6530058 46. Liu Y, Sourina O (2014) Real-time subject-dependent eeg-based emotion recognition algorithm. In: Gavrilova ML, Tan CJK, Mao X, Hong L (eds) Transactions on computational science XXIII. LNCS, vol 849, pp 199–223. Springer, Berlin. https://doi.org/10.1007/978-3662-43790-2_11 47. Jatupaiboon N, Panngum S, Israsena P (2013) Emotion classification using minimal EEG channels and frequency bands. In: 10th international joint conference on computer science and software engineering, pp 21–24. IEEE Xplore, Maha Sarakham. https://doi.org/10.1109/ JCSSE.2013.6567313 48. Duan R, Zhu J, Lu B (2013) Differential entropy feature for EEG-based emotion classification. In: 6th international IEEE/EMBS conference on neural engineering. IEEE Xplore, San Diego, CA, pp 81–84. https://doi.org/10.1109/NER.2013.6695876 49. Xu N, Plataniotis KN (2012) Affect recognition using EEG signal. In IEEE 14th international workshop on multimedia signal processing, pp 299–304. IEEE Xplore, Banff, AB. https://doi. org/10.1109/MMSP.2012.6343458 50. Ramirez R, Vamvakousis Z (2012) Detecting emotion from EEG signals using the emotive Epoc device. In: Zanzotto FM, Tsumoto S, Taatgen N, Yao Y (eds) Brain Informatics, vol 7670. LNCS. Springer, Heidelberg, pp 175–184 51. Hadjidimitriou SK, Hadjileontiadis LJ (2012) Toward an EEG-based recognition of music liking using time-frequency analysis. IEEE Trans Biomed Eng 59:3498–3510. https://doi.org/ 10.1109/TBME.2012.2217495 52. Bastos-Filho TF, Ferreira A, Atencio AC, Arjunan S, Kumar D (2013) Evaluation of feature extraction techniques in emotional state recognition. In: 4th international conference on intelligent human computer interaction, Kharagpur, pp 1–6. https://doi.org/10.1109/IHCI.2012.648 1860 53. AlMejrad AS (2010) Human emotions detection using brain wave signals: a challenging. Eur J Sci Res 44:640–659 54. TransCranialTechnologies (2012) 10/20 system positioning—manual. https://www.trans-cra nial.com/local/manuals/10_2_posman_v1_0_pdf.pdf 55. Zheng W, Lu B (2015) Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks. IEEE Trans Auton Ment Dev 7:162–175. https://doi.org/10.1109/TAMD.2015.2431497 56. Jenke R, Peer A, Buss M (2014) Feature Extraction and selection for emotion recognition from EEG. IEEE Trans Affect Comput 5:327–339. https://doi.org/10.1109/TAFFC.2014.2339834 57. Atkinson J, Campos D (2016) Improving BCI-based emotion recognition by combining EEG feature selection and kernel classifiers. Expert Syst Appl 47:35–41. https://doi.org/10.1016/j. eswa.2015.10.049

“Memorize, Reproduce, and Forget” Inclination; Students’ Perspectives: A Study of Selected Universities in Ghana John Kani Amoako, Yogesh Kumar Sharma, and Paul Danquah

Abstract The “Memorize, Reproduce, and Forget” (MRF) inclination also called rote learning has been a silent compromising remedy that has engulfed Ghana tertiary educational system. This affects students’ personal and career development, national development, and employers’ expectations. The study is aimed at ascertaining why MRF has been a compromising remedy for some students in the universities in Ghana and also determines the baseline among the causes. The study population, 82,825 students with a sample size of 800, consists of four public universities. Using a questionnaire, the data collected from the top three ranking universities and one technical university in Ghana (Kwame Nkrumah University of Science and Technology— KNUST, University of Ghana—UG, University of Cape Coast—UCC, Cape Coast Technical University—CCTU) was analyzed using content analysis and SPSS-V25. The results showed that the “Bookish/Theoretical” nature of lectures is the baseline cause of the MRF inclination and it has been overlooked because of the selfish gains of stakeholders. Keywords “Memorize, reproduce, and forget” · Bookish · Ghana Tertiary Education · Rote learning

1 Introduction Education is a great equalizer among individuals as well as nations [1]. The goal of every student aside gaining knowledge/skill is to acquire a certificate to show to the world the hard work they put in their educational journey. Unique factors or experiences can make students prioritize either of these goals ahead of the other. Perhaps the nature of the program is not what they anticipated [2], missed academic J. K. Amoako (B) · P. Danquah Heritage Christian College, Accra, Ghana e-mail: [email protected] Y. K. Sharma Shri JJT University, Jhunjhunu Rajasthan 333001, India © Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3_41

553

554

J. K. Amoako et al.

counseling sessions [3, 4], or others may have been frustrated because of the course content design and delivery, and assessment style of their lecturers [5, 6]. Some students drop out of school temporarily or permanently while others find mitigating or compromising means to complete the program “successfully” when things get unbearable [7]. As characteristics that can be said of every economy that is developed, skills and human resources have become the major drivers of economic and social well-being in the twenty-first century [8]. For the nation to develop and have a promising economy, it has to pay attention to the development of its workforce. This can be attained through higher education [9]. Meanwhile, inadequate skills and limited knowledge in technology are indications of the ill-preparedness of some Ghanaian graduates for the job market [10]. For instance, over the years, employers in Ghana have expressed some concerns about tertiary education graduates not being able to avail themselves of technology and innovative approaches to solving problems [11, 12]. This is partly because of ineffective pedagogic/learning practices [13] such as memorization, repetition, recitation, copying from the board, choral response, and “chalk and talk” [14, 15]. Education influences society by bringing people from different backgrounds together. Nonetheless, this claim has been opposed by other schools of thought that education is a system that awards success on some and fails on others [15]. The persistent development of knowledgeable and skilled labor is attainable by implementing policies and strategies that can bring entrepreneurs, institutions, and governments together toward a common goal and effort to educating and training the people to solve prevailing societal issues [16]. Figure 1 represents the mismatch between the expected or required knowledge and skills and the actual skills with which fresh graduates seek for employments in the job market: Labor market effect. This is seen when an increasing number of graduates remains unemployed or underemployed after completing various tertiary institutions in Ghana, employers perceive them as not possessing job-relevant skills and partly due to the low capacity of firms to absorb them, and when the integration of new emerging technologies in organizations’ models for work has led to new kinds of job openings that require matching skills or knowledge upgrade. Productivity effect. The discord between graduates’ skills and employers’ required skills compels the later to spend a considerable amount of resources in retraining newly recruited graduates. This leads to the high cost of production as well as low efficiency. Notwithstanding, other higher institutions have found the need to integrate labor market demands into their educational systems [17]. Development effect. The discord is reflective of low industrial growth and contribution to the economy’s gross domestic product. According to the Association of Ghana Industries (AGI), firms are collapsing and existing ones also lack the capacity to grow. Graduates’ skills mismatch and the less capacity of industry to draw in graduates has led to unemployment, increased social vices in the economy and high dependency rates.

“Memorize, Reproduce, and Forget” Inclination…

555

Fig. 1 Framework of the foundations and effects of tertiary education-firms skills mismatch on industrial development in Ghana. Source [8]

1.1 Organization The introduction (Sect. 1) of this paper acknowledges the literature of other researchers that students’ interests and willingness to acquire enough knowledge and skills in their enrolled courses can be influenced by certain anticipated or unanticipated events they encounter in the process. The events, when they are unfavorable, can lead to practices like absenteeism, rote learning also called “Memorize, Reproduce, and Forget” (MRF), etc. The section continues to focus on how the effects of rote learning can result in inadequate knowledge and skills which has been the cry of some employers in the Ghanaian economy. The methodology (Sect. 2) presents the methods and techniques that are engaged in the selection of the population sample and data collection which is focused on identifying the reasons why MRF is still common among students. Next, the discussion (Sect. 3) interprets the responses from the collected data and analyzes the findings comparatively with some recent literature (Sect. 3.1). In the conclusion (Sect. 4), the baseline among the reasons for the high practice of rote learning is established together with the side effects on the nation and the individuals. Also, the section contains some recommendations on how to approach the rote learning inclination in education.

556

J. K. Amoako et al.

Table 1 Research population made up of students of four universities in Ghana University

Student population

Year of publication

Category

Kwame Nkrumah University of Science and Technology (KNUST)

29,090

2017/2018

Bachelors’ degree

University of Ghana—UG 32,059

2015/2016

Bachelors’ degree

University of Cape Coast—CC

18,949

2016/2017

Regular undergraduate students

Cape Coast Technical University—CCTU

2724

2017/18

Entire students

Total

8285

Finally, the future scope (Sect. 4.1) talks about what else can be done by other researchers concerning the outcome of this study.

2 Methodology This research employs a qualitative research design because such a method allows the researcher to ask questions and further analyze the responses from the sample [18]. With the Taro Yamane formula [19] and cluster sampling technique, 800 sample units from the clusters which formed the population (Table 1) were used. Respondents’ (students’) subjective opinions were collected as data using open-ended questionnaires [20] which were distributed to respondents. Using content analysis, similar responses from the data were carefully categorized [21, 22], assigned codes [23, 24], and fed into SPSS-V25 for statistical analysis. In referencing Table 1, the participating universities were selected based on their top rankings, higher counts of programs offered, higher counts of students’ population, and higher counts of affiliations that other tertiary institutions have with them [25, 26]. Meanwhile, CCTU was selected based on convenience, time, and money constraints. Though these top universities have relatively the largest population of students and faculties/schools, it may not be adequate to generalize the conclusions of this study to include all other tertiary institutions in Ghana.

3 Discussion Concerning responses in the study, it can be said that MRF inclination has been overlooked in Ghana education system because stakeholders have benefiting interest; most students adopt it as an easy way to get promoted and also gain academic certificates; some lecturers also encourage it because it makes it easier to set questions

“Memorize, Reproduce, and Forget” Inclination… Table 2 Levels of undergraduate students of the selected institutions

557

Level

Frequency

Percentage (%)

Cumulative percentage (%)

100

132

16.5

16.5

200

168

21.0

37.5

300

395

49.4

86.9

400

105

13.1

100.0

Total

800

100

and mark examination papers and also they use it to promote the sale of their selfauthored course materials; fighting this MRF inclination means the government will have to make more commitments (including tools and infrastructure to support the practical aspects of certain courses, discouraging the mandatory purchase of selfauthored study materials from lecturers, cost of sensitization of the MRF canker, tedious nature of redress of educational policies and structures, etc.). When compared to arguments raised by [10, 12] that the job market is experiencing a lot of skills mismatch partly because of fresh graduates’ insufficient knowledge and skills in technology application, and their respective areas of study, this study confirms that students in tertiary rely on rote learning to get promoted. Also, the claims by [13], which partly blames lecturers for their unhealthy pedagogic practices leading to increase in the abuse of memorization and recollection technique as identified with students, are confirmed in the responses of respondents in this study. Below (Sect. 3.1), details the subjective perceptions of the sampled students on rote learning. In Table 2, out of the n = 800 (100%), 132 (16.5%) respondents were in level 100, 168 (21%) respondents were in level 200, 395 (49.4%) respondents were in level 300, and 105 (13.1%) respondents were in level 400. Regular undergraduate students were mainly used as a sample for the study.

3.1 Questionnaire Question 1. Table 3 represents the responses from the question—Have you ever heard “Memorize, Reproduce, and Forget” (MRF)? 98 respondents—12.3% (students) responded: “No” (meaning they had never heard the MRF syndrome), while 702 Table 3 Have you ever heard “memorize, reproduce, and forget”—MRF?

Option

Frequency

Percentage (%)

No

98

12.3

Yes

702

87.8

Total

800

100

558

J. K. Amoako et al.

respondents—87.8% responded: “Yes” (meaning they already knew the MRF syndrome). Question 2. Table 4 represents the various subjective responses (respondents’ views/opinions) received as follow up inputs (total-702) from students who responded “Yes” to the previous Table 3. An open-ended question (“What in your view is the cause of MRF in the Ghana educational system”) was asked to help the researchers receive unrestricted diverse answers from respondents. Column 1 (views/opinions) in Table 4 shows a total of 28 different groups containing similar opinions given by respondents as answers to the causes of MRF syndrome. Because diverse open-ended answers were received from respondents, each of the rows in Table 4, column 1, shows a grouped answer by respondents whose opinions were similar/same in meaning but with different wordings. To put the similar opinions in groups, researchers thoroughly read through each of the individual respondent’s opinion, grouped them, gave each group (separate rows in column 1) a general phrase, and finally assigned code numbers (column 2) to each group. Column 2 shows the assigned code numbers for each of the differently grouped similar/same answers (views/opinion) in column 1. The reason for assigning the code numbers was to make it possible and easier for researchers to code the responses into the statistical analysis tool (SPSS-version 25) used in the study and also make it easier for analyzing and interpreting the data. Column 3 (frequency), each row represents the number of respondents who shared similar views under their respective groups. For example, in Table 4, row 2, 122 respondents who had similar views were grouped under Bookish/Theoretical with the code number “v2”. Column 4 (percentage) represents the relative percentages with respect to the various groups of responses in column. Out of the 800 respondents, 98 (12.3%) responded “NO” in Table 3 (which also refers to the “Not Applicable” row in Table 4)—meaning they had not heard of MRF. According to the various responses received, below is the various groups/categories of responses from respondents as seen in Table 4 and Fig. 2: “Irrelevant view” (v0) had 43 respondents (5.4%); the opinions from the respondents in this category was not relevant and also not related to the objective of the question asked. Instead of causes, students rather suggested a number of solutions which in their view can help improve the non-attendance of lectures. The objective of the study was not to find remedies to curb the subject matter hence their responses are ignored but can be used in another research. “Bookish/Theoretical” (v2) had 122 respondents (15.3%); they were of the view that the courses taught were too theoretical with no/less practical explanation. “Difficulty in understanding course” (v3) had 65 respondents (8.1%); students in this group opined that the manner in which topics and concepts of certain courses are presented by lecturers made it difficult for them to understand the course concepts and content.

“Memorize, Reproduce, and Forget” Inclination…

559

Table 4 What in your view is the cause of MRF in Ghana’s tertiary education system? Views/opinions

Assigned code (views/opinions)

Frequency of respondents

Percent (%)

Not applicable

N/A

98

Irrelevant view

v0

43

5.4

Bookish/theoretical

v2

122

15.3

Difficulty in understanding course

v3

65

8.1

12.3

No reason

v4

44

5.5

Just to pass and move on

v5

74

9.3

Not enough time to study

v6

30

3.8

Easiest way to pass

v7

57

7.1

Not adequately prepared

v8

14

1.8

Subject is complex

v9

6

0.8

Inadequate practical facilities

v10

21

2.6

High semester course load

v11

16

2.0

Just a one-time course

v12

6

0.8

Lecturers unable to complete syllabus

v13

4

0.5

Not attending lectures

v14

2

0.3

Demands/encouragement from lecturers

v15

27

3.4

Lazy students

v16

10

1.3

Bad educational system—Ghana

v17

57

7.1

Essence of course not known

v18

1

0.1

Procrastination by students

v19

12

1.5

Boring course/lecture

v20

3

0.4

Irrelevant courses

v21

15

1.9

Bad student preparations

v22

3

0.4

To gain first class degree

v23

8

1.0

General academic pressure/stress

v24

9

1.1

Exams questions not challenging

v25

2

0.3 (continued)

560

J. K. Amoako et al.

Table 4 (continued) Views/opinions

Assigned code (views/opinions)

Frequency of respondents

A norm/culture

v26

10

1.3

Loaded syllabus

v27

34

4.3

Encouragement by colleagues

v28

7

0.9

Total

800

Percent (%)

100.0

Fig. 2 Bar chat for column 2 and 4 in Table 4

“No reason” (v4) had 44 respondents (5.5%); these students had no reason to support the MRF method employed in their academics. “Just to pass and move on” (v5) had 74 respondents (9.3%); respondents here use the MRF style to get promoted to the next level. “Not enough time to study” (v6) had 30 respondents (3.8%); here, they blamed it on the limited time left for them to study for exams. “Easiest way to pass” (v7) had 57 respondents (7.1%); students here use the MRF style to avoid failure in exams. “Not adequately prepared” (v8) 14 respondents (1.8%); they were of the view that MRF was the last-minute remedy to get them ready for exams. “Subject is complex” (v9) had 6 respondents (0.8%); they see certain subjects/topics as very complex irrespective of how good lecturers explained the contents to them, hence the need of MRF to create an impression of the complex subject being simple.

“Memorize, Reproduce, and Forget” Inclination…

561

“Inadequate practical facilities” (v10) had 21 respondents (2.6%); students here blame the institutions for inadequate practical tools, technology, and infrastructure to support the theoretical nature of courses. “High course load in a semester” (v11) had 16 respondents (2.0%); this group of responses was ill-concerned of the high number of courses they had to do in a single semester. “Just a one-time course” (v12) had 6 respondents (0.8%); they claimed that certain courses had no relations to other future courses to be done in their next levels, and therefore, it was appropriate they employed MRF. “Lecturers unable to complete syllabus” (v13) had 4 respondents (0.5%); because some lecturers are not able to complete their syllabus for the semester, students in this group are forced to adopt the MRF to compensate for the other topics not treated in the syllabus. “Not attending lectures” (v14) had 2 respondents (0.3%); they were of the view that MRF is one of the best remedies that students who skip lectures regularly use to avoid failure. “Demands/encouragement from lecturers” (v15) had 27 respondents (3.4%); respondents here complained that they are forced to buy some lecturer’s (selfauthored) study materials and also made to produce “word to word” of the books’/study materials’ content during examinations. “Lazy students” (v16) had 10 respondents (1.3%); students who are not serious with and lazy in their academics adopted MRF. “Bad educational system—Ghana” (v17) had 57 respondents (7.1%); this category complained about the structures and policies of Ghana Education Service which is mainly examination-based assessment with much focus on passing the examination and getting certification even at the expense of the knowledge and skills that are to be gained in the process. “Essence of course not known” (v18) had 1 respondent (0.1%); here, the student is not aware of the essence of the topic/course being studied. “Procrastination—students” (v19) had 12 respondents (1.5%); students who realize they have no option because of their procrastination attitude toward studies use MRF as a remedy. “Boring course/lectures” (v20) 3 had (0.4%); claiming that some courses/lecturers are boring in their delivery, MRF helps them not to waste their time to comprehend the course concepts. “Irrelevant courses/subjects” (v21) had 15 respondents (1.9%); they felt certain courses are not essentials to their field of study. “Bad student preparations” (v22) had 3 respondents (0.4%); students who realize in the last minutes to exams that they had deviated in their prior preparations, quickly adopt MRF to put them at par with those who are well prepared. “To gain 1st Class degree” (v23) had 8 respondents (1.0%); they were of the view that MRF is one of the simplest methods used to gain 1st class certificate. “General academic pressure/stress” (v24) had 9 respondents (1.1%); due to their claims that there are lots of pressure in academic activities from start to finish, these respondents depend MRF to sail through successfully.

562

J. K. Amoako et al.

“Exams questions not challenging” (v25) had 2 respondents (0.3%); in this category, students mention that examination questions are not demanding enough to test for the knowledge gained by them and how they can apply the concepts of the course in the real world. “A norm/culture” (v26) had 10 respondents (1.3%); these students claimed that it has been an existing culture/norm in the academic system for a long time, and therefore, they are also passing through. “Loaded syllabus” (v27) had 34 respondents (4.3%); they blamed it on lecturers loading a single course with too many objectives and contents. “Encouragements by colleagues” (v28) had 7 respondents (0.9%); students agree that this MRF style of gaining certificate or passing exams is highly recommended and commendable by their own colleagues.

4 Conclusion It was argued in the introductory part of this article that tertiary students in Ghana are faced with challenges that include poor design and delivery of course contents, unexpected demands from curricula and extra-curricular activities, etc. As a result, some students find convenient means to pass their examinations or tests even if they have to compromise on the adequacy of skills and knowledge gained. These poor choices have led to employers’ complaints about low levels of expertise from students. This study, therefore, enquires about rote learning (“Memorize, Reproduce, and Forget” inclination) which is a common practice among students, to learn more about its causes and effects on educational stakeholders. Based on the study outcome, it is evident that the “Bookish/Theoretical (v2)” nature of teaching and learning is the baseline cause of the MRF inclination in tertiary institutions in Ghana. With 122 respondents (15.3%) out of 800, this makes it alarming hence, the need to be addressed by stakeholders. Notwithstanding, all the other views expressed by the tertiary students in this survey equally need attention. The side effect of the MRF inclination in Ghana is that tertiary institutions will continue to produce graduates with inadequate knowledge and skills to match employers’ demand for graduates with rather adequate knowledge and skills in their field of study. When graduates prioritize the passing of exams and the gaining of certificates over gaining the needed knowledge and skill through their academic journey, national development and personal development thwart. This study recommends that: (1) Stakeholders in the Ghana education system redress this MRF inclination with urgency. (2) Further research into how well technology can be integrated into pedagogy, should be done in tertiary institutions and other levels of education to help combat the causes of “Memorize, Reproduce, and Forget” inclination in Ghana.

“Memorize, Reproduce, and Forget” Inclination…

563

4.1 Future Scope This study was focused on finding out the reasons behind students’ tendency to engage in rote learning. The outcome of this study can inform other researchers to explore educational technological aids (computer software and hardware) and how they can be used to mitigate the mentioned reasons. Also, to help mitigate the complaints of employers (as mentioned in the introduction section), it provides partial insight for educators and researchers to collaborate with industry players on how to adopt and implement some of the industry computer applications and other business processes into certain curricula.

References 1. Thomas MK (2015) Education: a more powerful weapon than war? 2. Igbokwe UL, Onyechi KC, Ogbonna CS, Eseadi C, Onwuegbuchulam AC, Nwajiuba CA, Ugodulunwa CC, Eze A, Omaeze K, Patrick CP, Ekechukwu LE (2019) Rational emotive intervention for stress management among english education undergraduates: implications for school curriculum innovation. Medicine 98(40) 3. Strepparava MG, Bani M, Zorzi F, Corrias D, Dolce R, Rezzonico G (2016) Cognitive counselling intervention: treatment effectiveness in an Italian University Centre. Br J Guidance Counsel 44(4):423–33 4. Quinn N, Wilson A, MacIntyre G, Tinklin T (2009) ‘People look at you differently’: students’ experience of mental health support within Higher Education. Br J Guidance Counsel 37(4):405–18 5. Bishop J, Wellinski S, Mason L (2011) Bad pedagogy: critical justice in 4/4 Time1 6. Hrepic Z, Zollman D, Rebello S (2004) Students’ understanding and perceptions of the content of a lecture. In: AIP conference proceedings, vol 720, No. 1, pp 189–192. American Institute of Physics 7. Sabates R, Westbrook J, Akyeampong K, Hunt F (2010) School drop out: patterns, causes, changes and policies 8. Bawakyillenuo S, Akoto IO, Ahiadeke C, Aryeetey EBD, Agbe EK (2013) Tertiary education and industrial development in Ghana. Policy Brief 33012 9. Dill DD, Van Vught FA (2010) National innovation and the academic research enterprise: public policy in global perspective. Johns Hopkins University Press, 2715 North Charles Street, Baltimore, MD 21218 10. Palmer R (2007) Skills for work? From skills development to decent livelihoods in Ghana’s rural informal economy. Int J Educ Dev 27(4):397–420 11. Dorvlo SS, Dadzie PS (2016) Information literacy among post graduate students of the University of Ghana. Libr Philos Pract 12. Dasmani A (2011) Challenges facing technical institute graduates in practical skills acquisition in the Upper East Region of Ghana. Int J Work-Integr Learn 12(2):67 13. Hardman F, Abd-Kadir J, Tibuhinda A (2012) Reforming teacher education in Tanzania. Int J Educ Dev 32(6):826–834 14. Lewin KM, Stuart JS (2003) Researching teacher education: new perspectives on practice, performance and policy 15. Meyer JW (1977) The effects of education as an institution. Am J Sociol 83(1):55–77 16. A Skilled Workforce for Strong, Sustainable and Balanced Growth: A G20 Training Strategy International Labour Office—Geneva (2010). https://www.oecd.org/g20/summits/tor onto/G20-Skills-Strategy.pdf. Last accessed 2019/11/10

564

J. K. Amoako et al.

17. Grubb WN (2003) The roles of tertiary colleges and institutes: trade-offs in restructuring postsecondary education. World Bank, Washington, DC 18. Miles MB, Huberman AM, Huberman MA, Huberman M (1994) Qualitative data analysis: an expanded sourcebook. Sage, Thousand Oaks 19. Yamane T (1967) Statistics: an introductory analysis. No. HA29 Y2 1967 20. Sudman S, Kalton G (1986) New developments in the sampling of special populations. Ann Rev Sociol 12(1):401–29 21. Krippendorff K (2018) Content analysis: an introduction to its methodology. Sage, Thousand Oaks 22. An overview of qualitative research methods. https://www.thoughtco.com/qualitative-researchmethods-3026555 23. Ball MS, Gregory WS (1992) Qualitative research methods: analyzing visual data. Sage, Thousand Oaks. ISBN 9781412983402 24. Nechully S, Pokhriyal SK (2019) Choosing grounded theory and frame work analysis as the appropriate qualitative methods for the research. J Manag (JOM) 6(1) 25. Webometrics ranking web of universities. https://www.webometrics.info/en/Africa/Ghana 26. uniRank top universities in Ghana. https://www.4icu.org/gh/

Author Index

A Ambeth Kumar, V. D., 133 Amoako, John Kani, 553 Antony Vijay, J., 331 Anwar Basha, H., 331 Arora, Deepak, 383 Arun Nehru, J., 331 Arya, Ashima, 481 Arya, Resham, 529 Awasthi, Nitin, 177

B Bagrecha, Rohan, 163 Bhawna, 511 Bhushan, Megha, 529

C Cabrera, Danelys, 439, 447 Chaurasiya, Rashmi, 411

D Danquah, Paul, 553 Deepak, N. A., 231

G Ganeshkumar, N., 191 Ganotra, Dinesh, 411 Garg, Preeti, 39 Goyal, Anjali, 343 Goyal, Tripta, 141 Gupta, Neeraj, 393 Gupta, Vishal, 151

I Iyyappan, M., 1

J Jain, Bharat Bhusan, 469 Jangir, Dimple, 469 Jindal, M. K., 85

K Kaur, Anmoldeep, 223 Kaur, Kamaljit, 259 Kaur, Parminder, 259 Kaur, Rupinder Pal, 85 Khan, Mohammed Ali, 279 Khatri, Sabita, 383 Khurana, Savita, 363 Kishore, R. Rama, 39 Kumar, Arvind, 1, 21 Kumar, Ashok, 529 Kumar, Munish, 85 Kumar, Narander, 383 Kumar, Rajeev, 307 Kumar, Sanjay, 191 Kumawat, Sunita, 203 Kurukuru, V. S. Bharath, 279

L Lezama, Omar, 427, 439, 447 Lomte, Vina, 163

M Maco, José, 427

© Springer Nature Singapore Pte Ltd. 2021 V. Singh et al. (eds.), Computational Methods and Data Engineering, Advances in Intelligent Systems and Computing 1257, https://doi.org/10.1007/978-981-15-7907-3

565

566 Malathi, S., 115 Malik, Sanjay Kumar, 481, 511 Malik, Sunesh, 67 Malik, Vinita, 101 Manco, Patricia, 439 Meenakshi, 177 Mehrotra, Anupam, 53 Mehta, Neetu, 21 Monisha, G. S., 115 Mor, Navdeep, 141 Mukta, 393 N Nandal, P., 321 Nandal, Rainu, 177 Nandal, Vineet, 79 Navadeepika, K. M. R., 133 P Patel, Mayank, 455 Poonam, 495 Priyanka, 243 R Ramkumar, K., 375 Randhawa, Arpan, 223 Reddlapalli, Rama Kishore, 67 S Sadh, Roopam, 307 Saidov, Sarvar, 375 Salunke, Shubham, 163 Sehrawat, Harkesh, 495 Shankar, Gori, 469 Shanmugam, Selvanayaki Kolandapalayam, 375

Author Index Sharma, Preeti, 295 Sharma, Saurabh, 151 Sharma, Shakshi, 343 Sharma, Yogesh Kumar, 553 Shedge, Shambhavi, 163 Shekhawat, Vaibhav Singh, 455 Shobha, N. S., 231 Shyamala Devi, J., 375 Silva, Jose, 427, 439, 447 Singh, Parvinder, 79 Singh, Prabhsimran, 259 Singh, Rajesh Kumar, 363 Singh, Rupam, 279 Singh, Sukhdip, 101 Singh, Vijendra, 481, 511 Singh, Yudhvir, 495 Siwach, Vikas, 495 Sood, Hemant, 141 Srivastava, V. K., 295

T Tiwari, Manish, 455 Toshpulotov, Sukrobjon, 375

V Varas, Jesus, 427, 439 Varela, Noel, 427, 439, 447 Villón, Martín, 427

W Wagh, Tanay, 163

Y Yadav, Rahul, 243