Proceedings of International Conference on Network Security and Blockchain Technology: ICNSBT 2021 (Lecture Notes in Networks and Systems, 481) 9789811931819, 9789811931826, 981193181X

The book is a collection of best selected research papers presented at International Conference on Network Security and

139 30 41MB

English Pages 419 [415] Year 2022

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Proceedings of International Conference on Network Security and Blockchain Technology: ICNSBT 2021 (Lecture Notes in Networks and Systems, 481)
 9789811931819, 9789811931826, 981193181X

Table of contents :
Preface
Contents
About the Authors
Security and Privacy
Cyber-Defense Mechanism Considering Incomplete Information Using POMDP
1 Introduction
2 Related work
3 Problem Description
3.1 The Network
3.2 The Incomplete Information and the Defender’s Perspective
4 Methodology
4.1 Decision Strategies
4.2 Scalability
5 Results
5.1 Inexperienced Attacker
5.2 Experienced Attacker
6 Conclusion and Discussion
References
Monitoring, Recognition and Attendance Automation in Online Class: Combination of Image Processing, Cryptography in IoT Security
1 Introduction
2 Literature Review
3 Methodology
4 Result Analysis
5 Conclusion
References
Cyber Threat Phylogeny Assessment and Vulnerabilities Representation at Thermal Power Station
1 Introduction
1.1 Military Threats
2 Phylogenetic Attributes of TPS
3 Cyber-Threat Phylogeny Categorization Scheme
3.1 Procedure for Threat and Vector of Threat
3.2 Physical Accessibility
4 Template for a Phylogeny and Its Application in a Conformance Test
5 Conclusion
References
A Novel Data Encryption Technique Based on DNA Sequence
1 Introduction
2 Literature Survey
3 Methodology
4 Security Evaluation
5 Results
6 Conclusion and Future Scope
References
Continuous Behavioral Authentication System for IoT Enabled Applications
1 Introduction
2 Recent Contributions
3 Proposed Scheme
3.1 Methodology
4 Result and Discussion
4.1 Experimental Scenarios
4.2 Experimental Results
5 Conclusion and Future Work
References
A Secure `e-Tendering' Application Based on Secret Image Sharing
1 Introduction
2 Preliminary
2.1 Major Entities
3 Related Work
3.1 Review of Wang-Ma-Li's SIS Scheme
4 Proposed Scheme
4.1 Uploading and Storing of e-Tenders
4.2 Opening of e-Tenders
5 Implementation and Experimental Results
6 Conclusion
References
Video Based Graphical Password Authentication System
1 Introduction
2 Literature Survey
3 Problem Identified
4 Proposed Solution
5 Result Analysis
6 Conclusion
References
Designing Robust Blind Color Image Watermarking-Based Authentication Scheme for Copyright Protection
1 Introduction
2 Proposed Watermarking-Based Authentication Scheme
2.1 Watermark Embedding Procedure
2.2 Watermark Extraction Procedure
3 Experimental Results
3.1 For 64-Bit Random Sequence as Watermark
3.2 Comparision with Respect to Hybrid Attacks
4 Conclusions
References
LSB Steganography Using Three Level Arnold Scrambling and Pseudo-random Generator
1 Introduction
2 Literature Survey
3 Proposed Solution
4 Conclusion
References
Network, Network Security and their Applications
Performance Analysis of Retrial Queueing System in Wireless Local Area Network
1 Introduction
2 Real – World Application
3 Model Description
3.1 Arrival Manner
3.2 Retrial Manner
3.3 Service Mode and Feedback Rule
3.4 Removal and Repair Procedure
3.5 Vacation Procedure
4 Steady State Distributions
4.1 Steady State Equations
4.2 Steady State Solutions
5 Performance Measures
6 Comparison of the Proposed Model with the Existing Model
7 Numerical Results
8 Conclusion
References
IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique
1 Introduction
2 Preliminaries
2.1 Huffman Coding
2.2 BDNA-A DNA Inspired Scheme
3 Proposed Algorithms
3.1 The Encryption Procedure
3.2 The Decryption Procedure
4 Implementation
5 Analysis of the Proposed Algorithm
5.1 Security Analysis
5.2 Experimental Results
5.3 Enhancements Over the Existing BDNA Cryptography [15]
6 Conclusion
References
Obfuscation Techniques for a Secure Endorsement System in Hyperledger Fabric
1 Introduction
2 Literature Survey
3 Proposed Linkable Ring Signature for Hyperledger Fabric
3.1 Proposed Construction
4 Obscuring Endorsement Policy
4.1 Implementing Membership Service Provider with Obfuscation of Endorsement in Hyperledger Fabric
5 Security Analysis for the Constructed Linkable Ring Signature Scheme
6 Security Analysis for a Privacy-Preserving Endorsement in Hyperledger Fabric
6.1 Policy Principal Anonymity and Unlinkability
6.2 Endorser Anonymity and Unlinkability
7 Performance Analysis of LRS
8 Experimental Analysis
9 Conclusion and Future Work
References
Mobile Operating System (Android) Vulnerability Analysis Using Machine Learning
1 Introduction
2 Literature Review
3 Methodology
3.1 Dataset
3.2 Preprocessing and Feature Extraction
3.3 Classifiers Framework (Paradigm)
3.4 Architecture of the GRU
4 Result Evaluation Analysis
4.1 ML
4.2 DL
5 Conclusion
References
Survey of Predictive Autoscaling and Security of Cloud Resources Using Artificial Neural Networks
1 Introduction
2 Background and Current Research
2.1 Autoscaling
2.2 Types of Autoscaling
2.3 Predictive Autoscaling Paradigms
2.4 Machine Learning Algorithms
2.5 Augmenting Autoscaling with Machine Learning
2.6 Cloud Native Process Automation
3 Implementation Challenges
4 Solution Analysis and Future Trends
5 Security Considerations
6 Conclusion
References
Systematic Literature Review (SLR) on Social Media and the Digital Transformation of Drug Trafficking on Darkweb
1 Introduction
2 Research Method
2.1 The Question for Research
2.2 Research Process
2.3 Search Strategy and Selection
2.4 Evaluation of Quality
2.5 Major Roles of the DW
2.6 The Process of Data Extraction
3 Result and Discussion
3.1 Threats of Criminal Activity in DW(RQ1)
3.2 Techniques for Tracking Down Criminals in DW(RQ2)
3.3 How Many SLRs Have Been Released- 1st February 2015 and March 30, 2021 (RQ3)
4 Study Limitations
5 Conclusion
References
A Survey on Interoperability Issues at the SaaS Level Influencing the Adoption of Cloud Computing Technology
1 Introduction
2 Literature Review
2.1 Cloud Computing Characteristics, Services and Deployment Models
2.2 Common Cloud Computing Issues
2.3 General Solutions to Cloud Issues
3 Results
3.1 Chart Analysis of the Survey on SaaS Level Interoperability
4 Conclusion and Future Research Directions
References
Human Recognition Based Decision Virtualization for Effecting Safety-as-a-Service Using IoT Enabled Automated UV-C Sanitization System
1 Introduction
2 Related Work
3 Methodology
3.1 Proposed Pipeline
4 Result
5 Conclusion and Future Work
References
Blockchain Technology and its Applications
Security Optimization of Resource-Constrained Internet of Healthcare Things (IoHT) Devices Using Asymmetric Cryptography for Blockchain Network
1 Introduction
2 Related Work
3 Motivation
4 Concerns and Challenges in Implementation of Cryptographic Techniques to Resource Constrained IoHT Devices
5 Solutions to Enhance Security in Physical Layer of IoHT Devices
6 Blockchain Based IoHT Devices
7 Elliptic Curve Cryptography for Private Key Generation
7.1 Algorithm for ECC Based Secret Key Generation
7.2 RSA
8 Results and Discussion
9 Conclusion and Future Work
9.1 Future Work
References
Preserving Privacy Using Blockchain Technology in Autonomous Vehicles
1 Introduction
1.1 Autonomous Vehicles and Intelligent Transportation System
2 Literature Review
3 Safety Approaches for Autonomous Vehicles and Intelligent Transportation System
4 Role of Internet of Vehicles (IoVs) in Today’s Smart Era
5 Critical Challenges and Open Issues Towards Autonomous Vehicles and Intelligent Transportation System
6 Proposed Solution
7 Simulation Results
8 Conclusion and Future Opportunities
References
Security Protocols for Blockchain-Based Access Control in VANETS
1 Introduction
2 Ease of Use
2.1 Backgrounds
2.2 Motivation
2.3 Problem Statement
3 VANET Background
3.1 Authentication of VANET
4 Blockchain Background
4.1 Authentication of VANETs using Blockchain Technology
5 Projected Method Architecture
6 Results
6.1 Authentication Delay
6.2 Active Time of Channels
6.3 Packet Size of B.S.Ms
6.4 Extra Information Sent
7 Conclusion
8 Future Works
References
LIVECHAIN: Lightweight Blockchain for IOT Devices and It's Security
1 Introduction
2 Proposed Methodology and Implementation
2.1 Encryption-Decryption
2.2 Blockchain Structure
3 Experimental Results
4 Conclusions
References
Secure and Scalable Attribute Based Access Control Scheme for Healthcare Data on Blockchain Platform
1 Introduction
2 Background
2.1 Blockchain
2.2 Attribute Based Encryption
3 Related Work
4 Preliminaries
4.1 Bilinear Maps
4.2 Ciphertext-Policy Attribute Based Encryption
5 Proposed Model
5.1 Stakeholders and Their Roles
5.2 Operation of the Proposed Framework
6 Security Analysis
6.1 Security Analysis of Blockchain
6.2 Security Analysis of Hierarchical CP-ABE
6.3 Our Security Model
6.4 Formal Security Proof Using BAN Logic
7 Conclusion
References
Adaptive Neuro Fuzzy Inference System for Monitoring Activities in Electric Vehicles Through a Hybrid Approach and Blockchain Technology
1 Introduction
2 Literature Review
3 ANFIS Modelling and Analysis of Charging EV's During Peak Hours
4 Hybrid and Conventional Integration of Blockchain in the Electric Vehicles
4.1 Experimental Validation
5 Conclusion
References
Application of BLOCKCHAIN in Agriculture: An Instance of Secure Financial Service in Farming
1 Introduction
2 Literature Survey
3 Methodology
4 Experimentation and Results
5 Conclusion
References
Adaptive Electronic Health Records Management and Secure Distribution Using Blockchain
1 Introduction
1.1 Challenges in the Implementation of Adaptive Electronic Health Record Maintenance and Distribution
2 Literature Survey
3 Proposed System
4 Architecture
5 Result Analysis
6 Conclusion
References
Non-content Message Masking Model for Healthcare Data in Edge-IoT Ecosystem Using Blockchain
1 Introduction
2 System Model and Methodology
2.1 Authentication Mechanism
2.2 Proof of Work Scheme
2.3 Message Transfer
2.4 Message Broadcast Policy
2.5 Peer Address Generation Scheme
2.6 Plausible Deniability Pacify Service for Message Sender
3 Results and Evaluation
3.1 Timing Analysis
3.2 Comparative Analysis
4 Conclusion
References
Voting Based Consensus Protocol for Blockchain to Maintain COVID Patient Records in Consortium Networks
1 Introduction
2 Literature Survey
3 Need for Consensus Protocols
4 Proposed Solution to Maintain Health Records
5 Implementation
6 Execution Results and Discussions
7 Conclusion and Future Work
References
Blockchain Adaptation in Healthcare: SWOT Analysis
1 Introduction
2 Healthcare Pressing Issues
2.1 Lack of Interoperability
2.2 Rising Healthcare Costs
2.3 Centralization
2.4 Lack of Auditability
2.5 Health Data Silos
2.6 Security and Privacy
2.7 Heterogeneity in Device Resources
2.8 Information and Integrated Healthcare Services
3 Blockchain Solutions for Healthcare Issues
4 Blockchain Adaptation SWOT Analysis
4.1 Strength
4.2 Opportunities
4.3 Weaknesses
4.4 Threats
5 Conclusion
References
Blockchain-IoT Based Blood Supply Chain Management System
1 Introduction
2 Related Works
3 Proposed System
3.1 System Workflow
3.2 System Architecture
3.3 Membership and Identity Management
3.4 Access Control and Privacy
3.5 IoT Based Automated Data Gathering
3.6 Multiparty Endorsements
4 Performance Evaluation
4.1 Latency
4.2 Throughput
5 Comparative Analysis
6 Conclusion and Future Work
References
Blockchain-Based COVID-19 Detection Framework Using Federated Deep Learning
1 Introduction
1.1 Contribution
2 Literature Survey
2.1 Differential Privacy
2.2 A Secure Deep Learning Model
2.3 Federated Learning
2.4 The End-to-End Encrypted Deep Learning Application
2.5 Provenance Using Blockchain
3 System Design
3.1 Secure Model Design
4 Experiment
4.1 Dataset
4.2 Performance Matrix
5 Conclusion
References
Systematic Review of Attribute-Based Access Control for a Smart City Using Blockchain
1 Introduction
2 Literature Review
3 Result and Discussion
4 Research Challenges
5 Conclusion
References
A Methodical Study on Blockchain Technology
1 Introduction
2 Overview of Blockchain
2.1 Public Blockchain
2.2 Private Blockchain
2.3 Consortium Blockchain
2.4 Hybrid Blockchain
3 Recent Contributions
4 Results and Discussion
5 Conclusion
References
Author Index

Citation preview

Lecture Notes in Networks and Systems 481

Debasis Giri Jyotsna Kumar Mandal Kouichi Sakurai Debashis De   Editors

Proceedings of International Conference on Network Security and Blockchain Technology ICNSBT 2021

Lecture Notes in Networks and Systems Volume 481

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong

The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).

More information about this series at https://link.springer.com/bookseries/15179

Debasis Giri Jyotsna Kumar Mandal Kouichi Sakurai Debashis De •





Editors

Proceedings of International Conference on Network Security and Blockchain Technology ICNSBT 2021

123

Editors Debasis Giri Department of Information Technology Maulana Abul Kalam Azad University of Technology Haldia, West Bengal, India Kouichi Sakurai Department of Informatics Kyushu University Fukuoka, Japan

Jyotsna Kumar Mandal Department of Computer Science and Engineering University of Kalyani Kalyani, West Bengal, India Debashis De Department of Computer Science and Engineering Maulana Abul Kalam Azad University of Technology Kolkata, West Bengal, India

ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-19-3181-9 ISBN 978-981-19-3182-6 (eBook) https://doi.org/10.1007/978-981-19-3182-6 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

In the last two decades, scientific computing has become an important contributor to all scientific research programmes. It is particularly important for the solution of research problems that are unsolvable by traditional theory and experimental approaches, hazardous to study in the laboratory or time consuming or expensive to be solved by traditional means. The International Conference on Network Security and Blockchain Technology (ICNSBT 2021) is such a premier forum for the presentation of new advances and research results in the fields of cryptography, network security, blockchain and its application, etc. The conference brought together leading academic scientists, experts from industry and researchers in their domains of expertise from around the world. The ICNSBT 2021 was aimed to bring together both novice and experienced scientists with developers, to meet new colleagues, collect new ideas and establish new cooperation between research groups and provide a platform for researchers from academic and industry to present their original work and exchange ideas, information, techniques and applications in the field of i) security and privacy, ii) network security and its applications and iii) blockchain technology and Its applications. ICNSBT 2021 was organized by the Computer Society of India, Kolkata Chapter. This conference received 152 papers across the globe. This volume contains 32 full papers. We would like to thank the Chief Patron, Patron, General chair, Organizing chair, organizing committee members, attendees and authors who attended and made this conference a success. We are also thankful to the invited speakers: Prof. Bart Preneel (Leuven University, Belgium), Prof. Claudio Orlandi (Aarhus University, Denmark), Prof. Yingjiu (Joe) Li (Oregon University, USA), Prof. Michael David (National Intelligence University, Washington, D.C., USA), Prof. Feng Hao (Warwick University, UK), Prof. Amlan Chakrabarti (University of Calcutta, India), Prof. Samiran Chattopadhyay (TCG Centres for Research and Education in Science and Technology), and Mr. Koushik Nath (Security Architect, Cisco Systems). We are grateful to the Springer Nature for publishing the presented papers of ICNSBT 2021 in the “Lecture Notes in Networks and Systems”, Springer Nature. v

vi

Preface

Our sincere gratitude to esteemed authors and reviewers for extending support and cooperation. Hope this volume will be a valuable document for researchers and budding engineers. Debasis Giri Jyotsna Kumar Mandal Kouichi Sakurai Debashis De

Contents

Security and Privacy Cyber-Defense Mechanism Considering Incomplete Information Using POMDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kshitij Pisal and Sayak Roychowdhury Monitoring, Recognition and Attendance Automation in Online Class: Combination of Image Processing, Cryptography in IoT Security . . . . . Pritam Mukherjee, Abhishek Mondal, Soumallya Dey, Avishikta Layek, Sanchari Neogi, Monisha Gope, and Subir Gupta Cyber Threat Phylogeny Assessment and Vulnerabilities Representation at Thermal Power Station . . . . . . . . . . . . . . . . . . . . . . . Vinod Mahor, Bhagwati Garg, Shrikant Telang, Kiran Pachlasiya, Mukesh Chouhan, and Romil Rawat A Novel Data Encryption Technique Based on DNA Sequence . . . . . . . Abritti Deb, Satakshi Banik, Pankaj Debbarma, Piyali Dey, and Ankur Biswas Continuous Behavioral Authentication System for IoT Enabled Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vivek Kumar and Sangram Ray

3

18

28

40

51

A Secure ‘e-Tendering’ Application Based on Secret Image Sharing . . . Sanchita Saha, Arup Kumar Chattopadhyay, Suman Kumar Mal, and Amitava Nag

64

Video Based Graphical Password Authentication System . . . . . . . . . . . . Bipin Yadav, Kaptan Singh, and Amit Saxena

78

Designing Robust Blind Color Image Watermarking-Based Authentication Scheme for Copyright Protection . . . . . . . . . . . . . . . . . . Supriyo De, Jaydeb Bhaumik, and Debasis Giri

91

vii

viii

Contents

LSB Steganography Using Three Level Arnold Scrambling and Pseudo-random Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Sayak Ghosal, Saumya Roy, and Rituparna Basak Network, Network Security and their Applications Performance Analysis of Retrial Queueing System in Wireless Local Area Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 N. Sangeetha, J. Ebenesar Anna Bagyam, and K. Udayachandrika IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Mangalam Gupta, Dipanwita Sadhukhan, and Sangram Ray Obfuscation Techniques for a Secure Endorsement System in Hyperledger Fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 J. Dharani, K. Sundarakantham, Kunwar Singh, and Shalinie S Mercy Mobile Operating System (Android) Vulnerability Analysis Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Vinod Mahor, Kiran Pachlasiya, Bhagwati Garg, Mukesh Chouhan, Shrikant Telang, and Romil Rawat Survey of Predictive Autoscaling and Security of Cloud Resources Using Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Prasanjit Singh and Pankaj Sharma Systematic Literature Review (SLR) on Social Media and the Digital Transformation of Drug Trafficking on Darkweb . . . . . . . . . . . . . . . . . 181 Romil Rawat, Vinod Mahor, Mukesh Chouhan, Kiran Pachlasiya, Shrikant Telang, and Bhagwati Garg A Survey on Interoperability Issues at the SaaS Level Influencing the Adoption of Cloud Computing Technology . . . . . . . . . . . . . . . . . . . . . . 206 Gabriel Terna Ayem, Salu George Thandekkattu, and Narasimha Rao Vajjhala Human Recognition Based Decision Virtualization for Effecting Safety-as-a-Service Using IoT Enabled Automated UV-C Sanitization System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Ananda Mukherjee, Bavrabi Ghosh, Nilanjana Dutta Roy, Arijit Mandal, and Pinaki Karmakar Blockchain Technology and its Applications Security Optimization of Resource-Constrained Internet of Healthcare Things (IoHT) Devices Using Asymmetric Cryptography for Blockchain Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Varsha Jayaprakash and Amit Kumar Tyagi

Contents

ix

Preserving Privacy Using Blockchain Technology in Autonomous Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Meghna Manoj Nair and Amit Kumar Tyagi Security Protocols for Blockchain-Based Access Control in VANETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Kunal Roy and Debasis Giri LIVECHAIN: Lightweight Blockchain for IOT Devices and It’s Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 Mukuldeep Maiti, Subhas Barman, and Dipra Bhagat Secure and Scalable Attribute Based Access Control Scheme for Healthcare Data on Blockchain Platform . . . . . . . . . . . . . . 276 Shweta Mittal and Mohona Ghosh Adaptive Neuro Fuzzy Inference System for Monitoring Activities in Electric Vehicles Through a Hybrid Approach and Blockchain Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Tapashri Sur, Sudipto Dhar, Sumit Naskar, Champak Adhikari, and Indrajit Chakraborty Application of BLOCKCHAIN in Agriculture: An Instance of Secure Financial Service in Farming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Sumit Das, Manas Kumar Sanyal, and Suman Kumar Das Adaptive Electronic Health Records Management and Secure Distribution Using Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 G. Jagadamba, E. L. Sai Krishna, J. P. Amogh, B. B. Abhishek, and H. N. Manoj Non-content Message Masking Model for Healthcare Data in Edge-IoT Ecosystem Using Blockchain . . . . . . . . . . . . . . . . . . . . . . . 325 Partha Pratim Ray Voting Based Consensus Protocol for Blockchain to Maintain COVID Patient Records in Consortium Networks . . . . . . . . . . . . . . . . . . . . . . . . 335 D. Chetan Surya, R. Shree Harsha, R. Sahitya, G. R. Deepak, and N. R. Sunitha Blockchain Adaptation in Healthcare: SWOT Analysis . . . . . . . . . . . . . 346 Halim Khujamatov, Nurshod Akhmedov, Lazarev Amir, and Khaleel Ahmad Blockchain-IoT Based Blood Supply Chain Management System . . . . . 356 Bikramjit Choudhury, Nabajyoti Dewri, Prangana Das, and Amitava Nag Blockchain-Based COVID-19 Detection Framework Using Federated Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Puja Das, Moutushi Singh, and Deepsubhra Guha Roy

x

Contents

Systematic Review of Attribute-Based Access Control for a Smart City Using Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Gourav Mondal, Debasis Giri, and Kousik Barik A Methodical Study on Blockchain Technology . . . . . . . . . . . . . . . . . . . 391 Tiyasha Laha, Aritra Bandyopadhyay, Kaustuv Deb, and Sourabh Koley Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

About the Authors

Dr. Debasis Giri is currently working as Associate Professor in the Department of Information Technology, Maulana Abul Kalam Azad University of Technology (formerly known as West Bengal University of Technology), West Bengal, India. Prior to this, he also held academic positions as Professor in the Department of Computer Science and Engineering and Dean in the School of Electronics, Computer Science and Informatics, Haldia Institute of Technology, Haldia, India. He did his masters (M.Tech. and M.Sc.) both from IIT Kharagpur, India, and also completed his Ph.D. from IIT Kharagpur, India. He is tenth all India rank holder in Graduate Aptitude Test in Engineering in 1999. He has published more than 80 papers in international journal/conference. His current research interests include cryptography, information security, E-commerce security and design & analysis of algorithms. He is Editorial Board Member and Reviewer of many international journals. He is also Program Committee Member of International Conferences. He is Life Member of Cryptology Research Society of India, Computer Society of India, the International Society for Analysis, its Applications and Computation (ISAAC) and IEEE Member. Jyotsna Kumar Mandal, M. Tech. in Computer Science from University of Calcutta in 1987, awarded Ph. D. (Engineering) in Computer Science and Engineering by Jadavpur University in 2000, is working as Professor of Computer Science & Engineering, Former Dean, Faculty of Engineering, Technology & Management, KU for two consecutive terms during 2008–2012, Director, IQAC, Kalyani University, and Chairman, CIRM, and Placement Cell. Jyotsna Kumar Mandal served as Professor, Computer Applications, KGEC; as Associate Professor, Computer Science; Assistant Professor, Computer Science, North Bengal University for fifteen years; as Lecturer at NERIST, Itanagar, for one year. Jyotsna Kumar Mandal has 34 years of teaching and research experience in coding theory, data and network security and authentication; remote sensing & GIS-based applications, data compression, error correction, visual cryptography and steganography and has been awarded 26 Ph. D. degrees, one submitted and 8 are pursuing. Jyotsna

xi

xii

About the Authors

Kumar Mandal supervised 03 M.Phil., more than 80 M.Tech. and more than 150 M. C.A Dissertations. Jyotsna Kumar Mandal is Guest Editor of MST Journal (SCI indexed) of Springer, published more than 450 research articles out of which 190 articles in international journals, published 15 books from LAP Germany, IGI Global, Springer, etc., organized more than 50 international conferences and Corresponding Editors of edited volumes and conference publications of Springer, IEEE, and Elsevier, etc., and edited more than 50 volumes as Volume Editor. Jyotsna Kumar Mandal received “Siksha Ratna” Award from Higher Education, Government of West Bengal, India, in the year 2018 for outstanding teaching activities; Vidyasagar Award from International Society for Science Technology and management in the fifth International Conference on Computing, Communication and Sensor Network; Chapter Patron Award, CSI Kolkata Chapter, on 2014; “Bharat Jyoti Award” for meritorious services, outstanding performances and remarkable role in the field of Computer Science & Engineering on 29 August 2012 from International Friendship Society (IIFS), New Delhi; and A. M. Bose Memorial Silver Medal and Kali Prasanna Dasgupta Memorial Silver Medal from Jadavpur University. Kouichi Sakurai received the B.S. degree in mathematics from the Faculty of Science, Kyushu University, in 1986. He received the M.S. degree in applied science in 1988 and the Doctorate in engineering in 1993 from the Faculty of Engineering, Kyushu University. He was engaged in research and development on cryptography at the Computer and Information Systems Laboratory at Mitsubishi Electric Corporation from 1988 to 1994. From 1994, he worked for the Dept. of Computer Science of Kyushu University in the capacity of Associate Professor and became Full Professor there in 2002. Now he is also working also with Adaptive Communications Research Laboratories and Advanced Telecommunications Research Institute International (ATR) as Visiting Researcher with information security. Professor Sakurai has published more than 400 academic papers around cryptography and information security. Prof. Debashis De earned his M.Tech. from the University of Calcutta in 2002 and his Ph.D. (Engineering) from Jadavpur University in 2005. He is Professor and Director in the Department of Computer Science and Engineering of the West Bengal University of Technology, West Bengal, India, and Adjunct Research Fellow at the University of Western Australia, Australia. He is Senior Member of the IEEE, Life Member of CSI, and Member of the International Union of Radio science. He worked as R&D Engineer for Telektronics and Programmer at Cognizant Technology Solutions. He was awarded the prestigious Boyscast Fellowship by the Department of Science and Technology, Government of India, to work at the Herriot-Watt University, Scotland, UK. He received the Endeavour Fellowship Award during 2008–2009 by DEST Australia to work at the University of Western Australia. He received the Young Scientist Award both in 2005 at New Delhi and in 2011 at Istanbul, Turkey, from the International Union of Radio

About the Authors

xiii

Science, Head Quarter, Belgium. His research interests include mobile cloud computing and green mobile networks. He has published in more than 250 peer-reviewed international journals in IEEE, IET, Elsevier, Springer, World Scientific, Wiley, IETE, Taylor Francis and ASP, 100 International conference papers, six researches monographs in Springer, CRC, NOVA and ten text books published by Pearson education. He is Associate Editor of journal IEEE ACCESS and Editor of Hybrid computational intelligence, Journal Array, Elsevier.

Security and Privacy

Cyber-Defense Mechanism Considering Incomplete Information Using POMDP Kshitij Pisal1 and Sayak Roychowdhury2(B) 1 School of Computer Science and Engineering, Georgia Institute of Technology, Atlanta,

Georgia 30308, USA 2 Department of Industrial and Systems Engineering, Indian Institute of Technology Kharagpur,

Kharagpur, West Bengal 721302, India [email protected]

Abstract. This article focuses on developing a cyber-defense system in which the defender has incomplete information about the state of the network under the threat of cyber-attack. It involves deriving optimal defense strategies for a layered network with uncertain state of security. The strategies are derived using the Partially Observable Markov Decision Process (POMDP), modelling the uncertainty of the state space. The defender observes its environment via noisy alarms triggered by the Intrusion Detection System (IDS). Based on its belief of the current state of the network, the defender chooses the action, which changes the network’s accessibility, thus blocking the adversary by decreasing its probability of reaching its goal (s). To circumvent the state-space explosion problem in traditional Markov models, an algorithm is developed which enumerates all the reachable paths an attacker can take, considering the possible sequential progress of the attacker within the structure of the network. This prunes the unreachable set of states out of the universal set, resulting in reduced computation time. Two scenarios based on the capability of the attacker are simulated, and the responses from the defender are recorded. It has been shown that the POMDP-based defence strategy is both economic and effective to mitigate cyber-attacks. Keywords: POMDP · Incomplete information · Recursive algorithm · Cybersecurity · Layered network

1 Introduction Cyberwarfare has emerged as a front in targeting not just organisations and big corporations but also government and nations. According to a report published in [1], the average cost of a data breach has been estimated to be USD 3.86 million. In most cyberattacks, the attacker exploits the vulnerabilities already present in the network. One of the ways to rectify this is to update security patches as soon as the vulnerability is detected. However, one of the significant issues with this approach is something called the vulnerability exposure window. According to the report by [2], the average time between the discovery of a vulnerability and upgradation of a security patch is 140 days. During this period, the network is exposed to adversaries, who can create costly disruptions © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 3–17, 2022. https://doi.org/10.1007/978-981-19-3182-6_1

4

K. Pisal and S. Roychowdhury

to bring down the functionality of the network or steal sensitive information. Hence, a real-time cyber-defense mechanism is of paramount importance to protect the system in the vulnerability exposure window. In this article, a layered network structure is assumed in which an attacker tries to reach the deepest node, and the defender tries to block the attacker. The defender aims to learn the best sequence of actions possible to protect the network, as well as minimize the network’s disruption for trusted users. The model is tested in two different scenarios with different capabilities of the attacker. The changes in strategies are observed, and the corresponding frequencies are noted for the different actions taken in a simulation of fifty encounters between two agents, the attacker and the defender. The problem of scalability in terms of expansion of the network’s size is also explored, and a recursive algorithm to find all the feasible paths that the attacker can take has been developed. The algorithm also serves the purpose of pruning out the unattainable states in the network, reducing the computational complexity. The primary contributions of this article are: 1. The defender’s decision problem with incomplete information to stall the progress of the attacker is solved for a layered network structure, using POMDP. 2. The defender’s capability to defend the nodes is demonstrated, each of which may have multiple exploit points. 3. A novel recursive-function based algorithm is developed to enumerate the possible attack paths and to prune out the unattainable states of the network. 4. The defender’s strategies are demonstrated effectively for attackers with different levels of capability. The rest of the article is organized as follows. In Sect. 2, a brief review of the literature in the relevant area is presented. Section 3 provides the problem description. In Sect. 4, the solution approach and methodology are explained. Section 5 shows the implementation using an example. Section 6 contains the conclusion and discussion.

2 Related work Optimization of cyber maintenance remains an active topic of research in the domain of cybersecurity. Kuhn and Madanat in [3] demonstrate the use of robust optimization in the context of asset management involving epistemic uncertainty in Markovian systems. Similar challenges are faced in cybersecurity, where cost-effective policies are required in order to protect the network. A game-theoretic approach towards modelling strategic interaction between attackers and defenders in cyberspace is proposed in [4]. In [5], authors have used the MDP framework in the allocation of resources to render the attackers ineffective and efficient recovery from the damages caused by the cyber-attack. In [6], the authors formulate a Bayesian Reinforcement Learning (BRL) which uses historical data obtained through limited scanning to derive optimal cyber vulnerability maintenance strategies. To demonstrate the cyber-attack and defence strategies in layered networks in [7], the authors present a 4-Node network, as a higher abstraction of servers in a part of the internet. The start node denotes the attacker’s computer, which is used to enter the cyber

Cyber-Defense Mechanism Considering Incomplete Information

5

network. The attacker then has to traverse through the network using the two nodes connected to its computer to finally extract data from the target node. The study further goes on to implement the different reinforcement learning techniques that can be used to train the attacker and the defender in a complete information game and compare them against each other in the given environment. A similar description of the attacker in a relatively more extensive network is discussed in [8]. The concept of exploit points is used, similar to that presented in [9]. The attacker targets these exploit points in order to compromise the connected node (s). Subsequently, the attacker moves deeper into the network to target specific goal nodes to have access to the most valuable information or to inflict maximum damage. When the attacker attacks in order to compromise a network, it is disguised as a regular user, limiting the information of its position in the network and the strategies it employs to compromise the same. In our article, a Partially Observable Markov Decision Process (POMDP) based model to derive defender’s decision making strategy is presented, which takes into account the uncertainty attributed to the defender’s incomplete information about the current state of the network. The POMDP solution techniques for stochastic optimization initially appear in [10, 11, 12]. The computationally efficient PERSEUS algorithm proposed in [13], which is a heuristic approach to solve POMDPs. In [14], the authors demonstrate the application of POMDP in deriving cost-effective maintenance actions in cyberspaces. A detailed application of POMDP in cyber-networks of large size is presented in [8]. The authors take a POMDP approach to train the defender against the attacker to minimize the damage caused to the network’s normal functioning by blocking the exploits, as blocking the exploit also prevents a trusted user from using the node and the subsequent nodes of the network. To make their approach scalable, the authors in [8] had also developed an online heuristic search algorithm through action selection and belief state updation using Monte Carlo simulation. In all the above networks, the observations are made through alarms triggered by the Intrusion Detection System (IDS). These observations are then used to find vulnerabilities in the network and assign resources to either fix the vulnerability or temporarily suspend the node to avoid further damage by the adversary.

3 Problem Description There are two important aspects of the cybersecurity problem addressed in this article, viz. the network structure and the defender’s incomplete information. 3.1 The Network The network used in this article can be referred to as a layered structure as described by [15]. The network has three nodes, which may be representative of network elements like servers, CPUs or modems. A good example of one such network in real world scenario is a Tree Topology, also known as a Star Bus Topology [16]. Since star topology is a very common network structure, training of a defender agent on one such network will result in finding applicable to a large section of networks. The nodes in our network can be

6

K. Pisal and S. Roychowdhury

Fig. 1. The network showing the connection of nodes and exploit points

accessed via different methods such as user login or a document upload segment. These methods to access the nodes are assumed to have specific vulnerabilities that the attacker can exploit by methods. Hence, for each node, the node can be accessed via multiple exploit points. In our example, we are considering two exploit points for each node. A trusted user accesses the same exploit points without any harmful intent to access the node and its available data (see Fig. 1). It is assumed that the attacker has to compromise a node in order to compromise all the subsequent nodes. The exploits e11 , e12 , e21 , and e22 belong to the upper layer of the network, and the exploits e31 , e32 , e41 , and e42 belong to the lower layer of the network. The attacker cannot simply attack node D without first compromising the nodes N1 or N2. 3.2 The Incomplete Information and the Defender’s Perspective As the attacker traverses through the network, the defender can be thought of as an agent with specific powers to modify the network’s functioning to defend it from an adversary. However, the defender has very little knowledge of the network’s state as the attacker is disguised as a regular user while compromising the nodes and extracting information. This lack of information on the network’s state and the attacker’s strategy is compensated by learning from the observations and the rewards that the defender gets. The IDS is not 100% reliable, and the possibility of both false detection and missed detection has to be catered to by learning via the POMDP model.

4 Methodology The methodological contribution of this article features two primary components, i) formulation of decision strategies, ii) scalability.

Cyber-Defense Mechanism Considering Incomplete Information

7

4.1 Decision Strategies The defender’s strategies are derived by modeling the problem as a POMDP. To learn more about POMDP and solution algorithms, one may refer to Spaan et al. [13]. The following subsections briefly describe how POMDP is adopted in the context of the research problem at hand. Partially Observable Markov Decision Process (POMDP): Complete observability is rarely present in any real life event. This nature of uncertainty can be captured by POMDP, which in turn is an extension (or generalisation) of the Markov Decision Process (MDP) in the sense that the agent cannot directly observe the underlining state. The belief states are derived from the conditional probability distribution of observations given the agent is in a specific state. These belief states are then updated as the agent takes decisions. On the other hand, the states are fully observable, and the agent can determine the exact state in an MDP, from the probabilities of state transition given the initial state and the action. In the cyber-network, POMDP becomes an effective technique to model the defender with very little knowledge about the attacker’s position and its next move. The defender tries to decipher the state in which the network is in, using the observations in the form of IDS alarms triggered by the attacker. POMDP has seven elements, given as: S : The finite set of all the States A : The finite set of all the actions T : S × A → π (S) The function for state transition. The transition function gives a  probability of transitioning    from state s ∈ S to s ∈ S for every action a ∈ A taken, hence given by T s|s , a . R : S × S → R The function gives the reward for every state transition from s ∈ S to s ∈ S.  : The finite set of all the observations O : S ×A → π () The function for observation which gives the probability of receiving an observation o ∈  dependent on s ∈ S after taking action a ∈ A. γ : The discount factor. The objective of the POMDP agent is to choose an action that maximizes the expected discounted reward:  ∞  γ t rt Maximize E t=0

where rt the reward received by the agent at a time t. Belief State Updation: In POMDP, since the system state is not known with certainty, a probability distribution over the state space is used, known as the belief state. For ndimensional state-space, the belief state is given by B = [b(s1 ), b(s2 ), . . . , b(sn )] where   b sj is the probability that the agent is in the state j. The belief state probabilities are updated after every action taken that causes a state transition.

8

K. Pisal and S. Roychowdhury

The formula for belief state updation for a state transition to the state s can be given as:

      O o|s , a s∈S T s |s, a bt (s) bt+1 s =     s∈S s ∈S O(o|s , a)T (s |s, a)bt (s)

where, for a transition of the state from s to s , O(o|s , a) denotes the observation probability for observing o for the state s when action a is taken, T (s |s, a) is the transition probability of going from state s to s when action a is taken. For each time step t for a finite period problem, the value function Vt is parameterized by a finite set of vectors, called the alpha vectors. Each alpha vector is associated with a particular action, and is |S| dimensional. The alpha vectors at time  step t can be given j j as α t for j = 1, 2... With each of these alpha vectors, an action a α t ∈ At is associated where At be the set of all possible action at time step t. Defender’s Actions: Each alpha vector (corresponding to a particular action) defines a region in the belief space (partitioning the space) for which the vector is the maximizing element of Vt . The equation for value function can be given as the maximum value of j the dot products (·) of the belief vector bt and the α t for j = 1, 2, . . . at time t: Vt∗ (b) = max bt .α t j

j

Fig. 2. (a) Blocking off one exploit in the lower layer. (b) Partial blocking of the lower and upper layer, securing transition from node N1 to D and entry in node N2. (c) Blocking of the lower layer completely, securing the transition to D. (d) Inspecting node N1. (e) Inspecting node N1 and N2. (f) Complete shutdown

Cyber-Defense Mechanism Considering Incomplete Information

9

The corresponding optimal action at∗ for a given belief state bt is determined from the alpha vector that maximizes the value function, as below:  j α ∗t = arg max bt .α t j

  at∗ = a α ∗t In this article the algorithm developed in [17] is used to solve the POMDP model. This solution algorithm uses optimal pruning along with PERSEUS for efficiency. An implementation of the solver is available at www.erwinwalraven.nl/solvepomdp/. The defender’s actions are primarily responsible for how the events progress. The transition probability of going from one state to the next, which corresponds to the attacker’s action of exploiting and compromising a node, depends on the modification in the network’s functionality that the defender carries out to defend the network. This is carried out by blocking the exploits connecting a node. The defender can also inspect a node that gives it the ability to understand the network’s state more precisely. The defender can also choose to rely on the antivirus and default malware protection by not altering the network’s functionality (see Fig. 2). Cost of the Actions: The actions taken by the defender would disrupt the normal functioning of the network even for a trusted user. The costs associated with each action is in accordance with the scale of disruption created in the network. For instance, blocking the entry exploit points for node N2 completely blocks the user from accessing N2 as well as the subsequent nodes. A considerable cost also incurs upon the defender when the goal node is compromised. Hence, the defender’s ultimate goal remains to defend the goal node while minimizing the network’s disruption. Observations: After taking each action, the defender receives an observation. This observation is modelled as the alarms or security alert notifications generated by the Intrusion Detection System (IDS) as the attacker progresses through the network and compromises the nodes. The probabilistic occurrence  of alerts is taken into account using  the observation probability for an action, O o|s , a .

4.2 Scalability An algorithm to generate all the paths possible for an attacker is proposed in this article. The importance of this procedure increases significantly for a network with a large number of nodes. Taking the set of entry exploits as E0 = {φ, e1 , e2 , e3 , ..en }, we define the subsequent sets as E1 = {e1 , e2 , e3 , ..en }, E2 = {e2 , e3 , ..en }, and so on till En = {en }, where φ is the starting point in which the attacker has not compromised any nodes. For (n + 1) sets, each set has the first element removed from the previous set. A child of an exploit e is defined as the exploit point(s) that comes immediately after the node connected to the exploit point e. A neighbour(s) of an exploit point e is the output of the attack path enumeration algorithm. Each exploit point is assumed to be a class object, containing the information of its Neighbours (N ) and Children (C). e.C stores

10

K. Pisal and S. Roychowdhury

the information of the children of the exploit point e, whereas e.N store the information of the neighbours of e. The child of the terminating exploit, that

is, which has no further children, is a set containing itself, given as eterminating .C = eterminating . We initialize φ.N as all the starting exploit points. A Flag value can be seen as an interrupter, which prunes out the possible neighbours as the attacker progresses. The recursive algorithm is given below:

Fig. 3. Graph (b) shows partial enumeration of all the feasible paths that can be taken by the attacker while attacking the network (a)

The algorithm is similar to a recursive Depth First Search (DFS) algorithm, and hence, the time complexity is similar to that of the recursive DFS algorithm, which is given as O(|V | + |E|) in the worst case scenario, where V and E are the numbers of vertices and edges respectively. However, the size of the graph is dependent on the network structure. In the network shown in Fig. 3, since there are 6 nodes, each of the nodes can either be healthy or compromised. This makes the number of possibility as 26 . But since, the state of the node being compromised is dependent on how the attacker progresses, all of the possibilities need not be considered. For each recursive call that

Cyber-Defense Mechanism Considering Incomplete Information

11

the algorithm makes, the flag interrupt only looks for the paths ahead that are possible. When Flag interrupter is provided at every step, at most n recursive calls are made, in the case where all the exploit points are attacked. Illustrative Example: As an example (see Fig. 3), the attacker follows the path shown as the thick red line in graph (b) while attacking network (a). Starting from the initial position, e1 is being exploited at t = 0, the possible paths that the attacker can take are all the neighbours of e1 at t = 0. But, we can provide a flag interrupter in the recursive calls, showing that only the branch corresponding to the neighbour e2 has to be considered, hence pruning the rest of the graph. Similarly, e4 and e6 are exploited in the subsequent time stamps. At each step, we prune out the paths which are no longer achievable, hence reducing the computation. Here timestamps are just given for reference; the attacker may take more than one timestamp to progress on a path due to a failed attack at compromising a node, but the path remains the same as suggested by the graph.

5 Results The attackers can be categorized in two classes based on their experience and capabilities, i) Inexperienced, ii) Experienced. The two classes are manifested in the transitional probabilities of the POMDP model. The inexperienced attacker has zero probability of success against a blocked exploit. In contrast, the experienced attacker, due to better capabilities, has the ability to bypass the security measure with a certain probability to compromise the node. In both cases, the observation probabilities of an IDS generating an alarm is kept the same. The results of the two cases are discussed in the subsequent subsections. Table 1 gives information on the different states of the network, Table 2 gives information on the actions taken by the defender and Table 3 gives the information on the observations made. Table 1. States of the POMDP model with a description State

Description

B0

All the nodes are healthy (not compromised)

B1

The node N1 is compromised

B2

The node N2 is compromised

B3

Both the nodes N1 and N2 are compromised

B4

Node D is compromised

12

K. Pisal and S. Roychowdhury Table 2. Actions taken by the defender

Action Number

Action description

0

Blocking e11 and e12 Blocking e11 or e12

1 2 3

Blocking e21 and e22 Blocking e21 or e22

4

Blocking e31 and e32

5 6

Blocking e31 or e32 Blocking e41 and e42

7

Blocking e41 or e42

8 9

Blocking e11 and e12 along with e41 and e42 Blocking e11 or e12 along with e41 and e42

10

Blocking e11 and e12 along with e41 or e42

11

Blocking e11 and e12 along with e41 and e42 Blocking e21 and e22 along with e31 and e32

12

Blocking e21 or e22 along with e31 and e32

13 14

Blocking e21 and e22 along with e31 or e32 Blocking e31 and e32 along with e41 and e42

15

Blocking e31 or e32 along with e41 and e42

16 17

Blocking e31 and e32 along with e41 or e42 Rely on default intruder protection, no action performed

18

Total shut-down of the network

19

Inspect node N1

20

Inspect node N2

21

Complete inspection of the network

Table 3. List of observations with description Observation

Observation description

0

Alarm at node N1

1

Alarm at node N2

2

Alarm at both the nodes N1 and N2

3

No observation

4

Node D is compromised

Cyber-Defense Mechanism Considering Incomplete Information

13

5.1 Inexperienced Attacker A Monte Carlo simulation is designed where the belief states are randomly initialized. The state of the network is also initialized, of which the defender has no knowledge. The defender takes actions following the POMDP based strategies for ten iterations, which is compared with randomly chosen actions. This simulation is run fifty times, and data of the expected cost incurred are recorded for both the POMDP-based and random action strategies. The average cost incurred by taking random action strategy is 884.47 units with standard deviation of 75.04, whereas for POMDP based actions, the average cost is 8.77 with standard deviation of 6.19 (see Fig. 4). This shows that the expected cost is much lower when the POMDP based strategy is adopted in uncertain situations. It indicates that for the inexperienced attacker when the actions are directed by the POMDP strategy, the expected cost incurred at t = 0 is significantly reduced. This is due to the fact that the defender is always successful in protecting the goal node in every simulation. In contrast, when random actions are taken, the attacker is successful in compromising the goal node in every simulation.

Fig. 4. Box plot for expected cost incurred by the defender, for 2 different strategies from 50 simulations against the inexperienced attacker

The defender, in order to stop the attacker from compromising the goal node, takes different actions following the POMDP-based strategy as the belief states are updated (see Fig. 5). The belief state corresponding to a particular action taken has been averaged and shown as bar plots. In the case of the inexperienced attacker, the only action taken throughout the fifty simulations is action 14, which is blocking all the exploits reaching the goal node. It can be seen from the plot, that the probability of the network is in the state B3, which is, both the nodes N1 and N2 being compromised is the largest. Since the defender is blocking the attacker in every instance from reaching the goal node, the attacker is stopped right before the goal node. Hence the probability of the network being in state B3 is so high.

14

K. Pisal and S. Roychowdhury

Fig. 5. Average belief probabilities with corresponding actions from the 50 simulations, for the inexperienced attacker

5.2 Experienced Attacker To model this behaviour, the transition probabilities are assigned in such a way that the attacker is successful in compromising a node with some probability, even if the defender is blocking all the exploit points leading to that node. Intuitively, in this case, just blocking the exploits might not be a viable option to protect the goal node, which is also evident from the simulation outcomes. For the experienced attacker, the average cost incurred for the random action is 864.01 units with standard deviation of 72.09, whereas for POMDP based actions, the average cost is 156.35 with standard deviation of 84.93 (see Fig. 6). It is interesting to note that in the case of the experienced attacker, unlike the inexperienced attacker, the mean cost for the defender is much higher even while following the POMDP-based strategy. This is attributed to the fact that in this case, to mitigate the attacker, the defender has to rely on stricter actions, such as shutting down the entire network, which comes with a heavy penalty. Even in the case of a seasoned attacker, the defender is successful in protecting the goal node in each of the simulations when it is making decisions using the POMDP-based strategy. Whereas in the cases where random actions are taken, the defender is unsuccessful in saving the goal node from being compromised. This proves the effectiveness of our methodology, for this type of layered network under the threat of cyber-attack. In the case of an experienced attacker, the actions taken are a set of three actions, 18, 19 and 20 (see Fig. 7). Action 18 is the strict action of completely shutting down the network, while action 19 and 20 correspond to inspecting the nodes N1 and N2 respectively. It is interesting to note that action 18 is taken when there is more uncertainty over the state of the network, while action 19 and 20, which are less expensive, are taken when there is relatively more certainty of being in states B1 or B2. From the observations, it can be said that the POMDP-based strategy dynamically suggests stricter actions (e.g.

Cyber-Defense Mechanism Considering Incomplete Information

15

Fig. 6. Box plot for expected cost incurred by the defender, for 2 different strategies from 50 simulations against the experienced attacker

block all nodes) in case of greater uncertainty about the state of the network. The choice of actions is more economic when there is more evidence of being in a state of lower severity.

Fig. 7. Average belief probabilities with corresponding actions from the 50 simulations, for the experienced attacker

16

K. Pisal and S. Roychowdhury

6 Conclusion and Discussion In this article, a POMDP based decision strategy is explored in modelling an incomplete information scenario faced by a defender in a layered network under the threat of cyberattack. The actions taken by the defender are chosen such that an immediate response to the adversary could be given. The decision strategy is shown to outperform random action strategy. From the Monte Carlo simulations, it is evident that when the attacker is inexperienced, a less disruptive measure of blocking the exploit points is used to stop the attacker from penetrating deeper into the network. In the case of an experienced attacker, a stricter action strategy (e.g. complete network shut down) is adopted when there is more uncertainty about the state of the network. When there is less uncertainty about the state of the network, the defender makes more informed decisions by taking less disruptive actions. The simulation of the above two cases sheds light on the defender’s adaptability with different scenarios. This solution framework may be scaled for larger networks, with the potential to save a lot of resources through optimally choosing the action strategy. Another contribution of this article is towards addressing the problem of state-space explosion. The problem is solved by enumerating all the possible paths taken by the attacker. In this process, the unreachable states that cannot be reached because of the network constraints, are automatically pruned from the universal set of states. This algorithm is used to partially enumerate a relatively larger network, and the results are presented. However, one of the shortcomings of the model is that the defender does not know the nature of the attacker beforehand. Further, the cases where more than one attacker attacks simultaneously are not considered. Multiple attackers will add to the complexity of the model. The cost of actions under different states also needs to be estimated carefully from the past data. Modeling the defender’s strategy with uncertain cost structure could be a potential avenue for future research.

References 1. IBM Security and Ponemon Institute: Cost of a Data Breach Report 2020 (2020) 2. Gorenc, B., Sands, F.: Hacker machine interface: the state of SCADA HMI vulnerabilities. TrendLabs Research Paper (2017) 3. Kuhn, K.D., Madanat, S.M.: Robust maintenance policies for markovian systems under model uncertainty. Comput.-Aided Civ. Inf. Eng. 21(3), 171–178 (2006) 4. Sokri, A.: Optimal resource allocation in cyber-security: a game theoretic approach. Procedia Comput. Sci. 134, 283–288 (2018). https://doi.org/10.1016/j.procs.2018.07.172 5. Njilla, L.L., Kamhoua, C.A., Kwiat, K.A., Hurley, P., Pissinou, N.: Cyber security resource allocation: a markov decision process approach. In: IEEE 18th International Symposium on High Assurance Systems Engineering (HASE), pp. 49–52 (2017) 6. Allen, T.T., Roychowdhury, S., Liu, E.: Reward-based monte carlo-bayesian reinforcement learning for cyber preventive maintenance. Comput. Ind. Eng. 126, 578–594 (2018) 7. Elderman, R., Pater, L.J., Thie, A.S., Drugan, M.M., Wiering, M.: Adversarial reinforcement learning in a cyber security simulation. ICAART 2, 559–566 (2017) 8. Miehling, E., Rasouli, M., Teneketzis, D.: A POMDP approach to the dynamic defense of large-scale cyber networks. IEEE Trans. Inf. Forensics Secur. 13(10), 2490–2505 (2018)

Cyber-Defense Mechanism Considering Incomplete Information

17

9. Ammann, P., Wijesekera, D., Kaushik, S.: Scalable, graph-based network vulnerability analysis. In: Proceedings of the 9th ACM Conference on Computer and Communications Security, pp. 217–224 (2002 November) 10. Sondik, E.J.: The Optimal Control of Partially Observable Markov Processes. Ph.D. Thesis, Stanford University, Stanford, CA (1971) 11. Smallwood, R.D., Sondik, E.J.: The optimal control of partially observable Markov processes over a finite horizon. Oper. Res. 26(2), 1071–1088 (1978) 12. Sondik, E.J.: The optimal control of partially observable Markov processes over the infinite horizon: discounted costs. Oper. Res. 26(2), 282–304 (1978) 13. Spaan, M.T., Vlassis, N.: Perseus: randomized point-based value iteration for POMDPs. J. Artif. Intell. Res. 24, 195–220 (2005) 14. Liu, E., Allen, T.T., Roychowdhury, S.: Cyber vulnerability maintenance policies that address the incomplete nature of inspection. Appl. Stoch. Models Bus. Ind. (2019) 15. Cerf, V.G., Cain, E.: The DoD internet architecture model. Comput. Netw. 7(5), 307–318 (1976). (1983) 16. Star Topology – Computer Hope. URL: https://www.computerhope.com/jargon/s/startopo. htm 17. Walraven, E., Spaan, M.: Accelerated vector pruning for optimal POMDP solvers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, No. 1 (2017 February)

Monitoring, Recognition and Attendance Automation in Online Class: Combination of Image Processing, Cryptography in IoT Security Pritam Mukherjee, Abhishek Mondal, Soumallya Dey, Avishikta Layek, Sanchari Neogi, Monisha Gope, and Subir Gupta(B) Department of Master of Computer Application, Dr. B. C. Roy Engineering College, Durgapur, West Bengal 713206, India [email protected]

Abstract. According to Oxford and Cambridge dictionary, the attested meaning of monitoring is an uninterrupted observation towards a particular circumstance for a specific period and inventing some new thing in it. In a word, we can tag it as “Supervision.” Regarding automation, the aphorism of both dictionaries is a work executed using self-operating machinery without any control of human beings. “Mechanization” is a substitution for the same. During the online class, continuous monitoring is essential and on the other hand, taking attendance is an obligatory task. It takes an adjunct effort and additional time involvement aside from the class hours. But if both these exigencies come under one umbrella with a very new aspect and a firm conviction, how will it be? IoT security and automation have collaborated to make this successful. This paper is an amalgamation of uninterrupted cognizance and guaranteed genuine automation on attendance marking. It contains the feature of data encryption using Fernet Cryptography to eschew manipulation. Another quality of this paper is that it shows a trail to detect human faces dexterously using Haar Cascade and Shape Predictor. The paper proposes a razor-sharp face authentication, discrepancy elimination and acts as a selectively permeable membrane. This paper provides a substantial replacement for manual attendance. The intention of generating this report is to bring ease to the online monitoring and attendance-taking system. The information presents its Promethean features with a minor error of 4% using Percentage error. Keywords: Cryptography · Eye blinking · Image processing · IoT security · Online class monitoring

1 Introduction We have been accustomed to the term “Online class” nowadays due to this worldwide intimidating epidemic disease “COVID 19” [12]. If we get into the time machine that we all have with us, that is our “mind,” and then if we get back to those previous days, as far as we can abjure, we would not be able to see such expeditious and spacious use of these © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 18–27, 2022. https://doi.org/10.1007/978-981-19-3182-6_2

Monitoring, Recognition and Attendance Automation in Online Class

19

platforms or any ordainments. Because in that era, though the abstraction was present, the technology was not in that much-facilitated stage as it is now. We didn’t use the technology extensively as we are using it directly now. So, we absolutely could not think of the prevalent use of “Virtual Learning.” [17]. But now it has become a very salient part of our daily life. The student fraternity is well known about this system at present. It also possesses a sweet but tedious name, “Online Class” [16]. It’s wordy; why? Because students have faced this situation abruptly. But those students’ brain (K.G. and primary level) is not full-blown, so it’s pretty tenacious for them to harmonize this “fun less” situation. They are feeling bored with the system. But as they have no other alternatives in their hand, they have to accept it and accustom too. The primary level students will contend to do these classes, as they are not interested in such courses [23]. The most culminating point is that the kindergarten students, whose syllabus is just to learn ABCD, 1234 and this kind of piece of stuff only, are also facing online classes. They are the smallest buds of this education sector. They usually don’t want to sit intently and do their respective courses. In some cases, we can see that both the parents are working, so sometimes they can’t oversee their children’s education justly [7]. If their child is not attentive in the class, then it will be embodied in their study in their future also. Thus, it’s infallible to have an advanced monitoring system [2]. So, it’s an area to restrain and ensure that students are paying attention properly. Thereupon, it is necessary to rebuff if a student is attending the class properly and, after getting into the course, that student is attentive to it or not. So, making the attendance and monitoring system more aggrandized is necessary to keep all these things in mind [19]. And that’s why we have done this research work combining with IoT [5]. After several literature reviews, it is time to look at the technical issues that we found as a gap in the research area in this domain. These are a matter to check, identify and fill up the holes. So, let’s spotlight technical issues: We usually see the network issue during an online class. Sometimes, when we attend online courses, the sound or the audio section may not come properly. Sometimes the audio comes earlier than the visual. The continuation of the class gets hampered. Another problem is that sometimes some unknown person enters the ongoing class and creates a nuisance. It doesn’t prevent unauthorized access. Here comes the other dispute: the face recognition system using image processing in an online course. We have seen cases where a student is leaving the class instead of that. The student places a dummy face or a steel picture of their face and leaves the course. Time quantum measurement, if the meeting host finds any dispute in a category, the host terminates the student. Still, if the host overlooks some error inside the class, then the overall purpose of the online course gets hampered. Documentation of the disturbance is also another problem. We need to identify the noise or the disturbance very quickly. Before terminating the student, documentation is essential, and we should keep it securely in an encrypted format. Such problems we have found as gaps in online classes and during monitoring students in an online course. As the domain is relatively more extensive, we can’t address all the mentioned disputes and debug them under a single project at a time. It needs a larger comparative project. So, we have tried to address some of the research gaps here in our proposed system. We have fully managed some of the problems, which are partially addressed. The further details we have discussed here are in different sections. We have

20

P. Mukherjee et al.

described the rest of the paper in the following manner. In Sect. 2, literature review, in Sect. 3 the methodology of the proposed system, we have allotted Sect. 4 for result analysis (using diagram and tables) and Sect. 5 for the conclusion and future aspect.

2 Literature Review The section “Literature review,” which is also known as “Related works” or “Literature Survey,” implies the backcloth analysis or researches on a topic. IoT, internet security, image processing, data science, etc., are the zones where we can improve contrivance machine learning [9, 10, 11]. Customarily, in the online monitoring system, two regions should be promulgated, perceived, and demands exceptional attention [21]. They are salient monitoring and veritable mechanized record-keeping [14]. Hence, aiming to monitor and retain and trail computerized records, some mechanizations have been eminently fancied [24]. A bit excerpts of them are acquainted here. Face Recognition System is an extensively used and most favored technique in students’ online monitoring systems. Researchers have been used and advocated this way out persistently. Usually, it scans the front face. Several face recognition systems are Eigenface (the most extensively used facial recognition and identification procedures) [3]. It’s a precedent of principal component analysis in mathematics. We can methodize Eigenvectors to constitute various numbers of contrasts of human faces). ANN (Artificial Neural Network) is also primarily used to distinguish human faces [8]. It bolstered a single layer that manifests pliable in the clamorous face detection process. It does the face attestation using a bilayered WISARD in ANN). HMM (Hidden Markov Model), another way that we can apply in face recognition [15]. In this process, we can separate the human face into different segments like eyes, ears, nose, etc. This procedure gives almost 87% correct and accurate face detection results, and it does so through a selected dataset. Otherwise, the analogous type model can detect the oneness of the face). Template Matching is a technique used for face recognition where the process constitutes the trial pictures in the form of a 2D array of esteems. A comparison will occur with Euclidean distance, including one template. We can represent a whole face through this. This process is adept at judging a face from different angles or viewpoints and finally detects a single face [20]. Researchers have proposed the “Radio Frequency Identification” to use this system for attendance monitoring. RFID technology is preferable among other technologies for automatic monitoring systems because it can run over the restraints of other involuntary spotting processes [18]. Light has conducted that for elucidation. This process is applied to gather particulars in a mechanized manner by the use of radiofrequency data communication. It happens betwixt an ambulant object and an RFID reader [1]. A typical reader symbolizes a utensil that accommodates single or multiple antennas from where radio waves emission and reverse signals acceptance happens from the tag. The same we can use to read and write data on a label and hand over the data to an apparatus to store and process. Fernet Cryptography (associated with IoT Security) is used in encoding-decoding [6]. It assures that where we have concealed an informative message through this process, it’s impossible to perpetrate any alteration there or unfold the same for reading or anything else if one doesn’t have the decoding key. This process

Monitoring, Recognition and Attendance Automation in Online Class

21

is also known as a “Secret Key”/“Symmetric Authenticated Cryptography” [13]. It also facilitates to implementation of the critical rotation through MultiFernet. Regarding encryption-decryption, it spawns a very new fernet key which is the preeminent artifact to retain robustly by the user. It is the only way to decrypt a message encrypted by fernet cryptography. One of the prevailing and illustrious algorithms in the face recognition field is the K-Means algorithm. The usage is for clustering the face attributes [4]. An indispensable algorithm to resize an image is Nearest Neighbor Interpolation. Interpolation is required when one wants to resize or remap any image starting from a pixel grid and ending to another specific grid. Haar Cascade is an acclaimed customizable algorithm. We use it to pinpoint substances. This method’s actual usage or application is to descry the front side of a face and its segments, like eyes, ears, nose, mouth, etc. Regarding Shape, Predictors, also called Landmark Predictors, are utilized for forecasting and circumscribing the critical coordinates of a predisposed shape (chiefly in the human face) [22].

3 Methodology We have chronicled the entire procedure in a pictorial/diagrammatic format in the “Methodology Diagram” Fig. 1. We have proposed this system for students, so the term user signifies the students here. Students have to log in to the proposed platform using predefined credentials when the class initiates. They are students’ names, their roll numbers, and the different subject codes of that specific class they are attending. These credentials we can store in the database. Then the system will authenticate the given information. The system will authorize the student to penetrate or terminate by confiding on the same accuracy. An image folder and a CSV file will induce, followed by corroboration. After entering, the camera will apprehend the faces (K-Means algorithm). The image will be rescaled (Nearest Neighbour Interpolation algorithm). The

Fig. 1. Methodology diagram of online student monitoring and attendance automation

22

P. Mukherjee et al. Table 1. Pseudocode of methodology

Monitoring, Recognition and Attendance Automation in Online Class

23

authentication will come about, and it will determine whether the student is known or unknown (using Haar Cascade and Shape Predictor algorithm). After accumulating all these data in a specific database, the session will start. It will put a counter of 10 min for unknown faces, and if the system fails to detect the face within that time, it’ll discard it. During the session, the camera will be mandatorily on to ensure uncompromising monitoring by the faculty. Along with that, it will check whether there is a natural or dummy face exists in class through eye blinking. If any video is streaming from a student’s end, it will immediately terminate that student. It will hold power to remove the forgery and keep the veritable student in the class. Then the proposed system will record the attendance only in a conclusive CSV file (System call + CWD), and it will be encoded just after concluding the session. Eventually, it will reach the teacher in an encrypted format, and only the teacher will have the authority to decode it and mark their attendance. Encoding and encryption (Fernet Cryptography algorithm) assure the security, i.e., a guarantee of a manipulation-free system. According to Fig. 1, we have shown the Pseudo Code in Table 1. Now, we are moving forward to explain the pseudocode. Arrange every captured image into a 2 × 2 matrix representation. Fetch the width (widthimg ) and height (heightening ) of the image and insert the value into pu , pv respectively. Iterate (widthimg , heightimg ) times to scale down the image matrix to A(m, n) from B(u, v). Where u = no pixel counts of an image in every row, v = no pixel count in every column. Now detect the output image O(u) Is it face or not. If yes, find the face match from the images stored in the database. If not found then, O(u) = 0. Else increment the value of O(u) . Then, analyze the value of K (i) , M (0) that denotes total positive image value and total detected image value. Now encrypt the machine-generated attendance data, Bytesarray. Convert Bytesarray into secretdata[] array. For decrypting the data convert secretdata[] into plaintext again using Fernet cryptography.

4 Result Analysis After mustering all the technologies pertinent to the problem definition and the proposed system, we have brought off the testing several times to acquire the craved yield. After perceiving the entire process and the function of the combined technology, here comes the scanning of the outcome known as “Result Analysis.”

Fig. 2. a) Test case 1 (Known face) b) Test case 2 (Known face) c) Test case 3 (Known face)

The fundamental function of recording attendance is the bona fide detection of students’ faces. First, the camera will apprehend the student’s image in front of the camera

24

P. Mukherjee et al.

to do this. The camera will abduct the photos in a premeditated unvarying time interim. The fixed time interval for capturing pictures in this intended system is 20 s. In the case of first detection espial, the camera will capture images in 20 s time intermission. So, as per the mathematical framework and reckoning, the image charging rate will be one picture per 20 s. Consequently, the mean value or the highest value for face matching will be 3. Now, it may oscillate throughout the whole time but within the mean value of 3. It will manifest clearly in the graph that tries out the detection process appositely. The face will be purposefully retained apart from the forepart of the camera. As a result, we could spot it clearly that the graph will be down due to this activity. The graphical delineation will be observed in this testing section (regarding known and unknown face identification), and the testing clarification will be acknowledged. The student’s face is enlisted previously in the database. So, a comparison betwixt the saved image and accomplish the captured image. Every 2 min of catching, we obtain six images. Figure 2(a, b, c) is for the known face, which the system will get matched and successfully acknowledged. The graph will be “Straight” in those particular cases. It defines that the chart will be straight all-time whenever the successful identification occurs, as shown in Fig. 2.

Fig. 3. a) Test case 1 (Unknown face) b) Test case 2 (Unknown face) c) Test case 3 (Unknown face)

On the other hand, we assigned the graph in Fig. 3(a,b,c) for unknown face detection. When the camera detects an unfamiliar face, the match count graph will be down and the unknown content graph upward. In short, two charts of each testing are just refuting each other as the conditions are in reverse mode. When applying this, it will be visible, and to get the appropriate result, both the situation has been tested by showing and hiding the face intentionally. Both the tests and the resultant graphs imply that the algorithms are executing correctly, and it also signifies that the students are physically present in the class. It also assures whether the student is in volatile mode or not. Now, we’re coming to the next part of testing. The algorithm will test whether the face presenting the camera’s opposite side is original or dummy. The testing we have done through “eye nictitating.” Fig. 4 is designated as a shred of evidence against the same. The eye nictitating ratio (predefined here) is ten times per minute (minimum). The figures earmarked as 4(a, b, c) are its corroboration. Its motive is to identify the activity of the student in the class. Instead, if the student is watchful in the course, it can be tested through this procedure. If a student is genuinely doing their style, then the original face will be spotted and displayed as well as the front will be proclaimed as an original face by

Monitoring, Recognition and Attendance Automation in Online Class

25

the system. Finally, the credentials are stored by the system and encoded and encrypted using distinct methods when the session ends. The system will hold all the data in a CSV file in a tabular format, and it will act as the ultimate attendance sheet. The file gets encoded to prevent manipulation and encrypted to avoid any discrepancy. The teacher will decode the file and see the result using an encryption key. Instead of the outcome, it indicates that security clearance storage has been fully completed. The overall proposed system has been tested 30 times over different cases. The pie charts given here are the clear picturization of attendance monitoring authentication. The criteria premediate 75% of attendance, and the pie chart in Fig. 5(a) is evidence of the same. The rest two pie charts (in Fig. 5(b) and 5(c)) are the variation of attendance tested here. All the pie charts (Fig. 5) clearly say that the result is genuine and the testing is successful. We have tried our proposed system from three different points of view. 30 times, we have tested the design. The three different perspectives are; the student is present physically during online class (test case 1). Next, the student has left the course, and instead of him, an unknown person is sitting there (test case 2).

Fig. 4. a) Test case 1 (Eye blinking) b) Test case 2 (Eye blinking) c) Test case 3 (Eye blinking)

Fig. 5. a) Attendance ratio (Test case 1) b) Attendance ratio (Test case 2) c) Attendance ratio (Test case 3)

Finally, the student has left the class after logging in to the course (test case 3). We want to mention that all the detection or recognition the system is doing only within 10 min. The system detects the error within that time slice and terminates the student from the class. Coming to the first case, a student has logged in to the course and appropriately did the class, and at the end of the class, after responding to the respective attendance, a student left the class. We have represented this case using the first pie chart: attendance ratio (test case 1). The following case is during an online class, and a student leaves the course. And a dummy face or someone else is sitting in front of the camera. The system will detect that error and terminate the student within 10 min. We have

26

P. Mukherjee et al.

represented this case using a second pie chart: attendance ratio (test case 2). Another issue is that a student enrolled in the class and then dropped out after a short period. The system cannot detect that vacant space within 10 min, treat that student as absent, and terminate him from the class. We have represented this case in a third pie chart: attendance ratio (test case 3). All the attendance-related pie charts we have developed after 30 trial runs for individual test cases and the generated value that we are getting converter those test values into pie charts shown in Fig. 5. We have stored the testing results and made those pie charts in 3 different scenarios. For every case, we have run these 30 times, and overall, when we found the mean square error of these three cases, we found the success rate of 96%, which means less than 5% error. So, we can consider that the system’s accuracy rate is 96%. We have found (using Percentage error) that the proposed approach has an accuracy rate of 96% and the accuracy level is satisfactory. We have seen this 96% accuracy by using percentage error.

5 Conclusion We could address the Duplex mode of the class through the proposed system, and we could enhance/evolve it up to a certain extent successfully. Meanwhile, the strict and keen monitoring from the teacher’s end has been efficiently addressed the duplex mode of an online class. Furthermore, the teacher could observe and scrutinize the all-inclusive deportment inside the class. One of the most advantageous parts of this proposed system is data encryption to avoid manipulation and predilection avoidance by keeping records in the image folder and CSV file. If a student gives any proclamation, the authority can challenge it as the evidence is present on both ends. When it has gone through the execution process, we’ve perceived that the proposed system is running and executing successfully with an accuracy rate of 96%, which can profess that the system is ok. The online class is running under the IoT platform, so the plan is proficient at fending off the embezzle of the same. It has some constraints too. This system could be more advanced if the machine learning technique is actionable. Besides that, the uncertainty of brightness may disable the process of image capturing. It’s presumed that a person nictitates his eyes ten times per minute as the eye nictitating deviates man-to-man. Though it has some limitations, the proposed system is still quite adequate with its inferred results.

References 1. Alhanaee, K., et al.: Face recognition smart attendance system using deep transfer learning. Procedia Comput. Sci. 192, 4093–4102 (2021). https://doi.org/10.1016/j.procs.2021.09.184 2. Bhatti, K., Mughal, L., Khuhawar, F., Memon, S.: Smart attendance management system using face recognition. EAI Endorsed Trans. Creat. Technol. 5(17), 159713 (2018). https:// doi.org/10.4108/eai.13-7-2018.159713 3. Erwin, et al.: A study about principle component analysis and eigenface for facial extraction. J. Phys. Conf. Ser. 1196, 1 (2019). https://doi.org/10.1088/1742-6596/1196/1/012010 4. Farhan, H.R., et al.: Face recognition system based on continuous one-state model Face Recognition System based on Continuous One-State Model, 050001 (August 2019)

Monitoring, Recognition and Attendance Automation in Online Class

27

5. Farhan, M., et al.: IoT-based students interaction framework using attention-scoring assessment in eLearning. Futur. Gener. Comput. Syst. 79, 909–919 (2018). https://doi.org/10.1016/ j.future.2017.09.037 6. Faritha Banu, J., Revathi, R., Suganya, M., Gladiss Merlin, N.R.: IoT based cloud integrated smart classroom for smart and a sustainable campus. Procedia Comput. Sci. 172, 77–81 (2020). https://doi.org/10.1016/j.procs.2020.05.012 7. Ghaffarian, S., Valente, J., van der Voort, M., Tekinerdogan, B.: Effect of attention mechanism in deep learning-based remote sensing image processing: a systematic literature review. Remote Sens. 13(15), 2965 (2021). https://doi.org/10.3390/rs13152965 8. Guo, G., Zhang, N.: A survey on deep learning based face recognition. Comput. Vis. Image Underst. 189, 102805 (2019). https://doi.org/10.1016/j.cviu.2019.102805 9. Gupta, S., et al.: Automatic recognition of SEM microstructure and phases of steel using LBP and random decision forest operator. Measurement 151, 107224 (2020). https://doi.org/10. 1016/j.measurement.2019.107224 10. Gupta, S.: Chan - vese segmentation of SEM ferrite - pearlite microstructure and prediction of grain boundary. 10, 1495–1498 (2019). https://doi.org/10.35940/ijitee.A1024.0881019 11. Gupta, S., et al.: Modelling the steel microstructure knowledge for in-silico recognition of phases using machine learning. Mater. Chem. Phys. 252, 123286 (2020). https://doi.org/10. 1016/j.matchemphys.2020.123286 12. Ilieva, G., Yankova, T.: IoT in distance learning during the COVID-19 pandemic. TEM J. 9(4), 1669–1674 (2020). https://doi.org/10.18421/TEM94-45 13. John, N., Philip, A.: FERNET System 3(1), 1–3 (2021). https://doi.org/10.5281/zenodo.509 0540 14. Khan, M., et al.: Face detection and recognition using OpenCV. In: Proc. – 2019 Int. Conf. Comput. Commun. Intell. Syst. ICCCIS 2019. 2019-Janua, pp. 116–119 (2019). https://doi. org/10.1109/ICCCIS48478.2019.8974493 15. Lal, M., et al.: Study of face recognition techniques: a survey. Int. J. Adv. Comput. Sci. Appl. 9(6), 42–49 (2018). https://doi.org/10.14569/IJACSA.2018.090606 16. Lemay, D.J., et al.: Transition to online learning during the COVID-19 pandemic. Comput. Hum. Behav. Reports. 4, 100130 (2021). https://doi.org/10.1016/j.chbr.2021.100130 17. Mishra, L., Gupta, T., Shree, A.: Online teaching-learning in higher education during lockdown period of COVID-19 pandemic. Int. J. Educ. Res. Open 1, 100012 (2020). https://doi. org/10.1016/j.ijedro.2020.100012 18. Mukherjee, T.: RFID based attendance management system. Int. J. Res. Appl. Sci. Eng. Technol. 9(VI), 268–275 (2021). https://doi.org/10.22214/ijraset.2021.34904 19. Muzaferija, I., et al.: Student attendance pattern detection and prediction. J. Eng. Nat. Sci. 3(1) (2021). https://doi.org/10.14706/jonsae2021313 20. Orrù, G., Marcialis, G.L., Roli, F.: A novel classification-selection approach for the self updating of template-based face recognition systems. Pattern Recognit. 100, 107121 (2020). https://doi.org/10.1016/j.patcog.2019.107121 21. Shen, Y., et al.: Microprocessors and microsystems smart classroom learning atmosphere monitoring based on FPGA and convolutional neural network. Microprocess. Microsyst. 103488 (2020 November). https://doi.org/10.1016/j.micpro.2020.103488 22. Shetty, A.B., et al.: Facial recognition using haar cascade and LBP classifiers. Glob. Transitions Proc. 0–12 (2021). https://doi.org/10.1016/j.gltp.2021.08.044 23. Tarik, A., et al.: Artificial intelligence and machine learning to predict student performance during the COVID-19. Procedia Comput. Sci. 184, 835–840 (2021). https://doi.org/10.1016/ j.procs.2021.03.104 24. Taskiran, M., et al.: Face recognition: past, present and future (a review). Comput. Hum. Behav. Reports. 4, 100130 (2020). https://doi.org/10.1016/j.dsp.2020.102809

Cyber Threat Phylogeny Assessment and Vulnerabilities Representation at Thermal Power Station Vinod Mahor1(B) , Bhagwati Garg2 , Shrikant Telang3 , Kiran Pachlasiya4 , Mukesh Chouhan5 , and Romil Rawat6 1 IES College of Technology, Bhopal, Madhya Pradesh 462001, India

[email protected] 2 Union Bank of India, Branch, Gwalior, MP 474001, India 3 Shri Vaishnav Vidhyapeeth Vishwavidyalaya, Indore, MP 452001, India 4 NRI Institute of Science and Technology, Bhopal, MP 462001, India 5 Government Polytechnic College, Sheopur, MP 476337, India 6 Shri Vaishnav Vidhyapeeth Vishwavidyalaya, Indore, MP 452001, India

Abstract. Cyber security issues relating to thermal power stations (TPS) are the major focus of concern towards the development of digital computing equipment. Stuxnet, the computer virus that wrecked uranium (Iran’s) enrichment facility, implies that TPS may be vulnerable to cyber-attacks that result in the discharge of hazardous elements. However, in comparison to information technology (IT), cyber security issues analysis on ICS and SCADA [industrial control systems and supervisory control and data acquisition systems, respectively] is lacking, and analyzing cyber-attack phylogeny for TPSs is difficult given the characteristics of ICSs. Cyber-attack phylogeny research has progressed, but does not reflect the inherent features of TPSs and is lacking a systematic remedial plan. As a result, as outlined in RG-5.71 and RS-015, it is important to further systematically investigate the stability of regulators (operators) in relation to online security (RS.015). This study proposes a framework for cyber-attack phylogeny based on TPS features and uses the template to illustrate a specific cyber-attack situation. Furthermore, by comparing the remedies by essential digital assets and devices (DAD), this article presents a systematic and effective technique (SET). The cyberattack examples studied with the suggested cyber-threat phylogeny are to be used as data for evaluating and validating cyber threat and security conformance for DAD to be employed, as well as mitigation and functional protective of TPS cyber-threats. Keywords: Thermal power station · SCADA · Cyber security · Attack phylogeny

1 Introduction Automation mechanisms of nuclear power plants (NPPs), also called thermal power stations (TPS), are transitioning from analogue to control and digital instrument (C and © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 28–39, 2022. https://doi.org/10.1007/978-981-19-3182-6_3

Cyber Threat Phylogeny Assessment and Vulnerabilities Representation

29

I) devices, because analogue C & I devices perform poorly compared to digital C & digital I devices [1, 2], and analogue gadgets are more difficult to maintain. Cyber-attacks (CA) and security breaches against ICS and SCADA systems are documented in the ICS-CERT report [3]. Stuxnet, a computer virus that physically destroyed Iran’s uranium enrichment facility’s centrifuges is an example of a CA on a NPP. Furthermore, in the USA, the Davis-Besse TPS was attacked. A Slammer virus (malware) [21] corrupted the safety status (alerting) indication system, causing it to fail. It will be inoperable for the next 5 h. The computer network (CN) of Korea Hydro and Nuclear Power [KHNP] was penetrated in Korea, and intruders acquired the TPS’s design (functioning) and manual, as well as employee personal information [4]. To avoid and revert to cyber threats as they become more prevalent, it’s vital to identify predicted CA against TPSs and assess cyber security compliance for DAD that can provide reliability [5] and performance. The NPP, reactors, some navy fuel facilities, the uranium enrichment plant, fuel fabrication factories, and the uranium mine are all candidates for threats that might cause widespread radioactive contamination. Threats like an air raid towards a reactor complex or cyber attacks (CA) [6], are classified into two types: commando-style floor threats to gadgets, which, if undermined, could result in reactor core meltdown (widespread) radioactivity dispersal. Thermal power facilities were first evaluated as prospective targets for the September 11, 2001 attacks (USA 9/11 Commission) [6]. Such a threat may result in widespread radioactive contamination by terror cells [18] and might also cause a power plant’s safety system to break, resulting in a core meltdown and/or significant damage to spent fuel pools. According to the Federation of American Scientists, in order for thermal power to become more commonly employed, thermal stations must be made significantly secure from dangers that may release huge amounts of radiation into the atmosphere. Passive nuclear safety elements in new reactor designs may be beneficial. The Nuclear Regulatory Commission in the USA [7] conducts exercises at all TPS locations at least once with 3- years. Thermal-neutron reactors have been popular targets during armed conflict, and have been repeatedly targeted during military air threats during the past two or three decades. Since 1980, Plowshares [3] has proven how nuclear weapons sites may be accessed through different acts of civil disobedience, with the group’s actions constituting notable security breaches at nuclear weapons locations in the USA. The National Nuclear Security Administration (NNS) acknowledged the significance of this year’s (2012) Plowshares event. Nuclear non-proliferation policy experts argue for the use of civilian contractors to just provide security at places where the government’s most dangerous military technology is created and stored [4]. Nuclear weapons materials (a black market [26]) are a worldwide worry, as is the risk of a militant group detonating a dirty bomb in a major city. 1.1 Military Threats Thermal-neutron reactors have been popular targets during armed confrontations but have been repeatedly targeted during major military threats, occupations, conquests, and campaigns during the previous three decades [18] (Table 1).

30

V. Mahor et al. Table 1. Cyber-threat activity

Duration

Activity

March-2021

A hacker group (backdrop of RedEcho) supported by Chinese government, targeting India’s power grid and transportation sector

25-March 1973

The People’ Revolutionary Army briefly took over the Atucha I thermal station in Argentina before it was completed, stealing an FMK-3 (sub - machine gun and 3.45 calibre handguns) from a security unit And having a falling out with the officers when they retired, injuring two cops [13]

30-September 1980

An Islamic Republic [IR] of Iran Security agencies (Air Force) launched Operation [Scorch Sword], an surprise bombing at the Al Tuwaitha thermal plant in Ba’athist, during the Iran-Iraq Conflict and War. An almost entire Thermal-neutron reactor was damaged in the raid, which took place 17 km southeast of Baghdad [8]

June 1981

The Israeli Air Force carried out Operation Opera, which completely ruined Iraq’s Osirak thermal research services [9]

8-January 1982

On the commemoration of the National Congress’s founding (African), the ANC’s military wing, attacked Koeberg Thermal Power plant (under construction), planting 4 limpet [8]. The explosions caused R 500 million in damage, and the plant’s activation was delayed by several months [9]

During[1984 – 1987]

6- times, Iraq struck Bushehr thermal station [10, 18]

1991

The US Air Force destroyed 3-Thermal-neutron plants along with enrichment pilot plant during Persian Gulf War and Iraq deployed missiles (Scud) towards Israel’s Dimona thermal station during its missile threat on Israel and Saudi Arabia

September 2007

In the Deirez-Zor Governorate, Israel attacked a Syrian reactor under development. [18]

The remainder of the paper is arranged as follows: Sect. 2 discusses the Phylogenetic Attributes of TPSS; Sect. 3 highlights the Cyber-Threat Phylogeny Categorization Scheme; Sect. 4 discusses the Template for a Phylogeny and Its Application in a Conformance Test; and Sect. 5 concludes the paper.

2 Phylogenetic Attributes of TPS The Advanced CA phylogeny study is mostly focused on the information technology (IT) industry, with the phylogeny for ICS and SCADA [12] concentrating on the concept of threat. Furthermore, phylogenetic studies are being carried out on specialized facilities, such as energy plants. Based on the threat type, attack target [26], vulnerabilities, and payload, Hansman [1, 10–12] suggested a Computer and network threats are categorized into 4 steps. The attack class is a type of threat vector that categorizes a wide range of CAs by threat method. For example, password cracking is Level 1,

Cyber Threat Phylogeny Assessment and Vulnerabilities Representation

31

guessing is Level 2, while brute-force is Level 3. The threat category for the attack target is expressed as the CA objects. The threat target [2], for example, is classified into several categories: level 1 hardware, level 2 computers, level 3 network devices, and level 4 switches. The Common-Vulnerabilities classification system is used to classify vulnerabilities and exposures (CVE) systems. For implementation, design, and configuration vulnerabilities, Howard’s vulnerability [13] was employed. The threat’s payload is broken into three categories: information leaking, destruction, and incapacity to service. By studying how to treat the control system, Fleury [1, 14] proposed an AttackVulnerability-Damage (AVD) model. The threat’s repercussions, how to respond to the threat, and the defense mechanism’s requirements towards attack (target, source, and technique) are all represented by the word “attack.“ The term “vulnerability” refers to the reason for an attack’s success as well as system flaws. The term “damage” refers to the severity of the threat. Considering the threat vector, operational effect, defense, information impact, and threat target into account, Simmons’ phylogeny [15] presented a CA categorization system model dubbed AVOIDIT. A vulnerability [26] and attack path are referred to as a threat vector. The object of a threat, like a system (network), is known as the attack target. Keith [1, 16] suggested a phylogeny that considers the susceptibility of a particular system as well as the community’s impact on cyber-threats. Keith’s phylogeny is divided into two categories: event vertices and effect vectors. An incident vector tells us about the source of the CA, the target of the threat, and the characterization of the CA, in addition to these methods and vulnerability. An effect vector identifies the community sectors that have been impacted by the CA, as well as the cause of the damage and the criteria used to measure the impact. The features of target threats were the subject of line [17]. The categories were divided into four groups based on the objective of the threat, the initial threat vector, lateral mobility, and the location of the command and control (CC) server [1]. The path of the attack is the initial threat vector. After infecting the system, the threat uses lateral movement to spread throughout the system. The location of the CC server indicates how an invasion could be attempted again. Flowers [14] suggested a categorization method. The type of attackers is centred on the case using an event-based matrix, depending on the industry, region, and virus type [2]. Dorottya [18, 19] also looked at embedded system threats and weaknesses. To estimate cyber security threats from system operation, human behavior, system failures, internal process flaws, and external events were all considered by CMU [2, 8]. The phylogeny of CA is mostly focused on the network, and it does not take into consideration system architecture quirks such as TPSs. The architectural attributes are the design paradigm of the facilities. The Reactor Protection System (RPS) in typical Korean TPSs [1] contains 4 PLCs [2] in 2-from-4 logic, ensuring that even if a single PLC is destroyed by a CA, the regular operation of the TPS is not endangered. This indicates that in order to have an effect on how TPSs work, a CA on PLCs must disable 3–4 PLCs at the same time. It is not reasonable to assess the severity of CA and its treatments without considering the architectural nature of TPSs. As a result, the architectural attributes of TPSs are reflected in this article, as well as information to consider for TPS cyber security. Furthermore, unlike conventional IT and ICS, TPS faces the possibility of radiation leakage and unlawful nuclear material transfer. TPSs differ from IT and ICS in terms of how they protect themselves from these threats. The attributes of TPS, such as the defense and

32

V. Mahor et al.

security strategy, reactor dead times, as well as the probability of radioactive material seeping [20] in the event of a catastrophic accident, are discussed in this paper. Furthermore, this study proposes systematic ways of matching remedies to threat vectors and attack consequences, which have not been addressed in earlier studies. RG.5.71 [1, 2] also matches the attributes of CA and their related defenses. A regulatory guide (RG) to nuclear thermal station cyber security initiatives is to ensure that security controls for the deployment of digital technology are met. As previously stated, taxonomic categories are categorized according to particular notions. By increasing the number of CA perpetrated using templates based on the proposed phylogeny categories, they can be used to verify the conformity of digital C and I devices that will be used in TPSs as data [4]. This data can also be used to develop effective remedy methods for CA. This study proposes threat technique, threat vector, threat result, remedies, and vulnerability as phylogeny categories for these reasons.

3 Cyber-Threat Phylogeny Categorization Scheme 3.1 Procedure for Threat and Vector of Threat The first phylogeny item is the threat process. We propose a phylogenetic template in this research to increase the number of individuals who utilize the internet. Various CA situations are created and tested [21]. The threat process can be used as a criterion for increasing the number of participants. CA is carried out in a systematic manner. The threat method consists of four steps: gathering information, getting access permissions, CC, action, and theft of information [18]. Using hacking techniques (port scanning and social engineering environments) [7], the acquisition phase of acquiring information includes examining the vulnerability network. Information about objects and vulnerabilities can be obtained in these ways. The stage of obtaining the access right involves attacking the system and obtaining the administrator’s authority from the user. This step also includes using a CA, such as a brute force attack, to find a password. The stage of executing a command remotely is known as the CC phase. The action-and-exfiltration step involves deleting or modifying the stored log data in order to keep the user unaware of a CA [13]. The threat vector is the 2nd phylogeny component. The TPS runs on a closed communication network that is physically disconnected by the rest of the planet. CA, however, suggests that a closed communication network is still not secure [11]. We classify the threat vector in this study by examining the infringements of CA on ICS and SCADA systems. Physical access (PA) and network access (NA) are two types of threat vectors classified as a result of this. When a portable device (a USB drive or a DAD) is physically placed or connected to nuclear facilities, this is referred to as physical access (PA). 3.2 Physical Accessibility 3.2.1 Portable Storage Medium Orientation The supply chain gadget [Importing and installing], TPSs have complex structures and a variety of safety procedures in place to prevent radioactive [1] material from escaping by bringing in and installing external devices from various suppliers. When the C and

Cyber Threat Phylogeny Assessment and Vulnerabilities Representation

33

I gadgets provided to the supply chain (SC) are associated with the thermal system and contain diseased Vulnerable programs [4] [256] like TPSs can be infected in terms of cyber security. One of the most common accident incidents is Stuxnet [1, 2], which destroyed an Iranian nuclear facility’s centrifuge. Triggered by the usage of a Siemens [2] PLC gadget that has been programmed with vulnerable ladder logic [10] (Fig. 1).

Fig. 1. Cyber-threat phylogeny

3.2.2 An Insider Threat and Access to the Internet In the case of TPSs, Management and access control are built on a low level of privilege and access authority, but an insider threat using a high-access-authority account might have a bigger affect the network. With network connectivity, insider CA is also a possibility. In 2010, US military agencies acknowledged their diplomacy with Wikileaks [1], a common violation. The Access to the Internet is The term “network access” refers to a method of connecting via digital gadgets to a network of TPSs. Incomplete networks are a common aspect of CA through system access in closed networks like TPSs. Network access can be classified into 3- categories based on this: usage of an incomplete network

34

V. Mahor et al.

for upgrading, remote access, and wireless communication. During connecting to external medium for repair, a susceptible code may travel across the incomplete network. The example of an accident induced by an imperfect network is infringement of Monju TPS [2] in 2014, This is done, when a worker altered the video software, allowing more than 43,000 documents(components) to be grabbed by a weak code [1, 12]. 3.2.3 Wireless Communication with Result of the Threat Only approved access within a TPS’s security control area (SCA) or external access (EA) within the security region is permitted, yet wireless technology in important DAD associated with critical safety functions is prohibited. Only authorized access to a TPS’s safety control system or outside access to the secure area is allowed, according to this specification of RS.015 [11]. In 2003, a thermal expert at the Davis-Besse TPS in the USA corrupted a thermal system computer(Server) while accessing a thermal station system over a VPN [3] encrypted channel from home on a virus-ridden notebook [1, 19]. The Result of the threat is The cyber-outcome threat is the third phylogeny. Stuxnet, which wrecked centrifuges at Iran’s Natanz [1, 2] TPS, suggests that CA could lead to a nuclear facility’s physical destruction. When it comes to CA, they can be divided into 4- categories: system destruction, disruption, Information tampering and leaks. System demolition happens when the system is physically broken, like with Bad-USB, or when the system’s logic is altered due to hard coding and vulnerable code, rendering the system inoperable and terminating the TPS. The data updating and instrumentationcontrol-signals in order to show false plant information, which might result in incorrect operating directives [3]. In addition to the above-mentioned generic repercussions of CA, in this paper, we look at the attributes of TPSs and classify the consequences of CA based on whether or not They have an impact on “Safety, Security, and Emergency Preparedness” [SSEP]functions. CA that affect SSEP functions are further subdivided into threats involving illegal transfer or sabotage. 3.2.4 Consequences and Impact on SSEP Functions CA is occurrences can result in physical harm to system components or have a direct influence on operation on TPSs’ SSEP, such as causing a shutdown. Physical damage [3] to a TPS, unlike other ICS and SCADA systems, can result in radioactive release. An assailant is keeping note of these details and utilizing CA to facilitate the unlawful movement of nuclear items. The act of endangering and destroying nuclear materials or infrastructures, or interfering with TPS’s regular functioning, is referred to as sabotage. Nuclear facility sabotage can result in radiation leaks and shutdown. Due to Xenon oscillations [2, 3], when an TPS is shut down, a reactor dead time is unavailable, which means it takes longer to restart than other types of plants. The DAD that fulfills SSEP activities must be defined and are referred to as vital digital assets to avoid the unlawful transfer of nuclear material and sabotage (SET) [5]. As a result, dependable procedures should be used to secure the SET in TPSs against CA.

Cyber Threat Phylogeny Assessment and Vulnerabilities Representation

35

3.2.5 Vulnerability A system’s vulnerability is a flaw in its design or operation that could be exploited by an attacker to carry out unauthorised operations [12]. This is defined in this document as a specific operating system, computer hardware, application software, CPU type, communiqué mechanism, and CVE condition. In contrast to IT security, TPS systems are unable to update and pathan antivirus in real-time. As a result, CA has the potential to be devastating. These issues can be resolved by performing penetration testing on an existing or upcoming DAD. This study proposes that by matching a CA with DAD information and CVE, the vulnerability is represented as data that could be used in penetration testing. Furthermore, the findings of matching a CA with information from DAD and CVE can be utilised to predict and respond to cyber-threats efficiently. TPSs, unlike typical ICS and SCADA systems, must take greater availability, reactor safety, and enriched uranium protection into account. Given these circumstances, DAD is classified as SET in TPS, regardless of whether or not it impairs the SSEP function. The SET was divided into two categories by NEI 13–10: direct SET and non-direct SET. The three forms of non-direct SET are: indirect SET, balance of plant [BOP] SET, and emergency preparedness [EP] SET. EP SETs are SETs that function as facilities (systems) to respond to nuclear plant accidents having radiological impacts. The BOP SETs are SETs that could impact the reactivity of thermal plants directly or indirectly, leading to unscheduled reactor shutdowns (transients). An indirect SET is one that does not have a negative influence on safety (security) functions before the failure is identified and is supplied by the operators. The non-direct SETs should be able to fulfill the minimum requirements (A through G). To fulfill Security Measure-A, the SET must be found in a Protected Area [PA] or a Vital Area [VA]. According to security measure-B, all assets linked to the SET must not have wireless connectivity capabilities. According to security measure-C, each asset linked to the SET must be air-gapped (physically) from the system (network). As a safety precaution, before making a configuration modification, E should evaluate and describe the situation. The SET should inspect security measures on a regular basis. G should monitor and evaluate baseline cyber security protection standards on a regular basis to ensure that they are met. The BOP and indirect SETs must fulfill all seven benchmark criteria, but the EP SET can meet up to four of them [D, E, F, G]. Direct SETs are categorized into 7 groups based on their SET characteristics (software and hardware properties) [A1 to A3, B1 to B3, and C]. Software attributes include programmes, firmware, configuration upgrades, and HMI remote access. Hardware attributes include the console port, “environmentally” communications functionality, and servicing and configuration ports. The goal of this category is to effectively manage DAD, as managing thousands of SET is impossible. Nuclear operators can also use SET categorization to develop proper CA response measures. Advanced research on CA phylogeny does not discuss these properties of TPSs, and systematic solutions are lacking. As a result, this article proposes systematic remedy tactics by matching CA to SET Class. Furthermore, the R.G.-5.71 [15] security control system (technical, operational, and administrative security controls as well as technical and operational) and CA remedies are matched. This provides for the verification of the security control system when

36

V. Mahor et al.

integrating TPS DAD and the prevention of a CA by incorporating a security function into a digital gadget [5].

4 Template for a Phylogeny and Its Application in a Conformance Test This study demonstrates how to apply CA phylogeny using the proposed template and ARP spoofing. Name of cyber attack (CA) – Address Resolution ProtocolARP[Spoofing, cache poisoning, and poison routing]. Definition-An attacker makes (fake and spoofed) ARP requests to a local area network (LAN). Spoofing enables an attacker to capture frames, modify communications, or block all traffic. The threat serves as a gateway for additional attacks such as Denial of Service [DOS[, Man in the Middle [MIM], or session-hijacking (Table 2). Table 2. Example of cyber-threat phylogeny (ARP spoofing). Threat procedure:

Attacker sends spoofed packed into network

Threat vector

MAC and IP address association

Vulnerability

CVE- Common Vulnerabilities and Exposures CVE-1999–0667 CVE-2020–2033 CVE-2018–17195 CVE-2019–15022 CVE-2020–27218

Threat Consequence

Traffic sent towards attacker’s – effecting machine, the nonexistent location – Effecting SSEP function

Remedies

Direct SET (Type C)-Affected by CA)

• Intercepting of communication networking devices • client certificate authentication

Security Control: • Packet content filtering • Avoidance of trust relationships • Implementation of ARP spoofing detection mechanism • Implementation of cryptographic network protocols

The CA (definition, technique, vector, vulnerability, result, and remedies) are all included in the template. As previously stated, the proposed phylogeny’s goal is to create strategic remedies that represent threat consequences and vectors, as well as to be utilized for compliance testing. This paper presents remedies when a PMS-[plant monitoring system] attacked by ARP spoofing, and discusses about to use phylogeny in the

Cyber Threat Phylogeny Assessment and Vulnerabilities Representation

37

conformance test to verify the usefulness of phylogeny. Prior to displaying the remedies, it should be emphasized that the PMS remedies described in this paper are designed to demonstrate the phylogeny’s use. This is the case when the PMS’s basic functions are considered. Actual-communication techniques, design attributes aren’t depicted as a security factor in the scenario, and they may differ from actual response scenarios. The PMS can choose control element assembling batch to employ the quick power cutoff system-RPCS. Serves duties relating to the controlling- rod-driving-mechanisms but does not seek direct control. As a result of the CA, the SSEP functions may be jeopardized. The PMS is comprised of the Plant data acquisition system [PDAS] and the PCS[Plant computer system]. The plant input variables are sent towards PCS through the PDAS. The PCS takes input data from PDAS and analyses, computes, alarms, and stores them, in addition to notifying the operator through other adjacent systems. If an insider or malware exploits ARP spoofing in the PCS, the PDAS will be unable to complete its role owing to a huge amount of packets, perhaps preventing the PCS from collecting plant input variables. As a solution, defense in depth, which defines the PCS as a higher grade of communication than the PDAS, can be utilized. Although data may be sent from high-grade SET to low-grade SET, the defense-in-depth technique prohibits signals along with data by going from low-grade SET to high-grade SET. In addition, the communication channels that are directly and indirectly coupled to SETs must be determined. The network protocol [24, 25] should not be designed to launch instructions outside of the same network range, nor should its commands be programmed to weaken the security status of the needed DAD. Before installing software fixes and updates, the security effect should be assessed [6]. As seen below, the suggested phylogeny may be used for conformance test. A conformance test is the technique for confirming the securities and compatibility of things prior to their introduction into public and national organizations. It analyses(certifies) the safety and dependability of ISP[information security products] using Common Criteria [CC]. At the case of TPSs, [RG-5.71 and RS-015] are used rather of CC, and verification is carried out after a document-based test that includes penetration testing. The document-based test ensures that the security controls described in security standards meet the DAD security design (mapping) criteria. Penetration testing can validate that the DAD security controls fulfill environmental requirements. However, there is a problem with utilizing CA to do penetration testing. The lineage provided in this paper can be utilized to address this issue. To begin, the DAD that will be used to meet the access controls of security requirements and be compatible with the design brief of digital device.

5 Conclusion This research proposed a CA phylogeny that takes into account TPS attributes. In general, phylogeny used in the IT area is not appropriate for nuclear regulatory objectives. It was especially important to rebuild security in relation to nuclear safety concerns. Gathering information, obtaining access permission, CC, and action and exfiltration are all parts of the threat procedure. The threat technique can be used as a criterion for conducting a thorough probe into a CA scenario. Physical and network access are the two

38

V. Mahor et al.

threat vectors. The cyber-repercussions threats have been classed as a threat to the SSEP function. The CA on the SSEP function was split into 2-categories: Nuclear sabotage with illegal shifting of nuclear Components (materials). This may be utilized to begin estimating the chance of a CA. the operating system, hardware component; software, CPU, and the communication-mechanism are deemed to be susceptible. Vulnerabilities could be mapped for CVE, making it easier to locate vulnerability reports and CA recommendations. In RG-5.71 and RS-015, the remedies were coupled with SETs and security controls. When TPS is threatened by CA, this mechanism allows for strategic responses.

References 1. Kim, S., Heo, G., Zio, E., Shin, J., Song, J.G.: Cyber attack taxonomy for digital environment in nuclear power plants. Nucl. Eng. Technol. 52(5), 995–1001 (2020) 2. Shin, J., Choi, J.-G., Lee, J.-W., Lee, C.-K., Song, J.-G., Son, J.-Y.: Application of STPASafeSec for a cyber-attack impact analysis of NPPs with a condensate water system test-bed. Nucl. Eng. Technol. 53(10), 3319–3326 (2021). https://doi.org/10.1016/j.net.2021.04.031 3. e Silva, R.B., Piqueira, J.R.C., Cruz, J.J., Marques, R.P.: Cybersecurity assessment framework for digital interface between safety and security at nuclear power plants. Int. J. Crit. Infrastruct. Prot. 34, 100453 (2021) 4. Zhang, F.: Nuclear power plant cybersecurity. In: Nuclear Power Plant Design and Analysis Codes, pp. 495–513. Elsevier (2021). https://doi.org/10.1016/B978-0-12-818190-4.00021-8 5. Rajawat, A.S., Rawat, R., Barhanpurkar, K., Shaw, R.N., Ghosh, A.: Vulnerability analysis at industrial internet of things platform on dark web network using computational intelligence. In: Bansal, J.C., Paprzycki, M., Bianchini, M., Das, S. (eds.) Computationally Intelligent Systems and their Applications. SCI, vol. 950, pp. 39–51. Springer, Singapore (2021). https:// doi.org/10.1007/978-981-16-0407-2_4 6. Aravindakshan, S.: Cyberattacks: a look at evidentiary thresholds in International Law. Indian J. Int. Law 59(1–4), 285–299 (2020). https://doi.org/10.1007/s40901-020-00113-0 7. Khazaei, J., Amini, M.H.: Protection of large-scale smart grids against false data injection cyberattacks leading to blackouts. Int. J. Crit. Infrastruct. Prot. 35, 100457 (2021) 8. Rawat, R., Rajawat, A.S., Mahor, V., Shaw, R.N., Ghosh, A.: Surveillance robot in cyber intelligence for vulnerability detection. In: Bianchini, M., Simic, M., Ghosh, A., Shaw, R.N. (eds.) Machine Learning for Robotics Applications. SCI, vol. 960, pp. 107–123. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-0598-7_9 9. Ning, X., Jiang, J.: Design, analysis and implementation of a security assessment/enhancement platform for cyber-physical systems. IEEE Trans. Industr. Inform. (2021) 10. MR, G.R., Ahmed, C.M., Mathur, A.: Machine learning for intrusion detection in industrial control systems: challenges and lessons from experimental evaluation. Cybersecurity 4(1) 1–12 (2021) 11. Djenna, A., Harous, S., Saidouni, D.E.: Internet of things meet internet of threats: new concern cyber security issues of critical cyber infrastructure. Appl. Sci. 11(10), 4580 (2021) 12. Xenofontos, C., Zografopoulos, I., Konstantinou, C., Jolfaei, A., Khan, M.K., Choo, K.K., R.: Consumer, commercial and industrial IoT (in) security: attack taxonomy and case studies. IEEE Internet Things J. (2021) 13. Qasim, S., Ayub, A., Johnson, J., Ahmed, I.: Attacking the IEC-61131 Logic Engine in Programmable Logic Controllers in Industrial Control Systems. Springer International Publishing, Cham (2021)

Cyber Threat Phylogeny Assessment and Vulnerabilities Representation

39

14. Naanani, A.: Security in Industry 4.0: Cyber-attacks and countermeasures. Turk. J. Comput. Math. Educ. 12(10), 6504–6512 (2021) 15. Rawat, R., Mahor, V., Chirgaiya, S., Rathore, A.S.: Applications of social network analysis to managing the investigation of suspicious activities in social media platforms. In: Daimi, K., Peoples, C. (eds.) Advances in Cybersecurity Management, pp. 315–335. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-71381-2_15 16. Robles-Durazno, A., Moradpoor, N., McWhinnie, J., Russell, G., Porcel-Bustamante, J.: Implementation and evaluation of physical, hybrid, and virtual testbeds for cybersecurity analysis of industrial control systems. Symmetry 13(3), 519 (2021) 17. https://en.wikipedia.org/wiki/Vulnerability_of_nuclear_plants_to_attack 18. Rawat, R., Mahor, V., Rawat, A., Garg, B., Telang, S.: Digital transformation of cyber crime for chip-enabled hacking. In: Handbook of Research on Advancing Cybersecurity for Digital Transformation, pp. 227–243. IGI Global (2021) 19. Tripathi, D., Singh, L.K., Tripathi, A.K., Chaturvedi, A.: Model based security verification of cyber-physical system based on petrinet: a case study of nuclear power plant. Ann. Nucl. Energy 159, 108306 (2021) 20. Zhang, F., Hines, J. W., Coble, J.: Industrial control system testbed for cybersecurity research with industrial process data. Nucl. Sci. Eng. (2021) 21. Abou el Kalam, A.: Securing SCADA and critical industrial systems: from needs to security mechanisms. Int. J. Crit. Infrastruct. Prot. 32, 100394 (2021) 22. Lee, C., Yim, H.B., Seong, P.H.: Development of a quantitative method for evaluating the efficacy of cyber security controls in NPPs based on intrusion tolerant concept. Ann. Nucl. Energy 112, 646–654 (2018) 23. Rawat, R., Rajawat, A.S., Mahor, V., Shaw, R.N., Ghosh, A.: Dark web—onion hidden service discovery and crawling for profiling morphing, unstructured crime and vulnerabilities prediction. In: Mekhilef, S., Favorskaya, M., Pandey, R.K., Shaw, R.K. (eds.) Innovations in Electrical and Electronic Engineering. LNEE, vol. 756, pp. 717–734. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-0749-3_57 24. Rawat, R., Mahor, V., Chirgaiya, S., Garg, B.: Artificial cyber espionage based protection of technological enabled automated cities infrastructure by dark web cyber offender. In: Intelligence of Things: AI-IoT Based Critical-Applications and Innovations, pp. 167–188. Springer, Cham (2021) 25. Rawat, R., Garg, B., Mahor, V., Chouhan, M., Pachlasiya, K., Telang, S.: Cyber threat exploitation and growth during COVID-19 times. In: Kaushik, K., Tayal, S., Bhardwaj, A., Kumar, M. (eds.) Advanced Smart Computing Technologies in Cybersecurity and Forensics, pp. 85–101. CRC Press, Boca Raton (2021). https://doi.org/10.1201/9781003140023-6 26. Mahor, V., Rawat, R., Kumar, A., Chouhan, M., Shaw, R. N., Ghosh, A.: Cyber warfare threat categorization on cps by dark web terrorist. In: 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), pp. 1–6. IEEE (September 2021)

A Novel Data Encryption Technique Based on DNA Sequence Abritti Deb(B) , Satakshi Banik, Pankaj Debbarma, Piyali Dey, and Ankur Biswas Tripura Institute of Technology, Narsingarh, Agartala, Tripura 79909, India [email protected]

Abstract. The volume of data created and processed in computer devices is growing at an alarming rate these days. Between all of these devices, massive quantities of vital and sensitive data are transferred. As a result, ensuring the security of all of this vital data is crucial. DNA cryptography is a relatively new paradigm and one of the world’s fast-growing technologies, attracting a lot of attention in cyber security. Although there are several issues with DNA cryptography, scientists are working to resolve them since, due to the high information density and parallelism inherent in DNA computers, it is feasible to develop more complex Crypto algorithms. The focus of this research is on contemporary cryptography. A technique for encrypting data using DNA-based encryption is presented. DNA coding technique is utilized to transform hidden messages into DNA strings in this work. Finally, the proposed DNA cryptography algorithm’s implementation is being shown in Java. Keywords: Cryptography · DNA encryption · DNA coding · Data hiding · Pi matrix

1 Introduction Cryptography is a technique for securing and protecting data during transmission. It’s useful for preventing unauthorized individuals or groups of people from accessing sensitive information. Cryptography’s two most important functions are encryption and decryption. Data encryption transforms a message delivered over the network into an unintelligible encrypted message. The received message is transformed to its original form at the receiving end, which is known as decryption. We are living in a digital era, whether it be ordering food, booking flights, submitting assignments, booking a cab, especially now in this global pandemic situation where we’re pretty much forced to do everything digitally along with our education. We are constantly using the internet. Inherently generating a large amount of data and this data is stored in a cloud, which is basically a huge data center or data server that you can access online, also we can use an array of devices to access these data. Now for a Hacker, it’s a golden age who tries to exploit and gain unauthorized access into a computer security networking system. With so many access points and public IP addresses, tons of data, and constant traffic it’s now easier for black cat hackers to exploit vulnerabilities. Above that cyber-attacks © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 40–50, 2022. https://doi.org/10.1007/978-981-19-3182-6_4

A Novel Data Encryption Technique Based on DNA Sequence

41

are increasing and evolving by the day. Hackers are becoming smarter and creative with their techniques. Hence data encryption is crucial. We primarily focus on contemporary cryptography in this study, where we create a technique for encrypting data utilizing DNA-based cryptography. DNA coding technique is employed in our research to transform secret messages into DNA strings, thereby concealing and encrypting the data within DNA. In our approach, the sender needs to input a secret message, key matrix coordinate, and dimensions. The coordinates should be prime and matrix dimensions should be less than the length of the secret message. The sender also needs to input DNA sequences (either manually or ‘generate automatically’) to hide the data. After all the inputs are done and validation is done, the proposed algorithm will convert the secret message into the form of a DNA (A, G, C, T) sequence. And then the algorithm will produce the key matrix from pi-matrix with the help of the key-matrix coordinate and dimensions. Afterward, the key-matrix is converted into a key-matrix 1D array. Finally, the implementation of the proposed DNA cryptography is presented. The remaining paper is organized as follows: Sect. 2 demonstrates the previous work, while Sect. 3 provides necessary details of the proposed methodology. Evaluations and experimental details are shown in 4 and 5 respectively. Finally, 5 presents the concluding remarks.

2 Literature Survey In this section, several prior DNA sequence-based data-hiding techniques are discussed. A variety of approaches for concealing data inside DNA sequences have been presented [1–3]. DNA is a lengthy polymer made up of nucleotides, which are repeating units. DNA polymers are huge molecules made up of millions of nucleotides. DNA is frequently found in pairs rather than as a single molecule. In the form of a double helix, these two lengthy strands entwine like vines. Every DNA Encryption algorithm must meet a set of criteria. These criteria were established in this work based on the limits of existing encryption methods. Several approaches for image encryption have indeed been devised in recent years to safeguard the security of this sort of data [4–11]. Although early encryption algorithms like AES, DES, RSA, and, IDEA relied on a strong correlation between neighboring pixels in digital pictures, they are ineffective for appropriate encryption. Another common encryption approach in imaging is based on deoxyribonucleic acid (DNA) that offers several advantages that make it attractive for use in encryption. The DNA-based method is divided into two stages. The first stage is to transform basic image pixels to a DNA sequence with the DNA hypothesis. The key image is then generated using those rules. The encoded plain picture pixels are then affected by DNA operation rules, which result in a cipher image. A recent paper on DNA cryptography and finite automata theory consists of a key pair generator, a transmitter, and a receiver as the three components of the system [12]. The sender creates a secret 256-bit DNA key depending upon the characteristics of the recipient, which is then used for the encryption of data. The DNA sequence is then coded using a randomly generated Mealy machine, which makes the ciphertext more secure. The suggested technique can defend the system from a variety of security attacks, including brute force, known plaintext, differential cryptanalysis, and ciphertext only, man-in-the-middle, and phishing assaults. The findings

42

A. Deb et al.

and discussions demonstrate that the suggested method is more efficient and secure than existing solutions.

3 Methodology Please DNA cryptography is one of the world’s most quickly developing technologies. It provides increased speed while using less storage and electricity. It stores memory at a density of roughly 1 bit/nm3 , compared to 1012 nm3 /bit for the traditional storage medium. Surprisingly, one gram of DNA stores 1021 DNA bases or 108 terabytes of info. As a result, it is possible to store all of the world’s data in just a few milligrams. DNA cryptography is the practice of concealing data using DNA sequences. The simplest method to encode the four nucleotides of DNA is to depict them as four figures and represent data in terms of: Adenine [A] (0) – 00, Guanine [G] (1) – 01, Thymine [T] (2) –10, and Cytosine [C] (3) –11. In the proposed algorithm, the sender needs to input a secret message, key matrix coordinate, and dimensions. If the coordinates are prime and the matrix dimensions are smaller than the secret message length. To disguise the data, the sender must additionally provide a DNA sequence (either manually or via the ‘generate automatically’ button). After all the inputs are done and validation is done, the proposed algorithm will convert the secret message into the form of a DNA (A, G, C, T) sequence. And then the proposed algorithm will produce the key matrix from pi-matrix with the help of the key-matrix coordinate and dimensions. The number π is a mathematical constant. It is defined as the ratio of a circle’s circumference to its diameter, approximately equal to 3.14159. But in reality, the decimal representation of π never ends and never follows any particular repeating pattern. In the proposed algorithm a matrix of dimension 333 × 30 is created from this large pi number decimal representation and it’s named as Pi-matrix throughout the paper. Afterward, the key-matrix is converted into a key-matrix 1D array. The first element of the Key-matrix 1D array will be the starting index of the given DNA sequence where encrypting of the data will start, and all the following elements will represent the gaps in the DNA sequence according to what the data will be encrypted. Now the ciphered text is ready to be stored in a text file. The proposed algorithm for encryption and decryption is presented in Algorithm 1.

A Novel Data Encryption Technique Based on DNA Sequence

43

ALGORITHM 1: ENCRYPTION 1: START 2: INPUT SECRET MESSAGE 3: INPUT KEY MATRIX COORDINATE. IF COORDINATES ARE NOT PRIME THEN SHOW WARNING END IF 4: INPUT DIMENSION OF THE KEY MATRIX. IF ROW*COLUMN < MESSAGE LENGTH THEN SHOW WARNING END IF 5: STORE THE KEY-MATRIX COORDINATES IN ONE TEXT FILE (C.TXT) AND DIMENSIONS IN ANOTHER (D.TXT) 6: INPUT DNA SEQUENCE (EITHER MANUALLY OR CLICK ‘GENERATE AUTOMATICALLY) 7: CONVERT THE SECRET MESSAGE INTO DNA SEQUENCE 7A: CONVERT THE CHARACTERS OF THE SECRECT MESSAGE INTO BINARY FORM AND STORE IN STRINGBUFFER ARRAY OBJECT (SB1[ ]) IF LENGTH OF BINARY FORM < 8 THEN ADD ZERO/s TO 0 INDEX OF THE BINARY STRING END IF 7B: DIVIDE THE 8 BIT BINARY NUMBER INTO GROUP OF 2 DIGITS AND CONVERT IT IN THE FORM OF DNA NEUCLEOTIDES (A=00, G=01, T=10, C=11) FORM AND STORE IN STRINGBUFFER ARRAY OBJECT (SB2[ ]) 8: PRODUCE THE KEY MATRIX FROM PI-MATRIX WITH THE HELP OF KEYMATRIX COORDINATE AND DIMENSIONS 9: CONVERT THE KEY-MATRIX INTO KEY-MATRIX 1D ARRAY (KEY_1D) 10: HIDE THE SECRECT MESSAGE IN THE DNA SEQUENCE 10A: USING THE KEY MATRIX FOR I IN RANGE 0 TO SECRET MESSAGE LENGTH IF I==0 THEN INDEX=KEY_1D[I] AND REPLACE CHAR OF DNA(INDEX) BY SB2[I] ELSE INDEX=INDEX+3+KEY_ID[I]+1 AND REPLACE CHAR OF DNA(INDEX) BY SB2[I] END IF END FOR 11: STORE THE CIPHER CODE IN PRIVATE A STRING OBJECT (E_MSG) 12: CREATE A FUNCTION TO RETURN THE CIPHERED MESSAGE (E_MSG) 13: STORE THE ENCRYPTED MESSAGE IN A TEXT FILE (E.TXT) 14: END.

The receiver side will have to input an encrypted secret message or click the ‘autogenerate from file’ button that will fetch the sequence directly from the text file. Also input key matrix coordinate and dimensions. After validating the key-matrix coordinate and dimensions the proposed algorithm will produce the key-matrix from pi-matrix. Then the proposed algorithm will convert the key-matrix into a key-matrix 1D array for decryption. Use the 1D array to find the starting point and gaps to retrieve the secret message from the DNA sequence. Then the proposed algorithm will convert the secret message into binary form and from binary to its real form (plain text) and finally show

44

A. Deb et al.

the decrypted message. The proposed algorithm for decryption is presented in Algorithm 2. ALGORITHM 2: DECRYPTION 1: START 2: INPUT ENCRYPTED SECRET MESSAGE OR CLICK THE ‘AUTO GENERATE FROM FILE’ BUTTON WHICH WILL FETCH THE ENCRYPTED MESSAGE FROM TEXT FILE E.TXT 3: INPUT KEY MATRIX COORDINATE. IF COORDINATES ! = COORDINATED IN C.TXT THEN SHOW WARNING 4: INPUT DIMENSION OF THE KEY MATRIX. IF DIMENSIONS ! = COORDINATED IN D.TXT THEN SHOW WARNING 5: PRODUCE THE KEY MATRIX FROM PI-MATRIX WITH THE HELP OF KEYMATRIX COORDINATE AND DIMENSIONS 6: CONVERT THE KEY-MATRIX INTO KEY-MATRIX 1D ARRAY (KEY_1D) 7: CREATE TWO STRINGBUFFER OBJECTS - SB1[ ] OF LENGTH 8 TO STORE THE BIANRY SECRET MESSAGE & SB2[ ] OF LENGTH 4 TO STORE THE SECRET MESSAGE IN FORM OF DNA NUCLEOTIDES (A,G,T,C) 8: RETRIEVE THE SECRECT MESSAGE FROM THE DNA SEQUENCE USING THE KEY MATRIX FOR I IN RANGE 0 TO SECRET MESSAGE LENGTH IF I==0 THEN INDEX=KEY_1D[ I ] AND APPEND CHARACTERS OF CIPHERED TEXT FROM INDEX TO (INDEX+4) TO SB2[ I ] ELSE INDEX=INDEX+3+KEY_ID[ I ]+1 AND APPEND CHARACTERS OF CIPHERED TEXT FROM INDEX TO (INDEX+4) TO SB2[ I ] END IF END FOR 9: CONVERT THE SECRET MESSAGE INTO BINARY FORM FOR IN TH RANGE I=0 TO LENGTH OF SECRET MESSAGE FOR IN THE RANGE J=0 TO LENGTH OF SB2[ I ] IF CHAR AT SB2[ I ] == ‘A’ THEN APPEND ‘00’ IN SB1[ I ] ELSE IF CHAR AT SB2[ I ] == ‘G’ THEN APPEND ‘01’ IN SB1[ I ] ELSE IF CHAR AT SB2[ I ] == ‘T’ THEN APPEND ‘10’ IN SB1[ I ] ELSE IF CHAR AT SB2[ I ] == ‘C’ THEN APPEND ‘11’ IN SB1[ I ] END IF END FOR END FOR 10: CREATE ANOTHER STRINGBUFFER OBJECT DM 11: CONVERT THE CIPHER TEXT INTO PLAIN TEXT FOR IN RANGE I=0 TO LENGTH OF SECRET MESSAGE CONVERT BINARY TO CHARACTER APPEND CHARACTER TO THE STRINGBUFFER OBJECT DM END FOR

A Novel Data Encryption Technique Based on DNA Sequence

45

12: STORE THE CIPHER CODE IN PRIVATE A STRING OBJECT (D_MSG) 13: CREATE A FUNCTION TO RETURN THE CIPHERED MESSAGE (D_MSG) 14: END.

The overall flowchart of the proposed cryptosystem using DNA sequence is shown in Fig. 1.

(a)

The encryption algorithm.

(b) The decryption algorithm

Fig. 1. The proposed cryptosystem using DNA sequence

4 Security Evaluation In this section, the proposed algorithm discusses some security issues. Two indicators exist; only the transmitter and the recipient know the coordinates of the key matrix and dimensions, and over 163 million DNA sequences are publicly available. Any assailant that wants to get the concealed and encrypted secret message in the DNA sequence should guess which DNA sequence is being utilized to cover data with the probability of 1/1.63 × 108 chances of success. The proposed algorithm also applies the pi-matrix which is a large matrix with large dimensions, made from pi number. The ‘π’ is a constant in mathematics which is the ratio of the circumference of a circle with its diameter. As an irrational number ‘π’ is not a common fraction, however, fractions like 22/7 are usually used to approximate it that equals to 3.14159. But in reality, the decimal representation of π never ends and never follows any particular repeating pattern. The decimal digits appear to be altered arbitrarily and to meet a certain random statistical requirement. Therefore the pi-matrix mentioned in the proposed method is also random and very difficult to guess. Moreover in this algorithm, a key-matrix will be created from this pi-matrix which coordinates and

46

A. Deb et al.

dimensions will only be known by the sender and the recipient of the secret message. Therefore, if an intruder wishes to find the secret message to determine the correct DNA sequence, the key matrix would be very difficult; thus, without such information the probability of retrieving secret messages is minimal. The repetition of the key, if the key is short, is another classical issue. The Friedman test or Friedman technique is the principal test for this vulnerability in order to identify the right key length. The Friedman test utilizes a coincidence index that calculates the equality of the ciphertext in order to break the main ciphertext. The aim is to determine the key length through the Friedman test. Key Length =

KP − Kr Ko − Kr

(1)

Kp = The probability for two ciphertext components selected at random are the same. This is 0.067 in English. Kr = The probability of coincidence for alphabet selection at random, the mentioned algorithm we use A, C, T, and G four letters as the DNA nucleotides. Therefore the value is 41 = 0.25 K0 = Coincidence rate: c

·(fi × fi−1 ) N (N − 1) i

(2)

where ‘c’ is the alphabet size, ‘N’ is the ciphertext length, ‘f’ is the frequency of the letter. We provide an example of how to secure the cipher based on the Friedman test. Binary Plain Text: 88 01101000 01100101 01101100 01101100 01101111 00100000 01000100 01101110 01100001 00100000 00100001 DNA Sequence Plain Text = 44 GTTAGTGGGTCAGTCAGTCCATAAGAGAGTCTGTAGATAAATAT Key = 256 GATCCTCCATATACAACGGTATCTCCACCTCAGGTTTAGATCTCAACAACG GAACCATTGCCGACATGAGACAGTTAGGTATCGTCGAGAGTTACAAGCTA AAACGAGCAGTAGTCAGCTCTGCATCTGAAGCCGCTGAAGTTCTACTAAG GGTGGATAACATCATCCGTGCAAGACCAAGAACCGCCAATAGACAACAT ATGTAACATATTTAGGATATACCTCGAAAATAATAAACCGCCACACTGTC ATTATT Encrypted DNA sequence = 289 GATGTTACTCCGTGGTATAGTCAAACGGTGTCATGTCCTCCACCTCAATAA GAGATTTAGATCTGTCTAACAACGTAGGAACCAATAATGCCATAGACATG AGACAGTTAGGTATCGTCGAGAGTTACAAGCTAAAACGAGCAGTAGTCAG CTCTGCATCTGAAGCCGCTGAAGTTCTACTAAGGGTGGATAACATCATCC GTGCAAGACCAAGAACCGCCAATAGACAACATATGTAACATATTTAGGAT ATACCTCGAAAATAATAAACCGCCACACTGTCATTATT

A Novel Data Encryption Technique Based on DNA Sequence

47

c Index of coincidence:

·(fi × fi−1 ) N (N − 1) i

(3)

A (14*13) + C (6*5) + G (12*11) + T (13*12) = 500 N (N − 1) = 44*43 = 1892 IC = 500/1892 = 0.26427 Key Length =

Kp − 0.25 ∼ KP − Kr = =∞ Ko − Kr 0.26427 − 0.25

As can be seen, the Coincidence Index (K0 ) value is close to 0.25, so that the denominator is about 0, so, based on the Friedman formula the key length would be ∞ (infinite). Therefore, an accurate key length cannot be found. Let’s pretend that the unwanted third party is experiencing trouble determining the key length. As a result, it seeks to identify the right plaintext using the frequency analysis method. But, using frequency analysis, we were unable to discover any relationship among letters. As a result, finding the plaintext with the secret key will be extremely difficult.

5 Results This section shows the illustration of the proposed technique. Here a message ‘hello’ is shown to be encrypted. Initially, the coordinates and dimensions of key matrix are specified along with the secret message which is further utilized to generate the actual matrix. Here the key matrix is generated from the Pi matrix using key matrix coordinates and dimensions. The details of key matrix in the encryption procedure are shown in Fig. 2.

(a) Matrix coordinates and dimensions.

(b) key matrix

Fig. 2. The encryption procedure using key matrix for encryption

The key matrix is converted to an 1D array where the 1st element is the starting index of DNA sequence and usually data hiding starts here. Every other element denotes the number of gaps between DNA nucleotides before hiding each character of the secret message. The 1-D array representation is shown in Fig. 3. The conversion process of the secret message ‘hello’ into nucleotides form for encryption and the conversion of secret message into DNA sequence and the final encrypted DNA sequence is shown in Fig. 4.

48

A. Deb et al.

Fig. 3. The 1D array representation of the key matrix

(a) Conversion of the secret message

(b) The DNA sequence.

Fig. 4. Final encrypted DNA sequence

A comparative analysis of different Cryptography system with the proposed method is shown in Table 1 while Table 2 demonstrates comparison of DNA based systems used in various areas and its capabilities with the proposed system. Here ‘Error’ represents the difference between the plain text (original) and output of decryption process. Table 1. A comparative analysis of DNA cryptography system Method of cryptography

Utilized DNA technology

Method of cryptography

A DNA based encryption method

DNA digital coding PCR The message is transformed into primers DNA template where primers being the keys to encrypt and decrypt [13]

A pseudo DNA cryptosystem

Conversion through transcription splicing

Data is translated to protein as per genetic code table by the sender, where keys are also sent through protected channels [14]

A bimolecular DNA-based cryptosystem

Carbon nano tube

Nano scales systems are presented for encryption [15]

Data encryption (proposed)

Pi matrix, 1D key matrix The 1D key matrix is formed from Pi matrix using coordinates and dimensions

A Novel Data Encryption Technique Based on DNA Sequence

49

Table 2. A comparison of DNA system in various applications Encoding binary information

Usage of binary encryption

Error

Molecular computing [16]

No

No

High

Hiding messages [17]

No

No

High

Organic data memory [18]

No

No

High

DNA-based watermarks [19]

YES

YES

Medium

Data encryption (proposed)

YES

YES

Low

6 Conclusion and Future Scope Presently, there is a growing alarming volume of data generated and kept within computers. Among all these gadgets, tremendous quantities of vital and sensitive information are transferred. It is therefore extremely important to ensure the safety of all these data. The purpose of Cyber security is to defend the data and integrity of computing assets against all threat factors throughout the entire life cycle of a cyber attack. DNA cryptography offers fresh hope for breaking impenetrable algorithms. This is because DNA computing is faster, requires less storage, and uses less power. In the proposed algorithm a key-matrix is formed from pi-matrix and a DNA sequence is used as key for encryption of the secret data. Additionally, the data is converted into the form of DNA sequence (A, G, C, T), ensuring extra security to the secret message so that no one other than the intended receiver can guess the keys nor recover or understand the hidden data. In the future, additional codes can be implemented so that the sender can directly encrypt and also send the secret data to the sender by the same application for convenience. And the receiver can also receive and directly decrypt the data from the software itself.

References 1. Chang, C.-C., Lu, T.-C., Chang, Y.-F., Lee, C.-T.: Reversible data hiding schemes for deoxyribonucleic acid (DNA) medium. Int. J. Innov. Comput. Inform. Control 3(5), 1145–1160 (2007) 2. Shimanovsky, B., Feng, J., Potkonjak, M.: Hiding data in DNA. In: 5th International Workshop on Information Hiding, vol. 2578, pp. 373–386. LNCS, Netherlands (2002) 3. Shiu, H., Ng, J.K.L., Fang, J.F., Lee, R.C.T., Huang, C.H.: Data hiding methods based upon DNA sequences. Inform. Sci. 180(11), 2196–2208 (2010) 4. Chai, X., Gan, Z., Yuan, K., Chen, Y., Liu, X.: A novel image encryption scheme based on DNA sequence operations and chaotic systems. Neural Comput. Appl. 31, 219–237 (2019) 5. Enayatifar, R., Guimarães, F.G., Siarry, P.: Index-based permutation-diffusion in multipleimage encryption using DNA sequence. Opt. Lasers Eng. 115, 131–140 (2019) 6. Yu, S.-S., Zhou, N.-R., Gong, L.-H., Nie, Z.: Optical image encryption algorithm based on phase-truncated short-time fractional Fourier transform and hyper-chaotic system. Opt. Lasers Eng. 124, 105816 (2020)

50

A. Deb et al.

7. Gong, L., Deng, C., Pan, S., Zhou, N.: Image compression-encryption algorithms by combining hyper-chaotic system with discrete fractional random transform. Opt. Laser Technol. 103, 48–58 (2018) 8. Nematzadeh, H., Enayatifar, R., Yadollahi, M., Lee, M., Jeong, G.: Binary search tree image encryption with DNA. Optik 202, 163505 (2020) 9. Luo, Y., Yu, J., Lai, W., Liu, L.: A novel chaotic image encryption algorithm based on improved baker map and logistic map. Multimed. Tools Appl. 78, 22023–22043 (2019) 10. Wang, Y., Zhao, Y., Zhou, Q., Lin, Z.: Image encryption using partitioned cellular automata. Neurocomput. 275, 1318–1332 (2018) 11. Ping, P., Wu, J., Mao, Y., Xu, F., Fan, J.: Design of image cipher using life-like cellular automata and chaotic map. Signal Process. 150, 233–247 (2018) 12. Pavithran, P., Mathew, S., Namasudra, S., Lorenz, P.: A novel cryptosystem based on DNA cryptography and randomly generated mealy machine. Comput. Secur. 104, 102160 (2021) 13. Cui, G., Qin, L., Wang, Y., Zhang, X.: An Encryption scheme using DNA Technology. In: Proc. 2008 3rd International Conference on Bio-Inspired Computing: Theories and Applications, pp. 37–42 (2008). https://doi.org/10.1109/BICTA.2008.4656701 14. Sadeg, S., Gougache, M., Mansouri, N., Drias, H.: An Encryption algorithm inspired from DNA. In: Proc. 2010 International Conference on Machine and Web Intelligence, pp. 344–349 (2010). https://doi.org/10.1109/ICMWI.2010.5648076 15. Chen, J.: A DNA-based, bimolecular cryptography design. In: Proc. 2003 IEEE International Symposium on Circuits and Systems, pp. III–III (2003). https://doi.org/10.1109/ISCAS.2003. 1205146. 16. Gehani, A., LaBean, T., Reif, J.: DNA-based cryptography. In: Jonoska, N., P˘aun, G., Rozenberg, G. (eds.) Aspects of Molecular Computing, pp. 167–188. Springer, Berlin, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24635-0_12 17. Clelland, C.T., Risca, V., Bancroft, C.: Hiding messages in DNA microdots. Nature 399(6736), 533–534 (1999) 18. Wong, P.C., Wong, K.-K., Foote, H.: Organic data memory using the DNA approach. Commun. ACM 46(1), 95–98 (2003) 19. Heider, D., Barnekow, A.: DNA-based watermarks using the DNA-crypt algorithm. BMC Bioinform. 8(1), 176 (2007)

Continuous Behavioral Authentication System for IoT Enabled Applications Vivek Kumar(B) and Sangram Ray Department of Computer Science and Engineering, National Institute of Technology Sikkim, Ravangla 737139, Sikkim, India [email protected]

Abstract. In the modern time of digitization, Internet of Things (IoT) play a key role in many sectors like smart factory, smart health monitoring, smart tracking and smart cities; so, this is provding new opportunities for the research and application development. But the acceptance of IoT based applications impacted due to its security and privacy challenges. The security concerns become more critical when the unauthorized user accessing the network services using a paired network device (i.e. Smart phone or laptop). Therefore, an authentication system is required that continuously observe the activities of the end device user using a continuous authentication system. In order to address this challenge, in this paper a noble continuous behavioral authentication system has been introduced by considering the limitations of IoT devices such as battery power and other computational resources. The proposed authentication model involves the machine learning based techniques to analyze the application usages patterns to discover the misleading user behavior. In addition, the correlation coefficient is used for feature ranking. The ranked features are used with three different machine learning algorithms namely support vector machine (SVM) with sequential minimal optimization (SMO), radial basis function (RBF) based SVM-SMO, and a multi-layer perceptron (MLP). The experimental analysis has been carried out using a real world application usage dataset to demonstrate the effectiveness of the proposed continuous behavioral authentication system. The experimental results demonstrate the superiority of the RBF-SVM-SMO technique over other two machine learning models for the IoT-enabled continuous authentication model. Keywords: Behavioral authentication · Machine learning · Internet of Things · Supervised learning · Behavior classification · Smart mobiles

1 Introduction The use of internet of things (IoT) is becoming very populer everywhere because it allows remote sensing, automation and data collection [1]. That is to transform autonomous systems and bring smart command and control to cyber-physical (CPS) systems. At the same time, the expansion of Machine Learning (ML) has enhanced analytical skills, which we see in increasing applications and products such as self-driving cars and smart mobile applications [2]. In this context, IoT is an emerging paradigm with a name for the © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 51–63, 2022. https://doi.org/10.1007/978-981-19-3182-6_5

52

V. Kumar and S. Ray

various technologies of intelligent objects found everywhere connected to the internet. These items are often deployed in open spaces to provide new services in a variety of application domains such as smart communities, smart cities and smart health [3]. In IoT enabled applications require efficiency, security and well-managed access control to make secure and adaptable applications [4]. In IoT enabled applications the IoT devices has used for collection of data from the employed environment, the collected data has transmitted over a base station or server, which process the data and identify the valuable insights for administrators or managers. The administrators are utilizing the smart devices such as smart mobile or laptop for controlling and administrating the IoT network devices. Therefore, such administrative applications for IoT devices require a suitable and effective authentication service, which only authorize the actual administrators to manage and control the network services. In this context, recently a significant effort [5–7] has been observed, which are providing a way to authorize the administrative devices using behavioral biometric. These contributions are motivates us to design and develop an improved authentication system for IoT enabled applications. Therefore, in this paper a continuous behavioral authentication system has been proposed to develop using behavioral biometric. The proposed authentication technique collects and processes the user activity of administrative smart devices, continuously in a small sampling time. However, the current authentication techniques are limited at initial access control, but the proposed technique continuously keeps watch on the administrative device activity. In this context, this paper first investigates and explores different security needs of IoT-enabled applications. Thus, a brief review of existing methodologies has been carried out. Further, based on the conducted review a new continuous authentication system is proposed. Further, the experiments on publically available dataset have been carried out and the performance of the prototype described. Next based on experimental analysis the results and conclusion are prepared. Finally for the future extension and improvement of the proposed prototype the research directions have been provided.

2 Recent Contributions This section includes the recent research efforts in the direction of securing the IoT enabled network. Therefore different IDS (intrusion detection systems), and authentication techniques are investigated. Access to the Internet of Things (IoT) devices are at risk of theft or loss, and are used by unauthorized users. Due to the weaknesses in authentication, the focus has turned to behavioral authentication. The ability to continuously access usage profiles and sensor data is enhanced by using behavioral authentication methods. Y. Ashibani et al. [8] explains in their work on behavioral authentication with regard to security and usability, and presented an authentication scheme that users verifie it where the average F-measure is 96.5%. Similarly P. Li et al. [9] will focus on user authentication in the cloud by analyzing and classifying user behaviors. They propose a Stochastic Petri net-based User Behaviour Authentication model (SPUBA). They have modified a K-modes algorithm and proposed an algorithm for calculating user behavior credibility. The simulations have been performed to analyze the execution time. Further F. Liang et al.

Continuous Behavioral Authentication System

53

[2] consider the good, the bad, and the ugly use of ML. They consider the benefits of ML, for security, such as IDS and decision accuracy. They consider the vulnerabilities from perspectives of security, including ways in which ML can be compromised, misled, and subverted. Finally, a trend has been identified for the utilization of ML in cyber attacks and intrusions. The IoT devices produce confidential and sensitive data. Thus, security is very important. U. Khalid et al. [10], a decentralized authentication and access control mechanism is proposed. The mechanism is based on fog computing and blockchain. The results demonstrate superior performance. In order to enhance the security in IoT, Y. Ashibani et al. [11] present an ML-based user authentication for smart home networks using application access. To validate the model, a real-world dataset is used. The model is evaluated for continuous authentication that utilizes apps. Additionally, assess various classifiers regarding legitimate user identification. Further, to secure the IoT-enabled applications IDS can be a helpful tool thus B. B. Zarpelão et al. [12] present a survey of IDS research for IoT. The objective is to identify leading trends, open issues, and future possibilities. They classified literature according to a detection method, placement strategy, security threat, and validation. They also discussed the possibilities for each, aspect of works specific IDS schemes for IoT. P. Kaluzny et al. [13] present the use of behavioral biometrics in mobile banking and payment apps. While requiring secure services, customers often not lock their devices and expose to misuse and theft. To this problem, behavioral biometric methods can be utilized to secure applications. The goal is to describe areas in which behavioral biometrics can be used. S. Zhao et al. [14] aims at investigating the applicability of CI (Computational Intelligence) in cybersecurity for IoT, including cybersecurity and privacy, cyber defense, IDS, and data security. They also provide directions and trends for security using CI. Next according to A. Khraisat et al. [15] due to diverse types of IoT devices, it is challenging to protect the infrastructure using traditional IDS. Thus, an ensemble Hybrid Intrusion Detection System (HIDS) is proposed that combines a C5 classifier and SVM. It combines the advantages of Signature-based IDS (SIDS) and Anomaly-based IDS (AIDS). The behaviour analysis of the vehicle is helpful to grasp the condition, and to improve the running efficiency and ensure the safe operation. The IoT has used to monitor and analyze car behavior, X. Feng et al. [16] designs a system based on the IoT. First, the basic structure of the vehicle behavior analysis system is constructed, and then the deep network model based on Hadoop is established. In addition, compared with the SVM, the algorithm can deal with massive data and improve the prediction accuracy. K. A. P. da Costa et al. [17], focuses on rigorous literature on ML applied in IoT and IDS. They aim, on recent and in-depth research of relevant works that deal with intelligent techniques and applied on IDS with emphasis on the IoT and ML. Smart health is important for continuous monitoring of patients and providing medical facilities. These systems are connected with a wireless network that is vulnerable to threats. These include Denial of Service (DoS), Fingerprint and Timing-based Snooping, Router Attack, Select and Forwarding attack, Sensor attack, and Replay Attack. S. A. Butt et al. [18] discuss attacks and their impact on health monitoring systems. L. N˘astase et al. [19] present the application layer protocols that are used in the IoT: CoAP, MQTT,

54

V. Kumar and S. Ray

XMPP. They discuss them separately and by comparison, with the security protocols. G. Thamilarasu et al. [20] say when attacks on IoT networks undetected; it affects the availability of systems, increases data breaches and identity theft. So they develop IDS for the IoT using a deep-learning algorithm. It provides security as a service and facilitates interoperability between communications protocols used in IoT. They evaluate the detection system using both real-network traces and using simulation. The results confirm the IDS can detect real-world intrusions effectively.

3 Proposed Scheme The main aim of the proposed investigative work is to design a continuous behavioral authentication system for IoT enabled applications. The continuous authentication systems kept an eye on activities of administrative devices which are used by the administrator to command the network application. Basically the IoT devices are connected with internet to communicate and receive the commands from the user through the cloud server. Additionally the user has a mobile device or laptop to control the applications. If this paired device is lost or misplaced then any unknown user can misuse the application. Therefore we need to identify the actual application owner by their application usage patterns. Therefore the model captures each and every activity of the administrative device to find the anomaly in the application usage pattern. Therefore the system evaluates the smartphone app usage activity on the server-side and verifies the users. 3.1 Methodology In order to distinguish between two different application usage patterns here we proposed a machine learning based technique. That technique usage the end-users activity log to process and obtain the pattern. An overview of the server side machine learning component is explained in Fig. 1.

Fig. 1. Proposed behavioral authentication

According to the given model the system requires App usage data for analysis thus a real-world dataset is collected from. The data set can be defined as the user behavior

Continuous Behavioral Authentication System

55

profile of Android application utilization. This dataset is a collection of more than 17 million users and is provided by Wandoujia, one of the leading Android marketplaces in China. It contains two categories: • Activity data • including installation • un-installation • Updates. • Network traces, which contain network activity • cellular or Wi-Fi • Daily total access time • And traffic generated. In this work the network usage data is used for experimentation, which consists of App access activity. The explanation of dataset attributes is given in Table 1. Table 1. Dataset attributes and type S. No.

Attribute

Type

1

Date

Date

2

Time

Time

3

Package name

Text

4

User id

Text

5

F-Cellular time

Time (MS)

6

F-Wi-Fi time

Time (MS)

The dataset contains a large number of users; where during the data processing we required to differentiate among two users thus the applied algorithm predict the user ID. In this experiment, we prepared the five different sets of user’s data. Therefore, the five datasets contains ranging from 6 to 10 (i.e. 6, 7, 8, 9, 10) user’s app access pattern. After that data pre-processing has been carried out. During the data pre-processing the main aim is to improve the quality of learning data. Therefore the pre-processing involves two major steps: 1. Mapping of qualitative attributes into quantity attributes 2. Elimination of low frequency Apps In order to perform the preprocessing task the following process is being used (Table 2):

56

V. Kumar and S. Ray Table 2. Preprocessing algorithm

After the employing the above pre-processing steps, also reduces those instances which contain very fewer usage patterns. Further, the feature ranking technique is used thus we compute the correlation coefficient of each attribute with respect to the user identity. Based on the obtained coefficient values we reorganize the data set. The following ranking function is used for ranking of the attributes:  (xi − x)(yi − y) (1) rank =   (xi − x)2 (yi − y)2 The dataset attributes are ranked according to the obtained values obtained from Eq. (1). The transformed dataset is further partitioned into two subsets i.e. training samples (70%) and test samples (30%). The training samples are used for learning with the machine learning algorithm and the testing set is used for validation of the trained machine learning model. Now, in order to prepare a continuous authentication model, we need to continuously extract the user’s mobile app usage data and a machine learning model which continuously classifies the appeared user’s access log. In this context, if any anomaly found in App usage pattern, then we label this pattern as suspected. Further, if a user continuously violates the historical patterns more than three times, then the model predicts the mobile usage pattern as malicious or unauthorized access. Therefore, we investigated the different machine techniques and we found that the SVM (Support Vector Machine) [21] and ANN (Artificial Neural Network) [22] both the techniques are much frequently used for classification, prediction, and pattern recognition tasks. Therefore the proposed model incorporates SVM as the main classifier and ANN more specifically MLP (Multi-Layer Perceptron) is used for comparative performance study. The SVM classifier mainly works well with the binary classification problems. Therefore, in order to better learning of SVM we used SMO (Sequential Minimal Optimization)

Continuous Behavioral Authentication System

57

technique for training and classification. In this context we implemented two variants of SVM based continuous authentication model. Initially, a linear SVM is used as a base classifier but due to low accuracy, we implement the second variant with the help of the RBF (Radial Basis Function) kernel. The RBF kernel can be described using the following equation:   x − y2 ... ... . . (2) k(x, y) = exp − 2σ 2 where k(x,y) is kernel function which accept two vectors x and y, and σ is a free parameters. Here due to a continuous increasing transaction log or nature of time series information, we cannot approximate the data relationships much accurately in linear manner thus here we implemented the RBF kernel. After implementing the RBF kernel-based multi-class SVM classifier the performance is enhanced significantly. On the other hand for the training of the MLP based classifier the following parameters are used: • learning rate 0.4, • epoch cycle 4000, • And hidden layers 7. This section discusses the server side model which is used to identify the behavioral pattern according to the application usages patterns. The next section provides the experimental scenarios of the implemented system.

4 Result and Discussion ˙In this section, the experimental setup for implimantation of the proposed RBF-SVMSMO scheme and its outcome are given. Further, the experimental result is compared with the existing SVM-SMO and MLP in terms of Precision, Recall and F-score, and it is found that the proposed scheme better with comparable security. 4.1 Experimental Scenarios The implementation of the proposed continuous authentication model for IoT-enabled applications is provided using JAVA technology. Additionally for implementing the SVM and MLP classifiers the LivSVM and WEKA libraries are used. The proposed model is implemented for three different experimental scenarios: i.

Classical SVM with SMO: In this scenario the SVM is developed with the liner classification technique additionally to support the multi-class classification the SMO is implemented for SVM. ii. SVM with RBF kernel and SMO: in this experiment the SVM is implemented with the RBF kernel function and for multi-class classification the SMO has been implemented.

58

V. Kumar and S. Ray

iii. MLP based classification: finally for comparative performance study an MLP based model is implemented in back propagation neural network. This section provides the proposed continuous authentication system using ML algorithms. The next section involves the performance evaluation of the proposed experimental models. 4.2 Experimental Results The proposed work is intended to design and develop an accurate continuous authentication model, which can work with a bulk amount of data and classify the user behavior accurately in less time for making effective and precise decisions. In this context, a model has been proposed for implementation and the experiments are extended in three different scenarios. The performance of the all three experimental scenarios is measured and explained in this section. In order to demonstrate the performance of the proposed data models for IoT-enabled continuous behavioral authentication systems the precision, recall, and F-Score are measured. Additionally to demonstrate the time efficiency the training time of the algorithms is also measured. Here Fig. 2 and Table 3 demonstrates the results in terms of precision rate, the table described the observed performance of the implemented system and Fig. 2 represents the performance of the models in line graph. The X axis of the line graph shows the number of user profiles used for experimentation and Y axis demonstrate the precision of the implemented models. Table 3. Precision of implemented techniques No of user profiles

SVM-SMO

RBF-SVM-SMO

MLP

6

0.79

0.87

0.82

7

0.86

0.94

0.86

8

0.83

0.92

0.87

9

0.89

0.95

0.9

10

0.87

0.94

0.91

According to the given results of precision the proposed RBF based combination of SVM-SMO shows higher performance as compared to other two data models. Similarly the Fig. 3 describes results in terms of recall, additionally their values are explained in Table 4. The observations shows the combination of RBF-SVM and SMO shows the greater potential as compared to other two implemented models. Figure 4 shows the performance in terms of F-score. The F-score is a harmonic mean of precision and recall, thus it can be used to demonstrate the superiority of the proposed models. In these diagrams, the number of users-based app access data is represented in Xaxis and the Y-axis consists of the relevant performance i.e. precision, recall, and f-score.

Continuous Behavioral Authentication System

Precision

1

59

SVM-SMO

0.5 RBF-SVMSMO

0 6

7

8

9

10

MLP

No. Of Users Fig. 2. Precision of implemented techniques

Table 4. Recall of implemented techniques No of user profiles

SVM-SMO

RBF-SVM-SMO

MLP

6

0.72

0.79

0.74

7

0.78

0.84

0.77

8

0.75

0.81

0.78

9

0.79

0.85

0.74

10

0.76

0.88

0.79

Recall

1

SVM-SMO

0.5 0 6

7

8

9

No. of Users

10

RBF-SVMSMO MLP

Fig. 3. Recall of implemented techniques

The f-score of a classification model can be measured using the following equation: F − score = 2 ∗

Precision ∗ Recall Precision + Recall

(3)

The experimental results as given in Fig. 4, show the performance of different implemented models in terms of F-score. In this experiment, the SVM, which is trained using the SMO technique and MLP, based technique produces similar performance. Additionally RBF based SVM which is also trained using the SMO technique produces higher accurate classification outcomes. Therefore the proposed model of continuous authentication is accurate as per system requirements (Table 5 and Table 6).

60

V. Kumar and S. Ray

F-score

1

SVM-SMO

0.5 0

RBF-SVMSMO 6

7

8

9

10

MLP

No of Users Fig. 4. F-score of developed models

Table 5. F-score of developed models No of user profiles

SVM-SMO

RBF-SVM-SMO

MLP

6

0.7533

0.828

0.7779

7

0.818

0.8871

0.8125

8

0.7879

0.8615

0.8225

9

0.837

0.8972

0.8121

10

0.8112

0.909

0.8457

Continuous authentication not only required a higher accuracy of classification is also required to analyze the patterns in less amount of time. Therefore the proposed work is also involving the evaluation of the system in terms of time requirements of algorithm learning. The learning time of all the implemented algorithms is explained in Fig. 5. As similar to previous line graphs, the X-axis shows the samples of a number of users and the Y-axis shows the learning time of the algorithms with an increasing amount of data. In this investigation, we found the number of samples directly impacts the performance of the learning algorithm’s time complexity. According to the experimental results, the MLP requires a large running time as compared to the other two implemented techniques for user behavior classification. Therefore proposed SVM-based classification scheme is efficient and accurate.

Continuous Behavioral Authentication System

61

Table 6. Training time SVM-SMO

RBF-SVM-SMO

MLP

6

237

279

348

7

458

481

592

8

727

799

903

9

992

1087

1176

10

1320

1448

1637

Training Time(MS)

No of user profiles

2000

SVM-SMO

1000 RBF-SVMSMO

0 6

7

8

9

10

MLP

No of Users Fig. 5. Training time

5 Conclusion and Future Work The IoT (internet of things) is a new generation network that enables automation and remote operations. This network generates a significant amount of data, among them some of the data is sensitive and confidential. In this context, security is one of the key necessities of the network. In most of the automation, the remote devices are controlled by some mobile phones or any desktop or laptop machines. The unauthorized access to these devices can affect the availability of the network services. In this context, we need some advanced authentication system that understands the user behavior of the remote controller device (i.e. mobile or laptop). Therefore, the proposed work is aimed to investigate behavioral biometric for continuous end device authentication. The proposed model usages the machine learning and data analytics technique for implementing the required authentication system. Additionally, the application usage data is used for learning with the user behavior and classify similar behavior according to the end-user. Therefore, at first a technique based on SVM and SMO learning concept is proposed. This technique further enhanced by implementing the RBF kernel. The enhanced model using the RBF kernel improves the classification accuracy significantly. In addition to that for comparative performance study, the MLP based model is implemented. Based on the experimental performance, the proposed model found accurate and efficient which is acceptable for continuous authentication. But the model needs improvements which are covered in the next discussion. Therefore in near future, the following extensions are required:

62

V. Kumar and S. Ray

i.

Need to improve the learning time for new samples, therefore the next work shall involve the pre-trained models which can adopt the previous learning as well as new patterns. ii. The current application usages data has very limited information about the device usage pattern, therefore in near future need to identify more feature for more accurate learning and classification. iii. The current model provides the 0.90 f-score which is acceptable in experimental scenarios but it is not much effective for real-world application implementation, therefore we need to improve the authentication accuracy more. iv. The current system is limited to serve the limited number of devices due to bulk amount of generated data, thus we also need to explore the deep learning models which shall work efficiently and accurately in large amount of data.

References 1. Ray, P.P.: A survey on Internet of Things architectures. J. King Saud Univ. – Comput. Inform. Sci. 30(3), 291–319 (2018) 2. Liang, F., Hatcher, W.G., Liao, W., Gao, W., Yu, W.: Machine learning for security and the internet of things: the good, the bad, and the ugly. IEEE Access 7, 158126–158147 (2019) 3. Zanella, A., Bui, N., Castellani, A., Vangelista, L., Zorzi, M.: Internet of things for smart cities. IEEE Internet of Things J. 1(1), 22–32 (2014) 4. Patwary, A.A.N., Fu, A., Naha, R.K., Battula, S.K., Garg, S., Patwary, M.A.K., Aghasian, E.: Authentication, access control, privacy, threats and trust management towards securing fog computing environments: a review. arXiv preprint arXiv:2003.00395 (2020) 5. Krašovec, A., Pellarini, D., Geneiatakis, D., Baldini, G., Pejovi´c, V.: Not quite yourself today: behaviour-based continuous authentication in IoT environments. Proc. ACM on Interact. Mob. Wearable Ubiquitous Technol. 4(4), 1–29 (2020) 6. Kumar, V., Sangram, R.: A smart mobile authentication technique using user centric attributes classifications. Int. J. Comput. Intell. & IoT 2(2), 435–440 (2019) 7. Liang, Y., Samtani, S., Guo, B., Yu, Z.: Behavioral biometrics for continuous authentication in the internet-of-things era: an artificial intelligence perspective. IEEE Internet Things J. 7(9), 9128–9143 (2020) 8. Ashibani, Y., Mahmoud, Q.H.: A behavior profiling model for user authentication in IoT networks based on app usage patterns. In: IECON 2018–44th Annual Conference of the IEEE Industrial Electronics Society, pp. 2841–2846 (2018) 9. Li, P., Yang, C., He, X., Lau, T.F., Wang, R.: User behaviour authentication model based on stochastic petri net in cloud environment. In: Chen, G., Shen, H., Chen, M. (eds.) Parallel Architecture, Algorithm and Programming, pp. 59–69. Springer Singapore, Singapore (2017). https://doi.org/10.1007/978-981-10-6442-5_6 10. Khalid, U., Asim, M., Baker, T., Hung, P.C.K., Tariq, M.A., Rafferty, L.: A decentralized lightweight blockchain-based authentication mechanism for IoT systems. Cluster Comput. 23(3), 2067–2087 (2020) 11. Ashibani, Y., Mahmoud, Q.H.: A machine learning-based user authentication model using mobile App data. In: Kahraman, C., Cebi, S., Onar, S.C., Basar Oztaysi, A., Tolga, C., Sari, I.U. (eds.) Intelligent and Fuzzy Techniques in Big Data Analytics and Decision Making: Proceedings of the INFUS 2019 Conference, Istanbul, Turkey, July 23-25, 2019, pp. 408–415. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-237561_51

Continuous Behavioral Authentication System

63

12. Zarpelão, B.B., Miani, R.S., Kawakani, C.T., de Alvarenga, S.C.: A survey of intrusion detection in Internet of Things. J. Netw. Comput. Appl. 84, 25–37 (2017) 13. Kału˙zny, P.: Behavioral biometrics in mobile banking and payment applications. In: Abramowicz, W., Paschke, A. (eds.) Business Information Systems Workshops: BIS 2018 International Workshops, Berlin, Germany, July 18–20, 2018, Revised Papers, pp. 646–658. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-048495_55 14. Zhao, S., Li, S., Qi, L., Da Xu, L.: Computational intelligence enabled cybersecurity for the internet of things. IEEE Trans. Emerg. Top. Comput. Intell. 4(5), 666–674 (2020) 15. Khraisat, A., Gondal, I., Vamplew, P., Kamruzzaman, J., Alazab, A.: A novel ensemble of hybrid intrusion detection system for detecting internet of things attacks. Electronics 8(11), 1210 (2019) 16. Feng, X., Hu, J.: Research on the identification and management of vehicle behaviour based on Internet of things technology. Comput. Commun. 156, 68–76 (2020) 17. da Costa, K.A., Papa, J.P., Lisboa, C.O., Munoz, R., de Albuquerque, V.H.C.: Internet of Things: a survey on machine learning-based intrusion detection approaches. Comput. Netw. 151, 147–157 (2019) 18. Butt, S.A., Diaz-Martinez, J.L., Jamal, T., Ali, A., De-La-Hoz-Franco, E., Shoaib, M.: IoT smart health security threats. In: 2019 19th International conference on computational science and its applications (ICCSA), pp. 26–31 (2019) 19. Nastase, L.: Security in the internet of things: a survey on application layer protocols. In: 2017 21st international conference on control systems and computer science (CSCS), pp. 659–666 (2017) 20. Thamilarasu, G., Chawla, S.: Towards deep-learning-driven intrusion detection for the internet of things. Sensors 19(9), 1977 (2019) 21. Krishnamoorthy, S., Rueda, L., Saad, S., Elmiligi, H.: Identification of user behavioral biometrics for authentication using keystroke dynamics and machine learning. In: Proceedings of the 2018 2nd International Conference on Biometric Engineering and Applications, pp. 50–57 (2018) 22. Harun, N., Woo, W.L., Dlay, S.S.: Performance of keystroke biometrics authentication system using artificial neural network (ANN) and distance classifier method. In: International Conference on Computer and Communication Engineering (ICCCE’10), pp. 1–6 (2010)

A Secure ‘e-Tendering’ Application Based on Secret Image Sharing Sanchita Saha1,2(B) , Arup Kumar Chattopadhyay2 , Suman Kumar Mal1 , and Amitava Nag2 1

2

Haldia Institute of Technology, Haldia 721657, India [email protected] Central Institute of Technology Kokrajhar, Kokrajhar 783370, Assam, India

Abstract. Since an ‘e-Tendering’ application allows bidders to submit e-tenders which are confidential documents, the security of these documents is the highest requirement of any ‘e-Tendering’ system. The party inviting the e-tender (may be an organization or government) or any entities administrating the ‘e-Tendering’ system (known as e-tender committee members) has the ability to see every e-tender submitted to the system. In the presence of multi-authority, any dishonest authority may compromise the confidentiality of the e-tenders, and identification of the dishonest authority is difficult. Traditional cryptographic approaches are not enough to provide complete security to the ‘e-Tendering’ systems. A secret sharing-based approach can be utilized to provide a costeffective solution. If the e-tenders are submitted in the form of images, a type of secret sharing known as secret image sharing may be applied, which ensures that the authorities individually or in collusion with a certain number of authorities cannot compromise the confidentiality of the e-tenders. In this paper, we propose a mobile application, i.e., an ‘e-Tendering’ application based on an (n, n) secret image sharing scheme. We consider that the bidders upload their tender images through the app as per the requirement of the tender inviting organization. The application is implemented to encode each secret tender-image to n number of share images (none of the share images discloses any information stored in the secret image). Thus, each of the n authorities of the organization receives exactly one share image. The recovery of the secret tender image is possible in the presence of all the n share images. Thus, it requires all the authorities to be involved in retrieving the e-tender details. The underlying secret image sharing scheme of the system is based on basic Boolean operations, which guarantees high performance. Keywords: e-tender operation

1

· Secret image sharing · Security · Boolean

Introduction

Tendering is a crucial business activity that assists organizations in selecting the most suitable bidder through a competitive bid for winning a given project. The c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022  D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 64–77, 2022. https://doi.org/10.1007/978-981-19-3182-6_6

A Secure ‘e-Tendering’ Application Based on Secret Image Sharing

65

Internet has grown in popularity as a platform for businesses where individuals can communicate with organizations to carry out e-business activities. An ‘eTendering’ system or electronic tendering system is a structured activity that replaces paper-based tender processes. It is more convenient and practical than a manual paper-based system because it automates the tendering process. ‘eTendering’ has been used in many countries due to its efficiency and efficacy. As a result, organizations get more opportunities to improve their business strategy through the system. However, there can be various security threats to these systems related to the confidentiality of tender documents and the privacy of the bidders. Thus, there is a need to ensure various aspects of information security such as confidentiality, integrity, availability, authentication, and nonrepudiation related to the e-tender documents in an ‘e-Tendering’ system. Many researchers analyzed the security requirements in ‘e-Tendering’ process and proposed various secure ‘e-Tendering’ systems [1,3,11,15,19,27]. In addition, some researchers utilized various secret sharing techniques to design secure ‘e-Tendering’ systems. Mohammadi and Jahanshahi [17] proposed an ‘e-Tendering’ architecture that uses Shamir’s [22] polynomial-based threshold secret share scheme for securing tender documents. However, the time complexity of most of the polynomial-base secret sharing schemes is O(rlog 2 r), where r-out-of-n or (r, t) is the threshold structure. On the contrary, the time complexity for most Boolean-based secret image sharing schemes is O(n) (most Booleanbased secret image sharing schemes consider an n-out-of-n or (n, n) threshold structure). Additionally, most polynomial-based schemes reconstruct slightly distorted versions of the secret images (lossy recovery). However, in Boolean-based schemes, the recovered secret images are completely lossless. Therefore, we use an (n, n) Boolean-based SIS scheme to ensure security along with computational efficiency to design our ‘e-Tendering’ application. Further, unlike most of the ‘eTendering’ systems that assume the tender committee members (the entities responsible for storing, verifying and evaluation of tender documents) are fully trusted, our scheme considers an environment where some tender committee members may be dishonest risking the privacy of the bidders being compromised. In this paper, we propose an ‘e-Tendering’ application based on an (n, n) secret image sharing (SIS) scheme. We consider e-tender documents as secret images and implement an ’e-Tendering’ application using the SIS scheme proposed by Wang et al. [24]. In our proposed scheme, we primarily focus on the confidentiality of the e-tender documents, such that any dishonest member inside the organization does not disclose those before the bid opening date and time. The organization of the rest of the paper is as follows: in Sect. 2, we discuss some preliminaries that are related to secret image sharing. In Sect. 3, we present the related work in the areas of e-tendering and secret image sharing. We proposed the design of our ‘e-Tendering’ application in Sect. 4. In Sect. 5, the implementation and experimental results are presented. Finally, in Sect. 6, we conclude.

66

2

S. Saha et al.

Preliminary

In this section, we define (n, n) secret image sharing, and it’s major entities. Definition 1 (n, n) secret image sharing. In an (n, n) secret image sharing scheme, a secret image I is shared among a group of participants P = {Pi }ni=1 so that every participant holds exactly one share image. The scheme guarantees to reconstruct the secret image in the presence of all the share images when all n participants collaborate; otherwise, any group consisting of n − 1 or fewer participants gets nothing. 2.1

Major Entities

– Secret Image. A secret image I needs to be shared among a group of participants. – Dealer. The dealer D is responsible for producing n share images from the input secret image. D further distributes the share images among participants so that each participant holds exactly one share image. – Participants. Participants are represented as set P = {Pi }ni=1 and they are the users seeking for the secret image. – Combiner. A combiner C is responsible for decoding the secret image when a particular set of participants submit their share images.

3

Related Work

‘e-Tendering’ is progressively being used worldwide, which can be described as submitting, accessing, or receiving documents related to tender electronically through the Internet. It helps any organization collect many relevant e-tenderers by publishing a tender notice for a specific task. ‘e-Tendering’ acts as a bridge between the organization inviting the tenders and bidders. In addition, it helps to overcome numerous problems associated with the manual tender system. Various ‘e-Tendering’ systems were designed and implemented all over the world. These systems can be commercial third-party systems for supporting numerous government and private client organizations, and non-commercial systems for client organizations based on their business requirements. Maintaining the confidentiality of the e-tender documents and the privacy of the bidders are the major concerns in any e-tendering system design and implementation. Du et al. [11] initially identified security requirements and classified security architectures for ‘e-Tendering’. They proposed an overview of a distributed TTP (Trusted Third Party) architecture for an ‘e-Tendering’ system. Chan et al. [3] presented a design and implementation of an ‘e-Tendering’ system for the automation of tendering processes utilising web services. Oyediran et al. [19] performed a survey of construction clients’ opinions on ‘e-Tendering’

A Secure ‘e-Tendering’ Application Based on Secret Image Sharing

67

system for managing the bidding process in Nigeria. The survey clearly specifies the thorough understanding of the prospects of the ‘e-Tendering’ systems to manage the tendering process. Yutia and Budi [27] proposed a blockchain-based ‘e-Tendering’ System. Kavishe and Chileshe [15] proposed a validated publicprivate partnerships framework that improves housing delivery in developing countries. Abdullahi et al. [1] proposed an ‘e-Tendering’ system to automate existing manual tendering. They have developed a web-based system that supports some features like tender notification, submission, opening, evaluation, approval and awarding. The system is designed to improve the tendering skill of the procurement process utilized by Nigerian government procurement agencies. Using secret sharing technique can significantly reduce the computation cost compare to traditional techniques like encryption or steganography. Secret sharing method was independently invented by Shamir [22] and Blakley [2] in 1979. Shamir’s scheme is based on Lagrange interpolation of polynomial, and Blakley’s scheme is based on the principle of hyperplane geometry. In a (r, n) threshold secret sharing method, the owner of a secret (known as the dealer) encodes the secret into some parts or shares. Then the shares are distributed among the n number of participants so that each participant receives precisely one share. Reconstruction of the secret is possible if r or more participants collaborate and submit their shares. Here, r is called threshold value (r ≤ n). However, any group of r − 1 or lesser participants cannot compute the secret. Mohammadi and Jahanshahi [17] proposed an ‘e-Tendering’ architecture using Shamir’s threshold secret sharing scheme [22] for securing tender documents. In this approach, when a bidder uploads a tender document in the ‘e-Tendering’ system, the system produces share images from the tender image, which the participants hold. Thus, the system allows predefined participants to open the tender document. Secret sharing can be utilized to share multimedia documents such as images and audio. Sharing of secret images is known as secret image sharing or SIS. If the e-tender documents are accepted in form of images, then SIS schemes are very efficient to design any secure application like e-tendering. Thien and Lin [23] proposed an important polynomial-based (r, n)-threshold SIS scheme (also known as Thien-Lin’s scheme) which is based on Shamir’s secret sharing scheme. Wu [26] extended the Thien-Lin’s scheme [23] for light images. Several polynomial-based SIS schemes are presented in [4,13,14,16,21,25,26]. However, most polynomial-based SIS schemes have a lossy recovery process, where the recovered image is slightly distorted from the original secret image. Although a few polynomial-based SIS schemes can perform lossless recovery, it comes with high computation costs. Boolean-based SIS schemes are cost-efficient as they involve elementary Boolean operations (predominantly XOR-operations) and can perform lossless recovery. Several Boolean-based SIS schemes were proposed in [5–10,12,18,20, 24]. An (n, n) secret sharing scheme is a specialized and the strictest form of threshold secret sharing that demands the presence of all participants to reconstruct the secret. Thus, there is no chance to form any unfaithful group of r < n participants to ingress the secret without the permission of the rest (n − r) par-

68

S. Saha et al.

ticipants. In other words, the recovery of the secret is only possible if all the participants collaborate. Most Boolean-based SIS schemes follow (n, n) access structure. 3.1

Review of Wang-Ma-Li’s SIS Scheme

An e-tender may be encrypted at the point of submission to ensure its confidentiality from any attack by intruders. However, we consider an environment where the tender committee members who are responsible for storing, verifying and evaluating the e-tender documents may not be fully trusted. A dishonest member may disclose e-tender documents before the bid closing and compromise the privacy of the bidders. An (n, n) secret image sharing (SIS) scheme can be used to provide a cost-effective solution to prevent the confidentiality of the e-tender documents even in the presence of (n − 1) dishonest members. In the proposed scheme, we use the SIS scheme proposed by Wang et al. [24]. Using the SIS scheme, each e-tender submitted can be encoded into a number of shares and each member would be assigned with exactly one share. The e-tender can be disclosed only if all the members collaborate and combine their shares. Thus, any collusion of less than n members cannot compromise the privacy of the bidder. In this subsection, a brief review of the SIS scheme by Wang et al. is presented. Construction of share images – Let I be the secret image of given size w × h. – Let P = {P1 , P2 , · · · , Pn } be the set of n participants. – Let D be the dealer and C be the combiner. The dealer D performs the following steps to generate the share images. Step 1. Choose n − 1 random matrices R1 , R2 , · · · , Rn−1 which are of the same size of the secret image. Step 2. Compute n share images as follows: ⎧ ⎪ if i = 1 ⎨R1 , Si = Ri−1 ⊕ Ri , if 2 ≤ i ≤ n − 1 ⎪ ⎩ Rn−1 ⊕ I, if i = n Step 3. Transmit the share images through secure channel so that each participant receives exactly one share image. Recovery of Secret Image If all n participants {Pi }ni=1 submit the share images {Si }ni=1 to the combiner C, C computes the secret as follows: I=

n  i=1

Si .

A Secure ‘e-Tendering’ Application Based on Secret Image Sharing

69

Time Complexity: The time complexity of a secret sharing scheme, usually measured for the secret recovery phase. The recovery process in the above scheme [24] involves only n − 1 XOR operations (given it is an (n, n) scheme). Thus, the time complexity can be measured as O(n).

4

Proposed Scheme

In this section, we present the design of our ‘e-Tendering’ application based on Wang’s [24] (n, n) secret image sharing scheme. The high-level use-case diagram is presented in Fig. 1. The identified actors and their roles in the system are defined as follows:

Fig. 1. Use case diagram of the application

– Organization Admin. The organization admin publishes the e-tender advertisements on behalf of the organization inviting tenders, and is responsible for inviting any n number of tender committee members. It also acts as the combiner during the reconstruction of the confidential e-tender image.

70

S. Saha et al.

– Bidders. The bidders upload their tenders when any organization publishes an e-tender notice with specific requirements. – Tender Committee Members. There are n tender committee members. These are the entities responsible for storing, verifying and evaluating tender documents. In the proposed scheme, each member holds a share of each original tender image uploaded by bidders until the end of the submission date and time. Then, when opening the tenders, the organization admin pools the share images from the tender committee members to recombine and reconstruct the secret e-tenders. Figure 2 shows the entire process in form of a flow-diagram. The several tasks carried out by the system can be grouped into two main modules: (1) uploading and storing of e-tenders and (2) Opening of e-tenders. In the following subsections, we describe the steps for these two modules.

Fig. 2. Flow-diagram of the application

4.1

Uploading and Storing of e-Tenders

1. The organization admin initially registers as admin to the system and also registers the tender inviting organization by providing valid organization details. 2. The organization admin may create advertisements for inviting tenders with the required information and also provide instructions to upload the e-tender advertisement through the application. 3. The admin also invites n tender committee members who are responsible for collecting shares of each secret tender-image, which each bidder uploads against any tender advertisement.

A Secure ‘e-Tendering’ Application Based on Secret Image Sharing

71

4. The n tender committee members of the organization register themselves in the application via an invitation link sent by the organization admin. 5. The bidders need to sign up in the application and individually upload the highly confidential tender image through the application as per the requirement. 6. After a bidder places a bid, and the original e-tender image is temporarily stored on the server. 7. The application is implemented to encode the secret tender-image It to n share images (none of the random-like share images reveals any information about the secret image).

The Process of Generating Share Images: The system performs the following tasks: (a) Produce n − 1 random images {R1 , R2 , · · · , Rn−1 } of the same size of the It , (b) Compute n share images {S1 , S2 , · · · , Sn−1 } as follows: S1 = R 1 , S2 = R 1 ⊕ R 2 , ··· ,

Sn−1

Si = Ri−1 ⊕ Ri , ··· , = Rn−2 ⊕ Rn−1 , Sn = Rn−1 ⊕ It ,

8. Then, exactly one share image is assigned to each member of the tender committee. 9. However, once shares are assigned to the members, the temporary image is then destroyed from the server. 4.2

Opening of e-Tenders

1. A bid-ending date has been set for every tender advertisement by the organization admin, and after the ending date, all n tender committee members have to forward their shares to the organization admin. 2. The admin checks his dashboard to find which advertisement’s bidding date is over and instructs the system recombines the shares from all the tender committee members to recover the actual e-tender image.

The Process of Recovering the e-Tender Image: If all the n shares are pooled, the system can reconstruct the e-tender image It as follows: It = S1 ⊕ S2 ⊕ · · · ⊕ Sn ,

72

S. Saha et al.

3. Recombined images are added to a zip file, and this zip file is temporarily stored on the server. Organization admin receives the zip file via email. Then, the temporary zip file is destroyed from the server.

5

Implementation and Experimental Results

For implementation purposes, we have used several tools and technologies such as Python 3 with Django, Celery, RabbitMQ, SQLite. In our application, we have developed various functionalities of the system, which are described along with snapshots of the application interface as follows: 1. Various types of users need to register to use the system. The application interface for registering users is shown in Fig. 3.

Fig. 3. Registration interface for the bidders and the organization admin

2. The admin invites n tender committee members (we have presented the results of an experiment where we have considered four tender committee members). The application interface of the invitation is as shown in Fig. 4.

Fig. 4. Invitation to the committee members

A Secure ‘e-Tendering’ Application Based on Secret Image Sharing

73

3. The application interface for registration of the tender committee members is shown in Fig. 5. 4. Bidders sign up in the application and upload the confidential e-tender image through the application. The application interface for uploading the tenderimage is as shown in Fig. 6. 5. A sample e-tender image uploaded is as shown in Fig. 7. 6. The share images generated by the application when the bidder uploads the tender-image are as shown in Fig. 8. 7. The admin checks his/her dashboard. After the bidding ending date, the admin requests the tender committee members to submit the shares such that those can be combined to recover the secret e-tender images. The application interface for the same is as shown in Fig. 9. 8. The secret e-tender image is reconstructed as per the sample input image, is as shown in Fig. 10.

Fig. 5. Sign up of the committee members

Fig. 6. Tender-image upload

74

S. Saha et al.

Fig. 7. Input tender-image

(a) Share-1

(b) Share-2

(c) Share-3

(d) Share-4

Fig. 8. Four share images

A Secure ‘e-Tendering’ Application Based on Secret Image Sharing

75

Fig. 9. Admin request for the recovery of e-tender

Fig. 10. Tender-image reconstruction

6

Conclusion

In this paper, we present the design and implementation of a secure ‘e-Tendering’ application based on the XOR-based (n, n) secret image sharing scheme proposed by Wang et al. [24]. In an ‘e-Tendering’ application, the e-tender document is considered a very confidential document, and its content should not be disclosed before the bid close date and time. Our design and implementation show that an ‘e-Tendering’ application can be efficiently implemented using a secret image sharing scheme, where e-tender is the secret image. When each bidder uploads his/her e-tender form, the e-tender image is encoded to share images (random-like images) using the (n, n) secret image sharing scheme. Then, those share images are stored with the tender committee members responsible for storing, verifying, and evaluating tender documents. After the due date of the bidding, the proposed system combines the shares pooled from the tender

76

S. Saha et al.

committee members and reconstructs the original secret e-tender image. The functions designed to implement the security of the system are computationally efficient since they are based on elementary Boolean operations.

References 1. Abdullahi, B., Ibrahim, Y.M., Ibrahim, A.D., Bala, K.: Development of web-based e-tendering system for Nigerian public procuring entities. Int. J. Constr. Manage. 1–14 (2019) 2. Blakley, G.R.: Safeguarding cryptographic keys. In: 1979 International Workshop on Managing Requirements Knowledge (MARK), pp. 313–318. IEEE (1979) 3. Chan, L.S., Chiu, D.K., Hung, P.C.: E-tendering with web services: a case study on the tendering process of building construction. In: IEEE International Conference on Services Computing (SCC 2007), pp. 582–588. IEEE (2007) 4. Chang, C.C., Lin, C.C., Lin, C.H., Chen, Y.H.: A novel secret image sharing scheme in color images using small shadow images. Inf. Sci. 178(11), 2433–2447 (2008) 5. Chattopadhyay, A.K., Ghosh, D., Maitra, P., Nag, A., Saha, H.N.: A verifiable (n, n) secret image sharing scheme using XOR operations. In: 2018 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), pp. 1025–1031. IEEE (2018) 6. Chattopadhyay, A.K., Nag, A., Singh, J.P.: An efficient verifiable (t, n)-threshold secret image sharing scheme with ultralight shares. Multimed. Tools Appl. 1–31 (2021) 7. Chattopadhyay, A.K., Nag, A., Singh, J.P., Singh, A.K.: A verifiable multi-secret image sharing scheme using XOR operation and hash function. Multimed. Tools Appl. 80, 35051–35080 (2020). https://doi.org/10.1007/s11042-020-09174-0 8. Chen, C.C., Wu, W.J.: A secure Boolean-based multi-secret image sharing scheme. J. Syst. Softw. 92, 107–114 (2014) 9. Chen, C.-C., Wu, W.-J., Chen, J.-L.: Highly efficient and secure multi-secret image sharing scheme. Multimed. Tools Appl. 75(12), 7113–7128 (2015). https://doi.org/ 10.1007/s11042-015-2634-1 10. Chen, T.H., Wu, C.S.: Efficient multi-secret image sharing based on Boolean operations. Signal Process. 91(1), 90–97 (2011) 11. Du, R., Foo, E., Nieto, J.G., Boyd, C.: Designing secure E-tendering systems. In: Katsikas, S., L´ opez, J., Pernul, G. (eds.) TrustBus 2005. LNCS, vol. 3592, pp. 70–79. Springer, Heidelberg (2005). https://doi.org/10.1007/11537878 8 12. Faraoun, K.M.: Design of a new efficient and secure multi-secret images sharing scheme. Multimed. Tools Appl. 76(5), 6247–6261 (2016). https://doi.org/10.1007/ s11042-016-3317-2 13. Ghebleh, M., Kanso, A.: A novel secret image sharing scheme using large primes. Multimed. Tools Appl. 77(10), 11903–11923 (2017). https://doi.org/10.1007/ s11042-017-4841-4 14. Kanso, A., Ghebleh, M.: An efficient (t, n)-threshold secret image sharing scheme. Multimed. Tools Appl. 76(15), 16369–16388 (2017) 15. Kavishe, N., Chileshe, N.: Development and validation of public–private partnerships framework for delivering housing projects in developing countries: a case of Tanzania. Int. J. Constr. Manage. 1–18 (2019) 16. Lin, C.C., Tsai, W.H., et al.: Secret image sharing with capability of share data reduction. Opt. Eng. 42(8), 2340–2345 (2003)

A Secure ‘e-Tendering’ Application Based on Secret Image Sharing

77

17. Mohammadi, S., Jahanshahi, H.: A secure e-tendering system. In: 2009 IEEE International Conference on Electro/Information Technology, pp. 62–67. IEEE (2009) 18. Nag, A., Singh, J.P., Singh, A.K.: An efficient Boolean based multi-secret image sharing scheme. Multimed. Tools Appl. 79(23), 16219–16243 (2020) 19. Oyediran, O.S., Akintola, A.A.: A survey of the state of the art of e-tendering in Nigeria. J. Inf. Technol. Constr. (ITcon) 16(32), 557–576 (2011) 20. Pati, R.S., Nag, A.: A novel XOR-based visual secret sharing scheme with random grid. SmartCR 5(5), 400–407 (2015) 21. Sardar, M.K., Adhikari, A.: A new lossless secret color image sharing scheme with small shadow size. J. Vis. Commun. Image Represent. 68, 102768 (2020) 22. Shamir, A.: How to share a secret. Commun. ACM 22(11), 612–613 (1979) 23. Thien, C.C., Lin, J.C.: Secret image sharing. Comput. Graph. 26(5), 765–770 (2002) 24. Wang, D., Zhang, L., Ma, N., Li, X.: Two secret sharing schemes based on Boolean operations. Pattern Recogn. 40(10), 2776–2785 (2007) 25. Wang, R.Z., Su, C.H.: Secret image sharing with smaller shadow images. Pattern Recogn. Lett. 27(6), 551–555 (2006) 26. Wu, K.-S.: A secret image sharing scheme for light images. EURASIP J. Adv. Signal Process. 2013(1), 1–5 (2013). https://doi.org/10.1186/1687-6180-2013-49 27. Yutia, S.N., Rahardjo, B.: Design of a blockchain-based e-tendering system: a case study in LPSE. In: 2019 International Conference on ICT for Smart Society (ICISS), vol. 7, pp. 1–6. IEEE (2019)

Video Based Graphical Password Authentication System Bipin Yadav(B) , Kaptan Singh, and Amit Saxena Department of Computer Science and Engineering, Truba Institute of Engineering and Information Technology, Karond-Gandhi Nagar, Bypass Road, Bhopal, M.P. 462038, India [email protected]

Abstract. Text-based secret phrase is the ordinary strategy utilized for client verification. It stays as the most broadly utilized strategy since it is straightforward, modest contrasted with different strategies, and simple to carry out. Clients will in general choose basic and short passwords to recollect without any problem. It is exceptionally simple to crack the password for intruders, whereas arbitrary and extensive passwords are difficult to recollect. The primary issue with customary literary strategy is that passwords chosen for some, applications are either powerless or critical or secure however hard to recall. A few clients even utilize the name of the framework as a secret phrase. The extensive passwords give greater security at the same time; it is hard to recall a few such long passwords. Graphical passwords can be considered as the best solution where the user selects one picture as a password from several pictures and researchers have proved that graphical passwords are easy to remember as compared to textual passwords. In this paper, another incorporated arrangement of token and video-based confirmation has been proposed in which two-level of authentication is done. In the first phase, the digital image displayed on any token like mobile is used for authentication. In the second phase, frames of some random video are chosen as passwords. The information was tried for 50 clients which are attempting to figure out the password of another client. The chance of guessing the correct video and correct frame is almost negligible. However, there may be a chance that the selected token picture is available at any social networking site so, guessing possibility of picture is becomes 2 out of 50 users. At last, the chance of guessing both correct videos as well as correct token image is zero. Keywords: Textual password · Graphical password · Authentication · Security · SSIM recall-based · Recognition-based · Cued-recall

1 Introduction Graphical passwords were acquainted as options with text-based passwords wherein the client imagines an image or different pictures to make a secret key. It is also proved by various studies that humans can realize and memorize graphical-based passwords more readily as compared to textual passwords. The Dual-coding hypothesis expresses that verbal (word-based) and non-verbal memory (picture-based) are handled and addressed © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 78–90, 2022. https://doi.org/10.1007/978-981-19-3182-6_7

Video Based Graphical Password Authentication System

79

contrastingly in the human cerebrum. Applying beast power assault on graphical passwords is more troublesome than literary passwords. The assault projects ought to naturally produce precise mouse developments to copy human information, which is a troublesome undertaking. Graphical passwords were acquainted as choices with textbased passwords wherein the client envisions an image or various pictures to make a secret word. Numerous brain research studies have shown that people have predominant memory for perceiving and reviewing visual data went against verbal or literary data [1, 2, 4]. The Dual-coding hypothesis expresses that verbal (word-based) and nonverbal memory(image-based) are handled and addressed contrastingly in the human mind. Applying savage power assault on graphical passwords is more troublesome than literary passwords. The assault projects ought to naturally produce exact mouse developments to mimic human info, which is a troublesome errand. Given the intellectual assignment associated with recollecting the passwords for verification, the graphicalbased passwords are mainly partitioned into 3 classifications acknowledgment-based; review based and signaled review procedures [3, 5, 9]. The acknowledgment-based, review based and signaled review methods is periodically alluded to as econometric, draw metric, and loci decimal measuring standard individually. To overcome the drawbacks of textual passwords many methods are developed that belong to these 3 categories. In the authentication phase of graphical password, many images are shown to choose a password during registration and then during authentication selected password is chosen. In acknowledgment-based frameworks, clients select a bunch of pictures or face shaping a portfolio from an enormous data set during secret key enlistment, and then during login, he/she has to verify the image from that set. Various scientists concentrated on the impact of pictures and articles in learning and reviewing by people [6, 7]. The acknowledgment-based procedure is a simpler undertaking than the review-based method [8]. In the (unadulterated) review-based strategy, the client needs to draw confidentiality on a lattice. Review-based passwords are equivalent to conventional passwords since they require the client to recollect and review the passwords during login time. There are no memory prompts or signals to assist the client with reviewing the passwords. The intellectual burden on the client is more and it is more enthusiastically than any remaining procedures [11]. Cued review is a simpler assignment than an unadulterated review since prompts assist the clients with recovering the secret phrase from memory. Articles might be accessible in human memory however not available for recovery [12]. It was shown that beforehand out of reach data in an unadulterated review circumstance can be recovered with the assistance of signals. In prompted review frameworks, clients select grouping of areas on a solitary picture, or the client might choose a solitary area in each round from an assortment of pictures introduced to him in an arrangement of rounds or the client can drag the articles on the picture to a better place to make a secret key.

2 Literature Survey Since 1996 there are many techniques developed in the field of graphical passwords. They are mainly categorized as recall-based, recognition-based, and cued-recall [9, 13]. In the recall-based plan, a client is approached to imitate a pre-drawn layout drawing with

80

B. Yadav et al.

the mouse or pointer on a network. recognition based plan requires the client to retain an arrangement of pictures during secret phrase creation, and afterward perceive their pictures from among fakes during confirmation. Cued recall plot, expects to diminish the memory load on clients, for the most part, gives a foundation picture and the client should keep in mind and target explicit areas on the picture. Beneath tables shows some review-based, acknowledgment based and prompted review-based strategies with their secret key size and portrayal (Tables 1, 2 and 3). Table 1. Recognition-based techniques S.NO

Technique

Password Space

Description

a

Déjà vu [13]

25ˆ5

In this method, the client chooses a bunch of images from various shown images. At the time of login, a test set with several pictures will be shown on the client’s framework. The test set contains a couple of pictures from the client’s portfolio and the remainder of the pictures from the leftover picture tests which are called bait pictures. For confirmation client should perceive the pictures from his portfolio which are essential for the test set

b

Passfaces [14, 15]

9ˆ5

In this strategy, a client chooses a bunch of human appearances during secret phrase creation. At the time of login, human faces portfolio has been shown to client and he/she have to choose correct faces that are selected during their registration phase

c

Faces/Story [16]

In this system, during secret word creation, a client chooses a succession of pictures and a story has been created using those pictures to recall that arrangement. Those pictures couls be taken from any medium like creatures, kids, sports, famous pictures of any model which are utilized in an everyday life. During login, the client needs to recognize the pictures in a similar grouping

(continued)

Video Based Graphical Password Authentication System

81

Table 1. (continued) S.NO

Technique

Password Space

Description

d

For mobile devices [17]

30 ( password space is small)

In this authentication system, each topic contains 30 thumbnail photographs. During enrollment, a client chooses a topic and afterward chooses a succession of thumbnail photographs on the subject. For validation, during login time, the client perceives and contacts the thumbnail photographs chosen by him in a similar arrangement utilizing a pointer. A number is allocated to every thumbnail photograph and the arrangement of thumbnail photographs structure a number secret key

3 Problem Identified However, security and convenience should go connected but this isn’t possible for real. The passwords which are usable (basic, short, and essential) are not secure and the passwords which are gotten (arbitrary and extensive) are not paramount. Numerous scientists are chipping away at these ideas and have proposed numerous methods of adjusting convenience and security [21]. There is no single answer for this issue and every procedure answers to some degree in at least one angle like decreased secret phrase creation time, diminished login time, expanded review achievement rate, expanded security. Acknowledgment-based passwords are essential however not secure with a single round. For better security, numerous rounds are required, bringing about expanded secret key creation time and login times. They are more defenseless against word reference and speculating assaults and for secret phrase catching, various login meetings ought to be noticed. Review-based strategies are secure against beast power and speculating assaults having more secret phrase space yet defenseless against shoulder surfing and malware assaults. The security increments with the expanded number of strokes yet, it is hard to recollect the arrangement of strokes in review-based frameworks. Signaled review strategies endure with beast power, speculating, and shoulder riding assaults. The secret phrase space is restricted for a solitary picture with not many snap focuses. In multi-round signaled review procedures, the necessity of an excessive number of pictures is the significant limit. The passwords are unsurprising a direct result of less secret keyspace or in light of continuous choice of problem areas in the pictures. After concentrating on the past works around here, it was perceived that review methods are safer against secret phrase speculating assaults than different procedures. Be that as it may, security increments with the number of strokes; the potential arrangements of strokes increments dramatically with the number of strokes. The client investigation of existing review-based methods demonstrated that recalling the grouping of strokes

82

B. Yadav et al. Table 2. Recall based techniques

S.NO Technique

Password Space

Description

a

DAS (Draw-A-Secret) [19] For a 5 × 5 grid, the theoretical password space for the password of length less than or equal to 12 is 58 bits

In this procedure, the client draws confidential (picture) on a framework utilizing pointer during secret key enlistment. The secret word is an arranged grouping of Co-ordinate sets of matrix cells contacted during the secret phrase drawing by the client. The drawing might contain at least one pen stroke isolated by pen-up occasions. For verification, during login time, the client needs to draw the image contacting the network cells in a similar succession

b

Pass-doodle [19]

Hand-made drawing is made on a touch screen without any grid

c

Pass-Go [20]

The password space of Pass-go is more than DAS

The client draws secret words on the matrix utilizing crossing points of the network cells. For every convergence, touchy regions are characterized, and contacting any point inside a delicate region is equivalent to contacting the crossing point. The lattice of size (X + 1) x (X + 1) in DAS is equivalent to the X x X framework in Pass-go

d

Pass-Shapes [20]

As each point contains eight possible strokes, for a password of length n, the complexity is 8ˆn

In this procedure, the shape is significant and size and area are not thought of, just the request for the strokes is thought of. Pass-Shapes are drawn by hand which assists the client with recollecting the shapes

Video Based Graphical Password Authentication System

83

Table 3. Cued recall-based techniques S.NO Technique

Description

a

PassPoints [18]

In this procedure, a client chooses certain areas on a picture as a secret key. During login time, the client needs to reselect similar areas in a similar request for validation

b

Cued click points [9]

In this method, in order to go to the next level a client has to tap on any one point. One more picture is shown in that round, the client needs to click a point in that picture. This cycle will be rehashed multiple times making a secret phrase of five snap focuses for five pictures. At time of login, the client needs to tap similar focuses in a similar succession

c

Inkblot authentication [9] In this method during secret key enlistment, an inkblot has been watched by the user, accepts a word that depicts the inkblot, and enters the first and last letters of a word as a feature of the secret word. This will be rehashed for some inkblots to produce a long-secret phrase. During login, the inkblots will be shown in a similar request and the client needs to enter the first or last person of the words accepted for those inkblots

d

Passlogix V-GO [9]

In this strategy, the client clicks/hauls the number of found objects for secret key creation. The succession of exercises like setting up supper by picking required things and cooking is a secret key. The secret key creation relies upon the climate chosen for the secret key. During login, the client needs to rehash a similar interaction

is more difficult however a large portion of the clients could recollect the (numerous) strokes. The creator is propelled to propose another review-based method utilizing local language passwords to deal with memorability of a grouping of strokes, holding the upside of better security of review-based procedures. Review-based procedures are solid against savage power assaults, yet helpless against shoulder riding assault [22, 23]. The current shoulder riding safe methods are to be contemplated and a component is to be applied to the local language secret word confirmation procedure and to research its protection from shoulder riding assault. Acknowledgment-based passwords are observed to be vital. The worker keeps an enormous graphical data set of pictures or articles in acknowledgment-based frameworks and readies a test set for each round for each client and sends a lot of information to the customer which forces significant overhead on the worker. Ordinary acknowledgment-based frameworks are equivalent to 4 digit PIN customary framework security because of the determination of the predetermined number of articles for the secret word. The creator is roused to propose a device-based acknowledgment method to exploit memorability related to acknowledgment-based passwords while taking out the overhead connected with different rounds in acknowledgment-based frameworks and to upgrade the security utilizing long passwords. An antiquated Indian

84

B. Yadav et al.

conventional game board is taken as an apparatus in the proposed acknowledgment-based plan for upgrading memorability, convenience, and security of passwords.

4 Proposed Solution The primary issue with every authentication system is that passwords chosen for some applications are either powerless or critical or secure however hard to recall and several kinds of research have already proved that graphical passwords are easier to remember than textual passwords. However, security and convenience should go connected but this isn’t possible for real. The passwords which are usable (basic, short, and essential) are not secure and the passwords which are tough (arbitrary and extensive) are not convenient. Utilizing the concept of graphical passwords the author has proposed a new integrated approach of token and video-based authentication systems (Fig. 1). Steps involved during the registration phase are1. Place token containing an image in front of the camera, a token can be any device like mobile. 2. Select a video as a second password. 3. Tap on live video to select up to 3 frames as passwords.

Fig. 1. Registration phase of the proposed method

At the time of registration, the client needs to put the symbolic picture before the camera, afterwards selection of a frame as a password is done from the shown live video. The security of this method lies in the fact that only few frames has been selected out of multiple frames of a video (Fig. 2).

Video Based Graphical Password Authentication System

85

Steps involved during the authentication are1. Place the chosen password image in front of the camera for password verification. 2. If a correct image is placed in front of the camera then go for a second pass. 3. Tap on chosen video frames, if correct then the user gets successful login to the system.

Fig. 2. Authentication phase of the proposed method

Related Work Graphical secret phrase-based validation frameworks are information-based framework, which centers around the way that human can retain and perceive pictures more effectively than text secret phrase. Graphical passwords are mostly arranged into reviewbased (draw metric) schemes, dependent on drawing or portraying shapes on the screen, acknowledgment-based (cognometric)schemes dependent on choosing some known things from a set of things, and prompted review (sociometric)schemes dependent on choosing districts of a known picture. The proposed approach is identified with a locimetric plan and it is a multifaceted validation as it joins tokens with a choice of edges on live video. Filter picture preparing calculation is utilized to remove unmistakable elements from a picture that can be future utilized for correlation between pictures. A. Multifactor Authentication Scheme To support security multifaceted validation frameworks can be utilized, that join at least two autonomous cycles. The proposed approach is a multifaceted confirmation

86

B. Yadav et al.

framework where both token image and a frame from video is choosen as a password. One researcher utilized cell phones as the equipment token for one-time secret phrase age. Another researcher has proposed a test reaction confirmation framework including a client snapping an image of a QR code through his cell phone. The information from this marker creates encoded information that will be utilized at the time of login. The techniques are likewise defenseless to specific sorts of assault, for example, a messages sent between a client and the framework has been modified by a middle man. B. Scale Invariant Feature Transform SIFT is a calculation that is utilized to distinguish and depict nearby components of a picture. From some random picture, fascinating marks of the article can be extricated to give a “highlight portrayal” of that item. Significantly, the provisions removed from the preparation picture be perceptible even there is an adjustment of picture scale, clamor, and enlightenment. C. Technique Used for Matching the Similarity The comparability between the pictures is measured using Structural Similarity Index Mapping (SSIM). It intends to give an improved version of conventional techniques, for example, top sign to-commotion proportion (PSNR) and mean squared mistake (MSE), which are demonstrated to be conflicting.   2Ax Ay + z1 (2Cx Cy + z2) (1) SSIM(x, y) = 2 (Ax + A2y + z1)(Vx+ Vy + z2) where, Ax : x average, Ay : y average, Vx : x variance- σx 2 , Vy : y variance - σy 2 , Cx Cy : x and y covariance, z1 = (k1 L)2 , z2 = (k2 L)2 : weak denominator’s division stabilizer variables, L is the pixel values dynamic range and K1 = .01, k2 = .03 (default) The SSIM formula is basically the comparison of the samples of x and y with three measurements namely L, C, and S. The individual comparison functions are luminance l(x, y) = Contrast C(x, y) =

2Ax Ay + z1 A2x + A2y + z1

(2)

2Cx Cy + z2 Vx+ Vy + z2

(3)

Cxy + z3 Cx Cy + z3

(4)

Structure S(x, y) =

Video Based Graphical Password Authentication System

z3 =

z2 2

87

(5)

Now, SSIM can be defined as a combination of these threeSSIM(x, y) = [luminanceα .Contrastβ .Structureγ ]

(6)

α, β, γ = 1 by default The outcome of SSIM is a decimal number from −1 to 1 or in between. It will result in +1 just when the two informational collections are indistinguishable, and −1 when information is unique. SSIM estimation formula-SSIM (presented picture, original picture) D. Other Matching Techniques Peak Signal to noise ratio (PSNR) The PSNR block processes the pinnacle signal-to-clamor proportion, in decibels, between two pictures. It can either be utilized as a quality estimation between the first and a packed picture and it is accepted that the higher the PSNR, the better the nature of the compacted, or recreated picture. The Mean Square Error (MSE) and the Peak Signal to Noise Ratio (PSNR) are the two usually utilized mistake measurements to analyze picture pressure quality. The MSE esteem shows the total squared mistake between the packed and the first picture, while PSNR esteem demonstrates the proportion of the pinnacle blunder and it is accepted that, the lower the worth of MSE, the lower the blunder. To register the PSNR esteem, initially, the mean-squared mistake is determined utilizing the accompanying condition:  2 M,N [l1 (m, n) − l2 (m, n)] (7) MSE = M∗N where, M, N: number of rows and columns of the input image Then, PSNR is computed asPSNR = 10 log10

R2 MSE

(8)

here, R is the maximum fluctuation in the input image data type. E.g. if the input image has a double-precision floating-point data type, then R will be 1 and if it has an 8-bit unsigned integer data type then, R will be 255, etc. Root-Mean-Square Deviation (RMSD) The root-mean-square deviation (RMSD) is an often utilized strategy for discovering the distinction between values i.e. the example esteems and populace esteems, anticipated by a model or an assessor, and the qualities noticed. The RMSD is the standard deviation of the contrasts between anticipated qualities and noticed qualities. It is a decent proportion of precision, yet just to think about anticipating mistakes of various models for a specific variable and not between factors. The RMSD of anticipated qualities yˆt for times t of

88

B. Yadav et al.

variable it is processed for n various forecasts as the square base of the mean of the squares of the deviations: √ n

RMSD =

(ˆyt − yt )2 /n

(9)

t=1

During the verification, stage the client needs to initially perceive the pre-picked token picture, and afterward choose 3 casings from the pre-picked video, if the right secret phrase is speculated client will get login effectively. The Feasibility studies look at its dependability, ease of use, and protection from perception. The dependability study proposes proper framework limits of 90% of which should mathematically coordinate with firsts to be passed judgment on same. The convenience concentrate on measuring task consummation times and blunder rates. At last, the security concentrate on features Improved Pass-BYOP’s protection from perception assault—shoulder surfing, camera-based perception, or malware.

5 Result Analysis Target results from the login stage are displayed in Table 4. This information was tried for 50 clients which are attempting to figure secret keys of another client. The chance of speculating the mysterious picture just as the video outlines effectively is very nearly zero. Anyway, the chance of speculating the mysterious picture for example the symbolic picture is 2 because there might be plausible that this picture is accessible at any informal communication site. The chance of speculating the edge from a video and speculating both picture and edge is zero. Table 4. Target results from the login stage Password Guessing(Tested on 50 users) Success

Successfully guessing of a secretive image as well as video frame

0

Failure

Guessing accurate secret image

2

Guessing accurate secret video frame

0

Accurate guessing of both images with the video frame

0

Nothing is correct

48

The current validation plot gives protection from perception and exactness of 80% whereas the proposed gives a precision of 95.5% during the enrollment stage and 98.9% during the verification stage as shown in Table 5. Authentication time analysis is shown in Table 6.

Video Based Graphical Password Authentication System

89

Table 5. Accuracy during registration and authentication phase Accuracy PHASE-I(REGISTRATION)

95.5%

PHASE-II (AUTHENTICATION)

98.9%

Table 6. Authentication time Authentication phase (time) Time (in sec)

Mean

Median

Standard deviation

174

119

42

6 Conclusion As we realize, text-based passwords are the most broadly utilized strategy since it is basic, cheap contrasted with different methods, and simple to execute. Clients will in general choose straightforward and short passwords to recollect without any problem. Graphical passwords were acquainted as options with text-based passwords wherein the client envisions an image or various pictures to make a secret key. Numerous brain science studies have shown that people have predominant memory for perceiving and reviewing visual data went against verbal or literary data. However, security and convenience should go connected at the hip for validation this isn’t accurate. The passwords which are usable (straightforward, short, and essential) are not secure and the passwords which are gotten (irregular and extended) are not paramount. Numerous analysts are dealing with these ideas and have proposed numerous methods adjusting ease of use and security. There is no single answer for this issue and every method answers to some degree in at least one angle like diminished secret phrase creation time, decreased login time, expanded review achievement rate, expanded security. In this paper, a new integrated approach of token and video-based validation has been proposed. The information was tried for 50 clients which are attempting to figure secret phrases of another client. The chance of speculating the mysterious picture just as the video outlines effectively is right around nothing. Anyway, the chance of speculating the mysterious picture for example the symbolic picture is 2 because there might be chance of its availability on sites like social networking. The guessing possibily of combined framework having both frame and image from token is zero.

References 1. De Angeli, A., Coventry, L., Johnson, G., Renaud, K.: Is a picture really worth a thousand words? exploring the feasibility of graphical authentication systems. Int. J. Hum Comput Stud. 63(1–2), 128–152 (2005) 2. French, R.S.: Identification of dot patterns from memory as a function of complexity. J. Exp. Psychol. 47, 22–26 (1954)

90

B. Yadav et al.

3. Mihajlov, M., Jerman-Blazic, B., Illievski, M.: Recognition-based graphical authentication with single-object images. Developments in E-systems Engineering. IEEE (2011) 4. Shepard, R.N.: Recognition memory for words, sentences, and pictures. J. Verbal Learn. Verbal Behav. 6, 156–163 (1967) 5. Suo, X., Zhu, Y., Owen, G.: Graphical passwords: a survey. In: Proc. ACSAC (2005) 6. Blonder, G.E.: Graphical Passwords. Lucent Technologies Inc., Murray Hill, NJ, U. S. Patent, Ed. United States (1996) 7. Paulson, L.D.: Taking a graphical approach to the password. Computer 35, 19 (2002) 8. Davis, D., Monrose, F., Reiter, M.: On user choice in graphical password schemes. In: 13th USENIX Security Symposium (Aug 2004) 9. Liu, X., Qiu, J., Ma, L., Gao, H., Ren, Z.: A novel cued-recall graphical password scheme. In: Sixth International Conference on Image and Graphics. IEEE 978–0–7695–4541–7/11 (2011) 10. Susan, W., Jim, W., Jean-Camille, B., Alex, B., Nasir, M.: Passpoints: design and longitudinal evaluation of a graphical password system. Int. J. Hum Comput Stud. 63, 102–127 (2005). July 11. Sobrado, L., Birget, J.C.: Graphical passwords. The Rutgers Scholar, An Electronic Bulletin for Undergraduate Research 4 (2002) 12. Weinshall, D., Kirkpatrick, S.: Passwords you will never forget, but can’t recall. In: Proceedings of Conference on Human Factors in Computing Systems (CHI), pp. 1399–1402. ACM, Vienna, Austria (2004) 13. Dhamija, R., Perrig, A.: Deja Vu: a user study using images for authentication. In: Proceedings of 9th USENIX Security Symposium (2000) 14. Real User: www.realuser.com 15. Brosto, S., Sasse, M.A.: Are Passfaces more usable than passwords: a field trial investigation. In: People and Computers XIV - Usability or Else: Proceedings of HCI. Springer-Verlag, Sunderland, UK (2000) 16. Adams, A., Sasse, M.A.: Users are not the enemy: why users compromise computer security mechanisms and how to take remedial measures. Commun. ACM 42, 41–46 (1999) 17. Takada, T., Koike, H.: Awase-E: image-based authentication for mobile phones using users favorite images. In: Human-Computer Interaction with Mobile Devices and Services, vol. 2795/2003. Springer-Verlag GmbH, pp. 347–351 (2003) 18. Wiedenbeck, S., Waters, J., Birget, J.-C., Brodskiy, A., Memon, N.: Passpoints: design and longitudinal evaluation of a graphical password system. Int. J. Hum.-Comp. Stud. 63, 102127 (July 2005) 19. Jermyn, I., Mayer, A., Monrose, F., Reiter, M.K., Rubin, A.D.: The design and analysis of graphical passwords. In: Proceedings of the 8th USENIX Security Symposium (1999) 20. Syukri, A.F., Okamoto, E., Mambo, M.: A user identication system using signature written with mouse. In: Third Australasian Conference on Information Security and Privacy (ACISP): Springer- Verlag Lecture Notes in Computer Science (1438), pp. 403–441 (1998) 21. Pate, S.E., Barhate, B.H.: A survey of possible attacks on text & graphical password authentication techniques. Int. J. Sci. Res. Comp. Sci. Eng. 6(Special Issue 1), 77–80 (Jan 2018). E-ISSN: 2320–7639 22. Irfan, K., Anas, A., Malik, S., Amir, S.: Text based graphical password system to obscure shoulder surfing. In: 2018 15th International Bhurban Conference on Applied Sciences and Technology (IBCAST), pp. 422–426 (2018). https://doi.org/10.1109/IBCAST.2018.8312258 23. Abhijith, S., Sam, S., Sreelekshmi, K.U., Samjeevan, T.T., Mathew, S.: Web based graphical password authentication system, Int. J. Eng. Res. Technol. (IJERT) 09(07) (2021). ICCIDT – 2021

Designing Robust Blind Color Image Watermarking-Based Authentication Scheme for Copyright Protection Supriyo De1(B) , Jaydeb Bhaumik2 , and Debasis Giri3 1

3

Department of Electronics and Communication Engineering, Techno Engineering College Banipur, Habra 743 233, India [email protected] 2 Department of Electronics and Telecommunication Engineering, Jadavpur University, Kolkata 700 032, India [email protected] Department of Information Technology, Maulana Abul Kalam Azad University of Technology, Nadia 741 249, India debasis [email protected]

Abstract. Digital watermarking is a process of embedding information within digital content mostly for authentication and digital right management purposes. In this article, a new robust blind color image watermarking scheme based on two-level Discrete Wavelet Transform (DWT)-Discrete Fourier Transform (DFT) has been proposed for copyright protection and authentication. Additionally, a new geometric attack detection and correction procedure has been introduced in this scheme. The proposed scheme can thwart several signal processing, geometric and hybrid attacks. The cover image is transformed using DWT and DFT before embedding the encrypted watermark. Here, encryption of watermark is done using AES in CBC (Cipher Block Chaining) mode. Finally, the encrypted watermark is embedded in transform domain of cover image by using the logistic map based random position generator. Experimental results show that the proposed scheme is better than the relevant existing schemes in terms of imperceptibility and robustness. Keywords: Blind image watermarking · DWT · DFT distortion correction · Robustness · Imperceptibility

1

· Geometric

Introduction

Image watermarking plays an important role to verify the authenticity of the owner in multimedia data communication over a shared channel like Internet. The primary objective of image watermarking is to insert some digital information for copyright protection in the digital image. Image watermarking can be done either in spatial domain or in transform domain. Transform domain watermarking schemes provide more robustness for image watermarking than spatial domain approaches. Discrete Cosine Transform (DCT) based watermarking c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022  D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 91–104, 2022. https://doi.org/10.1007/978-981-19-3182-6_8

92

S. De et al.

schemes were developed and discussed in [1,3]. Also there exist several multiple transformations based watermarking schemes [4,6]. Multiple transformations enhances the degree of imperceptibility as well as the robustness of watermarking schemes. A blind image watermarking technique based on DWT-SVD and DCT has been reported in [4]. It shows a sufficiently good robustness. But, it requires a significant amount of side information along with the watermarked image for retrieving the watermark. Hessenberg decomposition based blind color image watermarking technique has been found in [7]. The scheme has secured average PSNR value 35.0208 dB and 35.0046 dB for two different watermark images. However, in [7] robustness analyses in terms of geometric distortion are limited to cropping, scaling and 30◦ rotation. A novel digital image watermarking scheme based on DWT-DFT-SVD has been reported by Advith et al. [9]. Here, Y-Cb-Cr color space of cover image has been used for message hiding. The scheme moderately ensures the imperceptibility in terms of PSNR. But it also needs the side information at receiving end to extract the watermark. Kang et al. [12] have introduced a DWT-DFT based composite watermarking scheme. Here, the scheme achieved robustness against both affine transform and JPEG compression. On the contrary, it incorporates 4-level DWT, which implies a higher computation cost. Single level DWT-DFT based another non-blind watermarking scheme has been found in [5] where DFT has been applied to one of the sub-band obtained from DWT of host image. The coefficients of magnitude spectrum or phase spectrum obtained after DFT have been directly modified by the watermark image. In addition, the scheme is robust against only few distortions. However, it does not specify the distortion level and therefore the scheme is unable to confirm its strength against those distortions. Fazli et al. have proposed a DWT, DCT, and SVD based non-blind robust image watermarking method in [6] which incorporates a new technique for correction of main geometric attacks. It achieves overall good performance in terms of robustness. On the contrary, the scheme has embedded four copies of watermark in the cover image for more accurate extraction. In [10], two different methods have been adopted for data hiding. these are i) luminance component (Y) of cover image is used to hide the watermark bit sequence in DFT domain and ii) watermark bit patterns are inserted within the selected region of 2D histogram of blue-difference and red-difference (Cb-Cr) chrominance components. In [10], frequency domain embedding technique achieves robustness against signal processing distortion however it is unable to resist geometric distortion. On the other hand, 2D histogram based embedding technique achieves very poor robustness against signal processing distortion however; this approach provides very good robustness against geometric distortion. In addition, Roy et al. [2] proposed a watermarking scheme to track and repair the geometric distortion. Here, the scope to prevent geometric distortion is limited to cropping, scaling and rotation attacks.

Robust Blind Color Image Watermarking-Based Authentication Scheme

93

In this paper, we have proposed a blind image watermarking scheme based on DWT-DFT which makes the scheme robust against signal processing attack. Also we have incorporated a simple technique which provides the robustness against geometric attack. Encryption of watermark has been performed by AES in CBC mode for the confidentiality. Furthermore, authentication of the proposed scheme has been built up by the key dependent random position generator using logistic map. The rest of the paper is organized as follows: Watermark embedding and watermark extraction techniques of proposed scheme are described in Sect. 2. We have furnished the experimental results with different set of comparative studies in Sect. 3. Finally, the paper is concluded in Sect. 4.

2

Proposed Watermarking-Based Authentication Scheme

In this section, the proposed watermarking-based authentication scheme using two level DWT-DFT has been elaborated. Spatial localization, multi-resolution features along with geometric structural advantages are incorporated here. Figure 1 represents the block diagram of watermark embedding and extraction techniques for the proposed scheme. It has two phases, namely watermark embedding and watermark extraction.

Fig. 1. Block diagram of watermark embedding technique and watermark extraction technique

94

2.1

S. De et al.

Watermark Embedding Procedure

Two level DWT, Random position generator, DFT, Encryption and Hamming (7, 4) error correcting code have been used in watermark embedding procedure. The entire process can be divided into following sub-parts namely a) Two level DWT, b) Random position generator and DFT, c) Encryption and application of Hamming (7, 4) code to protect watermark bits, d) Message hiding technique, e) Watermarked image reconstruction. In addition, Algorithm 1 exhibits the algorithm of watermark embedding phase at the transmitting section and the algorithm for transformed domain embedding technique have been illustrated in Algorithm 3.

Input: Cover color image CI(M ×N ) , Watermark W(P ×Q) , Strength of watermark α, key1 , key2 ; Output: Watermarked image W M(M  ×N  ) ; Algorithm for Watermark embedding W = two2oneD(W );//two2oneD conversion is not required when a binary sequence is //used as a watermark W = EncryptionAES(W, key1 ); l = numberof bit(W ); for (i = 0; i < l/4; i + +) do W h(7i+1:7i+7) = HammingCode(W(4i+1:4i+4) ); end lb = numberof bit(W h); W M = W atermarkEmbeddingT echnique(CI, W h, lb, α, key2 ); W M = GenerateP rotectionBand(W M );//M  = M + 6, N  = N + 6 return W M ;

Algorithm 1: Watermark embedding phase

Input: Number of available non-overlapping (8×8) blocks in HL2 , LH2 subbands d, Initial value of the Logistic map used as key2 x0 ; Output: Random position vector r p; Algorithm for RandomPositionGenerator for (i = 0; i < buf f er; i + +) do xi+1 = 3.999xi (1 − xi );//buf f er is used for remove the inertia of the Logistic map //by eliminating first few numbers end j = 0, r p = 0; while (j < d) do xi+1 = 3.999xi (1 − xi ); temp = (xi+1 × 1014 )m od(d); if (temp ∈ / r p) then rpj = temp; j + +; end i + +; end return r p;

Algorithm 2: Random position generator

Robust Blind Color Image Watermarking-Based Authentication Scheme

95

Input: Cover color image CI(M ×N ) , Encoded watermark W h, Length of W h lb, Strength of watermark α, key2 ; Output: Watermarked image W M(M ×N ) ; Algorithm for WatermarkEmbeddingTechnique [CIR , CIG , CIB ] = separateP lanes(CI); [LL1 , HL1 , LH1 , HH1 ] = HaarW aveletT ransf orm(CIR ); [LL2 , HL2 , LH2 , HH2 ] = HaarW aveletT ransf orm(LL1 ); l = 1, pos = [row, col];//pos internally set as [4, 4] r p = RandomP ositionGenerator(d, key2 ); //d: number of available non-overlapping (8×8) blocks in HL2 , LH2 subbands //S B : vector of available non-overlapping (8×8) blocks in HL2 , LH2 subbands i = 0; while (l ≤ lb && i < d) do T EM P = DF T (SBrpi ); if (W h(l) == 1) then T EM Ppos = real(T EM Ppos ) + jα; end if (W h(l) == 0) then T EM Ppos = real(T EM Ppos ) − jα; end SBrpi = IDF T (T EM P ); i + +, l + +; end ∗ ∗ LL∗ 1 = invHaarW aveletT ransf orm(LL2 , HL2 , LH2 , HH2 ); ∗ //HL∗ 2 , LH2 are message embedded subbands W MR = invHaarW aveletT ransf orm(LL∗ 1 , HL1 , LH1 , HH1 ); W M = constructColorImage(W MR , CIG , CIB ); return W M ;

Algorithm 3: Transform domain embedding process

Two Level DWT. DWT tools have been considered because of their outstanding spatial localization and multi-resolution characteristics. These features are analogous to the theoretical models of the human visual system (HVS). In addition we have considered the Haar filter which effectively separates frequencies into four sub-bands independently. Moreover, It provides a better negotiation between robustness and visibility factor with low computing requirements. As human vision is less sensitive in red region [11] we have considered the red plane of cover image (CIR ) for message hiding. DWT based on Haar filter is applied to the red plane of cover image (CIR ) of dimension (M × N ) to find out the wavelet coefficients at LL1 band as shown in Fig. 2. This LL1 band (dimension M/2×N/2) contains more power of the original image. Subsequently, second level of DWT using Haar filter has been applied to decompose the LL1 band. For further processing HL2 and LH2 sub-bands (with dimension M/4 × N/4) have been considered. These two sub-bands are used here to preserve the imperceptibility of the watermarking scheme as HL2 and LH2 sub-bands are nothing but a finest analysis of horizontal and vertical information which are less significant part to maintain the quality of the cover image than LL2 coefficient. In relation, LL2 and HH2 sub-bands are not considered for message hiding to maintain the imperceptibility and robustness respectively. We have selected 2-level DWT for maintaining a balance among computational complexity, payload and robustness. The said process is shown in (1).

96

S. De et al.

[LL1 , HL1 , LH 1 , HH 1 ] = HaarW aveletT ransf orm(CIR ) [LL2 , HL2 , LH 2 , HH 2 ] = HaarW aveletT ransf orm(LL1 )

(1)

Fig. 2. Two level DWT - bands

Random Position Generator and DFT. At first, the logistic map based random position vector (rp) has been used for selecting the non-overlapping (8 × 8) blocks from the subbands HL2 , LH2 . Here, the initial value of the logistic map (x0 ) has been assigned by the key2 (0 < key2 < 1). To obtain the chaotic response from this map the scheme has set the value of controlling parameter 3.999 (i.e. ≈4). The detail operation of the random position generator has been presented in Algorithm 2. Two dimensional DFT is used immediately after randomly selecting the nonoverlapping (8 × 8) blocks from the subbands HL2 , LH2 . It decomposes the image into sine and cosine components. Moreover, it is an effective tool to obtain the geometric structure of the spatial domain representation. The maximum number of blocks are 2 × (M/32) × (N/32). Algorithm 3 describes the operation in details. Encryption and Application of Hamming (7, 4) Code to Protect Watermark Bits. In this stage, watermark bit-stream has been encrypted by AES in CBC mode with the help of 128-bit key (key1 ). Thereafter, Hamming (7, 4) error correcting code has been applied to improve the reliability of the encrypted watermark bit-stream (W ). Message Hiding Technique. We find out the non-overlapping (8 × 8) blocks using the random position vector (rp) from the subbands HL2 , LH2 . Next, the DFT operation has been performed for the selected block (SBrpi ) and a midfrequency co-efficient is selected to hide the watermark bits. Experimentally, it is observed that if low frequency co-efficient is replaced according to the watermark bit then PSNR (Peak Signal to Noise Ratio) is affected significantly but effects on BER (Bit Error Rate) is minimum. On the other hand, for high frequency co-efficient, BER is affected significantly but the effects on PSNR is minimum.

Robust Blind Color Image Watermarking-Based Authentication Scheme

97

In this scheme, the frequency co-efficient in the (4, 4) position is replaced by the watermark bit to trade off between PSNR and BER. The Algorithm: Transform domain embedding process shows the operation in details in Algorithm 3. Here, α is the strength of watermark and an impact on phase has been reflected as the messages hide in imaginary plane. This phenomenon builds up a resistance over distortion. Watermarked Image Reconstruction. Reconstruction of watermarked image can be done using following steps. – Update the modified value of each selected (8 × 8) block (SBrpi ) by applying the inverse DFT (IDFT). – Update the corresponding HL∗2 and LH2∗ subbands. – Rearrange LL2 , HL∗2 , LH2∗ and HH2 sub-bands according to the order shown in Fig. 2 and perform inverse DWT (IDWT) using Haar filter to obtain modified LL∗1 band using Eq. (2). – Again rearrange LL∗1 , HL1 , LH1 and HH1 bands according to the order shown in Fig. 2 and perform IDWT using Haar filter to obtain the red plane (R) of watermarked image (W MR ) using Eq. (2). LL∗1 = invHaarW aveletT ransf orm([LL2 , HL∗2 , LH ∗2 , HH 2 ]) ∗ W MR = invHaarW aveletT ransf orm([LL1 , HL1 , LH 1 , HH 1 ])

(2)

Finally uniform border of three distinct gray levels (with gray values 0, 128, 255) is generated outside the watermarked image (W M ) as protection band which helps to detect the correct geometric distortion irrespective of any background of cover image. In receiving end, protection band does not participate in any transformation for message extraction. 2.2

Watermark Extraction Procedure

As it is a blind scheme, watermark extraction procedure does not depend upon initial cover image information. In this stage, newly proposed geometric distortion correction technique has been employed to correct (if any) the geometric distortion of received watermarked image (W M  ) as shown in Algorithm 4. In this occasion, different parameters such as, ∠s1 , ∠s2 , ∠s3 , ∠s4 , ∠a1 , ∠a2 , ∠a3 and ∠a4 have been determined from the watermarked images (W M  ) as shown in Fig. 3(a) & (b). Next, corrected image is used for watermark extraction. Watermark extraction technique associates with two level DWT, DFT, error correction, decryption and message extraction. Algorithm 5 has been presented the watermark extraction procedure in detail.

3

Experimental Results

To analyze the performance of the proposed scheme with respect to robustness and imperceptibility, a selected set of relevant previous works have been considered here for different case study. The said algorithm is developed in MATLAB

98

S. De et al.  Input: Watermarked image W M(R×C) , Undistorted dimension of watermarked image (M  × N  ) ;  Output: Corrected watermarked image W M(M ×N ) ;

Algorithm for Geometric Distortion Detection and Correction W M  = unif ormW idthBorder(W M  ); E = CannyEdgeDetector(W M  ); [CT E, RBS] = cornerP ointDetection(W M  ); [l1 , l2 , l3 , l4 ] = detectOutermostLine(E); for (i = 1; i ≤ 4; i + +) do si = slope(li ); end for (i = 1; i ≤ 4; i + +) do ai = angle(l((i+2)m o d 4)+1 , li ); end ⎡ ⎤ a b 0 Af w = ⎣ c d 0 ⎦ ; 0 0 1 if s1 , s3 == 00 && s2 , s4 == 900 && a1 , a2 , a3 , a4 == 900 then if (R/M  == C/N  && R/M  , C/N  = 1) then sfrev = reverseScalingF actor(R, C, M  , N  );//scaling attack W M  = scaling(W M  , sfrev ); else //no attack or image translation end end if s1 , s3 == 00 && s2 , s4 = 900 && s2 == s4 && a1 , a2 , a3 , a4 = 900 then [a, b, c, d] = parameterDetection(R, C, M  , N  );//horizontal shearing attack Arev = inv(Af w ); W M  = af f ineT ransf orm(W M  , Arev ); end if s1 , s3 = 00 && s1 == s3 && s2 , s4 == 900 && a1 , a2 , a3 , a4 = 900 then [a, b, c, d] = parameterDetection(R, C, M  , N  );//vertical shearing attack Arev = inv(Af w ); W M  = af f ineT ransf orm(W M  , Arev ); end if s1 , s3 = 00 && s1 == s3 && s2 , s4 = 900 && s2 == s4 && a1 , a2 , a3 , a4 = 900 then [a, b, c, d] = parameterDetection(R, C, M  , N  , CT E, RBS);//affine transform attack Arev = inv(Af w ); W M  = af f ineT ransf orm(W M  , Arev ); end if s1 , s3 = 00 && s2 , s4 = 900 && s1 == s2 == s3 == s4 && a1 , a2 , a3 , a4 == 900 then W M  = imageRotate(W M  , −s1 );//rotation attack end p band = detectP rotectionBand(W M  ); W M  = reconstructImage(W M  , p band);//discard exterior area upto protection band return W M  ;

Algorithm 4: Geometric distortion detection and correction in watermarked image

7.1 and run under CPU INTEL CORE I3 at 1.7 GHz in Microsoft Windows 7 environment. The experiments are successfully performed with several images collected from different databases such as STARE image database [13], UCID image database [14], The USC-SIPI image database [15] and HDR Dataset Computational Vision Lab Computing Science by Funt et al. [16] as cover image. The sample set of cover images are shown in Fig. 4(a)–(d).

Robust Blind Color Image Watermarking-Based Authentication Scheme

99

 Input: Watermarked image W M(M ×N ) , key1 , key2 ; ∗ Output: Extracted watermark W(p×q) ;

Algorithm for WatermarkExtraction    [W MR , W MG , W MB ] = separateP lanes(W M  );     [LL 1 , HL1 , LH1 , HH1 ] = HaarW aveletT ransf orm(W MR );     [LL 2 , HL2 , LH2 , HH2 ] = HaarW aveletT ransf orm(LL1 ); l = 1, pos = [row, col];//pos internally set as [4, 4] r p = RandomP ositionGenerator(d, key2 );  //d: number of available non-overlapping (8×8) blocks in HL 2 , LH2 subbands  //S B : vector of available non-overlapping (8×8) blocks in HL 2 , LH2 subbands i = 0; while (l ≤ lb && i < d) do T EM P = DF T (SBrpi ); if (imag(T EM Ppos ) > 0) then extW(l) = 1; end if (imag(T EM Ppos ) < 0) then extW(l) = 0; end i + +, l + +; end l = length(extW ); for (i = 0; i ≤ l/7; i + +) do  W(4i+1:4i+4) = invHammingCode(extW(7i+1:7i+7) ); end W ∗ = DecryptionAES(W  , key1 ); W ∗ = one2twoD(W ∗ );//one2twoD conversion is not required when a binary sequence is //used as a watermark return W ∗ ;

Algorithm 5: Watermark extracting phase

Fig. 3. (a) Edges of geometrically distorted image Lena (b) Different measured angles of geometrically distorted image

3.1

For 64-Bit Random Sequence as Watermark

The robustness of the scheme has been verified by observing the obtained BERs from the extracted watermark for several cover images. In most of the cases the strength of watermark (α) is taken as 510 to trade off between imperceptibility and robustness. The performance of the scheme has been examined against different signal processing and geometric distortions. In Table 1, three sample images from each image database have been selected as cover image and obtained BERs are recorded against different signal processing distortions. On the other hand,

100

S. De et al.

Fig. 4. Sample color cover images: (a) Mandrill (b) im0001 (c) S0010 (d) ucid00786

the scheme has been evaluated against different geometric distortions using the same set of cover images and the corresponding outcomes are enlisted in Table 2. Further, in Table 3, results of two existing related schemes [6,10] are compared with the performance of the proposed scheme in terms of different signal processing and geometric distorions for the cover image Mandrill. The outcomes presented in Table 1, Table 2, and Table 3 successfully verify the robustness of proposed scheme. Table 1. Robustness analyses against signal processing distortions

USC-SIPI

STARE

HDR

UCID

Images

BER from extracted watermark GN

IN

PN

AF

MoF

HE

SPN

SN

BR

MF

GC

Mandrill

0

0

0

0

0

0

0

0

0

0

0

0

House

0.0313

0.0313

0

0

0

0

0

0.0625

0

0

0

0 0

Con

Jellybeans

0

0.0468

0

0

0

0

0

0.0156

0

0

0

im0001

0

0.0313

0

0

0

0

0

0

0

0

0

0

im0320

0.0156

0.0313

0

0

0

0

0.0156

0

0

0

0

0

im0370

0

0.0156

0

0

0

0

0

0.0156

0

0

0

0

S0010

0

0

0

0

0

0

0

0

0

0

0

0

S0040

0

0

0

0

0

0

0

0

0

0

0

0

S0770

0

0

0

0

0

0

0

0

0.0156

0

0

0

ucid00104

0

0

0

0

0

0

0

0

0

0

0

0

ucid00576

0

0

0

0

0

0

0

0

0

0

0

0

ucid00786 0 0 0 0 0 0 0 0 0 0 0 0 GN: Gaussian noise(0,0.015), SPN: Salt & pepper noise (0.01), IN: Impulsive noise density (0.1), SN: Speckle noise (μ = 0, sd = 0.05), PN: Poisson noise, BR: Brightness, AF: Average filter (3 × 3), MF: Median filter (3 × 3), MoF: Motion filter (length = 5, angle = 00 ), GC: Gamma correction (γ = 0.6), HE: Histogram equalization, Con: Contrast.

Further, Fig. 5(a) represents the imperceptibility of the proposed scheme with respect to the related existing scheme [10] in terms of PSNR, VIF, and NCD for different cover images. On the other hand, with the same experimental setup, the average PSNR value for Fazli et al.’s scheme has been found as 56.604 dB. In addition, Kang et al.’s [12] scheme has reported the average PSNR 42.5 dB for 60-bit message hiding. Moreover, the average Q-index value has been obtained for this scheme is 0.9939. These comparative studies indicate that the proposed scheme offers certain and reliable watermark imperceptibility.

Robust Blind Color Image Watermarking-Based Authentication Scheme

101

Table 2. Robustness analyses against geometric distortions Images

BER from extracted watermark AT SC1.5 SH0,0.5 SH0.5,0.5 ROT15◦ SC0.5 SC2.0 SH0.5,0 AR1.5,1 ROT30◦

USC-SIPI Mandrill

0

0

0

0

0.0313

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Jellybeans 0

0

0

0

0

0

0

0

0

0

im0001

0

0

0

0

0

0

0

0

0

0

im0320

0

0

0

0

0

0

0

0

0

0

im0370

0

0

0

0

0

0

0

0

0

0

S0010

0

0

0

0

0

0

0

0

0

0

S0040

0

0

0

0

0

0

0

0

0

0

S0770

0

0

0.0156

0.0156

0

0

0

0.0156

0

0

ucid00104 0

0

0

0

0

0

0

0

0

0

ucid00576 0

0

0

0

0

0

0

0

0

0

ucid00786 0

0

0

0

0

0

0

0

0

0

House STARE

HDR

UCID

0

AT: Affine transform; matrix defined in Table 3, SC0.5 : Scaling 0.5, SC1.5 : Scaling 1.5, SC2.0 : Scaling 2, SH0,0.5 : Shearing (0, 0.5), SH0.5,0 : Shearing (0.5, 0), SH0.5,0.5 : Shearing (0.5, 0.5), AR1.5,1 : Aspect ratio (1.5, 1), ROT15◦ : Rotation 15◦ , ROT300 : Rotation 30◦ .

Table 3. BERs with respect to geometric and signal processing distortions Attack

Proposed α = 510

Fazli et al. [6]

Cedillo-Hern´ andez et al. [10]

α = 663

Geometric distortions Rotation 75◦

0.0156

0



0/0.01

Translation (100, 100)

0

0

0

0/0.01

Scaling (0.5) - (2)

0 - 0

0 - 0

0 - 0

0 - 0/0.06 - 0.01

Flip horizontal & vertical

0

0

0

0/0.61

0.0781

0



0/0.51

0

0



0/0.20

0

0

0

0/0.47

Brightness

0

0

0

0.53/0.04

Contrast

0

0



0.53/0.01

Gaussian noise (0, 0.005) - (0, 0.015)

0 - 0

0 - 0

0 - 0

0.53 - 0.53/0.04 - 0.12

Impulsive noise density (0.02) - (0.10)

0 - 0

0 - 0



0.01 - 0.09/0.03 - 0.18

Median filter (3 × 3)

0

0

0

0.53/0.01

Sharpen (0.2)

0

0



0.53/0.03

Gaussian filter (3 × 3)

0

0

0

0.53/0.01

Histogram equalization

0

0

0

0.56/0.01

Shearing (0.5, 0.5) Aspect ratio (1.5, 1)



0.9 0.2 0



⎥ ⎢ ⎥ Affine transformation ⎢ ⎣ 0.1 1.2 0 ⎦ 0 0 1 Signal processing distortions

Moreover, JPEG 2000 compression attack has been studied with different α values for Lena cover image. Figure 5(b) shows that the scheme is robust for compression ratio upto 40, 80 and 100 for α = 510, α = 1275 and α = 2550 respectively.

102

S. De et al.

Fig. 5. (a) Comparisons on watermark imperceptibility in terms of PSNR, VIF and NCD (b) JPEG 2000 compression vs BER for different α value

3.2

Comparision with Respect to Hybrid Attacks

A study on hybrid attack is also cosidered in Table 4. It consists of different combinations of signal processing distortion and geometric distortion. The proposed scheme has been compared with existing scheme [8] with respect to BERs and NCs as shown in Table 4. From the consequent study, it has been found that proposed scheme is 6.7% better in terms of average NC compare to the Parah et al.’s scheme with respect to some hybrid attacks. Table 4. BERs and NCs with respect to hybrid attacks Attack

Proposed (α = 510) Parah et al. [8] BER

NC

BER

Salt & pepper noise (0.01) & Median filtering (3 × 3)

0

1

0.1050 0.9405

NC

Gaussian noise (0.001) & Scaling 0.5

0

1

0.1350 0.9060

Gaussian noise (0.001) & 5◦ Rotation & Sharpening (0.2) 0.0938 0.8788

0.1670 0.8489

Histogram equalization & Sharpening (0.2) & Scaling 0.5

0.0776 0.9300

0

1

In addition, it has been found that the BER values of proposed scheme reaches ‘zero’ with respect to scaling attack for scaling factor greater than 0.25. Whereas, Fang et al. showed BER values 0.16 and 0.06 for scaling factor 0.5 and 0.75 respectively. Even, they achieved BER value 0.03 for scaling factor 1, 1.25 and 1.5. On the other hand, proposed scheme achieves ‘zero’ BER value with respect to median filtering, Gaussian filtering and average filtering attacks, whereas the scheme [1] has reported BER values at least 0.22, 0.25 and 0.18 for median filtering, Gaussian filtering and average filtering attacks respectively. Finally, we have presented a performance summary of the proposed scheme in Table 5. Here, the responses of the proposed scheme have been compared briefly in terms of imperceptibility and robustness with respect to different related existing schemes such as, Fang et al.’s scheme [1], Roy et al.’s scheme [2],

Robust Blind Color Image Watermarking-Based Authentication Scheme

103

Table 5. Comparison with different existing and related image watermarking schemes Feature

Ref. [1] Ref. [2] Ref. [6] Ref. [8] Ref. [10] Proposed

Imperceptibility Average PSNR (dB)

44.5979 55.7365 56.6040 41.4667 48.6754

58.5383

JPEG compression, Scaling, Cropping













Brightness













Contrast, Impulsive noise, Aspect ratio, Shearing ✗











Robustness

Sharpen













Histogram equalization













Gamma correction, Log transform













Average filtering













Median filtering













Gaussian filtering













Circular average, Horizontal motion













Gaussian noise













Salt & pepper noise













Speckle, Poisson noise













Translation, Flip, Affine transformation













Rotation













Fazli et al.’s scheme [6], Parah et al.’s scheme [8] and Cedillo-Hern´ andez et al.’s scheme [10]. It is found that for cover image with dimension (N × N ) the overall time complexity for the proposed watermark embedding scheme is O(N 2 ). In terms of payload, the proposed scheme can hide maximum 2N bits data for the cover image dimension (N × N ). It has been observed from the entire set of studies that the proposed scheme is better compared to existing related schemes for image copyright protection and authentication.

4

Conclusions

In this article, we have proposed a new robust color image watermarking scheme for copyright protection along with authentication. The proposed scheme provides confidentiality of the watermark information by employing the AES encryption technique. Further, the logistic map based random position generator has ensured the authentication. On the other hand, the scheme is specially designed to prevent geometric distortions. We have implemented a simplified algorithm to remove the geometric distortions. In order to protect the scheme against signal processing distortion, we have used two level DWT and DFT. A suitable coefficient has been selected in frequency domain to hide the watermark bits. The overall performance of the proposed scheme is better compared to existing DWT-DFT based watermarking schemes. The Hamming error correcting code has also been adopted to correct errors in extracted watermark. Consequently, in terms of imperceptibility and robustness, a significant improvement has been observed in the proposed scheme.

104

S. De et al.

References 1. Fang, H., Zhou, H., Ma, Z., Zhang, W., Yu, N.: A robust image watermarking scheme in DCT domain based on adaptive texture direction quantization. Multimed. Tools Appl. 78(7), 8075–8089 (2018). https://doi.org/10.1007/s11042-0186596-y 2. Roy, R., Ahmed, T., Changder, S.: Watermarking through image geometry change tracking. Vis. Inform. 2, 125–135 (2018). https://doi.org/10.1016/j.visinf.2018.03. 001 3. De, S., Bhaumik, J., Dhar, P., Roy, K.: DCT-based gray image watermarking scheme. In: Communication, Devices, and Computing. Lecture Notes in Electrical Engineering (2018). https://doi.org/10.1007/978-981-10-8585-7 17 4. Singh, D., Singh, S.K.: DWT-SVD and DCT based robust and blind watermarking scheme for copyright protection. Multimed. Tools Appl. 76(11), 13001–13024 (2016). https://doi.org/10.1007/s11042-016-3706-6 5. Chen, A., Wang, X.: An image watermarking scheme based on DWT and DFT. In: Proceedings of 2nd International Conference on Multimedia and Image Processing (2017). https://doi.org/10.1109/ICMIP.2017.51 6. Fazli, S., Moeini, M.: A robust image watermarking method based on DWT, DCT, and SVD using a new technique for correction of main geometric attacks. Optik 127, 964–972 (2016). https://doi.org/10.1016/j.ijleo.2015.09.205 7. Su, Q.: Novel blind colour image watermarking technique using Hessenberg decomposition. IET Image Process. 10, 817–829 (2016). https://doi.org/10.1049/iet-ipr. 2016.0048 8. Parah, S.A., Sheikh, J.A., Loan, N.A., Bhat, G.M.: Robust and blind watermarking technique in DCT domain using inter-block coefficient differencing. Digit. Signal Process. 53, 11–24 (2016). https://doi.org/10.1016/j.dsp.2016.02.005 9. Advith, J., Varun, K.R., Manikantan, K.: Novel digital image watermarking using DWT-DFT-SVD in YCbCr color space. In: Proceedings of International Conference on Emerging Trends in Engineering, Technology and Science (2016). https:// doi.org/10.1109/ICETETS.2016.7603032 10. Cedillo-Hern´ andez, M., Garc´ıa-Ugalde, F., Nakano-Miyatake, M., P´erez-Meana, H.M.: Robust hybrid color image watermarking method based on DFT domain and 2D histogram modification. Signal Image Video Process. 8(1), 49–63 (2014). https://doi.org/10.1007/s11760-013-0459-9 11. Sawadogo, A., Lafon, D., Gb´et´e, S.D.: Statistical analysis of rank data from a visual matching of colored textures. J. Appl. Stat. 41, 2462–2482 (2014). https:// doi.org/10.1080/02664763.2014.920775 12. Kang, X., Huang, J., Shi, Y.Q., Lin, Y.: A DWT-DFT composite watermarking scheme robust to both affine transform and JPEG compression. IEEE Trans. Circuits Syst. Video Technol. 13, 776–786 (2003). https://doi.org/10.1109/TCSVT. 2003.815957 13. University of California, San Diego. STARE Image Database. https://cecas. clemson.edu/ahoover/stare/. Accessed 02 May 2018 14. Nottingham Trent University, UK. UCID Image Database. http://jasoncantarella. com/downloads/ucid.v2.tar.gz. Accessed 02 May 2018 15. University of Southern California. The USC-SIPI Image Database. http://sipi.usc. edu/database/database.php.. Accessed 02 May 2018 16. Funt, F.N., et al.: HDR dataset computational vision lab computing science. Simon Fraser University, Burnaby, BC, Canada. http://www.cs.sfu.ca/colour/ data/funthdr/. Accessed 02 May 2017

LSB Steganography Using Three Level Arnold Scrambling and Pseudo-random Generator Sayak Ghosal , Saumya Roy , and Rituparna Basak(B) University of Engineering and Management, Kolkata 700156, India [email protected]

Abstract. Cryptographic Image Scrambling is a widely relevant method used to transform images that are coherent, into an incomprehensible form of the image, to make it more tolerant to cryptanalysis. To minimize the possibilities of information detection, we intend to propose a novel solution, that utilizes multilevel Arnold’s transformation along with LSB steganography to further improve the security of the encryption process. Addition of a pseudo-random generator enhances the solution by lowering the penetrability risk and confusing any such attacks from threatening sources. Metrics, like NCPR, UACI and PSNR values between original and stego-image, have been used that confirms the viability of our scrambling technique. Keywords: Image scrambling · Arnold’s mapping · Pseudo-random generator · Private key · Encryption · Data hiding

1 Introduction Image steganography is an admissible method to hide important data in an inconspicuous manner within an image, which prevents unauthorized access of information. The fact that the hidden information is not comprehended by the naked eye, puts it into an advantage over cryptography, where the proof of encryption is evident. Data hiding [9, 13, 15] mechanisms encipher information and conveys its protection. The visual quality of the stego-image and the implementation of data hiding gives a metric of evaluation for data hiding performance. To perform this, we have utilized LSB steganography [2], where the Chromatic: Red value (of YCbCr) [4] is converted into its 8-bit binary counterpart. The least significant bits are then embedded with the data bits of our message, which particularly hides it in the vast array of information. The change in Chromatic red values of YCbCr, is not easily picked up by the human eye, thus it apparently produces no change to the image after LSB steganography. In nature, image pixels have a high degree of correlation with their neighboring pixels and hence various algorithms are utilized to properly scramble an image. Arnold’s Chaotic Map (ACM) [18] is generally used to randomize the pixel order of the image, however a simple single-layered transformation is not sufficient enough to provide data protection. Addition of Least significant bit steganography further increases the robustness of the solution. Therefore, we propose a novel approach, where three-layered ACM (TLA), is utilized along with © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 105–116, 2022. https://doi.org/10.1007/978-981-19-3182-6_9

106

S. Ghosal et al.

LSB steganography, to produce a more tolerant mechanism of solution, that has lower susceptibility to security threats. In the mapping function, the new position of the pixels during the scrambling process, is found using a mapping function, with initial positions as input. The function is one-to-one and invertible. The output of the function, produces new position values and reorganizes the pixels. This permutation causes the pixels to shift from their original positions which renders the transposed image as encrypted and unreadable. A pseudo-random generator is then used, that changes the RGB value of the pixels randomly with the help of a key, which in practice changes the intensity of each pixel. In the resulting new pixel organization, adjacent pixels with naturally close values will take on considerably different values, making it difficult to decipher the encrypted imagef. We have primarily used JPEG format image for the steganography because the metrics used to understand the efficiency has shown in quality results for JPEG image.

2 Literature Survey In paper [1] by Hariyanto et al. the author proposed the application of Arnold Mapping in Digital Image Encryption. The author described the algorithm and its properties. A key is used for the scrambling process which determines the number of iterations required to generate the new desired positions. The algorithm is applied on square images. Gupta et al. [2] proposed a chaotic mapping scheme for image cryptography in ‘Image Encryption Using Chaotic Maps’. The author has introduced the basics of key based chaotic maps which has generated promising results. The author has used fixed key space, which is a drawback of the proposed solution. In paper [3] Bhallamudi et al. studies the various processes of Image Steganography. The author has proposed a model for spatial domain Steganography using gray scale images which is a major drawback. The author has also studied the possibilities of image steganography in frequency domain by DCT steganography [12]. Dr. Amarendra K et al. in ‘Image Steganography Using LSB’ [4] proposes a model based on LSB steganography. The model hides the data in the RGB channels of the image. The author also proposes a key based LSB steganography. The drawback of this paper is that, the author has used the RGB channels for bit manipulation which is prone to steganalysis [14]. In paper [5] by Alwan et al. the author proposes image color image steganography in YCbCr space. Since the human eye is not very sensitive to changes in Chrominance, the author has chosen the Cb and Cr channels for data hiding. A XOR operation is used to embed the message bits into the LSB bits of these two channels.

LSB Steganography Using Three Level Arnold Scrambling

107

In paper [6] by Shankar S et al., the author has proposed a promising solution to overcome the problem of periodicity involved with Arnold’s mapping. The model proposed in this paper performs a basic Arnold mapping followed by key based random generator. To overcome the problem of periodicity, the model uses Pseudo random generators to change the pixel values of the image. Though the model shows promising results, Arnold mapping is applied only once which makes the model still vulnerable to foreign attacks. Min et al. [6] proposes a novel approach towards pseudo-randomness and pseudorandom generators. The paper introduces a PLCM. The paper combines PLCM with the logistic map to design a chaotic pseudo-random generator. The application of the proposed method seems promising in the field of image encryption. In another paper [7] Min et al. tries to overcome the drawback of Arnold’s Algorithm by dividing rectangular images into square blocks and performing Arnold Transformation on each block. The solution seems quite promising and works well with most rectangular images. But the proposed solution does not overcome the periodicity problem of Arnold Transform. Moreover, for this model to work, the width and the height should have a common factor other than one.

3 Proposed Solution In an image of 24 bits, the 8th bit in an image can be changed into a secret message bit and 3 bits can be stored in a pixel. There are small variations in the intensity of colors when the LSB is changed which the naked human eye cannot understand [3]. Then we perform image scrambling in which we perform image scattering. We take a M × N RGB image as input into the encryption model. It is then converted to YCbCr colour scale. In the YCbCr color scale, ‘Y’ indicates the luminance/intensity value of each pixel and Chromatic Blue and Chromatic Red signifies the Chrominance of the color information. Since small change in chrominance is not detected by human eye, the model tries to apply LSB steganography to the ‘Cr’ channel of the image, keeping overall variance from original image minimal. The image is then converted back to its RGB counterpart. Here the first stego image is being generated. Arnold’s Mapping Algorithm: Next, the generated image is passed through 3 layers of key-based Arnold Transform. Traditional Arnold Transform can only be applied to square dimensional images. Our proposed model successfully augments the Traditional Arnold Transform algorithm for rectangular dimension images. The augmented Arnold Mapping is given by equation:      11 xn xn+1 = mod (1) yn+1 yn 12 where (xn , yn )T represents the current position of the pixels at (n + 1)th iteration. (xn + 1 , yn + 1 )T represents the Arnold Mapped position of the pixel at (n + 1)th iteration. The number of iterations (n) at every level of Arnold mapping is equal to the private key (K) as input. For an N × M image, D is represented as:  N if N >M D= (2) M if M >N

108

S. Ghosal et al.

Block Segregation: • The image taken is divided into M × N rectangular blocks with each block having a positional coordinate in the spatial domain. Every block that is obtained from segregation can hence be individually treated as an image themselves. A block dimension can thus be defined as the number of pixels that constitutes it. For example, an image of 512 × 512 dimensions can be broken down into smaller blocks, with each block having a dimension of 32 × 32. Thus the image is segregated into total of 16 blocks. The subpart within a block, defined as a segment, can either be a single pixel or a cluster of pixels. • Augmented Arnold mapping [8] is applied on the basis of the private key, K1, to scramble the pixel positions within each block. A segment within a block can be defined as a single pixel or a cluster of pixels. For each block, the segments positions are reorganized. Let the set of segments of the block be denoted by S. We define a function f: S → S, i.e. f gives permutation of the segments. In action, when a segment within a block is mapped, it may so happen that the mapped coordinate of the segment not be in the range of the block dimensions. The problem then arises, that since it is a one-to-one function, the segment may not be mapped and the transformation becomes ineffective. Such scenarios are handled by embedding gaussian segments with random values within color range 0 to 255 into the generated segments, so that the mapping is effective. We intentionally do not define the range of the function. This reduces redundancy and increases the robustness of the algorithm. • Similarly, the augmented Arnold mapping is applied on the basis of the private key, K2, to map the positions of the individual blocks in the spatial domain. Let the set of blocks of the image be denoted by B. We define a function f: B → B, i.e. f gives permutation of the blocks. In action, when a block within an image is mapped, it may so happen that the mapped coordinate of the segment not be in the range of the block dimensions. The problem then arises, that since it is a one-to-one function, the block may not be mapped and the transformation becomes ineffective. Such scenarios are handled by inserting cluster of gaussian blocks with random values within color range 0 to 255 into the generated segments, so that the mapping is effective. This reduces redundancy and increases the robustness of the algorithm. The entire image is furthermore inter-mapped on a lowest unit basis (a pixel), to further scramble and scatter the image from the original image. This is done with the help of our augmented Arnold’s map, with the private key K3.

LSB Steganography Using Three Level Arnold Scrambling

109

Fig. 1. Application of Arnold’s Transformation to the Image I(x,y), produces mapping f:B(3,6) → B(4,8)

Pseudo-Random generator: Our proposed method utilizes a key-based (Key K4) pseudo-random generator [16] to randomize the color information (RGB) of the pixels by approximating the values of random numbers, and generating a sequence of random bits [17]. The new pixel values are computed by adding these random bits to the pixel values. This is done so that the adjacent pixels which may have close correlation will have considerably different values, making it difficult to decipher the encrypted image. The restructured image, I(x’, y’), that is produced from a N × M dimensional image, is equidimensional (square image). Thus the image obtained after the third level of scrambling is to be transmitted.

Fig. 2. The outcome of scrambling with three-level ACM and Pseudo-random generator

110

S. Ghosal et al.

Block Diagram:

NxM Image

Conversion to YCbCr

Convert to RGB

Insert text to Cr channel

Key 3

Inter-map whole picture on a pixel basis

Change RGB value with pseudo random generator

Introducing gaussian blocks

Segregate image into blocks

Key 2

Inter map position of blocks with Arnold scrambling

Output as a square image

Scramble pixels within each block with Arnold’s scrambling

Key1

Key 4

Descrambling: The procurement of the original image can be done by a very similar process to that of scrambling. To get back the original image, first the pseudo-random generator is reused with the help of its respective key K4, to get back the sequence of random values used for the initial change in pixel values. This sequence is then subtracted from the pixel values to get back the original pixel color information. Essentially, the image is then again broken into structural blocks. To re-scatter the image pixels to produce its original form, Arnold’s mapping with key K3 is used. Once the initial mapping is done, which was the final step our scrambling algorithm, the positions of the blocks are inter-mapped, with key K2. This essentially causes the blocks to regain their previous positions much like that in the source image. With the help of the private synchronous key, K1, we apply Arnold’s pixel transformation once more, this time on each smaller

LSB Steganography Using Three Level Arnold Scrambling

111

distinguished segment. Finally, the embedded gaussian blocks are removed, to regain the original dimensionality of the image. The image is further converted into the YCbCr color format. Considering the Cr channel for extraction of the hidden data. The LSB bits of the Cr channel contains the actual hidden message. Hence, we extract the data using reverse LSB steganography on the Cr channel. Thus the original image I(x, y) is reproduced.

(a): Pixel Scrambled

(b): Block Scrambled

(c): Stego Image

Fig. 3. The outcome of decryption of the stego image

Experimental Analysis: The three level Arnold’s mapping solution proposed essentially inserts the text data into the YCbCr color image and then it breaks the image into structural blocks, where the segments of the blocks are transformed using Arnold’s chaotic mapping. Thereafter, the blocks themselves are reorganized and finally, the restructuring occurs in the pixel level. This is the entire process of scrambling that ensues in the enscrambling phase. The RGB image hence gets completely scrambled and scattered. The blocks that were segregated are reintegrated and the stego image is retrieved. To measure the efficiency of our algorithm, we perform an analysis with metrics such as NCPR, UACI and PSNR. NCPR, which stands for the number of changing pixel rate between the scramble image and original image, and UACI (the unified averaged changed intensity) are the two most common quantities used to evaluate the strength of image encryption algorithms/ciphers with respect to differential attacks. NCPR can be represented as, NPCR% =

M N P(i, j) × 100 i=1 i=1 MN

UACI, or the unified averaged changed intensity calculates the change in intensity between the original and the scrambled image. Higher the value the better. It can be represented as,   1 M N IP (i, j) − IS (i, j) UACI%= × 100 i=1 j=1 MN 255 PSNR stands for Peak signal-to-noise ratio. It is an expression for the ratio between the maximum possible value (power) of a signal and the power of distorting noise that

112

S. Ghosal et al.

affects the quality of its representation. Lower the PSNR, the better. For noise values MSE =

2 1 m−1 n−1  I (i, j) − K(i, j) i=0 j=0 mn

equal to zero, the PSNR values reach to inf, however in practice having zero noise is an ideal case. PSNR can be represented by, The PSNR (in dB) can be defined as:   MAX 2 = 10 · log10 MSEI

PSNR MAXI = 20 · log10 √ MSE

= 20 · log10 (MAXI ) − 10 · log10 (MSE) The histogram analysis is a formidable form of attack on a stego-image. The original and stego-image are compared to understand pixel distribution or to monitor unusual shapes due to the embedding algorithm.

(a): Histogram of original image

(b): Histogram of stego image

Fig. 4. Histogram analysis for sample Baboon

(a) : Histogram of original image

(b) : Histogram of stego image

Fig. 5. Histogram analysis for sample MSG V

The histograms shown above are for the number of pixels per color value. The graphs are segregated into Red, Blue and Green curves. These curves show the histogram

LSB Steganography Using Three Level Arnold Scrambling

(a): Histogram of original image

113

(b): Histogram of stego image

Fig. 6. Histogram analysis for sample Lena

analysis for red, blue and green channels. On comparison, between the graphs of stego image and original image, the two histograms are statistically dissimilar and steg-analysis for the graphs will fail. Table 1 represents the various matrices used to analyze our proposed model. The results obtained are exceptional and the model is better than other similar models already proposed. Table 1. Metric computation of various ciphered color images Samples

PSNR

UACI

NPCR

MGS V

6.26

29.64

99.59

Baboon

27.02

28.52

99.58

7.29

29.03

99.57

Lena

Table 2 represents the whole encryption process of three different image, Lena, Baboon and MSG V. First the original images are shown in column 2. The block scrambled images are shown in column 3 and the final generated scrambled images are being shown in column 4. We have also tested the Chi-Square attack and received a probability value of pr ≥ 0.80, with a 100% embedding and a MSE value of 0.0345 for the image of Lena. Evaluation of image LSB steganography can be done with the help of three parameters, namely capacity, imperceptibility and image security. Analysis of histogram is an efficient method to understand security measurement of a stego-image that prevents attacker from acquiring data hidden in the image. The metrics that we have used to identify image quality measurement is PSNR, NCPR and UACI. The higher the image quality, less the distortion of image. The payload capacity is the number of secret bits embeddable in the image, while ensuring that the distortion in the stego-image is minimal. The bit rate is the number of bits that can be concealed in a pixel.

114

S. Ghosal et al. Table 2. Image Scrambling process

Comparative study: We compare our paper with three different existing scrambling algorithms. For paper [10], the author has proposed a two-level Arnold’s transformation. In this paper the author has taken rectangular images and have divided the image into square blocks and then two layers of Arnold’s scrambling are being applied. The paper was able to deliver promising results, but the blocks are of square dimensions. For the algorithm to work, the width and height should not be co-prime. Our model has successfully implemented the Arnold’s Transform without any restriction on image or blocks sizes. For paper [11], the author proposes a similar triple chaotic scrambling algorithm. The author taken a different approach completely using key based Hilbert curve for scrambling. The author has also focused on preventing statistical attacks on images. The author uses NCPR and UACI as a means of performance analysis. The results for this paper are compared with our proposed model in Table 4. It can be seen that our model being simpler is yet robust and efficient in image encryption. For paper [5], the author has proposed a single layer Arnold transform with pseudo-random generator to prevent statistical attacks. The model has promising outputs but the layer of security is shallow and can be easily deciphered by brute force attacks. More-over, the model is only valid for square dimension images. They have used PSNR value for performance analysis. Table 5 shows the comparative study. As stated in the proposed solution, our model has no restrictions on image dimensions and have increased layers of security with encryption keys at each stage.

LSB Steganography Using Three Level Arnold Scrambling

115

The comparisons are given in the tables below: Table 3. Metric comparison for sample image Baboon Paper name

NCPR

UACI

Proposed model

99.58

28.52

Image scrambling through two level arnold transform [10]

99.28

18.94

Table 4. Metric comparison for sample image Lena Paper name

NCPR

UACI

Proposed model

99.57

29.03

Triple chaotic image scrambling on RGB – a random image encryption approach [11]*

99.36

30.06

*We have taken the average of the NCPR and UACI values presented by the author.

Table 5. PSNR comparison for sample image Lena Paper name

PSNR

Proposed model

7.29

Data hiding in encrypted images using arnold transform [5]

8.03

4 Conclusion In this paper, our proposed algorithm for scrambling of image utilizes the Arnold’s image scrambling chaotic map. The images are segregated into blocks, and can further be taken into segments, these segments are initially scrambled after which the blocks are transformed. Similarly, the reverse procedure is followed with the help of the respective private keys, used in the steps. As compared to paper [5, 10] and [11], our model shows improvements in the image scrambling process and keeps the complexity simpler as compare to paper [11]. The model is resistant to statistical attacks and brute force attacks. The scrambled image can further be hidden in a cover image of appropriate dimensions. Different algorithms like edge-detection algorithm, skin detection algorithm, etc. can be used along with our model to increase the PSNR value of the cover image, so that it resembles the original image. Newer algorithms like CycleGAN have shown promising results in increasing the PSNR values of the cover image.

116

S. Ghosal et al.

References 1. Hariyanto, E., Rahim, R.: Arnold’s Cat Map Algorithm in Digital Image. https://doi.org/10. 21275/ART20162488 2. Gupta, A., Gupta, A.: Image Encryption Using Chaotic Maps. Issn (Online) 2348–7550 (2015) 3. Amarendra, K., Mandhala, V.N., Chetan Gupta, B., Geetha Sudheshna, G., Venkata Anusha, V.: Image Steganography Using LSB. ISSN 2277–8616 4. Alwan, Z.A., Farhan, H.M., Mahdi, S.Q.: Color image steganography in YCbCr space. Int. J. Electr. Comput. Eng. 10(1), 202 (2020) 5. Siva Shankar, S., Rengarajan, A.: Data Hiding In Encrypted Images Using Arnold Transform. ISSN: 0976–9102 (ONLINE) 6. Min, L., Hu, K., Zhang, L., Zhang, Y.: Study on pseudo-randomness of some pseudorandom number generators with application. In: 2013 Ninth International Conference on Computational Intelligence and Security, pp. 569–574. (2013). https://doi.org/10.1109/CIS.201 3.126 7. Bhallamudi, S.: Image Steganography. https://doi.org/10.13140/RG.2.2.21323.18727 8. Li, M., Liang, T., He, Y.-J.: Arnold Transform Based Image Scrambling Method. https://doi. org/10.2991/icmt-13.2013.160 9. Bender, W., Gruhl, D., Morimoto, N., Lu, A.: Techniques for data hiding. IBM Syst. J. 35(3.4), 313–336 (1996) 10. Satish, A., Erapu, V.P., Tejasvi, R., Swapna, P., Vijayarajan, R.: Image Scrambling through Two Level Arnold Transform. In: Alliance International Conference on Artificial Intelligence and Machine Learning (AICAAM) (2019) 11. Padmapriya, P., Rengarajan, A., Karuppuswamy, T., John Bosco, B.R.: Chaotic image scrambling on RGB – a random image encryption approach. Secur. Commun. Network. 10(1), 3335–3345 (2010). (Ver 1.0) 12. Walia, E., Jain, P.: An analysis of LSB & DCT based Steganography. Global J. Comput. Sci. Technol. 8(6), 3043–3055 (December 2016–January 2017) 13. Joshi, K., Yadav, R., Chawla, G.: An enhanced method for data hiding using 2-bit XOR in image steganography. Int. J. Eng. Technol. 8(6), 3043–3055 (2016) 14. Provos, N.: Defending against statistical steganalysis. In: Proceedings Of The 10th USENIX Security Symposium, pp. 323–335. Setga (2001) 15. Marvel, L.M., Boncelet, C.G., Retter, C.T.: Reliable blind information hiding for images. In: Aucsmith, D. (ed.) IH 1998. LNCS, vol. 1525, pp. 48–61. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49380-8_4 16. Blum, L., Blum, M., Shub, M.: A simple unpredictable pseudo-random number generator. SIAM J. Comput. 15(2), 364–383 (1986) 17. Blums, M., Micali, S.: How to generate crypto graphically strong sequences of pseudo-random bits. In: IEEE 23rd Symposium on the Foundations of Computer Science, pp. 112–117 (1982) 18. Mohammed, H.: Performance evaluation measurement of image steganography techniques with analysis of LSB based on variation image formats. Int. J. Eng. Technol. 7(4), 3505–3514 (2018)

Network, Network Security and their Applications

Performance Analysis of Retrial Queueing System in Wireless Local Area Network N. Sangeetha(B) , J. Ebenesar Anna Bagyam, and K. Udayachandrika Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore, India [email protected], [email protected], [email protected]

Abstract. General service batch arrival feedback retrial G-queue with vacation is considered. Along with the essential service, the server provides service in S phases. There are several different optional services available at each level. The server’s status determines whether or not each individual customer is admitted to the system. According to the Poisson process, positive customers arrive in batches. When a negative consumer enters the system, the server goes down and the customer in service is removed from the system. The failing server is immediately dispatched for repair. If the server is idle, one of the admitted customer instantly enters the service and the others join the orbit; otherwise, all admitted customers enter the orbit. After the completion of essential or optional services, the server may join the queue if they are dissatisfied with the service. The server will leave on vacation with certain probability, after providing service to a customer. The system’s probability generating functions are carried out using the supplemental variable approach, and various performance measurements are developed. Stochastic decomposition property is verified and the special cases are deduced. The joint distributions of the server state and the number of customers in the retrial group are obtained. The influence of parameters on system performance metrics is investigated. Keywords: Retrial G-queue · Optional service · WLAN · System performance

1 Introduction Many researchers have examined retrial queues with server vacations, notably Choudhury and Ke [6] and Bagyam and Udayachandrika [4]. The stochastic modelling of many real-life circumstances leads to the use of a retrial queueing model with feedback. For example, in data transmission, a packet transported from the source to the destination may be returned, and this process may be repeated until the packet is entirely delivered. This form of retransmission is referred to as feedback. Arivudainambi and Gowsalya [2] analyzed a two types of service with feedback retrial queue and Bernoulli vacation. Pankaj Sharma [13] studied a repairable faulty server queueing model with Bernoulli feedback, as well as a modified vacation strategy. Many batch arrival queueing systems have a constraint that prevents all clients from joining the system at the same time. This policy is named as admission control policy. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 119–131, 2022. https://doi.org/10.1007/978-981-19-3182-6_10

120

N. Sangeetha et al.

Admission control mechanism in retrial queueing systems was introduced by Artalejo et al. [3]. Dimitriou [7] considered failed single server retrial queue with batch arrivals and admission control and investigated the stability condition and joint steady state queue length distribution. Niranjan et al. [12] analysed the immediate feedback batch retrial queueing system with admission control, multiple vacation and threshold. Gelenbe [9] proposed queuing models with negative customers, or G-queues, which have been used to represent neural networks, multi-processor computer systems, and industrial processes. Rajadurai et al. [15] investigated an unstable retrial G-queue with orbital search, feedback, and Bernoulli vacation. Li and Zhang [11] examined an M/G/1 retrial G-queue with general retrial periods in which the server is prone to operating failures and repairs. Many researchers have recently created retrial queueing models with two or more stages. Dimitriou and Langaris [8] analysed a retrial queue with two phases of service and server vacations. Wang and Li [20] have considered the multi optional service single server retrial queueing system. Salehirad and Badamchi Zadeh [16] and Abdollahi and Salehirad [1] studied the multiphase queueing system with feedbacks. Bagyam and Udayachandrika [4] developed the model with the concept of retrial queues. Radha et al. [14] investigated a group arrival faulty server retrial G-queue with multi optional stages of service and orbital search. Sangeetha and Udayachandrika [17] derived a feedback batch arrival multistage retrial G-queue with fluctuating modes of services. Jain and Kaur [10] derived an MX /G/1 retrial queueing model by combining all the system features together due to applications of these models in real life scenario. Santhi [19] studied an M/G/1 retrial queue with second optional service and exponentially distributed multiple working vacation policy. The availability of optional services has been mentioned in two or more levels in the majority of retrial queueing literature. Every level, however, only has one service. The batch arrival retrial G-queue is examined in this work, with many levels of services and multiple alternatives at each step.

2 Real – World Application This model might be used to improve the performance of Wireless Local Area Networks (WLAN) that use transmission protocols. Messages of varying length arrive at the stations and are then separated into a number of packets (Positive Customers) before being transferred to the destination station through wireless channels. The availability of wireless channels will be determined prior to transmission using specialised transmission protocols such as IEEE802. 11. If the channel is discovered to be accessible, one packet is chosen to be automatically broadcast, while the remainder are held in a buffer (Orbit). If the server is busy, all packets must be kept in the buffer and will be retried later (Retrial). The unsuccessfully transferred messages are resent in this framework (Feedback). However, not all packets can be successfully transferred. To avoid this congestion, each incoming packet is permitted to the system with a certain probability (Admission Control). Each sensor node is capable of sensing the intended physical phenomenon and routing the monitoring information toward the destination node through multi-optional path

Performance Analysis of Retrial Queueing System

121

with no infrastructure. All possible paths for the selected source and destination pair are divided into two phases, first phase of essential path and the second phase of multioptional paths. In second phase, the multi-optional paths are subdivided into stages consisting of all possible node to node transmission. Here the selection of nodes in each stage is collectively known as the multistage and multi-optional services. After the completion of the first essential path, the transmitted data may reach the destination directly. Otherwise, it is allowed to pass through the stage by stage multi-optional paths. Transmission problems due to radio propagation in wireless sensor networks have been found (Negative Customers). The interference that causes signal deterioration in radio links may be determined by computing the amplitude histograms of the received signal strength and the features of the histograms (Repair). When there are no incoming requests, the server may do self-upgradation maintenance (Vacation).

3 Model Description The basic assumptions of the model under study are described here. 3.1 Arrival Manner Negative clients arrive at a rate λ− determined by the Poisson process. Positive clients arrive in batches based on Poisson process with rate λ+ . Y1 , Y2 ,…. are the size of successive arriving batches and which are independent and identically distributed random variables with p.m.f. ck = P(Y = k), k ≥ 1, and PGF C(γ ) with m1 and m2 as first two moments. Let ωj , j = 1 to 4 be the probabilities for admitting a client when the server is idle, busy, repair and on vacation respectively. Then, from the arriving batch of k clients, the corresponding probability of admitting n customers are given by ⎧ ∞  ⎪ ⎪ ⎪ Cm (1 − ωj )m , k = 0, j = 1 to 4 ⎪ ⎪ ⎨ m=1 aj,k =   ∞ ⎪  ⎪ m ⎪ ⎪ ωkj (1 − ωj )m−k , k ≥ 1, j = 1 to 4 Cm ⎪ ⎩ k m=1

+ Let λ+ j = λ



aj,k

for j = 1 to 4.

k=1

The PGF aj (γ) of {aj,k , k ≥ 0} is given by aj (γ) = C(ωj γ + (1 − ωj )) with ωj m1 and ω2j m2 as first two moments, where j = 1 to 4. 3.2 Retrial Manner If an approaching batch of clients discovers that the server is busy, on vacation, or inoperable, the batch joins the orbit. If the server is available, the first stage service for one of the approaching clients begins, and others enter the orbit.

122

N. Sangeetha et al.

3.3 Service Mode and Feedback Rule Service mode and feedback rule followed the same manner as discussed in the paper of Sangeetha and Udayachandrika [18]. 3.4 Removal and Repair Procedure The arrival of a negative client removes the positive customer from the system, bringing the server to a halt. The failing server will be repaired as soon as possible. 3.5 Vacation Procedure After completing service to a client, the server may take a single vacation with probability ϑ or remain in the system with probability ϑ = 1 − ϑ. Table 1. Notations for prescribed model

Table 1 shows the necessary notations of the model.

Performance Analysis of Retrial Queueing System

123

4 Steady State Distributions For the process {N(t), t ≥ 0}, I0 (t), Ik (α, t)dα, P0,k (α, t)dα, Pr,qr ,k (α, t) dα,R0,k (α, t)dα, Rr,qr ,k (α, t)dα andV(α, t) dα define the probability that the server is idle at time t in empty and non-empty system, busy in essential and optional services, under repair during essential and optional services and on vacation respectively. Let I0 , Ik (α), P0,k (α), Pr,qr ,k (α), R0,k (α), Rr,qr ,k (α) and Vk (α) be respectively the steady state probabilities of I0 (t), Ik (α, t), P0,k (α, t), Pr,qr ,k (α, t), R0,k (α,t), Rr,qr ,k (α, t) and Vk (α, t) (r = 1 to S, qr = 1 to lr ). 4.1 Steady State Equations The system of steady state equations that governs the model is ⎡ ⎤ ∞ lr ∞ S   ⎣ λ+ P0,0 (α) μ0 (α) dα + ϕr Pr,qr ,0 (α) μr,qr (α) dα⎦ 1 I 0 = ϑ ϕ0 r=1

0

∞ +

R0,0 (α) β0 (α) dα + 0

lr S  

qr =1 0 ∞



Rr,qr ,0 (α) βr,qr (α) dα +

r = 1 qr =1 0

V0 (α) ν(α) dα 0

(1) d Ik (α) = −(λ+ 1 + η(α)) Ik (α), k ≥ 1 dα

(2)

 d − + a2,m P0,k−m (α), k ≥ 0 P0,k (α) = −(λ+ 2 + λ + μ0 (α)) P0,k (α) + λ (1 − δ0n ) dα k

m=1

(3)  d − + Pr,qr ,k (α) = −(λ+ a2,k Pr,qr ,k - m (α), 2 + λ + μr,qr (α)) Pr,qr ,k (υ) + λ (1 − δ0k ) dα k

m=1

k ≥ 0, r = 1 to S, qr = 1 to lr (4)  d + R0,k (α) = −(λ+ a3,m R0,k−m (α), k ≥ 0 3 + β0 (α)) R0,k (α) + λ (1 − δ0k ) dα k

m=1

(5)  d + Rr,qr ,k (α) = −(λ+ a3,m Rr,qr ,k - m (α), 3 + βr,qr (α)) Rr,qr ,k (α) + λ (1 − δ0k ) dα k

m=1

k ≥ 0, r = 1 to S, qr = 1 to lr (6)  d + Vk (α) = −(λ+ a4,m Vk−m (α), k ≥ 0 4 + γ(α)) Vk (α) + λ (1 − δ0k ) dα k

m=1

(7)

124

N. Sangeetha et al.

with boundary conditions ∞ Ik (0) = ϑ [δ0

P0,k−1 (α)μ0 (α)dα + ∞ P0,k (α)μ0 (α)dα +

+ ϕ0

+

δr

0 lr ∞ S 

S 

Pr,qr ,k - 1 (α)μr,qr (α)dα

qr =1 0 l ∞

r=1

0

lr 



S 

r 

ϕr

∞ Pr,qr ,k (α)μr,qr (α)dα] +

qr =1 0 ∞

r=1

Rr,qr ,k (α)βr,qr (α)dα +

r = 1 qr =1 0

R0,r (α)β0 (α)dα 0

Vk (α) γ(α)dα, k ≥ 1

(8)

0



+

P0,0 (0) = λ a1,1 I0 +

I1 (α)η(α)dα

(9)

0 +

+

P0,k (0) = λ a1,k+1 I0 + λ



k 

∞ Ik−m+1 (α)dα +

a1,m

m=1

0

Ik+1 (α)η(α)dα, k ≥ 1 0

(10) ∞ P1,q1 ,k (0) = pq1

P0,k (α)μ0 (α)dα, k ≥ 0, q1 = 1 to l1

(11)

0

Pr,qr ,k (0) = pqr

lr - 1 

∞ Pr - 1,qr - 1 ,k (α)μr - 1,qr - 1 (α)dα, k ≥ 0, r = 2 to S, qr = 1 to lr

qr - 1 =1 0

(12) R0,k (0) = λ−

∞ P0,k (α)dα, k ≥ 0

(13)

0

Rr,qr ,k (0) = λ−

∞ Pr,qr ,k (α)dα, k ≥ 0 , r = 1 to S, qr = 1 to lr

(14)

0

∞ Vk (0) = ϑ[ϕ0

P0,k (α) μ0 (α) dα + 0

S 

∞ P0,k - 1 (α) μ0 (α) dα +

+ δ0 0

ϕr

r=1 S  r=1

∞ lr 

Pr,qr ,k (α) μr,qr (α) dα

qr = 1 0

δr

∞ lr 

Pr,qr ,k - 1 (α) μr,qr (α) dα], k ≥ 0

qr = 1 0

(15)

Performance Analysis of Retrial Queueing System

125

4.2 Steady State Solutions The following PGF are defined to solve the equations that govern the model ∞ 

I(α, γ) =

Ik (α)γ ; P0 (α, γ) =

k=1

Pr,qr (α, γ ) = Rr,qr (α, γ) =

∞ 

k

∞  k=0 ∞ 

P0,k (α)γ

k

k=0 ∞ 

Pr,qr ,k (α)γ ; R0 (α, γ) = k

Rr,qr (α)γk ; V(α, γ) =

k=0

R0,k (α)γk

k=0 ∞ 

Vk (α)γk

⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭

(16)

k=0

The necessary and sufficient condition for the system to be stable is A∗ (λ+ 1 ))

λ+ ω1 m1 λ+ 1

(1 −

+ T2 < 1. By using the definition of PGF, the steady state distributions of {N(t); t ≥ 0} are given by +

I(α, γ) = I(0, γ) e− λ1 α (1 − A(α)) + + λ− −λ+

P0 (α, γ) = P0 (0, γ) e−(λ

a2 (γ))α

+ + λ− − λ+

Pr,qr (α, γ) = Pr,qr (0, γ) e−(λ

+ − λ+

R0 (α, γ) = R0 (0, γ) e−(λ

+ − λ+

V(α, γ) = V(0, γ) e−(λ I(0, γ) =

R0 (0, γ) = Rr,qr (0, γ) =

(1 − R0 (α))

a3 (γ))α

a4 (γ))α

(1 − Br,qr (α))

(1 − Rr,qr (α))

(1 − V(α))

λ+ I0 [(a1 (γ) − a1,0 )T1 (γ) − γ(1 − a1,0 )] D(γ)

P0 (0, γ) = Pr,qr (0, γ) =

a2 (γ))α

a3 (γ))α

+ − λ+

Rr,qr (α, γ) = Rr,qr (0, γ) e−(λ

(1 − B0 (α))

I0 λ+ (a1 (γ) − 1) A∗ (λ+ 1) D(γ)

∗ ∗ I0 λ+ (a1 (γ) − 1) A∗ (λ+ 1 ) pqr r−1 (g(γ)) B0 (g(γ))

D(γ)

∗ λ− λ+ I0 A∗ (λ+ 1 ) (a1 (γ) − 1)(1 − B0 (g(γ))) /g(γ) D(γ)

∗ ∗ ∗ λ− λ+ I0 A∗ (λ+ 1 ) (a1 (γ) − 1) pqr r−1 (g(γ)) B0 (g(γ))(1 − Br,qr (g(γ))) /g(γ)

D(γ)

(17) (18) (19) (20) (21) (22) (23) (24)

(25) (26) (27)

126

N. Sangeetha et al.

λ+ I0 A∗ (λ+ 1 ) (a1 (γ) − 1) ϑ V(0, γ) =

M i=0

(δr γ + ϕr ) ∗r (g(γ)) B∗0 (g(γ)) (28)

D(γ)

where g(γ) = λ+ (1 – a2 (γ)) + λ− ; h1 (γ) = λ+ (1 – a3 (γ)); h2 (γ) = λ+ (1 – a4 (γ)) ∗0 (g(γ)) = 1, ∗r (g(γ)) =

T1 (γ) = ϑ

S 

r



m=1

lm  qm =1

pqm B∗m,qm (g(γ))

(ϕr + δr γ) ∗i (g(γ))B∗0 (g(γ)) + λ− ((1 − B∗0 (g(γ)))/g(γ))R∗0 (h1 (γ))

r=0

+ λ−

lr S   r = 1 qr =1



S  r=0

pqr ∗r - 1 (g(γ))((1 − B∗r,qr (g(γ)))/g(γ))R∗r,qr (h1 (γ))B∗0 (g(γ))

(ϕr + δr γ) ∗r (g(γ))B∗0 (g(γ))V∗ (h2 (γ))  ∗

D(γ) = γ − A

(λ+ 1)+

  λ+  ∗ + a1 (γ) − a1,0 (1 − A (λ1 )) T1 (γ) λ+ 1

The PGF of the orbit size at different states of the server are I(γ) =

λ+ I0 (1 − A∗ (λ+ 1 )) [(a1 (γ) − a1,0 ) T1 (γ) − γ(1 − a1,0 )] λ+ 1 D(γ)

P0 (γ) =

∗ λ+ I0 A∗ (λ+ 1 ) (a1 (γ) − 1)(1 − B0 (g(γ))) g(γ) D(γ)

λ+ I0 (a1 (γ) − 1) A∗ (λ+ 1) P(γ) =

lr S r=1 qr =1

(29) (30)

pqr ∗r - 1 (g(γ)) B∗0 (g(γ)) (1 − B∗r,qr (g(γ)))

g(γ) D(γ) (31)

R0 (γ) =

∗ ∗ λ− λ+ I0 A∗ (λ+ 1 ) (a1 (γ) − 1)(1 − B0 (g(γ)))(1 − R0 (h1 (γ)))) g(γ) h1 (γ) D(γ)

λ− λ+ I0 A∗ (λ+ 1 ) (a1 (γ) − 1) R(γ) =

lr S

r=1 qr =1

V(γ) =

pqr ∗r−1 B∗0 (g(γ))(1 − B∗r,qr (g(γ)))(1 − R∗r,qr (h1 (γ))) g(γ) h1 (γ) D(γ)

λ+ I0 A∗ (λ+ 1 ) (a1 (γ) − 1) ϑ

(32)

S r=0

(33)

(δr γ + ϕr ) ∗r (g(γ))B∗0 (g(γ))(1 − V∗ (h2 (γ))) h2 (γ) D(γ) (34)

Performance Analysis of Retrial Queueing System

127

Using the normalizing condition, the unknown constant I0 can be obtained as I0 = (1 −

l1  λ+ ω1 m1 (1) ∗ (λ+ )) − T )/A∗ (λ+ ) {1 + (1 − A pq1 f0 − T3 − T4 2 1 1 λ+ 1 q =1 1

− ϑ λ+ ω1 m1

S 

(δr + ϕr ) ∗r (λ− ) + λ+ (ω1 − ω2 ) m1 T5 − λ+ (ω1 − ω3 ) m1 T6 }

r=0

where

ϑλ+ ω1 m1

=

T2

r=0

λ+ m1 (ω2 T5 + ω3 T6 ) + T4 lr S  

T3 =

r = 1 qr =1



(1)

lr S  

S 

(δr + ϕr ) ∗r (λ− ) −

δr ∗r (λ− ) +

S 

(1)

r=1

∗ − T6 = (1 − B∗0 (λ− ))β(1) 0 + B0 (λ )

lr S  

lr S   r = 1 qr =1



pqr ∗r - 1 (λ− )(1 − B∗r,qr (λ− ))]

pqr ∗r - 1 (λ− )(1 − B∗r,qr (λ− ))β(1) r,qr



+ αe−λ α b0 (α)dα, f(1) r,qr = λ ω2 m1

0 (1)







αe−λ α br,qr (α)dα

0 (2)

= lim ∗r (g(γ)), Mr γ→1

+ T3 +

∗ − ∗ − (δr + ϕr ) (M(1) r B0 (λ ) + r (λ ) f0 )

r = 1 qr =1

Mr

q1 =1

(1)

T5 = (1/λ− )[1 − B∗0 (λ− ) + B∗0 (λ− )

(1)

(1)

pq1 f0

pqr ∗r - 1 (λ− ) B∗0 (λ− ) f(1) r,qr

r=0

f0 = λ+ ω2 m1

l1

pqr (Mi - 1 B∗0 (λ− ) + ∗r−1 (λ− ) f0 (1 − B∗r,qr (λ− ))

r = 1 qr =1

T4 = B∗0 (λ− )

S



= lim ∗r (g(γ)), γ→1

The PGF of the orbit size and system size are + Pq (γ) = I0 A∗ (λ+ 1 ) {(a3 (γ) − 1)(a4 (γ) − 1)[γ − T1 (γ) + λ (a1 (γ) − 1)T7 (γ)]

− λ− (a1 (γ) − 1) (a4 (γ) − 1)T8 (γ) + ϑ (a1 (γ) − 1)(a3 (γ) − 1)

S 

(δr γ + ϕr )

r=0

∗r (g(γ))B∗0 (g(γ))(1 − V ∗ (h2 (γ)))}/D(γ) (a3 (γ) − 1)(a4 (γ) − 1)

(35)

+ Ps (γ) = I0 A∗ (λ+ 1 ) {(a3 (γ) − 1)(a4 (γ) − 1)[γ − T1 (γ) + γ λ (a1 (γ) − 1)T7 (γ)]

− λ− (a1 (γ) − 1) (a4 (γ) − 1)T8 (γ) + ϑ (a1 (γ) − 1)(a3 (γ) − 1)

S  r=0

(δr γ + ϕr )

128

N. Sangeetha et al. ∗r (g(γ))B∗0 (g(γ))(1 − V ∗ (h2 (γ)))}/D(γ)(a3 (γ) − 1)(a4 (γ) − 1)

where T7 (γ) = 1 − B∗0 (g(γ)) + B∗0 (g(γ))

lr S r = 1 qr =1

(36)

pqr ∗r - 1 (g(γ))(1 − B∗r,qr (g(γ)))

T8 (γ) = (1 − B∗0 (g(γ))) (1 − R∗0 (h1 (γ))) + B∗0 (g(γ))

lr S   r = 1 qr =1

pqr ∗r - 1 (g(γ))

5 Performance Measures The steady state probabilities that the server is idle in the non-empty system, busy in providing essential and optional services, under repair during essential and optional services and on vacation respectively are given by I = lim I(γ) = γ→1

+ + I0 (1 − A∗ (λ+ 1 ))[(λ ω1 m1 /λ1 ) + T2 − 1] D

P0 = lim P0 (γ) = γ→1

(37)

∗ − λ+ ω1 m1 I0 A∗ (λ+ 1 )(1 − B0 (λ )) λ− D

∗ − λ+ ω1 m1 I0 A∗ (λ+ 1 ) B0 (λ )

P = lim P(γ) = γ→1

lr S r=1 qr =1 λ− D

(38)

pqr ∗r−1 (λ− )(1 − B∗r,qr (λ− )) (39)

R0 = lim R0 (γ) = γ→1

(1) ∗ − λ+ ω1 m1 I0 A∗ (λ+ 1 )(1 − B0 (λ )) β0 D

∗ − λ+ ω1 m1 I0 A∗ (λ+ 1 ) B0 (λ )

R = lim R(γ) = γ→1

lr S r = 1 qr =1 D

ϑ λ+ ω1 m1 I0 A∗ (λ+ 1) V = lim V(γ) = γ→1

S r=0

(40) (1)

pqr ∗r−1 (λ− )(1 − B∗r,qr (λ− )) βr,qr

(41)

(δr + ϕr ) ∗r (λ− ) B∗0 (λ− )ν1 D

Expected orbit size and expected system size can be obtained by d d Pq (γ) and Ls = lim Ps (γ) γ→1 dγ γ→1 dγ

Lq = lim

By comparing these two results, we can generalize Ls = Lq + P0 + P

(42)

Performance Analysis of Retrial Queueing System

129

6 Comparison of the Proposed Model with the Existing Model If A* (λ+ )→ 1, C(γ)→γ, λ− = 0, ϑ = 0, qr = 1 (r = 1,2,…,S) and ω1 = ω2 = ω3 = ω4 = 1, then the model under study reduces to M/G/1 queueing model with S-phase optional services and Bernoulli feedback. In this case, the PGFs of the queue size distribution when the server is busy in 1st and 2nd phase services are obtained resp. as P0 (γ) =

I0 (1 − B∗0 (λ+ (1 − γ))) S r=0

Pr (γ) =

(δr γ + ϕr ) ∗r (λ+ (1 − γ)) (1 − B∗0 (λ+ (1 − γ))) − γ

I0 pr ∗r (λ+ (1 − γ)) B∗0 (λ+ (1 − γ)) (1 − B∗r (λ+ (1 − γ))) S (δr γ + ϕr ) ∗r (λ+ (1 − γ)) (1 − B∗0 (λ+ (1 − γ))) − γ r=0

where ∗r (λ+ (1 − γ)) =

r  k=1

pk B∗k (λ+ (1 − γ)).

The above results coincide with the results of Abdollahi and Salehi Rad [1].

7 Numerical Results In this section, for the purpose of understanding the effect of parameters on the system performance measures, it is assumed that the retrial time, first phase service time, second phase service time, repair time during first phase, repair time during second phase and vacation time are exponentially distributed with parameters η, μ0 , μi,ji , β0 , βi,ji and ν respectively and the batch size follows geometric distribution with mean 1/σ. The input parameters are λ+ = 1.7, λ− = 0.4, η = 50, S = 3, l1 = 2, l2 = 3, l3 = 2, p = 0.3, p1 = [0.3, 0.2], p2 = [0.3, 0.2, 0.2], p3 = [0.4, 0.2], δ0 = 0.3, δ = [0.1, 0.3, 0.6], ϕ0 = 0.2, ϕ = [0.2, 0.1, 0.4], ϑ = 0.4, μ0 = 20, β0 = 4, μ1 = [40, 30], μ2 = [35, 42, 52], μ3 = [25, 30], β1 = [1, 3], β2 = [5, 7, 10], β3 = [12, 14], ν = 7, θ = 0.5, σ = 0.5, ω1 = 0.6, ω2 = 0.7, ω3 = 0.6, ω4 = 0.4. Table 2. Performance measures by varying ω1 ω1

I0

I

P0

P

R0

R

V

0.2

0.5306

0.0346

0.1202

0.1598

0.0120

0.0070

0.1357

0.4

0.3357

0.0583

0.1676

0.2228

0.0168

0.0097

0.1892

0.6

0.2314

0.0710

0.1929

0.2565

0.0193

0.0112

0.2178

0.8

0.1664

0.0789

0.2087

0.2775

0.0209

0.0121

0.2356

1

0.1221

0.0843

0.2194

0.2918

0.0219

0.0127

0.2478

130

N. Sangeetha et al. Table 3. Performance measures by varying ω3

ω3

I0

I

P0

P

R0

R

V

0.2

0.2472

0.0691

0.1891

0.2514

0.0189

0.0109

0.2135

0.4

0.2393

0.0700

0.1910

0.2539

0.0191

0.0110

0.2156

0.6

0.2314

0.0710

0.1929

0.2565

0.0193

0.0112

0.2178

0.8

0.2232

0.0720

0.1949

0.2591

0.0195

0.0113

0.2200

1

0.2149

0.0730

0.1969

0.2618

0.0197

0.0114

0.2223

Table 4. Mean Orbit Size for varying values of η, μ0 and ω1 η 20

30

40

μ0

Mean Orbit Size (Lq ) ω1 = 0.6

ω1 = 0.7

ω1 = 0.8

ω1 = 0.9

20

3.0926

4.5496

7.6046

18.1964

25

2.7803

3.7405

5.2872

8.2195

30

2.6445

3.4383

4.6034

6.4965

20

2.2247

2.7590

3.4545

4.4021

25

2.1606

2.6130

3.1662

3.8623

30

2.1254

2.5416

3.0361

3.6368

20

1.9710

2.3409

2.7741

3.2904

25

1.9556

2.2911

2.6708

3.1064

30

1.9437

2.2629

2.6181

3.0180

8 Conclusion Retrial Queueing System with multistage services has been developed according to the WLAN phenomenon and numerical analysis has been carried out. In the analysis, performance measures are calculated by varying the state dependent admission probabilities ω1 and ω3 and presented in the Tables 2 and 3. From the results it is observed that, increase in admission probabilities gives a decreasing trend in I0 and an increasing trend on the other performance measures. Same trend can be observed by varying ω2 and ω4 . Table 4 shows the way in which the orbit size changes for different values of retrial rate (η), essential service rate (μ0 ) and admission probability (ω1 ) at idle state of the server. From the table it is observed that mean orbit size moderately decreases with increase in retrial rate and essential service rate and mean orbit size sharply increases with increase in ω1 .

References 1. Abdollahi, S., Salehirad, M.R.: On an M/G/1 queueing model with K-Phase optional services and Bernoulli feedback. J. Serv. Sci. Manage. 5, 280–288 (2012)

Performance Analysis of Retrial Queueing System

131

2. Arivudainambi, D., Gowsalya, M.: Performance analysis of an M/G/1 feedback retrial queue with two types of service and Bernoulli vacation. Stochastic Process. Models Oper. Res. 38–54 (2016) 3. Artalejo, J.R., Atencia, I., Moreno, P.: A discrete time Geo[X] /G/1 retrial queue with control of admission. Appl. Math. Model. 29, 1100–1120 (2005) 4. Bagyam, J.E.A., Udaya Chandrika, K.: Bulk arrival two phase retrial queue with two types service and extended Bernoulli vacation. Int. J. Math. Trends Technol. 4(7), 116–124 (2013) 5. Bagyam, J.E.A., Udaya Chandrika, K.: Multi-stage retrial queueing system with Bernoulli feedback. Int. J. Sci. Eng. Res. 4(9), 496–499 (2013) 6. Choudhury, G., Ke, J.C.: A batch arrival retrial queue with general retrial times under Bernoulli vacation schedule for unreliable server and delaying repair. Appl. Math. Model. 36, 255–269 (2012) 7. Dimitriou, I.: A batch arrival priority queue with recurrent repeated demands, admission control and hybrid failure recovery discipline. Appl. Math. Comput. 219, 11327–11340 (2013) 8. Dimitriou, I., Langaris, C.: Analysis of a retrial queue with two-phase service and server vacations. Queueing Syst. Theo. Appl. 60(1–2), 111–129 (2008) 9. Gelenbe, E.: Random neural networks with negative and positive signals and product form solution. Neural Comput. 1, 502–510 (1989) 10. Jain, M., Kaur, S.: Bernoulli vacation model for MX /G/1 unreliable server retrial queue with Bernoulli feedback. Balking Opt. Ser. RAIRO-Oper. Res. 55, S2027–S2053 (2021) 11. Li, T., Zhang, L.: An M/G/1 retrial G-Queue with general retrial times and working breakdowns. Math. Comput. Appl. 22(15), 1–17 (2017) 12. Niranjan, S.P., Chandrasekaran, V.M., Indhira, K.: State dependent arrival in bulk retrial queueing system with immediate Bernoulli feedback, multiple vacations and threshold, IOP Conf. Ser. Mater. Sci. Eng. 263, 1–15 (2017) 13. Sharma, P.: M/G/1 retrial queueing system with Bernoulli feedback and modified vacation. Int. J. Math. Trends Technol. 61(1), 10–21 (2018) 14. Radha, J., Indhira, K., Chandrasekaran, V.M.: A group arrival retrial g - queue with multi optional stages of service, orbital search and server breakdown. IOP Conf. Ser. Mater. Sci. Eng. 263, 1–16 (2017b) 15. Rajadurai, P., Chandrasekaran, V.M., Saravanarajan, M.C.: Analysis of an M[X] /G/1 unreliable G-queue with orbital search and feedback under Bernoulli vacation schedule. Opsearch 53(1), 197–223 (2015) 16. Salehirad, M.R., Badamchi Zadeh, A.: On the multi–phase M/G/1 queueing system with random feedback. Central Eur. J. Oper. Res. 17(2), 131–139 (2009) 17. Sangeetha, N., Udaya Chandrika, K.: Batch arrival multi-stage retrial G-Queue with fluctuating modes of services and feedback. Adv. Math. Sci. J. 8(3), 438–450 (2019) 18. Sangeetha, N., Udaya Chandrika, K.: Multi stage and multi optional retrial G-Queue with feedback and starting failure. In: AIP Conference Proceedings, vol. 2261, no. 030077, pp. 1–11 (2020) 19. Santhi, K.: An M/G/1 retrial queue with second optional service and multiple working vacation. Adv. Appl. Math. Sci. 20(6), 1129–1146 (2021) 20. Wang, J., Li, J.: A single server retrial queue with general retrial times and two-phase service. J. Syst. Sci. Complexity 22(2), 291–302 (2009)

IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique Mangalam Gupta, Dipanwita Sadhukhan(B) , and Sangram Ray National Institute of Technology Sikkim, Ravangla 737139, India [email protected]

Abstract. In recent days digital data transmission over the Internet has significantly increased. In this digital data transmission, security is the fundamental issue to be taken care of. Several cryptographic techniques are used for providing essential security goals such as confidentiality, integrity, and availability to the user. A reliable encryption algorithm is required for transmitting data to ensure data integrity as well as data confidentiality with better throughput and protecting data from any unauthorized access. To achieve this goal we have proposed a novel symmetric key cryptographic technique named IBDNA for the esteemed consumer’s data encryption that not only ensures secure data confidentiality but also drastically reduces the size of the ciphertext for faster data transmission. The proposed algorithm is an enhanced version of the multi-fold BDNA algorithm which is based on a well-known DNA cryptographic algorithm. Moreover, through experimental results, we demonstrates that the proposed algorithm outperforms existing encryption algorithms like BDNA, DNA, DES, AES, Blowfish, etc. in terms of encryption time, ciphertext size, and throughput. Keywords: DNA cryptosystem · Huffman coding · Symmetric key cryptography

1 Introduction In the present scenario, Information has become the most valuable resource in the digital world. It gives rise to various new innovative methodologies for secure data storage and high-speed data transmission. Cryptography plays a vital role in information security and provides a highly secure environment for information storage [1]. Cryptography transforms meaningful text to illegible form to protect the data from unauthorized access or illegal alteration [2–4]. Cryptography requires two main operations that are encryption and decryption to achieve the three core goals of cryptography – confidentiality, integrity, and availability. The information called plain text that has to transmit over the communication channel must be codified into an indecipherable form called ciphertext using some keys. This process is called encryption. Decryption is required after receiving the ciphertext at the receiving end. It is just the reverse mechanism of encryption. Depending on the types of keys to encrypt as well as to decrypt, cryptography can be broadly classified into two types, symmetric key cryptography and asymmetric key cryptography. The former case needs only one identical secret key required for encrypting © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 132–145, 2022. https://doi.org/10.1007/978-981-19-3182-6_11

IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique

133

the plain text in the sender end and also the receiver end for decrypting the ciphertext to plain text. Whereas the latter case requires two different keys called the public key and private key. These keys are complementary. The public key entails encrypting the data and decryption is possible only by the corresponding private key to form the original text from the ciphertext. The challenge in symmetric key cryptography is to establish a secure key between the sender and the receiver that cannot be assessed by any illegal party or adversary. There are different cryptographic algorithms present in the current scenario. The universally used algorithms of Symmetric key cryptography are AES, DES, IDEA, BLOWFISH, Triple DES, etc. In Asymmetric key cryptography, most commonly used algorithms are RSA, ECC, DSA, etc. The most common application of symmetric key cryptography is bulk or huge data encryption. Symmetric key cryptography is mostly used for online secure data communication. Whereas the prime function of Asymmetric key cryptography is digital signature scheme, message authentication, etc. Symmetric key cryptography is an older mechanism and often by 100 to 1000 times faster than the Asymmetric key cryptography [2–6]. Moreover, a Symmetric key also requires lesser storage space. However, both methodologies suffer from different limitations. In Symmetric key cryptography, the chief problem is key transportation. Since all the digital mediums/channels are mostly insecure secure key transmission is the major challenge. While in Asymmetric key cryptography the major drawback is encryption/decryption speed as well as storage consumption. It increases the overhead of Asymmetric key cryptography. To mitigate this limitation public key infrastructure (PKI) is invented. It bridges the user identities with the traditional cryptographic keys used to authenticate the encrypted data passing through the communicating parties. It is well established that compared to other traditional cryptographic approaches PKI resists various security attacks to provide better security and maintain a good tradeoff between security and performance (Fig. 1).

Fig. 1 Symmetric key cryptosystem

Literature Review: From the past, to the present scenario several cryptographic protocols have been proposed and are used to secure data by encrypting. In 2016, Sharma et al. [7] has illustrated the history, fundamental flow, implementation of the oldest cryptography technique Data Encryption Standard (DES) along with the significant limitations of it. An implementation of a Windows tool for encrypting files by using Blowfish is also scrutinized by Meyers and Desoky [8]. This paper [8] demonstrates the result of the encryption tool in terms of encryption timing of S-box compared to the subkey. The authors [8] also proposed some possible improvements in the software tool that enhance the usefulness of the Blowfish algorithm. Nie et al. [9] in 2010 has evaluated the performance of the two commonly known cryptographic algorithms Blowfish and

134

M. Gupta et al.

DES that is mainly required for network data encryption. The authors in this research work examined these two algorithms in terms of encryption speed, security strength, and power ingestion. The experimental result shows that the Blowfish algorithm is faster compared to the DES algorithm, however, the power ingestion of the two are almost comparable. Finally, the authors conclude that the Blowfish algorithm for encryption is much more applicable for wireless network security. In 2016 Kumar et al. [10] has developed a modified version of the AES algorithm by increasing the round numbers to 16 from 12 of the original AES. It was claimed that the improved version of AES provides higher security and faster transmission speed by the authors [10]. The theoretical evaluations and the experimental result of the proposed algorithm also support their claim. In that year, Rajput et al. [11] also has proposed enhanced data security in Cloud Computing using the AES encryption algorithm. The authors in this paper have analyzed the AES method (RIJNDAEL) for securing data in the cloud storage. The method provides 128-bit keys in comparison to 56-bit keys of the DES algorithm. The proposed asymmetric block cipher RIJNDAEL algorithm provides higher security and is well implementable in different working environments like 8-bit microprocessors due to fewer storage requirements. Further, it is also efficient in terms of hardware and software implementation of any degradation in its performance. Several Symmetric and Asymmetric key cryptography algorithms are discussed by Bhardwaj et al. [12] with a prime focus on Symmetric key cryptography. The authors [12] also suggested the most suitable algorithm for cloud storage after comparing different cryptographic algorithms. Later, in 2016, Hossain et al. [13] has proposed a novel a new technique for DNA coding based cryptography that utilizes a dynamic DNA sequence table to provide a higher level of security. However, Sohal et al. [14] has identified the shortcomings of the protocol proposed by Hossain et al. [13] and presented an improved version of the protocol. In 2018, Sohal and Sharma [15] further developed a new protocol based on DNA cryptographic algorithm that facilitates huge data storage in cloud. To solve the fundamental issues of cloud storage (data privacy, security, reliability and interoperability), the authors has presented a new client side data encryption technique that provides better performance in terms of terms of encryption time, cipher text size and throughput. The authors in this paper have demonstrated the experimental result of the implementation of the proposed technique to support their claim. However, in this paper it has been proved that the protocol developed by Sohal and Sharma [15] can be improved to provide faster encryption timing and compressed cipher text size. Our Contribution: In this research work we have proposed a novel hybrid encryption decryption algorithm based on DNA based cryptography that incorporates Huffman coding. In this client-side symmetric key encryption technique at first the plain text is compressed by using Huffman coding and then it is encrypted. During decryption the reverse procedure is applied. This technique is implemented and the experimental result is compared with different traditional cryptographic algorithms like AES, DES, Blowfish as well as previous research works like DNA [13], BDNA [15] algorithms. The result demonstrates that the proposed scheme outperforms all the previous works in terms of encryption time, ciphertext size, and throughput. Therefore, it can be an acceptable conclusion that the proposed scheme offers better performance, and higher efficiency.

IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique

135

2 Preliminaries The fundamental requirements of the proposed scheme are illustrated below. 2.1 Huffman Coding Huffman Coding is developed by David A. Huffman, and it is one of the most famed greedy algorithms for the lossless data compression [16–19]. The Huffman’s algorithm gives a variable-length code for an input character file or source symbol and results a table to encode. The table is derived based on the frequency of occurrence (weight) for each possible value of the input source symbol. As an example, the higher frequent or more common symbols get the fewer bits and the less frequent symbols get larger bits [16]. 2.1.1 Computational Formulas length) • Average code length per character = (Frequency×Code Frequencyi • Total number of bits present in message encoded with Huffman algorithm = Total number of characters available in the corresponding message × Average code length per character • Time Complexity of Huffman Coding = O (n log n) where n is the number of unique characters in the text.

2.2 BDNA-A DNA Inspired Scheme BDNA [15] is a symmetric key cryptographic technique and it is based on the DNA coding[13]. The encryption process required two encoding tables as shown in Table 1 and Table 2, a 14-bit encryption key, and a random number named as N. The value N decides the shifting for each entries inside Table 1 and Table 2 is used to convert the binary data into the final ciphertext. At the receiving end, if a user seeks to access the stored cloud data, s/he needs to authenticate himself first. Then, s/he accesses the confidential cloud data, proper decryption key, and the random number N. The values in Table 1 shifts according to the value N, and then decryption key performs the decryption process successfully. 2.2.1 The Encryption Process The encryption algorithm is explained below.

136

M. Gupta et al. Table 1. Initial encoding of ASCII characters as 7-bit combinations for n = 0 [15].

2.2.2 The Decryption Process The decryption algorithm is explained below.

IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique

137

Table 2. Final encoding table [15]

3 Proposed Algorithms The proposed algorithms are based on symmetric-key cryptography which employs the identical keys for encryption plain text to form ciphertext as well as for decryption of the ciphertext to retrieve the plain text. The basic concept of these algorithms is supported by the BDNA cryptography [15] algorithm and the famous greedy algorithm Huffman coding explained in the previous section. Huffman coding is generally applied for lossless compression. The core idea behind using Huffman coding in the proposed algorithm is to primarily compress the plaintext to decrease its size to increase the efficiency of the algorithm. The proposed approach requires a 14-bit key to encrypt the plain text. However, this algorithm initially selects a 14-bit key before the actual encryption procedure starts, and

138

M. Gupta et al.

then the key is divided the 14-bit key into two 7-bit keys (k1 & k2) derived from the initial key that depends on the last bit. The Methodology of the overall procedure of the proposed scheme is discussed below. • First: The plaintext is compressed into a binary stream of bits using the Huffman coding; • Second: The binary stream is transformed into binary data using the encryption keys (k1 & k2); • Third: A table converts the binary data the final ciphertext. The table is a random version of basic 64 [20] bits encoding. 3.1 The Encryption Procedure The encryption process of the proposed algorithm is illustrated in this section. First of all, the plaintext is taken as input and then it is converted into a binary sequence (P1) using Huffman coding to reduce the plaintext by compressing the number of bits of the plaintext. Next, a 14-bit key (K) is inserted as input. After that, two keys (K1 & K2) of 7 bits generated from the 14-bit key (K) depending upon the last bit of K. The resulting two keys (K1 & K2) of 7 bits each are the actual keys that are involved in the encryption mechanism. Then we have performed an XOR operation with the plaintext bit sequence and 7-bits key K1. Then, the resulting bit sequence is split into two equal parts in such a way that if the resulting bit sequence is odd in the number of containing bits, then zero is padded in the latest bit of the bit sequence. Now two halves are interchanged to generate another bit sequence which is called (P2) and XOR operation is performed in between newly generated (P2) and 7-bit key (K2). Then, the resulting bit sequence is divided into blocks of 6 bits, and if zeros are padded at the end if needed. The final step of the algorithm is to generate ciphertext by converting the resulting bit sequence using a table. The encoding procedure used here is the base64 encoding. Algorithm 1 demonstrates the explanation of the proposed encryption mechanism in steps (Fig. 2). 3.2 The Decryption Procedure The decryption mechanism is exactly the reverse of the encryption procedure. At first,the ciphertext and the 14-bit key (K) are read, and then, the 14-bit key (K) is converted into two keys (K1 & K2) of 7-7 bits depending upon the last bit as mention in the encryption procedure in Sect. 3.1. After that, a table of base 64 encoding converts the ciphertext into bit sequence. Now, XOR operation is executed between the freshly generated key (K2) and bit sequence after removing the padded zero bits from the bit sequence. Next, the bit sequence is split into two halves to interchange the two halves of the bit sequence. XOR operation is done between the resulting bit sequence with key (K1). The final step of the decryption process is to convert the bit sequence into plaintext using Huffman’s decoding. Algorithm 2 demonstrates the stepwise explanation of the proposed decryption process (Fig. 3).

IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique

139

Fig. 2. The proposed encryption algorithm

Fig. 3. The proposed decryption algorithm

4 Implementation A core i3 Intel processor 3210 M of 4 GB RAM, 2.67 GHz, 64-bit Linux operating system of version 18.04 is used to implement the proposed algorithm in C++. The program input is the plaintext, and the successful implementation delivers the encrypted form of the input data (Fig. 4). For example: Plain text: “The best way to find yourself is to lose yourself in the service of others.” Key: 10100101011001

140

M. Gupta et al.

Fig. 4. Experimental results of the implementation

Ciphertext: kAsMKXrB0yVNgqvWW+t1Mxs+t1Mxs+Aa/4nE0oDToI8iLgsaKxFTnxXZ.

5 Analysis of the Proposed Algorithm This section compares the proposed algorithm with the conventional cryptographic techniques such as DES, AES, DNA, etc. for encryption parameters like encryption time, ciphertext size, and throughput. The enhancement of the proposed IBDNA over the fundamental BDNA algorithm is also illustrated here. 5.1 Security Analysis In this section, it has been shown that the proposed BDNA algorithm is secure against chosen plain text attack (CPA). The CPA is said to occur when the adversary has the negligible chances to guess the plain text within probabilistic polynomial time. An experiment is considered SYMIBDNACPA A,ε that consists of following phases: • Training Phase: The adversary A adapted the situation where he/she can access the encryption system of the proposed IBDNA algorithm. In this step, the adversary A sends a query message to get the encryption. • Challenge Phase: The adversary A, can make any query by his/her own choice. The adversary chooses plaintext of two equal length p0 and p1 to query to the challenger, generate any random key K with the help of the key generator. Now, he/she selects any random bit b ∀b ∈ {0, 1} to encrypt the plain text pb to produce the cipher text. • Post-Challenge Training Phase: The adversary A can randomly pick up any query messages to get the ciphertexts. • Response Phase: The adversary A finally submits the guess about encrypted plaintext, in the form of a bit, b’. The adversary A wins the experiment SYMIBDNACPA A,ε if the guessed value is equivalent to the value of b. The proposed algorithm ε = {KGEN, ENC, DEC} will be secure against CPA, if and only if for all adversary the outcome of the function in any polynomial time is equivalent to negf(k) such that  1  = 1 ≤ + negf (k) Pr SYMIBDNACPA A,ε 2

IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique

141

If an encryption algorithm is secure against CPA, then executing it two times on the same plaintext using the same key must generate dissimilar ciphertexts. In the proposed algorithm, the value of k is chosen arbitrarily on each encryption and also shuffled for each submission of the plain text. Therefore, each time of submission of the plain text, the resulting ciphertext will be altered. So, it can be stated that the proposed algorithm is CPA-secure. A scenario is considered where A inputs two plaintexts p0 and p1 such that p0 = p1 . From, the proposed algorithm. ε = {K, N , p0 } = ε = {K, N , p1 } It is because the dynamic nature of the key k the random value will be changed for each time of the encryption. This leads to negligible chance to guess the correct plain text. Hence, the proposed algorithm is free from CPA attack. 5.2 Experimental Results This section presents the comparative discussion on the encryption time, ciphertext size, and throughput of the proposed IBDNA that reveals the better efficiency of the proposed scheme among the other traditional algorithms like AES, DES [7], Blowfish [8], DNA [13], etc. 5.2.1 Cipher Text Size Since the technique applied Huffman coding for the conversion of plaintext to binary bit sequence the scheme does not increase the size of ciphertext while it decreases the size of ciphertext due to compression of plaintext using Huffman coding. Furthermore, in the approach data will take lesser time while transmitting it through a channel because the transmission time is directly proportional to the data sent [6]. The results have been shown in Table 3 along with visual representation using the bar chart in Fig. 5 below. Table 3. Comparison of cipher text among various technique for a given plaintext size Plain text (KB)

Cipher text (KB) IBDNA

DES

Blowfish

DNA

BDNA

AES

5

3.53

6.86

6.76

6.67

5.83

6.86

10

7.42

13.70

13.52

13.33

11.66

13.70

15

10.99

20.55

20.27

20.0

17.5

20.55

20

16.34

27.39

27.03

26.67

23.34

27.39

25

19.82

34.25

33.78

33.33

29.17

34.25

30

24.39

41.09

40.5

40.0

35.0

41.09

142

M. Gupta et al.

Fig. 5. Analysis of cipher text size generated by IBDNA and other cryptographic algorithms

5.2.2 Encryption Time The time of encryption is defined as the time required to convert the input (i.e. plaintext) into the output (i.e. ciphertext). The efficiency of the algorithm will be higher with the lower encryption time. It implies the inversely proportional relation between the efficiency of the algorithm and encryption time. [6]. This section shows the comparative study of the encryption time of the proposed IBDNA with the other cryptographic algorithms. The result demonstrates the proposed algorithm takes lesser time than AES [10], DES [7], and BDNA [15] for encryption of the similar-sized plaintext. Hence, the proposed algorithm is said to be higher in efficiency compared to the above-mentioned algorithms. The results are tabloid in Table 4 and the visual representation is shown using the bar chart in Fig. 6 below. 5.2.3 Throughput Thoughput is a direct analytical measure of the performance of an algorithm, and it is directly proportional; i.e. the performance will increase with greater the throughput value of an algorithm [6, 7]. The following formula gives the Throughput, Throughput =

Plaintext size Encryption Time

On the basis of the recorded time of encryption of algorithms, the above formula calculates the throughput of all techniques. The throughput of the proposed algorithm is demonstrated in Table 5 and Fig. 7 demonstrate and it shows higher value than all other techniques.

IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique

143

Table 4. Comparison of the time of encryption among various technique for a given plaintext size Plain text (KB)

Encryption time (KB) IBDNA

DES

Blowfish

DNA

BDNA

AES

5

0.014

0.152

0.054

0.200

0.048

0.053

10

0.027

0.283

0.074

0.483

0.066

0.081

15

0.040

0.405

0.097

0.625

0.090

0.106

20

0.052

0.571

0.130

0.972

0.106

0.172

25

0.066

0.785

0.133

1.250

0.123

0.198

30

0.079

0.955

0.170

1.487

0.149

0.218

Fig. 6. Analysis of encryption time generated by IBDNA and other symmetric algorithms Table 5. Throughput of various techniques Algorithm

Throughput (KB/sec)

IBDNA

373.75

BDNA

180.4

Blowfish

159.6

AES

126.8

DES

33.32

DNA

20.92

5.3 Enhancements Over the Existing BDNA Cryptography [15] In BDNA [15] all the characters of plaintext converted into 7-bit binary bit sequence using ASCII table for example y as 0,000,000, W as 0,000,110 etc. So, this in whole makes each character of 7 bits but in this proposed scheme we use Huffman’s coding for the conversion of characters to bit sequence. Huffman’s coding is a compression

144

M. Gupta et al.

Fig. 7. Comparison of throughput of various symmetric algorithms with IBDNA

technique which compresses the binary bit sequence for example if we convert plaintext “hello” in BDNA we get a 35-bit long bit sequence but if we convert plain text through the proposed scheme then we get only a 10-bit sequence which is 3.5 times smaller than the BDNA cryptographic technique. Moreover, in BDNA cryptographic technique [15] the first table which converts plain text into 7-bit sequence is static anyone can decode it if he has any middle string generated in the process, but in this scheme, we use Huffman coding which will not be decoded until he has the exact Huffman tree. With the use of Huffman coding the size of the generated sequence is also decreased. BDNA cryptographic technique used 1 7-bit key sequence which is extracted from a 14-bit key sequence given by the user, but we generated two 7-bit key sequences from the same 14-bit key sequence and XOR with it plain text twice which enhance the security in the scheme. The storing of table1 in BDNA cryptography [15] is a wastage of storage space because of 1 KB extra storage needed to store table1 but our approach does not need to store any kind of table. It decreases the run time storage complexity. BDNA takes more time in encrypting plain text but in our approach, the encryption time has been drastically decreased. So, from the above discussion, it is revealed that the proposed IBDNA algorithm is more efficient than that of the BDNA algorithm.

6 Conclusion In this paper, we have proposed a novel cryptography algorithm named as IBDNA which is inspired by BDNA cryptographic technique. To enhance the performance and effectiveness of the proposed algorithm over other traditional symmetric key cryptographic algorithms as well as BDNA algorithm, we have incorporated Huffman coding to compress plaintext. The experimental results exhibit clearly that our IBDNA defeats other symmetric key encryption algorithms (BDNA, AES, DES) based on the encryption time, ciphertext size, and throughput. In the future we will implement the propose an algorithm in various resource constraint environments like IoT and its different applications like smart home, e-health, etc.

IBDNA – An Improved BDNA Algorithm Incorporating Huffman Coding Technique

145

Acknowledgment. The research work is supported by the Ministry of Education, Govt. of India.

References 1. Bhardwaj, A., et al.: Security algorithms for cloud computing. Proc. Comput. Sci. 85, 535–542 (2016) 2. Chandra, S., et al.: Content based double encryption algorithm using symmetric key cryptography. Proc. Comput. Sci. 57, 1228–1234 (2015) 3. Forouzan, B.A.: Cryptography & Network Security. Special Indian Edition, Tata McGraw-Hill (2007) 4. Kahate, A.: Cryptography and Network Security, 2nd edn. Tata McGraw-Hill (2009) 5. Stallings, W.: Cryptography and Network Security Principles and Practices, 4th edn. Pearson Education, Prentice Hall (2009) 6. Forouzan, B.A., Mukhopadhyay, D.: Cryptography and Network Security (Sie). McGraw-Hill Education (2011) 7. Sharma, M., Garg, R.B.: DES: the oldest symmetric block key encryption algorithm. In: 2016 International Conference System Modeling & Advancement in Research Trends (SMART). IEEE (2016) 8. Meyers, R.K., Desoky, A.H.: An implementation of the blowfish cryptosystem. In: 2008 IEEE International Symposium on Signal Processing and Information Technology. IEEE (2008) 9. Nie, T., Song, C., Zhi, X.: Performance evaluation of DES and blowfish algorithms. In: 2010 International Conference on Biomedical Engineering and Computer Science. IEEE (2010) 10. Kumar, P., Rana, S.B.: Development of modified AES algorithm for data security. Optik-Int. J. Light Electron Opt. 127(4), 2341–2345 (2016) 11. Rajput, S., Dhobi, J.S., Gadhavi, L.J.: Enhancing data security using aes encryption algorithm in cloud computing. In: Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems, vol. 2, pp. 135–143. Springer, Cham (2016) 12. Bhardwaj, A., et al.: Security algorithms for cloud computing. Proc. Comput. Sci. 85, 535–542 (2016) 13. Hossain, E.M.S., et al.: A DNA cryptographic technique based on dynamic DNA sequence table. In: 2016 19th International Conference on Computer and Information Technology (ICCIT). IEEE (2016) 14. Sohal, M., Sharma, S.: Enhancement of cloud security using DNA inspired multifold cryptographic technique. Int. J. Secur. Appl. 11(12), 15–26 (2017) 15. Sohal, M., Sharma, S.: BDNA-A DNA inspired symmetric key cryptographic technique to secure cloud computing. J. King Saud Univ.-Comput. Inf. Sci. (2018) 16. Liu, Y.K., Žalik, B.: An efficient chain code with Huffman coding. Pattern Recog. 38(4), 553–557 (2005) 17. Tsai, C.-W., Wu, J.-L.: On constructing the Huffman-code-based reversible variable-length codes. IEEE Trans. Commun. 49(9), 1506–1509 (2001) 18. Lathi, B.P.: Modern Digital and Analog Communication Systems 3e Osece. Oxford University Press, Inc. (1998) 19. Haykin, S.S., Moher, M., Song, T.: An introduction to analog and digital communications, vol. 1. Wiley, New York (1989) 20. Sathishkumar, G.A., Bagan, K.B.: A novel image encryption algorithm using pixel shuffling and base 64 encoding based chaotic block cipher (IMPSBEC). WSEAS Trans. Comput. 10(6), 169–178 (2011)

Obfuscation Techniques for a Secure Endorsement System in Hyperledger Fabric J. Dharani1(B) , K. Sundarakantham1 , Kunwar Singh2 , and Shalinie S Mercy1 1

Thiagarajar College of Engineering, Madurai 625015, India [email protected], {kskcse,shalinie}@tce.edu 2 National Institute of Technology, Trichy 620015, India [email protected]

Abstract. Blockchain is a distributed ledger that provides a platform for sharing information among the participating nodes. The transactions are transparent and are validated by all the nodes in the network. Though the transparent nature offers advantageous features like provenance and non-repudiation, there are privacy risks that need to be addressed. This paper addresses a privacy problem prevailing in the endorsement system of Hyperledger Fabric, a permissioned blockchain network. The endorsement process reveals the entity authorized to perform endorsement. It is revealed in two ways: one is through the endorser’s signature and the other is through the endorsement policy. Endorsement policy is an expression that enforces, which are the organizations supposed to endorse a transaction. The endorser signature is secured by constructing a linkable ring signature secure in the standard model of security. The second form of leakage is prevented by committing the endorsement policy and providing non-interactive zero-knowledge proofs. This paper formally proves that the privacy-preserving endorsement system achieves anonymity and unlinkability of the endorsers in Hyperledger Fabric. Keywords: Blockchain · Hyperledger fabric · Endorsement ring signature · Non-Interactive zero-knowledge proof

1

· Linkable

Introduction

Hyperledger Fabric is a distributed operating system that allows multiple organizations to collaborate and implement permissioned blockchain for a business objective [3]. A remarkable privacy feature in Hyperledger Fabric is the option to anonymize the clients through identity mixer/idemix which is implemented by a separate MSP. Through Idemix signatures a client can request a transaction proposal without revealing which organization is requesting a service. Idemix signatures also provide the property of unlinkability to the clients. This implies that the various requests generated by the same client are unlinkable. The problem is Fabric does not support the privacy property of anonymity and unlinkability c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022  D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 146–158, 2022. https://doi.org/10.1007/978-981-19-3182-6_12

Obfuscation Techniques for a Secure Endorsement System

147

for other nodes or peers in the network [1]. There are various attacks against the endorsing peers such as DDoS attack as reported in the paper [2], and various other attacks such as spam attacks, eclipse attacks as given in [11,17]. All these attacks reduce the network efficiency or interrupt the transactions destined to a particular client. This is because endorsing organizations for a transaction are known to all the nodes in the network. To provide the said privacy features to endorsers, the ways information is known to need to be found. Endorsing organizations are known through two instances. One is through the endorsement policy and the other is through the endorser signature on the transaction response. This paper provides a framework to perform privacy-preserving endorsements. The proposed framework anonymize the endorsers and secure the endorsement policy through advanced cryptographic techniques.

2

Literature Survey

Privacy-Preserving Endorsement in Hyperledger Fabric: Stathakopoulou and Cachin [20] were the first to address the problem of privacypreserving endorsement in Hyperledger Fabric. BLS [7,8] and RSA [21] threshold signatures are employed to mask the endorser signature. In FCsLRS scheme[15], the authors construct a scoped-linkable ring signature scheme to anonymize the endorsers. In the above cases, securing endorsement policy is not handled and the constructed signature is secure only in the random oracle model. A followup work [10] uses a linkable threshold ring signature scheme to anonymize the endorsers but turns to be inefficient. Androulaki et al. [4], introduced the security definitions for the privacy-preserving endorsement scheme in Hyperledger Fabric. The paper secures both the endorser signature and the endorsement policy. It provides a security framework by assuming that idemix signatures can be used by peers to make endorsements. But signing with idemix signatures is not yet supported to nodes other than clients. Xiao et al. [22] have constructed multi-signature schemes for Fabric to improve the transaction efficiency. The scheme does not concentrate in masking the endorser identity but intends only to reduce the storage complexity. Ring Signatures: Ring signatures were introduced in [19] to allow a signer from an authorized ring of signers to sign a message hiding the actual signer. Katz et al. formalized and gave stronger security notions for ring signatures and constructed a scheme secure under standard model [6]. Subsequently schemes secure without random oracles were constructed [9,14]. As the endorsement process in Hyperledger Fabric is analogous to the e-voting procedure, a linkable ring signature [13] is more suitable. In EUROCRYPT 2019, Backes et al. [5] gave a construction of a linkable ring signature scheme secure under the standard model of security. The linkability feature offered in this work is not scoped linkable and requires a fresh set of key pairs to sign every new transaction. The signature construction in this paper is an extension of the work by Malavolta and Schr¨ oder [14], which is based on [12], that appeared in ASIACRYPT’17 by making it a scoped-linkable ring signature scheme.

148

3

J. Dharani et al.

Proposed Linkable Ring Signature for Hyperledger Fabric

In the proposed work, first, the construction of a linkable ring signature scheme is presented. Then, a technique to obscure the endorsement policy is given. Finally, a new membership service provider is proposed to integrate the signature scheme in Hyperledger Fabric. Basic Idea. The idea is to randomize the secret key and verification key pair and sign using the randomized secret key. Then prove in zero-knowledge that the randomized key used to sign a message originates from the ring R = {V K1 , ..., V Kn }. To incorporate the scoped-linkability feature, the verifiable random function is employed. Every scope is uniquely identified by an identifier. In the context of blockchains, a transaction has termed a scope. Transactions are signed to imprint the identifying information of who executed the transaction. As the proposed work conceals the source of the signature it is important to ensure that the same signer has not signed the transaction twice. The verifiable random function is a deterministic function that results in a unique value v for the input and a proof p that the value corresponds to the given input message m. Given a public verification key vk, message m, and (v, p) pair it is possible to publicly verify the correctness of VRF functions. The value v of the input m is pseudorandom if the proof is not disclosed. To achieve anonymity the proof part is encrypted and only the value is disclosed for verification in the Link function. 3.1

Proposed Construction

The construction of a linkable ring signature scheme LRS=LRGen,LRSign, LRVerify,LRLink is given below. Let, H = (Gen,Eval) be a programmable hash function; VRF = (Gen,Eval,Prove,Verify) be a verifiable random function; PKE = (Gen,Enc,Dec) be an asymmetric encryption scheme with ciphertext and key pseudo-randomness; NIZK = (Gen,Prove,Verify) be a non-interactive zeroknowledge proof system. The formal description of the linkable ring signature scheme is given in Fig. 1. – LRGen: Every signer generates a pair of verification key and secret key (V K, SK). It first samples two integers (x, α) from finite field Zp . Executes key generation of VRF and PKE functions and generates (vkL , skL ); (vknf , sknf ); (pk, sk) respectively. Key pair (vkL , skL ) helps to achieve linkability feature and (vknf , sknf ) is used to ensure non-frameability of linkable signatures. Sets secret key as a composition of (x, skL , sknf , sk). α is a trapdoor and is used to generate crs. Computes z and C as powers g2x and g2α respectively. Executes key generation of programmable hash function H and outputs k. Verification key V K is a composition of (z, k, C, vkL , vknf , pk). – LRSign: Every signer samples three integers to impart randomization at different levels. First, the x and z components of key pairs are re-randomized

Obfuscation Techniques for a Secure Endorsement System

149

procedure LRGen(1λ ) (x, α) Z2p x z g2 k HGen(1λ ) C g2α (vkL , skL ) V RF Gen(1λ ): β $ Zp , β compute g assign vkL g β , skL β (vknf , sknf ) V RF Gen(1λ ): γ $ Zp , g γ , sknf γ compute g γ assign vknf m (pk = g , sk = m) P KEGen(1λ ) VK := (z, k, C, vkL , pk), SK := (x, skL , sk) return (V K, SK) end procedure

procedure LRVerify(R, Σ, m) parse R = (V K1 , ..., V Kn ) parse V Ki = (zi , ki , Ci , vkL , vknf , pki ) parse σ = (θ, ς, pksOT S ) parse θ = (s, y, h, vL , ctL , ctnf , tid, π, z ) xπ := R||z ||c||(m, R||ctL ) b N IZK.V erif y(C = i Ci , xπ , π) b b ∧ V erif ysOT S (pksOT S , θ, ς) if e(y, z .g2s ) = e(h, g2 ) then return b =1 return (b = b = 1) end procedure

procedure LRSign(SK, m, R) parse R = (V K1 , ..., V Kn )

procedure LRLink(Σ1 , Σ2 ) ? if Σ1 .vL = Σ2 .vL then return 1 else return 0 end procedure

if i : V K = V Ki then return ⊥; parse V K := (z, k, C, vkL , pk) Z3p //Re-randomization of keys z z.g2ρ x x+ρ h HEval(k, m||R)δ //VRF evaluation to impart linkability and non-frameability vL e(g, g)1/(skL +tid) pL g 1/(skL +tid) (vksOT S , sksOT S ) GensOT S (1λ ) 1/(sknf +vksOT S ) vnf e(g, g) pnf g 1/(sknf +vksOT S ) //ElGamal Encryption ctL pL × g (m.rL ) ctnf pnf × g (m.rnf ) //NIZK Proof Generation xπ := R||z ||c||(m, R)||ctL π N IZK.P rove(C = i Ci , xπ , w = (ρ, δ, i, rL )) 1

y h x +s θ = (s, y, h, vL , ctL , ctnf , tid, π, z ) ς = SignsOT S (sksOT S , θ) return σ = (θ, ς, pksOT S ) end procedure

Fig. 1. Linkable ring signature.

150

J. Dharani et al.

using one of the randomization factors chosen ρ. The output of the hash evaluation function is then randomized by a factor of δ. Next, the transaction id tid which is used to uniquely identify every transaction is evaluated using the VRF function. The proof component pL is encrypted and the resultant ciphertext is given as ctL . The language of the NIZK proof systems is given by: ⎧ ⎫ K ∗ ), (k1 , ..., kn ), c, m, ctL ) : ⎬ ⎨ ((V K1 , ..., V Kn , V ∗ ρ δ L = ∃(δ, ρ, i, rL ) : VV K Ki = g2 ∧ h = HEval(ki , m) ∧ ⎭ . ⎩ Dec(ski , ctL ) = pL The above equation illustrates that there exists at least one verification key V Ki which is randomized to get V K ∗ . The evaluation of the message with key ki results in h. And the decryption of ciphertext ctL with the corresponding secret key ski yields a correct proof pL . The actual signature y is generated with the randomized part of the secret key x . The final linkable ring signature is a composition of partial signature component given by θ = (s, y, h, vL , ctL , ctnf , tid, π, z  ), and a one-time signature ς of the partial signature and the public key for one-time signature. – LRVerify: Verification proceeds by checking two criteria, one is verifying the validity of NIZK proof. Next, it checks for the equality of a bilinear pairing equation [16] as presented in Eq. 1. e(y, z  .g2s ) = e(h, g2 ) ?

1

(x +s)

e(y, z  .g2s ) = e(h (x +s) , g2 = e(h, g2 ) = e(h, g2 )

x +s x +s

)

(1)

The signature is accepted if both the checks get passed. – LRLink: The linkability function allows a signer to sign two different scopes without being linked but prevents a signer from signing a message on the same scope. The VRF values of signatures on the scope are verified pairwise to check if they are unique.

4

Obscuring Endorsement Policy

In this section, a method to secure the endorsement policy from revealing the endorser organization identities is given. A formal description is presented in Fig. 2. The idea is policy creators commit to the organization identities and share the randomness and the corresponding commitments to the respective organizations. The endorsing organizations check the validity of the commitments and generate a non-interactive proof of knowledge of their randomness rOrgid , secret key SK, and endorser identity EID. The validators verify the proofs and accept if it is valid.

Obfuscation Techniques for a Secure Endorsement System

procedure Setup Choose large primes p and q such that q|(p − 1) Choose a generator g of order q, subgroup of Z∗p Randomly sample a ∈ Zq and calculate h g a modp. // CRS generation Randomly sample α Zp Compute crs g2α end procedure

151

procedure Generate & Verify Membership Proof Prove(x1 = g rOrgid , x2 = g SK , x3 = g EID , w1 = rOrgid, w2 = SK, w3 = EID): parse crs = T ∈ G2 r Zp //Homomorphism, φ : G1 G2 Compute R φ(g2r ) Compute PR Tr Compute PAi φ(T r.wi ), ∀i ∈ [1, 3], wi ∈ Zp return π = (PAi , R, PR ) ∀i ∈ [1, 3] Verify(xi, π ∀i ∈ [1, 3]): parse T = g α parse xi = Ai ∈ G1 parse π = (PAi , R, PR ) ∈ G21 × G2 Check if: e(R, T ) = e(φ(PR ), g2 ) ∧ e(Ai , PR ) = e(PAi , g2 ) end procedure

procedure Generate Organization Pseudonym GenOrgPseudonym(Orgid): Calculate opid g Orgid .hrOrgid . VerifyOrgPseudonym(opid, rOrgid ): ? opid = g Orgid.hrOrgid . end procedure

Fig. 2. Obscuring endorsement policy

– Setup: The system parameters required to perform the commitment are set by the policy creators. And the parameters needed to generate a common reference string (CRS) can be generated by the endorsers during the key generation process. – Generate Organization Pseudonym: Endorsement policy creators generate Pedersen commitments [18] to the organization identities in the endorsement policy. And, forward the organization pseudonym and the allied randomness through an encrypted channel. The endorsers in the policy, check the valid? ity of the commitment as opid = g Orgid .hrOrgid . And proceeds to generate membership proof if the commitment is valid. – Generate & Verify Membership Proof: Every policy principal in the endorsement policy generates proof that they are authorized to execute the chaincode. An efficient non-interactive proof for the class of discrete logarithms without random oracles is employed [14]. The policy creators/endorsers forward their authorization proof along with the transaction response. In the

152

J. Dharani et al.

validation phase, every committing node checks the endorser’s signature and the proof. It then checks if a sufficient number of endorsements are available in the transaction. It then verifies the proof as follows: e(R, T ) = e(φ(PR ), g2 ) ∧ e(A, PR ) = e(PA , g2 )

(2)

e(φ(g2r ), g2α ) = e(φ(g2α.r ), g2 ) ∧ e(g1w , g2α.r ) = e(φ(g2α.r.w ), g2 )

(3)

e(g1 , g2 )α.r = e(g1α.r , g2 ) ∧ e(g1 , g2 )α.a.r = e(g1α.w.r , g2 )

(4)

Finally, the proof is accepted if the pairing Eqs. 2,3,4 are satisfied. The endorsement will now consist of an endorser signature and membership proof that the signer belongs to the committed organization. 4.1

Implementing Membership Service Provider with Obfuscation of Endorsement in Hyperledger Fabric

The functions supported by the proposed privacy-preserving endorsement system MSP is given below: – Key Generation: (skI , pkI , pp) ← Setup(1λ ), invokes the Setup algorithm and generates the public parameters and the secret and public key pair for CA. (V K, SK) ← LR.Gen(1λ ), invokes the key generation algorithm and outputs the verification and secret key pair for the user. – Policy Construction: C ← GenOrgP seudonym(Orgid), invokes the privacypreserving policy construction and outputs a Pedersen commitment of the organization identity. – Privacy-Preserving Endorsement: Σ ← LR.Sign(SK, m, R), invokes the linkable ring signature algorithm and outputs a signature. π ← P rove(x, w), invokes the NIZK proof of knowledge and generates a proof of membership. – Validation: b ← LR.V erif y(R, Σ, m) ∧ LR.Link(Σ1 , Σ2 ) ∧ V erif y(x, π), invokes the verification and link algorithms of linkable ring signature scheme and the commitment scheme respectively. The endorsement is accepted if the signature and proof passes all the verification tests.

5

Security Analysis for the Constructed Linkable Ring Signature Scheme

Theorem 1 (Correctness). If the employed NIZK proof system, signature scheme, VRF, and PKE schemes are complete, then the proposed threshold ring signature scheme is correct. Proof. The proof for completeness of the LRS scheme follows directly from the underlying schemes.   Theorem 2. [Unforgeability] If the employed ring signature scheme is unforgeable, then the proposed construction of LRS is unforgeable.

Obfuscation Techniques for a Secure Endorsement System

153

Proof. The proof is straightforward as the proposed work is an extension of a ring signature scheme RSig. The underlying scheme has existential unforgeability. The proof is by contradiction. Let us assume there exists a PPT adversary A which succeeds in forging the proposed linkable ring signature scheme with a non-negligible probability (λ). Then a reduction R is constructed against the unforgeability of the underlying ring signature scheme. If A succeeds in forging the signature generated from the LRSign scheme then R successfully forges the ring signature RSig. This implies that the probability with which A succeeds is exactly equal to the winning probability of R. This contradicts the unforgeability of the ring signature scheme RSig. As the RSig scheme is existentially unforgeable so is our scheme.   Theorem 3 (Linkable Anonymity). If the NIZK is a statistically zeroknowledge argument and the signature scheme has perfect re-randomizable keys, PKE possesses key privacy and CPA-security, VRF possesses key privacy and residual pseudo-randomness, then the proposed construction of LRS has linkable anonymity. Proof. Consider the following sequence of hybrids: H0 : This is the original anonymity experiment as outlined in the definition for anonymity in [6]. H1 : It is the same as H0 with the difference that the NIZK proof π in the challenge signature is a simulated proof generated from trapdoor α without the witness w. H2 : It is the same as H1 with the difference of generating challenge signature with freshly generated key pair instead of randomized keys. The indistinguishability of the adjacent experiments is shown as follows: H0 ≈ H1 . As the employed NIZK is perfectly zero-knowledge, the hybrids H0 , H1 are identical. H1 ≈ H2 . The only difference between the two experiments is in the key generation process. H1 contains challenge signature under randomized keys while in H2 they are not. As the signature scheme possesses perfectly re-randomizable keys the distributions in the two experiments are statistically close. H0 ≈ H2 . The challenge signature computed in H2 is independent of the response bit b. Hence, the only possible attack by an unbounded adversary A is guessing with a probability of exactly 1/2. Hence the simulations H0 , H2 are identical. From the above observations, the winning probability of A is given by: |P r[AAnon (λ) = 1] − 1/2| ≤ negl(λ)

(5)  

Theorem 4 (Non-Frameability). If the one-time signature scheme is strongly unforgeable, then the proposed construction of LRS is non-frameable. Proof. Consider a PPT adversary A wins successfully by generating a signature that links to an honestly generated signature. A is said to win if it generates two signatures σi and σj that links such that σi was generated by signer i and σj was produced by signer j. A is allowed to make corruption and signing queries and is able to get arbitrarily many signatures and generate a new signature σi

154

J. Dharani et al.

with respect to a fresh message and ring. A is provided with all the secret keys. It is successful if i) it can generate another signature σj which links to σi and ii) none of the users were corrupted in the rings of both the signatures. By the strong unforgeability of one-time signatures, it is infeasible to get two signatures   under the same vksOT S . Hence the proposed scheme is non-frameable. Theorem 5 (Linkability). If the NIZK is computationally sound, VRF has unique provability, and PKE is correct, then the proposed construction of LRS possesses linkability. Proof. The proof for linkability is given as a challenge-response game. Let the challenger generate k keys that constitute a ring R. Assume a PPT adversary A successfully generates k + 1 signatures without being linked which is given by LRLink(σi , σj ) = 0 ∀i = j. By the computational soundness of the NIZK argument there exists a witness extractor that extracts out the witness w = (δ, ρ, i, rL ). The proof statement illustrates that there exists an index i in R which generated the signature. By the correctness of PKE, it is evident that Dec(ski , Enc(pki , pL )) = 1 is true always. Finally, the two unlinkable signatures j i

= vL . By the pigeonhole principle, there exists at least one vkL σi , σj implies vL which is referenced twice for the same scope and message. But this goes against the unique provability of VRF functions.  

6

Security Analysis for a Privacy-Preserving Endorsement in Hyperledger Fabric

In this section, it is proven through proof by reduction that our proposed system is secure by meeting the security requirements of anonymity and unlinkability for a privacy-preserving endorsement. 6.1

Policy Principal Anonymity and Unlinkability

Policy principal anonymity ensures that the generated organization pseudonym and membership proof leaks nothing beyond the truth of the statement. Policy principal unlinkability ensures that given two membership proofs, it is infeasible to tell if both the proofs were generated by the same policy principal or not. Theorem 6 (Policy Principal Anonymity and Unlinkability). If the commitment scheme possesses binding and hiding properties, NIZK possesses soundness and zero-knowledge property, then the proposed policy construction is anonymous and unlinkable under the generic model of security. Proof. On a high level, it is to be proven that there exists a PPT adversary A who can generate a valid membership proof of an organization without being a valid member. Assume by contradiction an efficient attacker A wins in the challenge-response game of Definition 4.1 and 4.2 with a non-negligible probability. A reduction R can be modeled against the security of commitment and

Obfuscation Techniques for a Secure Endorsement System

155

zero-knowledge schemes. At some point A outputs a tuple with the organization pseudonym and membership proof (ppid∗ , π ∗ ) that its identity is bound to the organization pseudonym ppid. Reduction R receives it from A and invokes ∗ , π ∗ ) ← εA with the same the extractor to receive (x∗ = g rOrgid , w∗ = rOrgid set of inputs and randomness as A. A successful attack by A on the security of the proposed system implies a successful attack by R on the commitment and NIZK scheme. By the soundness of the NIZK scheme, the probability to prove the validity of the statement without the corresponding witness is close to negligible. By the binding property of the commitment scheme finding a different ∗ such that rOrgid ∗

∗ P r[(rOrgid

= rOrgid ∧ g rOrgid = g rOrgid ) = 1] = negl(λ)

(6)

This contradicts the initial assumption of A breaking the proposed scheme. As the underlying schemes are secure so is our scheme.   6.2

Endorser Anonymity and Unlinkability

Endorser anonymity ensures that given an endorsement it is infeasible for a PPT adversary A to track the identity of the endorser who produced it. Endorser unlinkability ensures that given two endorsements, it is infeasible for A to tell if it was produced by the same endorser. Theorem 7 (Endorser Anonymity and Unlinkability). If the linkable ring signature scheme satisfies the security guarantees then the proposed endorsement scheme guarantees anonymity and unlinkability to its endorsers. Proof. It is to be proven that there exists a corrupt endorser acting as a PPT adversary A, who can generate a valid endorsement without being a valid endorser. On the contrary, it is assumed that A wins the challenge-response game in Definition 4.3 and 4.4 with a non-negligible probability. This can be directly reduced to the anonymity and unforgeability of the employed linkable ring signature and hence model a reduction R which attacks the security of the linkable ring signature scheme. This implies that a successful forgery by A means a successful forgery by R. By unforgeability property of the underlying scheme the probability that R wins in forging a signature is nearly close to negligible. This contradicts the initial assumption and hence the proposed scheme is secure. By the anonymity of ring signature, it is infeasible to find the source of a ring signature.  

7

Performance Analysis of LRS

Table 1 shows the performance of the constructed linkable ring signature scheme analyzed in terms of the number of operations required for signing and verification algorithms with respect to the size of the ring represented as n.

156

J. Dharani et al. Table 1. Efficiency of linkable ring signature scheme Operations done

Algorithm Signing Verification

Pairings



Modular exponentiation

(4.n + 3) 1

(4.n + 2)

Programmable hash computation n

n

VRF evaluation

2



sOTS computation

1

1

Encryption

2



It is inferred that the pairing operation, modular exponentiation and programmable hash computation are dependent on the size of the ring n. VRF evaluation, one-time signature computation and encryption are all independent of the number of members in the ring. Hence, the signature size is dependent on the number of members in the ring.

8

Experimental Analysis

Signature generation time w.r.t number of endorsers FCsLRS LRS

700

Signature verification time (ms)

Signature generation time (ms)

800

600 500 400 300 200 100 0

0

50

100

150

200

Number of Endorsers

(a) Signature generation time

Signature verification time w.r.t number of endorsers 2,800 2,600 FCsLRS 2,400 LRS 2,200 2,000 1,800 1,600 1,400 1,200 1,000 800 600 400 200 0

50

100

150

200

Number of Endorsers

(b) Signature verification time

Fig. 3. Signature generation and verification time w.r.t number of endorsers

Figure 3 shows a comparative analysis in terms of (3a) signature generation and (3b) verification time between the existing FCsLRS scheme and the proposed LRS scheme for λ = 1024 bits. It is evident that the FCsLRS scheme is efficient compared to the proposed work. This is because the signature generation and verification times are independent of the number of endorsers in the existing work. In the proposed LRS scheme, signature generation and verification times depend on the number of endorsers. But the proposed work outperforms the existing work in terms of security. It is secure under the classical assumptions of cryptography while the existing work is secure only under the random oracle model.

Obfuscation Techniques for a Secure Endorsement System

9

157

Conclusion and Future Work

In this work, a linkable ring signature scheme secure under the standard model of security is constructed. A privacy-preserving framework for the endorsement system in Hyperledger Fabric is provided which allows the endorsers to make endorsements using the proposed signature scheme. As a part of the framework, the endorsement policy is also secured and the endorsers produce zero-knowledge proof of their membership in the policy. Future work would be to design and employ a constant size linkable ring signature scheme secure under the classical model of security.

References 1. Current limitations of idemix (2020). https://hyperledger-fabric.readthedocs.io/ en/release-2.2/idemix.html 2. Andola, N., Gogoi, M., Venkatesan, S., Verma, S., et al.: Vulnerabilities on hyperledger fabric. Perv. Mobile Comput. 59, 101050 (2019) 3. Androulaki, E., et al.: Hyperledger fabric: a distributed operating system for permissioned blockchains. In: Proceedings of the Thirteenth EuroSys Conference, pp. 1–15 (2018) 4. Androulaki, E., De Caro, A., Neugschwandtner, M., Sorniotti, A.: Endorsement in hyperledger fabric. In: 2019 IEEE International Conference on Blockchain (Blockchain), pp. 510–519. IEEE (2019) 5. Backes, M., D¨ ottling, N., Hanzlik, L., Kluczniak, K., Schneider, J.: Ring signatures: logarithmic-size, no setup—from standard assumptions. In: Ishai, Y., Rijmen, V. (eds.) EUROCRYPT 2019. LNCS, vol. 11478, pp. 281–311. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17659-4 10 6. Bender, A., Katz, J., Morselli, R.: Ring signatures: stronger definitions, and constructions without random oracles. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 60–79. Springer, Heidelberg (2006). https://doi.org/10.1007/ 11681878 4 7. Boneh, D., Drijvers, M., Neven, G.: Bls multi-signatures with public-key aggregation (2018) 8. Boneh, D., Lynn, B., Shacham, H.: Short signatures from the weil pairing. J. Cryptol. 17(4), 297–319 (2004) 9. Chow, S.S., Wei, V.K., Liu, J.K., Yuen, T.H.: Ring signatures without random oracles. In: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, pp. 297–302 (2006) 10. Dharani, J., Sundarakantham, K., Singh, K., Shalinie, S.M.: Design of anonymous endorsers in hyperledger fabric with linkable threshold ring signature. ReBICTE 6 (2020) 11. Feng, Q., He, D., Zeadally, S., Khan, M.K., Kumar, N.: A survey on privacy protection in blockchain system. J. Netw. Comput. Appl. 126, 45–58 (2019) 12. Fleischhacker, N., Krupp, J., Malavolta, G., Schneider, J., Schr¨ oder, D., Simkin, M.: Efficient unlinkable sanitizable signatures from signatures with rerandomizable keys. IET Inf. Secur. 12(3), 166–183 (2018) 13. Liu, J.K., Wei, V.K., Wong, D.S.: Linkable spontaneous anonymous group signature for ad hoc groups. In: Wang, H., Pieprzyk, J., Varadharajan, V. (eds.) ACISP 2004. LNCS, vol. 3108, pp. 325–335. Springer, Heidelberg (2004). https://doi.org/ 10.1007/978-3-540-27800-9 28

158

J. Dharani et al.

14. Malavolta, G., Schr¨ oder, D.: Efficient ring signatures in the standard model. In: Takagi, T., Peyrin, T. (eds.) ASIACRYPT 2017. LNCS, vol. 10625, pp. 128–157. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70697-9 5 15. Mazumdar, S., Ruj, S.: Design of anonymous endorsement system in hyperledger fabric. IEEE Trans. Emerg. Topics Comput. 9, 1780–1791 (2019) 16. Menezes, A.: An introduction to pairing-based cryptography. Recent Trends Cryptogr. 477, 47–65 (2009) 17. Moubarak, J., Chamoun, M., Filiol, E.: On distributed ledgers security and illegal uses. Future Gener. Comput. Syst. 113, 183–195 (2020) 18. Pedersen, T.P.: Non-interactive and information-theoretic secure verifiable secret sharing. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 129–140. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-46766-1 9 19. Rivest, R.L., Shamir, A., Tauman, Y.: How to leak a secret. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 552–565. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45682-1 32 20. Stathakopoulous, C., Cachin, C.: Threshold signatures for blockchain systems. Swiss Federal Institute of Technology (2017) 21. Tang, S.: Simple threshold RSA signature scheme based on simple secret sharing. In: Hao, Y., et al. (eds.) CIS 2005. LNCS (LNAI), vol. 3802, pp. 186–191. Springer, Heidelberg (2005). https://doi.org/10.1007/11596981 28 22. Xiao, Y., Zhang, P., Liu, Y.: Secure and efficient multi-signature schemes for fabric: an enterprise blockchain platform. IEEE Trans. Inf. Forensics Secur. 16, 1782–1794 (2020)

Mobile Operating System (Android) Vulnerability Analysis Using Machine Learning Vinod Mahor1(B) , Kiran Pachlasiya2 , Bhagwati Garg3 , Mukesh Chouhan4 , Shrikant Telang5 , and Romil Rawat5 1 IES College of Technology, Bhopal 462001, Madhya Pradesh, India

[email protected] 2 NRI Institute of Science and Technology, Bhopal 462001, MP, India 3 Union Bank of India, Branch, Gwalior 474001, MP, India 4 Govt. Polytechnic College, Sheopur 476337, MP, India 5 Shri Vaishnav Vidhyapeeth Vishwavidyalaya, Indore 452001, MP, India

Abstract. Because of the computational processing, seamless functioning and benefits that it gives to Android-users, cyber thieves have been drawn towards it. Conventional AMD: android malware detection (analysis) approaches, including signature-based on detection for power use monitoring, may miss new malicious infections and vulnerable activities. Here, an approach for identifying malicious software variants (MSV) in Android Apps that use the Gated Recurrent Unit [GRU] [ANN-Artificial Neural Networks] is given. A comparison of traditional ML and deep learning methods is provided in order to identify the most effective model to identify Android MSV with the best results. From Android apps, we retrieved the following 2-static feature, Application Programming Interface [API] Alert, authorization should be activated. The CICAndMal2017 dataset is used to train and evaluate our method. The Deep learning (DL) approach surpasses numerous methods with an accuracy of 99.3%, according to the testing assessments. The Learning methods provide clarity of information with ease of training. Keywords: Malicious software variants · Android · Static analysis · Gated recurrent unit

1 Introduction Notably, as the count of users grows, the valuable details that a CVI may access grow as well. The use of smartphones is rising at a faster rate than it has ever been previously. By 2021, there will be 4.1 billion smart mobile gadgets on the planet [1]. Furthermore, the Android operating system is used by more than 75% of normal smartphones [2]. Furthermore, antivirus software is rarely installed on Android smartphones. Even those that install it may not be able to utilize it to identify infections very efficiently [3]. Because of the vast “number of users, the large amount and details able to access on these gadgets, these characteristics may make the Android system more appealing to cyber vulnerability Injectors (CVI) [4]. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 159–169, 2022. https://doi.org/10.1007/978-981-19-3182-6_13

160

V. Mahor et al.

The attacker might be able to get access by using an app that was previously uploaded to G-Play, a victim then installs it, unwittingly giving the attacker-access [5, 6], More than 2.87 As of the third quarter of 2020, there were million Android mobile applications[AMA] available for download [7]. The MSV programmes were detected at a rate of 483,578 per month, or around 17,500 per day [8]. Because of the large number of malignant programmes, more advanced MSV detection technologies are required. Static, dynamic, and hybrid analysis are the three types of AMD methods [9]. Static analysis pulls features from Android apps without requiring them to be executed on a gadget or emulator. Suspicious or unusual conduct should be monitored. It can accomplish a high level of feature coverage, but it has a number of drawbacks, including code obfuscation and dynamic code loading. The features are extracted via dynamic analysis by running them on Android systems. This method has the potential to outperform static analysis by discovering additional characteristics or hazards that would otherwise go unnoticed by static analysis. In terms of time and computer resources, static analysis surpasses dynamic analysis [10]. The term “hybrid analysis” refers to the mix of static and dynamic analysis. In the detection procedure, it is more effective and efficient [11]. A Static analysis approach is used in our method. In essence, both in a number of research, ML techniques and DL approaches have been proven to be effective in recognizing MSV [12–14], whether in Android application analysis or other areas of cyber security. The performances of five classification methods are examined in this study. In identifying Android MSV, Sequential Minimal Optimization [SMO], Support Vector Machine [SVM], Decision Tree [DT], Logistic Regression Model [LRM], and Multilayer perceptron [MLP] are used. Table 1. Assessment algorithm comparative features Algorithms

Details

SVM

A supervised learning Technique required for regression, classification and outliers identification

SMO

Approach for solving the quadratic programming (QP) issues generated during training at SVM

DT

Use for selections and potential outcomes, such as a likelihood of incident consequences, cost objects, and utilities representations

LRM

Requited for predicting the probability of a target variables

MLP

Provides approximate results towards deep complex problems as in fitness approximation

GRU

Improves the memory capacities for RNN with ease for training the model

Then, using the CICAndMal2017 dataset [15], we compared its results to those of the DL GRU method. Because MSV requires unique permissions and API invoke alerts (API-IA) that are not found in benign apps, The dataset’s extraction of features and selecting technique includes include permissions and API-IA characteristics [16]. The

Mobile Operating System (Android) Vulnerability Analysis

161

below Table 1 provide features based representation of algorithms. The remainder of this study is organized as follows Sect. 2: analyses preceding literature, Sect. 3: describes the study methodological procedure, Sect. 4: presents research experiment results, and Sect. 5: summaries the work.

2 Literature Review Attackers that create harmful programmes have devised new strategies to attack Android users. As a result, numerous academics have confirmed the usefulness of existing techniques or proposed new tools that may be more successful in identifying malignant apps properly. Abdulrahman et al. [3] developed and trained a deep learning [DL] model on a data set containing around 30k malignant and 25k benign applications [17]. Present a novel technique for identifying malicious Android apps based on pseudodynamic analysis and the creation of an API-IA graph for each critical path. The researchers, on the other hand, built an API-IA tree to represent all conceivable execution paths that MSV may take over its lifespan. As a consequence, it transformed it into a low-dimensional numeric vector features and functionality suitable for use in a deep neural network (DNN) [31]. They also compared system performance to maximize network efficiency and contrasting various embedding techniques and adjusting different network setup settings to ensure that the optimum hyper-parameter assortment has been achieved in order to achieve the greatest accuracy results. Their findings show that the proposed MSV categorization has an accuracy of 98.86%, an F-measure of 98.65%, and recall and precision of 98.47% and 98.84%, respectively. To identify Android MSV, Suleiman et al.; presented a categorization technique built on similar ML. There were a maximum of 179 training sessions. Characteristics were collected and split in-to API-IA [18]. Instructions associated: more than 54 features, based on genuine MSV samples and benign apps. Permissions granted to the app: 125. Simple Logistic, Nave Bayes (NB), Decision Tree (DT), PART, and RIDOR were used to create a composite categorization model from a series of heterogeneous classifiers. According to their findings, PART outperformed all other classifiers, achieving truepositive rates of 0.95%, the true-negative (TN) rates of 0.96%, the false-positive (FP) rates of 0.03%, the false-negative rates of 0.04%, and accuracy of 0.04%, 0.96% accuracy, and 0.96%. Long et al. [3] proposed a lightweight A machine learning [ML]-based system that can be discriminate between favorable and malignant App. They also used both static and dynamic approaches to extract characteristics for applications [19]. They also present a novel technique for reducing the dimensionality of features (PCA-RELIEF), which has been shown to be successful in feature selection. Their research showed very efficient at identifying MSV by reducing the dimensions utilizing both their SVM model [32] and the suggested new model, paving the path for them to improve with a greater detection rate and reduced error detection rate. Conventional techniques for identifying MSV in the Android App were outperformed in all of these trials. The below Table 2 shows about available methods.

162

V. Mahor et al. Table 2. Background and comparison with the previous techniques

Reference

Research concentration

Approach

[25]

ML-AMD

ML

[26]

ML-AMD

ML with feature analysis technique

[27]

DL-AMD

DL

[28]

DL-AMD

Feature extraction technique

[29]

ML-AMD

Feature processing, ML, evaluation metrics

[30]

DL-AMD

Feature processing, DL, evaluation metrics

Proposed work

ANN-AMD

For identifying MSV in Android Apps that use the GRU [ANN-Artificial Neural Networks]

The fact that the data utilized in this article is current distinguishes it from the rest of the studies listed above. In this work, we compared the detection of Android MSV applications using standard categorization techniques against DL approaches.

3 Methodology Android Application Packages [APK] file collected by dataset that is publicly available [15]. The static properties are then retrieved with the help of a Python script written in the Jupyter-notebook environment [20]. API-IA and authorization are two of the features. For use in the training process, these features are produced and saved as a data frame in a Comma-Separated-Values (CSV) file. Finally, we utilize 10-fold cross-validation to test all classifiers and generate the evaluation metrics (Fig. 1). 3.1 Dataset Indeed, a large and trustworthy dataset is required to test and validate the AMD system using deep and ML models. As a result, a portion of the CICAndMal2017 dataset was used in this work, which was created and [15] published it, and it is found on the Canadian Centre of Cyber Security website [15, 21]. It was created by the use of realworld experiments. Our dataset contains 347 examples of benign Android apps and 365 MSV samples, with the MSVs categorized into 4-categories [adware, an ransomware, the scareware, and SMSware] [33, 34]. SMSware and Adware, for example, send out annoying and harmful advertising, whilst Scareware and Ransomware, on the other hand, solicit customers for payments and ransom in order for attackers to not harm their computers or stall their sensitive dataset. 3.2 Preprocessing and Feature Extraction It’s critical to pick characteristics that show which category the new record belongs to belongs to when it comes to categorization. From this perspective, all android apps’

Mobile Operating System (Android) Vulnerability Analysis

163

Fig. 1. MSV detect segments

permissions and API-IA are extracted, and both are included in data as features. Misspelled is a full-featured android file-interaction programme that is only available in python contexts [22]. Individual Android apps may be decompiled and reprogrammed using it. Misspelled is an application that analyses APK files by extracting the DEX access controls for all individual APK file. As a consequence, we developed a data frame (paradigm) with features [columns] and applications [rows], every column indicating a unique authorization or API-IA with binary, and rows indicating all MSV and benign files (APK). 3.3 Classifiers Framework (Paradigm) When dealing with labeled data, conventional ML Classifiers generally perform well [23]. SVM, SMO, DT, LRM, and MLP are some of the ML classifiers we employ (MLP). Various criteria have developed to assess classifiers and select the finest classifier capable of producing optimal outcomes. An Parameters [accuracy, F-measure, Recall,

164

V. Mahor et al.

and accuracy] scores are all considered in the analysis, which are computed using the formulae below.

Fig. 2. Evaluating parameters

The Accuracy is a measure for determining categorization models that relates to the model’s proportion of accurate predictions. Precision is described as the amount of accurately detected positive outcomes divided by the entire number of positive outcomes, including those that were missed. Furthermore, recall is the number of correctly identified results divided by total number of samples that should’ve been correctly identified. Precision may also be thought of as a measure of a classifier’s accuracy, with low precision suggesting a high false positives and false negatives. Low recall suggests a significant number of false negatives, whereas high recall exhibits classifier completeness. The F-Measure indicates the trade-off between precision, recall, and it may be used to select a model-based on this trade-off [24]. 3.4 Architecture of the GRU In order to construct a binary categorization model, we employ GRU. Figure 2 [1] depicts how the architecture is built. The model is made up of three parts: the primary source of information, the middle, and output block are the three blocks. The input-block has a GRU input data, yet the middle-block has 3-GRU layer along with (128, 255 and 512,) neurons, all of which is coupled to a regularization dropout layer having 0.2 dropout rate. Finally, the sigmoid function triggers a dense layer for binary classification in the output block. We utilized the widely used optimizer [Adam] having learning-rate (LR) of 0.0001 (Fig. 3). To keep track of loss and accuracy levels, we employed early halting and checkpoint approaches with ten percent of the epochs of patience. These approaches are effective in preventing falls in both over- and under fitting (OU-T) situations [8].

4 Result Evaluation Analysis We used two types of models in our experiments: standard ML classifiers and DL. We first use the dataset to train the classifiers, and then we test and evaluate them. The categorizations in both studies are based on characteristics collected from permissions and API requests.

Mobile Operating System (Android) Vulnerability Analysis

165

Fig. 3. DL model architecture

4.1 ML To develop a helpful model for identifying MSV apps, we specifically picked SVM, SMO, DT, LRM, and MLP, taking into account that these techniques are considered supervised learning. They’re all simple to set up and use, and they can help with both categorization and regression jobs. Table 3. Assessment of algorithms (Tested) for assessing android MSV Metrics/Values

SVM

SMO

DT

LRM

MLP

GRU

Accuracy (%)

97.3%

98.3%

97.3%

99.2%

94.8%

99.3%

Precision (%)

97.9%

98.8%

97.3%

99.5%

94.9%

97.3%

Recall (%)

95.6%

97.2%

98.4%

98.3%

94.6%

99.7%

F-measure (%)

97.3%

97.7%

97.3%

98.6%

94.8%

98.8%

Table 3 Shows about the generated result, With a score of 99.2%, the LRM (Logistic Regression Model) The classification has the accuracy and reliability and the most appropriate results, with the f score indicating that it correctly predicted 98.6% of the dataset. In addition, the LRM classifier has reached the 2nd -level of R-call, correctly predicting 98.3% of MSV samples. Regardless, the LRM classifier obtained the best degree of precision, identifying 99.5% of the whole dataset throughout the testing procedure.

166

V. Mahor et al.

Fig. 4. Efficiency graph

Figure 4, represents the efficiency and comparative modeling of algorithms using different parameters intervals. 4.2 DL Android was categorized using our DL method based on permissions and API requests. The quality and losses measures were used to calculate the high energy parameters. We terminated training the supermodel at 27th since the rate of loss in training and validation, the timeframe had dropped to 0.06%, and the accuracy level has achieved 99.3%. In addition, the DL classifier successfully predicted 98.8% of the dataset. Furthermore, it has a good R-call and precision score, as well as it’s able to properly detect 99.7% of MSV samples. The results demonstrate that DL techniques improved the model’s performance, with DL classifier results outperforming standard ML classifier results. However, both experiments yielded good results, thus In order to detect Smartphone MSV, models based on authorizations and API-IA are assumed to do is provide promising results.

5 Conclusion The GRU architecture of ANN-DL methods to offer a novel model for identifying MSV in Android OS apps is discussed here. Furthermore, a comparison between standard ML techniques and DL approaches is presented in an effort to find the most effective model for assessing Android MSV. The suggested classifications are trained on the data-set taken by CIC and the Mal-2017 data-set, which was tested using a static analysis of authentic, realistic MSV and benign apps. This research included both permissions and API-IA, as utilising both showed that it was worthwhile to place a greater emphasis on

Mobile Operating System (Android) Vulnerability Analysis

167

identifying Android MSV models. In AMD, the DL classifier beat all other classifiers, achieving a 99.3% level of accuracy and a 98.8% F-measure score, utilizing permissions and API-IA static data. The proposed work suits best when registered and licensed tools are used, but could be unpredictable for unregistered tools and pirated installations. Towards the limitation, the smart auto upgradation mechanism is to be created in mobile phones at the software as well as hardware end, for identifying and transmitting the susceptibility to a secure server for analysis and identification. That could be the future task to develop an intelligent chip containing a self-healing mode and an alerting mode at secure stations.

References 1. Sharma, K., Gupta, B.B.: Towards privacy risk analysis in android applications using machine learning approaches. Int. J. E-Services and Mob. Appl. (IJESMA) 11(2), 1–21 (2019) 2. Sabhadiya, S., Barad, J., Gheewala, J.: Android malware detection using deep learning. In: 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), pp. 1254–1260. IEEE (April 2019) 3. Cui, J., Wang, L., Zhao, X., Zhang, H.: Towards predictive analysis of android vulnerability using statistical codes and machine learning for IoT applications. Comput. Commun. 155, 125–131 (2020) 4. Arslan, R.S., Do˘gru, ˙I.A., Bari¸sçi, N.: Permission-based malware detection system for android using machine learning techniques. Int. J. Software Eng. Knowl. Eng. 29(01), 43–61 (2019) 5. Garg, S., Baliyan, N.: Machine Learning Based Android Vulnerability Detection: A Roadmap. In: International Conference on Information Systems Security, pp. 87–93. Springer, Cham (Dec 2020) 6. Gencer, K., Ba¸sçiftçi, F.: Time series forecast modeling of vulnerabilities in the android operating system using ARIMA and deep learning methods. Sustainable Comp. Info. Sys. 30, 100515 (2021) 7. Malik, Y., Campos, C.R.S., Jaafar, F.: Detecting android security vulnerabilities using machine learning and system calls analysis. In: 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C), pp. 109–113. IEEE (July 2019) 8. Rajawat, A.S., Rawat, R., Mahor, V., Shaw, R.N., Ghosh, A.: Suspicious big text data analysis for prediction—on darkweb user activity using computational intelligence model. In: Innovations in Electrical and Electronic Engineering, pp. 735–751. Springer, Singapore (2021) 9. Chen, S., Xue, M., Tang, Z., Xu, L., Zhu, H.: Stormdroid: a streaminglized machine learningbased system for detecting android malware. In: Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, pp. 377–388 (May 2016) 10. Yuan, Z., Lu, Y., Xue, Y.: Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016) 11. Rawat, R., Mahor, V., Chirgaiya, S., Rathore, A.S.: Applications of social network analysis to managing the investigation of suspicious activities in social media platforms. In: Advances in Cybersecurity Management, pp. 315-335. Springer, Cham (2021) 12. Ghaffarian, S.M., Shahriari, H.R.: Software vulnerability analysis and discovery using machine-learning and data-mining techniques: a survey. ACM Comp. Surv. (CSUR) 50(4), 1–36 (2017) 13. Senanayake, J., Kalutarage, H., Al-Kadri, M.O.: Android mobile malware detection using machine learning: a systematic review. Electronics 10(13), 1606 (2021)

168

V. Mahor et al.

14. Rawat, R., Mahor, V., Chirgaiya, S., Shaw, R.N., Ghosh, A.: Sentiment analysis at online social network for cyber-malicious post reviews using machine learning techniques. Computationally Intelligent Systems and their Applications, 113–130 (2021) 15. Sharma, S., Kumar, N., Kumar, R., Krishna, C.R.: The paradox of choice: investigating selection strategies for android malware datasets using a machine-learning approach. Commun. Assoc. Inf. Syst. 46(1), 26 (2020) 16. Rawat, R., Mahor, V., Chirgaiya, S., Shaw, R.N., Ghosh, A.: Analysis of darknet traffic for criminal activities detection using TF-IDF and light gradient boosted machine learning algorithm. In: Innovations in Electrical and Electronic Engineering, pp. 671–681. Springer, Singapore (2021) 17. Islam, N., Das, S., Chen, Y.: On-device mobile phone security exploits machine learning. IEEE Pervasive Comput. 16(2), 92–96 (2017) 18. Rajawat, A.S., Rawat, R., Barhanpurkar, K., Shaw, R.N., Ghosh, A.: Vulnerability analysis at industrial internet of things platform on dark web network using computational intelligence. Computationally Intelligent Systems and their Applications, 39–51 (2021) 19. Martinelli, F., Mercaldo, F., Nardone, V., Santone, A., Vaglini, G.: Model checking and machine learning techniques for HummingBad mobile malware detection and mitigation. Simul. Model. Pract. Theory 105, 102169 (2020) 20. Rehman, Z.U., et al.: Machine learning-assisted signature and heuristic-based detection of malwares in Android devices. Comput. Electr. Eng. 69, 828–841 (2018) 21. Rasthofer, S., Arzt, S., Bodden, E.: A machine-learning approach for classifying and categorizing android sources and sinks. In: NDSS, Vol. 14, p. 1125 (Feb 2014) 22. Rajawat, A.S., Rawat, R., Shaw, R.N., Ghosh, A.: Cyber physical system fraud analysis by mobile robot. In: Machine Learning for Robotics Applications, pp. 47–61. Springer, Singapore (2021) 23. Pekta¸s, A., Acarman, T.: Ensemble machine learning approach for android malware classification using hybrid features. In: International Conference on Computer Recognition Systems, pp. 191–200. Springer, Cham (May 2017) 24. Singh, A.K., Goyal, N.: Understanding and mitigating threats from android hybrid apps using machine learning. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 1–9. IEEE (Dec 2020) 25. Alqahtani, E.J., Zagrouba, R., Almuhaideb, A.: A survey on android malware detection techniques using machine learning algorithms. In: 2019 Sixth International Conference on Software Defined Systems (SDS), pp. 110–117. IEEE (June 2019) 26. Souri, A., Hosseini, R.: A state-of-the-art survey of malware detection approaches using data mining techniques. HCIS 8(1), 1–22 (2018). https://doi.org/10.1186/s13673-018-0125-x 27. Qiu, J., Zhang, J., Luo, W., Pan, L., Nepal, S., Xiang, Y.: A survey of android malware detection with deep neural models. ACM Computing Surveys (CSUR) 53(6), 1–36 (2020) 28. Naway, A., Li, Y.: A review on the use of deep learning in android malware detection. arXiv preprint arXiv:1812.10360 (2018) 29. Liu, K., Xu, S., Xu, G., Zhang, M., Sun, D., Liu, H.: A review of android malware detection approaches based on machine learning. IEEE Access 8, 124579–124607 (2020) 30. Wang, Z., Liu, Q., Chi, Y.: Review of android malware detection based on deep learning. IEEE Access 8, 181102–181126 (2020) 31. Rawat, R., Mahor, V., Chirgaiya, S., Garg, B.: Artificial cyber espionage based protection of technological enabled automated cities infrastructure by dark web cyber offender. In: Intelligence of Things: AI-IoT Based Critical-Applications and Innovations, pp. 167–188. Springer, Cham (2021) 32. Rawat, R., Garg, B., Mahor, V., Chouhan, M., Pachlasiya, K., Telang, S.: Cyber threat exploitation and growth during COVID-19 times. In: Advanced Smart Computing Technologies in Cybersecurity and Forensics, pp. 85–101. CRC Press

Mobile Operating System (Android) Vulnerability Analysis

169

33. Mahor, V., Rawat, R., Kumar, A., Chouhan, M., Shaw, R.N., Ghosh, A.: Cyber warfare threat categorization on CPS by dark web terrorist. In: 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), pp. 1–6. IEEE(Sept 2021) 34. Mahor, V., Rawat, R., Telang, S., Garg, B., Mukhopadhyay, D., Palimkar, P.: Machine learning based detection of cyber crime hub analysis using twitter data. In: 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), pp. 1–5. IEEE (Sept 2021)

Survey of Predictive Autoscaling and Security of Cloud Resources Using Artificial Neural Networks Prasanjit Singh(B) and Pankaj Sharma Starzplay, Dubai, United Arab Emirates [email protected], [email protected]

Abstract. This paper reports the results of a survey of predictive autoscaling methods and its security practices for cloud computing resources based on machine learning models involving Artificial Neural Networks, Linear Regression and Holt-Winters forecasting model. Autoscaling makes efficient use of resources by scaling up and scaling down the compute resources according to the workload. Scaling up ensures service reliability and scaling down ensures the cloud spend stays within budget. Despite all the advantages of autoscaling, achieving its full potential is somewhat challenging in situations where the resources are needed to be scaled up or scaled down abruptly. To deal with such challenges predictive autoscaling is used which forecasts the incoming traffic and prepares the system to handle it. This paper first discusses autoscaling and predictive autoscaling, about their integration with machine learning and goes on to survey autoscaling models aided by statistical forecasting models like Holt-Winters and Linear Regression. Security practices to prevent abuse of predictive autoscaling mechanisms have also been discussed. Keywords: Predictive autoscaling · Cloud computing · Machine learning · Artificial neural networks · Holt-winters model

1 Introduction Before the advent of cloud computing, computing resource allocation was a big challenge, but cloud computing has changed the way resources are allocated in the system. People no longer need hardware components to store their data, but the amount of resources required by the system is still an issue. In response to this issue, enterprises with large-scale component-based infrastructure are now introducing cloud autoscaling to maintain the quality of their service (QoS) in compliance with service level agreement (SLA). Auto-scaling makes efficient use of resources by scaling up and scaling down the resources according to the workload. This also helps maintain the operational expenditure low for the software service providers. Autoscaling certainly offers a number of benefits, but reaching its maximum potential can be challenging in situations where the resources are needed to be scaled up or scaled down abruptly. To deal with such challenges predictive autoscaling is used which forecasts the incoming traffic and prepares the system to handle it. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 170–180, 2022. https://doi.org/10.1007/978-981-19-3182-6_14

Survey of Predictive Autoscaling and Security of Cloud Resources

171

This paper first discusses autoscaling and predictive autoscaling, then machine learning and its integration with autoscaling, and in the end, the autoscaling models invented previously are discussed. Security practices to prevent abuse of predictive auto scaling mechanisms have also been covered. The distribution of computing services like, applications, storage, and processing power, on-demand, is called cloud computing which is usually carried out on pay-as-you-go basis via the internet [1]. With the help of cloud computing, companies can access anything and everything from different applications to storage from a cloud service provider, on rent. This is much more convenient for companies than owning a data center or complete computing infrastructure. Successively, cloud computing service providers can promote economies to a great extent by providing similar services to a large range of clients. These days, with the massive increase in resource allocation to big enterprise software systems like Amazon, Google, Facebook, and Netflix, the users require a high level of reassurance regarding QoS metrics such as elevated throughput, total response time, and service availability. If such reassurances are not done, users stop trusting the service providers of these applications hence they have a churn in their user base, which in turn affects their revenues. For QoS properties, usually, users retain Service Level Agreements (SLAs) with the service providers. If the service providers fail to satisfy their QoS metrics, they end up having a huge loss of revenue in addition to churn in the user base. So, a major challenge for such enterprise systems is complying with the SLA while keeping the costs low, mainly due to the fluctuating number of incoming users to the system. A big issue with the resource allocation scheme in cloud computing is the possibility of over allocation where due to repeated changes and elevated deviation of workload, compute resources are needed to be added and removed now and then and this process includes a serious cost overhead. To cater for this issue, [3] a recent innovation; called Autoscaling, is now offered by cloud providers like Google, Amazon, Microsoft, etc. Autoscaling is a feature that lets the companies automatically scale up or scale down their computational load as per the customer demand in real-time. By autoscaling, companies can launch a new server that can sustain the availability of applications and can also scale the capacity of computation to serve the consumer according to their needs, without making the capacity pre-commitments.

2 Background and Current Research 2.1 Autoscaling In the simplest terms, [2] autoscaling is just the exact number of servers needed to meet the requirements and demands at all times, which ultimately offers a striking solution to handle all the ambiguous demands in the market. Autoscaling is praised as the most useful and helpful feature of cloud computing, with a major plus point of having no additional fees. Autoscaling comes into play whenever a website or an application is in need of additional server resources to comply with the processing jobs and requests [3, 10]. The general idea about the usage of auto scaling is that it can only be used to handle traffic spikes or other sudden bursts, whereas, in reality, autoscaling is equally advantageous

172

P. Singh and P. Sharma

over the complete lifetime of a setup, be it one month or ten years. The main point is that now a scalable architecture can be easily designed that can automatically scale up or scale down to meet the requirements of a system during its lifetime no matter how fast or slow, big or small the setup grows during that time. Two very popular ways of autoscaling include, ‘front-end site traffic’ where auto scaling is done on the basis of the number of incoming requests like web pages, data transfer and objects. In contrast, ‘back-end autoscaling’ is another implementation where autoscaling is based on the number of jobs in the queue, known as “Load-based Scaling” or based on how duration of jobs in the queue, referred to as “Time-based Scaling”. 2.2 Types of Autoscaling There are three fundamental types of autoscaling based on the methods of observations [4]: Reactive. The reactive approach is closely drawn with the real-time observation of resources because resources are immediately scaled up or scaled down as a traffic spike occurs. In this approach, there is also a period for “cool down” often involved. This period is a set time period where the resources are kept as scale-up even if the traffic drops, and this is done to handle any additional incremental traffic spikes. Scheduled. Unlike the reactive autoscaling approach, in the scheduled approach the user sets a time period for when the resources in the system will be increased. For instance, other than any major events or for a peak time period during the day, instead of waiting for the resources to scale up as the demand is increased, the resources can be pre-provisioned in advance. Predictive. The predictive autoscaling approach is the one where the traffic load is analyzed with the help of machine learning and artificial intelligence techniques to predict when the resources should be scaled up and when they should be scaled down. This paper majorly focuses on the predictive autoscaling type.

2.3 Predictive Autoscaling Paradigms Before starting with predictive autoscaling, first, consider a real-world scenario that signifies the importance and need of autoscaling, and more precisely, predictive autoscaling (Fig. 1). The figure above shows the typical workload of a commercial website, which is common to all commercial sites. Such websites have a highly varying number of incoming clients which are dependent on numerous elements such as the time of the day, the day of week, the week of the month, etc. For loads like these, capacity could be planned. There is a major difference in cost incurred when planning for an average load since less hardware is required, but performance is greatly reduced when a peak load occurs. The degraded performance will result in loss of customers which will then affect the revenue.

Survey of Predictive Autoscaling and Security of Cloud Resources

173

Fig. 1. Typical workload of a commercial website

Contrastingly, if the planned capacity is meant for the peak-level workload, it can easily take care of peak load but when the load is not much the resources will remain idle for most of the time. So, to overcome these challenges, autoscaling is used and cloud providers like AWS render access to such hardware which can be allocated or deallocated as per time demands. When the VM images are not needed, these virtual machines (VM) can be released. VM images can be created in advance and then spun up as per the requirement. But still, a desirable solution would need a capability to accurately predict the upcoming workload on a system and allocate or distribute the compute resources in advance. By doing so, the system will already be prepared to deal with the increase in load when it occurs. The requirement to identify how many virtual machines needs to be provisioned to handle the predicted workload is a corollary necessity. For instance, consider a scenario in which N number of machines in a system is already running and managing U users. All of a sudden, the number of users increases from U to U + 500 and so the utilization of processors is also increased in the running nodes. Instinctively, increment in the number of allocated machines is the requirement of the situation, but how by how much is the real question here. Because anything less than the required number will result in degraded performance, while anything more will indicate to the user that the cost paid by him for so many resources are not even used by the application and this will make the user doubt the cloud provider company. This highlights the fact that in a cloud environment, autoscaling the resources is not an easy and simple task. Overcoming such challenges requires the procedures that consider a few things like the state transition expenses when the number of resources is altered, the capability to precisely predict the future load on the server, and computation of the accurate number of required resources for the predicted increment or decrement in load. 2.4 Machine Learning Algorithms Predictive auto scaling uses machine learning and artificial intelligence techniques to make predictions. In literature, many machine learning algorithms have been successfully developed for predictive autoscaling. Predictive autoscaling can be used for container-based cloud applications because such applications require automatic and timely provision and deprovision of resources for the dynamic fluctuations in workload. To perform the autoscaling of docker containers [8] proposed a machine-learning

174

P. Singh and P. Sharma

based approach that mainly had four steps: monitoring, analysis, planning and execution of the control loop. Different types of continuous data collection were done in the monitoring step. In the analysis step, to predict the future workload and determine the requirement of the number of containers, a fast and accurate neural network model was employed for handling of requests beforehand and removing the delays. The planning phase schemes were made to prevent oscillations which were a result of frequent scaling operations. Their experimental results showed that not only was their model comparable in accuracy to the auto-regression and integrated moving-average model, but it also offered a count of 600 times greater speedup of the prediction. In addition to that, the workload predicted through their model helped them in using the minimum number of replicas (as shown in figure below) to deal with the future workload (Fig. 2).

Fig. 2. Predictive auto scaling for container replica sets

To perform the autoscaling of Virtual Network Functions (VNFs), in response to dynamic fluctuations in traffic [9] proposed a machine learning based approach. To make the scaling decisions in advance, they trained their machine learning classifier on the previous VNF scaling decisions and the common patterns generated by the network traffic workload. They used four virtualization technologies: LXC and Docker; that are based on container virtualization and KVM and Xen; that are based on hypervisor virtualization. They collected data from private ISP and their results proved that their machine learning classifier was very accurate. They reported a deep analysis of feature ranking, learning process and an influence of distinctive sets of training time, features, and testing times. Their results demonstrated minimized operational costs and improved QoS. 2.5 Augmenting Autoscaling with Machine Learning For both training and testing, there are [5] four common models: (i) inputs, (ii) outputs, (iii) parameters, and (iv) assessment metrics. Set of inputs and outputs are a requirement

Survey of Predictive Autoscaling and Security of Cloud Resources

175

of the machine learning model. Inputs are distributed into training and testing datasets, while outputs represent dataset predictions based on the trained model. Cloud services might include inputs like higher-level types of data tied to the applications or services themselves, like the number of requests per second or low-level CPU, memory, or network usage. As a first step, all the existing non-numerical data are encoded into a numerical form and the transformation is done with machine learning feature-engineering techniques. To combine the machine learning techniques with the autoscaling methodology, it is mostly considered that inputs and outputs are numeric data and time-series types. The parameters which are included are model parameters and hyper-parameters. Moreover, when testing and training results are assessed for precision, assessment metrics are very important. Hyperparameters are explained as variables that are usually detected by “greedy algorithms”. Model selection can also be done by the assessment metrics that is a method to choose an appropriate model amongst a set of diverse candidate models. Numerical applied values are also attributed to “assessment-metrics”. The machine learning methods and techniques are needed to be joined with the autoscaling process to scale resources on the basis of the prediction results. A significant number of manual tasks are needed to integrate machine learning components with autoscaling systems. In conclusion, different elements are in diverse characters and scales. Different machine learning models should plug-and-play in a uniform method to integrate machine learning with auto-scale, rather than model-by-model. Common elements are needed to be generalized and represented at an abstract level to provide a vast range of machine learning models. 2.6 Cloud Native Process Automation There are five common parts in the life cycle of an autoscaling system [5] which are the: (i) “autoscaling group” (ii) “monitor” (iii) “autoscaling policy” (iv) “scaling engine” and the (v) “launch configuration”. In the same manner, machine learning services involve some components like data, model storage, validation, monitoring, ML model selection and ML algorithms. To integrate the moving parts of machine learning with autoscaling, a significant number of manual tasks are needed, and to minimize the deployment effort, this integration process needs to be automated. Models should be converted into deployment elements to reduce the deployment effort, if not, then the models stay at the stage of design and are detached from other steps in a machine learning workflow. Present autoscaling techniques while made available by leading cloud service providers usually need an expert to deploy and configure the cloud components that host the application. Resources created by auto scaling techniques take care of the infrastructure deployment but do not take into account the configurations. The downtime of the system can be increased by manually configuring the resources. The deployment effort is based on the work needed to handle the extensive complications which arise by using distinct cloud services and technologies. Also, reducing the effort is of great significance as the deployment and configuration of services is very expensive.

176

P. Singh and P. Sharma

3 Implementation Challenges Although autoscaling solves many of the resource allocation problems, still there are some [7] challenges involved in this technique. Some of the challenges are discussed in this section. The first challenge is the forecasting of workload. In the autoscaling process the resources are added or released as the workload changes with time. For adding and releasing the resources we need to program an API. Releasing the resources is not very hard but adding the resources suffers some performance overheads because of the following reasons. First, to start the procurement process, the Cloud API is required to be called initially. Then the servers are required to complete the boot process with the particular image. Following this applications are required to be initialized followed by a state update. Hence the needed resources for computing can be obtained prior to the time when the workload is actually increased. For this outcome to be achieved the future workload is needed to be predicted using the historical data. The next challenge is the resource requirement identification for the incoming load. The number of customers using a system every hour varies to a great extent, so the resources needed by the system also vary. The resources needed are a mathematical function of the types of calls made by each customer, the number of customers and the nature of application. A very precise estimation of the required resources is needed to be made because if the estimation made is not correct then the resources can be underprovisioned or over-provisioned in the cloud environment which may have drastic effects on the system (Fig. 3).

Fig. 3. Resource provisioning & scaling decisions

The figure above shows how autoscaling can over-provision (red dashes) or underprovision (yellow dashes) the resources. The third challenge is the adjustment of different cost factors while doing the resource allocation. An ideal solution to utilize the maximum number of resources is understood to be the ability to set a specific time-interval to change the resources as much as the change in workload occurs, making the assumption that the workload can always be overestimated. It will guarantee that the best possible number of resources are always used but since the change in resource is not impulsive, a plan like this is not likely to be possible.

Survey of Predictive Autoscaling and Security of Cloud Resources

177

4 Solution Analysis and Future Trends Different prediction-based autoscaling techniques and approaches have been followed by numerous people around the globe. In this section, we discuss most of those techniques and approaches. Hanieh Alipour [5] sought to make sense of the utilization of memory, processor, and networks, using multi metric metrics of resource utilization. In order to approach this problem, he adopts inference-based model learning to predict needs before any actions are taken. He designed a new machine learning-based method for autoscaling that includes the learning of multiple metrics for the cloud. He applied this method to continuous training models and for workload forecasting. Furthermore, the results of workload forecasting can be used to automate scaling of cloud resources. ‘finally, using platform language orthogonal APIs, he built the serverless functions for this machine learning-based process, which included model selection, monitoring, microservices scheduling and ML models. He showed its architectural execution on Microsoft Azure and AWS (as shown in figures below) and demonstrated the prediction results from machine learning (Fig. 4).

Fig. 4. Prediction results on Azure & AWS

Results of their proposed solution showed major cost reductions as compared to the generic threshold-based autoscaling. The model-driven framework is composed of firstclass elements to represent machine learning algorithm types, scores, outputs, inputs as well as parameters, to allow the machine learning prediction to be integrated with the autoscaling system. Several rules were set up to identify machine learning elements. Two abstraction levels were used to showcase how autoscaling and machine learning were related, a “cloud platform-specific” and a “cloud platform-independent” model. He automated the “model-to-deployment transformation” and “model-to-model transformation”. and integrated the models to be deployable and executable using ML model-driven approaches embedded with DevOps approach on the target cloud platform. Further the methods were demonstrated with deployment and scaling configuration of two opensource benchmark applications namely “Dell DVD” and “Netflix” on three different cloud platforms Azure, Rackspace Cloud and Amazon Web Services (Fig. 5). Results showed that his inference backed autoscaling along with model driven approach lessened around 27% of deployment toil as compared to the ordinary autoscaling. Samuel et al. [6] presented a model to meet Service Level Agreement (SLA) requirements. Accordingly, effective scaling of Virtual Machine compute resources in the cloud

178

P. Singh and P. Sharma

Fig. 5. Prediction results for NDBench & DVD store

was needed to be deployed ahead to meet the demand in the SLAs and this could be done by predicting future resource demands. So, in order to do so, he developed a client prediction estimate model in the cloud for the TPC-W benchmark application and measured it with the help of three machine learning techniques which were: Support Vector Regression (SVR), Neural Networks (NN) and Linear Regression (LR). Intending to provide cloud users with a more vigorous scaling decision option, he included businesslevel metrics for throughput and response time in the prediction model. He carried out the inferences and analysis from the experiments on Amazon EC2 which demonstrated that SVR provides a superior prediction model for traffic patterns related to sporadic workloads. To address problems like performance overheads because of reactive scaling of resources, which also makes the programming of cloud infrastructure monotonous, Roy et al. [7] developed a forecasting resource allocation algorithm which was developed with predictive control models. He combined an ARMA model for future workload prediction, based on limited horizon, with the lookahead controller. It could adjust the resources allocated to the users, in advance. The experimental results of their approach showed that it was beneficial for both the cloud providers and users. Their presented work showed the viability of their methods in the framework of the minuscule number of servers used (Fig. 6).

Fig. 6. Future workload prediction

The experimental results in the figure prove the effectiveness of their algorithm which demonstrates that the required base resources were two servers while it increased to three or four servers whenever the load was elevated. The prediction of their algorithm closely matched the incoming load, so it proposed an increase in resources whenever

Survey of Predictive Autoscaling and Security of Cloud Resources

179

there was more load and a decrease in resources whenever there was less load. Hence, this figure reveals the efficiency of the algorithm and how it saves cost whilst simultaneously ensuring the performance of the running application. Another experiment [11] which uses the Holt-Winters statistical model was also evaluated with encouraging results. Predictive Horizontal-Pod-Autoscaler, which uses the Holt-Winters prediction method, will be the first to scale and react earlier than the standard Kubernetes Horizontal-Pod-Autoscaler. This will be reflected in a higher number of replicas when scaling up, and scaling earlier; the result is lower average and maximum delays, and fewer failed requests mainly at the moment of changing from a low load level to a high load. However, this effect will only show up after at least one full season (24 h); for the first season, since the predictor has no data to make predictions, its performance will be roughly the same as the standard Kubernetes-Horizontal-PodAutoscaler.

5 Security Considerations Bad bots can pose a threat and trigger auto scaling patterns leading to unwanted upscaling of resources. To tackle this we propose an enhanced custom AWS (Amazon Web Services) WAF (Web Application Firewall). We assume that the website is hosted on AWS cloud and has AWS CloudFront in front of it. This means policies can be applied to allow the application servers only to CloudFront using origin access identity. Further, all the logs are to be streamed in near real-time using AWS Kinesis Data Firehose, and then parsed by a Lambda function to help identify fake/bad bots before storing the logs back to Amazon S3. The serverless Lambda function essentially does two steps, firstly it inspects all the traffic using the rules to detect bad and fake bots. It also uses forward DNS and reverse DNS lookups of the Client IP address of packets along with a user-agent string that resembles good bots like GoogleBot or BingBot. Secondly, once it confirms a bad or fake bot, the AWS Lambda function updates the WAF IP list set to permanently block the requests coming from fake bots IP addresses. That is how the security of the architecture is ensured continuously.

6 Conclusion Autoscaling helps deal with all the uncertainties related to resource allocation. Resources can easily be scaled up or scaled down according to the network traffic, which not only helps the customers but also the cloud service providers to provision the maximum possible customer count whilst ensuring pertinent quality and complying with the SLAs. Nevertheless, there is a need to adjust resources by programming APIs as the workload changes because the server cannot handle the sudden change in traffic, so it is imperative for prediction algorithms to be embedded with autoscaling mechanisms to enable forecasts of the incoming traffic and prepare the server beforehand for it. Many approaches have been developed to do predictive autoscaling which have proved very successful and a few of them are discussed in this paper. The future of predictive autoscaling is to look into the scalability of already developed techniques in the modern-day workload perspective. Finally, Infrastructure as Code (IaC) solutions like Terraform need to be leveraged to provision infrastructure and security policies on the cloud based on the predictions obtained.

180

P. Singh and P. Sharma

References 1. Ranger, S.: What is cloud computing? everything you need to know about the cloud explained | ZDNet, https://www.zdnet.com/article/what-is-cloud-computing-everything-youneed-to-know-about-the-cloud/ 2. Hillier, A.: Cloud Autoscaling Explained | Complete Overview of Scaling in the Cloud. https:// www.densify.com/articles/autoscaling 3. Fazli, S., Shulman: The effects of autoscaling in cloud computing 2 management science. Articles in Advance, pp. 1–15 (2018) 4. Michael, K.S.: What is autoscaling? Cloud autoscaling explained. https://searchcloudcomp uting.techtarget.com/definition/autoscaling 5. Alipour, H., Liu, Y.: Abdelwahab Hamou-Lhadj. Model-driven machine learning for predictive cloud auto-scaling. IEEE Transactions Software Engineering (2019) 6. Ajila, S.A., Bankole, A.A.: Using machine learning algorithms for cloud client prediction models in a web vm resource provisioning environment. In: Transactions on Machine Learning and Artificial Intelligence, p. 4, (2016) 7. Roy, N., Dubey, A., Gokhale, A.: Efficient autoscaling in the cloud using predictive models for workload forecasting. In: 2011 IEEE 4th International Conference on Cloud Computing, 500–507. IEEE (2011) 8. Imdoukh, M., Ahmad, I., Alfailakawi, M.G.: Machine learning-based autoscaling for containerized applications. Neural Comput & Applic 32, 9745–9760 (2020) 9. Rahman, S., Ahmed, T., Huynh, M., Tornatore, M., Mukherjee, B.: Auto-scaling VNFs using machine learning to improve QoS and reduce COST. In: 2018 IEEE International Conference on Communications 10. Singh, P., et al.: Mathematical rendition of generic process model-based design for decision making about cloud instance autoscaling actions. IOSR Journal of Mathematics (IOSR-JM) 17(3), pp. 49–56 (2021) 11. Thompson, J.: Evaluating Predictive Autoscaling in Kubernetes. https://jamiethompson.me/ posts/Evaluating-Predictive-Autoscaling-Kubernetes

Systematic Literature Review (SLR) on Social Media and the Digital Transformation of Drug Trafficking on Darkweb Romil Rawat1(B) , Vinod Mahor2 , Mukesh Chouhan3 , Kiran Pachlasiya4 , Shrikant Telang1 , and Bhagwati Garg5 1 Shri Vaishnav Vidhyapeeth Vishwavidyalaya, Indore, MP 452001, India

[email protected] 2 IES College of Technology, Bhopal, Madhya Pradesh 462001, India 3 Government Polytechnic College, Sheopur, MP 476337, India 4 NRI Institute of Science and Technology, Bhopal, MP 462001, India 5 Union Bank of India, Gwalior, MP 474001, India

Abstract. A proposed study provides an annotated list of SLR (Systematic literature Review) for use by dark web (DW) researchers, cyber experts, and security agencies in the investigation and detection of criminal activities targeting the international drug market via online social networking (OSN) platforms. To update prior studies, this research employed a broad systematic search. We used an AS (automated search) to look for SLRs that were produced between 2015 (February 1) and 2021(March 30). These SLRs are compared to those detected in the original study in terms of number, quality, and source. In our extensive investigation, we found a total of (30) SLRs linked to (28) separate evaluations. (15) of the papers appeared to be relevant, with another (10) having the potential to be useful to practitioners. The quality of articles given at workshops and conferences improved as more academics adopted SLR norms. SLRs have advanced beyond the level where they are fully utilized by innovators, but they are still not commonly used in criminal investigations involving the DW & DT (drug trafficking). We’ll investigate into and analyze on the DW’s influence in diverse socioeconomic scenarios, such as DT. Yet there are still gaps, such as primary research that frequently fails to assess quality and coverage of focused research on DT and other DW crime incidents, covering wide range of issues, including online illicit business, organized criminal events, and potential illegal markets. Keywords: Cyber terrorism · Dark web · Systematic literature review · Cybercrime · Drug trafficking

1 Introduction Empirical Dark web (DW) and Cyber-crime scholars, should follow evidence-based practice as devised in the fields of cyber security and cybercrime, and suggested a framework for Evidence-based dark web (FEBDW) based on consolidating best available © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 181–205, 2022. https://doi.org/10.1007/978-981-19-3182-6_15

182

R. Rawat et al.

evidence to answer issues raised by professionals and scholars relating towards crime Domain. When all empirical research on a topic is combined, the most trustworthy data is obtained. A systematic literature review (SLR) is the preferred approach for collecting empirical evaluation [5, 6]. SaibaNazah and ShamsulHuda [1] adapted the SLRs’ cybercrime recommendations to DW business channels and Cybercrime techniques [9], and then modified them to add findings from sociological research. SLRs are a way of bringing together information on a DW and Cyber-crime issue or research question [5, 9]. By being auditable and repeatable, the SLR [1] approach strives to be as impartial as feasible. Secondary studies are referred to as SLRs, whereas main studies are referring towards primary evaluation. • Meta-Analysis: It is utilized to do statistically-based aggregation if enough comparable primary evaluation with quantitative assessment of difference between approaches is available. However, we’ve discovered that meta-analysis for SLRs in DW and Cybercrime [2, 3] is seldom achievable because to a lack of primary research base. • Mapping Research. The goal of these researches is to locate and categories primary evaluation in a certain issue a specified areas. Like, “What do we know about issue z?” q1: They can be used to find accessible Related work before doing traditional SLRs. They employ the same search and data extraction methods as traditional SLRs, but they focus more on tabulating primary research into specified categories. The research of DW and Cyber-crime experiments [9–11] Furthermore, some mapping evaluation are more interested with how academics do DW and Cyber-crime research [13] than what exactly is known about a specific DW and Cyber-crime issue. Mapping evaluation give a more in-depth look at the subjects addressed in all primary evaluation, including main outcomes and primary study quality assessments. The findings of a mapping research targeted at detecting DW and Cyber-crime SLRs were released [12]. Associating an SLR of secondary research towards Drug Trafficking [4–6] Crimes on DW. The study’s objective was to figure out how many dark web crimes associated SLRs released, research areas addressed, constraints existing SLRs. We followed a Manual Search Strategy (MSS) of a targeted set of more than 28 Peer Reviewed research events (conferences and journals from February 1st, 2015 to March 30, 2021 containing empirical evaluation and Related work, and previously used for other mapping evaluation [10, 14]). This yielded 18 SLRs, 10 Mapping Evaluation (ME) and 1-Meta Analysis. Here, we compare and analyzes the outcome of automated searching strategy that covered the period from February 1, 2015, through March 30, 2021. In practice, we are comparing three sets of SLRs:

Systematic Literature Review (SLR) on Social Media

183

Fig. 1. The analysis framework

Where, T-1 States – To Directly Connected Papers towards the Research theme. T-2.1 Connects to Related Work associating Towards Research Domain. • Spanned months of February 2015 to 30th June, 2018[13]. • Discovered via a broad automated search from February 2015 to 30th July 2018, but who were not included in the initial research. [14] Compared and contrast the outcome of a MSS with a broad technique of Automated Search (AS). • Discovered between July 1, 2007 and July 30, 2008. Contribution from the Paper The Present Work focuses on the Research environment and related articles dealing with crime at Social Media and the Digital Transformation of Drug Trafficking on Darkweb. The variety of Article are searched using different automated techniques and quality evaluation based on this a study is presented here. In the rest of the work, these three sets of papers are referred to as T-1, T-2.1, and T-2.2, respectively (T for tertiary). T-1 refers to the original research [12], whereas T-2 refers to current research. The remainder of the paper is designed as follows: Sect. 2 represents research methods, Sect. 3 with result and discussion, Sect. 4 with study limits, and Sect. 5 with the conclusion of this work.

2 Research Method Saibanazah and shamsulhuda [1] described the fundamental SLR technique, which we used as also mentioned in Fig. 1. The following were the key differences between the procedures employed in this investigation and the methods utilized in the original study:

184

R. Rawat et al.

Rather of conducting a limited human search, we employed a comprehensive automated search [13]. Three scholars gathered data on quality and classification. They used the value (median or mode -where applicable) as consensus for publications found within the same time duration as the initial search. They utilized “minority report” and “consensus” approach for extracting the data for the collection of publications discovered after the original search period. 2.1 The Question for Research The research questions (first three) examined in this investigation was the same as those in the original study [12]: • QOR1: Between February 1, 2015, and March 30, 2021, how many SLRs were released? • QOR2: what are the researches subjects being pursued? ‘Who is driving the research effort?’ What was the third question in our initial study? “. However, because we were measuring action rather than leadership, the research question has been changed to: • QOR3: Who are the most active people and organizations in SLR-Focused research? Research question (fourth) in our initial study was, “What are the limits of existing research?” “ The initial research found many flaws in existing SLRs: Rather than DW and pharmacological themes, a huge proportion of paper evaluation looked into research techniques (8 of 21). Drug criminality was not often discussed on DW issues. The number of main evaluation for mapping evaluation was significantly higher than for SLRs. There were just a few articles that offered guidance tailored to the requirements of practitioners. Only a small number of SLRs evaluated the quality of primary research. As a result, the fourth question in this research has been modified to: • QOR4: Are the limits of SLRs still a concern, as they were in the initial study? • QOR5: What are the growing dangers in DW crimes? What are the risks in DW crimes? • QOR6: What methods are used to track down the offenders in DW? • QOR7: Is SLR quality upgrading better? 2.2 Research Process We used the search approach recommendations described in [15, 16], which is detailed below, to provide a summary of growing crimes occurring in the DW [7, 8, 12] and their effects. Digital-Libraries (IEEE, ACM, Citeseer, ScienceDirect, Google Scholar

Systematic Literature Review (SLR) on Social Media

185

and WOS (Web-of-Science)) is used to search for broad indexing platforms. Saibanazah and shamsulhuda [1, 2] also looked through the SCOPUS database. The title, keywords, and abstract were all used to do the searches. Between July and March 2021, the searches will be analyzed. Except for SCOPUS indexing, the researcher utilized a set of basic search phrases [17–23] for all of the sources and combined the outcome from all of the searches relating towards Darkweb crime domains: a) b) c) d) e) f) g) h) i) j) k) l) m)

“Dark-web” and “cybercrime” “Dark-net” and “Cyber Security” “Dark-web” and “Threads “ “Attack” and “crime rates” “Crypto market” or “dark-net marketplaces” “Silk Road” and “TOR” “Dark-net” and” illicit products” “Dark-web” and “Law enforcement” “Drug or human trafficking” and “fraud” “Dark-web” and “Prostitution or terrorism” “Dark-web” and “Terrorism” “Dark-web” and “Data brall” “Dark-web” and “Memex”

The Fig. 2 Describes about the process of Implementation of SLR technique and it comprises planning for identifying the Research domain, then the further process focuses for searching of literatures, then Useful data is extracted from the paper, after that Research focus and purpose is synthesizes to create review comment for analysis. The search keywords that were utilized to find relevant article, keywords is searched using Boolean search operation with mappings (ANDS and ORs). various search keywords were used to get relevant publications [24–27]. A search for other publications was also explored based on the references of the pertinent articles discovered Search 1 was carried out individually for 2015–2018 and then just for 2020 [17]. TITLE-ABSKEY-AUTH(“Dark-web”) AND TITLE-ABS-KEY-AUTH(“evidence-based dark web” OR “ review of studies and evaluation” OR “structured review” OR “systematic review” OR “literature review(LR)” OR Related work” “Previous Work” “Depth analysis” OR ‘literature analysis” OR “in-depth survey” OR “literature survey” OR “meta-analysis” OR “ Previous and Past studies”) AND SUBJECTAREA (“evidence-based dark web” OR “review of studies” (computer)). The outcome of the search was compared to the publications found in the original research (set T-1). As a result, we determined that the automated search for the most serious drug crime on DW sources was virtually as excellent as the MSS. Through automated search discovered 19 of the 26 evaluation found in the original study (excluding the two articles found through other ways). Two of the missing articles were on the edge of being included (abstract review article that wasn’t the paper’s major focus, and the other was a computer science study rather than a DW research), while the final missing study used the phrase “review.” Except for the Digital library (Springer), this does not enable narrow search to specific portions of the paper, the strings in the search terms of article (title, the abstract,

186

R. Rawat et al.

Fig. 2. Applied SLR methodology

Systematic Literature Review (SLR) on Social Media

187

and keywords). Based on criteria (inclusion and exclusion) described, we iterated and screened the most relevant publications. Following these screening procedures, 79 papers were chosen for this review study. We looked through the findings from the 2020 [17] search in further depth to see if our first search (which took place in July–August 2019) had overlooked any other pertinent articles. The 2020 search discovered 11 of the remaining 19 SLRs that were released between July 1, 2007 and July 30, 2019. The 2020 search overlooked seven SLRs because they utilized non-standardized terminology. They Didn’t used the phrase “literature review” (for example, use terminology like “literature review” [22, 23] or “assembly of studies” [16], or merely the terminology “review” without having qualifiers [10, 26]. And explained “searched publishing channels” [24] or “analyzed DW and drug for cyber crime experiments” [10] but didn’t mention any phrases linked to review. The system didn’t appear to have SCOPUS indexed. SCOPUS indexed-based search in July 2020 discovered all of the publications that utilized standard language and were mainstream DW research [28–30], as well as the three relevant papers that were overlooked in the first search and no further relevant evaluation. As a result, we came to the conclusion that we hadn’t missed any other big conventional publications. 2.3 Search Strategy and Selection The papers are arranged alphabetically [title, abstract, and keywords]. The assessment focused on removing publications that were clearly unrelated, repeats, or SLRs that we had previously discovered [22]. Phase 1: All of the papers were individually for inclusion by three scholars. All papers were randomly assigned to two scholars by a pool of five, omitting Saibanazah and shamsulhuda. All articles were examined to see if they might be rejected based on the abstract and title because they didn’t contain LR or weren’t about DW themes. Any differences were acknowledged, but the focus was on not rejecting any of the contested articles. As a result, 35 papers were rejected. Phase 2: The remaining 79 papers were retrieved in full and a second assessment was done using the following inclusion and exclusion criteria: (Not a PPT [3] or an extended abstract) but the entire manuscript; the paper contained an LR focused on defined search procedure; the article should be connected to DW and cybercrime rather than computer science(CS). Phase 3: We came up with 79 papers after using the criteria (inclusive and exclusive) in our iterative approach. The following are some examples of selecting procedures. Automatic Search: we were able to acquire 1200 publications using automatic search. Title-Based Selection: To speed up the article selection process, title-based selection was used. We select articles that are relevant to our SLR based on the title of the paper. The total number of papers now stands at 571.

188

R. Rawat et al.

Duplicate papers were deleted in this situation since some of the digital indexed papers are accessible in others. The count of articles was decreased to 383 after the duplicates were removed. Abstract-based selection: The abstracts of the 332 publications that were chosen were examined to see if they were connected to our SLR. At this point, the irrelevant abstract articles were discarded, and 90 Articles were chosen. Full-text Selection: All 110 papers were read entirety, and 90 papers were chosen as a result. Publications picked, 18-IEEE Xplore, 38-Google Scholar, 17- Science-Direct, 13-Springer, 11-Scopus, and the remaining taken from the ACM Library. • Scholars at random assigned to review all papers • Disagreements were communicated and worked out. • The emphasis was on not dismissing any articles that could be relevant. 43 publications were rejected because they conducted a LR but not having defined search methodology. Further 20 publications were eliminated because they had merely a relevant study part, were duplicates, or were not about DW and cybercrime. On the internet, the DW may be an arena for showing anonymous criminal characters. Security agencies, military officials, law enforcement, and intelligence need to keep tracking of activity done at dark web platform. On the DW [80], secrecy for shielding military command and control (CC) [81] installations for snooping [82] and hacking [83] by adversaries. DW might be used by the military to assess the region in which it works and discover potentially dangerous activities. The last 29 papers were divided into two groups. The first batch of 15 articles (T-2.1) included papers released between February 1, 2015, and July 30, 2017, while the second set (T-2.2) included papers released after July 1, 2017. We accomplished the qualitative examination and data extraction for the 12 papers in T-2.1 before starting on qualitative assessment and data retrieval for 18 articles at T-2.2.

2.4 Evaluation of Quality Every SLR is assessed using the Centre for Reviews and Dissemination (CDR) Database of Abstracts of Reviews of Effects (DBARE) standards [33–37], which are used at Technical Universities for research. Four questions form the basis of the criteria: • • • • •

Are the evaluation criteria (inclusion and exclusion) explicit and reasonable? How probable is it that the related work search found all significant,evaluation? Integrity and reliability of the included evaluations? Fundamental data/evaluation sufficiently presented? How the questions were graded (Fig. 3).

Q1: Yes, the criteria for inclusion are specified in the article; the prerequisites are implied in section (P). No (no), the requirements are not mentioned explicitly and are difficult to determine.

Systematic Literature Review (SLR) on Social Media

189

Fig. 3. Identification of included SLRs

Q2: Yes, the investigators used additional search techniques to search four or more digital libraries, or they identified and cited all articles that addressed the issue of interest. P, the scholars searched four or three digital libraries without using any other search techniques, or they searched a specific but limited selection of articles and conference proceedings; N, The authors used up to three digital libraries or a limited number of magazines to conduct their research. When analyzing question 2, the assessor must additionally consider if the digital resources were appropriate for the specific SLR. Q3: Yes, the authors specified qualitative criteria and retrieved them from all primary evaluations; P, the study subject addresses qualitative problems; N, no particular qualitative assessment of individual articles was conducted, or qualitative data was collected but not used. Q4: Yes, metadata about all papers is addressed so that data excerpts can be clearly traced back to relevant Article; P, only summarized information about individual papers

190

R. Rawat et al.

is portrayed, e.g. Articles are categorized into categories but specific evaluations cannot be linked to all categories; N, individual study outcomes not specified, i.e. primary evaluations not cited. Q5: We initially retrieved the reasons from the articles and discovered distinct related works based on the qualitative features and selection criteria. Themes were then extracted from the research using thematic analysis. Then, using the identified themes as a guide, we conducted qualitative data analysis. This provides architecture frameworks based on topics that have been utilized in related work. Q6: This data analysis is dependent on the Q1 output, which aids in the organization of different related works depending on the analysis as shown in Fig. 2. The motives were derived from the architectures summary in the first stage in order to determine the features of the models employed architectures. These features including (data size utilized, practical application, and model performance). The generalized models were then created using theme analysis. This phase establishes the context and purpose of our architectural study. The scoring technique was n = 0, p = 0.6, y = 1. For qualitative data extraction, we employ two distinct approaches, where collected information from available literatures are analyzed comprehensively from set T-1, T-2.1 and T-2.2 in Fig. 1. The consensus viewpoint was chosen as the intermediate value. For papers released between [February 1, 2015] and [March 30, 2020], we are using a more stringent method to answer qualitative issues (set T-2.2). The procedure is referred to as a “consensus and minority report. “The consensus result is then evaluated to a fourth independent, and any discrepancies are addressed until a final consensus is reached.

2.5 Major Roles of the DW Silk Road, Hansa, and AlphaBay markets [11, 28, 38–41] have become well-known over the years for facilitating the trading of illegal narcotics, firearms, stolen credit card information, child pornography, Human trafficking and other nefarious products. Terrorist organizations [42–46] find the natures of DW networks appealing. The violent extremist (VE) [47–51] organizations, which carry out terrorist attacks [8, 9, 52–55, 79] all over the world and exploit the internet’s global reach to reach a global audience, are a source of concern. 2.6 The Process of Data Extraction In the systematic review methodology, data harvesting is crucial. Reviewers collect information from every manuscript for serving raw material for the synthesis process after obtaining a list of publications from the related work searching. The research question established throughout the process stage determines the data to be extracted. Data Extraction form should be included at sections for recording the information needed to answer the research question and a space for general remarks. Reviewers using details from the form with evidence from the practical screen and the subsequent qualitative rating in the synthesis stage to generate a full record for all evaluations.

Systematic Literature Review (SLR) on Social Media

191

• The sort of research (SLR or Study of mapping). • The review’s focus on SLR relating to DW. • Issues relating to the articles context for SLR. Pettigrew and Roberts [12] provide a second, albeit brief, treatment, in which they provide an example of data retrieval and briefly explore the reliability of critical featured for analysis, as well as the unequal impacts of treatments. An electronic tool-based technique for coding and extracting documents from LRs is their most remarkable and significant accomplishment. Their method is based on grounded theory coding [5, 17] methods and is a type of qualitative data [8] analysis. Finally, Okoli [13, 59, 60] concentrates solely on collecting theoretical components from primary rese arch (ideas, boundary conditions, connections and explanations) [44]. Six papers as examples specifically state how they gathered data. Maxwell’s systematic review [1, 56–58]; Researchers (Nordstrom, C., & Carlson, Tuttle, Weber, Akintaro, M., Feinberg, T., Ericsson Marin, O. Catakoglu, M. Balduzzi, Ying Yang, Huanhuan Yu, LinaYang) [8–10, 14, 16–18, 61–66] devote a part of their technique to revisiting the initial procedure and how they used it while retrieving data.

3 Result and Discussion Here the two research questions (RQ) and findings of our comprehensive Related work study is discussed. The first RQ1 is to evaluate whether or not DW crimes are becoming more dangerous. This will be addressed in subsection A. The RQ2 examines the various strategies used to track down criminals in DW. In subsection, we covered how to respond to and assess with Techniques used to counteract these risks. Which summative assesses the contents of the 67 selected evaluation and their contributions? And the third RQ3 is to look at how much SLRs have released between February 1, 2015, and March 30, 2021. Table 1 lists the 56 SLRs that were released between February 1, 2015, and March 30, 2021 (omitting those mentioned in the tertiary research). We identify the following during each review. Whether it asked precise RQ, was particularly concerned in certain DW Research Trends (DWRT), or the trends of research (RT) for drug crime investigation conduction. The location of the SLR’s discovery (journal, PhD Forum, workshop, book chapter, conference). The number of main evaluations utilized by the author, as stated directly or in tabulations. If the research provided any recommendations for practitioners, concerning the nature of DWRT publications and SLR Guidelines references. 3.1 Threats of Criminal Activity in DW(RQ1) We identified significant criminal dangers in the DW based on the stories we chose, to answers (RQ1). The following is the list of criminal threat [73–77]: • Human and sex trafficking [6]: These are major crimes (risen dramatically) of online Social Platform, chat services [42], and Dark Web’s anonymity [46]. • Assassinations and their marketing: Criminals utilize DW to promote their assassin abilities. The sites MailOnline [34], White Wolves [1], and C’thuthlu [50] advertised criminals for hire [1] for $10,100 (USA) and $12,200 (Europe) [37].

192

R. Rawat et al. Table 1. Overview of extracted evaluation before1st February 2015 and March 30, 2021

Ref

Theme

Review and Research focus (RRF)

Quality

Citation guidance

Year

[19]

Security risks identification for digital transformation

RQ

Y

N

2019

[20]

Pharmaceutical drugs supply and usage in Australia

RQ

N

N

2019

[21]

metric for cyber threat trends at dark web (Limitations)

RQ

Y

Y

2019

[22]

Profiling model representation towards anomaly detection on internet

DWRT

Y

Y

2019

[23]

Countering the RT Coercion in Cyberspace (dark web)

Y

N

2019

[24]

Online blackmailers and RT growth of ransomware (Darkweb)

Y

Y

2019

[25]

Organized cyber-crime and terrorist networks (challenges)

RT

N

Y

2018

[26]

Methodology for SLR applied to engineering and education

RT

Y

Y

2018

[27]

Dark Net (drug dealing) RT

Y

Y

2015

[28]

dark web mining (drugs RQ and fake ids)

N

N

2016

[29]

Tor marketplaces data analysis

RQ

N

Y

2017

[30]

Law enforcement at dark web

RQ

N

Y

2017

[31]

illicit drugs in the digital world

RQ

N

Y

2016

[32]

dark side of CRM

RT

Y

N

2020

[33]

Anonymity services of Darkweb (tor, i2p)

RT

Y

N

2018

[s34]

Dark web impact at online anonymity and privacy

DWRT

Y

Y

2020

[35]

Dark web(DW) content analysis and visualization

RT

Y

Y

2019

(continued)

Systematic Literature Review (SLR) on Social Media

193

Table 1. (continued) Ref

Theme

Review and Research focus (RRF)

Quality

Citation guidance

Year

[36]

Detecting ransomware with honeypot techniques

RT

N

N

2016

[37]

Policing human trafficking: Cultural blinders and organizational barriers

RQ

N

N

2014

[38]

Cryptocurrency: the associated criminal activity

RQ

Y

Y

2020

[28]

Mining the dark web for RQ criminal events

N

N

2016

[39]

Studying illicit drug trafficking (Darknet markets): Canadian perspective

RQ

N

N

2016

[40]

Functioning of online drugs markets

RQ

N

N

2019

[41]

Drug vendors at Tor network

RT

N

N

2016

[42]

The Dark Web (DW) dilemma: Tor, anonymity and online policing

RT

N

N

2015

[43]

Analysis of crimes via newspaper articles

RT

N

N

2015

[44]

Malicious hacker forums analysis

RT

N

N

2016

[45]

Terrorist migration to the dark web

RT

N

N

2016

[46]

Cluster data monitoring for darknet traffic features

RQ

N

N

2015

[47]

Dark Web and Bitcoin Analysis towards Money Laundering

RQ

Y

N

2015

[48]

Webcam child prostitution

RQ

N

N

2017

[49]

Dark web impact on internet governance

RT

N

N

2015

[50]

PRISMA-P approach

RT

N

N

2015

[51]

Geolocation of crowds in dark web

RT

N

N

2018 (continued)

194

R. Rawat et al. Table 1. (continued)

Ref

Theme

Review and Research focus (RRF)

Quality

Citation guidance

Year

[52]

Social Media Law & Cybercrime

RT

N

N

2019

[9]

Attacks landscape (dark RT side of the web)

N

N

2017

[53]

Search of Shadows (Prosecuting Crime on the Dark Web)

DWRT

N

Y

2018

[54]

Military fight (ISIS on the dark web)

RQ

N

N

2015

[55]

Ancient artifacts vs. digital artifacts

RQ

N

N

2018

[56]

IDS (classification, RQ techniques and datasets)

Y

Y

2017

[57]

Performance crime and justice

RT

N

N

2015

[58]

Surveillance and privacy at deep Web

RT

Y

N

2017

[59]

Causes, Prevention and Law Concerned With Cyber Child Pornography

RT

Y

N

2015

[60]

Mining user interaction patterns in Darkweb (predicting enterprise cyber incidents)

RQ

Y

Y

2019

[61]

Intrusion detection and big heterogeneous data: a survey

DWRT

Y

Y

2020

[62]

Suspicious Big Text Data Analysis for Prediction (Darkweb User Activity Using Computational Intelligence Model)

DWRT

Y

Y

2021

[63]

Dark Web (Onion RT Hidden Service Discovery and Crawling for Profiling Morphing and Vulnerabilities Prediction)

Y

Y

2021

[64]

Social Network Analysis to Managing (Investigation of Suspicious Activities in OSN Platform)

Y

Y

2021

RT

(continued)

Systematic Literature Review (SLR) on Social Media

195

Table 1. (continued) Ref

Theme

[65]

Review and Research focus (RRF)

Quality

Citation guidance

Year

Sentiment Analysis RT (OSN for Cyber-Malicious Post Reviews Using Machine Learning Techniques)

Y

Y

2021

[66]

Darknet Traffic Analysis (Criminal Activities Detection Using TF-IDF and Light Gradient Boosted Machine Learning Algorithm)

Y

Y

2021

[67]

Digital Transformation DWRT (Cyber Crime for Chip-Enabled Hacking)

Y

Y

2021

[68]

Surveillance Robot RQ (Cyber Intelligence for Vulnerability Detection)

Y

Y

2021

[69]

Vulnerability Analysis (Industrial Internet of Things Platform on Dark Web Network Using Computational Intelligence)

RQ

Y

Y

2021

[70]

CPS [Fraud Analysis by RT Mobile Robot]

Y

Y

2021

[71]

Blockchain [Model of Expanding IoT Device Data Security]

RT

Y

Y

2021

[72]

Geometrical and Randomized [Algorithms Optimization for Cryptographic Applications]

DWRT

Y

Y

2019

RT

196

R. Rawat et al.

• Drug transactions [2]: On the Deep Web [3], there is generally 2-sorts of drug stores (markets). Include devoted to the single type of medication, like Heroin [12]. The customer relationship is quite popular because -(product expertise) and (vendor relationship). The 2nd sort of drug market [8, 10] is the general store (buyer’s market), which sells a variety of illicit goods such as firearms [7], pornography, the stolen jewellery [9], black-market(cigarettes), & credit-card[2, 4, 5]. • Child Pornography: Children utilize social media and apps like Omegle and Ask.fm [73] to communicate while hiding their identities [41]. Pedophiles use these apps for communicating with youngsters. • Terrorist: On the DW, terrorist groups pose a serious threat towards national Integrity (security). Terrorist organizations like (al-Qaeda/ISIS) [74] taken advantage of DW to further their bad goals in distributing misinformation. • Drug Trafficking Crimes: Drug trafficking [75] is a global trafficking method that includes the cultivation, manufacture, distribution and sale of narcotics which is subject to the Prohibition of Drugs Act. 3.2 Techniques for Tracking Down Criminals in DW(RQ2) Cybercrime in the DW is quite comparable to crime in the real world, with the exception that it is difficult for law enforcement to monitor virtual crime using the DW. One of the biggest issues that certain forensic analysts [21] may confront when trying to investigate [12] criminal behavior is the anonymity given by DW services. As a result, forensic investigation of illegal activities is hampered. On the DW, several CSI (crime detection investigations) conducted to find crimes. In the subsections under law enforcement and detection [52] methods, we’ll see how detection techniques and law enforcement [4, 5, 12, 14] methods are used and launched for this goal. 3.3 How Many SLRs Have Been Released- 1st February 2015 and March 30, 2021 (RQ3) We detected 23 SLRs between February 1, 2015, and March 30, 2021 (set T-2.2), eleven of which were mapping evaluations. Except for three, all of the SLRs were DWRTpositioned (referenced either at DWRT articles or the SLR guidelines). From 1 February 2015 to 30 July 2017 (T-1 + T-2.1), there were 41 evaluations, 21 of which were mapping evaluations. The count of evaluations per year indicates that the count of SLRs is increasing per year between 2015 and 2017 being equivalent to the count of evaluations each half year after 2018. The number of reviews identifying themselves as DW research technique (DWRT) and SLRs increased between February 2018 and July 2020. Below Table indicates that we classified all five of this research as mapping evaluation, and that four of them received a qualitative rating of less than two on the scale. As a result, while a huge count of papers can be acquired, the evaluation that results may be of poor qualitative, particularly in terms of traceability from original research to conclusions (Q4) and repeatability. The inclusion and exclusion criteria used in this survey (Q1): are likely to be harmed, & individual publications will most likely not be evaluated for quality (Q3), research approach (Q4), and statistical method mapping (Q5) (Table 2 and Table 3).

Systematic Literature Review (SLR) on Social Media

197

Table 2. Scores analysis of each question Ref.

Q1

Q2

Q3

Q4

Q5

SOM

N

P

N

P

Y

SLR

Y

P

Y

P

P

SLR

Y

Y

N

P

P

SLR

Y

P

N

Y

Y

SLR

Y

P

N

N

Y

SOM

P

Y

N

Y

Y

SOM

Y

Y

Y

P

N

SLR

N

Y

N

P

P

SOM

Y

N

N

N

P

SLR

P

P

N

P

Y

SOM

P

P

N

P

N

SLR

P

Y

N

N

Y

SOM

Y

N

N

P

Y

SOM

P

Y

N

P

Y

SLR

Y

Y

N

P

Y

SLR

Y

Y

N

P

P

SLR

Y

Y

Y

Y

P

SLR

Y

P

Y

Y

P

SLR

Y

P

N

Y

P

SOM

N

P

N

P

Y

SLR

Y

P

Y

P

P

SLR

Y

Y

N

P

P

SOM

Y

P

N

Y

P

SOM

P

P

N

P

Y

SOM

P

P

N

P

N

SOM

P

Y

N

N

Y

SLR

Y

N

N

P

Y

SLR

P

Y

N

P

Y

SLR

Y

Y

N

P

Y

SLR

Y

Y

N

P

P

SOM

Y

Y

Y

Y

P

SLR

Y

P

Y

Y

P (continued)

198

R. Rawat et al. Table 2. (continued)

Ref.

Q1

Q2

Q3

Q4

Q5

SOM

Y

P

Y

P

P

SLR

Y

P

N

P

P

SOM

Y

P

N

P

P

SOM

P

P

N

P

P

SLR

Y

P

N

P

P

SLR

Y

P

Y

P

P

SLR

Y

P

Y

P

P

SOM

Y

P

Y

P

P

SLR

N

P

Y

P

P

SOM

Y

P

Y

P

P

SLR

Y

P

Y

P

P

SOM

N

P

Y

P

P

SLR

Y

P

N

P

P

SLR

Y

P

Y

P

P

SLR

Y

P

Y

P

P

SLR

Y

P

Y

P

P

SOM

N

P

Y

P

P

SLR

Y

P

Y

P

P

SOM

Y

P

Y

P

P

SLR

N

P

Y

P

P

SOM

Y

P

Y

P

P

SLR

Y

P

Y

P

P

SOM

Y

P

Y

P

P

SOM

Y

P

Y

P

P

Systematic Literature Review (SLR) on Social Media

199

Table 3. Widely used keywords Keywords

References

Dark-net, Dark-web, Hidden web, Deep-web, [1, 7, 14, 16, 17, 25, 41, 44] Violence, war on drugs, drug trafficking, Illegal logging Drug Market, Human Trafficking,Silk [2, 9–11, 13, 23, 24] Road,black markets, unreported economy, Anonymous network, Bitcoin Fraud, Intelligence agency, CIA, Interpol, UN-Office (Drugs & Crime) Cyber crime, cyber attack, Darkweb [1, 7, 14, 17, 19, 23, 44, 54, 56, 57, 60, 62, 65] crypto-markets, organized crime, Hidden ToR Market Terrorism, cyber threat, crowd funded assassinations, illegal drug economy, underground economy, shadow economy, tax gap, Social Network propaganda, Online extremist recruitment

[5–8, 13, 19, 21, 23, 26, 39, 40, 42, 43, 47, 48]

Criminal, Illicit Drugs, Child pornography, smuggling, Arms trafficking, Biological organs Trading, illegal drug trade

[6–13, 15, 17, 19, 24, 28, 30, 31, 33]

Cyber security, Online Social Network (OSN) [1–7, 13, 44, 46, 47, 50, 52, 54, 56, 65, 67–69] Illicit Business, darknet markets, clandestine market, Grey market, cannabis, modafinil, LSD, cocaine, designer drugs, MDMA, Temazepam

When comparing the distribution of SLR themes to the curriculum, it becomes clear that coverage of fundamental DW concepts is severely limited. (If articles are mapped to the DWBOK [49], a similar result is obtained.) SLRs that are only deemed to be of exceptional qualitative (scoring 3 or above on the qualitative scale) would exclude five of the most essential SLRs for specialists and scholars. Even if SLR is of poor standard, the median number of major evaluations in mapping evaluation and SLRs is rather large (Table 4). It appears like a variety of subjects are being discussed. Using the DW Curriculum Guidelines for Degree and Research Programs [52] and the DW and drug Book of Knowledge (DWBOK) [49] as a baseline for evaluating the extent to which International Law and Crime Organization, Crime Intelligence, Cyber Security and Ethics topics are discussed, We looked at how successfully the SLRs linked to cybercrime were handled, both in terms of educational and Policy implementation.

200

R. Rawat et al. Table 4. In mapping evaluation and SLRs, the median number of main evaluation. Time Duration of Study Year

FebruaryMarch

AprilJuly

2015-2016 2016-2017 2017-2018 2018-2019 2019-2020 2020-2021 Total

AugustOctober

NovemberMarch

Average number of SLRSs per month 0.4 0.67 0.6 0.76 1.09 1.8

Number of positioned SLRs 4 6 5 9 10 22 56

4 Study Limitations Finding all relevant research is one of the most difficult aspects of SLRs. We utilized a combination of an automatic search and publications identified during a prior human search for this investigation. The search, however, missed four articles that might have been discovered because they were not indexed during the searches. All publications that employed standard language and were mainstream DW papers were identified in an extra search conducted. However, it’s possible that we overlooked certain subjects that fall between DW and crime, as well as IT and computer science (CS). Furthermore, the search phrases are created for locating the greatest amount of known and validated SLRs. Synthesis of research without utilizing words like “literature review” or “literature survey.” Other limitation in qualitative assessment was conducted in two ways: using a median in assessments released before July 30, 2017, and using a “consensus and minority report” approach in assessments released after July 30, 2018. Furthermore, a different approach, namely an extractor and checker strategy, was used in the original study. However, because the quality of each set of evaluations is comparable, we may confidently conclude that recent SLRs score higher in terms of quality. Finally, in terms of SLR selection and subjective data extraction, all of the publications were examined and data was retrieved from all of them.

5 Conclusion The findings of this study reveal two major differences from our prior study for our own new research. Specifically, this SLR includes a full explanation of the DW risks, and the count of SLRs released to be rising. Many LRs are still conducted without regard for technique. We discovered 53 SLRs (of various qualities) between February 1st 2015 and March 30, 2021, however we found 57 LRs that didn’t employ any defined and validated search technique. The 14 potential SLRs are eliminated at the original tertiary evaluation are not included in this group of 44 trials, Because DW is un-indexed, fragmented, and multi-layered system for detecting the crimes. These findings demonstrate

Systematic Literature Review (SLR) on Social Media

201

that cataloguing high-quality SLRs requires a thorough search of all available sources. If writers want their research to be easily identified, we highly advise for using terminologies “systematic review” or “systematic literature review (SLR)” in title and keywords. Prior to releasing the findings of an SLR, and DW crimes, it’s an excellent strategy to run additional search through an indexing engine like SCOPUS.

References 1. Nazah, S., Huda, S., Abawajy, J., Hassan, M.M.: Evolution of dark web threat analysis and detection: a systematic approach. IEEE Access 8, 171796–171819 (2020) 2. Rajamanickam, D.S., Zolkipli, M.F.: Review on dark web and its impact on internet governance. J. ICT Edu. 8(2), 13–23 (2021) 3. Chen, H.: Dark web: Exploring and data mining the dark side of the web, vol. 30. Springer Science & Business Media (2011) 4. Paoli, G.P., Aldridge, J., Nathan, R., Warnes, R.: Behind the curtain: the illicit trade of firearms, explosives and ammunition on the dark web (2017) 5. Moor, L., Anderson, J.R.: A systematic literature review of the relationship between dark personality traits and antisocial online behaviours. Pers. Individ. Differ. 144, 40–55 (2019) 6. Desmond, D.B., Lacey, D., Salmon, P.: Evaluating cryptocurrency laundering as a complex socio-technical system: a systematic literature review. J. Money Laund. Control 22(3), 480– 497 (2019) 7. Cascavilla, G., Tamburri, D.A., Van Den Heuvel, W.J.: Cybercrime threat intelligence: a systematic multi-vocal literature review. Comput. Secur. 105, 102258 (2021) 8. Marin, E., Shakarian, J., Shakarian, P.: Mining key-hackers on darkweb forums. In: 2018 1st International Conference on Data Intelligence and Security (ICDIS), pp. 73–80. IEEE (2018) 9. Catakoglu, O., Balduzzi, M., Balzarotti, D.: Attacks landscape in the dark side of the web. In: Proceedings of the Symposium on Applied Computing, pp. 1739–1746 (2017). 10. Yang, Y., et al.: Dark web forum correlation analysis research. In: 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), pp. 1216–1220. IEEE (2019) 11. van Wegberg, R., Verburgh, T.: Lost in the dream? measuring the effects of operation bayonet on vendors migrating to dream market. In: Proceedings of the Evolution of the Darknet Workshop, pp. 1–5 (2018) 12. Bryden, A., Roberts, B., McKee, M., Petticrew, M.: A systematic review of the influence on alcohol use of community level availability and marketing of alcohol. Health Place 18(2), 349–357 (2012) 13. Okoli, A.C., Idom, A.M.: The internet and national security in Nigeria: a threat-import discourse. Covenant Univ. J. Politics Int. Affairs 6(1) (2018) 14. Nordstrom, C., Carlson, L.: Cyber Shadows: Power, Crime, and Hacking Everyone. ACTA Publications (2014) 15. Tuttle, H.: Cryptojacking. Risk Manag. 65(7), 22–27 (2018) 16. Weber, J., Kruisbergen, E.W.: Criminal markets: the dark web, money laundering and counterstrategies - an overview of the 10th research conference on organized crime. Trends Organized Crime 22(3), 346–356 (2019) 17. Akintaro, M., Pare, T., Dissanayaka, A.M.: Darknet and black market activities against the cybersecurity: a survey. In: The Midwest Instruction and Computing Symposium (MICS). North Dakota State University, Fargo, ND (2019). 18. Feinberg, T., Robey, N.: Cyberbullying: intervention and prevention strategies. Natl. Assoc. Sch. Psychol. 38(4), 22–24 (2009)

202

R. Rawat et al.

19. Duc, A.N., Chirumamilla, A.: Identifying security risks of digital transformation - an engineering perspective. In: Pappas, I.O., Mikalef, P., Dwivedi, Y.K., Jaccheri, L., Krogstie, J., Mäntymäki, M. (eds.) Digital Transformation for a Sustainable Society in the 21st Century: 18th IFIP WG 6.11 Conference on e-Business, e-Services, and e-Society, I3E 2019, Trondheim, Norway, September 18–20, 2019, Proceedings, pp. 677–688. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-29374-1_55 20. Hulme, S., Hughes, C.E., Nielsen, S.: Drug sourcing and motivations among a sample of people involved in the supply of pharmaceutical drugs in Australia. Int. J. Drug Policy 66, 38–47 (2019) 21. Jardine, E.: The trouble with (supply-side) counts: The potential and limitations of counting sites, vendors or products as a metric for threat trends on the dark web. Intell. Nat. Secur. 34(1), 95–111 (2019) 22. Lashkari, A.H., Chen, M., Ghorbani, A.A.: A survey on user profiling model for anomaly detection in cyberspace. J. Cyber Secur. Mobil. 8(1), 75–112 (2019) 23. Hodgson, Q., Ma, L., Marcinek, K., Schwindt, K.: Fighting Shadows in the Dark: Understanding and Countering Coercion in Cyberspace. RAND Corporation, United States (2019) 24. Minnaar, A.: Cybercriminals, cyber-extortion, online blackmailers and the growth of ransomware. Acta Criminologica: Afr. J. Criminol. Victimology 32(2), 105 (2019) 25. Tundis, A., Huber, F., Jäger, B., Daubert, J., Vasilomanolakis, E., Mühlhäuser, M.: Challenges and available solutions against organized cyber-crime and terrorist networks. WIT Trans. Built Environ. 174, 429–441 (2018) 26. Torres-Carrión, P.V., González-González, C.S., Aciar, S., Rodríguez-Morales, G.: Methodology for systematic literature review applied to engineering and education. In: 2018 IEEE Global engineering education conference (EDUCON), pp. 1364–1373. IEEE (2018) 27. Afilipoaie, A., Shortis, P.: From Dealer to Doorstep—How Drugs Are Sold on the Dark Net. GDPO Situation Analysis. Swansea University, Global Drugs Policy Observatory (2015) 28. Baravalle, A., Lopez, M.S., Lee, S.W.: Mining the dark web: drugs and fake ids. In: 2016 IEEE 16th international conference on data mining workshops (ICDMW), pp. 350–356. IEEE (2016) 29. Celestini, A., Me, G., Mignone, M.: Tor marketplaces exploratory data analysis: the drugs case. In: International conference on global security, safety, and sustainability, pp. 218–229. Springer, Cham (2017) 30. Ghappour, A.: Searching places unknown: law enforcement jurisdiction on the dark web. Stan. L. Rev. 69, 1075 (2017) 31. Maddox, A., Barratt, M.J., Allen, M., Lenton, S.: Constructive activism in the dark web: cryptomarkets and illicit drugs in the digital ‘demimonde.’ Inf. Commun. Soc. 19(1), 111–126 (2016) 32. Nguyen, B., Jaber, F., Simkin, L.: A systematic review of the dark side of CRM: the need for a new research agenda. J. Strateg. Mark. 30(1), 93–111 (2020) 33. Montieri, A., Ciuonzo, D., Aceto, G., Pescapé, A.: Anonymity services tor, i2p, jondonym: classifying in the dark (web). IEEE Trans. Dependable Secure Comput. 17(3), 662–675 (2018) 34. Beshiri, A.S., Susuri, A.: Dark web and its impact in online anonymity and privacy: a critical analysis and review. J. Comput. Commun. 7(03), 30 (2019) 35. Takaaki, S., Atsuo, I.: Dark web content analysis and visualization. In: Proceedings of the ACM International Workshop on Security and Privacy Analytics, pp. 53–59 (2019) 36. Moore, C.: Detecting ransomware with honeypot techniques. In: 2016 Cybersecurity and Cyberforensics Conference (CCC), pp. 77–81. IEEE (2016) 37. Farrell, A., Pfeffer, R.: Policing human trafficking: cultural blinders and organizational barriers. Ann. Am. Acad. Polit. Soc. Sci. 653(1), 46–64 (2014)

Systematic Literature Review (SLR) on Social Media

203

38. Kethineni, S., Cao, Y.: The rise in popularity of cryptocurrency and associated criminal activity. Int. Crim. Justice Rev. 30(3), 325–344 (2020) 39. Broséus, J., Rhumorbarbe, D., Mireault, C., Ouellette, V., Crispino, F., Décary-Hétu, D.: Studying illicit drug trafficking on Darknet markets: structure and organisation from a Canadian perspective. Forensic Sci. Int. 264, 7–14 (2016) 40. Bhaskar, V., Linacre, R., Machin, S.: The economic functioning of online drugs markets. J. Econ. Behav. Organ. 159, 426–441 (2019) 41. Dolliver, D.S., Kenney, J.L.: Characteristics of drug vendors on the Tor network: a cryptomarket comparison. Vict. Offenders 11(4), 600–620 (2016) 42. Jardine, E.: The Dark Web dilemma: Tor, anonymity and online policing. SSRN Electron. J. (2015). https://doi.org/10.2139/ssrn.2667711 43. Jayaweera, I., Sajeewa, C., Liyanage, S., Wijewardane, T., Perera, I., Wijayasiri, A.: Crime analytics: analysis of crimes through newspaper articles. In: 2015 Moratuwa Engineering Research Conference (MERCon), pp. 277–282. IEEE (2015). 44. Shakarian, J., Gunn, A.T., Shakarian, P.: Exploring malicious hacker forums. In: Sushil Jajodia, V.S., Subrahmanian, V.S., Wang, C. (eds.) Cyber deception, pp. 259–282. Springer International Publishing, Cham (2016). https://doi.org/10.1007/978-3-319-32699-3_11 45. Weimann, G.: Terrorist migration to the dark web. Perspect. Terror. 10(3), 40–44 (2016) 46. Nishikaze, H., Ozawa, S., Kitazono, J., Ban, T., Nakazato, J., Shimamura, J.: Large-scale monitoring for cyber attacks by using cluster information on darknet traffic features. Procedia Comput. Sci. 53, 175–182 (2015) 47. Braga, R.R.P., Luna, A.A.B.: Dark web and bitcoin: an analysis of the impact of digital anonymate and crypto currencies in the practice of money laundering crime. Direito e Desenvolvimento 9, 270 (2018) 48. Açar, K.V.: Webcam child prostitution: an exploration of current and futuristic methods of detection. Int. J. Cyber Criminol. 11(1), 98–109 (2017) 49. Chertoff, M., Simon, T.: The impact of the dark web on internet governance and cyber security (2015) 50. Moher, D., et al.: Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst. Rev. 4(1), 1–9 (2015) 51. La Morgia, M., Mei, A., Raponi, S., Stefa, J.:. Time-zone geolocation of crowds in the dark web. In: 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pp. 445–455. IEEE (2018) 52. Sodhi, A.: Social Media Law & Cybercrime. Available at SSRN 3541485 (2020) 53. Becker, K., Fitzpatrick, B.: In search of shadows: investigating and prosecuting crime on the Dark Web. US Att’ys Bull. 66, 41 (2018) 54. Tucker, P.: How the Military will fight ISIS on the dark web. Defense One 24 (2015) 55. Paul, K.A.: Ancient artifacts vs. digital artifacts: new tools for unmasking the sale of illicit antiquities on the dark web. Arts 7(2), 12 (2018) 56. Chaudhari, R.R., Patil, S.P.: Intrusion detection system: classification, techniques and datasets to implement. Int. Res. J. Eng. Technol. (IRJET) 4(2), 1860–1866 (2017) 57. Surette, R.: Performance crime and justice. Curr. Issues Crim. Justice 27(2), 195–216 (2015) 58. Lightfoot, S., Pospisil, F.: Surveillance and Privacy on the Deep Web. ResearchGate, Berlin, Germany, Tech. Rep. (2017) 59. Nizami, S.M.N.S.M.: Causes, prevention and law concerned with cyber child pornography. IJECI 2(4), 9–9 (2018) 60. Sarkar, S., Almukaynizi, M., Shakarian, J., Shakarian, P.: Mining user interaction patterns in the darkweb to predict enterprise cyber incidents. Soc. Netw. Anal. Min. 9(1), 1–28 (2019) 61. Zuech, R., Khoshgoftaar, T.M., Wald, R.: Intrusion detection and big heterogeneous data: a survey. J. Big Data 2(1), 1–41 (2015)

204

R. Rawat et al.

62. Rajawat, A.S., Rawat, R., Mahor, V., Shaw, R.N., Ghosh, A.: Suspicious big text data analysis for prediction—on darkweb user activity using computational intelligence model. In: Mekhilef, S., Favorskaya, M., Pandey, R.K., Shaw, R.N. (eds.) Innovations in Electrical and Electronic Engineering: Proceedings of ICEEE 2021, pp. 735–751. Springer Singapore, Singapore (2021). https://doi.org/10.1007/978-981-16-0749-3_58 63. Rawat, R., Rajawat, A.S., Mahor, V., Shaw, R.N., Ghosh, A.: Dark web—onion hidden service discovery and crawling for profiling morphing, unstructured crime and vulnerabilities prediction. In: Mekhilef, S., Favorskaya, M., Pandey, R.K., Shaw, R.N. (eds.) Innovations in Electrical and Electronic Engineering: Proceedings of ICEEE 2021, pp. 717–734. Springer Singapore, Singapore (2021). https://doi.org/10.1007/978-981-16-0749-3_57 64. Rawat, R., Mahor, V., Chirgaiya, S., Rathore, A.S.: Applications of social network analysis to managing the investigation of suspicious activities in social media platforms. In: Daimi, K., Peoples, C. (eds.) Advances in Cybersecurity Management, pp. 315–335. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-71381-2_15 65. Rawat, R., Mahor, V., Chirgaiya, S., Shaw, R.N., Ghosh, A.: Sentiment analysis at online social network for cyber-malicious post reviews using machine learning techniques. In: Bansal, J.C., Paprzycki, M., Bianchini, M., Das, S. (eds.) Computationally Intelligent Systems and their Applications, pp. 113–130. Springer Singapore, Singapore (2021). https://doi.org/10.1007/ 978-981-16-0407-2_9 66. Rawat, R., Mahor, V., Chirgaiya, S., Shaw, R.N., Ghosh, A.: Analysis of darknet traffic for criminal activities detection using TF-IDF and light gradient boosted machine learning algorithm. In: Mekhilef, S., Favorskaya, M., Pandey, R.K., Shaw, R.N. (eds.) Innovations in Electrical and Electronic Engineering: Proceedings of ICEEE 2021, pp. 671–681. Springer Singapore, Singapore (2021). https://doi.org/10.1007/978-981-16-0749-3_53 67. Rawat, R., Mahor, V., Rawat, A., Garg, B., Telang, S.: Digital transformation of cyber crime for chip-enabled hacking. In: Handbook of Research on Advancing Cybersecurity for Digital Transformation, pp. 227–243. IGI Global (2021) 68. Rawat, R., Rajawat, A.S., Mahor, V., Shaw, R.N., Ghosh, A.: Surveillance robot in cyber intelligence for vulnerability detection. In: Bianchini, M., Simic, M., Ghosh, A., Shaw, R.N. (eds.) Machine Learning for Robotics Applications, pp. 107–123. Springer Singapore, Singapore (2021). https://doi.org/10.1007/978-981-16-0598-7_9 69. Rajawat, A.S., Rawat, R., Barhanpurkar, K., Shaw, R.N., Ghosh, A.: Vulnerability analysis at industrial internet of things platform on dark web network using computational intelligence. In: Bansal, J.C., Paprzycki, M., Bianchini, M., Das, S. (eds.) Computationally Intelligent Systems and their Applications, pp. 39–51. Springer Singapore, Singapore (2021). https:// doi.org/10.1007/978-981-16-0407-2_4 70. Rajawat, A.S., Rawat, R., Shaw, R.N., Ghosh, A.: Cyber physical system fraud analysis by mobile robot. In: Bianchini, M., Simic, M., Ghosh, A., Shaw, R.N. (eds.) Machine Learning for Robotics Applications, pp. 47–61. Springer Singapore, Singapore (2021). https://doi.org/ 10.1007/978-981-16-0598-7_4 71. Rajawat, A.S., Rawat, R., Barhanpurkar, K., Shaw, R.N., Ghosh, A.: Blockchain-based model for expanding IoT device data security. In: Bansal, J.C., Fung, L.C.C., Simic, M., Ghosh, A. (eds.) Advances in Applications of Data-Driven Computing, pp. 61–71. Springer Singapore, Singapore (2021). https://doi.org/10.1007/978-981-33-6919-1_5 72. Romil, R.: Geometrical and randomized-algorithms optimization for cryptographic applications. In: Algebra, Number Theory and Discrete Geometry: Modern Problems, Applications and Problems of History, pp. 96–96 (2019) 73. Kempen, A.: Spending too much time online is unhealthy for kids and can be a tool for child sexual predators! Servamus Commun. Based Saf. Secur. Mag. 114(6), 50–53 (2021)

Systematic Literature Review (SLR) on Social Media

205

74. Lee, C.S., Choi, K.-S., Shandler, R., Kayser, C.: Mapping global cyberterror networks: an empirical study of al-Qaeda and ISIS cyberterrorism events. J. Contemp. Crim. Justice 37(3), 333–355 (2021). https://doi.org/10.1177/10439862211001606 75. Namli, U.: Behavioral changes among street level drug trafficking organizations and the fluctuation in drug prices before and during the Covid-19 pandemic. Am. J. Qual. Res, 5(1), 1–22 (2021) 76. Rawat, R., Zodape, M.: URLAD (URL attack detection)-using SVM. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(1) (2012) 77. Rawat, R., Dangi, C.S., Patil, J.: Safe guard anomalies against SQL injection attacks. Int. J. Comput. Appl. 22(2), 11–14 (2011) 78. Dhariwal, S., Rawat, R., Patearia, N.: C-Queued Technique against SQL injection attack. Int. J. Adv. Comput. Sci. 2(5) (2011) 79. Rawat, R., Patearia, N., Dhariwal, S.: Key Generator based secured system against SQLInjection attack. Int. J. Adv. Res. Comput. Sci. 2(5) (2011) 80. Rawat, R., Mahor, V., Chirgaiya, S., Garg, B.: Artificial cyber espionage based protection of technological enabled automated cities infrastructure by dark web cyber offender. In: Intelligence of Things: AI-IoT Based Critical-Applications and Innovations, pp. 167–188. Springer, Cham (2021). 81. Rawat, R., Garg, B., Mahor, V., Chouhan, M., Pachlasiya, K., Telang, S.: Cyber threat exploitation and growth during COVID-19 times. In: Kaushik, K., Tayal, S., Bhardwaj, A., Kumar, M. (eds.) Advanced Smart Computing Technologies in Cybersecurity and Forensics, pp. 85–101. CRC Press, Boca Raton (2021). https://doi.org/10.1201/9781003140023-6 82. Mahor, V., Rawat, R., Kumar, A., Chouhan, M., Shaw, R. N., Ghosh, A.: Cyber warfare threat categorization on CPS by dark web terrorist. In: 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), pp. 1–6. IEEE (2021) 83. Mahor, V., Rawat, R., Telang, S., Garg, B., Mukhopadhyay, D., Palimkar, P.: Machine Learning based Detection of Cyber Crime Hub Analysis using Twitter Data. In: 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), pp. 1–5. IEEE (2021)

A Survey on Interoperability Issues at the SaaS Level Influencing the Adoption of Cloud Computing Technology Gabriel Terna Ayem1 , Salu George Thandekkattu1 and Narasimha Rao Vajjhala2(B)

,

1 American University of Nigeria, Yola, Nigeria

{gabriel.ayem,george.thandekkattu}@aun.edu.ng 2 University of New York Tirana, Tirana, Albania [email protected]

Abstract. The study’s conclusive solutions are two-fold: a 3-way generic solution mechanism into the other challenges bedeviling cloud adoption aside from interoperability is recommended and advanced. Also, a content analysis desk-search survey of the common solutions proposed by different researchers in cloud computing interoperability with special emphasis on the SaaS service-level interoperability with different initiatives which are structured on 6-key solution types is given. The analysis survey results from the 6-keys solutions advanced by researchers favors the standardization solution type more, with an implementation initiative of 40% and also the model-driven solution type, showing the highest number of authors implementation of 70%. Albeit more work on this area is needed to test and measure the effectiveness of the initiatives adopted in the survey in a single and broader context to determine their efficacy under the SaaS service level model of cloud interoperability. Keywords: Cloud computing · Interoperability · SaaS

1 Introduction Cloud computing has revolutionized the way services and businesses are done using the computing infrastructure. Products and services involving computing and IT infrastructure such as deployment, servicing, and update, etc., in the global world of business would no longer remain the same with the coming of the cloud. As businesses demand in terms of computing services, computing infrastructure, computing specialists’ personnel, and computing packages, etc., needed to effectively carryout out their business operations increases; the cloud computing phenomenon promises a reduction and tradeoff involving all the resources above needed to run their businesses smoothly [1–4]. Despite the huge potential and gains offered by the cloud computing phenomenon, many business owners are still dragging their feet when it comes to accepting to use cloud services, which is not without its challenges [4]. Cloud computing challenges exist, ranging from customer’s data security, cost, Service license (SLA) agreement issues, heterogeneity © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 206–214, 2022. https://doi.org/10.1007/978-981-19-3182-6_16

A Survey on Interoperability Issues at the SaaS Level Influencing

207

in semantic, lack of standardization across platforms, interoperability, etc. These issues occur at different levels of the cloud computing service deployment models [Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS)] and pose great concerns and apprehensions for many business owners who are skeptics of the cloud computing phenomenon. Interoperability which seems to be a significant issue in the cloud computing world can be seen as the ability and possibility of two or more separate organizations or systems to integrate and work together and transfer data and information within themselves [1, 2]. Thus, in cloud computing specifically, interoperability can be defined as the ability to move and connect services that are offered by different cloud service providers (CSP), irrespective of their location and heterogeneity in terms of environment and the type of software and hardware components deployed [3–5]. However, the greatest success in interoperability is found at the IaaS deployment level in the cloud computing world. And at this level, the cloud computing functionalities usually equals themselves and tend to be uniformly established amongst components that are shared. For instance, the Cloud Data Management Interface (CDMI) and similar others seem to be a de-facto standard in the cloud computing world [6]. At the PaaS deployment level, few standard interfaces employed in interoperability exist, but there are a handful of open-source standards which help to complement the few available ones [7]. The most significant challenges in interoperability lie at the level of the SaaS, as there are very few standard APIs available for interoperability compared with the other levels of cloud deployments. The rest of the sections of this work is divided into the research questions, a review of extant literature, a comparative analysis on surveys on interoperability of cloud service. The research questions guiding this study are: 1. What general issues and their solution exists in extant literature that affects cloud computing? 2. What extant surveys are available for interoperability issues and their solutions at the different cloud deployment model levels?

2 Literature Review 2.1 Cloud Computing Characteristics, Services and Deployment Models 2.1.1 Cloud Computing Characteristics Many definitions which are characteristics of the cloud computing phenomenon exists, albeit we will x-ray the cloud computing phenomenon under the prism of the National Institute of Standard and Technology (NIST) [1, 2]. NIST outline five characteristics which can be associated with the cloud computing phenomenon as follows: i.

On-demand Self-service: The provision of cloud computing services at the demand of a client or customers without human intervention at any given time and place without necessary going to meet with cloud service providers (CSP). Once the necessary agreements are meet and monies are paid, clients can expand or reduce their services on-demand and get their request met at any given point in time.

208

G. T. Ayem et al.

ii. Broadcast Network Access: All cloud services are provisioned over a network. Hence, making network the livewire of the cloud phenomenon, without which there can be no cloud. The network can a local area network (LAN) or a wide area network (WAN) which are all internet based. iii. Multiple tenancy and resource pooling: many customers or clients can share in the multiple tenancy agreement of one CSP, with different or similar terms and conditions been shared by the different clients and having different cloud computing demands and cost across the different clients. This is akin to different households sharing the structure of a building with each customer having his or her apartment with similar or different number of bedrooms kitchen and toilets [3]. Resource pooling on the other hand involves the CSP pooling or bringing the same cloud computing resources together in order to meet or satisfy the different clients cloud computing needs as they are demanded [4]. iv. Rapid Elasticity and scalability: The ability and capability of the cloud to rapidly scale-up or scale-down on services as they are demanded by the clients is very significant. A customer can elect at any time to make an extra demand on the cloud resources from the CSP and the cloud services can instantly be scaled-up to accommodate the request, without the customer have to make a new agreement. The customer may however need to pay on-demand the services requested for pending the during of the services he or she is requesting. Similarly, a client can elect to opt-out on any cloud services subscribed to depending on needs and the CSP can rapidly scale down and reduce the client cost commensurable to the service terminated without needing to sign a fresh agreement [5]. v. Measured Services: This is the cloud computing mechanism that makes sure clients pay for services they consumed. All CSP provides clients with a pay-as-you-use metering system, enables users to pay exactly for the services demanded. This kind of mechanism is also significant especially for small business who cannot afford a private cloud infrastructure, as the daily, weekly or monthly charges for services use may not impact negatively on the finances of the small businesses. Hence, making them to have value for their money. However, big organizations that utilizes this pay as you use services may be at a loss in the long-run as the cumulative cost may be huge and they may be better off having a private cloud infrastructure [6]. 2.1.2 Cloud Service Models There are three types of cloud computing service models: the service model models are: i.

SaaS – Software-as-a-Service model affords clients with services and applications that can be access over the internet and the web, without the clients have to worry about its maintenance, management and security issues. Just as the name implies the services are largely software services such as email addresses, field solution services etc., which clients request on-demand and use, leaving the responsibility of management, security and maintenance with the CSP. The SAAS is a multibillion naira venture and it is projected to hit over 117 billion USD after 2022 [7, 5, 6].

A Survey on Interoperability Issues at the SaaS Level Influencing

209

ii. PaaS – Platform-as-a-Service model is like as go-between of SaaS and IaaS. This cloud based model offers developers both beginner and advanced ones the opportunity to build or customize their software without necessarily using and Integrated Development Environment (IDE), which are usually time consuming and expensive. They have the least market share project which stand at 27 billion USD by 2022 according to [7]. Apache Stratos, MS-Azure, Google Engine among other are examples of PaaS [4–7]. iii. IaaS – Infrastructure-as-a-Service model provides organizations or clients with services such as networking facilities, storage capability, processing power and virtualization among other on the basis of pay-as-you-use, leaving the security and maintenance with the CPS. Amazon and GCP are some examples [8–14]. 2.1.3 Cloud Computing Deployment Model Similarly, four basic types of a deployment model for the cloud exist: The Public, Private, Community, and Hybrid. The Public cloud is an open source and it is owned and controlled by individuals, government or organization and subscription to it is with little or low fees to keep maintenance of the cloud going.; The private cloud is owned and controlled by a private company or firm; Community is owned by a community of users like an association of Marine Engineers, Doctors, etc. While the Hybrid cloud is a term used to describe a cloud system that is jointly owned, e.g., joint ownership of a community and a private cloud would be regarded as a hybrid cloud [8, 9]. 2.2 Common Cloud Computing Issues i.

Data Security Issues: The security of data over the cloud is a big challenge, as a plethora of data security issues such as virus attack, hacking, phishing, etc. is of great concern, since the clients’ data maintenance is transferred to a CSP which is a third party the possibility of abusing this privilege position by the CSP is eminent and calls for great concern by stakeholders [15]. ii. Deployment of the right Cloud model: Among the four deployment models. Viz: the public, private community and hybrid, intending customers’ musts x-ray their need and resource and determine which one of them best suite their need and resource. Hence, the can pose as a great challenge to a lot of clients. iii. Challenge of getting a CSP with a Perfect Service-Level-Agreement (SLA): A major challenge of getting the right CSP with that perfect SLA that protect the client is sometimes a challenging issue. As having a CSP which SLA agreement aligns with the policies of the client with a friendly compensation framework is difficult to come by. Hence, this a major setback to would be or current consumers of the cloud. iv. Cost challenge: the billing cost on the cloud could seems small since it is usually on the pay-as-you-use basis, albeit when it accumulated and considered over a long period of time, it becomes a challenge. This billing cost may not affect small companies as such, but it’s the propensity to impact huge cost on big organizations or corporations [16].

210

G. T. Ayem et al.

v.

Lack of Cloud expertise by organizations: Many organizations who want to go the way of the cloud are skeptical due to the fact that they lack qualified and requisite expertise who can handle this infrastructure when it is deployed or advise them accordingly on the working of the cloud as it relates their operations and services. Hence, this has become a big clog in the wheel of progress for the implementation of the cloud technology by lots of originations [10]. vi. Connectivity Issues: Cloud technology is usually deployed over the internet. Hence, there is bound to be connectivity issues at some point; and an organization whose operations and activities does not need a break in connectivity may be skeptical about deploying the cloud technology as a break in connectivity may make them lose millions of dollars in kind or cash [11]. vii. Migration issue: Old organization who want to migration to the cloud are usually confronted with the issue of what to migrate to the cloud and what not to migration. This can pose as a big challenge for old organizations although, what to migrate to the cloud may not be a problem for small and new organizations who want to start their business on the cloud [17]. 2.3 General Solutions to Cloud Issues The work of [18–20] projected a three-way general solution to some of the cloud technology issue as follows. i.

The preventive solutions Mechanisms: Users or would-be users of the cloud technology are advised to have a preventive solution mechanism that would enable them not to fall prey of some of the issues or challenges pose by the cloud computing phenomenon. The services of experts and consultants should be engaged that would help them prevent or mitigate the challenges. As this type of mechanism has the ability to save them a lot of ills and unnecessary costs that may come with some of the challenges a presented earlier. ii. The detective Solutions Mechanisms: Similarly, organization who would deploy the cloud technology are advise to put in place a detective solution mechanism that would be able to detect an issue when it is happening, as this can go a long way in helping to mitigate or truncate the extent of the impact the issue or challenge may have on the business or organization. iii. Corrective Mechanisms Solutions: Finally, on the generic solution, cloud adopters are advised to put in place a corrective mechanism solution that would enable their business on the cloud infrastructure to revert back to its original form as soon as possible when it has attacked, to preclude the halting of their business activities and a concomitant loss of revenue in the process. Almost all issues, challenges, and concerns posed by cloud computing, ranging from deployment to use, can be solved by using the above three named and discussed solution mechanisms. Many propositions on how to resolve interoperability challenges in cloud computing exists, and a plethora of them are still in the making. Hence, the list of solutions to these challenges as presented in this review work is not a conclusive work.

A Survey on Interoperability Issues at the SaaS Level Influencing

211

Future work would be needed especially at individual service model (SaaS, PaaS & IaaS) level to ascertain the best workable solution at each level.

3 Results In this section, employed the content-analysis desk search survey methodology to ascertain some of the interoperability issues advanced by scholars at the SaaS level. And we Table 1. Interoperability solution type deployed with the number of initiatives used for each solution at the SaaS level. Solution type

Initiative & Examples

Authors

Standardization: The process of developing and implementing and technical standards of devices or components based on the agreement cutting across different organizations, interest groups, and governments

Distributed Management Task Force (DMTF), Cloud Standards Coordination (ETSI), Advancing Storage and Information Technology (SNIA), Cloud Audit, Cloud Security Alliance (CSA), TC CLOUD, Object Management Group (OMG) and IEEE

Armbrust, Fox, Griffith, Joseph, Katz, Konwinski, Lee, Patterson, Rabkin and Stoica [22, 26] Grozev and Buyya [23] Harsh, Dudouet, Cascella, Jegou and Morin [24] Nogueira, Moreira, Lucrédio, Garcia and Fortes [21]

Abstraction Layers: This involves Projects RESERVOIR, SLA@SOI abstracting common features for and CSAL each CSP and advancing a layer that is high leveled to ensure that architecture business operations are more platform dependent

Harsh, Dudouet, Cascella, Jegou and Morin [24] Nogueira, Moreira, Lucrédio, Garcia and Fortes [21]

Open protocols: this solution focused more on establishing protocols of communications amongst various platforms

OCCI DeltaCloud Amazon EC2 and VMware vCloud

Harsh, Dudouet, Cascella, Jegou and Morin [24] Grozev and Buyya [21] Nogueira, Moreira, Lucrédio, Garcia and Fortes [21]

Open APIs: an They are protocols and tools employed in the building of software applications; hence, using APIs that encourages interoperability is advocated

Jclouds GoGrid OpenStack Simple Cloud and CloudLoop

Harsh, Dudouet, Cascella, Jegou and Morin [24] Hogan, Liu, Sokol and Tong [25, 27] Nogueira, Moreira, Lucrédio, Garcia and Fortes [21, 26]

Model-driven approaches: developing applications based on a common interoperability framework

Reusability framework for SaaS Computation Independent Model (CIM) Platform Independent Model (PIM) and Platform Specific Model (PSM)

Kleppe, Warmer, Warmer and Bast [26] France and Rumpe [27] Sharma, Sood and Sharma [28, 23] Hogan, Liu, Sokol and Tong [25] Ardagna, Di Nitto, Mohagheghi, Mosser, Ballagny, D’Andria, Casale, Matthews, Nechifor and Petcu [29] Kächele, Spann, Hauck and Domaschka [30, 22]

(continued)

212

G. T. Ayem et al. Table 1. (continued)

Solution type

Initiative & Examples

Authors

Service-Oriented Architecture (SOA)-based solutions: They are interoperability solutions advanced based on Service-Oriented Architecture for cloud interoperability

Cloud4SOA in PaaS Cloud4SOA in algorithm & Semantics

Zeginis, D’andria, Bocconi, Cruz, Martin, Gouvas, Ledakis and Tarabanis [31] Petcu [21] Nogueira, Moreira, Lucrédio, Garcia and Fortes [21]

identified 6-key interoperability solutions at the SaaS level some 25 paper survey. Table 1 below summaries the solution type, the Initiative & Examples used by each paper alongside the authors or paper count. 3.1 Chart Analysis of the Survey on SaaS Level Interoperability The bar chart in Fig. 1, shows the result from the desk search content analysis survey of solution type on cloud interoperability at the SaaS level. From the chart analysis, Standardization and the Model-driven solution type seems to be the most applied solution types on interoperability in the cloud at the SaaS level.

Fig. 1. Shows the analysis result on the SaaS level survey

4 Conclusion and Future Research Directions Our motivation for this work is to advanced solutions from extant literature to the cloud computing issues and challenges, especially the interoperability issues at the SaaS level that preclude consumers from wholly accepting the cloud phenomenon. A plethora of extant literature specifically on the surveys on interoperability exists, albeit many of them proffered a more generic solution that cut across all the service levels models

A Survey on Interoperability Issues at the SaaS Level Influencing

213

[20–27, 29, 31]. Thus, we surveyed solutions that looked specifically at the SaaS level as the facts on ground emphasized the greatest challenges at this level. We presented a 3-way generic solution mechanism into the other challenges aside interoperability that bedevils cloud adoption: the preventive, the detective and the corrective mechanism solutions are recommended. Finally, a content analysis desk search survey methodology was employed to ascertain the common solutions advanced by different researchers in the cloud interoperability issues with particular emphasis on the SaaS service model. from the survey of 25 papers, a 6-key solutions type into the aforementioned issue was identified. Viz: standardization, model-driven approaches, Open APIs, Open Protocols, Abstraction Layers, and Software Oriented Architecture (SOA)-based, alongside their variant initiatives. The results from the analysis showed that the standardization solution type has more number of initiatives which account for 40%; but with an average of 4 authors. While the model-driven solution has 4 initiatives with a 16% but with the highest number of authors implementation with is 7 authors. The rest of the solution types have an average range between 3 to 4 initiatives and between 2 to 3 authors implementation. Thus, we can conclude that the common solutions advanced to bridge the lacuna of the cloud interoperability at SaaS level seem to be the standardization solution as authors have advanced more standardization initiatives, which are efforts toward bridging this interoperability gulf. Secondly, more authors seem to use the model-driven approach towards achieving this interoperability at the SaaS level as the analysis survey revealed an author implementation of 70% . Future work is needed to test and measure the effectiveness of the initiatives adopted in the survey in a single and broader context to determine their efficacy under the SaaS service level model of cloud interoperability.

References 1. Marston, S., et al.: Cloud computing — the business perspective. Decis. Support Syst. 51(1), 176–189 (2011). https://doi.org/10.1016/j.dss.2010.12.006 2. Zook, M., et al.: Ten simple rules for responsible big data research. PLoS Comput. Biol. 13(3), e1005399 (2017). https://doi.org/10.1371/journal.pcbi.1005399 3. Al-Ruithe, M., Benkhelifa, E.: Cloud data governance maturity model. Book Cloud data governance maturity model. Series Cloud data governance maturity model, Association for Computing Machinery, Article 151 (2017) 4. Dillon, T., et al.: Cloud computing: issues and challenges. In: Proceedings of 2010 24th IEEE International Conference on Advanced Information Networking and Applications, pp. 27–33 (2010) 5. Haiqing, H., et al.: The impact of buyback service for IT portfolio and standardization of products on SaaS adoption. In: Proceedings of ICSSSM11, pp. 1–4 (2011) 6. Aleem, S., et al.: Empirical investigation of key factors for SaaS architecture dimension. IEEE Trans. Cloud Comput., 1 (2019). https://doi.org/10.1109/TCC.2019.2906299 7. Yangui, S., Tata, S.: An OCCI compliant model for PaaS resources description and provisioning. Comput. J. 59(3), 308–324 (2016). https://doi.org/10.1093/comjnl/bxu132 8. Kozlovszky, M., et al.: IaaS type cloud infrastructure assessment and monitoring. In: Proceedings of 2013 36th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 249–252 (2013) 9. Halili, M.K., Çiço, B.: Towards custom tailored SLA in IaaS environment through negotiation model: an overview. In: Proceedings of 2018 7th Mediterranean Conference on Embedded Computing (MECO), pp. 1–4 (2018)

214

G. T. Ayem et al.

10. Vajjhala, N., Ramollari, E.: big data using cloud computing - opportunities for small and medium-sized enterprises. Eur. J. Econ. Bus. Stud. 4, 129 (2016) 11. Malatpure, A., et al.: Experience report: testing private cloud reliability using a public cloud validation SaaS. In: Proceedings of 2017 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 56–56 (2017) 12. Babar, A., Ramsey, B.: Tutorial: building secure and scalable private cloud infrastructure with open stack. In: Proceedings of 2015 IEEE 19th International Enterprise Distributed Object Computing Workshop, p. 166 (2015) 13. Linthicum, D.S.: Emerging hybrid cloud patterns. IEEE Cloud Comput. 3(1), 88–91 (2016). https://doi.org/10.1109/MCC.2016.22 14. Naik, V.K., et al.: Workload monitoring in hybrid clouds. In: Proceedings of 2013 IEEE Sixth International Conference on Cloud Computing, pp. 816–822 (2013) 15. Fan, Q., Liu, L.: A survey of challenging issues and approaches in mobile cloud computing. In: Proceedings of 2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 87–90 (2016) 16. Rastogi, G., Sushil, R.: Cloud computing implementation: key issues and solutions. In: Proceedings of 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 320–324 (2015) 17. Alatawi, S., et al.: A survey on cloud security issues and solution. In: Proceedings of 2020 International Conference on Computing and Information Technology (ICCIT-1441), pp. 1–5 (2020) 18. Grover, J., et al.: Cloud computing and its security issues—a review. In: Proceedings of Fifth International Conference on Computing, Communications and Networking Technologies (ICCCNT), pp. 1–5 (2014) 19. Behl, A.: Emerging security challenges in cloud computing: an insight to cloud security challenges and their mitigation. In: Proceedings of 2011 World Congress on Information and Communication Technologies, pp. 217–222 (2011) 20. Martell, A., Smith, R., Motekaitis, R.: National Institute of Standard and Technology. NIST. Critically Selected Stability Constants of Metal Complexes (2004) 21. Novkovic, G.: Five characteristics of cloud computing (2017). Accessed 2021, https://www. controleng.com/articles/five-characteristics-of-cloud-computing/ 22. AlJahdali, H., et al.: Multi-tenancy in cloud computing. In: 2014 IEEE 8th International Symposium on Service Oriented System Engineering. IEEE (2014) 23. Zhu, Z., et al.: FPGA resource pooling in cloud computing. IEEE Trans. Cloud Comput. 9, 610–626 (2018) 24. Shukla, A., Simmhan, Y.: Toward reliable and rapid elasticity for streaming dataflows on clouds. In: 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). IEEE (2018) 25. Satyanarayana, S.: Cloud computing: SAAS. Comput. Sci. Telecommun. 4, 76–79 (2012) 26. Brodkin, J.: Gartner: seven cloud-computing security risks. InfoWorld 2008, 1–3 (2008) 27. Hsu, P.-F., Ray, S., Li-Hsieh, Y.-Y.: Examining cloud computing adoption intention, pricing mechanism, and deployment model. Int. J. Inf. Manage. 34(4), 474–488 (2014) 28. Haiqing, H., Xiaoxin, Z., Jin. Z.: The impact of buyback service for IT portfolio and standardization of products on SaaS adoption. In: ICSSSM11. IEEE (2011) 29. Ray, D.: Cloud adoption decisions: benefitting from an integrated perspective. Electron. J. Inf. Syst. Eval. 19(1), 3–21 (2016) 30. Osmani, L., et al.: Secure cloud connectivity for scientific applications. IEEE Trans. Serv. Comput. 11(4), 658–670 (2015) 31. Brodkin, J.: Gartner: seven cloud-computing security risks. Infoworld 2008, 1–3 (2008)

Human Recognition Based Decision Virtualization for Effecting Safety-as-a-Service Using IoT Enabled Automated UV-C Sanitization System Ananda Mukherjee, Bavrabi Ghosh, Nilanjana Dutta Roy(B) , Arijit Mandal, and Pinaki Karmakar Department of Computer Science and Engineering, Institute of Engineering and Management, Kolkata, India [email protected] Abstract. To meet the increasing demand of disinfection in public places across the country, we thought of coming up with the idea of a disinfectant car. The goal of this work is to serve public healthcare sectors by disinfecting large areas with Ultraviolet-C (UV-C) lights. Emerging technologies like Internet of Technology (IoT) and robotics have been recognized as promising ways to tackle the current challenges and achieve the goal. To avoid human touch, the overall process, including movement of the robotic car is controlled by a mobile application. Keywords: Internet of things · Surveillance and disinfectant car Ultraviolet-C (100–280 nm) tubes · Mobile application

1

·

Introduction

The pandemic which broke out due to Coronavirus has made the whole world come to a stand still for over a year. Health care sector has faced tremendous pressure and challenges in terms of catering to a huge number of Covid affected patients at a time. Many have lost their lives because there was a dearth of health care facilities. Hospitals were overpopulated and doctors were working multiple shifts at a stretch and yet could not tackle the situation. Now the spread of Coronavirus was happening due to multiple reasons and the only way to control the spread was rapid sanitization of public places and public transport. To do so, a huge amount of manpower is needed, which taking into account the possibility of getting affected being exposed in public places, is not a promising number. Rather if we could use automated systems to do the same job in less time and cost effective manner, that would help to control the spread of the virus. It will in turn somewhat slow down the death toll being caused by this deadly virus. Our primary motivation for the work is to reduce human involvement in the process as much as possible. Therefore the disinfectant car which will be bearing the proposed system will not need human intervention while maneuvering in c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022  D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 215–221, 2022. https://doi.org/10.1007/978-981-19-3182-6_17

216

A. Mukherjee et al.

public places. The system has a camera installed which will take live video of the place which needs to be sanitized. This live video will be processed by the proposed system which in turn will give instruction to the disinfectant car to reach the target place and sanitize it with UV-C light. Taking into consideration, the power of UV-C light in deactivating the DNA of a microorganism was the best possible choice. Maximum effort has been made to make sure the entire system is built at low cost so that it can be made available for common men at an affordable price. It could also serve the society with a Safety-as-a-service solution [3]. The main contribution of the work can be summarised as follows: – Design and development of a micro-controller based car along with attached Ultraviolet-C (100–280 nm) tubes. – Development of a new mobile application suitable for Andriod devices to control the process. Voice to text conversion and generation of a pretrained machine learning model suitable for the classification of input images to appropriate classes, are to be implemented in future. The organization of the rest of the paper is as follows: Sect. 2 covers the literature review. The complete methodology has been defined in Sect. 3. Results and future plan are shown in Sect. 4 and Sect. 5 draws the final conclusion.

2

Related Work

With the world wide distress caused by COVID-19 in this paper [1] an automated hand sanitization and temperature monitoring system has been proposed. With the motivation to develop a contact less device this paper proposes a cost effective model which has the capacity to handle masses and monitor temperature of users at a rapid rate of 10 sec per person. If a person is detected to be having high temperature, record of the same is forwarded to health authority and police to prevent the person from entering public places. This way contact tracing can also be done. The system is small in size, portable and easy to mount with an overall cost of Rs.8500 only. As the hand of the user comes in close proximity to the system it is detected by an ultrasonic sensor. The system immediately dispenses 5–6 ml of alcohol based sanitizer on the hand of the user. Temperature of the user is measured using MLX90164 IR Thermometer which measures body temperature without contact. If high temperature is detected an alarm will ring along with an alert light glowing up. This person can be a potential Coronavirus carrier and therefore his/her details can be uploaded to the database by the security personnel present at the entry point using a smartphone. The data will be transferred to the database using NodeMCU. The controlling component of the device is a PCB/Perf board. Programming of NodeMCU is done using C++. During practical usage it has been found the system brings down the cost of sanitization per person to Rs.0.66 which is very much affordable. Another approach has been made in the same context to automate the process of sanitization and temperature monitoring in this paper [4]. The proposed model consists of

Human Recognition Based Decision Virtualization

217

a room occupancy limiter, contactless temperature sensor, automatic hand sanitizer, oximeter, heartbeat rate sensing module, camera and motion detection module, social distancing module, and report generator. The system starts with a room capacity limiter. While a person is entering the room they have to fill in a Google form bearing their details. Once name and phone number is recorded the form generates a unique QR code for the user. Also since COVID symptoms are dynamic therefore only monitoring temperature and heartbeat with oxygen level is not enough. So the Google form has check boxes against nausea, bodyache etc. If a person has any of those symptoms they will check the boxes indicating true. When a person is trying to enter a room he/she will need to produce their hand below an ultrasonic sensor. As the recording is taken a servo motor rotates letting the gate open by 90◦ and the user will be allowed to enter. A count will be kept and as soon as the limit of the room reaches the highest limit the next person will be barred from entering the room. At the exit point another servo motor is used to open the door therefore keeping the count exact and again letting the next person enter at the checkpoints. As the sensors check the temperature and Oxygen level it is cross checked against a threshold value (for temperature it is between 97 ◦ F to 99 ◦ F and oxygen level at 95%). Any abnormality recorded will lead to the person being marked as a potential carrier of the virus and immediately his/her picture will be forwarded to the health workers for contact tracing. All the records are saved in a spreadsheet and any abnormality of an individual recorded will be highlighted in red. The hand sanitizer module of the system is operated using Ultrasonic sensor, Servo motor, and Arduino. The sensor detects the hand and when found within a range triggers the servo motor to rotate and let the sanitizer liquid flow out. This system also helps to monitor whether social distancing is being maintained within the room. The monitoring system consists of an ultrasonic sensor, Arduino, and buzzer. Overall the system will find use in real time applications with an advantage of being automated and therefore contact less.

3 3.1

Methodology Proposed Pipeline

The proposed system is designed such that the sanitizer car will move on terrains and plains with an inbuilt camera. This camera will capture live video which will be transferred to a connected smartphone. Further instruction for sanitization will be provided by the smartphone to the sanitizer car, thereby minimizing human involvement on spot. At the very beginning, connection needs to be set up between the smartphone and the system. Username and password is used to connect to the hotspot of the smartphone which in turn connects with the sanitizer car. Once connection is established, an IP address becomes visible on the smartphone. Entering the IP address in the browser will lead to the control dashboard which will help to direct the movement of the car, based on the live video captured by the camera. Working of the system can be witnessed using the below mentioned block diagram (Fig. 1).

218

A. Mukherjee et al.

Fig. 1. Block diagram of the proposed system.

Human Recognition Based Decision Virtualization

219

As the object to be sanitized is detected by the ESP 32 camera, the live video is forwarded to the smartphone. It has been connected via Wifi module using the mobile’s hotspot earlier. As soon as the user can detect the position of the object, using the control dashboard, direction is given to maneuver the L298N motor driver. This driver is fitted with the UV-C tube light. It then moves near the target object to sanitize it. Each module has a specific function to perform in this system. The Ultraviolet light has been chosen because it has the property of deactivating the DNA of bacteria, viruses and pathogens. Specifically this UV-C light damages the nucleic acid of different micro-organisms by forming covalent bonds with various adjacent bases in the DNA. This as a whole stops the process of replication and reproduction. If the micro-organism tries to replicate, it will die. The UV-C light source in the system will be a pre-installed tube (Fig. 2a). For driving the car, we are using a L298N based motor driver (Fig. 2b) module which is a high power motor driver. It can control up to 4 DC motors, or 2 DC motors with directional and speed control. The DC motor (Fig. 2c) used in the system is a 150 RPM single shaft BO motor. It is combined with a 69mm diameter wheel for plastic gear motors (Fig. 2d).

(a) UV-C light tube.

(b) L298N Motor Diver.

(c) DC Motor.

(d) Wheels.

(e) ESP32 Camera.

Fig. 2. The components of the system

220

A. Mukherjee et al.

The target objects to be detected, are located through the live feed captured by the camera module pre-installed on the sanitizer car. The ESP32-CAM is a very small camera module with the ESP32-S chip. This camera has an inbuilt Wifi module which transfers information from the camera to the smartphone.

4

Result

In this section, we showcase the model developed as per the proposed system. The developed prototype is able to move in any direction which is controlled by a mobile application (Fig. 3d). The side view of it has been shown in Fig. 3a and the internal circuit and bottom part are shown in Figs. 3b and 3c respectively. The UV-C light has been mounted over the robotic car and it is free to move in

(a) The side view of the sanitizer car.

(b) Internal circuit of the sanitizer car.

(c) Bottom part is shown.

(d) The IDE of the mobile application - Control dashboard.

Fig. 3. Some glimpses of the developed system.

Human Recognition Based Decision Virtualization

221

any direction. There is a controller to stop the car when the presence of human body is detected through the camera attached. A comparative analysis has been shown in Table 1. It shows the differences between multiple aspects in other models present in the market. Table 1. Experimental results based on some random messages. Model

Type of disinfectant used

Human involvement needed to operate?

Can the system be automated in future?)

Can the system be controlled by a mobile application?

Portable Disinfectant Device [2]

UV-C light and liquid sanitizer

Needed

No comments/Not sure

No

Sanitizer Car

UV-C light

Not needed

Yes

Yes

5

Conclusion and Future Work

The sole purpose of this work is to sanitize a huge surface area in less time with less human effort. The full system is automated which is controlled by a mobile application. The system has been implemented as a prototype model and it is working well within a closed area till now. One may fix the time for sanitization depending on the size of the area. The model will be turned off as soon as the work is done after a certain interval. The system’s performance is remarkable in an empty room. However, being a robotic car, it may also recognize the presence of human being and identify other obstructions. And a stable wifi or a mobile hotspot connection will ensure the best performance of it. Our future plan is to recognize the human body since the UV-C light is harmful for human. This will be done by video processing, embedded with an alarm to trigger an alert. It will turn off the UV-C light as soon as the alarm generates. Also, a floor map of the building may be fed into the system for tracking. It will send an update to the centralized server for tracking. This may generate an updated picture of the sanitized area. The system has been implemented to ensure safety from micro-organisms in less time and less human effort.

References 1. Samal, P., et al.: Automated sanitization device-hand sanitization. J. Today’s IdeasTomorrow’s Technol. 8(1), 27–33 (2020) 2. Kumar, D., Sonawane, U., Gohil, M.K., et al.: Design and development of a portable disinfectant device. Trans. Indian Natl. Acad. Eng. 5, 299–303 (2020) 3. Roy, C., Roy, A., Misra, S., Maiti, J.: Safe-aaS: decision virtualization for effecting safety-as-a-service. IEEE Internet Things J. 5(3), 1690–1697 (2018) 4. Anjali, K., Anand, R., Prabhu, S.D., Geethu, R.S.: IoT based smart healthcare system to detect and alert covid symptom. In: 2021 6th International Conference on Communication and Electronics Systems (ICCES), pp. 685–692. IEEE (2021)

Blockchain Technology and its Applications

Security Optimization of Resource-Constrained Internet of Healthcare Things (IoHT) Devices Using Asymmetric Cryptography for Blockchain Network Varsha Jayaprakash1 and Amit Kumar Tyagi2,3(B) 1 School of Electronics Engineering, Vellore Institute of Technology, Chennai, India

[email protected] 2 School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India

[email protected] 3 Centre for Advanced Data Science, Vellore Institute of Technology, Chennai, Tamilnadu, India

Abstract. The term “Internet of Things” is becoming increasingly popular and promising, ushering in a new era of smarter connectivity across billions of gadgets. In the foreseeable future, IoT’s potential is boundless. The healthcare industry, often known as IoHT, is the most demanding application of IoT. Any healthcarebased IoT system starts with sensors, RFID, and smart tags, all of which are limited in terms of resources. When these devices are integrated, there is a significant need for safe information transformation since they carry sensitive patient information that might be extremely dangerous if it falls into the hands of an unauthorized person. The internet of things based on blockchain is a new technology that combines the benefits of both blockchain and cryptography to protect data at the physical layer. It is lightweight compared to other traditional approaches and does not compromise security levels, as the name implies. This paper explains how to safeguard data using elliptic curve cryptography for blockchain network to encrypt and store the data. On the basis of energy and memory efficiency, as well as latency, the proposed method is evaluated. Keywords: Security · IoHT · Sensors · RFID · Resource constraints · Blockchain · Elliptic curve cryptography · Latency · Efficiency

1 Introduction The Internet of Things has become the most widely used term in the world today. It is a technical concept that entails practical devices such as sensors and actuators that are used to collect real-time data, convey that data over the internet, and store that data on cloud-based platforms with or without human participation [1–3]. In 1999, Kevin Ashton coined the term “Internet of Things” to promote the usage of radio frequency-based identification (RFID), which involves a variety of embedded devices. With the advent of home automation, industrial energy meters, wearable and self-health care devices in 2011, the tremendous expansion of IoT-based devices began [4]. The health-care © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 225–236, 2022. https://doi.org/10.1007/978-981-19-3182-6_18

226

V. Jayaprakash and A. K. Tyagi

industry is a significant contribution to the overall number of IoT-enabled devices in the world. The advent of IoHT allows patients to self-assess their body states while concurrently uploading these data to the hospital’s server, allowing doctors to maintain track of patients’ health problems and schedule exams and visits only as needed, saving both money and time [5, 6]. However, the widespread adoption of this technology has resulted in a slew of concerns and challenges relating to patient data protection. Data protection is required at three layers in any IoHT device: physical/design, communication, and computation. They are further categorized as resource rich (phones, tablets, computers) and resource constrained (sensors, RFID) devices (Fig. 1) [7]. Resource constrained devices are often used to deal with real time applications that require accurate processing of data. In addition to this they are limited in terms of power consumption, available memory and computation speeds [8, 9].

Fig. 1. Categorization of IoT devices

In most of the countries, the authentic information provided by the healthcare data should be confined through “Health Information and Portability Accountability Association (HIPAA)” [9]. Efficient and safe implementation of these healthcare systems can be achieved by using optimized and robust security systems [10]. Due to the mutual distrust in the environment, gadgets will be unable to operate independently and will need to be secured and authenticated in order to function effectively. Blockchain is a distributed ledger-based authentication system can be used to keep data comprising many types of health information secure and private. Consistency of storage data is achieved using consensus algorithm. The entire framework is distributed between nodes so that even the failure of few nodes will not tamper the data or allow new attacker nodes to enter into the system [11]. Whenever peer-peer communication is observed, the public key cryptographic algorithm is used for authentication purpose between IoT devices. However, to lower the barrier to implementing and using blockchain in these cases, a lightweight blockchain consensus should be established [12]. It makes any network decentralized by improving the security. Each device in a blockchain network has a unique device ID, private key and hash function of critical data for authentication validation before the access is given to the IoT device. Asymmetric cryptography such as elliptic curve

Security Optimization of Resource-Constrained Internet of Healthcare Things (IoHT)

227

cryptography or SHA-256 is used to encrypt and decrypt the data after authentication before storing in the blocks [13]. Organization of the Work: The paper is organized as follows: Sect. 2 give a literature survey regarding related researches followed by motivation in Sect. 3. Section 4 describes the challenges in designing a secure IoT system and Sect. 5 provides solution for security of resource constrained IoHT devices. Section 6 describes the blockchain mechanism employed for secure transfer of health data. Section 7 explains the working principle and the algorithm of elliptic curve cryptographic algorithm used for encryption purpose before storage of data in the blocks. Section 8 discusses the simulation results. Section 9 provides an insight about the future scope of this research followed by conclusion.

2 Related Work A lightweight blockchain architecture for healthcare database management was proposed by Leila Ismail and Huned Materwala [14]. The network participants are divided into demographic clusters by maintaining one copy of ledger. Forking is avoided by using a using a Head Blockchain Manager to handle transactions. The proposed method outperforms traditional Bitcoin network in terms of network traffic generated and computation speed. An autonomous solution to store the user credentials without the dependence of TTP was proposed by Daniel Maldonado-Ruiz [15]. The method developed was coined as “Three Blockchains Identity Management with Elliptic Curve Cryptography (3BI-ECC)”. Their system ensures that there is full integrity with secure identity and communication infrastructures with the presence of specific identity blockchains in the network. The system is more transparent to the user. Utsav Banerjee, Anantha P. Chandrakasan [16] developed a variant of elliptic curve cryptography algorithm and named it a pair-based cryptography (PBC). It uses the functions of bilinear maps between elliptic curve and finite fields that enables it to be used in novel applications other than secure key exchange. Chips were fabricated involving these secure algorithms with pairing crypto core occupying 112k NAND Gate Equivalents (GE) and 16KB of SRAM proving it to be an efficient and low-power architecture. Dipankar Dasgupta, John M. Shrein and Kishor Datta Gupta [17] discussed about the strength, power consumption, security level, complexity and vulnerability of various cryptographic algorithms used in blockchain in healthcare domain such as NIST P-256 curve, DSA, ECDSA and Cryptographic hashes. They consider ECC to be susceptible to side channel attacks, fault and timing attacks. SHA-256 is considered to be unbreakable according to them. They also discussed some new research rends in blockchain and the new vulnerabilities that could occur in future. Then, Chiu C. Tan„ Haodong Wang, Sheng Zhong and Qun Li [18] developed a lightweight identity based cryptography for body sensor networks that manages security, privacy and accessibility for health care monitoring and tested it on commercially available sensors. Simulation results showed that the proposed method performs faster computation than other sensor platforms but suffered from slow query performance compared to other ciphers. Further an efficient and secure authentication for IoT healthcare devices based on RFID tags and card readers using elliptic curve cryptography (ECC) was proposed by Davood Noori [19]. This authentication system can be used for

228

V. Jayaprakash and A. K. Tyagi

safe communication of information regarding patients’ health. The proposed algorithm proves to be efficient in terms of complexity due to lower computation cost and lesser multiplication rounds in ECC compared to other techniques. An FPGA-acceleration of ECC operation suing binary Edward curves were implemented by Carlos Andres Lara-Ninoa and Arturo Diaz-Perez [20]. The method takes advantage of the scalar point multiplication property of ECC algorithm. Results show that the proposed technique uses only 1400 slices of Virtex-5 FPGA to provide a security strength of up to 128 bits.

3 Motivation Health care is one of the fastest sectors to adapt to the changes made in IoT based systems. ““MarketsAndMarkets” predicts that IoHT will be worth US$ 163.2B, commercial report claims a spending of $117B, and McKinsey estimates an economic impact of more than US$ 170B” [21]. development of e-health systems such as electrocardiography, electroencephalogram, diabetes can be cost saving and help patients suffering with chronic diseases reduce the number of hospitals visits [22]. Also, the outbreak of covid-19 pandemic has created a fear in minds of people and refraining them visiting hospitals which could potentially cause them to suffer from the virus. This has enabled the IoHT sector to grow exponentially and will continue to bloom for the next few years. People are now looking for safer and less expensive ways to maintain and monitor their health. Due to the increased number of users, it has become an attractive sector for hackers. Hence, it is important to develop IoT based systems with enhanced security that enables safe transfer and computation of patients’ data. Security can be achieved by various methods like cryptography, block chain technology, machine learning techniques like supervised, unsupervised and reinforced learning etc. [23]. This paper focuses on blockchain based network using elliptic curve cryptography to protect the data at the sensing/physical layer of any IoT based system.

4 Concerns and Challenges in Implementation of Cryptographic Techniques to Resource Constrained IoHT Devices From physical sensors to computer servers, any IoT network incorporates a wide variety of platforms. This creates a slew of new difficulties for consumers, including privacy, security, compatibility, scalability, and interoperability [23]. IoT devices are a particularly appealing target for hackers because they interact directly with the actual environment to collect sensitive data [24]. The most common attacks experienced in this stage includes eavesdropping, replay attacks, node capture attacks, side channel attacks etc. These devices can potentially be physically damaged in addition to being tapped to gather the sensitive data provided. As a result, cyber security is required, which is regarded as a key problem in the implementation of authentication, data security, availability, privacy, and accessibility [25]. The approach used to protect sensitive data is entirely dependent on the surroundings. The proposed approach must be appropriate and highly secure for an IoT device’s applied layer, but it must be developed in such a way that it does not interfere with the device’s normal operations. Because these devices

Security Optimization of Resource-Constrained Internet of Healthcare Things (IoHT)

229

Fig. 2. Challenges in implementation of cryptography

are resource constrained, traditional PC cryptography approaches do not fit into this group. Figure 2 explains the cryptographic technique used to preserve this information must be designed by keeping in mind the limitation of the device. The major challenges include [26]: Low computation power, Lower energy, Reduction in availability of space due to smaller size, Reduction in memory space (ROM and RAM), Lower power and Faster execution time.

5 Solutions to Enhance Security in Physical Layer of IoHT Devices The main characteristics to be taken into consideration while choosing the right cryptographic techniques are cost, performance and security level. Performance can further be divided into subsections such as energy and power consumption, latency, computation speed, memory occupation and different attack models such as linear and differential attacks, side channel attacks and gault injection attacks [27]. Most of the abovementioned concerns are solved following LWC techniques with simple key and lesser number of rounds [1]. Cryptographic techniques are categorized as symmetric and asymmetric based on the number of keys. A symmetric cryptographic technique uses the same key for encryption as well as decryption whereas asymmetric technique has two private public key pairs [28]. In symmetric block ciphers the encryption and decryption process take place continuously. Many asymmetric algorithms such as ECC, ChaCha20 are widely used today for both lightweight as well high-end security systems [29]. This enables to share information between two parties by generating a common secret key without actually revealing each of the party’s private keys [30].

230

V. Jayaprakash and A. K. Tyagi

6 Blockchain Based IoHT Devices Blockchain is a public ledger used for secure and consistent transactions by anonymous users termed as miners. Only the first miner who soles the proof of work (POF) will get rewarded with bitcoins, which ultimately lead to the extension the blockchain network. The verified transactions are stored in the body of the blockchain using a structure called the Merkle tree and the blocks are linked together [31]. This decentralized characteristic of blockchain declines the need for authorities to monitor transactions and makes the system more secure, fair and unbiased [32]. As the work is now trending towards online mode of processing, Healthcare sector is also not spared from this trend. With the increasing demand for data collection, processing and storage electronically in digital format, there is equally vulnerable on the security of data being transferred across network and it opens a gate for the hackers to easily trap the information being transferred if it is not secured and encrypted. Blockchain is an advanced cryptographic technology that supports and ensures secured transmission of electronic data and it is emerging in the healthcare industry very rapidly as healthcare is becoming a vital part of our lives [33]. With healthcare automation, all the healthcare services are interconnected electronically for storing and retrieving patients’ health information, monitoring and investigating healthcare details through different medium. Blockchain technology helps in revolutionizing the conventional healthcare practices to a more secured, personalized, reliable and efficient mode of practice in terms of data sharing, drug traceability, clinical trials and pharmaceutical supply chain management due to its flexible, trusted, shared and reliable architecture. Integration of Blockchain technology with IoHT is gaining potential to address and overcome data vulnerabilities due to its (i) Scalable and decentralized architecture (ii) Uniqueness to develop applications without cloud or server dependencies (iii) Dependability and traceability to IoHT data (iv) Ability to provide secured data transmission between IoHT devices (v) Transparency to data (vi) Reliability to services [34] (refer Fig. 3). Several healthcare companies use IoHT to provide services such as patient monitoring, asset management, and inventory tracking. Due to enhanced transparency and security in communication, the introduction of Blockchain characteristics in the healthcare industry provides confidence and dependability in data and services to all parties participating in this segment. There is still a lot of research being done to optimize and address the challenges of data standardization and regulation, integration with existing healthcare systems, the establishment of Blockchain-IoHT policies, improvement of latency and throughput for huge healthcare data and to prove that the healthcare practices implemented using Blockchain technology is safe and reliable before large-scale implementation.

Security Optimization of Resource-Constrained Internet of Healthcare Things (IoHT)

231

Fig. 3. Advantages of blockchain technology

7 Elliptic Curve Cryptography for Private Key Generation Elliptic curve cryptography is an asymmetric public key-based encryption technique based on algebraic structure of elliptic curve over finite field that offers high level of security with lesser key size compared to other existing techniques. It was first proposed by Victor Miller and Neal Koblitz in the year 1985 [35]. The private-public key pairs are obtained by solving the elliptic curve equation defined over a finite field given by: y2 = {x3 + ax + b} mod {p}

(1)

where, p is any prime number and a and b are constants such that 4a3 + 27b2 = 0. The user comes to a conclusion after a common elliptic curve equation to generate private-public key pairs. Each of them uses the other users public key to generate a new set of secret keys for encryption purposes. Since this uses a common key based on the users’ private keys, it is difficult for any intruder to tap the message. The elliptic curve can be defined using any type of numbers namely rational, real or complex [36]. The elliptic curve lacks a straightforward encryption mechanism. Instead, ECC is used to generate a common secret key, and the data is encrypted using Elliptic Curve Diffie– Hellman (ECCDH), a hybrid cryptography system [37]. ECCDH, on the other hand, is vulnerable to man-in-the-middle attacks [38]. After generating a secret key based on mutual acceptance, the key can be used to encrypt data using any symmetric or asymmetric technique, such as ChaCha20, AES-GCM, RSA, and so on. Confidentiality, authentication, and non-repudiation are all guaranteed via the key exchange method [39]. This paper discusses the implementation of ECC to generate private and public secret keys and develop a mechanism to encrypt the data using Ron Rivest, Adi Shamir, and Leonard Adleman (RSA) algorithm involving the secret keys generated using ECC. The algorithm of both ECC ad RSA is discussed below.

232

V. Jayaprakash and A. K. Tyagi

7.1 Algorithm for ECC Based Secret Key Generation • • • • •

EncryptionKey(pubKey) -- > (sharedECCKey, ciphertextPubKey) Generate ciphertextPrivKey = new random private key. Calculate ciphertextPubKey = ciphertextPrivKey * G. Calculate the ECDH shared secret: sharedECCKey = pubKey * ciphertextPrivKey. Return both the sharedECCKey + ciphertextPubKey. Use the sharedECCKey for symmetric encryption. Use the randomly generated ciphertextPubKey to calculate the decryption key later. • DecryptionKey(privKey, ciphertextPubKey) -- > sharedECCKey • Calculate the ECDH shared secret: sharedECCKey = ciphertextPubKey * privKey. • Return the sharedECCKey and use it for the decryption. 7.2 RSA RSA is one of the most famous algorithms used even today in digital signatures and blockchain networks. It involves the use of both public and private keys to encrypt and decrypt the information. The difficulty in retrieving the plain text back from cipher text depends on the massive product of the two large prime numbers [40]. RSA Encryption Algorithm • • • • •

Input: RSA public key (n,e), Plain text Output: Ciphertext c Begin 1. Compute c = me mod n 2. Return c: End

RSA Decryption Algorithm • • • • •

Input: Public key (n,e); Private key d; Ciphertext c Output: Plain text m Begin 1. Compute c = cd mod n 2. Return m. End

8 Results and Discussion The ECC algorithm is used to generate the secret key by generating random public keys and the obtained secret key for encryption and decryption is used as key for RSA algorithm of symmetric encryption. The following algorithms were implemented in Python and the results are mentioned below. For encryption and decryption, two distinct secret keys, namely public and private key pairs, are formed, as shown in Fig. 4. As demonstrated in Fig. 5, the public key pair is also utilized as a key for symmetric encryption using the RSA technique to encode and decrypt data. The efficiency of algorithms will be investigated in the future by implementing them in a lightweight blockchain network.

Security Optimization of Resource-Constrained Internet of Healthcare Things (IoHT)

233

Fig. 4. Secret key generation using ECC

Fig. 5. Encryption and decryption of data using symmetric cipher

9 Conclusion and Future Work This paper examines the software implementation of the ECC algorithm for generating secret keys for data encryption in an IoHT blockchain network. The challenges of low-scale embedded device security, as well as various methods to solve them, were examined. The use of blockchain in IoHT was also discussed, and it was implemented by using ECC to generate a secret key and RSA symmetric cypher to encrypt data and update data on the blocks. These algorithms are also planned to be implemented in a lightweight blockchain network in the future, with the goal of determining their efficiency in terms of latency, memory usage, and throughput. 9.1 Future Work IoT applications are growing rapidly day by day and as most of the industries are moving towards IoT, energy consumption is one of the main constraints of the IoT world [3, 41]. The limited computational capabilities and resource constraints make it a vulnerable target for hackers [3]. IoHT is a field that is widely in use now. It deals with millions of patients’ health information which needs to be secured in order to prevent misuse. In future, light weight cryptographic encryption in IoHT can improve the security level of the system and help to make the devices more efficient and secure. Lightweight block ciphers are efficient in both hardware as well as software. Asymmetric block ciphers, stream, hash and elliptical curve functions are other available techniques which have a high potential to be employed in these devices to secure patients’ information in IoHT [42–44]. A lightweight blockchain network for secure authentication of patient’s data

234

V. Jayaprakash and A. K. Tyagi

can be developed that uses ECC algorithm to encrypt the data before it is updated on the ledger. Further the hardware performance can be studied by implementing these techniques in real time embedded systems, ARM-based microprocessors and dedicated integrated circuits which are widely used in IoHT industry to observe various parameters like circuit-footprint, throughput, latency, energy and power consumptions in order to design ultra-low power IoT devices. Acknowledgement. We thank the Centre for Advanced Data Science and School of Computer Science and Engineering, Vellore Institute of Technology, Chennai for providing an opportunity and their kind support to proceed with the research work on time.

References 1. Thakor, V.A., Razzaque, M.A., Khandaker, M.R.A.: Lightweight cryptography algorithms for resource-constrained IoT devices: a review, comparison and research opportunities. IEEE Access 9, 28177–28193 (2021). https://doi.org/10.1109/ACCESS.2021.3052867 2. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of things (iot): a vision, architectural elements, and future directions. Futur. Gener. Comput. Syst. 29(7), 1645–1660 (2013) 3. Singh, S., Sharma, P.K., Moon, S.Y., et al.: Advanced lightweight encryption algorithms for IoT devices: survey, challenges and solutions. J. Ambient Intell. Human Comput. (2017) 4. Tawalbeh, L., Muheidat, F., Tawalbeh, M., Quwaider, M.: IoT privacy and security: challenges and solutions. Appl. Sci. 10(12), 4102 (2020) 5. Gupta, S., Cherukuri, A.K., Subramanian, C.M., Ahmad, A.: Comparison, analysis and analogy of biological and computer viruses. In: Tyagi, A.K., Abraham, A., Kaklauskas, A. (eds.) Intelligent Interactive Multimedia Systems for e-Healthcare Applications, pp. 3–34. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-6542-4_1 6. Kute, S.S., Tyagi, A.K., Aswathy, S.U.: Security, privacy and trust issues in internet of things and machine learning based e-healthcare. In: Tyagi, A.K., Abraham, A., Kaklauskas, A. (eds.) Intelligent Interactive Multimedia Systems for e-Healthcare Applications, pp. 291–317. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-6542-4_15 7. McKay, K., Bassham, L., Turan, M.S., Mouha, N.: Report on Lightweight Cryptography (Nistir8114). Gaithersburg, MD, USA: NIST (2017) 8. Toshihiko, O.: ‘Lightweight cryptography applicable to various IoT devices.’ NEC Tech. J. 12(1), 67–71 (2017) 9. Ullah, A., Sehr, I., Akbar, M., Ning, H.: FoG assisted secure De-duplicated data dissemination in smart healthcare IoT. In: 2018 IEEE International Conference on Smart Internet of Things (SmartIoT) , pp 166–171. IEEE (2018) 10. Butpheng, C., Yeh, K.-H., Xiong, H.: Security and privacy in IoT-cloud-based e-health systems—A comprehensive review. Symmetry 12(7), 1191 (2020) 11. Li, D., Peng, W., Deng, W., Gai, F.: A blockchain-based authentication and security mechanism for IoT. In: 2018 27th International Conference on Computer Communication and Networks (ICCCN), pp. 1–6 (2018). https://doi.org/10.1109/ICCCN.2018.8487449 12. Berdik, D., Otoum, S., Schmidt, N., et al.: A survey on blockchain for information systems management and security. Inf. Process. Manag. 58(1), 102397 (2021) 13. Alexopoulos, N., Daubert, J., Mühlhäuser, M., et al.: Beyond the hype: on using blockchains in trust management for authentication, pp. 546–5532017

Security Optimization of Resource-Constrained Internet of Healthcare Things (IoHT)

235

14. Ismail, L., Materwala, H., Zeadally, S.: Lightweight blockchain for healthcare. IEEE Access 7, 149935–149951 (2019). https://doi.org/10.1109/ACCESS.2019.2947613 15. Maldonado-Ruiz, D., Torres, J., El Madhoun, N.: “3BI-ECC: a decentralized identity framework based on blockchain technology and elliptic curve cryptography. In: 2020 2nd Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS), pp. 45–46 (2020). https://doi.org/10.1109/BRAINS49436.2020.9223300 16. Banerjee, U., Chandrakasan, A.P.: A low-power elliptic curve pairing crypto-processor for secure embedded blockchain and functional encryption. In: 2021 IEEE Custom Integrated Circuits Conference (CICC), pp. 1–2 (2021). https://doi.org/10.1109/CICC51472.2021.943 1552 17. Dasgupta, D., Shrein, J.M., Gupta, K.D.: A survey of blockchain from security perspective. J. Bank. Finan. Technol. 3(1), 1–17 (2018). https://doi.org/10.1007/s42786-018-00002-6 18. Tan, C.C., Wang, H., Zhong, S., Li, Q.: IBE-lite: a lightweight identity-based cryptography for body sensor networks. IEEE Trans. Inf. Technol. Biomed. 13(6), 926–932 (2009). https:// doi.org/10.1109/TITB.2009.2033055 19. Noori, D., Shakeri, H., Niazi Torshiz, M.: Scalable, efficient, and secure RFID with elliptic curve cryptosystem for Internet of Things in healthcare environment. EURASIP J. Inf. Secur. 2020(1), 1–11 (2020). https://doi.org/10.1186/s13635-020-00114-x 20. Lara-Nino, C.A., Diaz-Perez, A., Morales-Sandoval, M.: Lightweight elliptic curve cryptography accelerator for internet of things applications. Ad Hoc Netw. 103, 102159 (2020) 21. Rodrigues, J.J.P.C., et al.: Enabling technologies for the Internet of Health things. IEEE Access 6, 13129–13141 (2018). https://doi.org/10.1109/ACCESS.2017.2789329 22. Khairuddin, A.M., Azir, K.N.F.K., Kan, P.E.: Limitations and future of electrocardiography devices: a review and the perspective from the internet of Things. In: International Conference on Research and Innovation in Information Systems, pp. 1–7 (2017) 23. Xiao, L., Wan, X., Xiaozhen, L., Zhang, Y., Di, W.: IoT security techniques based on machine learning: how do IoT devices use AI to enhance security? IEEE Signal Process. Mag. 35(5), 41–49 (2018). https://doi.org/10.1109/MSP.2018.2825478 24. Banafa, A.: Three major challenges facing IoT. IEEE IoT Newslett. (2017). https://iot.ieee. org/newsletter/march2017/three-major-challenges-facing-iot.html 25. Feng, W., Qin, Y., Zhao, S., Feng, D.: AAoT: Lightweight attestation and authentication of low-resource things in IoT and CPS. Comput. Netw. 134, 167–182 (2018). https://doi.org/10. 1016/j.comnet.2018.01.039 26. Mohd, B.J., Hayajneh, T., Vasilakos, A.V.: A survey on lightweight block ciphers for lowresource devices: comparative study and open issues. J. Netw. Comput. Appl. 58, 73–93 (2015) 27. Singh, S., Sharma, P.K., Moon, S.Y., Park, J.H.: Advanced lightweight encryption algorithms for IoT devices: survey, challenges and solutions. J. Ambient Intell. Hum. Comput. 4, 1–18 (2017) 28. Stallings, W.: Book: Cryptography and Network Security: Principles and Practice (2017) 29. Bhardwaj, I., Kumar, A., Bansal, M.: A review on lightweight cryptography algorithms for data security and authentication in IoTs. In: Proceedings of 4th International Conference Signal Processing, Computer Control (ISPCC), pp. 504–509 (2017) 30. Uppu, R., et al.: Asymmetric cryptography with physical unclonable keys. Quant. Sci. Technol. 4(4), 045011 (2019) 31. Tyagi, A.K., Aswathy, S.U., Aghila, G., Sreenath, N.: AARIN: affordable, accurate, reliable and innovative mechanism to protect a medical cyber-physical system using blockchain technology. IJIN 2, 175–183 (2021)

236

V. Jayaprakash and A. K. Tyagi

32. Bhutta, M.N.M., et al.: A survey on blockchain technology: evolution, architecture and security. IEEE Access 9, 61048–61073 (2021). https://doi.org/10.1109/ACCESS.2021.307 2849 33. Naqvi, R., Aslam, M., Iqbal, M.W., Shahzad, S.K., Malik, M., Tahir, M.U.: Study of block chain and its impact on Internet of Health Things (IoHT): challenges and opportunities. In: 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), pp. 1–6 (2020). https://doi.org/10.1109/HORA49412.2020.9152846 34. Sharma, A., Kaur, S., Singh, M.: A comprehensive review on blockchain and Internet of Things in healthcare. Trans. Emerg. Telecommun. Technol. 32(10), e4333 (2021). https://doi. org/10.1002/ett.4333 35. Singh, L.D., Singh, K.M.: Implementation of text encryption using elliptic curve cryptography. Procedia Comput. Sci. 54, 73–82 (2015) 36. Gueron, S., Krasnov, V.: Fast prime field elliptic-curve cryptography with 256-bit primes. J. Cryptogr. Eng. 5(2), 141–151 (2014). https://doi.org/10.1007/s13389-014-0090-x 37. Diffie, W., Hellman, M.E.: New directions in cryptography. IEEE Trans. Inform. Theory IT-22, 644–654 (1976) 38. Johnston, A.M., Gemmell, P.S.: Authenticated key exchange provably secure against the man-in-middle attack. J. Cryptol. 15(2), 139–148 (2002) 39. Mehibel, N., Hamadouche, M.: A new approach of elliptic curve Diffie-Hellman key exchange. In: 2017 5th International Conference on Electrical Engineering - Boumerdes (ICEE-B), pp. 1–6 ( 2017). https://doi.org/10.1109/ICEE-B.2017.8192159 40. Chandel, S., Cao, W., Sun, Z., Yang, J., Zhang, B., Ni, T.-Y.: A multi-dimensional adversary analysis of RSA and ECC in blockchain encryption. In: Arai, K., Bhatia, R. (eds.) FICC 2019. LNNS, vol. 70, pp. 988–1003. Springer, Cham (2020). https://doi.org/10.1007/978-3030-12385-7_67 41. Dhanda, S.S., Singh, B., Jindal, P.: Lightweight cryptography: a solution to secure IoT. Wirel. Pers. Commun. 112(3), 1947–1980 (2020). https://doi.org/10.1007/s11277-020-07134-3 42. Tibrewal, I., Srivastava, M., Tyagi, A.K.: Blockchain technology for securing cyberinfrastructure and Internet of Things networks. In: Tyagi, A.K., Abraham, A., Kaklauskas, A. (eds.) Intelligent Interactive Multimedia Systems for e-Healthcare Applications, pp. 337–350. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-6542-4_17 43. Tyagi, A.K., Nair, M.M., Niladhuri, S., Abraham, A.: Security, privacy research issues in various computing platforms: a survey and the road ahead. J. Inf. Assur. Secur. 15(1), 1–16 (2020) 44. Madhav, A.V.S., Tyagi, A.K.: The world with future technologies (Post-COVID-19): open issues, challenges, and the road ahead. In: Tyagi, A.K., Abraham, A., Kaklauskas, A. (eds.) Intelligent Interactive Multimedia Systems for e-Healthcare Applications, pp. 411–452. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-6542-4_22

Preserving Privacy Using Blockchain Technology in Autonomous Vehicles Meghna Manoj Nair1 and Amit Kumar Tyagi1,2(B) 1 School of Computer Science and Engineering, Vellore Institute of Technology,

Chennai Campus, Chennai, Tamilnadu 600127, India [email protected] 2 Centre for Advanced Data Science, Vellore Institute of Technology, Chennai, Tamilnadu 600127, India

Abstract. With transportation and automation being some of the major fields of interest in research in the current world, Autonomous Vehicles (AVs) and Intelligent Transportation System (ITS) have been taking the spotlight especially in the current era where mobilization and automation are proliferating. One of the major fields of research and development in the transportation field is the networking of various smart vehicles together for enhanced data acquisition and automated driving. This paper talks about one of the major challenges which is commonly found in developments in the related field which is privacy preservation. Privacy is a term which describes the safety and security of the information retrieved from the users for computation. When developing applications which indulge in automation and networking, privacy is one of the major parameters to be considered because there are high chances of attacks, data breaches and leakages. This paper discusses a possible solution of integrating the Blockchain technology with the systemic framework for encrypting and hashing the data and securing the information from any sort of mishaps and glitches. The paper also throws insights on ITS, AVs and Internet of Vehicles (IoV) and how Blockchain can be integrated to the ITS framework. Keywords: Privacy · Blockchain · Autonomous vehicles · Intelligent transportation system · Encryption · Security

1 Introduction Automation and autonomous systems have been taking over in various fields over the years, especially in the transportation sector where intelligent and smart network systems are taking shape. Autonomous vehicles are those automated vehicular systems which enable the vehicle to perform required and necessary actions and decisions on its own without the interference of any humans. This is made possible by using various sensors and tech gadgets to sense the surrounding environment and take necessary actions. In fact, these driverless systems extract the benefits of a completely automated vehicular technology to ensure that the vehicle is responsive and decisive to all the conditions just © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 237–248, 2022. https://doi.org/10.1007/978-981-19-3182-6_19

238

M. M. Nair and A. K. Tyagi

like how a human would. Autonomous vehicles itself have classification, let alone the other various categorizations for the types of interconnected transportation networks. There are mainly six different levels of automation as mentioned below [1]: • Level 0: human drivers are completely in charge of the decisions and controls of the car • Level 1: the Advanced Driver Assistance System (ADAS) complements the human driver in terms of maneuvering controls or with regards to the speed system of the vehicle • Level 2: the ADAS is autonomous enough to ensure that it can take care of maneuvering and controlling the speed system in most of the conditions. However, it requires the physical presence and attention of the driver at all times to carry out the remaining tasks when required • Level 3: the Advanced Driving System (ADS) is capable enough to perform automated driving in most of the conditions however, it needs the human driver to regain control in incidents where the system can’t take effective decisions • Level 4: the ADS has the ability to perform the tasks independently and doesn’t require any human intervention is some of the situations and conditions • Level 5: the ADS is completely automated and is autonomous enough to comply and execute all tasks independently without any human interactions for assistance. In majority of the cases, 5G technology is integrated with the network to establish connections between the vehicles on the road and the other sensory devices in the transportation sector. Apart from the innovation and advanced technologies being utilized in vehicles, extensive researches are being conducted in developing systemic frameworks that are capable of interconnecting vehicles among themselves, with the ground station, OnBoard Units (OBUs), etc. paving for an Intelligent Transportation System (ITS). This would not only incorporate mobility to a great level, but would also establish stable intercommunication techniques like Vehicle to Vehicle (V2V) communication, providing a larger coverage for interconnection [2]. As shown in Fig. 1, ITS is a framework that is a combination of users/drivers, roads and traffic control systems and vehicles which are integrated using enhanced communication technologies and Internet of Things (IoT). Each of the components of such a system work towards achieving the common goals of ensuring safety, improvising traffic efficiency, convenience and mobility, and uplifting the industrial presence. Figure 2 describes the working scheme and the interaction of the various nodes involved in an ITS. Road Side Units (RSU) are an integral part of any ITS as they indulge in sensing and collecting information pertaining to the vehicles and other nodes and transmits them to the control centers and authorities. GPS systems are used to track the location and position of various vehicles and the vehicles interact with each other either by V2V communication, vehicle to roadside communication and inter-roadside communication.

Preserving Privacy Using Blockchain Technology

239

Fig. 1. ITS framework and objectives

Fig. 2. Working schemes and Nodular interaction of AVs and ITS

1.1 Autonomous Vehicles and Intelligent Transportation System In a world where automation and technological advancement have been taking over, Autonomous Vehicles (AVs) is a perfect fit. AV, often called as driverless vehicle, is a vehicle which has the necessary technology and mechanisms integrated to operate and execute functions on its own without the presence of any human interference. The vehicle would be curated in a way such that it would be able to sense its surroundings and maneuver the vehicle on its own without much of human assistance and many of them incorporate various feedback mechanisms to ensure that the vehicle learns from the various challenges it’s being exposed to while in action. These type of futuristic vehicles and automation system is one of the major fields f research and extensive research is being carried out with regards to the various aspects of the same. One of the major attractions of AVs is the various advantages it showcases. One of the major advantages it showcases is the possibility of providing increased safety during travel and reduced

240

M. M. Nair and A. K. Tyagi

risks of accidents. Thousands of deaths occur each year due to accidents during travel, mainly due to reckless and rash driving by drivers. Using AVs will ensure that road safety rules and guidelines are being followed and that the risk of accidents are reduced to a great-extend. Most importantly, AVs prove to be a complementary aid for all those users who face difficulty in driving due to factors like age, physical disabilities, etc. [13]. When a number of AVs connect and interact with each other, it is often considered under the broader perspective of Intelligent Transportation System (ITS). However, just like how every coin has two sides, these massive strategies and frameworks are highly questionable when it comes to safety and security of data and its extremely essential to ensure that the users details and confidential information are safe and aren’t breached or leaked. Blockchain is one of those technologies which can surely provide safe data transmission by utilizing a distributed ledger technique. As an organization of this work, the main motive of this work is to showcase the various aspects and perspectives of the security and privacy sector of AIVs and ITS [3]. Section 2 of this paper elucidates some of the existing works and researches in this field as literature review while Sect. 3 focuses on the safety approaches and parameters for such automated vehicles and its systems. Section 4 discusses about scope/importance of Internet of Vehicles (IoVs) in Today’s Smart Era from an user’s perspective. Following this Sect. 5 talks about the possible issues, risks and challenges (including security values) in VANETs and ITS while Sect. 6 discusses about proposed work whereas Sect. 7 throws light on simulation results. Section 8 talks about future work/opportunities for future in intelligent vehicles followed by the conclusion.

2 Literature Review Autonomous Vehicles being one of the highly researched fields, numerous authors and researchers have portrayed their work on the same through surveys, virtual experiments and analyses. In [3], the authors have proposed a scheme for privacy preservation for AV’s and ITS in an urban area wherein the transmission of data and information from one node to the other within the network is secured using Emergent Intelligence (EI) technique. EI is a robust technology which provides automation, versatility, flexibility, etc. The Crypto ++ package has been used to implement the same. The authors of [3, 4] have put forth a different perspective on preserving the security and privacy of AVs in ITS using the concept of grouping. In the proposed model, they have put forth a hierarchical layout of the ITS framework with the following layers: Sensor/actuator, Vehicle, AVs Group, and Cloud respectively. The mechanism used for preserving privacy is to group the AVs in the vehicular layer with a certain leader who will be the only node communicating with the RSUs from the group. One of the important metrics considered for performance analysis and other approaches is the size of the group and the major restriction is that the chosen leader must not be restrictive in terms of power, storage, energy etc. A permutation scheme is used to encrypt the data bind transmitted from the group leader to other nodes, RSUs etc. [5]. The authors of [6, 7] have put forth an autonomous privacy preservation mechanism which involves the AVs to connect with the central node/authority only once after which, the connections will be renewed within frequent intervals without necessitating a permanent contact with the central node. It

Preserving Privacy Using Blockchain Technology

241

incorporates a pseudonym like authentication framework wherein the central node would send credentials to the subordinate AVs and this central authority node can revoke the users/driver’s anonymity without the need to establish direct contact with the vehicular node. On the other hand, in [8], the researchers have suggested a framework which integrates the Paillier Crytosystem in accordance with the Chinese Remainder Theorem to gather and collect the data from various sections of the road transport system in order to save bandwidth and additional authentication requirements [9]. This system is majorly autonomous as it allows its users/drivers to generate the private/public key pair for their communication needs. One of the other proposals for privacy preservation is in the field of Cooperative – ITS (C-ITS). C-ITS is a transportation framework which involves numerous stations and points which can detect their surrounding environment and interact with each other. In order to preserve privacy, four main parameters are taken into consideration: anonymity, unlinkability, pseudonymity, and unobservability [10, 11]. These strategies ensure that the stations are up to date with the pseudonyms being generated by the central node from a given pool and secures the interaction and transmission of data from nodes to local stations respectively [12].

3 Safety Approaches for Autonomous Vehicles and Intelligent Transportation System Guarantying safety to the users of a fully autonomous vehicles or an ITS framework is one of the major challenges posing a hinderance to the conventional standards followed for similar software. One possible way is to propose a standard completely based on assigning particular scopes for the safety nodes in the system. It is always important to have a feedback system to ensure that the system keeps learning from the new experiences and challenges it’s being exposed which in turn improvises the decisive skills of the system. Some of the conventional standards which can be adopted include: • ISO 26262: This is one of those standards that are capable of ensuring the safety of integrity levels in the autonomous vehicle by adopting a V based process model and by addressing the software and hardware requirements at varying integrity levels. In simpler terms, it mainly helps the system to be free from any type of faulty designs and helps in developing mitigation plans in case of a fault. • ISO 21448: This is an extension of the previous standard which is used for ensuring the safety of the intended functionalities of the modules within the system. It covers handling situations of predicting misuses and issues which may arise during user interaction and focuses on elaborating over the operational necessities. Apart from the above-mentioned safety standards, there are various other safety standard approaches like IEC 61508, SAE ARP 4745A, SAE ARP 4761, etc. IEC standard mainly covers the aspects of chemical process supervision and control. While SAE standards cover the those in terms of aviation [14].

242

M. M. Nair and A. K. Tyagi

4 Role of Internet of Vehicles (IoVs) in Today’s Smart Era With Internet of Things (IoT) taking over the new era in nearly every field, it has had a massive impact in the transportation sector too resulting in the enhancement of traditional Vehicular Ad-Hoc Networks (VANETs) through IoV. IoV is one such aspect which guarantees promising results and innovative solutions to various problems and ideas in the same [15]. In fact, IoV is one of the best ways to interconnect various vehicles together in a holistic environment [16]. In simple words, IoV is a type of distributed network which complements the usage of data generated by AVs, ITS and VANETs with the main aim of allowing the interconnected vehicles to communicate and exchange data with each other and with the users, drivers, ground station, etc. [17]. Dwelling deeper into the field of IoV, there are five main categories of network communication [18]: • Intra-vehicle: These systems are responsible for supervising and monitoring the working and performance of the internal components. This is achieved through On-Board Units (OBUs). • Vehicle to Vehicle (V2V): These systems are useful in complementing the wireless data transfer strategies with information regarding the velocity and location of surrounding vehicles. • Vehicle to Infrastructure (V2I): These systems are responsible for data transmission through wireless means for exchanging data among vehicles and respective Road Side Units (RSU). • Vehicle to Cloud (V2C): They ensure that the systems permit the vehicles to gain access to extra data required through the internet via supportable Application Programming Interfaces (APIs). IoV has a plethora of benefits to it apart from just a mediator platform used for exchanging data and information from one node to another within the transportation network. However, when IoV is integrated with AVs and ITS, it potentially supports various functions like those of intelligent traffic management, dynamic information services. Intelligent vehicle control, etc. Further, the most daunting fact, that millions of people are injured and face death due to traffic accidents and spend hours in traffic jams because of the lack of proper management systems. However, with IoV, there are many new applications and opportunities that provide great services to the drivers, users and the entire network to have a smooth and efficient flow. Furthermore, IoV can be accounted as one of the base foundational components for opening up various service providers like parking spot identifier services, real-time traffic information services, location-based services, etc. [19]. Figure 3 shown below portrays the architecture of IoV which has around 7 cumulative layers. The first layer contains the various possible options which support the interaction with the users via User Interaction surfaces. The next layer is the Data Acquisition layer which is responsible for collecting data from numerous components and sources including those of sensors, navigational systems, traffic control systems, etc. Following which is the Data Filtering and Preprocessing layer which helps to analyze and filter out any noisy data from the collected set which is then passed onto the communication layer that handles the passage of data by choosing the appropriate network choice. Control and management layer handles the different

Preserving Privacy Using Blockchain Technology

243

service providers of the network and takes care of the various policies to be included. The processing layer analyzes the huge volume of information being passed down by utilizing the various types of cloud computational structures. And finally, the security layer is responsible for authenticating and validating the data transmission and communications occurring and has direct access to all the layers.

Fig. 3. Seven layer architecture of IoVs

5 Critical Challenges and Open Issues Towards Autonomous Vehicles and Intelligent Transportation System AVs and ITSs which seemed to be a dream for many over the years, is now paving its way to reality. However, this too has a number of challenges and issues which needs to be overcome when considering its implementation on a large scale. Considering the various domains that AVs can possibly impact, it has challenges and bottlenecks which range from social to that of the technical field. One of the major challenges that stands out from the rest is the possibility of system crashes or failures due to any tiny bugs or errors in the respective software component of the system [19]. These systemic errors can ultimately lead to dangerous accidents and injuries. Further, the fact there is extensive interaction between the logical decision-making system of the autonomous system itself indicates that there need to be clear definition of boundaries with respect to the private and public information of the users which are extracted [20]. Moving over to the technical challenges faced by the implementation of these autonomous systems. Hardware components pose

244

M. M. Nair and A. K. Tyagi

significant challenges when it comes to the case of wireless interactions and connectivity because in case of any fault in any of the devices, there are chances that it affects the entire system adversely. One of the other thoughts to ponder upon is in terms of the energy requirement and managements. Lack of suitable and viable energy management can be a potential bottleneck for the growth of AVs [21].

6 Proposed Solution The main objective of the paper is to device a system that would preserve the privacy and information of the users utilizing the services of AVs and ITS framework. Especially in the beginning stages, it’s very essential that the users support and rely on the services and the device and the best way to do it is by providing them reliable and safe service assurance. One of the topics which has taken the spotlight when it comes to securing data is Blockchain. Blockchain is very similar to a database which works on distributed grounds being shared commonly by the various nodes and components of the given network. The data is stored in a digital format. One of the major applications of Blockchain has been in the field of cryptocurrency and bitcoins because it is one of the safest ways to conceal the transactions taking place by making it resilient to attacks and data leakages. In fact, the major advantage of the Blockchain concept is that it promises the security and privacy preservation of any record maintained within it without the requirement or interference of any external parties. Even though Blockchain is compared to that of a database, they’re quite different from each other, especially in terms of how the data/record is being stored. The information/record is stored in the form of group like structures called blocks which have a certain capacity. Each time a record is being entered, it is stored in a block in the Blockchain and the block will continue storing information until it’s full. Once it reaches its capacity, it will be closed and will be linked to the next block in line, ultimately leading to a chain of various blocks [22, 23] and [24]. As shown in Fig. 4, the blocks not only store information being entered but also consists of details like index, previous hash, hash, timestamp, data. Hash and previous hash are basically values which are produced as a result of the hash function which encrypts the data being stored in the block. In fact, these values are the encrypted products of the hashing algorithm used for the curation of the respective Blockchain. The main advantage of the hash is that it always generates a distinct value. Timestamp stores the time at which the block was created and data stored the records being entered. Previous hash stores the hash value of the previous block to get the blocks linked in an orderly fashion. The aim of the proposed solution is to integrate Blockchain with the frameworks of AVs and ITS, which is definitely a feasible, viable and secure option. All the devices and components which are connected to the various sensory nodes need to integrated with a client-side application while the components in the ITS or the corresponding network framework AVs are connected to, need to be curated in a resilient and secure way such that they are safe from some of the major attacks. The network framework mainly contains the components which would be integrated with the server-side application. Each time some information is gathered from the sensors by the functional modules on the client side, they are hashed and stored in a block. This continues until a Blockchain is

Preserving Privacy Using Blockchain Technology

245

Fig. 4. Connected blocks and their attributes forming a Blockchain

developed by chaining the blocks together based on their hash values and will ensure that the data is stored in a safe and preserved environment, especially during data transfer. The server-side components then easily access these details and information for necessary computations. Apart from the server and client-side applications, the Blockchain will be majorly decentralized and will be maintained and regulated by one of the trusted nodes to allow easy access to information by the client and server-side nodes. Having said this, integration of Blockchain is what needs to be looked into. The users of the AVs and ITS will be in connection with the server consistently and necessary information passing will take place. The Blockchain will be open to information extraction and information addition from both the client and server sides of the systemic model. The data being stored in the Blockchain will be fully hashed and encrypted with the help of suitable encryption keys. The point to be noted here is that the encryption technique used is an extremely powerful and resilient one making sure that it seals the data from any breaches or attacks. Simulations were conducted to visualize and get deeper insights into the working of a blockchain using Anders Brownworth’s simulator. The dataset used has been pulled from Kaggle and contains the details of the date, time and the number of vehicles spotted by an automated traffic control system. Firstly, a block simulation was carried out. The dataset was loaded into the block. Before loading the dataset, the block remains red in color indicating that it is not encrypted and is hence vulnerable to attacks. The “Block” attribute in the block contains the index number of the block, the “Nonce” attribute stores a hashed number and the “Data” attribute is where the data is loaded into. On clicking the “Mine” option, the block turns green in color and the encrypted data is stored in the “Hash” field. This indicates that the block is safely encrypted and is resilient to any sort of attack. An important point to note is that the hash values maintained in the block are programmed to starts with four zeroes followed by an ‘x’ and then the encrypted data. The next simulation was to experiment the working of a Blockchain. Starting off with three blocks, each block has the attributes of “Block” (index number), “Nonce”, “Data”, “Prev”, and “Hash”. Except for Prev, all other attributes are the same as observed in the first simulation. In the Blockchain implementation, Prev is the field which holds the

246

M. M. Nair and A. K. Tyagi

hash value or encrypted key of the previous block to ensure that chaining of the blocks take place efficiently. Initially the blocks are red, and on uploading the necessary data from the dataset into the three blocks and mining it, it is observed that the Blockchain changes to green color depicting that the data has been encrypted and the blocks have been chained together. The Prev value of the first block in the chain will always be zero as it isn’t being linked to by any other block. Another point to be noticed is that each time a new block is appended to the chain, the previously present blocks turn to red depicting that the chain needs to be mined again for establishing proper linkages between the blocks. Only on mining again, will the blocks be linked together and will be completely encrypted. Moving on to the implementation of the suitable hashing algorithm for necessary encryption. For the Blockchain, the information being stored in the block possess a unique hash value and this hash value is what links the various blocks together into one chain. There are various cryptographic techniques like Symmetric-Key cryptography, SHA-256, Asymmetric-Key cryptography, etc. The hash algorithm used for the implementation of the proposed solution has been executed in a Java environment on NetBeans editor. A class Block contains the functions of getPreviousHash(), getTransaction(), and getBlockHash() along with a constructor used for initializing data members too store the hash value and the hash value of the previous block. Using this class, a genesis or nascent block is generated following which similar block are generated and linked together. The cryptographic technique of SHA-256 is used for generating the hash value for each of the blocks.

7 Simulation Results As per the execution of the hashing algorithm in Java environment and the simulations from Anders Brownworth gives a holistic and complete approach to the reliability on Blockchain for integrating it as safety and security mechanism in AVs and ITS. All simulated results can be clearly observed in the implemented algorithm indicating that the Blockchains being created are adhering to all the features and attributes it must possess. The concept of avalanche effect which depicts that the change of data or information in any one block breaks the chain with all remaining block until they’re mined again can also be observed in the implemented algorithm. Furthermore, a tiny change in the data held in a block ensures the generation of a new hash value. All these features make Blockchain reliable and suitable for ensuring safety and privacy preservation especially in fields like ITS. To conclude, this paper elaborates on the main safety and privacy challenges in relation to AVs and ITS. Further, the cryptographic techniques and hashing algorithm used in the curation of the proposed algorithm provide one of the best results.

8 Conclusion and Future Opportunities AVs and ITS being one of the highly researched and massive domains, there are numerous options for future research, improvisation and development. One of the areas which can be dwelled deeper into is the restoration of privacy and security using other mechanisms other than Blockchain. Furthermore, the topic of integration of cloud platforms and

Preserving Privacy Using Blockchain Technology

247

cutting-edge techniques for enhancing the implemented systems has great potential for expanding futuristic opportunities. Last but not the least, VANETs, which is an associated field in transportation and automation of vehicles, is one of those topics which covers the networking and interconnection of the systemic devices in the transportation sector and has plenty of areas for extensive research and ideation.

References 1. Sravanthi, K., Burugari, V.K., Tyagi, A.: Preserving privacy techniques for autonomous vehicles. 8, 5180–5190 (2020). https://doi.org/10.30534/ijeter/2020/48892020 2. Sodhro, A.H., et al.: Quality of service optimization in an IoT-driven intelligent transportation system. IEEE Wirel. Commun. 26(6), 10–17 (2019) 3. Chavhan, S., Gupta, D., Garg, S., Khanna, A., Choi, B.J., Hossain, M.S.: Privacy and security management in intelligent transportation system. IEEE Access 8, 148677–148688 (2020) 4. Chavhan, S., Gupta, D., Chandana, B.N., Khanna, A., Rodrigues, J.J.P.C.: ‘IoT-based context aware intelligent public transport system in a metropolitan area.’ IEEE Internet Things J. 7(7), 6023–6034 (2020) 5. Qian, Y., Chen, M., Chen, J., Hossain, M.S., Alamri, A.: ‘Secure enforcement in cognitive Internet of vehicles.’ IEEE Internet Things J. 5(2), 1242–1250 (2018) 6. Jolfaei, A., Kant, K.: Privacy and security of connected vehicles in intelligent transportation system. In: 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks–Supplemental Volume (DSN-S), pp. 9–10. IEEE (2019) 7. Sucasas, V., Mantas, G., Saghezchi, F.B., Radwan, A., Rodriguez, J.: An autonomous privacypreserving authentication scheme for intelligent transportation systems. Comput. Secur. 60, 193–205 (2016) 8. Ogundoyin, S.O.: An anonymous and privacy-preserving scheme for efficient traffic movement analysis in intelligent transportation system. Security and Privacy 1(6), e50 (2018) 9. Shaheen, S.A., Finson, R.: Intelligent transportation systems. Reference Module in Earth Systems and Environmental Sciences, pp. 1–12. Elsevier, Amsterdam (2013) 10. Zear, A., Singh, P., Singh, Y.: Intelligent transport system: a progressive review. Indian J. Sci. Technol. 9(32), 1–18 (2016). https://doi.org/10.17485/ijst/2016/v9i32/100713 11. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16 12. Kountché, D.A., Bonnin, J., Labiod, H.: The problem of privacy in cooperative intelligent transportation systems (C-ITS). In: 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 482–486 (2017). https://doi.org/10.1109/ INFCOMW.2017.8116424 13. Krishna, A.M., Tyagi, A.K., Prasad, S.V.A.V.: Preserving privacy in future vehicles of tomorrow. JCR 7(19), 6675–6684 (2020). https://doi.org/10.31838/jcr.07.19.768 14. Koopman, P., Ferrell, U., Fratrik, F., Wagner, M.: A safety standard approach for fully autonomous vehicles. In: Romanovsky, A., Troubitsyna, E., Gashi, I., Schoitsch, E., Bitsch, F. (eds.) Computer Safety, Reliability, and Security: SAFECOMP 2019 Workshops, ASSURE, DECSoS, SASSUR, STRIVE, and WAISE, Turku, Finland, September 10, 2019, Proceedings, pp. 326–332. Springer International Publishing, Cham (2019). https://doi.org/10.1007/ 978-3-030-26250-1_26 15. Zhao, Q., Zhu, Y., Chen, C., Zhu, H., Li, B.: When 3G meets VANET: 3G-assisted data delivery in VANETs. Sensors J. 13(10), 3575–3584 (2013)

248

M. M. Nair and A. K. Tyagi

16. Campolo, C., Cozzetti, H.A., Molinaro, A., Scopigno, R.: Augmenting vehicle-to-roadside connectivity in multi-channel vehicular Ad Hoc networks. J. Netw. Comput. Appl. 36(5), 1275–1286 (2013) 17. Yang, F., Wang, S., Li, J., Liu, Z., Sun, Q.: An overview of internet of vehicles. China Commun. 11(10), 1–15 (2014) 18. Tyagi, A.K., Aswathy, S.U.: Autonomous Intelligent Vehicles (AIV): research statements, open issues, challenges and road for future. Int. J. Intell. Netw. 2, 83–102 (2021). ISSN 2666–6030. https://doi.org/10.1016/j.ijin.2021.07.002 19. Joy, J., Rabsatt, V., Gerla, M.: Internet of vehicles: Enabling safe, secure, and private vehicular crowdsourcing. Internet Technol. Lett. 1(1), e16 (2018) 20. Zhang, K., Mao, Y., Leng, S., He, Y., Zhang, Y.: Mobile-edge computing for vehicular networks: a promising network paradigm with predictive off-loading. IEEE Veh. Technol. Mag. 12(2), 36–44 (2017) 21. Jameel, F., Chang, Z., Huang, J., Ristaniemi, T.: Internet of autonomous vehicles: architecture, features, and socio-technological challenges. IEEE Wirel. Commun. 26(4), 21–29 (2019) 22. Varsha, R., Nair, M., Nair, S., Tyagi, A.: Deep learning based blockchain solution for preserving privacy in future vehicles. Int. J. Hybrid Intell. Syst. 16(4), 223–236 (2021). https:// doi.org/10.3233/HIS-200289 23. Krishna, M., Tyagi, A.K.: Intrusion detection in intelligent transportation system and its applications using blockchain technology. In: 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), pp. 1–8 (2020). https://doi. org/10.1109/ic-ETITE47903.2020.332 24. Tyagi, A.K., Niladhuri, S.: Providing trust enabled services in vehicular cloud computing. In: Proceedings of the International Conference on Informatics and Analytics (ICIA-16), Article 3, pp. 1–10. Association for Computing Machinery, New York (2016). https://doi.org/10. 1145/2980258.2980263

Security Protocols for Blockchain-Based Access Control in VANETS Kunal Roy(B) and Debasis Giri(B) Maulana Abul Kalam Azad University of Technology, West Bengal, Bidhannagar, India [email protected]

Abstract. The VANETs (Vehicular Ad Hoc Networks) are modular networks made up of automobiles, roadside-units, and other foundations which allows several nodes to communicate with one other to improve street protection and activity preventions. While this technology has a lot of advantages for drivers, it also has a lot of security challenges that are essential for the safety of pedestrians. It’s vital to make sure that only registered vehicle-members communicate data and that revoked vehicle-members never get in the route. Many present VANET solutions rely on a single trustworthy authority, which This raises the cost of network calculation and transmission while also functioning as a server failure. Using blockchain technology in VANETs, we may benefit from a decentralized and distributed system while avoiding a single source of failure. Furthermore, the legitimacy of data is ensured by blockchain technology, by increasing system’s transparency. For access control in VANET, the proposed technique use Hyperledger Fabric, a blockchain network platform. Using blockchain technology, all of the vehicle-members with fraudulent IDs are registered, confirmed, and cancelled. Automobiles in the network employ services supplied by roadside equipment with blockchain access to verify the validity of timely information generated by devices connected nearby. For vehicle-members accessing the network’s safety messages, this system verifies the blockchain’s legitimacy of forged IDs and public keys, leading in light-weight authentication and reduced computing latency. Keywords: VANET · Blockchain technology · Hyperledger Fabric

1 Introduction According to the WHO’s Worldwide status report [1] on street security 2018, 1.35 million people have died as a result of street activity mishaps each year. Street activity mishaps and wounds have also been Identified by the WHO as the the most common cause of mortality among people aged 5 to 29. People in Canada spend about 11.6 million hours each year on the road and use 23 million gallons of fuel. owing to activity obstruction, according to the Canadian Automobile Association (CAA) in 2017 [2]. The innovation of VANETs will somehow diminish street mishaps along with activity delay. Automobiles, road-side units (R.S.Us), and other frameworks may form the foundation of a VANET. Cars in the VANET network features an on-board unit (OBU) which sends status of the vehicle-member to other vehicle-members nearby. R.S.Us are foundations © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 249–264, 2022. https://doi.org/10.1007/978-981-19-3182-6_20

250

K. Roy and D. Giri

visible on the side of the road that aid vehicle-members in communicating with one another. Vanet provides support for consolation and security applications [3]. Security applications support crisis caution frameworks, lane-changing assistance, and crossing point coordination, while consolation applications include highlights such as climate data frameworks, gas/restaurant area, and cost points of interest.

2 Ease of Use 2.1 Backgrounds Figure 1 depicts the cooperation between the automobiles and the Roadside-Units. VANET [5] supports Vehicle-member-to-Vehicle-member (V-2-V), Vehicle-member-toframework (V-2-I), framework-to-framework (I-2-I), and a Vehicle-member to another Internet-connected devices (V2X) communication. The IEEE in the U. S. accepted the standard Committed Brief Extend Communication (DSRC) standard, which enables 11 MHz channels with a maximum transmission capacity of 5.8 GHz [3]. To enable vehicle-member communication, the IEEE 1609 family’s Remote Access in Vehicular Environment (WAVE) provides the useful standards along with secured administrations [6]. The IEEE 1609.2 standard offers procedures for maintaining WAVE communications security, as well as vehicle-member anonymity and security.

Fig. 1. Cooperation between automobiles and roadside-units

One of the most difficult concerns in the VANET is security. It is vital to guarantee that vehicle-members transmit meaningful signals in the manner in which they were verified; otherwise, they may endanger drivers’ lives or result in severe financial loss. It is feasible to adhere to [5] assaults in VANET. VANET, in accordance with [5] assaults, may be: • Dynamic vs. Passive – Depending on the technique of attack; if the attacker engages in combat with a few unpleasant behaviours, the assault is dynamic. Otherwise, it could be a detached attack if the aggressor quietly tunes into the arrangement at that juncture.

Security Protocols for Blockchain-Based Access Control in VANETS

251

• Inside vs. outside – On the basis of aggressor’s enrolment; inside assaults are carried out by confirmed hubs of the network, whereas outside assaults are carried out by hubs that are not connected to the network. • Logical vs. malevolent – On the basis of assailant’s motivation; judicious assailants aim to destabilise the situation because of own self gain, in contrast to malignant aggressors’ motivation is to cause harm to the network. In [5], We Identified a few of the most important security requirements in VANET in [5]: • Vehicle-member Identification – detecting important automobiles and verifying that they are who they say they are. • Statistical analysis – ensuring that the traded messages are not changed or altered by an unauthenticated supplies. • Vehicular Identity tracing is required for un-deniability and retrieval of the vehiclemembers’ Identities. • Vehicle-member privacy and namelessness – It ensures that secret data about the automobiles is not shared with others and that the assailant is unable to follow them • Accessibility control – ensuring that the right people have access to the right information and services for different substances on the network. As a result, to address some of these privacy criteria, we’ve turned to blockchain technology, such as vehicle-member verification. Satoshi Nakamoto first revealed blockchain in October 2008, [7] referred as the Bitcoin-Cryptocurrency. The technology of Blockchain promotes a sharable access system by utilizing encryption to save data in an unchangeable format on a public composer which is distributed over associated networks and uses smart contracts that is connected with various apps. Bitcoin and Ethereum are two previous blockchain innovations. These are permissionless/public blockchains in which the participants are distrustful of one another. Private/permissioned blockchains are now available, which provide more protection by controlling the hubs which frames the blockchain arrangement. Therefore with the help of blockchain, we will provide additional advantages such as improved traceability since every changes inside the blockchain will be followed easily, and can be improved with suitable networks available as members can get the recent updated record from other members only if an aggressor taints their record. In any event, the most important aspect of our investigation is to monitor the verification of automobiles in VANET. 2.2 Motivation In VANET, keeping track of cars’ personalities could be a significant difficulty. Advance, existing techniques rely on Public Key Infrastructure (P-K-I), [8, 9], in which a Central Authority distributes an open key and a secret key for each vehicle-member (CA). These secret keys establish electronic marks which verify messages/information sent by automobiles insIde the company. Assist, every running vehicle-members are given the CA’s open key as well as a certificate containing the open key that the CA has carefully marked. The accepting vehicle-members will begin by verifying the certificate’s electronic signature with the CA’s open key and recovering the transmitting vehicle-member’s open key. This open key is then used in the verification of the advanced signature of the information that was retrieved.

252

K. Roy and D. Giri

Although, in the VANET, this target is ineffective because each vehicle-member sends an essential security information (B.S.M) every 100 ms [10], and some of these basic messages shall not be confirmed as a result of cryptographic Idleness that will occur in the contract time-period during which they will be requiring mostly. Confirmation based on computerized marks has additional processing cost for encryption and unscrambling forms, as discussed by Liu et al. in [8]. We can observe from the results in [11] that preparing messages using PKI in VANET takes more time. 2.3 Problem Statement The VANET is a game-changing technology which improves street security and activity control. In any case, VANETs have a slew of security concerns that can have a negative impact on the framework, costing money and lives. Currently, the VANET verification system is maintained using an open key foundation (PKI), which distributes open and private keys to cars. Each communication delivered by an automobile includes that automobile’s computerized signatures and certificates meticulously marked by the Central Specialist (CA), which helps other vehicle-members confirm the sender without requiring additional computation. In addition, every 100 ms, each car transmits a basic security message [10]. The framework’s execution will be influenced by the delay in the PKI system for encryption and decoding. As a result, by using PKI to verify vehicle-members, the framework’s efficiency is harmed due to increased computational cost. Blockchain has the potential to be a bright future for VANET. It serves as the foundation for the framework’s decentralized and distributed structure. The Decentralized System eliminates needs for a Central Expert which executes critical activities such as enlisting cars, revoking fake vehicle-member Identities, and changing the personalities of the vehicle-members. It also makes a difference in terms of removing a single source of dissatisfaction. Encouragement, having a Shared Structure, it will always ensure of having adequate support in the event confirming that member is inaccessible or if the member is assaulted and its information is lost. We propose to create a safe and efficient computing environment confirmation component for approving the personalities of cars delivering message-lists in this proposal.

3 VANET Background The Vehicular-Ad-Hoc-Networks (VANETs) are type of electronic Ad-Hoc networks, where vehicle-members are associated with one another. In this sort of arrangement, each Vehicle-member has an On-Board-Unit (OBU) which will be helpful in remote co-operations with another vehicle-members [5]. This arrangement has RoadsIde-Units (R.S.Us) that are dependable in handing-off the data within an indicated range. In this configuration, each vehicle-member has an On-Board-Unit (OBU) which allows to cooperate with another cars remotely [5]. This system consists of RoadsIde-Units (R.S.Us) that can transfer data over a given distance. In VANETs, there are primarily 4 Modes of Communications, as indicated below, [5]: • V2V (Vehicle-member-to-Vehicle-member) communication, which involves sending signals of one vehicle-member to another.

Security Protocols for Blockchain-Based Access Control in VANETS

253

• Vehicle-member-to-Infrastructure (V2I) – It is a vehicle-member that provides data to the nearby R.S.Us. • Infrastructure-to-Infrastructure (I2I) – back-end services are provided by frameworks communicating with one another. • V2X (Vehicle-member to X) – a communication system that allows a vehicle-member to communicate with other internet-connected devices. 3.1 Authentication of VANET For overseeing the personalities of the cars within the arrangement, the present state of the art uses Public-Key-Infrastructure (P-K-I) [8] and [17]. A Central Specialist will distribute an open key and a secret key to each vehicle-member in the arrangement (C.A). The On-Board-Units of the cars are stacked with sets of open keys and secret keys when they are enrolled with the territorial specialist or the national specialist. Open keys (P) are easily Identifiable by all hubs within the organisation, but private keys (S) are regions where the hub knows it is authorised to and are kept hidden. In addition, the C.A will provide a certificate (C) throughout the enlisting process. During enlisting, the C.A will also issue a certificate (C) and provide the vehicle-members with its open key (PCA ). The certificate will include vehicle-member V’s open key (PV ) and the open key’s computerised signature using the C.A’s secret key (SCA ), as described in condition (1) [17]. The signatures includes the C.A’s Identification, which is IdCA . C V = P V |Sign SCA [ P V |IdCA ]

(1)

When a fundamental-security-message is sent within the arrange, the sender computes the message’s computerized signature using its secret key. The enhanced signatures of the message, as well as the vehicle-member’s certificates, are then sent together with the fundamental-security-message (B.S.Ms). To verify the communication’s source, the acknowledging vehicle-member will first verify these certificates that was attached to the messages. The open key of the C.A (PCA ) which was stacked during enrolment is usually used. After verifying the certificate, the sender’s open key is obtained and used to verify the advanced signature in advance. After verification of the certificates, the open key of the source is obtained and is utilized to confirm the message’s advanced signatures in advance. If the source message is confirmed sccessfully, the message’s sender is confirmed. We depicted communication from a vehicle-member as underneath in [17]: V i → M s , Sign SV [ M S |T i ], C V

(2)

In condition (2), Vi speaks to the vehicle-member sending the message MS , at timestamp Ti is utilized to append the message Ms with the time-stamp Ti , to guarantee that later messages are being received. SignCv speaks to the computerized signatures of vehicle-members Vi utilizing its secret key, SV . CV is the certificate of vehicle-member Vi that was stacked on the On-Board-Units of the vehicle-members amid enlistment. The recipients of the messages will confirm CV and after that utilize the open key gotten to confirm the advanced signatures of the message (Fig. 2).

254

K. Roy and D. Giri

Fig. 2. P-K-I architecture-[18]

In Fig. 5, the territorial C.As are associated with the national C.A. The territorial C.As are capable of enrolling vehicle-members by giving open/secret key sets to the vehicle-members. Assist, they are dependable for recharging the advanced certificates and repudiating them in case a getting out-of-hand vehicle-members which are detected.

4 Blockchain Background The Blockchain was first developed in 2008 by Satoshi Nakamoto as a backend for the community cryptocurrency Bitcoin [7]. The innovation operates on decentralised basis, with data stored in a common database. This information is cryptographically marked to ensure that the record remains unchangeable and can be tracked all the way back to the start.

Fig. 3. Blockchain framework [19]

Security Protocols for Blockchain-Based Access Control in VANETS

255

As seen in Fig 3, Blockchain Technology is comparable to a tangible record that will keep the track of monetary interests [19]. Each tile in a Blockchain will be Cryptographically linked to its previous tile, similar to how pages of a record are linked by page numbers. Hash capacities are commonly used. Information of any measure is taken as input and converted to the respective hash values of a settled measure in hash capacity. In most cases, this is a one-way job because recovering the input data from the hash esteem is either impossible or impractical. Each tile in a blockchain carries the hashing values of the previous tile [19]. Thus we can say that Blockchain differs from conventional databases such that it “underpins” the addition of data, which information cannot be removed, ensuring more openness. In a distributed ledger, as shown in Fig. 4, each member of this network manage a duplicate of the ledger and to add a modern tile in records, every member must first check the modern tile, reach a general agreement, and then reorganise their ledger. The majority of hubs in blockchains must agree on the modern component, reach an agreement, and include the item [20]. Proof-of-Work, Proof-of-Stake, and Proof-of-Authority, [7, 21] are some of the most well-known agreement components used in blockchains. As a result, even if any one of the members is unavailable due to a server crash or an assault, another members provIdes facilities continuously. Once the downed member has resumed administrations, it can obtain the most recent record by requesting it from one of the other cmembers [19].

Fig. 4. Blockchain ledgers distribution

As a result, we can divide blockchains into two categories: public blockchains such as Bitcoins [22], Ethereum [23], and private blockchains such as Hyperledger-Fabric [13]. 4.1 Authentication of VANETs using Blockchain Technology Blockchain is a new innovation that is being researched for a variety of applications. The use of Blockchain-in-VANETs for authority is a hot topic in the area right now. Several analysts have been using various types of Blockchains (Public and Private) for the protection of vital information, providing vehicle-member confirmation tools, and transform the VANET framework into a decentralized and distributed system. Several writers [7, 28, 31, 32], and [29] have investigated the use of Blockchains-in-VANETs

256

K. Roy and D. Giri

for authorizing vital information and thus maintaining vehicle-member enrolments and repudiation. Open blockchains like Bitcoin [31] and Ethereum [28] are used in a couple of these methods. They use consensus tools such as Proof of Work, which can be used to calculate moderate to moderate agreement [21]. As a result, updating information on the blockchain for all entries will take a longer time, resulting in the running of a framework slowly. This will have a significant impact on the effectiveness with which security messages are confirmed and received by other automobiles on the network. PKI design was used in a handful of the studies. Open and Secret keys are used in this system for obscurity and unscrambling capabilities, as well as to create and verify complex markings. In any scenario, computing time is necessary for Encryption and Decryption, this results in longer preparation times and delays. We use Blockchain Technology with Open and Secret keys pairs in this study. The bogus identities and open keys of the cars are maintained here in a common record which are shared by all members (R.S.Us and C.As). Each vehicle-member in the arrangement is a member of the blockchain in the designs [29, 31, 32], and [28]. While this provIdes shared record accessible to automobiles, it also degrades the blockchain performances due to the increased number of new participants in the network. The execution time to establish a general agreement increases as the number of members increases. Additionally, by storing the shared record of every member within the Blockchain Organization, it will result in increasing vehicle-member capacity operating costs. As a result, the central authorities and R.S.Us, rather than the automobiles, are the members of our blockchain in our report. In addition, the latency in R.S.U communication, measured in milliseconds, is one of the parameters used in [7] to evaluate the execution of the recommended strategy. This parameter is also where we’ll track how well the suggested lightweight confirmation mechanism using Hyperledger Fabric and bogus IDs works. We use the Elliptic Curve Digital Signature Algorithm to compare our results to those of a traditional PKI setup (ECDSA). We chose ECDSA because of the high security it provIdes in a low-key measure [38].

5 Projected Method Architecture Hyperledger-Fabric, an authorized-blockchain established under the Hyperledger extension, is used in our suggested design. It actualizes a decentralised and distributed system, as discussed in Hyperledger Fabric. Verification Parties and R.S.Us are members of the blockchain organisation. We want to achieve the following goals with this recommended strategy: • Provide a decentralised and distributed VANET system. • Use bogus IDs, Open(public)/Secret(private) keys, and e-signatures to create a lightweight confirmation scheme. • Enrol, approve, and disavow automobiles via exchanges. The members of the projected system are depicted in Fig. 5. The R.S.Us and the Authentication Parties have gained accessibility with the blockchain. In any scenario, the R.S.Us have Read-Only access to the sharable record, whereas the Authenticated Parties have full access, allowing them to make modifications to the record.

Security Protocols for Blockchain-Based Access Control in VANETS

257

Fig. 5. Projected architecture

In this sort of Decentralized Networks, we have many Parties, so that proprietors of the vehicle-members in numerous districts can enlist with their closest Party. Whenever the ledger is conveyed, the enrollment of particular vehicle-member within the organization is unmistakable to every members (other parties and R.S.Us) and it is not fair to the authority enlisting the vehicle-member which accounts for way better networking scalability as new vehicle-members can effortlessly connect the organize and don’t have to enrol once more with another authorizing party in case they commute to a locale or space kept up by another party. The proprietors of the listed members enlist them in VANETs by reaching the closest Authenticated-Party. During the portion of enrollment, every member gets collections of of bogus identities together with open-secret keys pairs using E.C.D.S.A. The bogus identity is utilized as the sender Id in fundamentalsafety-messages. The Open key together with the bogus Id of the members is put away within the blockchain to supply a quick interpretation for the R.S.Us for validation of the members within the organize. Each time a member gets a message in VANETs, the bogus Id within the messages is confirmed with the Roadside-Units to check if the sender incorporates a substantial open key within the blockchain.

6 Results Therefore, we examined the outcomes in this projected activity. We have done comparison of our results methodology to a conventional Public Key Infrastructure system, noting the authentication latency, channel active time, and B.S.M parcel measure distinction, as well as the added message operating costs in our projected methodology. We kept track of the delay in real time for confirmation to think about the effectiveness of our recommended technique. This can be the number of seconds spent by each vehiclemember in the organization accepting a B.S.M to confirm the sender of the B.S.M in seconds and was calculated with the help of the ctime library’s clock() function. When

258

K. Roy and D. Giri

there are 5 to 50 vehicle-members on the street, the sum of the term for confirming each B.S.M inside the reenactment is recorded. We have calculated the following parameters using Gnu-plot Software: 6.1 Authentication Delay We observed the latency in real-time for confirmation to think about the effectiveness of our recommended technique. This might be the number of seconds spent by each vehicle-member in the organization accepting a Basic Safety Message to confirm the source of the B.S.M in seconds. This had been calculated with the help of the ctime library’s clocks() function. When there are 5 to 50 vehicle-members on the street, the sum of the term for validating each B.S.M inside the reenactment is recorded. We estimated the average authentication latency per B.S.M by dividing the total time it takes to validate all B.S.Ms (for recreation time – 150 s) (-when the no. of cars is around 50) by the overall number of B.S.Ms sent throughout the organization. As given in Fig. 6, the normal latency per B.S.M to verify the sender when using the suggested technique is around (1.8 ms), whereas it is roughly (1.8 ms) when using the PKI system (3.5 ms). And This is frequently owing to the additional computing time needed to approve two advanced marks inside the PKI method, one for the PV in the e-Certificate and another for the information that was sent message that was sent. By leveraging R.S.U administrations to approve the PV using Blockchain and then certifying the computerized e-Signature within the data in this projected method, we are able to cut the computation time in half.

Fig. 6. Average delay in authentication for a B.S.M

Furthermore, when the number of automobiles in the recreation increased, we documented the total time delay caused by authentication process for all Basic Safety Messages within the arrangement. As shown in Figure 7, the total latency for confirming

Security Protocols for Blockchain-Based Access Control in VANETS

259

all of the Basic Safety Messages in our suggested technique is nearly half that of the traditional system. As a result, our projected solution will effectively cut the processing time necessary for verification in half when comparing with the traditional Public Key Infrastructure approach.

Fig. 7. Overall time due to verification

6.2 Active Time of Channels The total delay of MAC layer was active owing to a blockage is the channel active time. The overall active time is recorded by the Veins-4.7.1 system, and partitioning it by the add up to recreation time gives the channel active time in seconds. In comparison to the PKI approach, which has 0.049 s, the suggested strategy has a greater channel active time of about 0.10 s at a leisure time of 150 s, as shown in Fig. 8. Typically, because of the extra communications with R.S.Us necessary to approve the senders’ Pid and P. When a vehicle-member requests PId and P approval, the R.S.Us will broadcast a W-S-A. This occurred as a result of an increase in the channel’s active time.

Fig. 8. Channel active time

260

K. Roy and D. Giri

6.3 Packet Size of B.S.Ms The PKI system will contain an altered e-certificate whenever comparing with the measure of the B.S.M packets in our suggested technique during the comparison of the B.S.M packet size in both techniques. The B.S.M bundle for the PKI technique was roughly 230 bytes in the recreations, counting the unique Identifier of the message (approx. 62 bytes) and the certification that contains the transmitting vehicle-member’s open key (approx. 63 bytes) and the cryptographic certificates of it (approx. 62 bytes). When we compared our recommended solution, the B.S.M bundle was estimated to be around 170 bytes. The 63-byte advanced signature of the public key is not required by the B.S.M in our blockchain approach. Inside the B.S.M, it contains the vehicle-member’s Pid , as well as the message’s signature (62 bytes) and the vehicle-member’s open key (approx. 66 bytes). Figure 9 depicts this.

Fig. 9. B.S.M packet size comparison

6.4 Extra Information Sent We also looked at the added operating costs in our strategy to communicating the inquiries to the R.S.U and the R.S.U’s responded to Vehicle-member as a Wave-Service-Advocacy (WSAs) that broadcasts to every nearest terminals. The number of demands submitted to the Road side units by the automobiles, the no. of WSAs broadcasted to the vehiclemembers, and the no. of B.S.Ms inside the organization are shown in Fig. 10.

Security Protocols for Blockchain-Based Access Control in VANETS

261

Fig. 10. Our potential methodology, where more messages are sent.

7 Conclusion Blockchain technology is used to achieve the Idea of providing a lightweight authentication solution that is computationally efficient compared to traditional P-K-I engineering. We authorize the PId and P of the cars relaying information utilizing R.S.U administrations in our suggested system. Cars to maintain a waitlist of the recently approved vehicle-members (PId and P) with a short turnaround-time. We approve the computerized e-signature of the B.S.M alongside the time-stamp that the message was transmitted once we approve the PId and P of the transmitting vehicle-member. The R.S.Us will need to access the blockchain and inquire about it in order to approve the vehicle-member’s PId and P. Our suggested methodology minimizes the overall time required for verification, but it does so by reducing the active period of the channel. Our projected technique will necessitate the transmission of additional messages to the R.S.Us as well as assistance from the R.S.Us to the cars. As a result, this occurs during the extra channel active time.

8 Future Works As stated in the segment, channel congestion could be a negative in our projected solution. There are a variety of ways to maintain and progress in channel obstruction. A handful of these methods can be utilized to see if they decreases channel blockage in the Blockchain System. Therefore, reaching consent on blockchains could be a difficult task. There are a variety of quick and effective assessments that can be used to advance the suggested framework. Furthermore, locating a hub for getting into mischief and reporting it to the Verification Parties necessitates extra analysis. More research is needed to find effective strategies to Identify convictions regarding and assailants within the network and expel these out from the network. Acknowledgement. I would like to specific my appreciation to my supervisor Debasis Giri Sir for the valuable comments, and engagement through the learning preparation of this ace proposal.

262

K. Roy and D. Giri

Moreover, I would also like to thank for presenting me to the subject as well for the bolster on the way. Too, I like to thank the members in my survey, who have eagerly shared their valuable time amid the method of meeting. I would like to thank my adored ones, who have backed me all through the complete handle, both by keeping me concordant and making a difference me putting pieces together. I will be thankful until the end of time for your cherish.

References 1. Global status report on road safety (2018). Accessed 27 June 2019 2. First of its kind CAA study Identifies Canada’s worst traffic bottlenecks. Accessed 11 Jan 2017 3. Yousefi, S., Fathy, M.: Vehicular ad hoc networks (VANETs): challenges and perspectives. In: 6th International Conference on ITS Telecommunications, pp. 761–766 (2006) 4. VANET [jpg] (2015) 5. Abassi, R.: VANET security and forensics: challenges and opportunities. Wiley Interdisc. Rev. Forensic Sci. 1(2), e1324 (2019) 6. Mejri, N.M., Ben-Othman, J., Hamdi, M.: Survey on VANET security challenges and possible cryptographic solutions. Veh. Commun. 1(2), 53–66 (2014) 7. Malik, N., Nanda, P., Arora, A., He, X., Puthal, D.: Blockchain based secured Identity authentication and expeditious revocation framework for vehicular networks. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), pp. 674–679 (2018) 8. Liu, X., Fang, Z., Shi, L.: Securing Vehicular Ad Hoc Networks. IEEE (2007) 9. Zeadally, S., Hunt, R.-S., Chen, Y., Irwin, A., Hassan, A.: Vehicular ad hoc networks (VANETS): status, results, and challenges. Telecommun. Syst. 50(4), 217–241 (2012) 10. Cronin, B.: Vehicle-member Based Data and Availability (2012). Accessed 12 July 2019, https://www.its.dot.gov/itspac/october2012/PDF/data_availability.pdf 11. Pan, J., Cui, J., Wei, L., Xu, Y., Zhong, H.: Secure data sharing scheme for VANETs based on edge computing. EURASIP J. Wirel. Commun. Netw. 2019(1), 1–11 (2019). https://doi. org/10.1186/s13638-019-1494-1 12. Bauerle, N.: What is the Difference Between Public and Permissioned Blockchains? Coindesk (2019). https://www.coindesk.com/information/what-is-the-difference-betweenopenand-permissioned-blockchains 13. Introduction – Hyperledger Fabric (2019). https://hyperledger-fabric.readthedocs.io/en/rel ease-1.4/blockchain.html, Accessed 10 Aug 2019 14. Lansford, J., B. Kenney, J., Ecclesine, P.: Coexistence of unlicensed devices with DSRC systems in the 5.9 GHz ITS band. In: 2013 IEEE Vehicular Networking Conference, Boston, MA, USA (2013) 15. Li, J., Lu, H., Guizani, M.: ACPN: a novel authentication framework with conditional privacypreservation and non-repudiation for VANETs. IEEE Trans. Parallel Distrib. Syst. 26, 938–948 (2015) 16. Sumra, I.A., Hasbullah, H.B., AbManan, J.B.:“Attacks on security goals (confidentiality, integrity, availability) in VANET: a survey. In: Laouiti, A., Qayyum, A., Mohamad Saad, M. (eds.) Vehicular Ad-hoc Networks for Smart Cities. Advances in Intelligent Systems and Computing, Singapore (2015) 17. Raya, M., Hubaux, J.-P.: The security of vehicular ad hoc networks. In: SASN 2005 Proceedings of the 3rd ACM workshop on Security of ad hoc and sensor networks, Alexandria, VA, USA (2005)

Security Protocols for Blockchain-Based Access Control in VANETS

263

18. Nampally, V., Sharma, M.R., Ananthanarayanan, A.: A survey on secure clustering approaches for VANET. In: Scientific Figure on ResearchGate (2017) 19. EdX. Module 1: Introduction to Blockchain Components. In: LinuxFoundationX - LFS170x: Blockchain: Understanding Its Uses and Implications [Class lecture slide] 20. Aziz. Public vs Private Blockchain: What’s The Difference? masterthecrypto (2019). https:// masterthecrypto.com/public-vs-private-blockchainwhats-the-difference/ 21. Frankenfield, J.: Consensus Mechanism (Cryptocurrency). Investopedia (2019). Accessed 25 June 2019, https://www.investopedia.com/terms/c/consensusmechanism-cryptocurrency.asp 22. Frankenfield, J.: Bitcoin. Investopedia (2019). Accessed 26 Oct 2019, https://www.investope dia.com/terms/b/bitcoin.asp 23. George, S.A.: Secure identity management framework for vehicular ad-hoc network using blockchain. Electronic Theses and Dissertations 8299 (2020). https://scholar.uwindsor.ca/etd/ 8299 24. Rosic, A.: What is Ethereum? [The Most Updated Step-by-Step-GuIde!]. Blockgeeks (2016). https://blockgeeks.com/guIdes/ethereum/ 25. What Is Ripple. Everything You Need To Know. Cointelegraph. https://cointelegraph.com/ ripple-101/what-is-ripple, Accessed 6 Nov 2019 26. Rilee, K.: Understanding Hyperledger Fabric—Byzantine Fault Tolerance. Medium (2019). Accessed 14 Feb 2019 27. Voshmgir, S. (July 2019)"Smart Contracts,” Blockchainhub Berlin”, [Online]. Available: https://blockchainhub.net/smart-contracts/. [Accessed 7 November 2019] 28. Proof of authority. Wikipedia (2018). https://en.wikipedia.org/wiki/Proof_of_authority, Accessed 6 Dec 2019 29. LeIding, B., Memarmoshrefi, P., Hogrefe, D.: Self-managed and blockchainbased vehicular ad-hoc networks. In: UBICOMP/ISWC 2016 ADJUNCT, HeIdelberg, Germany (2016) 30. Dai, C., Xiao, X., Ding, Y., Xiao, L., Tang Y., Zhou, S.: Learning based security for VANET with blockchain. In: 2018 IEEE International Conference on Communication Systems (ICCS), Chengdu, China (2018) 31. Rankin, D.J., Eggimann, F.: The evolution of judgement bias in indirect reciprocity. Proc. Royal Soc. B (2009) 32. Lasla, N., Younis, M., ZnaIdi, W., Arbia, D.B.: Efficient distributed admission and revocation using blockchain for cooperative ITS. In: 2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS), Paris, France (2018) 33. Lu, Z., Wang, Q., Qu, G., Liu, Z.: BARS: a blockchain-based anonymous reputation system for trust management in VANETs. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), New York, USA (2018) 34. Proof of Authority: consensus model with Identity at Stake. POA Network (2017). https://med ium.com/poa-network/proof-ofauthority-consensus-model-with-Identity-at-stake, Accessed 6 Nov 2019 35. Lei, A., Cruickshank, H., Cao, Y., Asuquo, P., Anyigor, C., Ogah, Z.: Blockchain-based dynamic key management for heterogeneous intelligent transportation systems. IEEE Internet Things J. 4(6), 1832–1843 (2017). https://doi.org/10.1109/JIOT.2017.2740569 36. Jiang, T., Fang, H., Wang, H.: Blockchain-based internet of vehicles: distributed network architecture and performance analysis. IEEE Internet Things J. 6(3), 4640–4649 (2019). https://doi.org/10.1109/JIOT.2018.2874398 37. Wang, R., et al.: A privacy-aware PKI system based on permissioned blockchains. In: 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China (2018)

264

K. Roy and D. Giri

38. Decoster, K., Billard, D.: HACIT: a privacy preserving and low cost solution for dynamic navigation and forensics in VANET. In: Proceedings of the 4th International Conference on Vehicle-member Technology and Intelligent Transport Systems (VEHITS 2018) (2018) 39. NazirIdis, N.: Comparing ECDSA vs RSA (2018). SSL.com, https://www.ssl.com/article/ comparing-ecdsa-vs-rsa/, Accessed 9 Dec 2019 40. George, S.A.: Secure identity management framework for vehicular ad-hoc network using blockchain. Electronic Theses and Dissertations, 8299 (2020). https://scholar.uwindsor.ca/ etd/8299 41. What is a vehicle-member Identification. AutoCheck. https://www.autocheck.com/vehiclememberhistory/vin-basics, Accessed 10 Nov 2019 42. "OMNET++," OMNET++ (2019). https://omnetpp.org/, Accessed 26 Dec 2019 43. Simulation of Urban Mobility. SUMO (2019). https://sumo.dlr.de/docs/index.html, Accessed 26 Dec 2019 44. Vehicle-members in Network Simulation. Veins (2019). https://veins.car2x.org/, Accessed 26 Dec 2019 45. Hyperledger-Composer (2019). https://hyperledger.github.io/composer/latest/, Accessed 26 Dec 2019 46. Welcome to Hyperledger Composer. Hyperledger Composer (2019). https://hyperledger.git hub.io/composer/latest/introduction/introduction.html, Accessed 26 Dec 2019 47. microsoft/cpprestsdk. GitHub (2019). https://github.com/microsoft/cpprestsdk, Accessed 26 Dec 2019 48. Crypto++® Library 8.2. Crypto++ (2019). https://www.cryptopp.com/, Accessed 26 Dec 2019 49. Saini, I., Saad, S., Jaekel, A.: Identifying vulnerabilities and attacking capabilities against Bogusnym Changing Schemes in VANET. I:n International Conference on Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments (2018)

LIVECHAIN: Lightweight Blockchain for IOT Devices and It’s Security Mukuldeep Maiti, Subhas Barman(B) , and Dipra Bhagat Jalpaiguri Government Engineering College, Jalpaiguri 735102, West Bengal, India [email protected]

Abstract. Blockchain requires high computation and heavy storage loads. It can be implemented in IoT devices by reducing the computational as well as space complexity of the blockchain. In this paper, a partial blockchain, called Livechain is designed for IoT type lightweight devices. Livechain is designed to hold much lesser number of blocks, i.e. few recent blocks, mining is replaced by an algorithm which is implemented using a miners’ queue. For security of the whole system, Elliptic curve cryptography (ECC) is used on the curve secp256k1 and few well known hashing function like SHA256.

Keywords: Blockchain

1

· IoT devices · Elliptic curve cryptography

Introduction

In the recent advent of technology, a centralized system is converted into distributed system which needs a security system that works on decentralized environment. In this regards, blockchain is the best solution to provide security in a decentralized way [1]. The blockchain is being used in healthcare system [2], energy trading [3], smart farming [4] etc. Now a days, usage of IoT devices is increasing exponentially, 8.7 billion in 2012 to 50 billion in 2020, whereas active connection for IOT devices increases 3.3 billion in 2015 to 9.8 billion in 2020. It is estimated to have 24 billion connections by 2024 [5]. Recently, most of these IoT devices are being controlled by some central servers either of the manufacturer of the device or some leased servers. It will be much harder to fulfil the demand in very near future with such a hike in number of devices and the number of connections. The communication between any two devices be much harder and slower, even, can fail on single point failure [6]. So, it’s better to have a decentralized network [11] that can relay message faster reliably, securely without failure & consuming low power. Here, blockchain is the solution. But blockchain requires much more disk space and memory, some of them even requires huge computing power which in turn making blockchain unfavorable for IOT devices [7,8]. Few blockchain and their-requirements are in Table 1. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022  D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 265–275, 2022. https://doi.org/10.1007/978-981-19-3182-6_21

266

M. Maiti et al. Table 1. Minimum requirements for existing blockchain platforms

Blockchain

Disk Space (Least) Memory (Least) System

OS

Bitcoin Core

200 GB

1 GB

Laptop ARM chipsets 1GHz

MacOS, Linux, Windows 7/8.x/10

Ethereum Full-Node 464 GB



Laptop ARM chipset

Linux, MacOS

Hyperledger Fabric

4 GB



MacOS, Ubuntu, Linux



The challenge is, IoT devices have 100 MHz to 1 GHz clock speed and memory starting from few KB to GB. Where blockchain like Bitcoin core, Ethereum requires hundreds of GBs of diskspace and some GBs of memory, this is big mismatch for implementation blockchain for IoT devices [5]. Specification of few popular IoT devices is given below in Table 2. Table 2. Specification of few popular IoT devices Platform

CPU

GPU

Clock Speed Memory Storage (Flash)

Intel Galileo Gen2

TM R IntelQuark SoC X1000



400 MHz

256 MB 8 MB

Intel Edison

TM R IntelQuark SoC X1000



100 MHz

1 GB

Beagle Bone Black

Sitara AM3358 BZCZ100

PowerVR SGX530 @520 MHz

1 GHz

512 MB 4 GB

Electric Imp 003

ARM Cortex M4F –

320 MHz

120 KB

Raspberry Pi B+

Broadcom BCM2835 SoC based ARM11 76JZF

ARM NXP LPC1768 ARM Cortex M3

4 GB

4 Mb

Video 700 MHz R CoreIVMultimedia @ 250 MHz

512 MB SD Card



32 KB

96 MHz

512 KB

These IoT devices are being used to perform certain tasks including some of the critical and hazardous tasks, like different sensors in industries, cars, fire alarms etc. Some of them requires privacy too, like, home surveillance, private IoT devices etc. where privacy is a big issue. For such devices, security, privacy is a basic requirement and is to be taken care of [9,10]. Therefore, the objective of this work is to implement of blockchain for IoT devices without compromising security & maintaining low power consumption, smaller memory and storage usage, decreasing network latency, huge data processing capability, security and privacy.

LIVECHAIN: Lightweight Blockchain for IOT Devices and It’s Security

2

267

Proposed Methodology and Implementation

Main steps of our implementation are 1. Partial Blockchain Implementation: Every node in this blockchain network must contain at least x (say x = 64) latest blocks. This is ensured by redefining block hash creation methodology, i.e., every block hash is formed by hashing data that includes last block hash and Merkle root of last x block hashed with current block timestamp. 2. Proof of Work: For low energy consumption, proof of work concept is being maintained by a queue defined in the blockchain itself. Say it Miner’s Queue. Only verified miner can mine in this network. The queue defined in the blockchain that contains verified public keys. Only the public key in the front of the queue can mine block within given time frame. After a block is mined or on time over, the public key is removed from the queue. One must request their public key to be enqueued, once verified, public key is enqueued. So, there is no need of solving such unnecessary math problem to mine blocks unlike bit-coin, Ethereum etc., which in turn saving lots of computing power, so energy. 3. Security: In this work, we use elliptic curve cryptography (ECC) using elliptic curve which is widely used in bitcoin, Ethereum and some other cryptocurrencies. Every IoT device/local controller of IoT devices, say end points will have its own private key (only known to the endpoint device itself) and public key (publicly visible). Every message/data embedded in the transaction is encrypted by these public-private keys. 2.1

Encryption-Decryption

Communication between two devices is done as follows: Let’s say Alice wants to send encrypted message to Bob. Alice has private key, say a which is known to Alice only, and public key, say P a = a ∗ G (generating point is multiplied with private key). Similarly, Bob has private key, say b which is known to Bob only, and public key, say P b = b ∗ G (generating point is multiplied with private key). Total process of transaction encryption-communication-decryption is shown in Algorithm 1. 2.2

Blockchain Structure

This blockchain consist of three main components, 1. Chain, 2. mem-pool, 3. miner queue. 1. Chain is collection of nodes. In every node, there will be at least k latest blocks. A blockchain representation is given in Fig. 1. Each block consists of two parts, i.e. Header and body. Header section contains height, timestamp, previous block hash, Merkle root of the transactions, Merkle root of the last

268

M. Maiti et al.

Algorithm 1. Encryption-decryption of transactions Input: Public keys P a, P b, generating point G, private keys a, b, transaction m At Alice’s end 1. Alice gets Bob’s public key, P b 2. Generate shared key by EC-multiplied Alice’s private key with Bob’s public key and a random number, say r, S = r ∗ a ∗ P b = r ∗ a ∗ b ∗ G 3. Break the message into 255 bits chunks 4. Generate number of keys from S and EC-added to corresponding message chunks and concatenated. 5. Sign the whole transaction 6. Send the encrypted and signed transaction to blockchain At Blockchain 7 Validate the transaction 8 Add to mem-pool and broadcast to the network 9 Miner adds the transaction to a block At Bob’s end 10 Got a transaction against his public key P b 11 Validate the transaction for legitimate sender 12 Generate shared key by EC-multiplied Alice’s public key P a with Bob’s private key b and the random number, say r, generated by Alice, S = r ∗ b ∗ P a = r ∗ a ∗ b ∗ G 13 Breaks the encrypted message into 256 bits chunks 14 Generate number of keys from S and EC-subtracted from corresponding encrypted message chunks and concatenated Output: Finally, Bob get the decrypted message (m)

k block hashed with current block time-stamp, and calculated hash from all these. The block structure, header and body of a block is shown in Fig. 2. The body part contains array of IoT transactions. Each IoT transaction is the piece of information that contains sender-receiver details, message, proof of authenticity. It consists of timestamp, array of unit transactions which is consist of sender and receiver public key, encrypted message, signature and hash, calculated from all these data. The IoT transaction structure is shown in Fig. 3(a). 2. Mem-pool consist of valid IoT transactions. Once some endpoint generates and sends some IoT transactions to some node in the network, then the transaction is validated and added to the mem-pool of that node and broadcasted to the network. Nodes that got this broadcasted message, verify and add the transaction to their mem-pool. Later, during block creation or mining, miner collects these transactions from this mem-pool. The structure of the mem-pool is given in Fig. 3(b). 3. Miner queue contains all the verified public key of miner those are allowed to mine new block. Only the public key in the front of this queue can mine a block with a given time frame. After the time frame is over front of the queue to be deleted and next public key start generating another block. It is shown in Fig. 4.

LIVECHAIN: Lightweight Blockchain for IOT Devices and It’s Security

269

Fig. 1. Blockchain consists of latest K block

Fig. 2. Block structure, header and body

4. Communication between nodes can be established using any existing P2P network. P2P network libraries like WebRTC, libp2p and others will work perfectly with this model, however if you want to use local server other than p2p for local devices then node.js or other active local servers can easily be integrated with the central local node iff that central node is connected to the p2p network.

3

Experimental Results

The Livechain is implemented by creating separate module for different functionality. i.e. Chain module, ECC module, transaction module, miner module, client module. Each module is developed independently to each other but using common platform. We are using python programming language. We have used few libraries like TinyDB as database, JSON to convert data, hashlib for creating hashes etc. At first, space and time complexity of our blockchain implementation are measured. The process we followed in the both complexities measurement is shown in the Fig. 5. This way, the space requirement for block creation and mining of block is measured. The space complexity complexity is shown in Fig. 6. As in the graph, which was created using actual measurements of the module. In this graph space complexity is measured using tracemalloc module. From the

270

M. Maiti et al.

Fig. 3. The structure a) IoT transaction and b) mem-pool

Fig. 4. The structure of miner’s queue

graph below, it is clear that memory requirements is linearly related to number of blocks it creates or mines. The graph shows memory used when n (from 1 to 10) number of blocks is created and mined with different block size. Similarly, the time complexity is also measured for block creation and mining of block. The execution time for creating and mining block is plotted in Fig. 7. From the graph, it is clear that time complexity is linearly related to number of blocks it creates or mines. The graph shows execution time (in seconds) of creating and mining n (from 1 to 10) number of blocks. Here, size of the created and mined blocks are different. We consider 10, 20, 30, 100 and 256 transactions for each block. It is also observed that time taken for mining a single block is very less i.e. few msec regardless of the block size, which enables us very fast block creation and mining. The storage requirement with respect to number of transactions per block is also measured. We have measured the storage requirement with a variable number of transactions per block and then we changes the number of blocks also

LIVECHAIN: Lightweight Blockchain for IOT Devices and It’s Security

271

Fig. 5. Procedure for complexity measure

Fig. 6. Space complexity

to capture the storage requirement. The memory space requirement for different number of blocks containing different number of transactions is plotted in Fig. 8. Here, the storage size is proportional to the amount of data inside the block. Memory usage in wallet creation and transaction generation is also derived in our experiment and the corresponding result is shown in Fig. 9(a). As in the graph, memory usage in wallet creation is proportional to the number of addresses in the wallet but transaction generation is independent and nearly equal to 0.07 MB. This observation enables us to use these modules in low

272

M. Maiti et al.

Time (in sec)

Block creation and mining time (no of transactions=10) time (no of transactions=20) time (no of transactions=30) time (no of transactions=100) time (no of transactions=256)

No. of blocks

Fig. 7. Time complexity

Fig. 8. Storage requirement with respect to number of transactions per block

memory devices like IoT devices. It becomes possible due to low sized crypto module which is designed from scratch. The time complexity of wallet creation is also computed and shown in the Fig. 9(b). Finally, we compute time and space complexity in our experiment for transaction creation and addition in memory pool. These complexities are measured using tracemalloc and time modules. The result is plotted in Fig. 9(c). Moreover, we have compared our Livechain with the existing blockchain. The comparison is given in Table 3.

LIVECHAIN: Lightweight Blockchain for IOT Devices and It’s Security

273

Memory used (in MB)

Wallet creation Transaction generation

Creation time (in sec)

No. of transaction or wallet addresses (a)

No. wallet addresses

time (in sec) or memory (in MB)

(b)

Time to add transaction Memory used

No. of transaction (c)

Fig. 9. a) Memory usage in wallet creation and transaction generation, b) Time complexity of wallet creation, c) Time and space complexity of transaction creation and addition into memory pool

274

M. Maiti et al. Table 3. Comparison with the popular blockchains

Platform

Scalability Consensus algorithm

Node type

Block creation

Block validation

Tr. creation/ Validation

CPU/ GPU overhead

Bitcoin

Yes

PoW

Full

Yes

Yes

Yes

High

Lightweight No

No

Yes

Low

Ethereum

Yes

PoW

Full Yes Lightweight No

Yes No

Yes Yes

High Low

No No

Yes Yes

Yes Yes

Low Low

Full Yes Lightweight Yes

Yes Yes

Yes Yes

Low Low

Hyper- ledger No Fabric

Kafka, Raft Peer Solo, PBFT Orderer

Livechain

Miner’s Queue

4

Yes

Conclusions

Blockchain is the most powerful decentralized security mechanism for data security in the recent advanced distributed systems. Due to its computational and storage complexity, lightweight applications are not being suitable for deployment of blockchain in their security enhancement. Finally, a solution called Livechain with all similar functionalities of blockchain has proposed with low CPU/GPU overhead. The aim of being light weight and low powered blockchain design is achieved. The Livechain is suitable for implementation with small memory, low computing power devices like IoT devices, smart phones, domestic PC etc., without compromising security.

References 1. Casino, F., Dasaklis, T.K., Patsakis, C.: A systematic literature review of blockchain-based applications: current status, classification and open issues. Telemat. Inform. 36, 55–81 (2019) 2. Shahnaz, A., Qamar, U., Khalid, A.: Using blockchain for electronic health records. IEEE Access 7, 147782–147795 (2019). https://doi.org/10.1109/ACCESS.2019. 2946373 3. Yahaya, A.S., Javaid, N., Javed, M.U., Shafiq, M., Khan, W.Z., Aalsalem, M.Y.: Blockchain-based energy trading and load balancing using contract theory and reputation in a smart community. IEEE Access 8, 222168–222186 (2020) 4. Vangala, A., Sutrala, A.K., Das, A.K., Jo, M.: Smart contract-based blockchainenvisioned authentication scheme for smart farming. IEEE Internet of Things J. 8(13), 10792–10806 (2021). https://doi.org/10.1109/JIOT.2021.3050676 5. Sultan, A., Mushtaq, M.A., Abubakar, M.: IoT security issues via blockchain: a review paper. In: Proceedings of the 2019 International Conference on Blockchain Technology, pp. 60–65 (2019) 6. IoT Security White Paper (2018)

LIVECHAIN: Lightweight Blockchain for IOT Devices and It’s Security

275

7. Yli-Huumo, J., Ko, D., Choi, S., Park, S., Smolander, K.: Where is current research on blockchain technology?-a systematic review. PLoS ONE 11(10), e0163477 (2016) 8. Sun, S., Du, R., Chen, S., Li, W.: Blockchain-based IoT access control system: towards security, lightweight, and cross-domain. IEEE Access 9, 36868–36878 (2021). https://doi.org/10.1109/ACCESS.2021.3059863 9. https://builtin.com/blockchain/blockchain-iot-examples 10. Alaba, F.A., Othman, M., Hashem, I.A.T., Alotaibi, F.: Internet of Things security: a survey. J. Netw. Comput. Appl. 88, 10–28 (2017) 11. Yazdinejad, A., Srivastava, G., Parizi, R.M., Dehghantanha, A., Choo, K.-K.R., Aledhari, M.: Decentralized authentication of distributed patients in hospital networks using blockchain. IEEE J. Biomed. Health Inform. 24(8), 2146–2156 (2020). https://doi.org/10.1109/JBHI.2020.2969648

Secure and Scalable Attribute Based Access Control Scheme for Healthcare Data on Blockchain Platform Shweta Mittal and Mohona Ghosh(B) Indira Gandhi Delhi Technical University For Women, New Delhi 110006, India {shwetamittal,mohonaghosh}@igdtuw.ac.in Abstract. With the emergence of blockchain technology, the process of healthcare data storage has shifted from traditional storage mode to online mode. This has become possible due to the various benefits associated with the blockchain technology like immutability which means data can’t be changed or erased once it’s recorded on the blockchain. Next one is availability which means data is available in real time and verifiability which means the legitimacy of data being kept may be verified by all nodes on the blockchain. In case of healthcare where sensitive data is being stored, it is not acceptable that this data is visible to all as it raises the privacy concerns. All data is visible to other nodes in the network. Therefore, there is an alarming need of a secure access control mechanism that will ensure that only authorized and selective parties are able to see the personal medical data. In this paper, we address this issue by proposing hierarchical ABE (Attribute Based Encryption) based access control mechanism on blockchain. We explain why data encryption alone cannot solve this issue and then propose a framework that utilizes hierarchical CP-ABE (Ciphertext Policy Attribute Based Encryption) access control mechanism to set permission rights for the different parties involved. We show that our framework provides scalability and fine grained access control, i.e., every data element is given its own access policy. We present a thorough security analysis of our proposed framework and show that it is secure and resistant to collusion and cheating attacks. We prove the accuracy of the functionality of the proposed scheme using BAN logic. Keywords: Access control · Blockchain · Electronic healthcare data · Hierarchical CP-ABE (Ciphertext-Policy Attribute Based Encryption)

1

Introduction

Hospital generates the medical record of the patient and stores it in its database. When a patient receives treatment at more than one hospital, the past medical information which was generated at previous hospital may be needed for future treatment. Hospitals maintain strict rules and regulations on sharing of medical data [5]. As hospitals store the data in centralized manner, various risks are associated with it such as data tapering, hacking, loss of data by natural disaster c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022  D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 276–290, 2022. https://doi.org/10.1007/978-981-19-3182-6_22

Secure and Scalable Attribute Based Access Control Scheme

277

or by intension. Patient cannot guarantee that his/her data is safe with the hospital against security breaches. Therefore, it is important to shift the data storage process from the traditional centralized manner to online mode. Cloud computing technology can effectively improve the level of healthcare services [14]. This shift from conventional to online mode has multiple benefits like cost saving in terms of electricity consumption and less man-power will be required to run the IT infrastructure, time saving in terms of easy management of data. Though cloud computing has various benefits, it cannot be used to store the sensitive medical data. Various concerns such as data privacy, choosing the right service provider [14], security, governance and integration issues [15] are related with the cloud computing model. Security is one of the major issues with cloud computing [15]. Blockchain provides the better solution to overcome these issues. Blockchain has many advantages over centralized platforms such as verifiablity, immutability and anonymity [17]. However, there are various issues related to access control in blockchain. All transactions are visible to every node on the blockchain network (verifiability) and in case of healthcare where sensitive data is being stored, it is not acceptable that this data is visible to all as it raises the privacy concerns. In case of healthcare data which is sensitive and need to be not disclosed to everyone, only authorized parties should have rights to disclose their data using certain mechanism. Therefore, there is a need for an access control mechanism that will ensure regulatory controls over the disclosure and access of medical data to selective parties only on blockchain. In this paper, we propose a framework that utilizes blockchain based hierarchical Ciphertext-Policy Attribute Based Encryption (CP-ABE) access control scheme for the different parties involved. Our Contribution: In this paper, we provide a novel approach for sharing and accessing the medical data by combining hierarchical attribute based encryption and blockchain technology. We also explain why data encryption alone cannot solve the issue for secure data access and we propose a framework that utilizes hierarchical Ciphertext-Policy Attribute Based Encryption (CP-ABE) access control scheme to set permission rights for the different parties involved. We show that our framework is scalable and provides fine grained access control, i.e., every data element is given its own access policy. Thorough security analysis of the proposed framework has been presented and shown that it is secure and resistant to collusion and cheating attack. BAN logic is used to prove the accuracy of the proposed scheme’s functionality.

2 2.1

Background Blockchain

Blockchain is a decentralised approach in which information is shared among numerous parties who do not trust each other [17]. Properties of Blockchain are [17]: Distributed: No single party control, Efficient: Fast and scalable, Verifiability: Everyone can check the validity of information, Immutability: It makes it impossible for any entity to manipulate, replace,

278

S. Mittal and M. Ghosh

or falsify data stored on the blockchain. Therefore blocks once recorded cannot be deleted or modified, Anonymity: Nodes identity is unknown [17]. There are two classes of blockchain. First one is permissioned blockchain in which every node needs a prior approval before entering into the system. Second one is permission-less blockchain which is open for all and lets everyone to participate in the system. Each blockchain block contains certain data, hash of previous block and hash of current block. Merkle Trees also known as hash trees are maintained which ensures that the data is tamper proof [17]. As the nodes are anonymous, a consensus mechanism is used to reach to a common agreement. There are three types of failure in distributed systems: Crash faults, Network or partitioned faults, and Byzantine faults [17]. The effect of first two faults can be known easily but the third one is difficult to guess. So a consensus mechanism is required to ensure reliability and fault tolerance in the presence of faulty nodes and to reach to a common agreement. A challenge response based system is established to achieve the consensus. The network imposes a challenge on the participating nodes and each node in the network would attempt to solve the challenge individually. The node that is able to solve the challenge first, would get to dictate what next set of data to be added into the blocks. Miners are the nodes in the system who adds the block to the blockchain. Every miner attempts to solve the challenge on their own. The block is accepted for the miner who can prove first that the challenge has been solved and then he/she put the solution in the existing block and adds that block to the blockchain. After receiving this, other miners stop solving the challenge and start finding new blocks of transactions which are coming from the clients. There are various consensus mechanisms based on the type of blockchain used. Permission-less blockchain uses consesnsus algorithms based on challenge response strategy like Proof of Work (PoW) [17], Proof of Stake (PoS) [17], Proof of Burn (PoB) [17] and Proof of Elapsed Time (PoET) [17]. Permissioned blockchain uses Byzantine Fault Tolerance (BFT) algorithm and Practical Byzantine Fault Tolerance (PBFT) algorithm [17]. As we will be using permissioned blockchain, the PBFT consensus algorithm will be used. The PBFT method attempts to obtain a worldwide agreement on the network [2]. In this case, messages are shared among the peers and because it is a closed environment, everyone knows who its peers are. Nodes in PBFT system are ordered as one node being the primary node which works as leader and others referred to as secondary or backup nodes. Primary knows who the backups are and backup knows who are the other backups and primary. In this way, nodes multicast the message among the closed group. This process can support high transaction throughput because one can include any number of transactions which is possible there and those transactions can get validated and consensus can be reached. 2.2

Attribute Based Encryption

ABE comes to the aid of providing fine grained access control which means every data object has its own access policy. The policy determines whether to allow or deny any request. KP-ABE (Key-Policy Attribute-Based Encryption)

Secure and Scalable Attribute Based Access Control Scheme

279

and CP-ABE (Ciphertext-Policy Attribute Based Encryption) are two types of ABE [12,13]. In case of KP-ABE, the key is linked to an access policy, while the ciphertext is linked to attributes’ set [13]. Every user who joins the system has some attributes. The key issuer chooses some attributes and access policy is being defined for a user, say John, based on his attributes. The authority then gives John the decryption key corresponding to the access policy. John receives access to the requested message if and only if the attributes’ set linked with the ciphertext fulfil the access rules with his key [13]. The owner of data loses control over his data, which is a key downside of this scheme. Also, to govern the access to the data being exported, he must rely on a key issuer [13]. To overcome the drawback of KP-ABE, CP-ABE was proposed. The user’s key is linked with attributes’ set in the CP-ABE scheme, and data is linked with an access policy [13]. The data owner selects attributes in CP-ABE, and access policies are developed based on those attributes. The key issuer selects attributes’ set for each of the new user who joins the system and grants him a decryption key based on the selected attributes’ set. The new user is permitted access to the message if the attributes’ set associated with the new user’s key matches the access policy associated with the ciphertext [13].

3

Related Work

Pournaghi et al. [2] provide a proficient scheme which is based on blockchain and attribute based encryption called as “MedSBA” [2]. They have used both KP-ABE and CP-ABE for the purpose of access control. Malamas et al. [3] present novel hierarchical multi expressive blockchain architecture. They use proxy blockchain to manage the trust authorities and to share the decryption key among different users, they employ CP-ABE. Due to a single key issuer, the approach is not scalable and has the performance complexity of using two blockchains [3]. Chen et al. [5] presented an approach which is based on blockchain and cloud storage to manage personal medical data. In their work, it is not defined who will access the blockchain. Comparison of the characteristics of medical blockchain has been presented with the traditional systems [5]. Access Policies are not yet defined. Zhang et al. [6] introduces the concept which uses two kinds of blockchain for the purpose of data storage: consortium blockchain and private blockchain. Medical data indexes are maintained in a consortium blockchain and the actual data is saved in a private blockchain. The main limitation of this work is storage overhead [6]. Li et al. [7] introduce various cryptographic algorithms to guarantee user privacy. Due to the limitation on the size of block, it is important to optimize the structure of data stored in the blockchain [7]. Patel et al. [8] developed a medical imaging data sharing framework. This paper introduces patient defined access permissions. In this work, they presented a comparison of their framework to traditional information sharing network (ISN). The privacy and security models are complex which have been performed in an unclear regulatory environment. They employed the PoS consensus mechanism, which is not feasible for healthcare applications since new nodes in the network may not have enough stakes to participate in the mining

280

S. Mittal and M. Ghosh

procedure [8]. Researchers talks about the potential of cloud computing for the purpose of storing the healthcare data. Cloud computing technology has various benefits that highly improve the level of healthcare services. Patients can access their personal medical data anytime and anywhere. There are various issues associated with it like data privacy, choosing the right service provider, vendor lock-in etc. [9,10,14,15]. Ahuja et al. [12,13] talk about the use of hierarchical ciphertext attribute set based encryption for access control in order to achieve scalability and provide fine-grained access control over data stored on cloud servers. The scheme is immune to collusion attack and cheating attack. Their scheme includes a hierarchical CP-ABE that provides scalability and also removes a centralized key issuer providing another level of hierarchy [12,13]. Azaria et al. [11] introduce a decentralised record management system based on blockchain to handle electronic medical records. They have implemented the proposed system on Ethereum platform. Since the miners have been introduced in the system their roles have not been defined. All nodes contribute in the process of adding blocks to the blockchain. System does not implement the consensus mechanism to reach to a common agreement [11]. There are three main directions in which research is going on in application of blockchain in healthcare: Data Storage, Data Sharing and Data Access. The comparison of the related research work based on these criteria is presented in Table 1. Table 1. Comparison of the related research work Paper

Data storage Data sharing Data access

Zuo et al. [1]







Pournaghi et al. [2] ✗





Malamas et al. [3]







Lee et al. [4]







Chen et al. [5]







Zhang et al. [6]







Li et al. [7]







Patel [8]







Azaria et al. [11]







Ahuja et al. [12, 13] ✗





A scalable and secure access control in terms of resistance to collusion and cheating attack has not been implemented by far in the healthcare application which is based on blockchain technology. Several schemes are using ABE techniques for secure sharing of data [1–3]. They are not scalable in terms of providing fine grained access control to the user because of having single key issuing authority. Therefore, there is a need for a mechanism that provides a scalable access control. In our proposed framework, the key issuer is decentralised at multiple levels of the hierarchy, which uses CP-ABE in a hierarchical structure to achieve scalability. as CP-ABE alone is not resistant to collusion and cheating attack [13].

Secure and Scalable Attribute Based Access Control Scheme

4

281

Preliminaries

This section presents the important notations and preliminaries that will be used throughout the rest of the work. 4.1

Bilinear Maps

The following are some facts about groups having effectively computable bilinear mappings [16]. Let G1 and G2 be two prime order q multiplicative cyclic groups. Let g is a generator of G1 and e is a bilinear map, e : G1 x G1 → G2 . The properties of the bilinear map e are as follows: 1. Bilinearity: We have e(xm , y n ) = e (x, y)mn for all x, y ∈ G1 and m, n ∈ Zq . 2. Non-degeneracy: e(g, g) = 1. If the group operation in G1 and the bilinear map e: G1 x G1 → G2 are efficiently computable, then G1 is a bilinear group. It’s important to mention that map e is symmetric as e(g m , g n ) = e(g, g)mn = e(g n , g m ). 4.2

Ciphertext-Policy Attribute Based Encryption

A CP-ABE scheme has five algorithms: Setup, Encryption, Key Generation, Decryption and Delegate [13]. Setup: Other than the implied security parameter, this method does not accept any input. This procedure yielded the public parameters PP and the master key MK. Encryption (PP, M, A): Inputs to this algorithm are public parameters PP, a message M, and an access structure A over the attributes’ set. The procedure encrypts M and generates a ciphertext CT that can only be decrypted by a user with attributes’ set that match the access structure A. Key Generation (MK, S): The master key MK and attributes’ set S that describe the key are sent into the key creation process. Output to this algorithm is a private key called PK. Decryption (PP, CT, PK): Inputs to this algorithm are PP, ciphertext CT containing an access policy A, and a private key PK for attributes’ set S. The algorithm will decode the ciphertext and provide a message M if the attributes’ set S matches the access structure A. ˜ A secret key PK for a attributes’ set S and a set of S˜ ⊆ Delegate (PK, S): S are sent into the delegate algorithm. The output is a secret key P˜K for the ˜ attributes’ set S.

282

S. Mittal and M. Ghosh

Hierarchical Ciphertext-Policy Attribute Based Encryption: There are a number of drawbacks to using CP-ABE. First one is, with increase in the number of data recipients, the data owner will be overloaded with work related to key generation and distribution process. Other one is, CP-ABE can be difficult to use in many applications, such as healthcare [13]. When a doctor has to get health information from people who have diabetes, he must ask each patient for secret keys. Because of the key-management issue, this approach will overload doctors [13]. In our proposed framework, we are addressing the first problem of CP-ABE. We use a hierarchical CP-ABE so that scalability is achieved by decentralising the key issuer at various levels of the hierarchy [13]. This scheme achieves flexibility, scalability and fine-grained access control using the ABE scheme [13]. This scheme takes use of an organization’s hierarchical structure to achieve scalability by transferring the responsibility of generation and distribution of keys from one top level authority to several low level domain authorities [13].

5

Proposed Model

We have proposed a framework which will be used to share personal medical data among different entities. In our framework, blockchain technology has been integrated with hierachical CP-ABE and cloud computing platform to store and share the personal medical data. This system incorporates permissioned blockchain.

Fig. 1. System model

5.1

Stakeholders and Their Roles

We assume 7 different entities in our system, Data consumers, Data producers, TTP (Trusted Third party), CA (Central Authority), DA (Domain authority), Cloud storage system, Blockchain network.

Secure and Scalable Attribute Based Access Control Scheme

283

TTP is responsible for managing CA. CA is responsible for managing the activities of DA. The data consumers in each domain are managed by DAs. Every entity in the system has a pair of private and public keys. The public key is visible to all system entities, but the private key is kept private by the entity. All data consumer entities such as hospitals, medical insurance companies etc. make a registration to the TTP. Then through DAs, they have been given private decryption keys that satisfies the access policy. The signature keys are created by TTP and made available to data producers. The levels of hierarchy prevent the attacker to perform collusion attack [13]. Data Producers (doctor): These are the entities that produce the patients’ medical data. Data Requesters: These are the entities who want to access the patient’s medical data. Some of the data requester are consuming the data legally such as hospitals, medical insurance companies and medical research labs that are using the data for some research work. While some of the data consumers may or may not have legal purpose such as patient’s parents, friends, or neighbors. These organisations may ask to utilise medical information about patients. All of these persons and organisations have access to patient information dependent on the patient’s access policy. Cloud Storage System: Encryption of medical data is needed to achieve confidentiality. Encrypted data is stored on cloud systems. Patient encrypts the data and stores them on the cloud storage. Blockchain: A well-known and reliable technology that synchronises and distributes transaction data across several nodes. It interacts with all of the system’s

Fig. 2. Steps of the framework showing data storage and data access

284

S. Mittal and M. Ghosh

Fig. 3. Access structure specified by the patient

entities and records those interactions as transactions [2]. Hash of the previous block and time stamp maintain block integrity and prohibit unauthorised alterations, while the merkle tree root, which contains hashes of all transactions in that block, assures block integrity [2] (Fig. 2). 5.2

Operation of the Proposed Framework

The proposed framework extends CP-ABE to manage user hierarchy, including Central Authority (CA), Domain Authority (DA) and users, as shown in Fig. 1. The proposed framework involved the following steps: Step 1: Data producer entity generates the medical data. Step 2: Encryption of medical data has been done using the symmetric secret key SK. Encryption has been done to ensure that the data is kept confidential. Step 3: Data producers store the encrypted data on the cloud. As the size of the block is limited upto 8 MB only [17] and the medical data requires lots of space, it is not feasible to store the data onto the blockchain. Step 4: To establish access control on data stored on the blockchain, the secret key SK is encrypted using ABE. SK CP −ABE is the generated key. Step 5: The SK CP −ABE has been stored on blockchain. The data can be accessed by data consumer entities based on the patient’s access policy. The secret key SK should not be put on the blockchain since data recorded on the blockchain is available to all participating nodes, and anybody in the network can use the SK to decode data stored on the cloud. Step 6: Access policy (see fig. 3) has been determined by the patient which regulates the access control process. When an entity wants to decrypt the data he/she needs to satisfy the access policy. The main advantage of this process is that after an access to the data by the data consumers, the patient can change the access policy. He does not require to change the actual encryption key to the data i.e., SK after every access.

Secure and Scalable Attribute Based Access Control Scheme

285

For e.g.: If a doctor wants to access the data, he or she must follow the access rules outlined in the access structure tree (see fig. 3). The access policy in this given example says that a doctor cannot access the data alone. The doctor should belong to the hospital that is authorized to access the data. Now we’ll discuss about the key generation algorithms of our proposed framework. Algorithm 1: Setup phase Input: Group (G1, g), G1 is a prime order p bilinear group with generator g. Output: TTP’s master key MK and public parameter PP. Algorithm 2: Key generation (CA) Input: TTP’s master key MK and CA’s attributes’ set S. Output: CA’s Master key M KCA . Algorithm 3: Key generation (DA) Input: CA’s Master key M KCA and the attributes’ set SDA that correspond to DA. Output: DA’s Master key M KDA . Algorithm 4: Key generation (User) Input: DA’s Master key M KDA linked with attributes’ set SDA and attributes’ set SU . Output: User U’s secret key SU . T is an access structure that the patient can create as shown in Fig. 3. This access tree specifies which entity specified by labels (see fig. 3) can decrypt the text.For any portion of his medical data, a patient can create any access structure [2]. Our framework supports access revocation using CP-ABE. As access policy is associated with the encryption key SK, patient can change the access policy after every access if he wants to revoke the access rights. There is no need to change the decryption key every time the requester access the data. All hospitals, as well as their authorised workers, who are able to register medical records in the blockchain, must need to go to the TTP (Trusted third party) and be authenticated by it based on their features as labeled in access tree. Then the TTP delivers them keys through DAs.

6 6.1

Security Analysis Security Analysis of Blockchain

Blockchain has become popular due to its various properties that provide security. – Distributed: Blockchain is a distributed architecture. As the network transmission is not controlled by a central authority, the security risk is reduced. The data transmission does not go through a single entity. – Tamper Proof: Merkle trees makes the blockchain tamper proof. Everyone can verify all the information as everyone on the network has the copy of the ledger. In merkle tree, leaf node holds hash of the transaction and every intermediary node stores the hash of the left and right child’s combination. Any modification to any of the transactions will be visible in the root hash

286

S. Mittal and M. Ghosh

value. If there is a change in the root hash value, block hash will change and then previous hash will also change and that way we have a cascading effect. In simpler words, if we make change in any of the blocks, next block will get change. And this change will be easily identified. – Verifiability: Everyone in the network can check all the transactions as the copy of blockchain is distributed to all the nodes. If any change will be made by any entity in any of the copy of blockchain, the rest of the copies remain intact. – Non-Repudiation: Digital signatures are used to validate the origin of the transaction. Every transaction carried out on the blockchain is signed by the sender’s digital signature using their private key. Digital signature ensures that all transactions were carried out by the legitimate entity only. 6.2

Security Analysis of Hierarchical CP-ABE

Resistance Against Collusion Attack: A collusion attack is a security attack in which a node deliberately makes a secret agreement with an adversary, or a node is somehow forced to have such an agreement. The attacker can access system’s private data [13]. To get to Message M, decrypt the ciphertext as M = c/e(C, D)/e(g, g)us

(1)

c = M e(g, g)αs (refer Bilinear maps Sect. 4.1). In e(g, g)us , The user’s key is connected with u, which is a unique random value. Retrieving e(g, g)us from c is required to recover M, which can only be done if the user’s key has enough characteristics to fulfil access tree T. A random number is used to randomise the primary components of each user. Adversaries secretly use their keys to get full access to the message, which they do not have. When malevolent users conspire with their keys, the conspired key components are randomised with various random values, limiting their transit to polynomial interpolation to obtain e(g, g)us [13]. As a result, our framework is immune to collusion attacks. Resistance Against Cheating Attack: Our framework employs a hierarchical CP-ABE, in which the ciphertext CT is linked to access policy A, and the keys are identified using attributes’ set. Non-leaf nodes of A are threshold gates, whereas leaf nodes of A are attributes. The data owner picks a secret random number s to encrypt the message M that corresponds to A. Then, from the root node to the leaf nodes, s is spread over all the nodes in the tree A [13]. As a consequence, each node in tree has a secret random number given to it. A user would only get M if his key has enough attributes to meet the threshold v associated with the tree’s root node A and acquire e(g, g)u .s , that may then be utilised with Eq. 1 to get M. All members of a group’s keys are randomised in SAP using the same random number u, allowing them to aggregate their attributes. Furthermore, each group member is given a key that corresponds to

Secure and Scalable Attribute Based Access Control Scheme

287

a set of attributes, so as all members of the group are only permitted to satisfy A and acquire e(g, g)us collectively, that is able to attain message M [13]. Therefore, our framework is immune to cheating attack [12,13]. 6.3

Our Security Model

The following theorems prove the security of our framework. Theorem 1: Our scheme is secure under single party control. Proof: There is no central authority to regulate network transmission since the key creation and distribution process is divided among numerous parties, which decreases the security risk. Theorem 2: The framework is secure against tampering. Proof: Merkle trees make the blockchain tamper proof. If any change in any of the transaction is made, that change will be reflected to the root hash value. If there is a change in root hash value, block hash will change and then previous hash will also change and that way we have a cascading effect. To put it another way, if we modify one of the blocks, the following block will also change. This shift will be readily visible. Theorem 3: The framework is secure against repudiation attack. Proof: Digital signatures are used to validate the origin of the transaction. Every transaction carried out on the blockchain is signed by the sender’s digital signature using their private key. Digital signature ensures that all transactions were carried out by the legitimate entity only. Theorem 4: Our framework is impenetrable to collusion and cheating attack. Proof: As described in Sect. 6.2, our proposed framework is secure against collusion and cheating attack. Theorem 5: No one can steal patient’s personal medical data. Proof: The encrypted data is incomprehensible to adversaries. They will not be able to steal vital information of patients. Theorem 6: The proposed framework is scalable. Proof: Our proposed framework uses a hierarchical CP-ABE to reduce the key generation and distribution workload. At various levels of the hierarchy, the scheme decentralises the key issuer. TTP only gives the key to the CA, and the CA produces the key for the DA. The key is given to the end users by DA. As a result, the key generation activity is distributed across different levels of the structure. 6.4

Formal Security Proof Using BAN Logic

The proposed protocols’ correctness has been evaluated using BAN logic. Burrows et al. (1989) developed BAN logic, which is a logic based on belief and action [18]. It is a logic that is based on the views of the protocol’s trusted parties and the promotion of those beliefs over communication processes in order to detect the flaws in authentication protocols [18]. Table 2 shows the basic BAN logic notations.

288

S. Mittal and M. Ghosh Table 2. BAN logic notations Symbols Description A |≡ B

A believes B

AB

A sees B

A |∼ B

A once said X

A⇒B

A has jurisdiction over B

#(X)

Statement X is fresh

XY

This represents X combined with the formula Y

{X}K

X encrypted under the key K.

A← →B

A and B may use the shared key k to communicate.

k

k

− →B

B has k as a public key

The DP’s (Data Producer) initial assumptions are as follows: K

−→ DP A(1.1) DP |≡ −−SK A(1.2) DP |≡ #(KSK ) The DC’s (Data Consumer) initial assumptions are as follows: Kpub

A(2.1) DC |≡ −−−→ DC A(2.2) DC |≡ #(Kpub ) A(2.3) DC |≡ γ CP −ABE The B’s (Blockchain) initial assumptions are as follows: K

−→ DP A(3.1) B |≡ DP |≡ −−SK Kpub

A(3.2) B |≡ DC |≡ −−−→ DC A(3.3) B |≡ DP ⇒ KSK A(3.4) B |≡ DC ⇒ Kpub A(3.5) B |≡ DP |≡ #(KSK ) A(3.6) B |≡ DC |≡ #(Kpub ) Goals: The following goals assure the proposed protocol’s security. G1: B |≡ DP |∼ T1 G2: B |≡ DP |∼ T2 G3: DC  data G4: DC |≡ DP |≡ T1 G5: DC |≡ DP |≡ T2 Data Encryption Phase: E(1.1): (DP → B): B  T1 KSK E(1.2): (DP → C): C  {data}KSK E(1.3): (DP → B): B  {KSK }CP −ABE and C  T2 KSK

Secure and Scalable Attribute Based Access Control Scheme

289

Data Decryption Phase: D(1.1): (B → DC): DC  T1 KSK D(1.2): (DC → B): B  SC (smart contract) D(1.3): (B → DC): DC  T2 and DC  {KSK }CP −ABE D(1.4): (C → DC): DC  {data}KSK Theorem 1: The blockchain believe that the data producer has once created transaction T1 . Proof: Z1 is derived from the assumptions A(3.1) and A(3.3), as well as the Jurisdiction rule [18]: A|≡ B⇒X, A|≡ B|≡ X A|≡X

We have (Z1) as:

B|≡ DP ⇒KSK , B|≡ DP |≡ B|≡

KSK

−−−→DP

K

−−SK −→DP Based

k

on E(1.1), Z1 and message-meaning rule [18]: KSK

A|≡ B ← →A, A {X}k A|≡ B|∼X

We have (Z2)

B|≡ −−−→DP , B T1 K SK as: Hence G1 is achieved. Likewise, G2, G4, G5 are B|≡ DP |∼T1 achieved as well.

Theorem 2: The patient guarantees that only data consumers with the necessary attributes have access to the medical data. Proof: Consider D(1.1) in which DC receive transaction T1 from blockchain. If there are necessary attributes present in the transaction, the outcome of smart contract SC will be valid. Using

D(1.1),

D(1.3)

DC T2 DC {KSK }CP −ABE , DC {data}K

and D(1.4), we have Z3 as: Using A(2.3) and Z3, we have Z4 and Z5 as:

SK

DC {KSK }CP −ABE , DC|≡ γCP −ABE DC KSK , DC {data}KSK DC KSK DC data

Hence goal G3 is

achieved.

7

Conclusion

This paper provides a novel and secure framework for sharing the medical data among data consumers such as hospitals, medical insurance firms, medical research laboratories and the entities who wants to gain access to the medical data. In the proposed scheme, Hierarchical Attribute based encryption has been integrated with the blockchain technology and cloud computing architecture. Attribute based encryption allows the patient to implement the fine grained access control over their personal medical data, hierarchical ABE prevents the system from Collusion and Cheating attacks and the blockchain technology has made the system susceptible to various security threats such as data tampering, repudiation, loss of data by intention or by natural disaster and hacking of data. The security analysis of the proposed scheme has been demonstrated. BAN logic has been used to prove the accuracy of the proposed scheme’s functionality.

290

S. Mittal and M. Ghosh

References 1. Zuo, Y., Kang, Z., Xu, J., Chen, Z.: BCAS: a blockchain-based ciphertext-policy attribute-based encryption scheme for cloud data security sharing. Int. J. Distrib. Sensor Netw. 17, 155014772199961 (2021) 2. Pournaghi, S.M., Bayat, M., Farjami, Y.: MedSBA: a novel and secure scheme to share medical data based on blockchain technology and attribute-based encryption. J. Ambient Intell. Hum. Comput. 11(11), 4613–4641 (2020). https://doi.org/10. 1007/s12652-020-01710-y 3. Malamas, V., Kotzanikolaou, P., Dasaklis, T.K., Burmester, M.: A hierarchical multi blockchain for fine grained access to medical data. IEEE Access 8, 134393– 134412 (2020) 4. Lee, T.-F., Li, H.-Z., Hsieh, Y.-P.: A blockchain-based medical data preservation scheme for telecare medical information systems. Int. J. Inf. Secur. 20(4), 589–601 (2020). https://doi.org/10.1007/s10207-020-00521-8 5. Chen, Y., Ding, S., Xu, Z., Zheng, H., Yang, S.: Blockchain-based medical records secure storage and medical service framework. J. Med. Syst. 43(1), 1–9 (2019) 6. Zhang, A., Lin, X.: Towards secure and privacy-preserving data sharing in e-health systems via consortium blockchain. J. Med. Syst. 42(8), 1–18 (2018). https://doi. org/10.1007/s10916-018-0995-5 7. Li, H., Zhu, L., Shen, M., Gao, F., Tao, X., Liu, S.: Blockchain-based data preservation system for medical data. J. Med. Syst. 42(8), 1–13 (2018). https://doi.org/ 10.1007/s10916-018-0997-3 8. Patel, V.: A framework for secure and decentralized sharing of medical imaging data via blockchain consensus. Health Inf. J. 25(4), 1398–1411 (2019) 9. Esposito, C., De Santis, A., Tortora, G., Chang, H., Choo, K.R.: Blockchain: a panacea for healthcare cloud-based data security and privacy? IEEE Cloud Comput. 5(1), 31–37 (2018) 10. Ali, O., Shrestha, A., Soar, J., Wamba, S.F.: Cloud computing-enabled healthcare opportunities, issues, and applications: a systematic review. Int. J. Inf. Manag. 43, 146–158 (2018) 11. Azaria, A., Ekblaw, A., Vieira, T., Lippman, A.: MedRec: using blockchain for medical data access and permission management, In: 2016 2nd International Conference on Open and Big Data (OBD), pp. 25–30 (2016) 12. Ahuja, R., Mohanty, S.K., Sakurai, K.: A scalable attribute-set-based access control with both sharing and full-fledged delegation of access privileges in cloud computing. Comput. Electr. Eng. 57, 241–256 (2017) 13. Ahuja, R., Mohanty, S.K.: A scalable attribute-based access control scheme with flexible delegation cum sharing of access privileges for cloud storage. IEEE Trans. Cloud Comput. 8(1), 32–44 (2017) 14. Zhiqiang, G., Lingsong, H., Hang, T., Cong, L.: A cloud computing based mobile healthcare service system. In: 2015 IEEE 3rd International Conference on Smart Instrumentation (ICSIMA), Measurement and Applications, Kuala Lumpur, Malaysia, pp. 1–6 (2015) 15. Sultan, N.: Making use of cloud computing for healthcare provision: opportunities and challenges. Int. J. Inf. Manag. 34, 177–184 (2014) 16. Bethencourt, J., Sahai, A., Waters, B.: Ciphertext-policy attribute-based encryption. In: 2007 IEEE Symposium on Security and Privacy (SP 2007) (2007) 17. Antonopoulos, A.M.: Mastering Bitcoin: Unlocking Digital Crypto-Currencies, 1st edn. O’Reilly Media, Inc., Newton (2014) 18. Burrows, M., Abadi, M., Needham, R.M.: A logic of authentication. Proc. Royal Soc. Lond. A Math. Phys. Sci. 426(1871), 233–271 (1989)

Adaptive Neuro Fuzzy Inference System for Monitoring Activities in Electric Vehicles Through a Hybrid Approach and Blockchain Technology Tapashri Sur(B) , Sudipto Dhar, Sumit Naskar, Champak Adhikari, and Indrajit Chakraborty Budge Budge Institute of Technology, 700137 Kolkata, India {tapashrisur,sudiptodhar,sumitnaskar,champakadhikari, indrajitchakraborty}@bbit.edu.in http://www.bbit.edu.in/

Abstract. The rapid upsurge in the espousal of electric vehicles in the global market for improving the sustainability of transportation systems raised apprehensions regarding the impact of the electric network during peak load hours that may affect the embargo of a power blackout, voltage drop, etc. The effect of these types of demand-side energy paraphernalia like Electric vehicles on the grid is the cause of locating charging infrastructure, which confines with producing cost, battery charging time, and the limitations in the battery. We propose in this paper a newfangled methodology that charges an Electric Vehicle on an Adaptive Neuro-Fuzzy Inference System (ANFIS). It efficiently controls non-linear suspension classification and reduces power fluctuation. Also, Hybridization is effective in the progression of prediction models, predominantly for renewable energy systems. Furthermore, with the familiarization of Blockchain Technology, it will accomplish a protected and transparent provision with an acceptable expectancy cost through decentralization of the network.

Keywords: Neuro-fuzzy control vehicles · Smart grid · FPGA

1

· Blockchain system · Electric

Introduction

Renewable resources are being utilized in the energy matrix [1] in recent years. To attain effective dispatch in both utilization and consumption of renewable, the Simulated Power Plant acts as a transitional catalyst amongst disseminated energy resources, power grid, and Electric Vehicles. Also, the V2G (Vehicle to Grid [2] concept will actively be used in the smart grid for charging any Electric Vehicle in the designated network. A cohesive system is attained through which the energy stowed in the battery is returned and reabsorbed [11]. In this paper, c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022  D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 291–301, 2022. https://doi.org/10.1007/978-981-19-3182-6_23

292

T. Sur et al.

we propose a theoretical modus operandi through hybrid approach where we shall secure the communication amongst the Simulator Power plant collector and Electric Vehicle protuberances concurrently along with an AI-driven Blockchain integrated system which shall rheostat the power management to resolve the problem aforementioned in smart grid.To regulate an Electric Vehicle, a Field Programmable Gate Array will provide sustenance to the Adaptive Neuro-Fuzzy Inference System (ANFIS) Fig. 1.

Fig. 1. ANFIS model

Each five-layer in the nodes of an ANFIS is designated with some tasks. As ANFIS is a feed-forward neural network, the square nodes in the model are adaptive and always depend on the output attained from the learning procedure [3]. At first, the step’s parameters are set as premise parameters and more appropriate membership functions are incorporated during the learning process. Proposed Work: In this paper we have proposed using the membership function from layer one, in the second layer, each node is filled with the rules constructed from the previous layer. The output obtained from each node controls the firing power ωi of the individual rule [13]. Second Layer output(o/p)Function : o/pi (x) = μa (x).μb (x) ωi Third Layer output(o/p Function : ω ¯i = ω1 + ω2 + ω3 + ..ωn Fourth Layer output(o/p Function : o/pi = ω ¯if i = ω ¯ i (Ai x + Bi y + Ci )  Fifth Layer output(o/p Function : o/p = ωi f i

(1) (2) (3) (4)

ANFIS and Blockchain Technology

293

In PID controller case, four parameters sp, si, sd, sg are used to tweak and optimize the controller. The differential equation related to the fractional order controller in non-linear classification canister be defined as: x(t) = sp e(t)) + si ft−β + sd fiγ e(t) + sg f1−α The incessant transfer function is given by: Tb =

Y (s) Ri = sp + β + sd Rγ + Rg s−α P (s) s

The Federated Machine Learning method [13] is used for EV charge prediction where the EV flotilla is fused for the exchange of data amongst the grid and itself for effectual electrical source [4]. Every time the power grid requests for the need of electricity, the network shall provide the requisite driving status, and to provide security to the EV’s, the nodes are fed into a hash function which is referred to as Merkle Root tree structure that provides efficiency both in memory as well as a computation in Block-chain Technology [5]. Figure 2 explains the Merkle Root [h1..hz] structure with hz nodes, where hz is the hash function fed into each classification [TX1..TXn] into each node through bottom up approach. Adding new blocks would compete through the adherents of the distributed ledger. Merkle tree is accoutred with two delineated artefacts, one being function hash operator and the other is an assignment operator, ϕ, which atlantes the set of nodes to the set of strings: h(m)-length m → ϕ(m) ∈ 0, 1 k. For the two child nodes, m left and m right, of any interior node, m parent, the assignment ϕ is the requisite required to placate ϕ(m parent) = hash(ϕ(ml ef t)u  ϕ(mr ight)). The operator function hash is designated as candidate one-way function, instantaneously such as SHA-1. hf[h1..hz] hf[h1..hm]

hf[hn..hz]

TX1[h1..] TX2[..hm] TX3[hn..] TXn[..hz]

Fig. 2. Merkle root architecture

2

Literature Review

We have contemplated and surveyed few interrelated work which focused mainly on the conventional conformist optimum approach. [1] in the paper proposed a integrated system AEBIS for managing power. The precise estimation of consumption of power produced dependable and apt provision to stream superfluous electricity [1]. The AI-integrated chips confirm efficient performance. Varied

294

T. Sur et al.

algorithm has been introduced for charging of the vehicles [10] through federated erudition approaches and the introduction of Blockchain technology envisages protected and translucent provision.

3

ANFIS Modelling and Analysis of Charging EV’s During Peak Hours

ANFIS is a grouping of the fuzzy controller along with neural network, that allows the controller with two intelligent techniques of self-tuning and adaptive. If these two procedures are combined, it will attain good perceptive in both quality and quantity. The data obtained will be used to amend the parameters in the ANFIS model. Takagi-Sugeno-Kang [12] structure of ANFIS is used for training and modifying the standard neural network. To make it simple, the following assumptions are made: (a)the model has three input x1 , x2 , x3 and one output z. (b) It has three rules R1, R2 and R3. R1 : if x1 is A1 and y1 is B1 then f x1 = a1 x1 + b1 y1 + s1 R2 : if x2 is A2 and y2 is B2 then f x2 = a2 x2 + b2 y2 + s2 R3 : if x3 is A3 and y3 is B3 then f x3 = a3 x3 + b3 y3 + s3 The membership functions implicitly have Gaussian distribution Fig. 3 for the thresholds since they are not homogeneously scattered and each charging point of the EVs has an individual load profile which is calculated in accordance with the load and the fitting of the locations [1]. The membership functions feats the fuzzy logic through 1) The state of charge of the EV’s. 2) Discrete Load profile of the grid. 3) Photo Voltaic Panels for energy production. With the lower state of charge and lesser load profile, the EVs are more preferable if they are charged with PVs which are portable and the nodes of the charging point can be ideal with higher weights [6]. Through fuzzy logic, we amassed the data and formulate a number of fractional truths and further aggregate it into higher truths that an surpasses certain threshold value in the result. To circumvent the disproportionate amount of energy formed by the renewable sources which possibly will harm the equipment mounted on the grid, we could store this excessive energy of the EVs on the storage systems installed on the grid or households [7]. This is the resultant of the members agreeing upon their preferred nodes which shall help in generating blocks that could be sent through all the nodes and this transaction would be crucial in deciding to charge the Electric Vehicles during high electricity demands so that grid surplus risk can be minimized [8]. Through calculating the local gradient i ← Δgw

δe(wi ) δwi

(5)

ANFIS and Blockchain Technology

295

Fig. 3. Sate of charge

We shall upload the function from ANFIS system and verify it and add it to the Blockchain network. For each individual Electric Vehicles, the output function is the prediction of power consumption [9]. We have established a base of Fuzzy Rule in IF-THEN Rule format built on Sugeno Fuzzy Inference system. During exhibiting the training data to fit the parameters of the model, the outputs would be to predict unseen data so that test set error can be lessened. To ensure predicted power ingesting, aggregator before initializing, needs to wait for collecting local models r = 0 and train the model in the ANFIS system that we have described in the aforementioned. The scatter plots with the predicted values and trend curve. For better corroboration, the spurious correlation is eliminated [10]. The skfuzzy module in python was used create the Fuzzy logic and train the values to incorporate it into the FGPA chip. The EVs with greater mean weight are permitted to be charged. The fuzzy weight dedicated for the next

Algorithm 1. Decentralized Fl-based learning structure design f1.ruleblock( EV name=””, EV description=”, enabled=true, minimum=0.000, maximum=1.000, lock-range=false, lock-previous=false aggregation=none, dis-junction=none, conjunction=none, calculate local gradient, activation=f1.generate, rules=[ f1.rule create(”R1:if x1 is A1 and y1 is B1 then,f x1 =a1 x1 +b1 y1 +s1 ”), f1.rule create(”R2:if x2 is A2 and y2 is B2 then,f x2 =a2 x2 +b2 y2 +s2 ”), f1.rule create(”R3:if x3 is A3 and y3 is B3 then,f x3 =a3 x3 +b3 y3 +s3 ”) ])

296

T. Sur et al.

Fig. 4. Mechanism of selecting EVs for charging

round of sequence from the State of Charge is obtained from the weight higher than the threshold since every grid node is designated with distinct charging point in the grid [11] Fig. 4. The technique of finding the requisite nodes in the power grid to charge the Electric Vehicles during peak hours is explained in Fig. 5. The core task is to envisage the expanse of electricity supply from the Electric Vehicles. At the beginning the residual current power of each vehicle is calculated and the maximum capacity of the battery is denoted as state of Charge. Then it is compared with the current remaining power with the projected power consumption Projected Consumer Power. If the residual power is less than the projected consumption of the Electric Vehicle, then the supply of electricity SEL at that instant would be halted. The available energy AE is set to zero. Then the residual power will be used by the next EV in queue. Additional Electric Load ASoE on the power grid is made available to the Electric Vehicles. A parameter ρ is used to symbolize the emancipation rate in the EVs. If the supply of electricity SEL is less than the Additional Electric Load, then the residual power in the grid is used as an additional measure against the shortage of power. In this case, the DS, discharge of electricity is the discharge rate ρ multiplied by the available electricity or power, AE. Equation 6 and 7 describes the methodology of finding the remaining power after which the aggregator would upload the collected data set from the Electric Vehicles. This would enhance the efficiency of charging in the EV’s with minimum residual consumption of power. As described above in the equation, the training of the data set is the significant embodiment to anticipate the possibility of power deficiency. ASoE (6) ρ= SEL DS i ← ρ ∗ AS i

(7)

Interoperability in charging an electric vehicle is focused on the expanse of apportioning and data interchange amongst different charging station points.

ANFIS and Blockchain Technology

297

This data aggregation and data immutability shall enhance the performance of EVs. Fuzzy logic usage at the inception of each iteration with the fuzzy weight ranging from 0 to 1 will spawn severally across the network in accordance with the state of charge and load profile of the EVs. The blocks are enumerated to the chain from the nodes with the one having extortionate weights along with the strew distributed ledger. The flowchart explicitly described the mechanism to find the adjacent charging point. The hash code in the requisite desideratum block contains the information of the minutiae of the identification of each Electric Vehicle. This would help in efficient working of the EVs while charging.

4

Hybrid and Conventional Integration of Blockchain in the Electric Vehicles

Blockchain comprises of disseminated digital ledger through which all transactions are shared in a secure and placid platform. To evaluate the Electric Vehicle’s performance, a conceptual prototypical model is created on ANFIS, and Blockchain is presented. A hybrid ANFIS is characterized by the Blockchain. In this framework, the FPGA in the Electric Vehicles sends information when the power is minimum and locates the nearest power grid station to charge the EV. To find the nearest grid in a location, we use KD Tree classification. The k-dimensional method recursively partition along the data axes Fig. 6. 4.1

Experimental Validation

The members of the distributed ledger provides security to the chips in EVs. To protect the information of the EVs, Blockchain Technology protects the transaction through the encrypted hash code. The client of the EVs would feed the information to the hash function and create the hash data [12]. Thus, the transaction is trusted through Blockchain Technology. The process of evaluating the performance of the EVs including assembling, relocating, dealing, and analysing is accomplished in a safe, unswerving, transparent environment Fig. 7. Alongside the Hybrid ANFIS model, the learning data is uninterruptedly updated with the historical data. This would increase the efficiency of the system in the EVs. The state of charge of the Electric Vehicle, the load profile, and the Residual power are taken into interpretation for calculating the fuzzy weight for each assigned member. This charging methodology of Electric Vehicles would enhance the usability of the power grid through Blockchain Technology. The simulation outcomes were engendered by using different plat’s strictures and changing the values. The EV’s Charging Point at various driving cycles based on ANFIS parameters is explained in Table 1. Adaptive ANFIS, Root Mean Square Error, EV Grid Charging Switch Sugeno-type fuzzy inference sys-

298

T. Sur et al.

Fig. 5. Flowchart for finding the requisite nodes for charging EVs in the power grid

tems, Quadratic Polynomial Mean function, and Data Handling Through Group Method are the parameters taken as a supposition for analysing the required value of EV’s charging test, passive and in ANFIS. Fuzzy weight higher than the static threshold value is evenly disseminated over all the designated nodes in the dispensed network with Lasso(alpha = 0.1)r2 on test data : 0.648064 and Elastic N et(alpha = 0.1, l1r atio = 0.8r2 ) on test data : 0.632515 since when a given node is charged, the fuzzy weight commit to it in the next round is

ANFIS and Blockchain Technology

299

Fig. 6. KD Tree Structure for finding the nearest grid to charge

Fig. 7. Stages of springing the performance of the EVs with Blockchain Architecture

lowered owing to the higher State of Charge value which resulted in the diminution of grid congestion than those who had not the chance of being charged in the grid.

300

T. Sur et al. Table 1. EV’s Charging Point at various driving cycles based on ANFIS

Case

5

EV’s Charging test Passive ANFIS

Adaptive ANFIS

50

837

970

Root mean square error

47

877

230

EV grid charging switch Sugeno-type fuzzy inference systems

31

25

415

Quadratic polynomial mean function

35

144

2356

Data handling through group method

45

300

556

Conclusion

In this paper, we have proposed an ANFIS aided Blockchain Technology in the Electric Vehicle for managing the power in the canny grid platform. In detail, we present the essential apparatuses of both hardware and software proposed approaches. Through a reliable prediction technique of power consumption, it will help in reducing the delay of energy management and monitoring. With the low-slung rate of latency and power ingesting, the ANFIS processor is trained proficiently using the raw information. The security in the EVs in return is provided by Blockchain Technology. In our forthcoming exertion, the competency of arranging the Blockchain Network will be enhanced by exploring communication mechanisms and stowage conventions in Blockchain Technology. Also we aspire and intent to implement IoT based location detection technique to recharge electric vehicle to ensure appropriate utilization of power grid during peak time.

References 1. Wang, Z., Ogbodo, M., Huang, H., Qiu, C., Hisada, M., Abdallah, A.B.: AEBIS: AI-enabled blockchain-based electric vehicle integration system for power management in smart grid platform. IEEE Access 8, 226409–226421 (2020). https://doi. org/10.1109/ACCESS.2020.3044612 2. Guille, C., Gross, G.: A conceptual framework for the vehicle-to-grid (V2G) implementation. Energy Policy 37(11), 4379–4390 (2009). ISSN 0301–4215, https://doi. org/10.1016/j.enpol.2009.05.053 3. Saadatmandi, S., Roscia, M., Lazaroiu, G.: Blockchain and fuzzy logic application in EV’s charging. In: Proceedidngs (2020). https://doi.org/10.1109/ ICRERA49962.2020.9242662 4. Kaga, Y., Fujio, M., Naganuma, K., Takahashi, K., Murakami, T., Ohki, T., Nishigaki, M.: A secure and practical signature scheme for blockchain based on biometrics. In: Liu, J.K., Samarati, P. (eds.) ISPEC 2017. LNCS, vol. 10701, pp. 877–891. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72359-4 55 5. Chen, R.-Y.: A traceability chain algorithm for artificial neural networks using ts fuzzy cognitive maps in blockchain. Future Gener. Comput. Syst. 80, 198–210 (2018)

ANFIS and Blockchain Technology

301

6. Liang, G., Weller, S.R., Luo, F., Zhao, J., Dong, Z.Y.: Distributed blockchain-based data protection framework for modern power systems against cyber attacks. IEEE Trans. Smart Grid 10(3), 3162–3173 (2018) 7. Sarker, M.R., Dvorkin, Y., Ortega-Vazquez, M.A.: Optimal participation of an electric vehicle aggregator in day-ahead energy and reserve markets. IEEE Trans. Power Syst. 31(5), 3506–3515 (2015) 8. Li, F., Qin, J., Zheng, W.X.: Distributed Q-learning-based online optimization algorithm for unit commitment and dispatch in smart grid. IEEE Trans. Cybern. 50(9), 4146–4156 (2020) 9. Najafi, S., Shafie-khah, M., Siano, P., Wei, W., Catal˜ ao, J.P.S.: Rein forcement learning method for plug-in electric vehicle bidding. IET Smart Grid 2(4), 529– 536 (2019) 10. Jadidbonab, M., Mohammadi-Ivatloo, B., Marzband, M., Siano, P.: Short-term self-scheduling of virtual energy hub plant within thermal energy market. IEEE Trans. Ind. Electron. 68(4), 3124–3136 (2020). https://doi.org/10.1109/TIE.2020. 2978707 11. Lazaroiu, C., Roscia, M., Saadatmandi, S.: Blockchain and fuzzy logic application in EV’s charging. In: 2020 9th International Conference on Renewable Energy Research and Application (ICRERA), pp. 315–320 (2020). https://doi.org/10. 1109/ICRERA49962.2020.9242662 12. Chou, K.P., Prasad, M., Lin, Y.Y., Joshi, S., Lin, C.T., Chang, J.Y.: TakagiSugeno-Kang type collaborative fuzzy rule based system. In: IEEE Symposium on Computational Intelligence and Data Mining (CIDM) 2014, pp. 315–320 (2014). https://doi.org/10.1109/CIDM.2014.7008684 13. Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. 10, 1–19 (2019). https://doi.org/ 10.1145/3298981

Application of BLOCKCHAIN in Agriculture: An Instance of Secure Financial Service in Farming Sumit Das1(B)

, Manas Kumar Sanyal2 , and Suman Kumar Das3

1 JIS College of Engineering, Information Technology, Kalyani 741235, India

[email protected] 2 Department of Business Administration, University of Kalyani, Kalyani 741235, India

[email protected] 3 Zenlabs, Zensar Technologies, Pune 411014, India

Abstract. In the just gone study, it has been observed that numerous incidents associated with farmer suicides on account of bankrupt and underperforming agriculture. The worlwide population is increasing rapidly and hence the dependence of land to build housing as well as industrial areas is also demanding more in the same propotion. The productive lands for agriculture are declining and more returns must be produced for meeting the country’s needs. Although demand increases, farmers suffer from some important problems such as an increase in initial investment to establish the soild ground because of high rate of interest, the actual price of the crops as the mediation of intermediaries in the market and surveying bazaar bias as well as need of the buyer. Novelties of this article are to resolve these issues by the proposed block-chain architecture. In this architecture, the farmer and the banker are entirely keep apart by the intermediaries. The moderness of the article is to implement blockchain technology, security issues, and digital consciousness in Agriculture Farming. The objective is to provide the sustainable utility of this technology in financing. Keywords: Blockchain · Double spending · Digital consciousness · Security issues · Proof of concept

1 Introduction The substantial trouble in agricultural bazaar is the “break-off” between agriculturalist and customers. “Agro-chain” is a transparent block-chain softplace where agriculturalist and customers can execute a cooperative cultivation strategy. The significant novelties of the proposed architecture are: • The agriculturalists are not required to wait for the processing of loan to raise the investment. The zero interest fund could be provided by the customer. • Consumers can obtain quality products at lower prices since they were financing the fruits from the moment of farming. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 302–311, 2022. https://doi.org/10.1007/978-981-19-3182-6_24

Application of BLOCKCHAIN in Agriculture

303

• It is not necessary to have large farmland, even small agriculturalists and domestic agriculturalists sale the crops as well as make profits. • An well planed block-chain can be guaranteed peer-to-peer updating on unchangeable chains. Consumers can find particular agriculturalists for particular crops. • Farmers could assemble consumer trueness based on product goodness and type of agriculture, which could eventually lead to better profits. • Even consumers in low-income groups can finance crops depending on the requirements and could avoid retail price swing for the crops. • The unchangeable accounting book in block-chain guarantees translucency and minimises the probability of cheat. • Smart program stored on a block-chain could yeild best interlink among agriculturalists and customers for the different unexpected natural climatics. In due course, Agro-chain set up a de-centralized agricultural bazaar where agriculturalists could simply uplift funds for the crop together with consumers in their hands to buy their products. As a result, consumers could guarantee good merit crops at a reasonable price with an initial venture in crops in which agriculturalists and customers could find the benifit and develop a sustainable environment for future prospect. The efficient agriculturalists will get the maximal outcome from the product and the efficient investors could guarantee best quality crops for themself. This study also introduces the state of the art of the transition of digital technology in the field of Banking, financial services, and insurances. The 69% of people or more in the glove are using a smartphone, now the question is how much of them are smartly using it? All the people are putting their data on different social networking sites, but on the other end, some people are earning money by using their data. It implies that the people are acting as a sensing element in this era of Intelligent Internet of Things (IoT) and they are the looser in two aspects: one is a source of data, others are sweets of the hacker. The social networking sites acquired data through us freely but we are not aware of that and hence we are the loser. At present, people are almost interconnected with electronic devices through their smartphones with the aid of evolving technology that is the Internet of Things (IoT). Now the problem is that most of them do not know the security issues of IoT. The devices are connected through WiFi or Bluetooth that is not secure; it means the hacker can easily tap your devices and hack them. The question is how to resolve these issues? The answer is digital consciousness means using smart devices smartly and adopting new technology to know your devices. In Banking when anyone transacts money, she/he has to trust a third party in a centralized ledger system which encompasses many sub-systems, and hence it involves recurring costs. This system is not so secure because the third party is there. This insecurity phobia could be overcome by a decentralized technology called Blockchain. Blockchain technology uses a cryptographic algorithm and hash function for hardening security issues. In the following subsequent section, the authors precisely elaborate on Blockchain, the technology behind it, and security-related issues for self-consciousness. The double-spending problem were seen the in third party peer-to-peer distributed electrinic payment system. The double-spending is a type of attack that causes a problem to digital currencies in which one user can spend the same digital asset more than once. Figure 1 depicts double spending: Bob spend a digital coin twice, one to Lisa and a

304

S. Das et al.

copy of the same to Alice. The double spending can be eliminated by the application of blockchain.

Fig. 1. Double spending [1]

In ‘block-chain’ technology, the digital information is represented by ‘block’ whereras the public database is represented by ‘chain’. In addition blocks are used to store the data about transactions, Fig. 2 shows all components of a Blockchain. Blockchain: Blockchain could be a distributed, decentralized, public ledger. It keeps transactions honest and consistent in terms of security throughout the worldwide economy. In 1991, Stuart Haber and W. Scott Stornetta were first conined the term block-chain and wanted to perform secure transaction which could not be tempered by any more [2]. The bulk of blocks in the chain of network have to be agreed to complete the the sender’ or payee’ transaction. The sender’ or payee’ transactions are sent to all the blocks or nodes in which each block approved the transaction(proof-of-work) and the block added to the block-chain network for completing the transaction as shown in Fig. 2.

2 Literature Survey In the recent literature, the authors’ trade wanted to yeild glassiness for the customers and reporting them on the source of the green materials and the regenerative farming process of the integral parts. The manifesto paln visualised centralized tranperency by including a block-chain which providing functional knowledge for enhancing the productivity and security [4]. The Scientists of the Agri-Ledger tailored the agricultural-supply-chain solution using the modern advancement of secured digital communication for shaping the architecture of future agriculture. They explored the better way of keeping track the cash flow between the bankers and the treasurers [5]. The Ban-Qu implemented, how to harvest crops, mine gems and minerals in the secured digital communication for shaping the architecture for future agriculturists with block-chain platform. It ensures paper less secure green computing [6]. In a study of farming in Africa, investigation is that the farming environment is not efficient due to poor recording, poor communication and trasaction process. The scientist

Application of BLOCKCHAIN in Agriculture

305

Fig. 2. Blockchain transaction mechanism [3].

try to eliminate this issues by incorporating smart contract in farming for financial services which save the farmer from the dishonest middleman [7]. In India, the govt of AP came into light of block-chain for solving the issues in farming in colaboration with Chroma-Way, a Swedish-Startup. They try to simplify data management among the several stack holders in the farming ecosystem for transperent transaction processing as well as record keeping. This will minimise the record keeping cost by mimimising trauble in the process of farming and make a worldwidw revolution [8]. The IBM initiate and implement block-chain technology by amalgameting IOT devices for the welfare of the agriculturist, the manufacturer of the food for the nation. This smart transperent record keeping technology helps the farmer for getting quick crops insurance. They also get instant information about soil quality, cultivation constraints and forecat about the growing crops for sustainable cultivation [9]. The management of agriculture could be well managed by block-chain along with IOT which can maximize the satisfaction level of the farmers as well as the customers. The artificial intelligence (AI) and internet of things (IOT) significantly used in block-chain technology for the advancement in farming in recent startup. The stored data in block-chain is extra seamless and this data is ready for machine learning (ML). The ML technique can provide the prediction and recommendation for crops related query which is very essential for the agriculturist. The demand of crops can be easily acquired through data analytics efficiently as the data are transparent in nature. The food supply chain can be enhanced by block-chain which ensures reduction of food frauds in the food industry. The advancement of block-chain, AI and smart-phone helps the farmer for payment of their crops without any delay or any additional processing charge from their income [10]. The block-chain is a big digital electronic trust where third party authentication is not required, the data stored in a public block and no one can alter or remove it [11].

306

S. Das et al.

The bit-coin and smart-contract are the two significant innovation of block-chain, which are being used by the financial institution recently. This secured financial instrument is secured by proof of work [12].

3 Methodology In this section, the authors have shown the Methodology of the entire system. This supply-chain mechanism track a cultivating product from agricultuerist to the customer. In today’s generation farmers depends heavily on a bank loan for farming, in case of low productivity and fewer profit farmers can’t manage to pay the loan (loan + loan interest). Micro-finance will help the farmers in this scenario. The workflow for the same is mentioned in Fig. 3.

Fig. 3. Micro-finance management associated with agriculture.

In the above process user (Farmer/Approver) needs to enroll in the system. Farmers need to fill in the details about the Crop (Quantity and Price of the Crop). Once Farmer submits the details it will be submitted to the approver. After that approver in the system will check the quality of the crop and the approver will approve/reject or update the price of the Crop as per quality and afterward, it will assign LOT for the crop, with this detail as per the budget, the Micro-finance management will fund the farmer. Figure 4 shows the Proof of Concept (PoC) level technical architecture of the above-mentioned. The front end/client part interacts with the blockchain ethereum network with the NodeJS middleware [13].

Fig.4. Technical architecture of micro-finance management in agricultural field.

Application of BLOCKCHAIN in Agriculture

307

There are three technologies within the blockchain to create a block and perform transactions [14]: a) Time stamping: The transactions are order according to time stamping so that data can be stored chronologically. b) Consensus: It is the dynamic way of proof-of-concept where the blocks are created and broadcast through mining the nodes. c) Data Security & Integrity: The accuracy of the nonce number are verified by the hash execution with the peer blocks to ensure data integrity as well as security.

4 Experimentation and Results First of all as per system description from technical prospective Ethereum Blockchain Network needs to be configured and later should be running, after that the smart contracts needs to be compiled and deployed or migrated over the configured ethereum Network (In the experimentation we have used ethereum test network). After that Farmer needs to enter the asked details like ID, Name, Address, Mobile No, Crop Name, Crop Quantity, Estimated Price. Figures 5, 6, 7, 8, 9 and 10 shows the outputs of the above experimentation.

Fig. 5. Making up the ethereum test network and Farmer inputs.

308

S. Das et al.

After submitting all the details over the network the backend API will try to interact with the smart contract (c/o smart contract event will get triggered), if no error occurs in between than the transaction will successfully be committed over the ethereum blockchain network as showin in Fig. 5.

Fig. 6. Output for transaction verification after Farmers input

After that approver will test the quality of the Crop submitted by the farmer, which will be approved by the approver if quality matches as per the input given by the farmer. All the corresponding events/function created in the Smart Contract will be invoked and saved or logged in Blockchain as shown in Fig. 6.

Fig. 7. Quality testing and approving the inputs given by Farmer.

After that, Approver need to create a LOT for the Crop approved by him/her and this Lot will be available to the Customer. The approver may update the price of the crop in lot as shown in Fig. 7.

Application of BLOCKCHAIN in Agriculture

309

Fig. 8. Creation of LOT

After that Customer will search on the basis for Crop based on Farmer ID and Farmer LOT ID, with this all the details will be visible to him/her as shown in Fig. 8.

Fig. 9. Customer getting the details of the LOT.

Now Micro-Finance Module will enable approver to fund the farmer in case the farmer is in loss. The approved Amount which the Approver wants to aid to the Farmer, corresponding amount will be deducted from the overall budget as shown in Fig. 9.

310

S. Das et al.

Fig. 10. Aiding micro-finance if needed to farmer.

The system will ensure that whatever the transactions is happening over blockchain network is immutable and would provide transparency in the process involved, because of which farmers and even customer can be befitted out of the system.

Fig. 11. Usability versas security in financing

The security of a conventionall banking is increased with reference to usability. This execution for mobile banking is decreasing from teller workstation to mobile banking withe lesser degree og security shown in Fig. 11. The aim of block-chain technology is to enhance the security performance to a higher degree as speculated Fig. 11.

5 Conclusion The financial security issues in any transaction system especially in banking play a crucial role. The motivations of the study were addressed bit of these issues in terms of crypto graphical techniques and tools by introducing Blockchain technology. The study

Application of BLOCKCHAIN in Agriculture

311

mentioned the checklist of financial security as well as different investment techniques to enhance financial security. The conclusion of this study is to explore the security measures in the field of mobile and internet banking. The model speculate that the centralize transaction system is not cent percent secure due to trusted third party is invoked there to control the whole system. So, to ensure cent percent security, the decentralized transaction system, “Blockchain-Technology” is the best choice and it is the future for secure transaction system. Acknowledgments. The special thanks to the experts Prof. Manas Kumar Sanyal, Kalyani University, for his spontaneous advice and encouragement. The special gratitude to our Management, JIS GROUP, JIS College of Engineering, Department of Information Technology for providing all kinds of R&D facilities.

References 1. Blockchain - Double Spending. https://www.tutorialspoint.com/blockchain/blockchain_d ouble_spending.htm. Accessed 08 Nov 2021 2. Marr, B.: A very brief history of blockchain technology everyone should read. Forbes (2018)s. https://www.forbes.com/sites/bernardmarr/2018/02/16/a-very-brief-historyof-blockchain-technology-everyone-should-read/. Accessed 08 Nov 2021 3. Lastovetska, A.: Blockchain Architecture Explained: How It Works & How to Build (2021)ss. https://mlsdev.com/blog/156-how-to-build-your-own-blockchain-architecture. Accessed 08 Nov 2021 4. TE-FOOD: TE-FOOD - Farm-to-table food traceability solution. TE-FOOD (2020). https:// www.te-food.com/. Accessed 04 Mar 2020 5. “AgriLedger: Agriledger (2020). http://www.agriledger.io/. Accessed 04 Mar 2020 6. Supply Chain Software – Blockchain Platform BanQu (2020). https://banqu.co/. Accessed 04 Mar 2020 7. Kenya’s Shamba Records uses blockchain, AI to improve farm processes, Disrupt Africa, 12 August 2019. https://disrupt-africa.com/2019/08/kenyas-shamba-records-uses-blockchainai-to-improve-farm-processes/. Accessed 04 Mar 2020 8. Alekh Sanghera: Block Chain in Agriculture (2018). https://www.yesbank.in/digital-banking/ tech-for-change/agriculture/block-chain-in-agriculture-innovationinsights. Accessed 04 Mar 2020 9. Let’s put smart to work: Let’s put smart to work, 04 December 2019. https://www.ibm.com/ thought-leadership/in-en/smart/index.html. Accessed 04 Mar 2020 10. Takyar, A.: Blockchain in Agriculture - Improving Agricultural Techniques. Software Development Company, 29 August 2018. https://www.leewayhertz.com/blockchain-in-agr iculture/. Accessed 04 Mar 2020 11. Marr, B.: A very brief history of blockchain technology everyone should read. Forbes. https:// www.forbes.com/sites/bernardmarr/2018/02/16/a-very-brief-history-of-blockchain-techno logy-everyone-should-read/. Accessed 14 Sept 2019 12. Popovski, L., Soussou, G., Webb, P.B.: A Brief History of Blockchain, p. 3 (2018) 13. Selvaganesh: How Node JS middleware Works? Medium, 11 June 2018. https://medium.com/ @selvaganesh93/how-node-js-middleware-works-d8e02a936113. Accessed 05 Mar 2020 14. What is Bitcoin blockchain? A guide to the technology behind BTC. Cointelegraph (2021). https://cointelegraph.com/bitcoin-for-beginners/how-does-blockchain-work-abeginners-guide-to-blockchain-technology. Accessed 08 Nov 2021

Adaptive Electronic Health Records Management and Secure Distribution Using Blockchain G. Jagadamba(B) , E. L. Sai Krishna, J. P. Amogh, B. B. Abhishek, and H. N. Manoj Siddaganga Institute of Technology, Tumakuru 572103, India [email protected]

Abstract. In this world, people meet many challenges in the healthcare system. There is a circumstance where an individual has to maintain a comprehensive health report from time to time, and it feels not an easy job. A person may not be suffering from one type of disease; he/she may be undergoing multiple problems at once, and it is challenging to maintain all the reports. There is another circumstance where a sufferer visits multiple hospitals/consulting doctors; in a before-mentioned situation, it is tough for such patients to maintain them, and they may suggest the repeated process. In such a situation, an approach to maintaining Electronic Health Records is proposed in this paper, which uses Blockchain methodology to provide the most secure way to store and share patient reports whenever they need. The proposed system makes the task easy for patient’s to maintain multiple records in a single block, and the testing result leads a pathway to adopt in this current scenario. Keywords: Block chain · Healthcare system · Health records · Maintain · Electronic health records · Covid

1 Introduction In worldwide, most citizens visit the hospital when health problems are identified, and most of the time, they will not get any update about health status as the entire process is manual. However, the unavailability of information about health status or unavailability to track the history of health problems or to have health check-ups is a challenge and it is very much needed in this current COVID pandemic. To have timely treatment, patients need up-to-date health records. The individual’s health records do not mean to say about current status, but the health records about a period. Unfortunately, humans tend to miss or forget to keep records or sometimes forget to take them to doctors while visiting and repeat the same testing procedure most of the time. However, the collective records about the patient help doctors to diagnose the problem and minimize the burden of repeated procedure to be done. At the same time, the doctors may not accurately diagnose the problem for various reasons when the records are presented manually. However, continuous tracking, monitoring, accurate analysis and prediction are the substratum of providing good healthcare. To provide the © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 312–324, 2022. https://doi.org/10.1007/978-981-19-3182-6_25

Adaptive Electronic Health Records Management

313

right care at right time, maintaining the reachable, precise and secured Electronic Health Records (EHR) is necessary [11, 15, 16]. EHR is a digital health record about the testing reports, prescription, personal healthrelated data such as medications, past medical history, saturation levels, laboratory data in the past years, personal details regarding health insurance, etc., of the individual patient maintained and distributed digitally when necessary. Electronic health records are designed to maintain multi-institutional health records, and there may be a chance that many patients leave their personal and medical data across various institutions, i.e., hospitals. Some hospitals may use or sell this information with a third party, and also patients feel inconsistent if multiple hospitals involve. They face security issues and disclose some fluid access to information because no longer is the issuer an affected person. Interoperability [1] challenges among issuers act as barriers to powerful information sharing—this loss of coordinated information control and exchange results in fragmentation of health records instead of cohesive. Here, patients gain from a holistic image of their health records. Hence, trust set up is crucial and many patients who doubt the security in their facts can also be avoided. Apart from the above-discussed challenges, patients [4] are even more and more active and enthusiastic about handling their personal information on the web in the age of social media and digitalization. Here, a discovery of how the blockchain structure could be implemented to EHR’s is an interesting area of research. Hence, a “distributed ledger protocol” connected with Bit coin help the main requirement [2]. Blockchain [6] help the reinvention of the way patient’s medical data are shared and saved with the help of using different security mechanisms for medical data exchange of health information in the medical business by considering the peer-to-peer network security. The support and analysis about how the distributed ledger technology and blockchain can transform the medical industry is a challenging one. This challenge can be attempted by setting the patients’ health information in the health sector by considering the security, scalability, availability, privacy, and interoperability [7–10]. In addition, blockchain can offer a new version for medical records exchange by transforming EHRs more secure and efficient. 1.1 Challenges in the Implementation of Adaptive Electronic Health Record Maintenance and Distribution Inside the e-health sector, a distributed and scalable system that permits the growth of medical data is required. However, for the quality of data, the provision of a secure and stable implementation should be guaranteed that target the quality and correctness of the data while shifting among distinctive organizations/agencies that own the information inside the vast ecosystem. Therefore, there is a need for a system that should manage authorization, dynamic authentication and information access control for medical data. By keeping this in mind, a significant challenge is about encrypted information retrieval and storage. However, the database implementation inside the system provides data confidentiality, query assurance and secure storage. Blockchain can offer a new version for medical records exchange by transforming EHRs more secure and efficient. Public-key cryptography used in blockchain creates an append-only, stable, and time stamped records in sequence. However, the copies of

314

G. Jagadamba et al.

the blockchain are distributed or scattered in the network [2]. Through this, security of the data and application is required. This challenge can be achieved as there is no third party controlling the system as it happens in every other technologies and every node with equal responsibility over network [2, 12] in blockchain. Hence, a blockchain based EHR system is proposed for easy and secured sharing.

2 Literature Survey Many systems have been created to manage Electronic Health Records (EHR) using blockchain and cloud-based solutions. The transformation of the Indian healthcare is possible through the block-chain technology-driven EHR. The technology driven EHRs are placed at the centre of the healthcare ecosystem to increase the security, privacy, scalability, interoperability and for good governance of electronic health data [20, 21]. Systems that manage EHR’s through blockchain involve zero trusts [17], and they include peer-to-peer architecture rather than employing client-server architecture. The data and application security can be achieved as there is no third party controlling the system as it happens in every other technology. By using the blockchain, a “Trust-less” version is employed on the unique nodes to solve computationally hard “puzzles” (hashing exercises) [2]. It is used to stable the content material from tampering and relies upon the block of content material that is added to the chain. The nodes in this are recognized as “miners”. The append block strategy of miners, guarantees that it is far hard to rewrite records in the blockchain. As the systems can be developed using a cloud-based solution, the systems that manage EHR’s through blockchain involve zero trusts and include peer-to-peer architecture rather than employing client-server architecture. The data and application security can be achieved as there is no third party controlling the system as it happens in every other technology. A contract structure is also proposed in PatientProvider Relationship Contract, Summary Contract and Register Contract [2]. It also describes how the system node has to be and describes the back-end API Library, which is constructed using multiple utilities and bundled together to facilitate the system’s operation. Thus, many hurdles of record management and their user interfaces like working directly with blockchain can be avoided. The work also includes blockchain mining, which encourages “miners” in the network to participate with resource computation to achieve incremental and reliable progress on the chain. Interoperability on the DLT (Distributed Ledger Technology) is an important aspect that can overcome security and privacy issues [1, 17]. The security parameter is achieved through immutability because the data must be protected from unauthorized access to be safe. This security is achieved by placing the hash value of the block. If the block’s hash value does not match the previous one after a while, the occurrence of data leakage can be confirmed; thereby, security can be achieved. Furthermore, even if a third party tries to track all the transactions of a particular patient, physical identity is not integrated into the system, and the third party can never be tracking the patient in the real-world. Creating safe and reliable care files, associating identities and enrolling patients is one of the essential components in any EHR system. In this regard, a work [1] addressed interoperability issues to ensure the anonymity and isolation of each transaction and overcomes the lack of a single reliable source of personal information for medical services. This work listed blockchain use cases in the healthcare industry, including creating

Adaptive Electronic Health Records Management

315

healthcare patient profiles, improving audit trails, improving the delivery of healthcare IT applications and connecting traditional databases in a blockchain environment. A search method in [3] reveals the details to find out the problems and concentrates on identifying some of the strategies and library used to collect data; describe the time allocation of the course and quality assessment of the selected research. Work on the classification of electronic records is attempted and solved the openness of EHRs on the blockchain in healthcare [3]. An artificial neural network is also used in blockchain technology in medical systems [5]. These records based on personal medical records are created using blockchain algorithms to ensure that the PHR data of healthcare institutions is safely verified and medical data is accurately verified against existing vulnerabilities. This work [5] also developed the imaging of medical data in Electronic Medical Record (EMR), Order Communication System (OCS), Picture Archiving Communication System (PACS) and Personal Health Record (PHR). Here, the different medical records are compared and it was found that currently, only doctors can use and verify the medical data. The blockchain and a DLT promised a reliable accounting system that is reliable and immutable-records for various use cases, including records, including real estate and healthcare [5, 13]. An innovative blockchain-based system to maintain evidence-based archives is proposed [6] for an ancient science designed to preserve actual documents for a long time. The reliable archiving systems provide efficient records or threats to the long-term accuracy, reliability and authenticity. The new technology is able to understand the weaknesses and try to adopt the new technologies for improvements that fill the gaps. A weak blockchain infrastructure can lead to whether the blockchain is more secure [4, 5], because blockchain technology is still in the early stages of adoption, people doubt the security of the system and risks. Hence, an effort is required to develop technology with a more secure and proven infrastructure. Furthermore, the blockchain service of the artificial intelligence (AI) framework is expected to become a compelling, efficient, commercial look and needs to be redesigned to make it safer and more user-friendly. A cloud-based framework was also studied [14], which focuses on enabling all healthcare specialists to complement to the EHR content. The proposed model overcame manual updations problems and patient registration. The proposal was found to be wellorganized for the governance. The work also declares that the test result is captured error-free, and patient data is shared at various healthcare system levels in a secured way. Preventive, curative and rehabilitative programs to do good planning and managing at various levels of the public health system are required to conduct various experiments. But, unfortunately much support is not avilable in achieving national and international goals. In the same references, several solutions [13, 16] provide improvements over the current healthcare systems limitations using blockchain technology. Many works include frameworks and tools to measure Hyperledger Fabric, Composer, Docker Container, Wireshark capture engine and Hyperledger Caliper. The proposed work improves data accessibility between healthcare providers through an Access Control Policy Algorithm. The work also assisted by simulating the environments to implement the Hyperledgerbased electronic healthcare record (EHR) sharing system that uses the concept of a chain-code. The work [5] is limited to different aspects of EHRs and Blockchain, but not included any applications. The review work provided the updated taxonomy, challenges

316

G. Jagadamba et al.

and different issues associated with the EHRs. These identifications helped to concentrate on these issues while designing the proposed system. However, based on the outcome of literature survey, a system is proposed based on the following features: • • • •

Permissioned Blockchain Ethereum Produce EHR’s that are concerned with present health issue Dynamic Authentication Methods

3 Proposed System

Fig. 1. General structure of blockchain

The Fig. 1 shows the general structure adopted using blockchain. The Fig. 1 involves various procedures detailed below. The procedures are numbered as that of the Fig. 1 procedures to give more clarity. 1. 2. 3. 4. 5. 6. 7. 8. 9.

Add Patient Record Resolve Address and SC, Post new EHR, Link SC with EHR Mining, Bounty SC Updated Notification, Reject or Acknowledge Update PPR status in SC Signed Query Request, Query Result Check Permissions on Current Chain Update

The patient registration includes updations of patient details such as name, age, weight etc., before getting into the ethereum network using Ganache’s private address. As there is no third party involved in the process, the doctor can access the patient’s

Adaptive Electronic Health Records Management

317

record only after the access given by the patient. After that, the file uploaded by the patient is encrypted with a secret key. The encryption is done by using the hash functions. From the literature survey, audience (patients and clinicians) based application is catered using blockchain and ethereum. Blockchain technology is used to securely store documents in the form of electronic documents in the blockchain network. The immutability property will help the document replace another document and permanently enter particular documents in Distributed Databases. The main benefits of this approach come in handy to provide adaptive security for documents storage and access. The proposed system also reduces in the data sharing time and cost for the EHRs. In 2017, ethereum evolved and released a new version with more functionality and tools, making it an exciting platform for private or semi-private blockchain. This ethereum is suited to client’s business better as it allows establishing more control over who sees what and who has access to what, compared to the public blockchain. However, public blockchain involves all information available to everyone and needs additional cryptography to secure. The purpose of this work is to ensure the safe storage and transmission of electronic medical records or other documents (such as protocols, scan reports) without external intermediaries. In daily life, this process requires the participation of a notary, who checks the contract when signing the legal contract, signs the contract and writes it in the diary. Although there is no need to pay for notarization services in the blockchain world, which usually costs much money, due to cryptography, all signatures can be protected while the blockchain is working. In addition, review and maintenance of registration logs involve step-by-step implementation. The maintenance takes much less time, and the implementation involves the following steps: 1. Smart Contracts is implemented and designed for logging into patient-provider relationships associated with a health record inspection, permissions granting and retrival of data. 2. Crypto-graphic hash of the record is done in the next step to ensure security against tampering for data integrity. 3. “off-chain” data exchange between a patient database and a provider database is done using the syncing algorithm. 4. Implementation of join and participate function in the blockchain network is done using Ethereum client component. 5. Ethereum client then performs various tasks like peer-to-peer network connection, encrypting, sending transaction details, and maintaining of a verified local copy of the blockchain. 6. Finally, identity confirmation is done using public key cryptography (one key is with the generator system and other key will be with the patient) is done. Then, the employed DNS implementation maps to the existing and widely accepted form of ID (e.g., name or social security number) to the person’s blockchain address. Where Ethereum public key is a point on an elliptic curve and is calculated using (1).   (1) y2 mod(p) = x3 + 7 mod(p) where, the mod p indicates that this curve is only valid over the field of prime order p.

318

G. Jagadamba et al.

For generating a Public Key (k), we use elliptical curve arithmetic. The simplified form of the equation is follows: K=k∗G

(2)

where G is a predetermined point on the curve (called the generator point). The public key can be expressed in more intuitive way using (3). K = (k ∗ G1, k ∗ G2)

(3)

where G1 and G2 are the x and y coordinates respectively. The proposed and developed work includes two key modules: document storage and smart exchange. In the document storage module, the repository holds all the documents with a public review and private limited read access, so we use Github. The files hosted on public GitHub are visible to everyone but still protected by encryption. By combining public storage and secure encrypted access, the designed system enables anyone to access the file; however, if there is no encryption key, the file is just a bunch of bytes. For customers, using the private Interplanetary File System (IPFS) network instead of the public GitHub as the repository is also allowed. The IPFS is a distributed peer-topeer file system with content addresses. Those interested can download files through the Internet, and like-minded people share these files. In addition, participants can use the IPFS client (entry point) to upload files to the system upon the network.

4 Architecture This work is focused on the secure distribution of electronic health records and their management, which paves the way to either Cloud-based solutions or Blockchain-based solutions. We have chosen Blockchain as the underlying technology because Blockchain is a transparent, immutable, accountable, peer-to-peer network. However, there is much development carried on in the Surgical, Equipment, infrastructure arena, but there has been no development in the healthcare industry regarding data management and the security of health records. In the present Healthcare industry, the security is not patientcentric; instead, it is provider-centric, which is anyway a threat to security. Hence, this blockchain technology can transform the healthcare industry in data management and the security of electronic health records. Blockchain involves Cryptography as that provides higher-level security. Blockchain is immutable as no single entity will have complete control over data. Every node/participant is equally responsible for the network. Blockchain is not centralized. It is decentralized, democratic. Because of these features of Blockchain, We felt that Blockchain addresses the perfect solution for this project. There are various types of Blockchain and different varieties of Blockchain like Ethereum Blockchain, Hyper ledger Blockchain. The proposed adaptive security system is implemented using the ethereum blockchain. The ethereum blockchain is selected as it is easy to build applications, and it enables developers to utilize blockchain technology for a wide range of applications. First, however, the Truffle framework is used, which deploys the basic structure of Blockchain and helps in development. In the implementation, Metamask is used as

Adaptive Electronic Health Records Management

319

it is a wallet available on the Chrome web store that helps users to store their ethereum accounts and enhance security. For testing, Ganache is used as this provides ethereum accounts with dummy ethers to test the network. Finally, ReactJs, along with HTML, is used to implement the front-end for the proposed work, and NodeJs is used for back-end development. The following steps are followed while implementing the proposed system (Fig. 2): 1. Doctor and patient can register by providing his/her name and age. 2. The patient uploads the file and gives a random nounce/seed value to encrypt the file. 3. The file is uploaded to IPFS, and the secret is stored secretly in Ethereum. 4. Patients can provide access to particular doctors. 5. As soon as the patient gives the Doctor access, that Doctor will be able to see the patient’s public address on the doctor’s home page. 6. The Doctor can obtain any hashed ipfs file from the patient and send a request to view the file to the host application. 7. Node application finds the ipfs file and retrieves the blockchain secret, decrypts the file and sends it to Doctor.

Grant access to doctor

Patient

Save file hash

Upload file with secret key

Encrypt file with secret Node App

IPFS Return file Register

Doctor

Access patient Fig. 2. System architecture

Blockchain

320

G. Jagadamba et al.

5 Result Analysis The proposed system is tested with different aspects, and it has been recorded that the time required for different tasks that the user will perform through the application and have presented here. However, the time required by some operations varies based on the network/connectivity speed. The application is tested on different Systems with different RAM and Operating system with 10 accounts. The time required for each operation for each account is calculated and then the average is taken and presented in the form of a bar graph in Fig. 3 and Fig. 4. Operations like fetching files uploaded by the patient and accessing the patient’s files depend on network speed and bandwidth at that point in time. This system is tested on different machines with different processors and operating systems, and it seamlessly works fine on all scenarios and processors, operating systems and different network speed/bandwidth. This application involves blockchain as the underlying technology, and as blockchain is a peer-to-peer network, this application is tested with various nodes/participants. Hence, the Ganache tool, which provides ethereum accounts to test the network and application, can be used. However, Ganache will provide only a maximum of 10 ethereum accounts. So, we have tested with 10 Ganache accounts. This application worked seamlessly well and also delivered an excellent performance.

Fig. 3. Performance based on Processor

Blockchain is the only foreseeable solution for accountability, auditability and preserving security without giving any single entity/organization special status. Blockchain also involves Metamask extension which is not connected to the application but associated in secured way. If the Metamask does not have access to an application, then an exception will be thrown saying that “Failed to load web3 accounts or contract.” The same is shown in the Fig. 5. Furthermore, another testing is done regarding the data entry. Here, if the data entered is invalid, then ethereum smart contracts will check entered values and Metamask error will be shown as “Alert Transaction error”. This can be seen in the Fig. 6. In this scenario,

Adaptive Electronic Health Records Management

321

Fig. 4. Performance based on Operating System

Fig. 5. Exception saying “Failed to load web3.accounts or contract.”

if the user does not have sufficient ether for the transaction, then, Metamask error will be shown as “Alert Insufficient funds” shown in Fig. 7. This application provides many benefits for both patient and doctor such as security as it is blockchain based solution and there will be no need to carry medical files and as blockchain is decentralized, so there is no need to trust any single entity/organization. Here, public address is used for transactions. Hence it is safe and here, patient is the centre of the health system which improves security and privacy. The medical files will be accessible by any hospital/healthcare system and this application provides interoperability as patients and providers are only the stakeholders.

322

G. Jagadamba et al.

Fig. 6. Error saying “Alert Transaction Error”

Fig. 7. Error saying “Alert Insufficient Funds”

6 Conclusion The Blockchain can bring the transformation in Healthcare Industry that has never happened in data management and security of the health records arena, and this is just the core functionality and can be expanded with great vision by adding many other stakeholders but with keeping and preserving the key factor that is security, and privacy. The Blockchain helped to provide the most secure way to store and share patient EHRs whenever they need. It also address recordkeeping challenges over sensitive health information control, trust issues between different Medical Institutions, interoperability among the stakeholders, and data access control. The usage of blockchain has reduced the data

Adaptive Electronic Health Records Management

323

sharing speed and cost. The security in this system involves patient-centric instead of provider-centric. The proposed system is examined for the performance issues of the software, but the testing result with the usability is restricted to 10 members due to unavailability of funding. In the future work, by taking required funding more memberships can be taken and feasibility of the system can be verified.

References 1. Brodersen, C., Kalis, B., Leong, C., Mitchell, E., Pupo, E., Truscott, A., Accenture LLP: Blockchain: securing a new health interoperability experience, pp. 1–11. Accenture LLP (2016) 2. Clarke, A., Steele, R.: Secure and reliable distributed health records: achieving query assurance across repositories of encrypted health data. In: 45th IEEE Hawaii International Conference on System Sciences, pp. 3021–3029 (2012) 3. Tang, F., Ma, S., Xiang, Y., Lin, C.: An efficient authentication scheme for blockchain-based electronic health records. IEEE Access 7, 41678–41689 (2019) 4. Ekblaw, A., Azaria, A., Halamka, J.D., Lippman, A.: A case study for blockchain in healthcare: MedRec prototype for electronic health records and medical research data. In: Proceedings of IEEE Open & Big Data Conference, vol. 13, p. 13 (2016) 5. Mayer, A.H., Da Costa, C.A., Da Rosa Righi, R.: Electronic health records in a blockchain: a systematic review. Health Inform. J. 26(2), 1273–1288 (2020) 6. Lemieux, V.L.: Blockchain and distributed ledgers as trusted recordkeeping systems. In: Future Technologies Conference (FTC), vol. 2017 (2017) 7. Tanwar, S., Parekh, K., Evans, R.: Blockchain-based electronic healthcare record system for healthcare 4.0 applications. J. Inf. Secur. Appl. 50, 102407 (2020) 8. Tith, D., Lee, J.-S., Hiroyuki Suzuki, W.M.A.B., Wijesundara, N.T., Obi, T., Ohyama, N.: Application of blockchain to maintaining patient records in electronic health record for enhanced privacy, scalability, and availability. Healthc. Inform. Res. 26(1), 3–12 (2020) 9. Hussein, A.F., ALZubaidi, A.K, Habash, Q.A., Jaber, M.M.: An adaptive biomedical data managing scheme based on the blockchain technique. Appl. Sci. 9(12), 2494 (2019) 10. Duranti, L., Endicott-Popovsky, B.: Digital records forensics: a new science and academic program for forensic readiness. In: Proceedings of the Conference on Digital Forensics, Security and Law (Association of Digital Forensics, Security and Law), vol. 5, no. 2, p. 4 (2010) 11. Scherer, M.: Lightweight Blockchain for HealthCare. Ph.D. dissertation, Umeå University, Umeå, Sweden (2017) 12. Wu, Z., Liang, Y., Kang, J., Yu, R., He, Z.: Secure data storage and sharing system based on consortium blockchain in smart grid. J. Comput. Appl. 37(10), 2742–2747 (2017) 13. Sudeep, T., Parekh, K., Evans, R.: Blockchain-based electronic healthcare record system for healthcare 4.0 applications. J. Inf. Secur. Appl. 50, 102407 (2020) 14. Pai, M.M.M., Ganiga, R., Pai, R.M., Sinha, R.K.: Standard electronic health record (EHR) framework for Indian healthcare system. Health Serv. Outcomes Res. Methodol. 21(3), 339– 362 (2021). https://doi.org/10.1007/s10742-020-00238-0 15. Jagadamba, G., Aditya, T.R., Pai, S.S., Manjunath, P., Bhat, K.U.: Real time patient activity monitoring and alert system. In: International conference on Electronics and Sustainable Communication Systems (ICESCS), 2–4 July, pp. 708–712 (2020)

324

G. Jagadamba et al.

16. Jagadamba, G., Sharath Kumar, S.: Adaptive context-aware access control for EPR resource in healthcare system. In: 6th International Conference on Advances in Computing, Communications and Informatics. IEEE, Manipal, September 2017 (2017) 17. Jagadamba, G., Babu, B.S.: A service oriented adaptive trust evaluation model for ubiquitous computing environment. Int. J. Ad Hoc Ubiquit. Comput. 29(4), 255–269 (2018)

Non-content Message Masking Model for Healthcare Data in Edge-IoT Ecosystem Using Blockchain Partha Pratim Ray(B) Department of Computer Applications, Sikkim University, Gangtok, India [email protected]

Abstract. Blockchain technology deals with the cryptographic decentralized database information storage and sharing among peer-to-peer (P2P) nodes immutably. Recent growth in decentralized edge-Internet of Things (IoT) ecosystem has aroused the issue of man-in-the-middle attack due to the inherent vulnerability of resource-constrained systempool involvement. Thus, an extreme privacy-aware support in such context is required for hiding of non-content transactions in the absence of central monitoring authority. This paper proposes a novel decentralized non-content message masking system model in covering P2P edge-IoT decentralized framework (DTC). We also characterize the proof of work, non-content message masking scheme, masked broadcasting feature and plausible deniability pacifying approaches. Keywords: Blockchain · Non-content message masking Healthcare · Edge computing

1

· IoT ·

Introduction

Blockchain system is a consensus-based secure cryptographic decentralized public or private database that stores information immutably over a peer-to-peer network. It is better than traditional technologies by following means i.e. (i) trustworthiness, (ii) disintermediation, (iii) immutability, (iv) confidentiality, (v) robustness, (vi) availability, (vii) transparency, (viii) verifiability and (ix) auditability [1–3]. Current ecosystem of blockchain comprises of dependent projects, users, exchanges, miners, developers and related applications to facilitate frictionless and self-regulated financial transactions in terms of cryptocurrencies of a hash-chain data structure [4,5]. Every peer of the underlying decentralized blockchain P2P network performs message transmission, reception and cryptographic puzzle solving task besides storing public or private key pairs, messages and transaction histories based on its capability. Though, a blockchain system provides an industry grade secure network infrastructure. It lacks in non-content message-masking (e.g., sender and receiver of messages) capability. It is a must needed thing in current socio-economic scenario where public agencies are reportedly trying to maintain the “call-detail c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022  D. Giri et al. (Eds.): ICNSBT 2021, LNNS 481, pp. 325–334, 2022. https://doi.org/10.1007/978-981-19-3182-6_26

326

P. P. Ray

records” for eavesdropping into the citizens’ privacy to understand and control their behavior and acts [6]. Hiding a peer’s identity is a difficult task, though “throw-away network address”, Pretty Good Privacy (PGP), GNU Privacy Guard (GPG), even Tor have been used in recent times to encode the messages [7]. But the encryption techniques could not alone manage the masking of non- content segment of message i.e. the identity of sender and receiver peer of blockchain. Such techniques mostly depend on the HTTPS and X.509 for providing security to the peer’s identity along with trusted certified authority served CA certificates in connection to the popular web browsers such as the Firefox, Chrome, and Internet Explorer [8]. A recent incident occurred in Iran proved that a fraudulent but cryptographically valid CA certificate could be signed and wrongly personified with the secret key of a hacked CA for performing the man-in-the-middle attack against the citizens who were continuously served by the Google-based messaging applications. For instance, if any autonomous public agency is tied to any of such CA then it would become very easy for them to peek into a citizen’s privacy, provided certain hardware assembly is installed at their site. The situation may get more vulnerable when edge-IoT supported machine-tomachine (M2M) communication is hugely used in resource constrained device pool. Further the communication mechanism is not standardized and prone to get assaulted by external intruding entities. Thus, an urgent necessity has emerged that is capable to counter novel communication protocol. Such aspect must be accompanied non-content message masking encryption facilities to guarantee a spoof-free, decentralized, trustless, and minimalistic key-management overhead scheme in edge-IoT supported blockchain ecosystem. We characterize essential features such as, (i) novel system model, (ii) proof of work scheme, (iii) non-content message masking-based message transfer, (iv) masked-message broadcast, (v) address generation and (vi) plausible deniability resistive service for the sending peer. These characteristics show how presented non-content message masking could improve existing eavesdropping-proof system to a privacy-aware novel level, especially for edge-IoT based M2M communication perspective. In this paper, we devise a novel non-content health data masking technique in IoT-edge scenario to leverage better security and privacy to the sender and receiver of health data over IoT-edge.

2

System Model and Methodology

We propose a system model that leverages the internal communication, interaction and trustless features of the implied P2P blockchain in the edge-IoT scenario i.e. DTC. Figure 1 presents the non-content message masking approach where Proof of Work assisted mathematical puzzle and encryption are done to hide message data along with the sender and receiver peer identity.

Non-content Message Masking for IoT-Healthcare Data

327

Fig. 1. Proposed system model for non-content P2P health message masking at the Edge-IoT ecosystem.

2.1

Authentication Mechanism

For any available peer-node which belongs to the edge-IoT based decentralized P2P blockchain network. DTC requires a stringent authentication mechanism for making the system spoof-free [8]. For such case, an efficient address sharing technique is prescribed as (1) and (2) where public key of sender-peer (Ui ) is hashed by SHA512() to find its network address and vice-versa for the receiver’s (Uk ) side. Address(Uk ) ← SHA512 (P ubkey(Ui ))

(1)

Address(Uk ) ← SHA512 (P ubkey(Ui ))

(2)

Algorithm 1: Proof of Work for Each Sent Message 1 2 3 4 5 6 7

N once ← M SB0−63 [0] InitHash ← SHA512 (DT CP Load ) EV al ← 20 ∗ 99 for EV al > TP oW do N once ← N once + 1 THashV al ← SHA512 (SHA512 (N once||InitHash )) EV al ← IN T (M SB0−63 [THashV al ])

The message encoding scheme is illustrated via (3)–(5). Initially, the message payload i.e. DT CP Load is created by summing of the embedded message time, encoded message version, stream number and encrypted value of the message payload-length. A nonce of 64 bits size is appended into it to make the message pay load uniquely usable. DT CM SGP ri is used as preliminary value while

328

P. P. Ray

combining actual address number, stream number, DT CP Load , and checksum. Final encoded form of DT CM SGP ri is computed with reference to the base58. It prepends with the “DC” as a constant identifying string. The string is hold with all the messages i.e. DT CM SG flowing through the proposed DTC system model.

Algorithm 2: Best Effort Message Transfer 2

if (T TLoad − ETk )