Fog Data Analytics for IoT Applications: Next Generation Process Model with State of the Art Technologies [1st ed.] 9789811560439, 9789811560446

This book discusses the unique nature and complexity of fog data analytics (FDA) and develops a comprehensive taxonomy a

819 54 16MB

English Pages XV, 497 [501] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Fog Data Analytics for IoT Applications: Next Generation Process Model with State of the Art Technologies [1st ed.]
 9789811560439, 9789811560446

Table of contents :
Front Matter ....Pages i-xv
Front Matter ....Pages 1-1
Introduction (Aparna Kumari, Rajesh Gupta, Sudeep Tanwar)....Pages 3-17
Introduction to Fog Data Analytics for IoT Applications (Puneet Kansal, Dilip Sharma, Manoj Kumar)....Pages 19-38
Fog Data Analytics: Systematic Computational Classification and Procedural Paradigm (D. Pradeep Kumar, R. Hanumantharaju, B. J. Sowmya, K. N. Shreenath, K. G. Srinivasa)....Pages 39-58
Fog Computing: Building a Road to IoT with Fog Analytics (Avinash Kaur, Parminder Singh, Anand Nayyar)....Pages 59-78
Data Collection in Fog Data Analytics (S. R. Mani Sekhar, Snehil Tewari, Haaris Rahman, G. M. Siddesh)....Pages 79-104
Front Matter ....Pages 105-105
Mobile FOG Architecture Assisted Continuous Acquisition of Fetal ECG Data for Efficient Prediction (Anupam Bhardwaj, Pooja Khanna, Sachin Kumar)....Pages 107-122
Proposed Framework for Fog Computing to Improve Quality-of-Service in IoT Applications (Rakhi Akhare, Monika Mangla, Sanjivani Deokar, Vaishali Wadhwa)....Pages 123-143
Fog Data Based Statistical Analysis to Check Effects of Yajna and Mantra Science: Next Generation Health Practices (Rohit Rastogi, Mamta Saxena, D. K. Chaturvedi, Santosh Satya, Navneet Arora, Mayank Gupta et al.)....Pages 145-172
Front Matter ....Pages 173-173
Process Model for Fog Data Analytics for IoT Applications (Anjali Modi, Shreena Jani, Karansingh Chauhan, Jitendra Bhatia)....Pages 175-198
Medical Analytics Based on Artificial Neural Networks Using Cognitive Internet of Things (Himani Bedekar, Gahangir Hossain, Ayush Goyal)....Pages 199-262
Application of IoT-Based Smart Devices in Health Care Using Fog Computing (Satyasundara Mahapatra, Anupam Singh)....Pages 263-278
Data Reduction Techniques in Fog Data Analytics for IoT Applications (Srinidhi Hiriyannaiah, Zaifa Khan, Aniket Singh, G. M. Siddesh, K. G. Srinivasa)....Pages 279-309
Front Matter ....Pages 311-311
Background and Research Challenges for Fog Data Analytics and IoT (Ansh Riyal, Geetansh Kumar, Deepak Kumar Sharma)....Pages 313-340
Behavior-Based Approach for Fog Data Analytics: An Approach Toward Security and Privacy ( Urvashi, Lalit K. Awasthi, Geeta Sikka)....Pages 341-354
Data Security and Privacy Functions in Fog Data Analytics (Apoorva Bhagat, Srishty Mittal, Uzma Faiz, Deepak Kumar Sharma)....Pages 355-385
Data Security and Privacy Functions in Fog Computing for Healthcare 4.0 (Darpan Anand, Vineeta Khemchandani)....Pages 387-420
Front Matter ....Pages 421-421
Fog Data Analytics and Healthcare 4.0 (Madhurima Hooda, Shashwat Pathak, Shreyans Pathak)....Pages 423-443
Fog Data Processing and Analytics for Health Care-Based IoT Applications (Tarjni Vyas, Shivani Desai, Anand Ruparelia)....Pages 445-469
The Importance of Fog Computing for Healthcare 4.0-Based IoT Solutions (U. Hariharan, K. Rajkumar)....Pages 471-494
Conclusion (Rajesh Gupta, Aparna Kumari, Sudeep Tanwar)....Pages 495-497

Citation preview

Studies in Big Data 76

Sudeep Tanwar   Editor

Fog Data Analytics for IoT Applications Next Generation Process Model with State of the Art Technologies

Studies in Big Data Volume 76

Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland

The series “Studies in Big Data” (SBD) publishes new developments and advances in the various areas of Big Data- quickly and with a high quality. The intent is to cover the theory, research, development, and applications of Big Data, as embedded in the fields of engineering, computer science, physics, economics and life sciences. The books of the series refer to the analysis and understanding of large, complex, and/or distributed data sets generated from recent digital sources coming from sensors or other physical instruments as well as simulations, crowd sourcing, social networks or other internet transactions, such as emails or video click streams and other. The series contains monographs, lecture notes and edited volumes in Big Data spanning the areas of computational intelligence including neural networks, evolutionary computation, soft computing, fuzzy systems, as well as artificial intelligence, data mining, modern statistics and Operations research, as well as self-organizing systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. ** Indexing: The books of this series are submitted to ISI Web of Science, DBLP, Ulrichs, MathSciNet, Current Mathematical Publications, Mathematical Reviews, Zentralblatt Math: MetaPress and Springerlink.

More information about this series at http://www.springer.com/series/11970

Sudeep Tanwar Editor

Fog Data Analytics for IoT Applications Next Generation Process Model with State of the Art Technologies

123

Editor Sudeep Tanwar Department of Computer Science and Engineering Institute of Technology, Nirma University Ahmedabad, Gujarat, India

ISSN 2197-6503 ISSN 2197-6511 (electronic) Studies in Big Data ISBN 978-981-15-6043-9 ISBN 978-981-15-6044-6 (eBook) https://doi.org/10.1007/978-981-15-6044-6 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

Through the exponential growth of sensors and smart gadgets (collectively referred to as smart devices or Internet of Things (IoT) enabled devices), a significant amount of heterogeneous and multi-modal data, termed as Big Data (BD), is being generated. To deal with such BD, we require efficient and effective solutions such as data mining, analytics, and reduction to be performed at the edge of fog devices on a cloud environment. Existing research and development efforts generally focus on performing BD analytics, overlooking the difficulty of facilitating fog data analytics (FDA). This book discusses the unique nature and complexity of FDA and develops a comprehensive taxonomy (divided into different chapters) abstracted into a process model. The proposed model addresses various research challenges, such as accessibility, scalability, fog node communication, nodal collaboration, heterogeneity, reliability, and quality of service (QoS) requirements. To demonstrate the proposed process model, we have included some case studies. The main feature of this book is to consider all aspects required to manage the complexity of FDA for IoT applications and also develops a comprehensive taxonomy. This book focuses on FDA in IoT and the requirements related to Industry 4.0. It provides a comprehensive taxonomy for FDA abstracted into a novel process model reflecting FDA over IoT. This taxonomy helps the readers to know about the sources and features of FDA. The main benefits to the readers are as follows: • It contains case studies to demonstrate the process model, which makes the readers aware of future challenges associated with the FDA, especially for IoT applications. • It includes the layered architecture of FDA and also compares the life cycle of both big data and FDA. The book is organized into five sections. The first section is focused on the introduction and background of the FDA for IoT applications, which includes five chapters. The second section discusses the emerging technologies and architecture of the FDA, which has three chapters. The third part illustrates the role of IoT

v

vi

Preface

applications in the FDA with well-structured four chapters. The fourth section highlights security issues, research challenges, and opportunities, which has four chapters. Finally, the last section focuses on FDA application in Healthcare 4.0 with four chapters.

Part: Introduction and Background of FDA The chapter “Introduction” presents an overview of the FDA. The major aim of this chapter is to provide a bird’s eye view of the usage of cloud computing and fog computing technologies used for the FDA. This chapter also gives a comparative analysis of the existing data analytics techniques used in the FDA. Moreover, it discusses different researchers’ views about the FDA in detail. This chapter also discusses the importance of the FDA in several IoT-based applications such as health care and smart grid and highlights the future research challenges for the researchers/readers working in the same field. The chapter “Introduction to Fog Data Analytics for IoT Applications” highlights the role of fog computing to improve the IoT/Cloud paradigm. This chapter introduces the origin of fog computing, the role of fog computing in IoT applications, why to use fog computing, and the architecture of fog computing. Then, it emphasizes how fog computing works, characteristics of fog computing, comparison of fog computing with cloud computing, the difference between fog and edge computing, and advantages and limitations of fog computing with its applications. This chapter provides insights into possible research directions with an overall concept of fog computing in the context of IoT applications. The chapter “Fog Data Analytics: Systematic Computational Classification and Procedural Paradigm” discusses the characteristic attributes and analyzes the algorithmic complexities in FDA. A systematic computational classification is generated to develop a paradigmatic procedure for FDA to process, store, and analyze data efficiently and effectively. It also develop an ideal platform for the proliferating sensor-based devices and services on IoT applications. In the end, a few case studies have been discussed, which benefit from the proposed model, and the study is concluded with future scope for the researchers. The chapter “Fog Computing: Building a Road to IoT with Fog Analytics” discusses the relevance of the FDA in the area of IoT with its issues and challenges. This chapter majorly focuses on the basics of the FDA. Then, the different characteristics of cloud and fog computing platforms are explained. Also, a detailed architecture of both the platforms is introduced with a comparative analysis. On the fog server, the FDA tool performs data localization. All the methods of application management, such as resource coordination technique, distributed application deployment, and distributed data flow method, are discussed. Further, research

Preface

vii

direction using Deep Learning to Big Data is discussed in detail to improve the formulation of data abstractions and dimensionality reduction, along with their possible solutions. The chapter “Data Collection in Fog Data Analytics” highlights the collection techniques and management of data. It discusses how data differs in scenarios and the various methods of data collection in the FDA like node-based segregation, which reduces the requirement of a large number of fog nodes to be set up and the overloading of these nodes. This study explores the techniques wherein raw and passive forms of data can be made to evolve and become meaningful with reduced size, indulge on how Bluetooth low-energy technology can be used to process collected data through gateways, and use data collectors with wireless low-powered sensing systems. Finally, this chapter discusses various case studies related to Moving Vehicles, Industrial Automation, Underwater Data Collection, Water Conservation in Agriculture, Indoor Air Quality Monitoring, Health Monitoring System, Telehealth Big Data, and Healthcare 4.0 related to FDA.

Part: Emerging Technologies and Architecture for FDA The chapter “Mobile FOG Architecture Assisted Continuous Acquisition of Fetal ECG Data for Efficient Prediction” presents an overview of FDA and discusses the architecture for continuous monitoring of Fetal Electrocardiogram (fECG) from maternal ECG to avoid any kind of acute condition caused to the newborn child at the time of birth. The continuous acquisition of fECG will lead to a very large amount of data to send over the cloud for further examination by the doctor; this data has to be preprocessed before storing it to the cloud for much faster and efficient evaluation. The proposed architecture has the potential to extend and virtualize new and efficient healthcare processes for fetal health monitoring, along with the Healthcare 4.0 environment and mobile fog computing. The chapter “Proposed Framework for Fog Computing to Improve Qualityof-Service in IoT Applications” enlightens a framework that aims to improve quality of service (QoS) by providing reduced latency and load balancing at the fog layer. This improvement in QoS is achieved with the help of data aggregation and load balancing. In the proposed framework, an overburdened fog node requests its neighboring node to share its load. Additionally, it suggests implementing various techniques to aggregate data ahead of transmission. Resultantly, the study improves QoS by outperforming the existing approaches by preventing bottlenecks in the fog network. The chapter “Fog Data Based Statistical Analysis to Check Effects of Yajna and Mantra Science: Next Generation Health Practices” provides a case study for fog data-based statistical analysis and verifies the effect of Yajna and Mantra. For this study, Havan Samgri is specially designed for Asthmatic patients with a 3:1 ratio. Then, Surya Gayatri Mantra chanting has been performed 24 times, and Nadi

viii

Preface

Shodhan Pranayam has been executed for half an hour duration daily. Subjects (Users) have taken kwath of Hawan Samagri twice in a day. Lung Function Test (LFT) has been experimented to check the efficacy of Yagyopathy with three parameters, i.e., FVC-Forced vital capacity (calculates the capacity of human lungs), FEV-Forced expiratory volume, and FEV1/FVC known as MER-measured expiratory rate (directly related to the proportion of lung size exhaled per second). This study demonstrated a significantly improved performance in lung function using FDA.

Part: Role of IoT in FDA The chapter “Process Model for Fog Data Analytics for IoT Applications” presents a process model that describes the process flow of data analytics using fog computing and various modules as used in the fog computing architecture. It discusses the use case of the FDA for IoT-based Healthcare applications and concludes with a case study, which highlights the current challenges of fog computing for a better adoption of the technology. In the chapter “Medical Analytics Based on Artificial Neural Networks Using Cognitive Internet of Things”, a Cognitive Radio network is simulated for optimization of spectrum sensing and energy detection. Moreover, two effective classification methods are evaluated on remotely measured physiological parameters, such as blood pressure and heart rate, of patients with two types of diseases— chronic kidney disease and heart disease. Using the proposed framework, the patients’ blood pressure values, after being measured, can be used by doctors and hospitals to predict the heart rate for heart disease patients and blood glucose (sugar) for chronic kidney patients remotely. This type of remote patient monitoring with machine-learning-based disease state prediction can be beneficial for determining patient’s disease remotely using their real-time bio-signal measurements. The chapter “Application of IoT-Based Smart Devices in Health Care Using Fog Computing” explores the field of fog computing, cloud computing, and IoT applications and the importance of the quality of service in the healthcare industry. It presents the working environment and an integrated architecture of fog computing with IoT. The author highlights the importance of fog computing with IoT in the healthcare sector with the help of various services and applications. At last, a case study discusses various issues and challenges in the adoption of IoT-based devices in health care. The chapter “Data Reduction Techniques in Fog Data Analytics for IoT Applications” details the fundamental issues related to the FDA for IoT applications. Then, it explores the various data reduction techniques with fog computing for IoT applications. These techniques includes; Missing Values Ratio, Low Variance Filter, High Correlation Filter, Principal Component Analysis (PCA), Random Forest/Ensemble Trees, Backward Feature Elimination, and Forward

Preface

ix

Feature Construction. Further, a case study with the PCA method for FDA is discussed in the end of the chapter, which demonstrates the effectiveness of data reduction methods in FC.

Part: Security Issues, Research Challenges, and Opportunities The chapter “Background and Research Challenges for Fog Data Analytics and IoT” introduces FDA and describes the challenges and issues related to it. This study presents an explanation of the advantages and disadvantages of the existing system, for instance, cloud computing, which motivates to incorporate Fog and its methods to solve cloud computing issues. Then, it describes the various research works carried out in FDA to overcome the problems of resource discovery, sharing, path improvement, and self-organization. FDA depends on other technologies for storage, communication, and processing to develop rapidly like a communication network, i.e., 4G and 5G. So, it still requires research work and attention to reach its full potential. The chapter “Behavior-Based Approach for Fog Data Analytics: An Approach Toward Security and Privacy” discusses advanced security mechanisms for FDA. This chapter assures security and privacy in fog architecture by reducing the error rate in the proposed security strategy. There are a lot of biometric-based techniques that have been developed using face, palm print, fingers, eyelids, and so on. But there are very few works available in the field of typing behavior characteristics. To address the aforementioned issue, a security strategy for fog computing is designed by analyzing a user’s typing behavior pattern. For this study, nine behavior parameters are deployed and the error rate is evaluated at each step. Results also show that Crossover Error Rate (CER) reduces to 2% for the final stage using the proposed strategy. The proposed scheme is validated by a simulator designed for registering a new users and identifying their request to get the access of services. The chapter “Data Security and Privacy Functions in Fog Data Analytics” highlighted the reasons for the susceptibility of fog nodes. The security systems and concerns in cloud and fog computing are compared in the chapter. The chapter also examines the various types of attacks to which the fog network is vulnerable such as man in the middle attacks, authentication threats, distributed denial of service, and others. Finally, the chapter aims at investigating the different methods to handle these types of attacks. The first approach is the prevention of security attacks, which includes techniques like identity authentication, access control, and cryptographic schemes. The second approach for data privacy and security handling is to the detection of attacks along with their various methods like intrusion detection, data integrity check, and network traffic analysis to handle the aforementioned issues. The last approach covered is recovery from attacks, which covers recovery schemes. Thus, the chapter intends to provide a multi-faceted understanding of data security and privacy in the FDA.

x

Preface

The chapter “Data Security and Privacy Functions in Fog Computing for Healthcare 4.0” highlights data security and privacy issues of fog computing in Healthcare industry 4.0. Then, it discusses various services provided by fog computing such as encryption, data availability, and traffic analysis using IoT. It also emphasizes relevant cloud/fog security design principles, software requirements for FDA, and various risk issues, for example, confidentiality, privacy, and compliance risks. This chapter is important to understand the concepts of Data Security and Privacy Functions in Fog Computing for Healthcare 4.0.

Part: FDA Applications in Health Care The chapter “Fog Data Analytics and Healthcare 4.0” presents a deep insight into cloud computing, fog computing, and implications of fog computing across domains with the imminent prevalence of the IoT devices. Mostly, the benefits of cloud computing have been reaped by many of the large and small technological firms, by providing services like data and file storage, hosting websites, etc., but with the advent of IoT devices and Fog Computing, the doors are open for a wide variety of disciplines such as smart cities and health care. This chapter discusses the benefits of fog computing over Cloud Computing and its applicability in Healthcare 4.0 by presenting a case study. Finally, it concludes the study with future research challenges of IoT and fog computing. The chapter “Fog Data Processing and Analytics for Health Care-Based IoT Applications” discusses the Healthcare-based IoT application, cloud computing, fog data processing, FDA, and their integration and importance. A Literature survey involving all the works that include fog and IoT is discussed. Case studies involving fog and IoT in healthcare systems are also presented to provide light on how fog and IoT eliminate pressures on healthcare systems that require real-time processing. FDA deployment can be effective in remote monitoring system, equipment monitoring, and smart equipment maintenance. Hence, the research issues and challenges in FDA are some prominences at last. The chapter “The Importance of Fog Computing for Healthcare 4.0-Based IoT Solutions” highlights the amalgamation of cloud computing, fog computing, and IoT in Healthcare 4.0. The fog environment is created to address issues, which are overlooked by the cloud computing model. It is an extension to cloud computing to keep the devices close to the data center and cloud storage. The devices acting as an intermediary between the fog and cloud layers are known as fog nodes. They provide limited information to the end-user or client for the appropriateness of haze gadgets and portals in the automation of any real-time application such as health care and smart cities.

Preface

xi

The last chapter “Conclusion” wraps up the FDA with the findings, future research challenges, and directions in IoT applications, for instance, health care and smart cities. This chapter shows that FDA critically contributes to the decision-making of real-time IoT-based applications. The editor is very thankful to all the members of Springer Private Limited, especially Mr. Aninda Bose, for the given opportunity to edit this book. Ahmedabad, Gujarat, India

Dr. Sudeep Tanwar

Contents

Introduction and Background of FDA Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aparna Kumari, Rajesh Gupta, and Sudeep Tanwar

3

Introduction to Fog Data Analytics for IoT Applications . . . . . . . . . . . . Puneet Kansal, Dilip Sharma, and Manoj Kumar

19

Fog Data Analytics: Systematic Computational Classification and Procedural Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Pradeep Kumar, R. Hanumantharaju, B. J. Sowmya, K. N. Shreenath, and K. G. Srinivasa

39

Fog Computing: Building a Road to IoT with Fog Analytics . . . . . . . . . Avinash Kaur, Parminder Singh, and Anand Nayyar

59

Data Collection in Fog Data Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . S. R. Mani Sekhar, Snehil Tewari, Haaris Rahman, and G. M. Siddesh

79

Emerging Technologies and Architecture for FDA Mobile FOG Architecture Assisted Continuous Acquisition of Fetal ECG Data for Efficient Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Anupam Bhardwaj, Pooja Khanna, and Sachin Kumar Proposed Framework for Fog Computing to Improve Quality-of-Service in IoT Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Rakhi Akhare, Monika Mangla, Sanjivani Deokar, and Vaishali Wadhwa Fog Data Based Statistical Analysis to Check Effects of Yajna and Mantra Science: Next Generation Health Practices . . . . . . . . . . . . . 145 Rohit Rastogi, Mamta Saxena, D. K. Chaturvedi, Santosh Satya, Navneet Arora, Mayank Gupta, and Parul Singhal

xiii

xiv

Contents

Role of IoT in FDA Process Model for Fog Data Analytics for IoT Applications . . . . . . . . . 175 Anjali Modi, Shreena Jani, Karansingh Chauhan, and Jitendra Bhatia Medical Analytics Based on Artificial Neural Networks Using Cognitive Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Himani Bedekar, Gahangir Hossain, and Ayush Goyal Application of IoT-Based Smart Devices in Health Care Using Fog Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Satyasundara Mahapatra and Anupam Singh Data Reduction Techniques in Fog Data Analytics for IoT Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 Srinidhi Hiriyannaiah, Zaifa Khan, Aniket Singh, G. M. Siddesh, and K. G. Srinivasa Security Issues, Research Challenges, and Opportunities Background and Research Challenges for Fog Data Analytics and IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Ansh Riyal, Geetansh Kumar, and Deepak Kumar Sharma Behavior-Based Approach for Fog Data Analytics: An Approach Toward Security and Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Urvashi, Lalit K. Awasthi, and Geeta Sikka Data Security and Privacy Functions in Fog Data Analytics . . . . . . . . . 355 Apoorva Bhagat, Srishty Mittal, Uzma Faiz, and Deepak Kumar Sharma Data Security and Privacy Functions in Fog Computing for Healthcare 4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 Darpan Anand and Vineeta Khemchandani FDA Applications in Health Care Fog Data Analytics and Healthcare 4.0 . . . . . . . . . . . . . . . . . . . . . . . . . 423 Madhurima Hooda, Shashwat Pathak, and Shreyans Pathak Fog Data Processing and Analytics for Health Care-Based IoT Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445 Tarjni Vyas, Shivani Desai, and Anand Ruparelia The Importance of Fog Computing for Healthcare 4.0-Based IoT Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 U. Hariharan and K. Rajkumar Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Rajesh Gupta, Aparna Kumari, and Sudeep Tanwar

About the Editor

Dr. Sudeep Tanwar is an Associate Professor in the Computer Science and Engineering Department at Institute of Technology, Nirma University, Ahmedabad, Gujarat, India. He is visiting Professor in Jan Wyzykowski University in Polkowice, Poland and University of Pitesti in Pitesti, Romania. He received B.Tech in 2002 from Kurukshetra University, India, M.Tech (Honor’s) in 2009 from Guru Gobind Singh Indraprastha University, Delhi, India and Ph.D. in 2016 with specialization in Wireless Sensor Network. He has authored or coauthored more than 130 technical research papers published in leading journals and conferences from the IEEE, Elsevier, Springer, Wiley, etc. Some of his research findings are published in top cited journals such as IEEE TNSE, IEEE TVT, IEEE TII, IEEE Access, Computer Communication, Applied Soft Computing, Journal of Parallel and Distributed Computing, Emerging Transactions on Telecommunication, Journal of Network and Computer Application, Pervasive and Mobile Computing, International Journal of Communication System, Telecommunication System, Computer and Electrical Engineering and IEEE Systems Journal. He has also contributed 10 edited/authored books with International/National Publishers like IET, Springer. He has guided many students leading to M.E./M.Tech and guiding students leading to Ph.D. He is Associate Editor of IJCS, Wiley and Security and Privacy Journal, Wiley. His current interest includes Wireless Sensor Networks, Fog Computing, Smart Grid, IoT, and Blockchain Technology. He was invited as Guest Editors/Editorial Board Members of many International Journals, invited for keynote Speaker in many International Conferences held in Asia and invited as Program Chair, Publications Chair, Publicity Chair, and Session Chair in many International Conferences held in North America, Europe, Asia and Africa. He has been awarded best research paper awards from IEEE GLOBECOM 2018, IEEE ICC 2019, and Springer ICRIC-2019.

xv

Introduction and Background of FDA

Introduction Aparna Kumari, Rajesh Gupta, and Sudeep Tanwar

Abstract In the recent couple of years, smart devices and sensors in IoT applications are growing drastically and generating an extensive amount of multi-modal and heterogeneous data, designated as Big Data (BD). BD requires several intelligent computation systems such as data mining and data analytics to handle BD-related storage challenges on a cloud. Presently, Cloud Computing (CC) is comprehensively used in the industry to handle BD challenges, but it raises various issues such as latency, security, and high cost for data management. The promising technology Fog Computing (FC) facilitates at the edge of a cloud to handle the aforementioned issues. The analytics on fog data generated by diverse IoT devices is one of the challenging tasks. In this chapter, we discourse on the complexity and uniqueness of Fog Data Analytics (FDA). A detailed discussion on FDA architecture is abstracted with the innovative process model. This chapter highlights the various attributes of the FDA for IoT applications and discusses the FDA classification like fog data collection, storage, and analytics on it. The proposed FDA process model addresses numerous research challenges, such as scalability, accessibility, heterogeneity, reliability, nodal collaboration, and quality of service (QoS) with future research directions. Keywords Fog data analytics · Cloud computing · Big data · Data analysis · Internet of Things · Fog computing

A. Kumari · R. Gupta · S. Tanwar (B) Nirma University, Ahmedabad, Gujarat, India e-mail: [email protected] A. Kumari e-mail: [email protected] R. Gupta e-mail: [email protected]

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Tanwar (ed.), Fog Data Analytics for IoT Applications, Studies in Big Data 76, https://doi.org/10.1007/978-981-15-6044-6_1

3

4

A. Kumari et al.

1 Introduction Since the inception of the Information and Communications Technology (ICT) and Internet of Things (IoT), a fast increase of data is being generated. This is better termed as Big Data (BD); the aggregation and processing of BD have become a foremost research concern. The IoT comprises various smart devices such as smart cars, smartphones, sensors (sensing physical environment), and others, which are interconnected to each other through the communication network without any human interaction. The ballooning of IoT devices over cloud-based applications caused several BD challenges. The various data storage-associated challenges have been handled through Cloud Computing (CC) technology and it is extensively used in the industry using the pay-as-you-go scheme [1]. It discourses the BD challenges, for instance, latency and high cost due to its data distribution management and scalability approach [2]. Even though CC provides a flexible and scalable system for data analytics, security challenges between cloud and local assets raise the downtime and other factors [3, 4]. Nonetheless, to handle the aforementioned issues, the emergent technology fog computing (FC) (proposed by Cisco) plays a vital role due to its decentralized architecture over the cloud [5]. A comparative analysis between CC and FC based on the numerous parameters like data communication cost and network bandwidth requirement and many more are shown in Table 1. This table has highlighted the benefits of the usage of FC along with CC (FC+CC) over only the CC system for IoT applications. Unlike CC, FC handles the latency-sensitive real-time applications on the verge of a network [6]. FC possesses a substantial processing power at edge nodes to perform the computation of a huge volume of data on their own (without going through distant servers). The FC has Cloudlets (a small scale data centers), which supports data-intensive IoT-based devices for low latency during the data processing and transfer. FC divests the cloud servers to enable large data storage and increase the processing power of the entire organization. FC requires numerous intelligent solutions for decision-making like data analytics, data mining, and reduction in the edge of the fog devices over the cloud [7, 8]. Further, Fog Data Analytics (FDA) is a consolidation of the CC, FC, and analytics on data at the cloud end [9]. The elementary functionality of Fog is its ability to improve data distribution and reduce latency [10]. A distinctive FDA life cycle comprises several functions like BD storage, BD distribution, and visualization of BD, as shown in Fig. 1. In Fig. 1, the FDA architecture is divided into three layers; these layers are End User Layer, Fog Layer (FL), and Cloud Layer (CL). The first layer is the End User layer, where data is collected from various IoT devices like sensors and actuators to store at the edge devices over a cloud. The collected data is processed, and analytics performed on it;only after that, only relevant data is sent to the cloud to handle the issues of BD [11]. The bottom (End User) layer contains users and sensor/IoT devices that are linked to the cloud using a communication channel. Data extracted here are forwarded to the next layer, which is an FL. FL compromises various Fog Nodes (FNs) to handle the priority-based requests [12].

Introduction

5

Table 1 A comparative analysis between FC and CC Features Cloud Network bandwidth requirement Data communication cost Data storage capacity Data access latency (RTT) Fault tolerance Data flow System response time System and data reliability Mobility support of smart devices Security Privacy Battery lifetime Applications

Fog

High

Improved

High High High No All High Manageable Random

Low Low Improved Yes Relevant Improved Improved Systematic

Low High Low All areas

Improved Low Improved Critical applications

IoT devices communicate to FL-based FNs using fog gateway. Then, FL to CL communication takes place through a cloud gateway, which is the utmost top layer in the FDA architecture.

1.1 Internet of Things (IoT) Applications The development of ICT and digital devices revolutionized these digital devices to smart devices. The IoT is essentially a network of devices, which can record data, share data, and even perform some computation on data using digital apparatuses, software, sensors, and entrenched chips. Each IoT device has a unique Internet Protocol (IP) address, which helps to establish connections and verify the connection between these smart devices. The communication between these IoT devices does not need any predefined interactions, for instance, Human-to-Computer (H2C), Computer-to-Human (C2H), or Human to Human (H2H). The IoT applications have been really effective in a different section of the various industries like health care [13], smart cities, and smart grid with data storage and processing facility [14], although it has several limitations related to BD (generated from IoT devices) such as small computation capacity and limited data analysis power, which need to be mitigated using appropriate approaches like FDA [7].

6

A. Kumari et al.

Fig. 1 Layered FDA architecture

1.2 Fog Computing and Its Role in FDA Intel estimates the data generation capacity of the average automated vehicle, which produces around 40TB of data in every 8 h [15]. In this scenario, FC infrastructure is commonly provisioned to use the relevant data for specific tasks [16]. The remaining data that are not timely for the specified task or process are forwarded to the cloud. The cloud consists of the extended computing resources for BD storing (edge devices produce but do not use) [17]. Then, the cloud provides supplementary computing resources for analytics that make it a complementary ecosystem for FC-based applications. An FC infrastructure consists of various functions and components based on its real-time application. It includes computing gateways, which accept BD from

Introduction

7

diverse collection endpoints like routers and switches (connecting assets within a network) or data sources. FC is a decentralized architecture that comprises computing resources and locating these resources near to the data-generation sources. The processing of data close to the edge decreases latency and uses less computing resources [18]. The steps to process BD through the FDA (using FC architecture) in an IoT application: 1. The signals from various IoT devices are read by an automation controller. 2. The controller executes the system program required to automate the IoT devices. 3. Then, the control system program sends data through to standard gateway protocols or Open Platform Communication (OPC) server. The OPC is basically an interoperability standard for data exchange. 4. These data are converted into the format to be accepted by Internet-based service providers like HTTP(S) and MQTT. 5. After conversion, data is sent to FNs or IoT gateway. These endpoints receive the data for analysis or transfer the data to the cloud.

1.3 Process Model for FDA In the FDA process model, as shown in Fig. 2, the multi-modal heterogeneous BD generated from various heterogeneous fog devices is collected and aggregated. After collection and preprocessing, the BD is stored in the storage system and then forwarded for analytics using priority gateways. Then, historic data get processed and analyzed with the help of routine events at the FNs. In the case of real-time data, the urgent event performs real-time processing and analysis at FNs as per the need of the IoT application [19]. Successively, the BD intelligence on the processed data takes place, and useful information is retrieved for decision-making. Then, the BD will be transmitted to other FNs or cloud for further analytics.

1.4 FDA Attributes for IoT Applications The computational flow in FDA from cloud to fog is quite similar to the data analytics in CC, with the single variation being the inclusion of the edge devices [20]. This computational approach for the data analytics defines various attributes of FDA as shown in Fig. 3, and it empowers the extensive development of IoT applications and services as listed beneath.

8

Fig. 2 Process model of FDA Fig. 3 FDA attributes for IoT applications

A. Kumari et al.

Introduction

1.4.1

9

Heterogeneity

Heterogeneity comprises hierarchial components that work as building blocks of a distributed architecture of the FDA. FC infrastructure provides the major FDA facility, for instance, data storage, data computation, and networking services between core cloud and end devices.

1.4.2

Interoperability

To support the wide range of services to the IoT application, fog devices and FDA work in an interoperating environment. These services could be real-time data analytics, artificial intelligence, predictive decisions, and data streaming [21, 22].

1.4.3

Real-Time Interaction

The FDA has the competency to work in real time to achieve better QoS, for instance, energy distribution [23] and monitoring systems in smart grid and real-time traffic monitoring in intelligent transportation systems (ITS).

1.4.4

Cognition

In this scenario, the goal is to become user-centric. The data accessing facility and analytics based on the user requirements are focused by the FDA to better understand the need of the user. It also provides the best place to control, store, and transmit the data throughout the cloud to end devices located at the edge of the IoT application. The IoT application incorporated with FC provide better responses to the user due to their nearness to end devices and more efficiently reproduce the users’ requirements [24].

1.4.5

Geo-Graphical Environment Distribution

The QoS can handle effectively dynamic and static edge-devices in IoT applications. Its application network consists of geographically dispersed FNs and sensors in various environments like weather monitoring sensors, temperature monitoring, and healthcare monitoring [25].

1.4.6

Edge Location with Low Latency

The recent research and development of applications for IoT-based smart devices have been recognized as insufficient due to the absence of vicinity between the

10

A. Kumari et al.

devices. For better QoS at the edge of the IoT application, low-latency services are required with the application like video streaming TV services and live gaming applications [26].

1.5 Classification for FDA in IoT Application 1.5.1

Data Collection

There are various data collection systems and components that exist in the data analytics ecosystem. These data collection systems provided duplicate data, erroneous data, and missing values, which need a faithful preprocessing. Following are the prevalent components for the data collection: – IoT Devices: The IoT has opened a door in the field of ICT with a combination of the computer and physical eco-sphere, i.e., IoT devices. It enhances proficiency and accurateness as it reduces human intervention. A few of the examples are smart cities, smart homes, smart grids, virtual power plants, ITS, and many more [27]; as each element is uniquely recognizable within the network, it provides the global connectivity. – Sensory Devices: It is the core device used for the data collection when it refers to automatic data collection (without human intervention) for cloud-based IoT applications. These data generated from various sources are characterized as voice, image, vibration, weather, pressure, temperature, voice, current, vehicles, and so on in a specific time interval. These huge data (BD) are transferred primarily through a Local Area Network (LAN) or a wireless network for BD collection, storage, and processing. The FDA provisions better communication and services by establishing a strong network connection through LAN/WAN (5G/4G) [28] or wireless network technologies (Bluetooth/ZigBee) [29], and their edge capacities at FN are as shown in Figs. 4 and 5. Figure 4 illustrates the various network technologies such as Wi-Fi and ZigBee to empower FDA communication between edge-devices. The sensory devices contain API through which data interacts with IoT applications by using their unique IP address. There are other potential data sources, such as social media websites and social networking API.

1.5.2

Data Storage

Data storage is majorly categorized into three categories—Clustering, Indexing, and Replication. In clustering, a group of data is collected and stored into the fog storage devices. In indexing, the data has comparative indexations for quick retrieval and fast access. In real-time IoT applications, complex data are streamed by combining

Introduction

11

Fig. 4 Empowering network technologies for FDA communication

Fig. 5 Edge capacity of wireless models in FDA

a sequential approach for newly arrived data and an indexing approach for old data. In the last category, replication, the same data is simulated over other machines to handle fault tolerance.

1.5.3

Data Processing

The FC has a well-known distributed architecture for data exchange and processing. In this FC architecture, most of the systems are controlled remotely and others are handled at the edge of the cloud. FC tends to reduce the volume of data that is passed

12

A. Kumari et al.

on to CC for storage, processing, and analysis [30]. The data processing in an FC system happens in the smart devices that lie at the edge of the application network. It also ensures data security and privacy, and only relevant data is transferred to the cloud [31, 32]. The most vital attribute of the FDA is its capability of filtering data that needs to be processed at the cloud layer. There are three basic components: (i) IoT Verticals (inhabitant applications or products—Smart Devices such as communicators, external interface, controllers, and so on), (ii) Orchestration Layer (consists of predefined functions, for instance, data migration, data sharing, decision-making, and policy supervision), and (iii) Abstraction Layer (provides a uniform interface to the end-user, e.g., generic API).

1.5.4

Data Analysis

The analysis part is different from conventional computing methods in the FDA as it combines analytic on historical data and real-time analytics as well. In the IoT rebellion, every industry is moving toward huge volumes of data for processing and real-time analytics to achieve better QoS. This real-time analytics can be facilitated using the FC architecture. For example, the smart grid industry leverages deep BD analysis for prediction of power failures, power consumption, energy prices, and demand forecasting to achieve higher productivity and better QoS.

1.6 FDA Research Challenges and Future Direction The FDA discussion for IoT applications in the context of real-time access, heterogeneity, and predominantly interoperability requires BD storage and processing. Though many solutions exist to handle this, they are not capable enough as compared to the exponential data growth rate. Presently, minimal software and tools exist to address FDA issues for IoT applications. Hence, potential research challenges and issues are identified as delineated in Fig. 6, which demand the development of a process model for FDA life cycle. The hunt for these challenges will expedite the encroachment of the FDA.

1.6.1

Heterogeneity in Fog

Fog is located close (relatively at the edge) to the end devices which are connected via a fog gateway. The responsibility of the fog gateway is to manage and maintain the heterogeneous connectivity between the fog and the end devices. Emergent popular technologies like network virtualization (NV), network function virtualization (NFV), and software-defined network (SDN) can be used to maintain the network efficiently and increase network reliability and scalability.

Introduction

13

Fig. 6 Research challenges in FDA communication

1.6.2

Data Reliability

Reliability is one of the backbone characteristics of the network, which ensures maximum network availability. To achieve this, a network needs to perform periodic checks and locate the failure points, so that the failed components again start working. As the fog network is highly dynamic, it is not feasible to do checks and rescheduling of failed components periodically. The other reason for the same is the latency of the communication network. So, the other possible solution to make the system fault-tolerant is to make the replicas of the fog nodes.

1.6.3

Storage Capacity

The storage capacity of the fog node is limited compared to the cloud capacity. Fog nodes store the data whose requirement is urgent (or critical data). Nowadays, smart end devices are generating a huge amount of data called big data, so fog nodes are not much capable to store this [33]. This limits the analysis capability also.

14

1.6.4

A. Kumari et al.

Data Communication Latency

A fog-based network needs to perform real-time analysis of data instead of batch processing. It requires all resources at the same time for processing the real-time data, but the storage capacity of a fog node is limited. As all fog nodes are connected to each other for data and resource sharing, latency is the major issue. High latency reduces the quality of service (QoS) of the fog network.

1.6.5

Data Distribution

Most of the applications require a trusted data flow between the multiple fog devices, which requires pre-attribute definitions. Data distribution among the fog nodes is essential as the capacity of single fog is not quite limited and is not able to store ample amount of data. It is challenging because of the communication network properties such as latency, reliability, and throughput.

1.6.6

Resource Management

In FDA, resource management can be application-aware management and detection and sharing management. Challenges in the first case are end-device mobility, network latency, storage capacity, and network bandwidth. It plays a vital role in mobile sensing applications. The latter case of resource management is for maintaining the performance of the fog network. The main challenge in detection and sharing management is energy harvesting.

1.6.7

Security and Privacy

Presently, very few researchers have focused on the security and privacy issues of the FDA. It is quite difficult to achieve the authentication, authorization, and access control mechanism in the fog environment as the data is distributed among the different fog nodes of the heterogeneous network [34]. Each fog node has different computing capabilities, so above all, security and privacy mechanisms are difficult to achieve. One of the possible and efficient solutions is to create a trusted execution environment.

1.6.8

Processing Capability

Fog nodes are battery constrained devices with a limited lifetime. So, the processing of big data over the fog nodes is impossible. Limited processing capability can give incorrect results to the end nodes, which may be harmful to them. This can be solved using distributed fog computing.

Introduction

15

2 Conclusion The explosion of sensors, smart devices, and IoT generate BD drastically in recent years. The generation of a huge amount of data from these IoT devices demands the efficient storage, processing, and analysis of data with high frequency in real time. In this chapter, we have targeted the embryonic intersects of FC and CC technologies with a distinctive focus on analytics on fog-based BD. The FDA has extensive usage in a variety of IoT applications such as smart cities, satellite imaging, health care, ITS, and smart grid. FDA is important to unlock the full advantages of these IoT applications in the present society. This chapter provides an introduction to BD communication and analytics in FC architecture related to the concepts of conventional CC. This chapter also discusses and addresses plentiful challenges in the FDA data processing and analysis and develops a broad classification. Further, an advanced level of a process model is abstracted through the FDA life cycle for BD computing. This unique process model targets research challenges and practices regarding FDA computing in this field of research.

References 1. Tang, B., Chen, Z., Hefferman, G., Wei, T., He, H., Yang, Q.: A hierarchical distributed fog computing architecture for big data analysis in smart cities. In: Proceedings of the ASE BigData & SocialInformatics 2015, ASE BDSI ’15. Association for Computing Machinery, New York, NY, USA (2015) 2. Taneja, M., Jalodia, N., Davy, A.: Distributed decomposed data analytics in fog enabled iot deployments. IEEE Access 7, 40969–40981 (2019) 3. Stackpath: What is fog computing? (2018). https://www.stackpath.com/edge-academy/fogcomputing/ 4. Buyya, R., Srirama, S.N.: Fog Computing Realization for Big Data Analytics, pp. 259–290 (2019) 5. Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: Proceedings of the First Edition of the MCC Workshop on Mobile Cloud Computing, MCC ’12, pp. 13–16 (2012). Association for Computing Machinery, New York, NY, USA 6. Roman, R., Lopez, J., Mambo, M.: Mobile edge computing, Fog et al.: a survey and analysis of security threats and challenges. Future Gener. Comput. Syst. 78, 680–698 (2018) 7. What is fog data analytics? how to leverage it for business purposes? (2018). https://datafloq. com/read/fog-data-analytics-leverage-business-purposes/7310 8. He, J., Wei, J., Chen, K., Tang, Z., Zhou, Y., Zhang, Y.: Multitier fog computing with large-scale iot data analytics for smart cities. IEEE Internet of Things J. 5(2), 677–686 (2018) 9. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N., Obaidat, M.S., Rodrigues, J.J.P.C.: Fog computing for smart grid systems in the 5g environment: challenges and solutions. IEEE Wirel. Commun. 26(3), 47–53 (2019). June 10. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N., Parizi, R.M., Choo, K.-K.R.: Fog data analytics: a taxonomy and process model. J. Netw. Comput. Appl. 128, 90–104 (2019) 11. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N., Maasberg, M., Choo, K.-K.R.: Multimedia big data computing and internet of things applications: a taxonomy and process model. J. Netw. Comput. Appl. 124, 169–195 (2018) 12. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N., Opportunities and challenges: Fog computing for healthcare 4.0 environment. Comput. Electr. Eng. 72, 1–13 (2018)

16

A. Kumari et al.

13. Gupta, R., Tanwar, S., Tyagi, S., Kumar, N.: Tactile-internet-based telesurgery system for healthcare 4.0: an architecture, research challenges, and future directions. IEEE Netw. 33(6), 22–29 (2019) 14. Badidi, E., Moumane, K.: Enhancing the processing of healthcare data streams using fog computing. In: 2019 IEEE Symposium on Computers and Communications (ISCC), pp. 1113– 1118 (2019) 15. Mehta, P., Gupta, R., Tanwar, S.: Blockchain envisioned uav networks: challenges, solutions, and comparisons. Comput. Commun. 151, 518–538 (2020) 16. Nelson, P.: Just one autonomous car will use 4,000 gb of data/day (2016). https://www. networkworld.com/article/3147892/one-autonomous-car-will-use-4000-gb-of-dataday.html 17. Prasad, V., Bhavsar, M., Tanwar, S.: Influence of monitoring: fog and edge computing. Scalable Comput. 20, 365–376 (2019) 18. Kumar, R., Kalra, M., Tanwar, S., Tyagi, S., Kumar, N.: Min-parent: an effective approach to enhance resource utilization in cloud environment. In: 2016 International Conference on Advances in Computing, Communication, Automation (ICACCA) (Spring), April 2016, pp. 1–6 (2016) 19. Tanwar, S., Bhatia, Q., Patel, P., Kumari, A., Singh, P.K., Hong, W.: Machine learning adoption in blockchain-based smart applications: the challenges, and a way forward. IEEE Access 8, 474–488 (2020) 20. Tanwar, S., Vora, J., Kanriya, S., Tyagi, S., Kumar, N., Sharma, V., You, I.: Human arthritis analysis in fog computing environment using Bayesian network classifier and thread protocol. IEEE Consum. Electron. Mag. (2018) 21. Ma, B.B., Fong, S., Millham, R.: Data stream mining in fog computing environment with feature selection using ensemble of swarm search algorithms. In: 2018 Conference on Information Communications Technology and Society (ICTAS), pp. 1–6 (2018) 22. Gupta, R., Tanwar, S., Al-Turjman, F., Italiya, P., Nauman, A., Kim, S.W.: Smart contract privacy protection using ai in cyber-physical systems: tools, techniques and challenges. IEEE Access, p. 1 (2020) 23. Jayaraman, P.P., Bártolo Gomes, J., Nguyen, H., Abdallah, Z.S., Krishnaswamy, S., Zaslavsky, A.: Scalable energy-efficient distributed data analytics for crowdsensing applications in mobile environments. IEEE Trans. Comput. Soc. Syst. 2(3), 109–123 (2015) 24. Chiang, M., Zhang, T.: Fog and IoT: an overview of research opportunities. IEEE Internet of Things J. 3(6), 854–864 (2016) 25. Hathaliya, J., Sharma, P., Tanwar, S., Gupta, R.: Blockchain-based remote patient monitoring in healthcare 4.0. In: 2019 IEEE 9th International Conference on Advanced Computing (IACC), December 2019, pp. 87–91 (2019) 26. More, P.: Review of implementing fog computing. Int. J. Res. Eng. Technol. 4(06), 335–338 (2015) 27. Darwish, T.S.J., Abu Bakar, K.: Fog based intelligent transportation big data analytics in the internet of vehicles environment: motivations, architecture, challenges, and critical issues. IEEE Access 6, 15679–15701 (2018) 28. Gupta, R., Tanwar, S., Tyagi, S., Kumar, N.: Tactile internet and its applications in 5g era: a comprehensive review. Int. J. Commun. Syst. 32(14), e3981 (2019). e3981 dac.3981 29. Borresen, J.L., Jensen, C.S., Torp, K.: Fogbat: combining bluetooth and gps data for better traffic analytics. In: 2016 17th IEEE International Conference on Mobile Data Management (MDM), vol. 1, pp. 325–328 (2016) 30. Gupta, R., Tanwar, S., Tyagi, S., Kumar, N.: Machine learning models for secure data analytics: a taxonomy and threat model. Comput. Commun. 153, 406–440 (2020) 31. Gupta, R., Tanwar, S., Tyagi, S., Kumar, N., Obaidat, M.S., Sadoun, B.: Habits: blockchainbased telesurgery framework for healthcare 4.0. In: 2019 International Conference on Computer, Information and Telecommunication Systems (CITS), August 2019, pp. 1–5 (2019) 32. Aljumah, A., Ahanger, T.A.: Fog computing and security issues: a review. In: 2018 7th International Conference on Computers Communications and Control (ICCCC), pp. 237–239 (2018)

Introduction

17

33. Tanwar, S.: Verification and validation techniques for streaming big data analytics in internet of things environment. IET Netw. (2018) 34. Batool, S., Saqib, N.A., Khan, M.A.: Internet of things data analytics for user authentication and activity recognition. In: 2017 Second International Conference on Fog and Mobile Edge Computing (FMEC), pp. 183–187 (2017)

Introduction to Fog Data Analytics for IoT Applications Puneet Kansal, Dilip Sharma, and Manoj Kumar

Abstract Dictionary meaning of FOG: Vaporized condensed fine particles of water spread in the lower atmosphere that differs from cloud only in being near the ground. Fog Computing actually works almost the same way as its dictionary meaning. The term Fog computing was coined by Cisco Systems Inc., which later formed a Consortium named “OpenFog consortium”, with other high tech companies and academic institutions around the world aimed at the standardization and promotion of fog computing (Bonomi et al. in Proceedings of the First Edition of the MCC Workshop on Mobile Cloud Computing-MCC ’12, Helsinki, Finland, pp. 13–15 [1]). Fog computing is an extension of cloud computing, which helps in strengthening the cloud paradigm. Cloud is the backbone for IoT devices. According to Gartner, Inc., 20.4 billion connected things will be in use worldwide by 2020. All these devices will produce huge amounts of hot data that need to be processed quickly. Fog computing brings cloud closer to IoT devices and provides real-time processing of data. Fog computing in the IoT also helps in improving efficiency, performance and reduces the amount of data transferred to cloud for processing, data analysis, and storage. We will see in this chapter, the role of fog computing to improve IoT/cloud paradigm. This chapter introduces you to the Genesis of fog computing, Role of fog computing in IoT applications, why to use fog computing, Architecture of fog computing, how fog computing works, fog nodes, Characteristics of fog computing, Comparing Fog computing with cloud computing, difference between fog and edge computing, Advantages and disadvantage of fog computing, and applications of fog computing. After studying this chapter, you will become familiar with the overall concept of fog computing in the context of IoT applications.

P. Kansal (B) · M. Kumar Computer Department, Delhi Technological University, Delhi, India e-mail: [email protected] M. Kumar e-mail: [email protected] D. Sharma Computer Engineering and Application Department, GLA University, Mathura, India © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Tanwar (ed.), Fog Data Analytics for IoT Applications, Studies in Big Data 76, https://doi.org/10.1007/978-981-15-6044-6_2

19

20

P. Kansal et al.

Keywords Genesis of fog computing · Role of fog computing in IoT applications · Architecture of fog computing · Working of fog paradigm · Fog nodes · Advantages of fog computing · Applications of fog computing

1 Introduction In today’s informatics world, data is the most important commodity. Internet of Everything (IoE) is playing a very important role in generating a huge volume of data, which is growing up exponentially due to the explosion of the Internet of Things (IoT) devices. By Analyzing this data, important information can be retrieved which helps in business gains such as Product recommendations, Online monitoring of the manufacturing industry, Monitoring of patients, etc. According to Cisco estimates, there will be more than 50 billion interconnected devices by the year 2020 [2]. As per Machina Research, IoT devices will grow from 6 to 27 billion in the decade 2015– 2025. Till the year 2025, IoT connections will cross 2.2 billion, in which connected cars have the biggest share of 45%. They also forecast the revenue potential of IoTbased devices to 3 trillion US dollars by 2025. The Internet of Things will also generate more data traffic with more revenue. A huge amount of data estimated over 2 zettabytes will be generated mostly from consumer electronic devices, as shown in Fig. 1. The current mobile network architectures are not capable to manage the momentum and magnitude of this big data. As per past trends, the big data which are needed to be analyzed and stored, are pushed on cloud [4]. But due to increasing data velocity and size, sending the big data Fig. 1 Growth forecast for IoT-based sector till 2025 [3]

Introduction to Fog Data Analytics for IoT Applications

21

from interconnected devices to the data centers might be expensive due to bandwidth constraints, or having high latency, lack of location awareness, and mobility support. As for location awareness and time-sensitive applications that are emerging (such as patient monitoring, self-driving cars, flocks of drones, and real-time systems), the distant data centers (cloud) might not be able to provide the low latency service [5]. Privacy concerns may be another issue while sending data to cloud. So to address the issues related to low latency requirement, high bandwidth usage for cloud communication, geographically dispersed and privacy concerns, there is a need for a paradigm which can work close to interconnected edge devices. Fog is an emerging computing paradigm that offers resources close to the edge of the network. Fog paradigm is spread in between network edge and cloud inclusive both, as shown in Fig. 2. Fog concept is motivated by the need to process high velocity and big volumes of data from Internet of Things with low latency and privacy. Fog computing was proposed by both academia and industry [6, 7] to solve the issues faced by cloud and IoT combined paradigm. As fog provides a computing solution close to edge devices, it also remains connected with the cloud paradigm. Fog does not replace cloud, but provides a similar facility as cloud at the network edge. Fog paradigm provides high-speed computing, data storage, and application services to end users which are hosted at various network devices such as routers, switches, and set-top boxes. It reduces service latency, improves privacy, and provides Quality-of-Service, which provides a better end-user experience. Fog is an emerging architecture for computing, which supports a variety of applications, such as artificial intelligence, Internet of Things (IoT), and 5th generation wireless systems.

Fig. 2 Fog computing lies between cloud and end devices

22

P. Kansal et al.

Fig. 3 Global fog computing market analysis by Zion market research [9]

The fog technology has suitable characteristics which make it more suitable for the applications requiring cloud services with real-time interactions and at low latency. The volume and velocity data is increasing, but storage, online processing, and cost of bandwidth are going down, which allow the IoT and cloud to grow. But instead of transmitting all the IoT data to cloud directly, it will be a nice idea to process data close to edge devices and send the metadata to cloud. This will help in achieving ultralow latency. Edge devices cannot perform all the computations by themselves, so the role of fog computing comes in. The global market of fog computing is increasing at a very fast pace [8] as shown in Fig. 3. As per the survey of Zion market research, the fog computing market will increase from USD 35 million in 2018 to USD 768 million in 2025 at a CAGR of 55.6%.

1.1 Formally Defining Fog Computing As per OpenFog Consortium, fog computing is “a system-level horizontal architecture that distributes resources and services of computing, storage, control and networking anywhere along the continuum from cloud to Things”.

1.2 Cisco Vision of Fog Computing Cisco’s IOx platform implements fog computing, which combines Cisco IOx application support and IoT devices [10]. Cisco IOx applications are installed on Cisco network devices, such as IP video cameras, switches, and routers, which provide real-time data processing.

Introduction to Fog Data Analytics for IoT Applications

23

Cisco Inc. introduced the term fog computing [1] for making easy wireless data transfer to edge devices. IoT is becoming omnipresent or IoT is generating huge volumes of data. Cisco network devices which have mass computing capability can do data processing at a lower level. Cisco’s vision is to enable fog computing applications on billions of interconnected devices to run directly at the network edge. There are a huge number of devices across the world and those devices are somewhat managed by some sort of homogeneity. These resourceful devices could do some sort of computing of things, and can even run applications on the devices. So, user can manage, develop, and run required software applications on Cisco network devices. Cisco combines operating system and open-source Linux together in single-network devices. So, it helps to do not only computing, but it was like if you want to do computing you need to give some sort of a platform to run the applications for computing things right. So, those things are there in the devices and this is possible because there are resources available at different layers of the network toward the edge. Cisco Inc. is investing a lot in fog computing. But it’s not only Cisco, other companies like VMware, IBM, EMC, and Intel also deliver edge computing capabilities. The OpenFog Consortium is a group of academic institutions (including Princeton University) and high-tech companies (including Cisco Systems, Intel, Dell, Microsoft, and ARM) looking for the advancement of standards in fog technology. The OpenFog Consortium was constituted in 2015 and now it includes 57 members across Europe, Asia, and North America including reputed academic institutions and 500 Forbes companies [11]. Later, OpenFog consortium was combined with the Industrial Internet Consortium (IIC) in the year 2019 [12].

2 Role of Fog Computing in IoT Applications By the year 2020, there will be approximately 30 billion interconnected devices worldwide, and in 2025, the number will reach 75 billion connected things, as per Statista, as shown in Fig. 4 (Statista is a reputed online portal for statistics). All these devices will produce large amounts of hot data related to real-time applications. Actually, IoT speeds up response and awareness of real-time applications. In industries such as oil, manufacturing, transportation, mining, utilities, gas, and the public sector, real-time processing can enhance service, output, and safety. Nowadays, IoT technology is also in use to save wildlife [14] and Smart Tourism [15]. The cloud paradigm and IoT together are doing a marvelous job. IoT is providing data and cloud is processing and storing this data. Various types of analysis are performed on the data generated by IoT, which helps in taking better decisions. But due to the advancement of real-time applications, cloud and IoT combined paradigm

24

P. Kansal et al.

Connected Devices in Billions 80 70 60 50 40 30 20 10 0

2015

2016

2017

2018 2019 2020 2021 2022 Connected Devices in Billions

2023

2024

2025

Fig. 4 Interconnected devices worldwide from 2015 to 2025 (in billions) [13]

is facing many problems, such as high latency and high bandwidth requirements. Sending and receiving data from cloud is a time-consuming process due to centralizations of data centers, also lots of bandwidth is required to send huge data to cloud. Today’s cloud paradigms are not suitable for the applications which require low latency, variety, and real-time analytics as shown in Fig. 5. Today’s cloud paradigm was not build to handle huge volumes of data produced by growing IoT devices. IoT sensors are generating the data constantly and sending it to cloud for processing. Continuous updating of a huge amount of data is making it more difficult for cloud to store and process it. Sensors are now becoming part of our daily life; due to the small size of sensors, we are wearing them as gadgets. Our mobile, health gadgets, sensors at a factory, pollution detection sensors, and many others send hot data to cloud for processing. The present cloud network is not capable to process data in real time due to its physical centralization. Here the role of fog computing comes in, which can process the real-time data and reduce the need for high bandwidth also. Because cloud computing is not viable for low latency-based applications, fog computing comes in. Fog paradigm distributed approach addresses the need for IoT networks. Fog computing reduces the bandwidth needed by processing data itself at the edge of the network. Delays in data transmission in many real-time applications can be life-threatening; for example, in smart rail, vehicle-to-vehicle communication systems, manufacturing and utilities, or telemedicine, smart grid deployments, and patient care environments where milliseconds matter. Cloud–IoT alone is not suitable for these types of applications. Fog paradigm is spread all over the nodes which exist in between edge devices and cloud systems, which helps in fast processing of data and provide real-time response to end users. Fog paradigm provides many services just like cloud at the network edge. Data analytics and data storage are one of the important tasks of the fog paradigm. As the fog network is close to the edge devices, it provides real-time data processing for end users. Cloud network is not efficient in providing real-time data processing, so fog

Introduction to Fog Data Analytics for IoT Applications

25

Fig. 5 Centralized cloud data center problems [16]

simply extends cloud-like services at the network edge, but this does not mean fog is to replace cloud. Actually, fog computing is just an extension of cloud computing. Network hardware device manufacturers are taking more interest in fog computing, because they can easily modify available architecture as per the fog paradigm. Companies like Cisco, Intel, and Dell are working with artificial intelligence vendors to produce routers, switches, and other IoT gateways that support fog technology.

26

P. Kansal et al.

3 Why Use Fog Computing • The present cloud models are insufficient to handle the requirements of IoT-based systems. • IoT is generating big data, which needs to be processed close to the edge of the network. • IoT have limited capability for analysis of data generated by it. • Many applications need real-time processing of data. • Sending all data to a centralized cloud data center may cause congestion. • High bandwidth is required for transmission of a huge amount of data to cloud. • If all the devices become online, even IPv6 will not be sufficient to assign IP addresses. • Data may be confidential which the firm doesn’t want to share online. • Processing of data may vary according to the physical location. For example, traffic handling strategies may be different for two different areas. • IoT are exposed to harsh environmental conditions; locally managing edge devices will result in a better response.

4 Architecture of Fog Computing As we have already discussed, the fog paradigm lies in between edge devices and cloud networks. Actually, fog computing is simply extending cloud services to the edge devices. Earlier, what was happening is we have cloud and we have the IoT devices; IoT devices were sending all the generated data to the cloud paradigm and wait for a response from cloud, as shown in Fig. 6. This is the traditional way of dealing with IoT devices and the cloud paradigm. Now, what we are saying over here is we need to have some capabilities of cloud being implemented closer to the IoT devices’ layer. So, now we are talking about the fog architecture with the cloud architecture. The IoT–Cloud traditional model was having a problem of latency; that means, that it takes so much of time for the data that is sensed by IoT devices to be uploaded to cloud doing some processing over there and then getting the response back. So overall, this basically delays the whole process. In fog computing, we still have cloud but with the introduction of a new fog layer which helps in reducing the overall delay. The 3-layer fog architecture is shown in Fig. 7. Let us look at Fig. 7; we have already gone through it from a different perspective. At the very bottom we have IoT devices which are basically sort of like the physical layer of any network. We have these physical IoT devices over here, then we have the fog layer and then we have cloud. The fog layer has fog nodes which are sort of like virtual instances of the IoT devices. These virtual instances have better or improved processing capability, improved storage capabilities, and so on. The fog layer I should mention over here has some transient storage capabilities. Transient

Introduction to Fog Data Analytics for IoT Applications

Fig. 6 Cloud and IoT devices’ communication

Fig. 7 3-Tier fog computing architecture

27

28

P. Kansal et al.

storage is not as permanent storage. For permanent storage, if required, data may be transmitted to the cloud data centers. Fog nodes are closer to the IoT devices, which is in between cloud and edge devices. Some processing and computation will happen at the fog layer and the rest would be sent to cloud and in some case, we can send them directly to cloud without having the intermediate fog. So, we have 2 different comparable architectures of IoT, one using cloud the other using fog, and the essence is that in fog we are not saying that we will get rid of cloud. This fog layer is going to have something known as the fog nodes; we are going to talk about these fog nodes shortly. So, essentially what we are doing is we are trying to bring the cloud surfaces in the form of fog nodes closer to the edge devices. Fog is actually an intermediator between cloud and the edge devices to process data before sending it to cloud. Note, only time-sensitive data would be processed in the fog. The data which is non-timesensitive is passed by fog nodes to cloud for processing. Also note, the fog layer sends only unique data to cloud; repeated and same data generated by sensors is eliminated by the fog layer. This helps in proper utilization of the bandwidth and other network resources. So what we have basically is some kind of complementarity along with cloud by the introduction of fog. There are many advantages with fog networks. Firstly the reduction in latency between data sensing and data processing points. The second advantage is that we are not flooding the cloud network with data packets, because not everything is sent to the cloud. There are many more advantages, we will discuss later in more detail.

5 How Fog Computing Works? The fog layer is built of a large number of distributed fog nodes implemented on routers, switches, and other access points. Distributed fog nodes work like a “mini cloud” at the edge of the network. Fog nodes are distributed all over the network. The fundamental idea behind the fog paradigm is edge devices can share and process the data for each other without the support of cloud. Various network devices are used for data processing and storing, which can be shared locally with other edge devices. These network devices work as fog nodes and process the data. The task which requires big data analytics is sent to cloud as the fog paradigm cannot handle big data. So the fog computing layer works like an intelligent worker, who processes only real-time required data, and sends non-urgent tasks to cloud for processing. This helps in reducing bandwidth requirements and improving the response time for end applications. As shown in Fig. 8, level 3 is the edge layer where end users interact. At the edge layer, sensors play the most important role to produce hot data. Sensors produce a tremendous amount of data which need to transfer to upper layers for processing.

Introduction to Fog Data Analytics for IoT Applications

29

Fig. 8 Working of fog nodes

End users can also interact with each other directly, which is known as Device-toDevice (D2D) communication [17, 18]. Data which is produced at the edge layer is transferred to level 2 (fog layer). Fog layers have specific hardware (such as routers and switches) and software (such as Cisco IOx) to process and store transient data. The fog layer is just like “mini cloud”. An important part of the fog layer is to provide data analytics close to the edge layer itself. The distance of the fog layer is small from the edge layer devices as compares to the cloud layer. Due to the small distance from the edge devices, the fog layer provides real-time data processing for edge devices. Real-time data processing is a must for many IoT applications. The fog layer is built of many distributed fog nodes. These fog nodes are actually implemented on routers, switches, web cameras, and other edge network devices. In other words, we can

30

P. Kansal et al.

say that we can empower the various edge network devices for fast data processing, which is the requirement of real-time IoT applications. Nowadays, Mobile Cloud Computing (MCC) and Mobile Edge Computing (MEC) are very popular technologies. These technologies extend many services to the network edge. MEC helps in processing the data at the cellular base stations [19] while MCC use lightweight clouds such as cloudlets [20], to provide computational resources in close proximity of end users. Fog computing is extending these two paradigms by providing cloud-based services such as SaaS, PaaS, and IaaS at the network edge, by using highly distributed network devices (such as routers, WAN switches, or regional servers) and specialized softwares (such as Cisco IOx). Fog computing helps in reducing the traffic load of the network. Fog nodes are distributed all over the network which collect data from various edge devices. Many times, data sent by edge devices are found in repeated format. So these fog nodes do not send this repeated data to cloud. Only one copy of data is shared with the cloud network, which helps in reducing the traffic load on centralized data centers. Similarly, cloud servers send processed data to the fog nodes rather than to edge devices. These fog nodes share this data with various edge devices as per their requirement. This helps in reducing the traffic load on the higher network. In addition, fog nodes transmit only the reduced processed data to cloud. All the hot data is processed by fog nodes themselves. Sensors generate a huge amount of hot data which require small processing units to process. Fog nodes can process this data and share the reduced processed data at cloud, which results in traffic reduction.

6 Fog Node Some of the most popular existing fog nodes are Cisco IOx, Cloudlet, and ParaDrop as shown in Fig. 9. Cloudlet is a resource-rich fog node [21], which has three layers: the lower layer is cloud data cache and Linux, the middle layer has cloud-based softwares such as OpenStack which helps in virtualization, and the topmost layer is application oriented having offload tenant virtual machine instances. These virtual machines help in providing resources to edge devices in real time over a WiFi network. Cloudlet is actually a cluster of many computers which works like a small data center at the edge of the network. Cloudlet was designed specifically for interactive mobile-application virtual reality and language processing. The architecture of Cisco IOx is actually implemented on a Cisco router [22]. This architecture works on expensive hardware which is not open in public space. The architecture supports developers to run scripts, compile their code, and install their own application programs. It works on grid routers by installing applications on guest OS. ParaDrop is developed on gateways such as WiFi access points and set-top boxes. As these devices are commonly available to end user, so this become easy for end user to use them for setting up a fog environment.

Introduction to Fog Data Analytics for IoT Applications

Cloudlet (a)

31

Cisco IOx (b)

Fig. 9 Cloudlet framework (a) [21] and Cisco IOx framework (b) [22]

7 Characterization of Fog Computing Let’s see some of the important characteristics of the fog paradigm: • Low latency: The fog paradigm works close to edge of the network, so its latency time is very small as compared to cloud. Many applications such as traffic control, telemedicine, and industrial IoT need low latency. • Geographical distribution: Unlike the centralized cloud paradigm, fog has distributed deployment. This distributed network helps in many applications for moving vehicles. • Large-scale sensor networks: IoT sensors are growing day by day, which is producing a huge amount of data. Sending and receiving this data continuously can cause congestion problems near a centralized cloud network. Fog distributed paradigm can handle this huge amount of data close to the edge of the network itself. • Interaction with cloud: The fog paradigm does not work independently. Fog is actually sandwiched between edge and cloud networks. Those applications need not be processed in real time, and are sent to cloud by the fog nodes for processing. So fog works like an intelligent agent between edge devices and cloud. • Predominance of wireless access. • Transferring Metadata to cloud: Fog nodes send only a small amount of metadata to cloud, which helps in better bandwidth utilization. • Real-time applications: Fog computing is very helpful in real-time application data processing. • Robustness: Fog network is a distributed network, so the failure of one fog node doesn’t impact other networks.

32

P. Kansal et al.

8 Fog Computing Versus Cloud Computing The cloud paradigm suffers high latency, traffic congestion, inefficiency to handle growing IoT devices, and high bandwidth requirement costs. Data centers are centralized, located at far physical regions, so data transmitted from different regions can cause huge congestion in a centralized network. Data centers are centralized at different locations, set up by Google, Altus, Facebook, Apple, Microsoft, China Unicom, TATA, AT&T, Amazon Web Services (AWS), Bell, and many more. These data centers need to be active round the clock, which needs lots of electricity to operate. As per the Natural Resources Defense Council (NRDC), in the year 2013, United States-based cloud data centers consumed 91 billion kW per hour of electrical energy, which require many power plants to operate, while the DC consumption was estimated to 140 billion kW per hour by 2020 [23]. This emits carbon pollution of nearly 100 million metric tons per year. However, companies like Apple are moving toward 100% renewable energy-based data centers, with geothermal, wind, and solar energies. In addition, most of the cloud data centers are located at distant locations from end users, which may result in Quality-of-Service (QoS) degradation. While, the fog paradigm provides various services at the network edge such as real-time data analysis, it is very easy to configure or install applications on fog nodes as per end-user requirements. But note that the fog paradigm is not a replacement of the cloud paradigm; rather, fog is an extension of cloud. The fog paradigm provides cloud services in real time to end users to enhance their experience.

8.1 Main Differences Between Fog and Cloud Paradigm In the fog paradigm, data computation power, storage capacity, and applications are different from the cloud paradigm. Computation and storage capacity of cloud are very high as compared to fog nodes. But the fog paradigm can provide different applications as per end user need very easily as compared to cloud. The more detailed point-to-point difference is mentioned in Table 1.

9 Edge Computing Versus Fog Computing Helder Antunes is a Senior Director of corporate strategic innovation at Cisco Inc. He is also a member of the OpenFog Consortium; he says fog is a superset of Edge computing. In edge computing, data is processed where it is created while in fog computing data is processed where it is stored. Fog encapsulates both edge processing and network connections.

Introduction to Fog Data Analytics for IoT Applications

33

Table 1 Difference between fog and cloud Features

Fog paradigm

Traditional cloud paradigm

Latency

Very low

High

Reliability

Low

High

Data storage capacity

Low

High

Big data processing

Up to a limit

High

Deployment cost

Low

High due to sophisticated planning

Computing model

Uses a distributed approach

This is a centralized approach

Size of system

Large distributed system made up of a small number of nodes

Cloud data centers are big in size

Resource optimization

Locally done

Globally done

Maintenance

Generally requires no human involvement

Maintained and operated by a technical expert

Mobility management

Hard

Easy

Operation

Small and large, both companies play an important role in fog computing

Operated by large companies

Edge computing: In this technology, we try to process data close to its source of generation, which helps in speeding up the data processing. In edge computing, data need not be transmitted to a distant cloud for analysis. There are many benefits to process the data where it is generated, but still, due to small processing units and low storage capacity, there are limitations of edge computing. These problems can be overcome with the fog paradigm which is spread all over, starting from edge to cloud [24]. Fog computing: As shown in Fig. 10, fog is the bigger version of edge computing, and it facilitates high-end data processing and storage services between edge devices. Fog computing extends edge computing from the edge of the network upto the cloud.

Fig. 10 Edge is a subset of fog

34

P. Kansal et al.

Consider Bombardier, an aerospace company, which in 2016 used sensors in its aircraft for real-time analysis of its engines’ data. This helps the company a lot to handle the problems in grounding its aircraft [25]. Placing edge computing elements close to the engine helps in detecting enginerelated issues in real time such as engine overheating or burning too lean. So this removes the need to transmit data to the cloud for processing. Processing engine data at the edge helps in detecting whether the engine is about to fail in the next few months or years. Data processing also helps in determining why an engine has overheated. In this use case, fog computing need does not apply.

10 Fog Computing Advantages and Disadvantages Advantages of Fog Computing: • • • • • • •

Conserves network bandwidth. Improves system response time. Supports mobility; fog nodes can also be installed in moving vehicles. Reduces repeated data transmission to the cloud data centers. As the data remain close to the edge device, there are fewer security concerns. Minimizes Internet and network latency. Fog network can also withstand harsh environmental conditions, such as under sea and tracks. • New fog applications can easily develop according to user needs. • Fog nodes can be installed in remote areas. • Increasing use of fog computing is in Smart cities where real-time data need to be processed for more efficient systems. Disadvantages of Fog Computing • • • • • •

Trust and authentication concerns. Cost and availability of fog equipments is a concern. Security issues due to IP address spoofing or man in the middle attacks. Data consistency in the fog paradigm is a challenging act. Power consumption is higher at fog nodes. Scheduling may be complex due to task movement between fog nodes, client devices, and cloud servers. • Has some wireless security issues and privacy concerns.

11 Fog Computing Applications Fog computing is in the emerging stage but still there are many applications in which fog computing is in the working mode.

Introduction to Fog Data Analytics for IoT Applications

35

Fig. 11 Fog node at connected vehicle [29]

Autopilot cars: The number of self-driving cars is increasing in the market day by day. Google, Tesla, Uber, and many other giants are in the auto-driving car market. Autopilot cars have thousands of sensors which keep observing the surrounding environment. This sensed data need to be processed in real time, which is not at all possible by communicating this data to cloud. So fog nodes process this data in real time and also help in communicating with nearby cars. Smart grids: Smart grid systems are growing; this helps in detecting how much electricity is required in which area [26]. So by processing this information in real time with the help of fog nodes, better services can be provided [27, 28]. Connected Vehicle: The modern automobiles have many electronic systems on wheels as shown in Fig. 11, with hundreds of computing devices on board. Many types of sensors are used in various parts of the vehicle to monitor its performance. These onboard sensors collect real-time data which can help the end user to detect any fault in a very advanced stage. Fog nodes on the vehicle help in processing this data in real time, which later can also be shared with the manufacturer for future improvements. Healthcare 4.0: The fog-based Healthcare 4.0 helps in remote healthcare systems [30], assisted living, e-medication, early disease prediction systems, population monitoring, and control in smart cities [31]. Fog technology has suitable characteristics such as real-time interactions and ultralow latency which are required by Medical applications. The volume and velocity of data are increasing, but storage, online processing, and cost of bandwidth are going down, which allows the IoT and cloud to grow. But instead of transmitting all the medical devices’ data to cloud directly, it will be a nice idea to process data close to edge devices and send the metadata to cloud. This will help in achieving ultralow latency. Edge devices cannot perform all the computations themselves, so the role of fog computing comes in. For example, blood pressure sensors can send the patient data to any fog unit, which can process it very fast and can report to any nearby

36

P. Kansal et al.

hospital in case of an emergency. Heart rate monitoring can also be done in real time with fog computing [32]. Tele surgery is another application of Healthcare 4.0 [33]. Security and privacy of e-health records [34–36] is a big concern in Healthcare 4.0 development. Block chain is playing a very important role in handling security concerns of fog computing networks [37–39]. Smart Traffic lights: Traffic lights can effectively be controlled by fog computing rather than cloud-based systems. Real-time video analytics: Fog computing is also becoming popular for real-time video processing. CCTV cameras themselves work as fog nodes. Sending Multimedia data to cloud is very expensive in terms of large bandwidth requirements. Multimedia big data [40] can be processed by fog technology at the edge of the network. Safety management for Mining: Fog computing also playing a very important role in providing safety while mining [41].

References 1. Bonomi, F., Milito, R.; Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: Proceedings of the First Edition of the MCC Workshop on Mobile Cloud Computing-MCC ’12, pp. 13–15. Helsinki, Finland, 17 Aug 2012 2. Evans, D.: The internet of things: how the next evolution of the internet is changing everything. CISCO White Paper 1, 1–11 (2011) 3. Machina Research. machinaresearch.com 4. Ravandi, B., Papapanagiotou, I.: A self-learning scheduling in cloud software defined block storage. In: 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), pp. 415– 422. IEEE (2017) 5. Zhang, B., Mor, N., Kolb, J., Chan, D.S., Lutz, K., Allman, E., Wawrzynek, J., Lee, E.A., Kubiatowicz, J.: The cloud is not enough: saving IoT from the cloud. HotStorage (2015) 6. Yousefpour, A., Fung, C., Nguyen, T., Kadiyala, K., Jalali, F., Niakanlahiji, A., Kong, J., Jue, J.P.: All one needs to know about fog computing and related edge computing paradigms: a complete survey. J. Syst. Architect. 98, 289–330 (2019) 7. Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: Proceedings of the First Edition of the MCC Workshop on Mobile Cloud Computing, pp. 13–16. ACM (2012) 8. Nath, S.B., Gupta, H., Chakraborty, S., Ghosh, S.K.: A survey of fog computing and communication: current researches and future directions. IEEE Commun. Surv. Tut. 9. Global Fog computing market analysis by Zion market research. https://www.zionmarketresea rch.com/report/fog-computing-market 10. Ketel, M.: Fog-cloud services for IoT. In: Conference April 2017, pp. 262–264 11. Open Fog Consortium, June 2017. https://www.iiconsortium.org/index.htm#member-com panies 12. Industrial Internet Consortium, Press release (31 Jan 2019). The industrial internet consortium and openfog consortium join forces. www.iiconsortium.org. Accesed 04 July 2019 13. Interconnected devices worldwide from 2015 to 2025. https://www.statista.com/statistics/471 264/iot-number-of-connected-devices-worldwide/ 14. Gor, M., Vora, J., Tanwar, S., Tyagi, S., Kumar, N., Obaidat, M.S., Sadoun, B.: GATA: GPS Arduino based tracking and alarm system for protection of wildlife animals. In: International Conference on Computer, Information and Telecommunication Systems (IEEE CITS-2017), Dalian University, pp. 166–170. Dalian, China, 21–23 July 2017

Introduction to Fog Data Analytics for IoT Applications

37

15. Bodkhe, U., Bhattacharya, P., Tanwar, S., Tyagi, S., Kumar, N., Obaidat, M.S.: BloHosT: Blockchain enabled smart tourism and hospitality management. In: International Conference on Computer, Information and Telecommunication Systems (IEEE CITS-2019), pp. 237–241. Beijing, China, 28–31 Aug 2019 16. Centralized cloud data centers problems. https://www.cisco.com/c/dam/global/en_ca/soluti ons/trends/iot/docs/iot-data-analytics-white-paper.PDF 17. ElSawy, H., Hossain, E., Alouini, M.-S.: Analytical modeling of mode selection and power control for underlay D2D communication in cellular networks. IEEE Trans. Commun. 62(11), 4147–4161 (Nov 2014) 18. Feng, D. et al.: Device-to-device communications underlaying cellular networks. IEEE Trans. Commun. 61(8), 3541–3551 (Aug 2013) 19. Hu, Y.C., Patel, M., Sabella, D., Sprecher, N., Young, V.: Mobile edge computing: a key technology towards 5g. ETSI White Paper 11, 1–16 (2015) 20. Satyanarayanan, M., Lewis, G., Morris, E., Simanta, S., Boleng, J., Ha, K.: The role of cloudlets in hostile environments. IEEE Pervasive Comput. 12(4), 40–49 (2013) 21. Satyanarayanan, M., Schuster, R., Ebling, M., Fettweis, G., Flinck, H., Joshi, K., Sabnani, K.: An open ecosystem for mobile-cloud convergence. IEEE Commun. Mag. (2015) 22. Cisco.: Iox overview. https://goo.gl/n2mfiw (2014). Accessed 18 June-2015 23. https://www.nrdc.org/sites/default/files/data-center-efficiency-assessment-IB.pdf 24. Prasad, V.K., Bhavsar, M., Tanwar, S.: Influence of monitoring: fog and edge computing. Scalable Comput. Pract. Exp. 20(2), 365–376 (2019) 25. https://dataconomy.com/2016/01/internet-of-aircraft-things-how-analytics-of-ioat-is-transf orming-the-aerospace-industry/ 26. Kaneriya, S., Tanwar, S., Verma, J.P., Tyagi, S., Kumar, N., Obaidat, M.S., Rodrigues, J.J.P.C.: Data consumption-aware load forecasting scheme for smart grid systems. In: IEEE Global Communications Conference (IEEE GLOBECOM-2018), pp. 1–6. Abu Dhabi, UAE, 09–13 Dec 2018 27. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N., Rodrigues, J.: Fog computing for smart grid systems in 5G environment: challenges and solutions. IEEE Wirel. Commun. Mag. 26(3), 47–53 (June 2019) 28. Tanwar, S., Tyagi, S., Kumar, S.: The role of internet of things and smart grid for the development of a smart city. In: Intelligent Communication and Computational Technologies (Lecture Notes in Networks and Systems: Proceedings of Internet of Things for Technological Development, IoT4TD 2017, vol. 19, pp. 23–33. Springer International Publishing 29. Fog node at connected vehicle. https://automationalley.com/Blog/2017/October-2017/FogComputing-A-New-Paradigm-for-the-Industrial-Io.aspx 30. Kaneriya, S., Chudasama, M., Tanwar, S., Tyagi, S., Kumar, N., Rodrigues, J.J.P.C.: Markov decision-based recommender system for sleep apnea patients. In: IEEE Conference on Communications (IEEE ICC-2019), pp. 1–6. Shanghai, China, 20–24 May 2019 31. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N.: Fog computing for healthcare 4.0 environment: opportunities and challenges. Comput. Electr. Eng. 72, 1–13 (2018) 32. Vora, J., Tanwar, S., Tyagi, S., Kumar, N., Rodrigues, J.J.P.C.: HRIDaaY: ballistocardiogrambased heart rate monitoring using fog computing. In: IEEE Global Communications Conference (GLOBECOM-2019), pp. 1–6. Hawaii, USA, 9–3 Dec 2019 33. Gupta, R., Tanwar, S., Tyagi, S., Kumar, N.: Tactile internet-based telesurgery system for healthcare 4.0: an architecture, research challenges, and future directions. IEEE Netw. 12–19 (Dec 2019) 34. Tanwar, S., Tyagi, S., Kumar, N. (eds.) Security and privacy of electronics healthcare records. In: IET Book Series on e-Health Technologies, pp. 1–450 (2019) 35. Hathaliya, J., Tanwar, S., Tyagi, S., Kumar, N.: Securing electronics healthcare records in healthcare 4.0: a biometric-based approach. Comput. Electr. Eng. 76, 398–410 (2019) 36. Vora, J., Italiya, P., Tanwar, S., Tyagi, S., Kumar, N., Obaidat, M.S., Hsiao, K.-F.: Ensuring privacy and security in e-health records. In: International Conference on Computer, Information and Telecommunication Systems (IEEE CITS-2018), pp. 192–196. Colmar, France, 11–13 July 2018

38

P. Kansal et al.

37. Vora, J., Tanwar, S., Verma, J.P., Tyagi, S., Kumar, N., Obaidat, M.S., Rodrigues, J.J.P.C.: BHEEM: a blockchain-based framework for securing electronic health records. In: IEEE Global Communications Conference (IEEE GLOBECOM-2018), pp. 1–6. Abu Dhabi, UAE, 09–13 Dec 2018 38. Tanwar, S., Parekh, K., Evans, R.: Blockchain-based electronic healthcare record system for healthcare 4.0 applications. J. Inf. Secur. Appl. 50, 1–14 (2019) 39. Gupta, R., Tanwar, S., Tyagi, S., Kumar, N., Obaidat, M.S., Sadoun, B.: HaBiTs: blockchainbased telesurgery framework for healthcare 4.0. In: International Conference on Computer, Information and Telecommunication Systems (IEEE CITS-2019), pp. 6–10. Beijing, China, 28–31 Aug 2019 40. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N., Maasberg, M., Choo, K.K.R.: Multimedia big data computing and internet of things applications: a taxonomy and process model. J. Netw. Comput. Appl. 124, 169–195 (2018) 41. Tanwar, S., Vora, J., Kaneriya, S., Tyagi, S.: Fog based enhanced safety management system for miners. In: 3rd International Conference on Advances in Computing, Communication & Automation, (ICACCA-2017), pp. 1–6. Tula Institute, Dehradhun, UA

Fog Data Analytics: Systematic Computational Classification and Procedural Paradigm D. Pradeep Kumar, R. Hanumantharaju, B. J. Sowmya, K. N. Shreenath, and K. G. Srinivasa

Abstract The unprecedented proliferation and acceptance of sensor-based gadgets has given rise to technological advancements and innovations in all of the interconnected technologies related to it. The transactional data that is generated from this level of connectivity is termed as Big Data and is continuously collected and delivered in endless streams between the networked entities. Big data contains insightful information and knowledge that can be exploited by businesses by mining and analytics. But the sheer volume of data brings with it a lot of challenges in terms of efficient data storage and effective data analytics on fog and cloud computing platforms generally and specifically at the edge of the cloud on fog devices. Extensive research has been carried out to address the challenges and realize the benefits of fog data analytics. In this chapter, we discuss the characteristics of attributes and analyze the algorithmic complexities in fog data analytics. A systematic computational classification is generated to develop a paradigmatic procedure for fog data analytics to process, store and analyze data efficiently and effectively and develop an ideal platform for proliferating sensor-based devices and services on Internet of things. We present few case studies that benefit from the proposed model. D. Pradeep Kumar (B) · R. Hanumantharaju · B. J. Sowmya Department of Computer Science and Engineering, Ramaiah Institute of Technology, Bangalore, India e-mail: [email protected] R. Hanumantharaju e-mail: [email protected] B. J. Sowmya e-mail: [email protected] K. N. Shreenath Department of Computer Science and Engineering, Siddaganga Institute of Technology, Bangalore, India e-mail: [email protected] K. G. Srinivasa Department of Information Management and Coordination, NITTTR, Chandigarh 160019, India e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Tanwar (ed.), Fog Data Analytics for IoT Applications, Studies in Big Data 76, https://doi.org/10.1007/978-981-15-6044-6_3

39

40

D. Pradeep Kumar et al.

Keywords Fog data analytics · Internet of things · Big data · Process model

1 Introduction With the advent of internet of things (IoT), cloud computing and artificial intelligence there has been a phenomenal change in the way data is stored and processed. Businesses have embraced cloud computing and related computing methods to move their data onto cloud servers to overcome the dependency and address the issues latency in local servers. The steep increase in the proliferation of smart sensing gadgets has contributed to the exponential growth in data generated. Big Data necessitates the development of smart solutions for its processing like data mining and analytics. The task of accumulating and analyzing data to gather insightful actions is very difficult. Fog computing assures to speed up this task by pushing the cloud closer to the place where data is generated: the IoT enabled smart gadgets on the network. Sending data to and receiving it from the cloud for analysis can be a daunting task due to the limited bandwidth and security factors. But fog nodes can be setup wherever a network connection is established. The node closest to the network edge consumes data from IoT devices and transfers a variety of data to the appropriate site for analysis. Fog nodes are often placed in the appropriate place in the network—between endpoint sensing devices, actuators and the cloud servers—to serve as critical resources for the Internet of Things. Fogging or fog computing has huge processing capacity at the edge nodes that enables it to perform all the complex computation on the data without moving them to the servers located far off in the cloud. Fog devices can be considered as a miniature data centre that supports intensive data operations of smart IoT devices with low latency. Fog computing enables the storage of big data and results in the improvement of the performance of the entire platform. Fog data analytics (FDA) is a fusion of cloud computing and fogging models. Fog functions to improvise the distribution of data and helps in the reduction of latency. Fog data analytics consists of storing, distributing and visualization of big data. Data is collected from the sensor-based IoT devices and stacked on the edge device in the cloud. Data accumulated from the numerous IoT based devices of users are first processed in the fog layer through the fog gateway and later processed on the cloud through the cloud gateway to address other details pertaining to the data. The central idea of FDA is to facilitate storing, computing and networking operations between the Cloud and the users in a virtual mode. This ability of FDA makes it suitable for the cognitive approach. Fog nodes may be distributed in different geography but provide real-time communication in different environments in applications such as patient monitoring, smart homes, and weather forecasting. Processing of the data in the FDA cycle helps in developing insights about the data and is further communicated to the cloud or Fog network for precise and specific analysis. Fog Computing tries

Fog Data Analytics: Systematic Computational Classification …

41

to decreases the amount of data that is moved on to Cloud devices to store, analyze, and process. Fog Computing provides protection and security for the data. Some of the challenges in FDA are heterogeneity in the fog network as it is located at the edge of the network, network failures, delay due to the complexity in processing data, storage and bandwidth issues to handle the variety and volume of data, demands constant monitoring and evaluation and the lack of business model. These challenges motivated us to prepare taxonomy and new data analytic strategies in this chapter. In this chapter, we did an extensive survey towards the Fog Data Analytics, Taxonomy of FDA, different process model that covers the different analytic techniques on the fog. This chapter is divided into many sections. The next section describes the extensive literature survey on Fog Data Analytics, Sect. 3 describes our proposed taxonomy of Fog Data Analytics. Here we are proposing the new Taxonomy with Cloud layer, Fog layer consists of n fog nodes which facilitate the Accurate Analytic techniques on the FOG layer. Section 4 describes different case studies on FDA. Finally concluding the chapter with a conclusion.

2 Literature Survey Fog-IBDIS: Industrial Big Data Integration and Sharing with Fog Computing for Manufacturing System [1] Junliang Wang a.et.al explores Fog-IBDIS, a major information incorporation and offering system to mist registering, from three perspectives: its activity standard, utilitarian modules, and usage. Dissimilar to past huge information mix considers with distributed computing, this examination builds the Fog-IBDIS stage utilizing mist processing, which parts the IBDIS task into a few sub-undertakings run by the information generators. With respect to handling, all crude datasets are pre-processed and broke down by the information proprietors to ensure the crude information protection. Moreover, Fog-IBDIS applies the information handling in the haze customers inside the assembling frameworks, in this manner changing the incorporated information preparing mode into disseminated task execution. Appropriately, just the analyzed outcomes are moved between the disseminated haze customers, which diminishes the volume of the transmitted information and facilitates the system traffic load. As far as we could possibly know, this is the primary endeavour to join IBDIS with mist registering innovation in assembling frameworks. Yousefpour et al. [2], provides some materials on Fog computing and its different concepts involved in it, which includes their similitudes and disagreement and a glossary of challenges in fog computing, and through a far reaching overview their outline and sort the endeavours on fog computing and its related processing ideal models with a proposed difficulties and future bearings for look into in haze registering. Moustafa et al. [3], has proposed an engineering to delineate the communications of Internet of Things, Cloud layer and Fog layer for successfully executed enormous information investigation and applications which are digitally secured. Since the

42

D. Pradeep Kumar et al.

gadgets and administrations in the three layers produce heterogeneous information sources, the Cloud frameworks have been utilized to process, figure and store such information at brought together areas. Within cases, the portability support, area mindfulness, low inactivity and land area are as yet the key difficulties in the Cloud layer that could be handled utilizing the Fog ideal models by preparing computational undertakings at the edge of the system. And also provided security problems in existing security tools and future research directions are introduced to improve the security of the IoT Fog-Cloud architecture. Learning based Adaptation for Fog and Edge Computing Applications and Services [4], Argerich et al., In order to facilitate the development, as well as improving the efficiency of applications and services, an adaptive framework is proposed. This versatile structure utilizes Reinforcement Learning to join adjustment systems, determined in the improvement stage, during runtime so as to upgrade the exhibition of uses and administrations. Execution of this methodology is acknowledged on Python alongside AdAS, an Adaptive Applications Simulator to assess its productivity. All through reproductions, the versatile system figures out how to accomplish a prerequisite fulfilment of practically 85% while keeping a high accuracy in shaky execution settings on low force gadgets (i.e. changing systems administration data transfer capacities and accessible asset). Showing more exile and efficiency than a pre-programmed adaptive logic, yielding a precision between 10 and 25% higher. Argerich et al. [5], compresses the fog troubles and openings with respect to huge IoT data assessment on cloudiness sorting out. With basic key characteristics in some proposed look into works made in the fog computing on a fitting stage for new increasing IoT gadgets, organizations, and applications. Most basic fog applications (e.g. therapeutic administrations checking, splendid urban networks, related vehicles and sagacious framework) will be discussed here to make a proficient green handling perspective to help the cutting edge period of IoT applications. Taneja et al. [6], proposes strategy avoids sending unrefined data to the cloud and offers balanced estimation in the establishment. The results show an 80% decline in the proportion of data moved to the cloud using the proposed fog based appropriated data examination approach differentiated and the normal cloud-based procedure. In addition, by grasping the proposed dispersed methodology, viewed a 98% drop in the time taken to land at the convincing result differentiated and the cloud-driven procedure. What’s more, have also introduced the results on the idea of examination game plan got in the two techniques, and suggest that the cloud-based appropriated assessment approach can fill in as comparably as the standard cloud-driven strategy. Bachmann et al. [7], has performed an exhaustive foundation and related work examination was directed before beginning to structure and actualize a dynamic, extensible and versatile true mist registering system and has explored IoT use cases on two previously existing IoT systems. The first examined IoT system is crafted by Vögler et al. who presented a versatile huge scale IoT system concentrating on the utilization instance of brilliant urban areas. The second work by Kim and Lee exhibited an open-source IoT system that gives the way to create and execute IoT administrations for a wide range of partners, i.e. gadget and programming

Fog Data Analytics: Systematic Computational Classification …

43

engineers, specialist organizations, stage administrators and administration clients and developed a comprehensive structure which has assessed with six assessment situations. Ahmed et al. [8], gives insight on a lot of cloud applications which either were at that point executed, or basically proposed for future improvement by choosing 30 genuine or proposed applications that spread a wide scope of utilization types for future haze figuring stages. And at a point utilized the arrangement of reference applications to present a general foundation about haze processing stages and applications with a methodology for choosing an agent set of reference applications and examine the necessities that these applications put on haze processing stages, this investigation targets helping future stage creators settle on educated decisions about the highlights, and the sorts of uses that may profit by them. Mehdipour et al. [9], proposes an idea where information examination can be performed close to when the information is produced to interchanges overhead just as information handling time. Presenting another kind of progressive information investigation such as the fog layer acts an important and prominent layer than compared to the cloud layer with the ability of empowering IoT applications on premise handling that outcomes in different points of interest, like the diminished size of information, decreased information interchange and utilizing with minimal cost the cloud with use cases of smart home, smart monitoring nutrition system. Dang et al. [10], gives a fascinating part of IoT and distributed computing in the human services with a complete review on distributed computing, especially fog computing, with standard designs and existing exploration in social insurance applications to incorporate into single stage alongside cloud, thinking about different dangers, vulnerabilities and assaults that should be considered, and broke down and abridged pertinent security models to forestall conceivable security dangers. Verma [11] describes the financial exchange is unstable and non-stationary and creates gigantic volumes of information inconsistently. Right now, existing Machine Learning calculations are dissected for securities exchange estimating, and furthermore, another example discovering calculation for gauging stock pattern is created. Three methodologies can be utilized to take care of the issue: key examination, technical investigation and the machine learning. Test investigation done right now that the AI could be helpful for financial specialists to settle on gainful choices. So as to direct these procedures, an ongoing dataset has been acquired from the Indian financial exchange. Kumari et al. [12], examines the interesting nature and multifaceted nature of fog computing by investigating with a detail scientific classification of FDA into a novel procedure with additional challenges like availability, adaptability, haze hubs correspondence, nodal coordinated effort, heterogeneity, dependability and Quality of Services (QoS) prerequisites. Prasad et al. [13], describes that the advancement of the Internet of Things (IoT) has increased the need for Cloud, edge and fog computing. The central advantage of cloud-based plans is that they permit information to be gathered from various administrations and destinations, which is reachable from wherever in the world. The associations will be profited by consolidating the cloud stage with the on location fog systems and edge gadgets and as a result, this

44

D. Pradeep Kumar et al.

will expand the usage of the IoT gadgets and end clients as well. The system traffic will lessen as information will be conveyed and this will likewise improve the operational proficiency. The effect of observing in edge and fog computing can assume a significant job to productively use the assets accessible at these layers. Jhonson et al. [14], describes the new era of Big Data and advances in technology made significant transitions towards the high functionality of IoT devices. Big Data accumulation over IoT devices and networks is clearly visible to solve the problem among different computing methods-quantum computing, cloud computing, edge/fog computing. Big Data and data mining: Precise data, imprecise data, uncertain data are three different varieties of data that are collected and analyzed by using a process called Data Mining. Popular data mining techniques are: Frequent Pattern Mining, Supervised Learning. Minu et al. [15], describes the main objective is to provide complete insight about various advances in IoT through intelligence by key enabling for digital innovation. The chapters elaborate the general and critical review of edge computing and computational accepts of edge computing in internet of things. Also discussed Business aspects, models and opportunities for IoT interoperability on top of SDN/NFV enabled, and has given a brief idea of advanced application and future trends in Edge Computing. Cloud Networking [16], presents an IOT model having four layered architecture to address the vitality effectiveness in IoT systems bolstered by Cloud figuring. The engineering right now of IoT objects (for example temperature sensors) at the primary layer and the systems administration components (transfers, facilitators and entryways) are facilitated inside the upper three layers, separately. These systems administration components in each layer total and procedure the traffic delivered by IoT objects and different components in lower layers. This work proposes a model wherein virtual machines prepare the IOT traffic facilitated by appropriated smaller than normal Clouds at transfers, organizer and entryway gadgets with the target of minimization of the all-out force utilization. For this reason, the creators thought about two situations; (1) the Gateway Placement Scenario (GPS) in which found the VM in the passage component just, with the goal that traffic total and preparing dealt with by one mini Cloud at the gateway and (2) Optimal Placement Scenario (OPS) in which VM arrangement is flexibly empowered at transfers, the facilitator or the portal components. Their outcomes demonstrated that with ideal dissemination of smaller than usual Clouds in the IoT infrastructure, the absolute force utilization diminished altogether giving force investment funds of up to 36% contrasted with preparing IoT information in a solitary little Cloud facilitated at the passage. The diminished force utilization of OPS came about because of the decreased number of bounces navigated by the IoT to-VM traffic in the IoT arrange and the diminished number of segments and traffic through the IoT organize. They additionally considered the effect of limit compelled VMs (spoke to by the quantity of IoT objects a VM can serve) on complete force utilization. They indicated that as the quantity of served IoT objects per VM builds, the quantity of fuelled on systems administration components diminishes, thereby improving power effectiveness.

Fog Data Analytics: Systematic Computational Classification …

45

Taneja et al. [17], introduced that fog explicit decay of multivariate predictive analytical models such as linear regression as the prescient diagnostic model in work utilizing the factual question model and summation structure. The deterioration technique utilized isn’t the commitment, yet applying the disintegration strategy to the examination model to run in an appropriated way in the fog empowered IoT organizations is the commitment that can be applied to a real-world dataset and assessed utilizing a fog registering testbed. The proposed technique abstains from sending crude information to the cloud and offers adjusted calculation in the framework. The outcomes show an 80% decrease in the measure of information moved to the cloud utilizing the proposed mist based conveyed information investigation approach when contrasted and the regular cloud-based methodology and furthermore watched 98% drop in the time taken to show up at the conclusive outcome contrasted and the cloud-driven methodology.

3 Taxonomy of Fog Data Analytics Fog computing offers a shared hierarchical structure that holds the collection of elements which provides integration of technology and services such as smart homes and cities, networked vehicles and smart power systems. The taxonomy is described as shown in Fig. 1. Deploying fog systemic networks at the edge is important for the security of the smart future, conducting innovative experiments, analyzing huge data for accurate prediction, and detecting abnormal, malicious and dangerous incidents. Therefore, the concept of fog computing in this evolving phase of the fog computing paradigm should be clarified. The expanded view of fog computing doesn’t address the similarities with the Cloud computing. Defining and contrasting all fog functionalities with pre-existing structure requires the fundamental definition, which can be interpreted as follows. Fog computing is a new paradigm architected around pooled resources in which distributed odes interact and communicate with each other cooperatively individually or in a group. Communication happens at the edge of the networked infrastructure Fig. 1 Taxonomy of fog data analytics

46

D. Pradeep Kumar et al.

Fig. 2 Units at FOG nodes

supported by the processing in the cloud. Nodes in the fog layer function individually with no external intervention and together ensures technical efficiency, storage capability, good communicability and many more features in the infrastructure to cater to the addition of utilities and ever increasing customers and users. The diagram above indicates that fog is supported by the cloud ecosystem. Fog computing provides stability of computing services and ensures the quality of service in the network is provided at the edge. Fogging extends the services of the cloud in the IoT at the edge to address the conventional issues in the cloud systems. The edge devices concentrate on the processing of raw unprocessed data for processing, command and activities of IoT system. FOG Nodes Fog computing architecture allows for dynamic transfer functionalities to process, network and store services at the intersection of cloud, internet of things and fog node. The portals at which the interaction takes place between fog and cloud and other nodes must be flexible and allow to relocate compute, storage and control facilities dynamically between them. It allows for effective evaluation of service offered by the fog and enables effective management of quality of service (Fig. 2). Fog and cloud communication is very important and necessary to enable collaborations between them to deliver services consecutively which aids in functions like, (i) supervising and managing functionalities at fog with facilities in cloud computing, (ii) transferring information between fog and cloud to process, compare and perform various other functions, (iii) decision for distribution or scheduling of fog nodes to allocate service dynamically on a need basis by the cloud, (iv) conjointly adapt to improve and manage compute facilities progressively, (v) making available the cloud utilities to the user via fog nodes. A decision has to be taken on what information must traverse across the cloud and the fog intersection. Cloud and fog nodes should possess the capability of pooling resources to support each other’s processing. For example, for one or more user applications, every fog node that is setup must share capabilities to store, compute and process tasks with other nodes to cooperatively function based on priority. Many fog nodes may operate on other backup services. Fog to User IoT/Start. Fogging offers many services to various internet of things applications on the network (e.g. sensor-based devices and gadgets), capable of characteristic recognition functionality. Interfaces between fog and Internet of things, and

Fog Data Analytics: Systematic Computational Classification …

47

users and fog must necessarily enable access for internet of things over services on the fog securely, efficiently and intuitively. Different analytical elements in fog nodes are as follows: 1. Data Cleaning Data cleaning is the way toward getting ready information for examination by erasing or altering information that is mistaken, inadequate, immaterial, copied or designed improperly. Normally, this information isn’t vital or accommodating while investigating information, as it can hinder the process or give inaccurate results. There are several methods for cleaning data depending on how it is stored along with the answers being sought. Data cleaning is not simply about erasing information to make space for new data, but rather finding a way to maximize a data set’s accuracy without necessarily deleting information. For one, data cleaning requires more activities than data removal, such as fixing spelling and syntax errors, standardizing data sets, and correcting errors such as empty fields, incomplete codes, and duplicate data points found. Data cleaning is considered a foundational element of the data science basics, as it plays an important role in the analytical process and uncovering reliable answers. Most importantly, the goal of data cleaning is to create data sets that are standardized and uniform to allow business intelligence and data analytics tools to easily access and find the right data for each query. 2. Exploratory Analysis Exploratory Data Analysis alludes to the basic procedure of introductory information examinations so as to identify designs, distinguish inconsistencies, test speculations and check suppositions utilizing rundown statistics and graphical portrayals. Exploratory information investigation (EDA) is a way to deal with dissecting informational indexes so as to condense their principle attributes, frequently utilizing visual strategies. A measurable model could conceivably be utilized, yet EDA is for the most part for seeing what the information may show us past the job of quantitative displaying or testing speculations. John Tukey pushed the exploratory information examination to permit analysts to test the information possibly define theories that could prompt new information assortment and experimentation. EDA contrasts from the underlying information examination (IDA) [1], which centres all the more intently around checking the suppositions required for model fitting and speculation testing, dealing with missing qualities and changing factors as important. The objectives of EDA are: • To gain an understanding of data and find clues from the data, • to formulate assumptions and hypothesis for our modelling; and • to check the quality of data for further processing and cleaning if necessary. The various tasks that are performed in the Exploratory Data Analytics • Preview data

48

• • • • • •

D. Pradeep Kumar et al.

Check total number of entries and column types Check any null values Check duplicate entries Plot distribution of numeric data (univariate and pairwise joint distribution) Plot count distribution of categorical data Analyze time series of numeric data by daily, monthly and yearly frequencies.

3. Data Transformation Data Transformation can be basic or complex relying upon the information changes required between the information source (introductory) and the information target (last). Information change is generally accomplished through a blend of manual and mechanized step [2]. Data transformation methods and innovation can differ broadly relying upon the sort, format, unpredictability and volume of the data transformation. Data transformation can be broken down into the following steps, each appropriate as needed based on the complexity of the necessary transformation. • • • • •

Data discovery Mapping of data generating the code Executing the generated code Reviewing the code and the results

These means are regularly the focal point of data analyst who may utilize various specific devices to play out their assignments. The steps can be described as follows: • Data discovery is the initial phase during the time spent information change. The information is ordinarily profiled utilizing profiling apparatuses or some of the time utilizing physically composed profiling contents to more readily comprehend the information structure and qualities and choose how it should be changed. • Data mapping is a strategy for deciding how to delineate, change, channel, total and so forth singular fields to create the ideal result. Software developers or specialized information investigators regularly lead information mapping as they work to characterize the change manages in the particular domain (e.g. visual ETLtools [3], transformation languages). • Software generation is an executable code creation strategy (for example SQL, Python, R, or other executable guidelines) that will change over information dependent on the required and indicated information mapping rules [4]. Data change innovations typically produce this code [5] dependent on the designer determinations or metadata. • Code execution is the step by which the generated code to create the desired output is executed against the data. The code executed can be joined together to form the tool or the developers require some set of rules to execute manually the set of instructions.

Fog Data Analytics: Systematic Computational Classification …

49

• Data review is the final advance in the process planned for guaranteeing that the yield information meets the transformation needs. This stage is regularly performed by the business client or final end-client of the information. Any irregularities or imperfections in the information that are found and transmitted to the designer or information investigator as new necessities which will be applied during the transformation [1]. Data transformation holds many functionalities in the process of data analysis. • Extraction and parsing Information ingestion begins with the extraction of data from an information source in the cutting edge ELT (Extract Load and Transform) technique followed by replicating the information to its goal. Beginning changes focus on characterizing the information organization and structure to guarantee its consistency with both the goal framework just as the information as of now set up. A case of this sort of information change is parsing fields out of comma-delimited log information for stacking to a database. • Translation and mapping Probably the most significant information changes incorporate information representation and interpretation. For instance, a section with passages speaking to mistake codes can be mapped to the comparing bug reports, making that segment more clear and increasingly usable for the show in a customer confronting application. Translation transforms information in the specific compositions in one device to data compositions applicable to another. Web data may even come in the compositions of stratified JSON or XML files after parsing, but for inclusion in a relational database, it needs to be converted into row and column detail. • Filtering, aggregation and summarization Software transformation is regularly about trimming down the information and making it increasingly reasonable. Sifting through superfluous fields, sections and records that total the information. Discarded information could remember numerical files for information expected for charts and dashboards, or reports from business zones that are not important to a particular report. • Information may likewise be accumulated or outlined. by, for example, changing a period arrangement of client exchanges to hourly or day by day deals tallies. • BI Business Intelligence apparatuses can do this separating and conglomeration, however, it very well may be increasingly productive to do the changes before a detailing instrument gets to the information.

50

D. Pradeep Kumar et al.

• Enrichment and imputation Data from different sources can be fused together to create deformalized, enriched information. A customer’s transactions can be rolled up into a big number, and applied to a customer information table for quicker comparison or consumer analysis systems. Long or freeform fields can be divided into multiple columns and can be imputed or corrupted data substituted as a result of these kinds of transformations. • Indexing and ordering Data can be configured to logically order or suit a data storage device. In the context of connection database management systems, for example, indexing may boost efficiency, or manage relationships between different tables. • Encoding and anonymization Data containing information that can be marked or any information that may negatively impact privacy or protection should be anonymized prior to dissemination. In many industries, private data encryption is a must and systems can perform multilevel encryption from single database cells to complete records or fields. • Modelling, typecasting, formatting and renaming Finally, a full series of transformations will reshape the data without altering the content. Cast and convert compatible data types, adjust dates and times for consistency with format offsets and location and renaming schemes, tables and columns. 4. Data Compression Data Compression is the way the data structure of the bits can be changed, encoded or converted in a way that saves space. It enables the storage size of one or more instances or data elements to be reduced. Often known as source code or bit rate reduction is data compression. Data compression allows the fast transfer and optimization of physical storage space of a data object or file over the network or the Internet. Data reducing strategies may be used to achieve a much smaller number, but important information-driven, reduced representation of the data set. Data compression is commonly used in computing services and applications, including data communication. Data compression uses many compression methods and software solutions to decrease data size using data compression algorithms. A popular data compression technique removes redundant data elements and symbols and replaces them to reduce the data size. Data compression can be lose-free compression or lose compression for graphical data, in which the former preserves all replacements but stores any repeated data and the latter eliminates all repeated data. Data compression can be less waste or lossless. Lossless compression helps to restore the file to its original state without losing a single bit of data when the file is uncompressed. Lossless compression is the conventional binary solution and text

Fog Data Analytics: Systematic Computational Classification …

51

and table files in which the information is modified by words or numbers. Loss compression eliminates bits of obsolete, useless or imperceptible data indefinitely. Loss compression is helpful in graphics, audio, video and photographs where the loss of such pieces of data has little to no noticeable impact on the representation of the content. 5. Feature Engineering Feature engineering is the way to extract raw data characteristics using domain information and data mining technology. These features can be used to increase the performance of machine learning algorithms. Function development can be seen as learning on machines itself. Feature engineering requires the creation of new input characteristics from the current ones. Data cleaning can usually be considered an additional operation, as a subtraction and function engineering. Often for three major reasons, this is one of the most important tasks a data scientist can do to boost model performance • You can isolate and highlight key details that “work” on your algorithms. • You should apply your experience in your own field. • Most significantly, you will learn the field knowledge of other people once you grasp the “vocabulary” of feature engineering. Benefits You need great features which define your data structures. Better characteristics mean versatility. You can pick the “false models” (less than optimum) and still get excellent results. Some models have a good data structure to choose from. The flexibility of good characteristics allows you to use less complex, quicker to function, more intuitive and easier to maintain versions. There’s a lot to consider. Better characteristics mean simplified versions. For much the same reason you can select “the wrong parameters” with well-built apps (less than optimal) and still get good results. The right configuration and the most optimized parameters must not be chosen too hard. For good characteristics, you are similar to the basic question and you can use it to help explain the underlying problem. All the information you have at your disposal. Strong characteristics mean better outcomes. 6. Data Classification Data Classification is the way to organize and classify data in various forms, categories or classes. Data processing allows data to be separated and classified for particular business or personal purposes according to the data criteria defined. Data classification includes several tags and labels that identify data form, confidentiality and integrity. The quality can also be taken into account in the data classification

52

D. Pradeep Kumar et al.

process. The level of data risk is often classified according to various levels of value or confidentiality, which correlates to safety measures enforced to protect each level. There are three main types of data classification that are considered industry standards: • Content-, context-, and user-based approaches can be both right and wrong depending on the business need and data type. • Context-based classification considers the user’s method, location, or author as an indirect indication of sensitive information • User-based classification relies on the manual, end-user selection of each item. • The classification process relies on the use of a user-based programme. Utilization relies on user-based expertise and flexibility when flagging sensitive documents are generated, published, checked or distributed. • Based on business requirements and data type, information, meaning and userbased solutions may be both correct and incorrect. Classification steps for effective data Understand the current configuration: Maybe the best starting point for efficient data classification is to take a look at the current state of the data and the rules of your business. You must know what data you have before you can organize it. Creation of a data classification policy: it is almost impossible without the correct policies to stay in line with client data privacy requirements. You should make a proposal for your top priority. Prioritize and organize information: Now that you have a plan and an image of your current information, it is time to classify the data correctly. Decide how best your protection and privacy can be used to mark the data. The classification of data has more benefits than making it easier to find information. In order to understand the vast quantities of data produced at any time by modern organizations, data processing is important. Data protection offers a concise summary of all data available to an organization and offers an overview of where data are stored, how to secure access to it and how best to avoid threats to safety. When it has been developed, it provides a formal framework that promotes more effective data protection measures and promotes compliance with security policies by its employees. 7. Building an Efficient model for Classification to improve the accuracy of Classification The system is designed as a model of machine learning that extracts most of the contributing features in the pre-processing step and trains different models to obtain some insight as shown in Fig. 3. The type of machine learning which is used for classification is supervised. The modules that are identified for this work are as follows: • Acquisition and Extraction of Dataset In this module, the dataset is acquired from a reputed source.

Fog Data Analytics: Systematic Computational Classification …

53

Fig. 3 Machine learning model

• Data Pre-processing Pre-processing of the dataset on the collected data is carried out here. Normally pre-processing includes cleaning the dataset, i.e. identifying the missing values and performing the correct procedure, converting data formats to the appropriate format and sub-setting data sets to a smaller size so that only the required data is used instead of any unnecessary variables that could affect the computation. • Train and Test The pre-processed data is split into training and testing data in this section. As normal, there is a 70–30 split where 70% of the data relates to the preparation and the rest to the test collection. The test collection is used for cross-validation and thus aids in performance calculations. • Predictions Here the models generated on the basis of the training set are tested on the test set, and the results thus predicted are cross-validated with the test set and the model output is measured.

54

D. Pradeep Kumar et al.

• Analysis There are various implementation modules that do not go through the Train and test, prediction phases, in-turn will come under the category of Analysis where a visual representation is created for them based on the pre-processed data. Support Vector Machine (SVM) Support Vector Machine (SVM) is suitable for problems of regression as well as classification. The goal here is to establish a hyperplane in an N-dimensional space (given that the data has N characteristics), which classifies the data precisely. Hyperplanes are boundaries of judgment, and their dimensions depend on the number of features in the data. SVM consists of support vectors, which are the data points closest to the hyperplane and influence the location and direction of the plane, and the elimination of these data points (support vectors) will change the position of the hyperplane. The multiclass SVM aspires to assign labels to instances by using vector support machines, where the labels Building a Decision Support System are selected from a finite set of several elements for specified agricultural parameters using Data Analytic Techniques. Generally speaking, SVMs are two-class classifiers which means they can deduce the data to belong to one class or another. A classic SVM investigates a margin that isolates both positive and poor examples. Furthermore, if any examples in the dataset are mislabelled or highly rare, this can lead to poorly designed models. There are also circumstances in which a non-linear region may more effectively segregate the groups. The data used in this study is non-linear. So, in this case, the definition of non-linear SVM is better suited. SVM handles the grouping of nonlinear regions by using a kernel function (related to non-linear) to portray data in a separate space where a (linear) hyperplane cannot be used for segregation purposes. Therefore, using the kernel function, a linear learning algorithm learns a non-linear function in a high-dimensional feature space even though the system’s functionality is governed by a parameter independent of the space’s dimensionality called kernel trick. As in equation, the linear classifier SVM is expressed as a dot product between vectors of the data point. A kernel function must be continuous, symmetric and have a definite gram matrix which is positive. The most widely used kernel function is the Gaussian Radial Base Function (RBF), which is equivalent to converting the data to an infinite Hilbert dimensional space by the equations →  − → xi T · − xj K − xi , x j = → K (xi, x j) = exp (− γ ||xi − x j ||2), where γ > 0 where k (xi, xj) is the kernel function, xi and xj are the data vectors and γ defines how far the influence of a single training example reaches. Building a Decision Support System for selected Agriculture parameters using Data Analytic Techniques features that take circles (hyperspheres) as hyperplanes that include a radial basis function,

Fog Data Analytics: Systematic Computational Classification …

55

but the decision boundaries become much more complex if multiple such features interact. Enhanced SVM The Cost (C) and Gamma (ÿ) parameters are the specifications for an RBF that can be modified to improve the current SVM model in order to achieve greater precision. Cost C here handles the trade-off between correct training point classification and a smooth boundary. Learning algorithms are focused primarily on understanding or learning from input data and are not concerned with learning methods while various means have varied impacts. Because of the bane of dimensionality, multi-dimensional training data (usually large number of dimensions) can often be easily interpreted by over-fitting the model. Therefore, it is generally preferable to specifically allow those training points to be misclassified in order to have the separating hyperplane’s “overall better” place. If the cost value is high, the model selects more data points as a support vector and thus the possibility of getting higher variance and lower bias, resulting in the issue of overfitting. Conversely, if the value of cost is low, then the model chooses fewer data points as a support vector and thereby obtaining lower variance and high bias leading to underfit model. So, the objective is to find the balance between “not too strict” and “not too loose” value of the cost. Cross-validation and resampling, along with grid search, are some of the efficient ways to procure the best value for the cost. As said earlier, the gamma (γ) parameter defines how far the influence of a single training example reaches, With low values on “far” and high values on “near.” It can be recognized as the inverse of the impact radius of samples selected as support vectors by the model. If the gamma value is large, then the model’s decision boundary will depend on data points near to the decision boundary, and the closer data points will bear more weights than far away points, making the decision boundary wigglier. If the value of gamma is small, then far away data points hold more weights than the closer data points, and thus the boundary of the decision looks more like a line. Cloud Layer Cloud, which is a network of multiple devices, computers and servers connected to each other over the Internet (Fig. 4). Such a computing system can be figuratively divided into two parts: • The frontend—consists of client devices (computers, tablets, mobile phones). • The backend—consists of data storage and processing systems (servers) that can be located far away from the client devices and make up the cloud itself. These two layers that are fog layer and cloud layer communicate with each other directly by means of wireless connections.

56

D. Pradeep Kumar et al.

Fig. 4 Fog to cloud communication

4 Case studies of Fog Data Analytics Application and case studies of Fog Computing and Fog Data Analytics in several fields. Smart Traffic Lights: Building a Decision Support System for selected Agriculture parameters using Data Analytic Techniques features that take circles as hyperplanes that include a radial basis function, but the decision boundaries become much more complex if multiple such features interact. Software is used to automate the driving process in the car which will enable the rider to steer the automobile hands-free. Software not only drives the automobile by itself but also parks the vehicle on its own without any assistance from the rider from inside of the vehicle. Fogging establishes and supports real-time communication. It has emerged as the industry’s choice for the next generation of the interconnected vehicular network made possible by the internet. Smart vehicles and the traffic system with the communication infrastructure connecting both of them can and will eventually create accident free commuting and save lives lost on the road due to uncontrollable accidents. Smart Grids: fog computing has changed the way grids operate after it’s been put to use in the grid. Due to the unending need for energy and its availability coupled with competition and cost for its usage has made devices harness energy from solar farms and wind turbines. Fog creates an ecosystem for the edge devices to make sense of the information collected on the fog receivers and creates actionable commands for the actuators to follow. Data is utilized in the top tier for generating reports and visualize information and analyze all the transaction for insights. Fog stores data differently in the bottom and top tier for effective usage of the resources. Smart Train: fog computing also finds application in the smart trains where the management of wear and tear of different parts is automated. Sensors on the critical parts of the train such as bearings and engine constantly monitor the system for any

Fog Data Analytics: Systematic Computational Classification …

57

deviation from the operational temperature and will raise an alarm and communicate the same to the system operator in real-time for immediate addressing and further servicing. This reduces the risk of service accidents that could cost lives and damage to the property. Wireless sensor nodes in the networks are programmed to consume very less energy and enhance the life of the battery. Actuators act as tools to effectively measure, monitor and manage the operations. Sensors in the air monitoring system in the mines adjust the flow of air into and out of the tunnels based on certain parameters that may affect the safety of the workers. Wireless sensor nodes operate in an environment with least bandwidth, low power, less computing capacity most of the time. Smart Building Control: Wireless sensors for measuring temperature, humidity, or levels of various gaseous components in the building atmosphere are mounted in decentralized smart building control. As a consequence, information can be shared among all sensors in the floor, and the reading can be combined to form reliable steps. The fog devices respond to data by using collaborative decision-making. The system works together to minimize heat, circulate clean air and humidifies the environment. Light sensors operate by monitoring the presence or absence of moving objects near them, by turning the lights on/off accordingly. In smart buildings, application of fog computing helps reduce the carbon footprint by reducing and conserving the energy expended in may activities.

5 Conclusion Fog computing application is very diverse, and logically demands different types of requirements on the fog computing platforms created to support them. There is an urgent need for processing systems that can facilitate real-time data processing. With the ever-growing popularity of IoT systems and its usage, the need for a more reliable process model and taxonomy needs to be developed. As the data analytics applications are increasingly connected with Big Data. This paper introduces a reference process model and taxonomy and discusses various case studies that support and implement it. In fog nodes, different steps of analytic techniques are followed to do the efficient classification of data. The efficient classification involves tuning some of the parameters of SVM. Many challenges still remain like issues of security and energy usage minimization. Future research opportunities can be explored on Open protocols and architectures which are identified to pull the end users even more towards fog computing.

58

D. Pradeep Kumar et al.

References 1. Wang, J., Zheng, P., Lv, Y., Bao, J., Zhang, J.: Fog-IBDIS: industrial big data integration and sharing with fog computing for manufacturing systems. In: Proceedings of the Journal Homepage: www.elsevier.com/locate/eng on Research Intelligent Manufacturing—Article (2019) 2. Yousefpour, A., Fung, C., Nguyen, T., Kadiyala, K.: All one needs to know about fog computing and related edge computing paradigms. In: proceedings of the [cs.NI] 13 Feb 2019. https:// arxiv.org/abs/1808.05283v3 3. Moustafa, N.: A systemic IoT-Fog-Cloud architecture for big-data analytics and cyber security systems: a review of fog computing. In: Proceedings IEEE International Conference on Fog Computing, IEEE (2019) 4. Argerich, M.F.: Learning based adaptation for fog and edge computing applications and services, Report on Fog Computing (2018) 5. Argerich, M.F., Wang, S., Azam Zia, M., Jadoon, A.K., Akram, U., Raza, S.: Fog computing: an overview of big IoT data analytics. A Review Article in Wirel. Commun. Mob. Comp. (2018). https://doi.org/10.1155/2018/7157192 6. Taneja, M., Jalodia, N., Davy, A.: Distributed decomposed data analytics in fog enabled IoT deployments. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2907808 7. Bachmann, K., Schulte, S.: Design and implementation of a fog computing framework. On Fog Computing, 129 pages (2017). https://www.infosys.tuwien.ac.at/team/sschulte/theses/Bac hmann_Master.pdf 8. Ahmed, A., Arkian, H.R., Battulga, D., Fahs, A.J., Farhadi, M., Giouroukis, D., Gougeon, A., Oliveira Gutierrez, F., Pierre, G., Souza Jr, P.R., Ayalew Tamiru, M., Wu, L.: Fog computing applications: taxonomy and requirements. https://arxiv.org/abs/1907.11621v1. [cs.DC] (26 Jul 2019) 9. Mehdipour, F., Javadi, B., Mahanti, A., Ramirez-Prado, G.: Fog computing realization for big data analytics. In: Buyya, S. (ed.) Fog and Edge Computing: Principles and Paradigms, Chapter 11/Fog Computing Realization for Big Data Analytics. Wiley STM 10. Minh Dang, L., Jalil Piran, Md., Han, D., Min, K., Moon, H.: A survey on internet of things and cloud computing for healthcare (9 July 2019) 11. Verma, J.P., Tanwar, S., Garg, S., Gandhi, I., Bachani, N.: Evaluation of pattern based customized approach for stock market trend prediction with big data and machine learning techniques. Int. J. Bus. Anal. IGI Global 6(3), 1–13 (2019) 12. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N., Parizi, R., Choo, K.K.R.: Fog data analytics: a taxonomy and process model. J. Netw. Comput. Appl. 128, 90–104 (2019) 13. Prasad, V.K., Bhavsar, M., Tanwar, S.: Influence of monitoring: Fog and edge computing. Scalable Comput. Pract. Exp. 20(2), 365–376 (2019) 14. Jhonson, S.: How fog computing is changing the BigData paradigm for Iot device. The online Resource for Big Data (2019) 15. Minu, R.I., Najarjuna, G.: Fog Computing and Computational Intelligence Paradigm’s for the IoT. E-Book (2019) 16. A Path way Approval. https://www.fda.gov/patients/device-development-process/step-3-pat hway-approval (2018) 17. Best Resource Taxonomy. https://www.fda.gov/drugs/development-resources/best-resourcetaxonomy (2018)

Fog Computing: Building a Road to IoT with Fog Analytics Avinash Kaur, Parminder Singh, and Anand Nayyar

Abstract There is a great impact on our day-to-day life by integrating platforms of cloud computing and Internet-of-things (IoT). Also, some of the limitations exist in today’s era. Although various services of cloud are freely available and are also comparatively cheaper. But it consumes a large amount of network bandwidth. The main disadvantage of cloud computing is the distance between the data center and the data source. Fog computing offers a solution to these kinds of problems in cloud computing. It is one of the distributed service computing models. It completely utilizes the various computing functions of terminal devices. It also exhibits para-virtualized architecture. The different characteristics of cloud and fog computing platforms are explained in this chapter. Also, the detailed architecture of both platforms is introduced with a comparative analysis. On the fog server, fog analytics tool performs data localization. All the methods of application management such as resource coordination technique, distributed application deployment, and distributed data flow method are discussed. Further, research direction in using Deep Learning to Big Data is detailed as the improved formulation of data abstractions, dimensionality reduction, etc. Also, the possible solutions are presented.

1 Introduction to Fog Computing Sensors and IoT devices nowadays are mostly connected to the internet. More of the electronic devices, mobile services, etc., of consumers are connected by the Internet of Things (IoT). This is also referred to as the Internet of Everything (IOE). A. Kaur · P. Singh (B) School of Computer Science and Engineering, Lovely Professional University, Phagwara, India e-mail: [email protected] A. Kaur e-mail: [email protected] A. Nayyar Duy Tan University, Da Nang, Vietnam e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Tanwar (ed.), Fog Data Analytics for IoT Applications, Studies in Big Data 76, https://doi.org/10.1007/978-981-15-6044-6_4

59

60

A. Kaur et al.

The number of devices to be connected by the internet will reach 30–50 billion globally [1]. This chapter provides an overview of integrating fog computing with IoT. The benefits of integration are briefly discussed and also various characteristics are discussed. The contribution of this chapter is: 1. Discussion of research articles that investigate fog computing and IoT integration 2. Discussion of challenges of IoT driven economy and how integration with fog computing is beneficial. 3. Discussion of application management in the Fog environment. 4. Discussion of the future research scope.

1.1 IoT Driven Economy and Its Challenges The characterization of IoT devices is performed based on a limited amount of computational power and a limited amount of storage. Hence, it resulted in more number of cloud services with the support to smart devices. As a result, present networks get more overloaded with limited bandwidth available. The promising solution for this problem is the integration of Software Defined Networking (SDN) and Network Function Virtualization (NFV). Virtual computing resources are provided by IaaS to remote users over the internet. There is a limited amount of services by the cloud due to the constraint of the performance load of the servers in the data centers. Maybe, the data center is placed close to the infrastructure of a large network. The sensors and the IoT devices are located in the field levels such as roads, houses, farms, and factories. Hence, the situation with large latency exists that does not offer the services with the QoE and QoS satisfaction levels. The complexity arises due to the generation of a large volume of data by the IoT devices and this data is to be analyzed in real-time only. The generated data impacts the analytical applications and storage systems. The services of the cloud can scale up to the requirements of IoT by providing scalable processing and storage services. However, in the case of certain real-life applications, it is not realistic for the delay in data transfer to the cloud and obtaining results. Hence, it is not at all acceptable. Sending a large amount of data to the cloud can saturate the network and also effects IoT performance.

1.2 Fog Computing and Cloud Comparison The present environment of cloud computing is facing various challenges in various areas. It is not able to support various applications that are time-sensitive as augmented reality, gaming, etc. Fog computing handles these challenges. Both the platforms are compared in Table 1.

Fog Computing: Building a Road to IoT with Fog Analytics Table 1 Comparison of fog and cloud computing Parameters Cloud computing Security measures Attack Deployment Server nodes location Hardware

Definite Less probable Centralized Within internet Storage is scalable

61

Fog computing Cannot be defined More probable Distributed Local network edge Limited storage

2 Fog Computing as Solution Initially, Edge computing was proposed for addressing the above-mentioned challenges. It was used for the utilization of computing resources for preliminary data processing, local storage for a reduction in the congestion or network load. For the fast decision-making process, it enabled localized processing of data. However, there was a limited amount of resources with edge computing leading to increased process latency and resource contention. Hence, it leads to emerging of new platform Fog Computing which leads to the integration of edge devices with cloud services for overcoming the limitation of Edge computing. Fog computing leverages the cloud resources by avoiding resource contention and also making the coordination between different edge devices distributed geographically.

2.1 Definition of Fog Computing Many definitions exist for Fog Computing: Starting from core to edge of network Fog computing is an extended version of Edge platform [2, 3]. The networking capabilities of Fog are enhanced by providing support between interacting devices and also providing a hosting environment. Another definition of Fog Computing [4] is: In the process of Fog computing a large number of decentralized and ubiquitous devices cooperate and communicates between themselves. The communication is also maintained with the network for the processing and storage tasks. There is no third party interaction for supporting the applications and services executing in the sandboxed environment. The part of devices is leased by the users for service hosting and obtaining the incentives. The critical connections of fog to the cloud are not addressed. So another definition for Fog is introduced by Yi et al. [5] The distributed computing architecture is Fog Computing with a large number of resources containing heterogeneous devices where the backup is not provided by cloud services. More number of definitions can be introduced soon.

62

A. Kaur et al.

2.2 Fog Computing Platform A completely virtualized platform fog computing provides networking, compute, and storage services between cloud data centers and high-end storage devices. Figure 1 depicts the role of fog computing. Networking, compute, and storage resources are building blocks for Fog. It can be illustrated as follows: • Low latency, Location Awareness, and Edge Location: A large number of endpoint services are provided by the Fog. It even provides the services to the applications where there are low latency requirements like augmented reality, video streaming, etc. • Geographical Distribution: Widely distributed deployments are demanded by various Fog applications. For example, high-quality streaming is delivered by Fog to the moving vehicles with the help of access points and proxies positioned along tracks and highways. • Wireless access predominance Heterogeneity: Nodes of Fog are offered in different forms and deployed in various environments. An important role is played by Fog in the ingestion of data that is near to the source.

Cloud Computing

FOG

FOG

FOG

Fog Computing

End Devices

Fig. 1 Role of fog computing

Fog Computing: Building a Road to IoT with Fog Analytics

63

2.3 Fog Computing: Characteristics 1. Location Awareness: In nature, the services of cloud services are mostly distributed where the services of Fog computing have the feature to derive location and location tracking of the user. 2. Location at the edge: As closely related to the edge, the latency-sensitive applications that require processing as real-time are supported. 3. Service Delivery and real-time applications: The process of cloud involves a large amount of delivery whereas the Fog involves real-time delivery for the overloading and latency of the network. 4. Edge Analytics: Fog computing involves analyzing the sensitive data locally instead of sending it over the network. 5. Scalability: Complete set of real-time data is not handled by the cloud. Fog computing overcomes the disadvantage of cloud computing where real-time data is not provided by the cloud. The data is preprocessed at the edge and data sent to the cloud. Conclusion: Fog computing provides cloud-like services and is a distributed computing paradigm. It leverages the resources from the edge, as well as the cloud, in addition to its resources. It deals with the IoT data locally by utilization of edge or client devices for carrying central, communication, management, and configuration. The fog intermediate approach leverages from the on demand scalability provided by the cloud. It also involves analytics applications and data processing components executing in the distributed cloud and edge services. Fog computing supports distributed data analytics, interface heterogeneity, and user mobility for reaching distributed edge applications requirements.

2.4 Fog Computing: Architecture Paradrop, IoX, and Cloudlet were the early architectures of Fog Computing.

2.4.1

Cloudlet Architecture [6]

Figure 2 shows the implementation of Cloudlet, a resource-rich node. The operating system is the lowest level along with downloaded data from the node. The virtualization layer is the next higher level. Various applications are hosted on the highest level called virtual machines.

64

A. Kaur et al.

VMs with applications

A

P

P

S

Virtualization( Open Standard)

LINUX

Data Cached From Cloud

Fig. 2 Architecture: cloudlet

Fig. 3 Architecture: IoX

2.4.2

IoX Architecture

The architecture is based on the CISCO router as shown in Fig. 3. The hosting of the applications is performed on Hypervisor inside the operating system. The platform enables code development and scripts. The installation of new operating systems can be performed for easy public access and is costly.

2.4.3

Fog Computing Platform for Local Grid

It is installed on sensors or network devices and is embedded in the software. It secures and standardizes the communication between different devices, therefore, minimizing service and customization cost. The platform of the local grid exists on the device between cloud and edge. It also provides reliable communication between different devices. Therefore, real-time decisions are made at an edge at an instant without the involvement of communication with cloud involving high latency. Also, the local grid services through open communication standards can communicate with the cloud as fog being an extension of the cloud. More complex problems can be

Fog Computing: Building a Road to IoT with Fog Analytics

65

solved by utilizing the interplay between fog and cloud. Fog computing local grid is integrated and shipped with Local grid vRTU. It is software based on a virtual remote terminal unit for transforming communication between different edge devices.

2.5 ParStream It is a real-time IoT analytics platform. It integrates with ParStream for scalable, fast, and reliable infrastructure. It also provides a Big Data analytics platform on IoT. It can be enabled on CISCO IoX, a fog enabled device.

3 Layers in Fog Computing Architecture It consists of three-layer architecture and extends the cloud. The cloud services are extended by the Fog by introducing a fog layer between end devices and clouds. The three layers are explained as follows and illustrated in Fig. 4 • Terminal Layer: Closest layer to the physical environment and end-user. It consists of various IoT devices such as readers, smart cards, smart vehicles, etc. The devices are geographically distributed and sense the feature data of events and physical objects. It also transmits the data to the upper layer for storage and processing. • Fog Layer: It is composed of several nodes of fog which include access points, switches, gateways, routers, etc. These are widely distributed between cloud and end devices for example parks, streets, shopping centers, cafes, etc. Nodes are both static and dynamic in nature. To obtain the services, end devices are connected with Fog nodes. They perform computation and storage of the sensed data. • Cloud Layer: It composes of various high performing storage devices and servers such as smart factories, smart homes, etc. It offers storage and computation capability for a large amount of data.

3.1 Present Challenges The existing challenges in designing cloud computing platform are identified as follows: Latency: Latency minimization is to be obtained as real-time responses are the expectations of fog applications. Virtualization Technology choice: This determines the flexibility and efficiency speed in nodes of Fog. To lower down the latency the considerations to be followed are:

66

A. Kaur et al.

Fig. 4 Fog computing architecture: layers

• Data Aggregation: The data is to be aggregated for lowering down the latency. • Scheduling and provisioning: For the resource-constrained fog nodes if provisioning and scheduling are not done in time limit then it may lead to latency. Hence, the correct scheduling mechanism is to be performed using mobility and priority model.

Fog Computing: Building a Road to IoT with Fog Analytics

67

• Failures or churning: Fog computing is affected in the events of failure or churn of nodes. Different mitigation techniques such as replication, rescheduling, and chuck pointing can be employed. • Network Management: It is critical for the functioning of fog. It is a challenge to deploy integrated Software Defined Networking (SDN) and Network Function Virtualization (NFV). • Platform and Security: Intrusion detection systems and access control with support from each layer are to be performed.

4 Application Management in Fog The fog nodes are distributed and heterogeneous with a limited amount of resources. Therefore, it is better to deploy the web applications in the cloud datacenters. Further, these applications are modeled as interdependent lightweight application modules [7]. The necessary instructions are contained in an application module for the execution of typical IoT functions such as to receive data, process the received data, perform analysis on data, and provide a response to the real-time application. Every application module consists of necessary instructions for generating a specific output. Further, the data is sent to another module as an input based on the data dependency. A certain amount of resources such as memory, bandwidth, CPU, etc., are required by each module for execution. For reduction of overheads, distributed application development techniques are discussed in [4], coordination of fog nodes [8], and model programming platform [9].

4.1 Latency-Aware Application Development The different techniques for latency-sensitive applications before the emergence of Fog computing is discussed [10–13]. Latency-aware Application Module Management emerges considering input receiving frequency and objective deadline with QoS provisioning for maintaining optimized energy in Fog [14].

4.2 Distributed Application Development The techniques are 1. Droplets: Smallest unit of placed application, distributed as nodes of Fog according to the requirement. When a large amount of geographically distributed IoT devices generates a large number of requests for service [15]. Therefore, Fog computing is a requirement of cloud. In the scenario of Fog, the policy exists

68

A. Kaur et al.

to deploy the applications on nodes of Fog in “Droplets” form which form the smallest unit of an application [4]. 2. Mobile Fog dynamic node discovery [9]: This technique is based on runtime scalability of distributed applications and hierarchical orientation of nodes of fog.

5 Fog Analytics 5.1 Introduction There is a lot of contribution of IoT towards Big Data but the architecture of Fog Computing is essential for IoT. It is a recent technology that reduces complexity. It is an efficient method to reduce big data load on the cloud. The fog servers can preprocess the data at a local server and can take faster decisions. The aggregated data is only sent to the cloud. Hence, the velocity and volume of data are reduced that is sent to the cloud. Different applications such as Event Monitoring, Interactive Gaming, and Augmented Reality have a requirement of processing conditions instead of the complete databank.

5.2 Fog Computing, Stream Data Analytics, and Big Data Analytics Real-time data mining, Stream data mining, and also the area of analytics act as a base of distributed stream mining as classifications and feature extraction of different Fog Systems. Tensor-flow facilitates the implementation of machine learning algorithms and advanced data mining in mobile edge devices and also in Fog servers. The challenge that exists is how without affecting the performance load balancing is performed between Fog servers and edge devices. Also, there are stream processing engines where streaming can be executed on Fog Servers such as Apache Storm and Spark. Hence, this omits the requirement of new tools for Fog Analytics. Some of the issues still to be addressed are APIs for Fog streaming. There exist certain Apache Hadoop ecosystem components Spark GraphX for processing of graph and Apache Mahout for machine learning.

5.3 Machine Learning for Fog Ecosystem The different applications of Machine Learning are used in various real-life scenarios in Fog Computing that involve stream data, IoT, and also various other scenarios of big

Fog Computing: Building a Road to IoT with Fog Analytics

69

data. With the passing time, more of the bioinformatics data and health-related data are accumulated and created continuously resulting in the huge scenario of big data [16]. Different applications as IoT devices, biometrics, genomics, and 3D imaging are leading to data growth in big data [17]. Real-time streaming of data is also being deployed in Neonatal care for tracking the threats of infections—Fog Computing and IoT application. The revolution can be brought in the healthcare sector with the involvement of data analytics. For the application of enhanced machine learning methods or techniques for analyzing stream data, IoT, and Big Data the different characteristics are 1. High-Speed Robustness: The algorithm of machine learning should tend towards processing and digesting real-time data. Also, there should be no degradation in performance due to velocity, density or volume of data. 2. Scalability: The used machine learning algorithm should be able to handle a large volume of data with limited storage overhead and complexity in space. 3. Incrementality: The available machine learning algorithms are unable to handle dynamic data. Dynamic handling should be performed by a machine learning algorithm with no compromise with quality. 4. Feature Selection: In Big data, IoT, Fog Computing the number of features and dimensionality is too large. Feature selection enables to identify the features and also leave the redundant features. The efficiency of predictive analytics algorithms is enhanced by selecting the most crucial and critical features. Hence, this will reduce the effect of large dimensionality thus leading to an increase in efficiency of the machine learning algorithm. This improves the interpretability and process of learning. The feature selection is performed by humans in the conventional machine learning algorithm. Scale Invariant Feature Transform (SIFT) and Histogram of Gradients (HOG) [18] are known techniques in Computer Vision Domain for feature engineering. It can lead to the minimization of human efforts if automated tools are deployed for feature engineering. The dimensionality reduction and feature selection are to be adopted for the identification of important features from big data sets. These selected features enable fast decision-making process in shorter span time for a large volume of data. The large veracity, velocity, and volume of data pose a great challenge situation for universality, robustness, reliability, nonlinearity, implementation, and cost complexity. For example, in the case of bioinformatics, “Feature Vector” is one of the crucial indicators in protein sequence analysis with very high dimensionality. A large number of features exist leading to a higher level of complexity and lower accuracy in prediction. A new feature selection method is introduced by Bhagyamathi et al. [19], and by dimensionality reduction is used for introducing a new annealing technique by Barbu et al. [20]. Another method of incremental learning selected the subset of features incrementally from different samples of data. Zeng et al. [21] introduced the incremental feature selection method. 5. Distributed Data: The algorithm of machine learning should handle the distributed processing of data partially on each node and then merging of partial nodes into one is to be performed.

70

5.3.1

A. Kaur et al.

Supervised Learning

Labeled Training examples are used for training an algorithm in supervised learning. The algorithm predicts the class labels of the trained instances on the basic information obtained from available instances of training. The supervised learning models are either classification models or continuous models of regression. But all these were introduced when big data came up. The big data analytics require advanced approaches of supervised learning for distributed and parallel learning such as Neural Network classifiers, divide and conquer SVM [22], and Multi-hyper plane model Machine (M) Classification Model [23]. SVM is most efficient among all these. For big data analytics, modified SVM techniques are used. A method New Primal SVM [24] for the classification of big data was introduced. There are also various application-based techniques.

5.3.2

Decision Trees Distributed

To parallelize the process of induction Gradient Boosted Decision Trees (GBDT) [25] were developed. The GBDT can be easily converted to a map-reduce model. Due to the increased I/O overheads and high communication cost, GBDT based on map-reduce model was employed for horizontal data partitioning because the HDFS is not suitable in this situation. Calaway et al. introduced rXDTree (Reliable highspeed decision tree) for Big Data [26]. The histogram is computed for creating an empirical function of the distribution. It builds the decision tree using a breadthfirst approach and makes it execute in a parallel computing environment. Hall et al. introduced a modified decision tree algorithm that uses parallel decision trees for generating rules using tractable training sets [27]. But as the training dataset is much large so it increases the complexity. It can only be used for the classification of unknown data to train decision trees.

5.3.3

Clustering Methods for Big Data

All the issues with the big data scenarios are not addressed by conventional clustering techniques simultaneously. Parallel clustering methods offer a solution for a large volume of data. The high velocity of big data scenarios can be handled by incremental clustering techniques. The data consisting of different variety is handled by MultiView clustering [28]. The techniques for big data scenarios are CURE, CLARANS, CLARA, DENCLUE, and DBSCAN. K-prototype and K-mode techniques work on large mixed and categorical data [29]. Memory requirements are minimized by a variant of Ordonez [30]. The iterative sampling of the large dataset is performed by an introduced framework [31]. This framework is used for the formation of clusters [32]. In the wave cluster, spatial domain data is converted to frequency domain [33]. Linkage and partitioned hierarchical clustering is proposed for parallel processing systems [34, 35]. The Parallel K-means algorithm is proposed using MapReduce

Fog Computing: Building a Road to IoT with Fog Analytics

71

architecture [36]. A distributed clustering algorithm, PDBSCAN [37] is then introduced for obtaining the clusters in a distributed environment. To reduce the errors partitioning is performed by P-cluster [38]. In the shared-nothing architecture, a parallel version of BIRCH, PBIRCH [39] is proposed to distribute the data among multiple processors. For the computation of new centers of clusters, the incremental K-means algorithm is introduced [40]. Another algorithm, incremental hierarchical clustering is introduced [41] and density-based clustering (IGDCA) is introduced [42]. Clustering algorithm with multi-view scope is proposed [43, 44]. In different feature spaces, separate clustering is proposed [45]. For projection of multi-views in low dimensional space multi-view clustering is proposed [46, 47].

5.4 Distributed Parallel Association Rule Mining Techniques for Big Data Scenario The scalability of size or dimensions cannot be offered by sequential association mining techniques. Different high performance, parallel, and distributed association rule mining algorithms are developed as follows: In Count Distribution proposed parallelization algorithm for the conventional sequential apriori algorithm in association rule mining [48]. The communication is minimized by the proposed algorithm due to a reduction in several counts between processors [49]. Complete memory utilization is not performed by this algorithm as a complete hash tree is replicated on each processor. Based on count distribution, Fast Distributed Mining (FDM) is proposed [50]. PDM [51] is based on DHP [52]. Fast Parallel Mining (FPM), a parallel version of FDM [53]. The inherited deficiencies are retained when serial methods are converted to parallel [54]. MapReduce framework is deployed for faster execution [55].

5.5 Dynamic Association Mining The datasets used in the process are static and it is an inherited assumption in all associate rule mining techniques, whereas all the transactional databases are dynamic in nature. Due to transactions, different changes in data invalidates the previous results of association rule mining techniques [56]. In a dataset, Fast Update (FUP) computes the large sets of items. Certain algorithms are also unable to handle a large volume of data. GRN represents the complex behavior of a group of genes and their influence on other genes in biology. Based on steady-state time series data. GRN is inferred in several attempts but dynamic time series data cannot be handled [57]. GRN reconstruction method is to be proposed for inferring reliable GRNs. In target diseases, potential drug targets are to be identified [58].

72

A. Kaur et al.

6 Deep Learning and Big Data For obtaining significant and meaningful patterns, deep learning algorithms can be used in the big data scenarios for processing unlabeled and large supervised data [59–61]. After obtaining the information from unlabeled/unsupervised data, training of traditional discriminative models of machine learning can be performed with labeled/supervised data points. Deep learning algorithms can find global patterns or relationships in input [59]. The other advantages of deep learning are • Utilization of extracted knowledge from the abstract and complex representation of data. • The deep learning techniques outperform other techniques and enable automated feature engineering for heterogeneous data types such as text, speech, and image. • With the deep learning techniques, a higher level of abstractions obtained also results in obtaining semantic and relational knowledge. All of the above features are useful for deep learning methods. Hence, deep learning becomes an effective tool in big data analytics. The examples are Data Tagging and Semantic Indexing [62]. In the Big Data for speech, Computer Vision, social network, and defense, semantic indexing is an important part. The indexing of data is to be performed due to the size of data by Semantic Indexing to make information retrieve fast. It also solves the purpose of raw data indexing. The methods can be deployed for the generation of high-level abstract data representations. To solve the purpose of indexing such representations can be utilized. The relationships and entities between the data are revealed by these representations. The data with similar representations are stored together, for easier and faster access to data. Deep learning algorithms essentially display the associations within the data. The challenge is posed by multiple language processors in text processing if query or search is based on multiple language content. Using the morphological analysis for conversion, knowledge can be represented independently of any natural language using Standard intermediate form (Object Knowledge Model) [63]. The layer above the multilingual text is the layer of intermediate form. The knowledge graph from the knowledge frames is produced in the next level of the hierarchy. In the last layer, the processing of the knowledge graph is performed. The internal structure of the knowledge graph is represented by sources of multiple languages. Processing such as identifying entities, activities in various input texts helps in comparing different contents, summarizing, and also analysis of the text. The complete text document’s vector representation helps to easily and efficiently retrieve the information. The extracted complex data representation does not contain information about semantic and relational data and can be further utilized for semantic indexing. The comparison between texts becomes easier if the text can be represented in the form of vector. The text similarity can be judged by vectors and is a much better method than comparing raw data. Document representation condenses unique or essential features of the topic or document. The key parameter in several Retrieval and Document Classification systems is Word Count. For example, BMZJ [64] and

Fog Computing: Building a Road to IoT with Fog Analytics

73

TFIDF [65], are such systems. In all these systems, independent occurrences of words are highly correlated and individual words are considered as dimensions. Hence, deep learning techniques can be employed for the reduction of dimensionality of data by identifying semantic features of the document. A single document is represented by a 128-bit code. The Hamming distance method between different binary codes is used for evaluating the semantic similarity of documents. In Hamming Space, semantic similar documents are closer. The document can be retrieved by using its binary code. The binary code occupies a small amount of storage space. The fast bit counting method is used to calculate the Hamming distance. Deep learning algorithms outperform conventional shallow machine learning algorithms and require a lesser amount of space. The Google product “Word 2vec” produces a vector equivalent to input documents. It constructs vocabulary from the input. The word vector file is used in many applications in machine translation and Natural Language Processing (NLP). Word Vector learning techniques for large datasets are proposed [66]. A distributed framework “Dist Belief” is used for training the neural network and it also enables machine translation [67]. Further solutions and innovations for large-scale data are to be proposed.

7 Approaches for Fog Analytics 1. Smart Data: It is a package of structural data that is encapsulated and generated by IoT sensors and devices, virtual machines, and a set of metadata introduced for Fog Analytics. 2. Fog Engine: It provides the solution for communication capability and data analytics for communication with the cloud and also with each other. Fog engine is referred to as a platform that is heterogeneous, agile, and customizable which is integrated with IoT devices. It performs data processing in the cloud and an integrated grid of IoT devices. The fog engine can create a peer-to-peer network with other devices beneath the cloud. It facilitates the interaction with the cloud as a gateway and also offloading of data. In the Fog engine, the analyzation of instream data is performed locally. To perform global data analytics, data from the different Fog Engines is collected and transmitted to the cloud. For the deployment of Fog Engine, multiple scenarios exist that are dependent on single or multiple analyzers, multiple receivers, and single or multiple transmitters. The deployment of Fog Engine uses the burden of data analytics and network backbone at the side of utility, and also further reduces dependence on the cloud. The performing of computations locally only cleans and analyze a certain amount of data by Fog Engine. Hence, the minimization of large-scale data transfer to the cloud is helped to turn-down the network congestion. 3. More Products: The other products are Cardio log Analytics by Intlock and Microsoft Azure Stack that offers data analytics on-premise. The capacity on

74

A. Kaur et al.

demand is delivered by Oracle as Oracle Infrastructure as a Service (IaaS) that enable deployment of customer system in own data centers of customers. 4. ParStream: It is introduced by CISCO and enables continuous and immediate analysis of real-time data during the loading of data. It exhibits hybrid distributed architecture for analyzing a billion number of records at the edge of compression capabilities and patented indexing for processing the real-time data and also reducing degradation of performance. It uses GPUs and CPUs for the execution of queries. It integrates with machine learning engineers and R language for supporting advance analytics. A large amount of historical data is analyzed using the time series method.

8 Research Directions • Harnessing the Temporal Dimension of IoT Data for Customer Relationship Management (CRM): The IoT data temporal perspective can be harnessed to enhance the quality of experience and service. • Semantics addition to IoT Data: The information value is enhanced by the addition of metadata to specific implications and circumstances. The metadata is of great importance to IoT data for client processing and use at Fog or device level. The ontologies can be created by deploying vocabularies of metadata. The different ontologies are based on OKM, OWL, and RDF. • SemanticWeb of IoT: It is one of the main goals where the integration of data from sensor and IoT devices into the semantic web is performed. • Standardization and Interoperability in IoT: The worldwide IoT exhibit heterogeneous initiatives, stages, traditions, and standards used as HTTP 7 AMQP, STOMP, XMPP, and MQTT.

9 Case Study A structured and competent technique to improve health is using the Internet of things. One of the best ways for providing healthcare is monitoring the health using various healthcare systems that possess the ability to acquire bio-signals from different nodes of the sensor. After that data is sent to a gateway using a wireless communication protocol. To perform real-time processing, diagnosis, and visualization, real-time data is sent to a remote cloud server. Smart gateways analyze ECG signals with extracted features of p-wave, heart rate based on lightweight wavelet transform [68].

Fog Computing: Building a Road to IoT with Fog Analytics

75

10 Conclusion With the fast development of Mobile internet, CPS, and IoT Fog computing are growing rapidly. It enables more number of services and applications from the cloud to edge network by using geographically distributed networks. It fulfills the criteria of latency-sensitive applications by reducing the amount of network transmission and data transmission time. The fog computing technology along with the concepts of deep learning is focused on this chapter. Further, different Fog Computing applications are detailed along with the research directions. This will be the most prominent area that will further influence the industry and academia.

References 1. Komninos, N., Philippou, E., Pitsillides, A.: Survey in smart grid and smart home security: issues, challenges and countermeasures. IEEE Commun. Surv. Tutor. 16(4), 1933–1954 (2014) 2. Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: Proceedings of the First Edition of the MCC Workshop on Mobile Cloud Computing, pp. 13–16 (2012) 3. Dastjerdi, A.V., Gupta, H., Calheiros, R.N., Ghosh, S.K., Buyya, R.: Fog computing: principles, architectures, and applications. In: Internet of Things, pp. 61–75. Elsevier (2016) 4. Vaquero, L.M., Rodero-Merino, L.: Finding your way in the fog: towards a comprehensive definition of fog computing. ACM SIGCOMM Comput. Commun. Rev. 44(5), 27–32 (2014) 5. Yi, S., Li, C., Li, Q.: A survey of fog computing: concepts, applications and issues. In: Proceedings of the 2015 Workshop on Mobile Big Data, pp. 37–42 (2015) 6. Satyanarayanan, M., Lewis, G., Morris, E., Simanta, S., Boleng, J., Ha, K.: The role of cloudlets in hostile environments. IEEE Pervasive Comput. 12(4), 40–49 (2013) 7. Gupta, H., Vahid Dastjerdi, A., Ghosh, S.K., Buyya, R.: ifogsim: a toolkit for modeling and simulation of resource management techniques in the internet of things, edge and fog computing environments. Softw.: Pract. Exp. 47(9), 1275–1296 (2017) 8. Giang, N.K., Blackstock, M., Lea, R., Leung, V.C.: Developing IoT applications in the fog: a distributed dataflow approach. In: 2015 5th International Conference on the Internet of Things (IOT), pp. 155–162. IEEE (2015) 9. Hong, K., Lillethun, D., Ramachandran, U., Ottenwälder, B., Koldehofe, B.: Mobile fog: a programming model for large-scale applications on the internet of things. In: Proceedings of the Second ACM SIGCOMM Workshop on Mobile Cloud Computing, pp. 15–20 (2013) 10. Kang, Y., Zheng, Z., Lyu, M.R.: A latency-aware co-deployment mechanism for cloud-based services. In: 2012 IEEE Fifth International Conference on Cloud Computing, pp. 630–637. IEEE (2012) 11. Nishio, T., Shinkuma, R., Takahashi, T., Mandayam, N.B.: Service-oriented heterogeneous resource sharing for optimizing service latency in mobile cloud. In: Proceedings of the First International Workshop on Mobile Cloud Computing & Networking, pp. 19–26 (2013) 12. Ottenwälder, B., Koldehofe, B., Rothermel, K., Ramachandran, U.: Migcep: operator migration for mobility driven distributed complex event processing. In: Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems, pp. 183–194 (2013) 13. Takouna, I., Rojas-Cessa, R., Sachs, K., Meinel, C.: Communication-aware and energy-efficient scheduling for parallel applications in virtualized data centers. In: 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing, pp. 251–255. IEEE (2013)

76

A. Kaur et al.

14. Mahmud, M.R., Afrin, M., Razzaque, M.A., Hassan, M.M., Alelaiwi, A., Alrubaian, M.: Maximizing quality of experience through context-aware mobile application scheduling in cloudlet infrastructure. Softw.: Pract. Exp. 46(11), 1525–1545 (2016) 15. Singh, P., Gupta, P., Jyoti, K., Nayyar, A.: Research on auto-scaling of web applications in cloud: survey, trends and future directions. Scalable Comput.: Pract. Exp. 20(2), 399–432 (2019) 16. Singh, S.P., Nayyar, A., Kumar, R., Sharma, A.: Fog computing: from architecture to edge computing and big data processing. J. Supercomput. 75(4), 2070–2105 (2019) 17. Solanki, A., Nayyar, A.: Green internet of things (g-iot): Ict technologies, principles, applications, projects, and challenges. In: Handbook of Research on Big Data and the IoT, pp. 379–405. IGI Global (2019) 18. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893. IEEE (2005) 19. Bagyamathi, M., Inbarani, H.H.: A novel hybridized rough set and improved harmony search based feature selection for protein sequence classification. In: Big Data in Complex Systems, pp. 173–204. Springer (2015) 20. Barbu, A., She, Y., Ding, L., Gramajo, G.: Feature selection with annealing for computer vision and big data learning. IEEE Trans. Pattern Anal. Mach. Intell. 39(2), 272–286 (2016) 21. Zeng, A., Li, T., Liu, D., Zhang, J., Chen, H.: A fuzzy rough set approach for incremental feature selection on hybrid information systems. Fuzzy Sets Syst. 258, 39–60 (2015) 22. Hsieh, C.-J., Si, S., Dhillon, I.: A divide-and-conquer solver for kernel support vector machines. In: International Conference on Machine Learning, pp. 566–574 (2014) 23. Djuric, N.: Big data algorithms for visualization and supervised learning. Ph.D. thesis, Ph.D. dissertation, Temple University (2014) 24. Nie, F., Huang, Y., Wang, X., Huang, H.: New primal SVM solver with linear computational cost for big data classifications. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, vol 32, pp. II–505 (2014) 25. Ye, J., Chow, J.-H., Chen, J., Zheng, Z.: Stochastic gradient boosted distributed decision trees. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 2061–2064 (2009) 26. Calaway, R., Edlefsen, L., Gong, L., Fast, S.: Big data decision trees with r. Revolution (2016) 27. Hall, L.O., Chawla, N., Bowyer, K.W.: Decision tree learning on very large data sets. In: SMC’98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No. 98CH36218), vol. 3, pp. 2579–2584. IEEE (1998) 28. Mehta, A., Kaur, A., Singh, P.: A heuristic approach for efficient load balancing in cloud using weight based algorithm. In: 2018 4th International Conference on Computing Sciences (ICCS), pp. 1–6. IEEE (2018) 29. Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Disc. 2(3), 283–304 (1998) 30. Ordonez, C., Omiecinski, E.: Efficient disk-based k-means clustering for relational databases. IEEE Trans. Knowl. Data Eng. 16(8), 909–921 (2004) 31. Bradley, P., Fayyad, U., Reina, C.: Scaling clustering algorithms to large databases, knowledge discovery and data mining (1998) 32. Kaur, A., Singh, M., Singh, P., et al.: A taxonomy, survey on placement of virtual machines in cloud. In: 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS), pp. 2054–2058. IEEE (2017) 33. Sheikholeslami, G., Chatterjee, S., Zhang, A.: Wavecluster: a wavelet-based clustering approach for spatial data in very large databases. VLDB J. 8(3–4), 289–304 (2000) 34. Li, X., Fang, Z.: Parallel clustering algorithms. Parallel Comput. 11(3), 275–290 (1989) 35. Gupta, R., Tanwar, S., Tyagi, S., Kumar, N., Obaidat, M.S., Sadoun, B.: Habits: blockchainbased telesurgery framework for healthcare 4.0. In: 2019 International Conference on Computer, Information and Telecommunication Systems (CITS), pp. 1–5. IEEE (2019)

Fog Computing: Building a Road to IoT with Fog Analytics

77

36. Zhao, W., Ma, H., He, Q.: Parallel k-means clustering based on MapReduce. In: IEEE International Conference on Cloud Computing, pp. 674–679. Springer (2009) 37. Xu, X., Jäger, J., Kriegel, H.-P.: A fast parallel clustering algorithm for large spatial databases. In: High Performance Data Mining, pp. 263–290. Springer (1999) 38. Judd, D., McKinley, P.K., Jain, A.K.: Large-scale parallel data clustering. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 871–876 (1998) 39. Garg, A., Mangla, A., Gupta, N., Bhatnagar, V.: Pbirch: a scalable parallel clustering algorithm for incremental data. In: 2006 10th International Database Engineering and Applications Symposium (IDEAS’06), pp. 315–316. IEEE (2006) 40. Chakraborty, S., Nagwani, N.: Analysis and study of incremental k-means clustering algorithm. In: International Conference on High Performance Architecture and Grid Computing, pp. 338– 341. Springer (2011) 41. Widyantoro, D.H., Ioerger, T.R., Yen, J.: An incremental approach to building a cluster hierarchy. In: 2002 IEEE International Conference on Data Mining. Proceedings, pp. 705–708. IEEE (2002) 42. Chen, N., Chen, A.-Z., Zhou, L.-X.: An incremental grid density-based clustering algorithm. J. Softw. 13(1), 1–7 (2002) 43. Kailing, K., Kriegel, H.-P., Pryakhin, A., Schubert, M.: Clustering multi-represented objects with noise. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 394– 403. Springer (2004) 44. Kumari, P., Kaur, A., Singh, P., Singh, M.: Robust energy-aware task scheduling for scientific workflow in cloud computing. In: 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 985–990. IEEE (2017) 45. Zeng, H.-J., Chen, Z., Ma, W.-Y.: A unified framework for clustering heterogeneous web objects. In: Proceedings of the Third International Conference on Web Information Systems Engineering. WISE 2002, pp. 161–170. IEEE (2002) 46. Chaudhuri, K., Kakade, S.M., Livescu, K., Sridharan, K.: Multi-view clustering via canonical correlation analysis. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 129–136 (2009) 47. Kumar, A., Daumé, H.: A co-training approach for multi-view spectral clustering. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 393–400 (2011) 48. Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Trans. Knowl. Data Eng. 8(6), 962–969 (1996) 49. Kaur, A., Singh, P., Singh Batth, R., Peng Lim, C.: Deep-q learning-based heterogeneous earliest finish time scheduling algorithm for scientific workflows in cloud. Softw.: Pract. Exp. (2020) 50. Cheung, D.W., Han, J., Ng, V.T., Fu, A.W., Fu, Y.: A fast distributed algorithm for mining association rules. In: Fourth International Conference on Parallel and Distributed Information Systems, pp. 31–42. IEEE (1996) 51. Park, J.S., Chen, M.-S., Yu, P.S.: An effective hash-based algorithm for mining association rules. ACM SIGMOD Rec. 24(2), 175–186 (1995) 52. Park, J.S., Chen, M.-S., Yu, P.S.: Efficient parallel data mining for association rules. In: Proceedings of the Fourth International Conference on Information and Knowledge Management, pp. 31–36 (1995) 53. Cheung, D.W., Xiao, Y.: Effect of data skewness in parallel mining of association rules. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 48–60. Springer (1998) 54. Tanwar, S., Tyagi, S., Kumar, N.: Multimedia Big Data Computing for IoT Applications: Concepts, Paradigms and Solutions, vol. 163. Springer (2019) 55. Moens, S., Aksehirli, E., Goethals, B.: Frequent itemset mining for big data. In: 2013 IEEE International Conference on Big Data, pp. 111–118. IEEE (2013) 56. Vohra, J., Tanwar, S., Tyagi, S., Kumar, N., Rodrigues, J.J.: Hridaay: ballistocardiogram-based heart rate monitoring using fog computing. In: 2019 IEEE Global Communications Conference (GLOBECOM), pp. 1–6. IEEE (2019)

78

A. Kaur et al.

57. Thomas, S.A., Jin, Y.: Reconstructing biological gene regulatory networks: where optimization meets big data. Evol. Intell. 7(1), 29–47 (2014) 58. Madhamshettiwar, P.B., Maetschke, S.R., Davis, M.J., Reverter, A., Ragan, M.A.: Gene regulatory network inference: evaluation and application to ovarian cancer allows the prioritization of drug targets. Genome Med. 4(5), 41 (2012) 59. Bengio, Y., LeCun, Y., et al.: Scaling learning algorithms towards AI. Large-Scale Kernel Mach. 34(5), 1–41 (2007) 60. Bengio, Y.: Deep learning of representations: Looking forward. In: International Conference on Statistical Language and Speech Processing, pp. 1–37. Springer (2013) 61. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013) 62. Council, N.R., et al.: Frontiers in Massive Data Analysis. National Academies Press (2013) 63. Prabhu, C.: Fog Computing. Deep Learning and Big Data Analytics-Research Directions. Springer (2019) 64. Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: SIGIR’94, pp. 232–241. Springer (1994) 65. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988) 66. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). arXiv:1301.3781 67. Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation (2013). arXiv:1309.4168 68. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N.: Fog computing for healthcare 4.0 environment: opportunities and challenges. Comput. Electr. Eng. 72, 1–13 (2018)

Data Collection in Fog Data Analytics S. R. Mani Sekhar, Snehil Tewari, Haaris Rahman, and G. M. Siddesh

Abstract As the amount of real-time data is increasing rapidly, the computation of these data is a challenging task. These data are being produced by billions of IoT devices in the world and processed in the cloud. Meanwhile, around two quintillion bytes of information are collected every hour and are predicted to rise exponentially in the following years. The restrictions of efficiency, storage capacity and security of the end devices in the cloud lead to a new indispensable computing paradigm named as ‘Fog Computing.’ In this chapter we try to assimilate, the need to collect and manage data, how data differs in different scenarios and the various methods implemented at present to collect data such as node-based segregation which reduces the requirement of a large number of fog nodes to be set up and overloading of these nodes. Exploring techniques wherein raw and passive forms of data can be made to evolve and become meaningful with reduced size, understanding how bluetooth low energy technology can be used to process collected data through gateways and usage of data collectors with wireless low powered sensing systems. Finally, the chapter discusses fifteen case studies related to Moving Vehicles, Industrial Automation, Underwater Data Collection Water Conservation in Agriculture, Indoor Air Quality Monitoring, Health Monitoring System, Telehealth Big Data and Healthcare 4.0 related to data analytics by incorporating Cloud, Fog and IoT. Keywords Data · Data collection · Fog computing · Data analytics · Methods of data collection · Fog data analytics S. R. Mani Sekhar (B) · S. Tewari · G. M. Siddesh Department of Information Science and Engineering, Ramaiah Institute of Technology, Bangalore, Karnataka, India e-mail: [email protected] S. Tewari e-mail: [email protected] G. M. Siddesh e-mail: [email protected] H. Rahman Department of Electronics and Instrumentation, Ramaiah Institute of Technology, Bangalore, India e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Tanwar (ed.), Fog Data Analytics for IoT Applications, Studies in Big Data 76, https://doi.org/10.1007/978-981-15-6044-6_5

79

80

S. R. Mani Sekhar et al.

1 Introduction Data is a collection of facts, figures, observations, or a description of things. The term data and information are often used interchangeably, though there is a subtle difference. Data is raw and unorganized, which can seem random and futile until it is organized. When data is structured, processed and presented in a given context to make it useful, it is called information. Data is information that is transformed into a form that can be moved and analyzed. Data can be classified into three categories. Firstly, structured data is the data that can be organized into a typical database, i.e. and it can be stored in a table with rows and columns. This data can be searched efficiently by human queries or machine algorithms. Names, addresses, etc. come under structured data. Unstructured data do not conform to any data model. They have internal structure but lack an easily identifiable structure, resulting in certain difficulties such as searching and indexing the data. Computer programs cannot make use of this data easily. Text, audio, and video information, logs, and web activity are part of unstructured data. Semi-structured data fits in between structure and unstructured data. They maintain internal tags and markings but do not conform to a data model. These organizational properties make them easier to analyze by humans but not machines. XML, emails and JSON files are few such examples. Data collection is a methodical approach to accumulate and regulate information to get a definite understanding of an area of interest. Data Collection empowers a person or an institution to answer relevant questions, reduce risks incurred due to insufficient data, predict and assess outcomes, understand and set new trends. Overall, it gives new insights and a cutting edge over the competitors, whereas in the absence of data collection, the companies will still have to use the outdated methods to make decisions. Data collected can vary from personal data such as demographics, contact details and other identifying factors to web data, which comprises any information pulled off from the internet for research purposes or otherwise. Most of the data collected is in electronic form; hence the size of the collected data is humongous. As a result, this crosses into the realm of big data. The fast and steady collection of data in the last two decades and the growth of IoT devices over the last couple of years has led to a surge in data. Due to this, the term ‘Big Data’ was coined in mid-2005. Any huge amount of data structured or unstructured, which must be organized, stocked and processed, can be referred to as Big Data, and the metric frequently used is with regard to petabytes and exabytes. Data Mining is very often used to collect big data, and it is used to decipher patterns and correlations for IoT applications. It has been observed that mining data for frequent and repeated patterns have been more useful than the regular computing methods. Big data gives intuition that helps to make better decisions and strategic business moves. It allows the B2C companies to broaden their horizon and prosper by changing the marketing strategies and implement new changes with concrete analysis of data to support it. The traditional databases cannot monitor big data, and therefore, new computing methods have emerged, such as cloud computing, fog computing and

Data Collection in Fog Data Analytics

81

quantum computing. The term fog is used as an analogy to the real-world fog, and it is a layer between the clouds and the ground, which here in our case will be the cloud storage and the IoT devices, respectively. Fog computing has a decentralized architecture wherein it has edge nodes that have the power to compute loads of data without sending it to distant servers. Because of this proximity within the devices, fog computing grants control and real-time services and analysis. It facilitates computing near the users and provides data security and safety over data leakages. In layman terms, fog nodes or gateways are used to crunch through the data and send only the required data to the cloud. The remainder of the chapter is organized as follows. Section 2 highlights the general methods of data collection by various sources and how fog computing plays a vital role in data collection. Section 3 explains how optimized compression of data can reduce the traffic congestion in the network. Section 4 gives insights that are helpful in effectively collecting and managing big data. Section 5 discusses various case studies in crucial fields such as healthcare, agriculture, automation, etc. Section 6 summarizes the data collection methods in Fog Data Analytics and concludes the chapter.

2 Methods of Collecting Data Over the last few years, there has been an explosion of data from different fields through various sources. Data is being collected at a massive rate and pace by people and organizations to create a user-friendly environment, perform research and increase profits. Some of the ways to collect data are • Website visits—A company needs to understand user behaviour and traffic on their websites. They use various web analytics tools that are available in the market, such as Google Analytics, IBM Unica NetInsight, Webtrends Analytics, etc. The website owner, with the aid of these tools, collects and visualizes the statistics and perceives the user’s activity on the site. This is done in order to understand the different patterns in which the user tries to search and access specific choices available throughout the website, which is very helpful, for example, in creating a recommender system, which makes the website easy to access and saves time. Also, heatmaps can be used to track the cursor movements, scrolling areas, and time spent on certain sections. • Mobile Applications—According to some of the research groups, it is found out that as of now, there are approximately 3 million apps present in the play store created by Google. These apps can share the information available on mobile phones with Google and other third parties. This information can range from the location of the user and the nearest phone towers to personal information such as date of birth, gender, contact numbers, email ID, etc. The business of advertising and data sharing has skyrocketed in recent times and needs to be monitored efficiently.

82

S. R. Mani Sekhar et al.

• Loyalty Programs—It is tougher to maintain a customer base than acquiring the customers for the first-time. If not handled and serviced correctly, a customer might not return back to the brand/company next time, and thus, to keep the customer’s relationship with the brand intact, loyalty programs are used. To be loyal to the brand, the user is given offers, discounts, first-access, etc. in return for personal information. • Sensors—Sensor data is generated by the active gadgets in our smart environment and is often associated with the term Internet of Things (IoT). This covers everything from counting a number of steps travelled using motion or activity trackers, measuring heart rate and pulse, to measuring temperature and weather using sensors. Sensor data is generally used for the optimization of processes, and machines can adapt to its surroundings by making smart changes. For example— Air Asia, in collaboration with GE aviation, uses sensors with IoT complemented by AI to cut down its operating costs and boost usage and profits. • Fog layer acts as a link between the IoT devices and the cloud. Huge amounts of data are being generated every day and to deal with the drawbacks of cloud computing, it is very important to not only focus on data management, storage, processing, analyzing and monitoring but also focus on data collection. If devices are able to judge which data is useful and which is redundant or useless, then a lot of data will get filtered at the initial layers of networking and computing resulting in the intake of data for processing to be reduced by a very significant extent. Fog Computing plays a very crucial role in achieving efficient management of data. It filters out redundant data and prioritizes the processing of time-sensitive data. Data is harvested from the physical environment with the help of sensors which are placed in gadgets or IoT devices referred to as Fog devices [25]. A number of these devices are connected to large servers of higher configurations and the fog devices which are connected to the same server can also interact with each other. The efficiency and throughput while processing data and rendering the services requested in real-time depends on several factors such as the number of linked devices to a server, bandwidth, connectivity of the devices, etc. After the necessary evaluations, data is either filtered out or sent to higher levels for complex and expensive computation or storage purposes.

3 Optimized Collection of Compressive Data Traditional methods of information gathering in wireless sensor networks faced many challenges. The sensor data used to be collected continuously and stored at a local node. This data would then be forwarded to a central base station at regular intervals of time. As a result, this used to congest the entire network leading to inefficient data transfer and larger energy consumption [1]. The need for optimized compression and collection of data was required. According to the study by [2], there exist three main categories of schemes used for data collection. The first method is that of collecting the critical data or compression

Data Collection in Fog Data Analytics

83

of original data, and then reconstructing the original data by numerical analysis and interpolation techniques [3]. Reconstruction of data results in additional energy and time consumption. To solve this issue, an algorithm was developed, which reduces the redundancy of data by using the strong correlation among sensory data in temporal and spatial spaces [4]. The second method tries to solve the efficiency problem by creating advanced routing algorithms and balancing the load on the entire network [5, 6]. The final method implements a compressed sensing theory where data has a property called sparseness in the transformation process, which allows the data to be reconstructed with much fewer samples than Nyquist theory [7]. Other methods include collecting data at various time instances, storing it in a matrix form, and applying a singular value decomposition technique to achieve good compression of the data by choosing the number of singular values to be retained without losing out on information [8].

4 Management of Big Data The field of Information Technology has seen a colossal amount of inventions and innovations, technological changes, the introduction of tech gadgets and much more in the last two decades, and there are no qualms in acknowledging the fact that data has become one of the most crucial and fundamental elements of this astronomical realm. Therefore, it is important to have an outlook for smart management of data, especially big data, to save space and time instead of only focusing on computing platforms. The smart management of data leads off from an efficient collection of data at the very beginning, and Fog Computing can play a major role in achieving it. The data collected from the sensors and several other gadgets and devices are transformed from their natural and bland form to an ingenious form of data cells that is more substantial in terms of valuable information and is also a systematic way of minimizing the load of big data [9]. As soon as the data generated from the sensors is collected, then and there, it is processed in the local fog, and more relevant data with less velocity and volume is forwarded to the cloud for global processing and storage. This also helps in reducing latency and communication overhead.

5 Case Studies The following section presents case studies discussing how data is collected in various industries such as automotive, agriculture and healthcare, etc. in different environment settings and how data collected from individuals is utilized in different scenarios.

84

S. R. Mani Sekhar et al.

5.1 Data Collection in Moving Vehicles In today’s data-centric world, vehicles are being equipped with different kind of instruments. These instrument or sensors provide data such as GPS location, vehicle speed, video data, the status of various parts of the vehicle and chemical emissions. The data is then used for applications like intelligent transportation systems, emergency response systems, traffic monitoring and pollution analysis. With the development of the vehicle industry and wireless communication technology, the vehicular ad-hoc networks (VANETs) were created. The system mainly comprises of an onboard unit (OBU), application unit (AU) and roadside unit (RSU) [10]. The OBU is generally mounted on a vehicle for exchanging information with RSUs or other OBUs. The AU is a device placed within the vehicle that communicates the information with the net via the OBU, which is responsible for all mobile and networking function, The RSU is fixed along the roadside or at strategic locations for shortrange communication. The RSUs extend the transmission range of the network by forwarding the data to other OBUs and RSUs and provide internet connectivity to the OBUs as well. But VANETs also face various challenges such as signal fading due to obstacles between two communication vehicles, bandwidth limitations, connectivity due to rapid changes of topology and inefficient routing protocols. To overcome these challenges, the concept of fog computing is extended to VANETs. This enhances the chances for the optimization of the data gathering process. In [11], a data gathering framework based on fog computing (DAFOC) was proposed. Figure 1 shows a three-layer data collection framework and analytics involved in each layer. The framework intuitively varies the threshold to upload suitable amounts of data based on congestion in the network and suppresses unnecessary message transmissions. The RSUs in the DAFOC frameworks behave as fog nodes that provide computation and storage capabilities among the vehicles. The vehicles have sensors that operate on two modes: low-cost sensing (LCS) mode and high-cost sensing (HCS) mode. The nodes on LCS mode perceive the environment and produce data at a fairly slow rate. The RSUs evaluates the confidence of data and initiates an event validating procedure when the confidence is high. Based on this event check, the nodes in HCS mode sense more detailed data about the environment. This detailed data is sent to the RSU for final event verification where the data may well be advanced to a neighbouring node, or another RSU or the cloud directly. This information is processed in the cloud, and the decision/feedback is sent back to the RSU. This method lowers the overall power requirement of the nodes by implementing the two modes of operation and using each mode as and when required. Also, it minimizes unnecessary data upload and transmission and reduces transmission costs.

Data Collection in Fog Data Analytics

Fig. 1 Three-layer architecture for data gathering system [11]

85

86

S. R. Mani Sekhar et al.

5.2 Fog Computing in Industrial Automation Following the trend towards industry 4.0 and cyber-physical organizations, the industrial Internet of Things (IIoT) has started taking off. The adoption of IIoT is changing industrial automation and leading to greater connectivity among industrial systems. Data collected from IIoT applications, such as scheduling of paths for industrial robots and manufacturing monitoring, requires real-time processing of data, including computing architectures that support low latency and efficient response. Cloud computing and IoT have a hand in hand relation. The cloud acts as a pathway for the massive amounts of data generated from the IoT. But the cloud centres are usually remote, which leads to latency in the transmission of data. To offer small latency and positional cognizance for the systems, fog computing is adopted in IIoT. A generic framework of fog computing in an industrial environment [12], consists of three-layers: things layer, fog layer and distributed cloud layer. All the machinery and equipment utilized in the industry are part of the things layer. They generate data that has to be managed based on the particular request by a certain application. The fog layer includes network devices such as routers and gateways to process timesensitive data since they are closer to the things layer. Wired or wireless means of communication can be used with the former layer. High-end computing servers form the basis of the distributed cloud layer and perform data-intensive computations. This layer uses cellular communication or broadband networks to communicate with the fog layer. The communication between layers can be performed in numerous ways. The first method involves the IIoT machinery to generate data and send requests to the fog nodes for either processing at the fog nodes or at the cloud. These fog nodes analyze the requests and transmit the results to other fog or cloud nodes. If the workload is light or processing of data in real-time is required, the data is sent to the fog nodes. Otherwise, it is sent to the cloud. If the data has been sent to the cloud, the clouds nodes process the requests, and the results are retrieved by the IIoT nodes. The majority of the data is global information for the industry, which is stored in the cloud for global data sharing. The second method, unlike the first, performs the tasks individually but finally collaborate to complete the tasks in a distributed manner. This method demands optimal task allocation since the system is distributed over nodes but compensates for the limited computational powers of fog nodes. The final method is a multi-tier model that contains two fog layers. The fog layers can be of the same or different interaction modes that are centralized or distributed. Computationally inexpensive tasks are handled in tier-1, whereas computationally demanding tasks can be handled in tier-2. The cloud nodes handle the most demanding tasks. In the centralized model, a master node and several fog nodes are present at each domain in the fog node. The master node receives the estimated request time from each fog node and determines the fog nodes which are idle to perform the requested task based on the network conditions, whereas the distributed interaction mode does not require a master node. Using a distributed communication protocol, the fog nodes communicate with each other. The fog nodes maintain certain variables such as

Data Collection in Fog Data Analytics

87

waiting time, which are updated in a condition table. The fog nodes choose suitable neighbouring nodes for the offloading of tasks. The centralized model can be simple to deploy but is prone to single-link failure. That is, if the master node fails, the entire system will come to a halt. The distributed mode is not vulnerable to singlelink failure, but the implementation of a distributed protocol is complex, and task distribution is complicated.

5.3 Collection of Data Under Water Underwater acoustic sensor networks (UASNs) has been widely accepted for data collection schemes underwater. UASNs find applications in ocean disaster deterrence, military defence, assisted navigation, monitoring of aquatic life, coastline surveillance and protection and resource explorations. These applications generate enormous amounts of data such as high-definition video, audio and pictures. However, these networks face several issues such as low propagation speed, multipath effect and Doppler spread due to the acoustic signals. Therefore, these data collection schemes cannot be applied to underwater environment directly and the concept of fog computing was extended to these networks. In [13], a fog computing-based, four-layer network of underwater sensor cloud system was proposed as depicted in Fig. 2. The sensors/nodes in the physical layer have limited storage and computation capacities. They are equipped with antennas and are either anchored to the bottom of the ocean or made to float at certain depth with the help of buoys. They only sense the data and direct the collected information to suitable fog nodes. These fog nodes, in the fog layer, have large storage capacities and are computationally stronger than the nodes in the physical layer. The main purpose of the fog nodes is to perform localised computation on the data received from the physical nodes by discarding redundant data, extracting key information and dimensionality reduction of the data. Based on this information, the fog nodes, which are generally mobile nodes, make it to the surface to distribute the data to nodes in the sink layer if data is delayinsensitive or by the transmission of data to sink nodes through multi-hop mode with the help of several other above-level mobile fog nodes. The sink nodes in sink layer transmit the information to the cloud computing centre by radio signals after data fusion operation. This novel method helps to reduce communication delay between the nodes and minimize overall energy consumption in the network.

5.4 Water Conservation in Agriculture Using Fog and IoT Water is the basic necessity for the survival of all living things on Earth. It is essential for ensuring food security to the world’s inhabitants. Agriculture is the largest consumer of water accounting to about 70% of freshwater. The quality of water is

88

S. R. Mani Sekhar et al.

Fig. 2 Underwater sensor cloud system

important for physical health and fitness. Lack of water or usage of contaminated water can lead to serious health problems. The major grounds of water wastage are leakages in pipelines during distribution and the use of primitive irrigation method, surface irrigation, by drenching areas where no crops can profit from which depletes an enormous portion of water. Hence, a smart water management platform is required to sense the amount of water required by the agricultural crops and regulate the flow of water to where it is required the most. The SWAMP project was undertaken to incorporate IoT based solutions for smart water management by monitoring the field based on crop status and environment, the condition of the crop and to alter the irrigation plot accordingly [14]. The SWAMP system is classified into five layers

Data Collection in Fog Data Analytics

89

• IoT Services/Sensing Layer: A variety of sensors and actuators are utilized to acquire data on soil, plant (vegetation index, canopy temperature), weather and precipitation levels. The SWAMP pilots use commercial sensors such as drones to take images, as well as homemade multiparametric sensor probes [15]. • Data Acquisition, Security and Management: Distributed databases consisting of cloud and fog nodes work in conjunction with each other to deal with large amounts of data coming from the sensing layer. Protocols and software for data acquisition along with security are the focus of this layer, • Data analytics: Analysis of big data is performed in the cloud. SWAMP utilizes prevailing algorithms and models. A distributed infrastructure of cloud servers and fog nodes are utilized to make the data available to upper layers. • Management of information: Builds upon application middleware protocols in addition to the data analytics facilities in layer 3. This layer acts as an API for the final layer. • Application Layer: The data that is collected is altered into information that is useful to the agriculturalists via user interfaces. The SWAMP project has ensured that the layers are generic so as to be adjustable to other environmental settings. Layers 1, 2 and 3 are sufficiently elementary to be replicable in a variety of settings, whereas layer 4 is a fully customisable layer which may need customisation for every deployment. Methodologies in Layer 5 are application-specific. SWAMP is still in its initial stages, though pilots have proven to be successful.

5.5 IoT Implementation for Collection of Data Using QR Codes Some specific objects are recognized by using videos which store data on the internet. This implementation of IoT model is built on algorithms executed in the OpenCV library for python to identify or recognize the objects and use Raspberry PI to pile-up the data accumulated from the surrounding. The distinct and specific hardware and software used can collectively be called a module [16]. Every time when there is an object detection the module sends a request to the web application. The communication which involves these modules take place through HTTP communication and thus, there is a need for active connection to the internet. This web application is responsible for storing, monitoring and controlling of the data of the objects. In the recent few years, camera monitoring systems have become very common and it is observed that a substantial amount of information is not being used efficiently. Quick Response Codes, better known as the QR Codes, holds the potential to become one of the most intriguing ways of acknowledging the recognition of labels or marks, considering the fact that 2D type bar codes which are most widely used, encodes alphanumeric characters that have lots of information and possible applications. Each QR code is structured in a way so that it encompasses data, a code to

90

S. R. Mani Sekhar et al.

correct the encountered errors and some orientation patterns. The system proposed for implementation can be classified into three-layers, as Database Model, Web API and Web Application and the camera module. The system enables the user to keep track of the location of an entity at some specific point in time. Artificial vision is the most prominent part of this IoT implementation, the usual impediments faced are the quality of the camera, the number of frames per second and the distinctness of the object. Basically, modules are created by the users to gather data and transmit the required information about the respective objects in order to dispense that information to all the members of the pool. This pool behaves as a reference for the data showcased and also for the upcoming data from the modules, hence the pools happen to be the most important since all of the data is stacked and registered using a referral or labels. Eventually, the user will have contemporary information about various objects. For example, this technology can be used to identify vehicles going in and out of a facility per day. The facility centre will share the data and the outcome will be available to the owner of the car.

5.6 Indoor Air Quality Monitoring Using IoT and Fog With the rapid growth of industrialisation and urbanization, the level of pollution in the environment has been increasing at an exponential rate. This pollution affects the day to day life and quality of living of an individual. Environmental pollution can be of air, water or land pollution. Air pollution stands out as one of the more prominent forms of pollution as it affects the health conditions of people at a larger scale. People today are spending more time indoors (homes, offices, work related areas) than outdoors. Due to the rising demand for automation, at home and offices, consequently, there has been an increase in the usage of electronic devices which have been found to emit many harmful gases and radiations. Hence indoor air quality monitoring has become of utmost importance. An IoT enabled indoor air quality monitoring system was proposed by [17]. The IoT device/sensor samples the air at constant intervals of time. The data was transferred to an intermediate node via Bluetooth. This intermediate node then communicates with the processing node through WiFi. An alarm was raised whenever the pollution level was raised beyond a certain limit. This idea was further improved on by [18], as they developed an air monitoring device with the underlying architecture strongly formed on the concepts of fog computing. A three-layered approach was taken in this system designed as the sensing layer, network layer and application layer. The layers communicated with each other through Wi-Fi. The sensing layer was the backbone of the entire system. It was used to sense the air quality and was deployed over the wide area. The sensing model comprised of the sensor chunk, processing unit, communication module and power module. The sensor block consisted of various sensors to detect the different types of pollutants and the processing module was used to process the raw data from the sensors. The fog computing layer used to gather the data from all the sensing

Data Collection in Fog Data Analytics

91

layers and passed the data to the application layer after necessary processing. The application layer receives data from the fog device and provides data visualization either through the website or mobile application. The entire model was considered with the sole aim to minimize the redundant network traffic, reduce overall power consumption and reduce the computational burden on the sensing nodes. With the help of this huge volume of data collected and processed, useful messages were generated and sent to the people within the confined spaces to raise alerts and create awareness.

5.7 Emotional Profiles With the passage of time, people have become more aware of health and fitness, and thus many new models have been proposed and developed to apprehend the affective state of individuals. More detailed 24/7 calibrated monitoring helps to delve deeper into the internal and external worlds experienced by human beings [19]. Data such as anatomical signals, speech and facial expressions help to behold and assimilate the affective states [20]. This category of data can be accumulated effortlessly, with the help of the sensors embedded in today’s smart devices such as smartwatches, smart glasses, motion or activity trackers, smartphones, etc. Because of the widespread popularity of these devices and their use by the common people, there arises an opportunity to develop the existing unorthodox techniques to recognize and analyze the various states experienced by the subject and for what reason they occur. Also, collecting and analyzing data of this class can be helpful in identifying the cause of various emotions, and then based on the outcome suggest actions to be taken to deal with them. These devices (smartphones, laptops, etc.) are recognized as IoT nodes and are interconnected to each other with the help of the internet and other technologies such as radio frequency identification (RFID) or other wireless sensor networks. Services in healthcare are omnipresent and still have a scope of prevalent research work and great developments. Amidst these innumerable devices it is important to collect data but, in the meantime, be energy-efficient and costeffective. One such platform providing energy-aware services can be based on REST architecture which stands for Representational State Transfer. The energy utilization of the devices can be minimized by logically reduplicating the available sensors incorporated within the IoT devices and firmly regulating the rate of transfer of data grounded on the reckoning of the energy left in the IoT devices. The proposed platform was undertaken to reduce energy consumption by the IoT devices and can be classified into three components: • Smart Devices—Data Collection applications must be installed on every IoT node (smart devices) for collecting data related to the emotional state of the people and then sending them to fog nodes for further analysis. • Central Nodes—These nodes act as a hub and sanctions link between the RESTful web services and smart devices active in a smart environment. One of the smart

92

S. R. Mani Sekhar et al.

Fig. 3 Proposed architecture of the data collection model in fog data analytics [20]

devices itself, some other personal computer or some special device can take up the job of a central node. • RESTful web services—The biggest advantage of using this web service is that it is sustainable and scalable in nature. Some of the services offered in context to the platform discussed are listed in the table Figure 3 is a flowchart displaying the three essential components of the proposed architecture and the web services rendered essential for complete implementation of the model is listed in Table 1.

5.8 Health Monitoring System Using Fog Computing Healthcare is crucial to a country’s economic and wealth development. The rise in people falling, accidents and emergencies require hospitals to treat, diagnose and manage all these various cases. It is becoming increasingly difficult to monitor that a patient is complying by treatment plans and safeguarding them during attacks. To remedy this situation, wireless sensor networks are being widely adopted in the field of health informatics.

Data Collection in Fog Data Analytics

93

Table 1 Required web services provided by the RESTful architecture Device Discovery

The device discovery service receives a notification from the registered device through the central node and upon verification, a link is established to transfer the collected data from the device to the next layer

Device Registration

A new device when registered is provided with its credentials and the type of device. The device registration system adds the newly registered device details into the list maintained by the central node

Sensor Deduplication

The sensor deduplication service detects smart devices with same sensors and then collects data only from one of them to filter out the redundant data. The central node maintains a list of sensors that contribute data to the cloud

Energy Consumption Management The Energy Consumption Management service sets the monitoring rate of the smart devices according to the battery life of the device. The monitoring rate of the device is directly proportional to its battery level Sensor Data Collection

The data which is generated by the sensors are collected and transferred to the sensor data collection service which integrates all the information in its database

Affective State Profile

Data is retrieved from the database by the Affective State Profiler and then utilized to fabricate/construct the emotional profiles of the subjects

Patients wear wireless sensors which can monitor several health parameters remotely by the hospital while the patients are at the comfort of their homes [21, 22] has proposed a context sensitive fog computing environment to improve the state of current healthcare systems. The wireless accessories attached to a patient generates huge amounts of data. This data may be useful or redundant. Also, the data being collected varies among patients. Cloud computing can cause a delay in data transfer from sensor to cloud to hospital, hence to reduce this time delay a distributed architecture such as fog computing is required. A three-tier architecture consisting of cloud, fog and sensor layer has been proposed. The sensors consist of wearable or non-wearable devices such as smartwatches, smart glasses or smartphones. The data gathered can be both intrinsic such as blood pressure and blood glucose levels or extrinsic such as the temperature of the physical location of the patient depending on the context. The collected real-time data is then sent to the fog computing layer for data analysis and aggregation. The data is distributed among various fog nodes for efficient computing where duplication of data is detected and filtered out. Then data is fused to be put together as a single entity after which it would be sent to the cloud computing tier if further analysis is required. The various health monitoring systems perform actions based on this data supervise the actions taken by the fog computing layer. The introduction of a fog layer into the cloud computing network reduces the security risk associated with patients and the prevention of loss due to a data centre failure. The

94

S. R. Mani Sekhar et al.

implementation of fog in a health monitoring system is still in its primary stages of research and development, but studies have proven it to have a fruitful future.

5.9 Collecting Data Related to Elderly Behaviour According to various census and surveys, it was reckoned that in the next fifteen years there will be a huge shift in the proportion of people living above the age of 60, this shift can likely be from 11% as of now in 2019, to 17% after a little more than a decade in 2030. Therefore, it is necessary to develop effective models to make the lives of the older people more smooth, relaxed and supportive in the upcoming years. Aged people are equally vulnerable to both physical and cognitive diseases such as mild cognitive impairments (MCI) and infirmity [23]. MCI gives rise to many significant cognitive changes in the person as noticed during a normal conversation in social relationships and daily behaviour and if not acted upon the healthcare system, to adapt to the upcoming contrasting requirements then there is a possibility of social and economic challenges around the globe. One method to acknowledge this issue is, by not considering an elderly person as an entity with special requirements. It is the person’s way of living and the cordial relationship he maintains with all his colleagues, family and friends what should be taken under consideration. The gigantic amount of data accessible today can aid in the decision-making process and create smart territory at the service of its occupants. Information and Communication Technologies (ICTs) plays an important part, by sensing data from the physical domains, managing data and employing, with the purpose of facilitating compatible modules upon which newly integrated values and personalized assistance can be fabricated. This problem is addressed by establishing substructures based on ICTs to observe the ageing behaviour and to provide rectifying measures after the data has been completely analyzed with the aim of promoting their independent living. Numerous technologies are used for surveying elderly people’s behaviour varying from wearable devices to wireless sensor networks, vision systems, portable interactive devices, bluetooth low energy beacons and augmented reality. Data can also be accumulated using high-end sensor devices, but these are not affordable and can create hindrance in implementing the data collection model at a large-scale. Therefore, it is important to consider the cost of the sensing devices for large scale implementation and it should be seen that there is an unobtrusive flow of data. The framework designed for this IoT implementation can be divided into two layers, the first layer which is used for collecting and managing the data for both indoor and outdoor activities, and the second layer then uses this data, analyzes it and then sets off the correct measures. The first and foremost requirement for the system is to precisely regulate the flow of the collected data from the inharmonious data sources and then assess in what aspects the behaviour of the elderly is varying or is consistent in different state of indoor and outdoor activities. The optimized data obtained as the output will be the prime data for the future works.

Data Collection in Fog Data Analytics

95

5.10 Telehealth Big Data Through Fog Computing It is very evident by now, that data gathered with the help of several sensors, especially the wearable sensors in healthcare and biomedical application is tremendous and will only rise exponentially, and hence there is a need of smart information gathering, data storage, data reduction and data analytics at the fog layer and then at the further levels [24]. The nested computer gathers the discerned data as time series, analyzes it and then tries to decipher similar patterns which are present in the collected data. The unique patterns are transmitted, and the nested computers draw out the relevant information that is forwarded to the cloud. Today some wearable medical sensors like ECG and activity monitors when positioned on the human body allows unobstructed 24/7 collection of data for health monitoring. Nowadays there are lots of self-regulating devices which are present around the human body, including at places like home and office that collect real-time data and feed telehealth interventions which help in making healthcare a little more affordable and raise awareness of one’s self health. This can be cited as a standard example of big data application wherein huge amounts of data with real-time information has to undergo a speedy processing to provide the optimum healthcare. One of the difficulties faced by the healthcare organizations is to project information sensing nodes into the body sensor networks (BSNs) which is now facilitated by employing wearable technology devices such as activity trackers, smartwatches, belt-worn personal computers, etc. The energy efficiency of these edge devices used in telehealth is also a matter of concern. To provide with ceaseless monitoring of the patients, BSNs are operated on batteries, and thus it is important to keep low power consumption as one of the priorities for BSNs. Generally, the procedure of data storage and data transmission devours a lot of energy, and thus it is better-off to process the data faster and filter out unwanted data which consumes space in storage and consumes energy when it is to be transmitted from one layer to another. Fog Computing architectures are extremely useful in achieving such objectives, their distinctive features can pull off onsite data analytics to lessen the unwanted data from being stacked and transmitted to the cloud. Some of the efficient telehealth systems are (1) Philips has come up with a device for COPD (chronic obstructive pulmonary disease) patients which basically is an adhesive patch ceaselessly gathering information about attributes like respiratory function, heart rate and physical activity [25]. (2) Philips is also working on developing one of its devices which can be controlled by iPhone and iPads and shall be used to subdue the very common tenacious pain suffered by many people across the globe [26]. (3) EchoWear is a smartwatch-based system that collects approximately 100 Mb data per day per person, who undergo profound speech therapies to improve their communication effectiveness [27].

96

S. R. Mani Sekhar et al.

5.11 BLE-Based Data Collection The data gathering is one of the most vital aspect in the implementation of IoT. Because on the soaring amount of data being generated every day and which is only going to multiply in the coming days, it is crucial that efficient ways of collecting data should be focused on. There are many systems proposed with their underlying architecture based on Bluetooth Low Energy (BLE) technology to send the data onwards from the smart devices, phones or gadgets also referred to as data collectors to the fog nodes or hubs for analyzing data [28]. Table 2 compares the attributes of some of the similar technologies currently in use. The smart objects are acknowledged as the most elementary unit of Internet of Things (IoT) vision. The ideate is to connect several devices and objects to each other in our surroundings and bring together the data collected by them at any time, from any place and through any course of action. According to the data provided by the IoT Analytics, the number of internet-connected devices currently active in the world is estimated to be around 7 billion, that is a little less than the world population and this number is only going to increase insanely as internet connection and consumption increases and new devices and gadgets are introduced in the market. There are many applications of IoT devices and in various domains such as medicine, industry, agriculture, etc. The outpouring of information can be generalized in three phases; the collection phase where the data is collected through several IoT devices, the transmission phase where the information is filtered through the fog nodes in the fog layer and then forwarded to the cloud and the last, the managing, processing and the utilization phase. The process of collecting data is a set of techniques and technology which is used to sense the real environment and gather the existing data. Smartphones are now no more used for carrying out conventional jobs such as to send someone a text message or make calls alone but nowadays are also used to sense the environment around them. Specialized sensors in smartphone, as shown in Fig. 4, has been a huge addition in recent times, such as accelerometer, compass, gyroscope, orientation, GPS, proximity, gravity, barometer, light, sensor and some commonly and widely used sensors such as microphone and camera. Not only sensors, the smartphones also have RFID, NFC and BLE in their arsenal. Bluetooth Low Energy (BLE) is also referred to as Smart Bluetooth which was first-time set forth in 2010 as a segment of Table 2 Some specifications of existing Low power technologies Low power technology

Standard

Transmission range

Expected lifetime

Rate of data flow

BLE

IEEE 802.15.1(V4)

8–10

1–2 years

1–1.5 Mbs

Bluetooth

IEEE 802.15.1

10–50

Days–months

1–3 Mbs

Wi-Fi

IEEE 802.11 b/g 90–100

Several hours

10–11 Mbs

ZigBee

IEEE 802.15.4

4 months–2 years

20–250 Kbs

90–100

Data Collection in Fog Data Analytics

97

Fig. 4 Sensors present in the smart devices [28]

Bluetooth Core Specification version 4.0. It is a technology used to facilitate shortrange wireless communication. The main aim of this technology is to authorize transceivers with low intricacy, low consumption of power, higher shell life and lower operating costs than the outstanding transceivers. It is an augmentation of the existing Bluetooth technology that permits small battery powered devices such as sports sensors, digital and smartwatches, wireless keyboard, etc. to communicate with each other. Even though BLE has comparatively less transmission power and range, the devices that operate on BLE can stay functional for many years because of its highly efficient low power-consuming idle mode. Bluetooth Low Energy (BLE) has a major setback that it allows data transmission only over a single hop, however, there are many mechanisms coming up in order to tackle this problem using the fundamental components of the BLE stack.

5.12 Safety Management System for Miners Mining contributes more than 2.4% of the GDP of the country. It is a source of employment for a large section of people, especially those, who reside close to the mines. History serves as evidence that even the smallest of mistakes or any form of

98

S. R. Mani Sekhar et al.

negligence can prove to be lethal in the mining industry such as the Dhanbad and Chasnala disasters in India. A maximum number of fatalities occur in the coal mines of India. This makes it extremely important to ensure the safety of the mineworkers for which it is vital to know their live location and the environment they are working in. Cloud computing can be useful but the existing state-of-the-art technologies and architectures are unable to garner effective results. Hence many new architectures or models are being proposed such as a fog layer being integrated to make the system more reliable, dynamic and efficient [29]. Miners are constantly exposed to a dangerous and robust environment where toxic gases and dust have adverse effects on their health. Several propositions include communication with the help of sensors which determine different parameters such as methane gas level, oxygen level and water level but they have their own limitations. Even companies have tried to set up message centres and implement trained robots for inspection and other automation activities but since it is not cost-effective, the practice was discontinued. Thus, it is important to make the mineworkers conscious of life-threatening situations and also guide them to safe evacuations. One of the recently proposed architecture supervises the work environment and the movement of the miners in the minefields. A GSM layer has also been added for better navigation and monitoring of the action of the miners. Some of these sensors used for determining the level of carbon dioxide, carbon monoxide and hydrogen sulfide in the coal mines are K-30, MQ-7 and NTMOS H2 S gas sensors, respectively. The data from these sensors are collected and forwarded to the different sensor nodes which creates a path till the fog gateways and then the fog nodes manoeuver the data as per the requirement. The fog layer gives an edge to the proposed architecture by reducing latency in the time of emergency. The fog layer analyzes the data and notifies the worker and the control room if the values of various parameters are on par with the threshold limits otherwise it sends the data to the cloud for storage purpose. The data stored on the cloud can be used for drills, research and visualization purposes. Adding more sensors such as pressure sensors and adding routes for fast and safe evacuation at the time of a disaster will alleviate the efficiency and the performance level of the architecture.

5.13 Healthcare 4.0 Healthcare industry has swiftly and rapidly grown after the evolution of the World Wide Web (WWW), in the last three decades. Since inception, Health 2.0 (the mid2000s) and Health 3.0 have already been implemented wherein patients and the doctors are able to use basic tools for education, spreading awareness, disseminate elementary health-related guidelines and eventually go on to create medical records of the patients and store them for future references. These models turned out to be very convenient and efficient for the health experts, researchers, hospitals and its other users. With the escalating number of medical records and the information stored on a daily basis, it was crucial to introduce Cloud Computing in the healthcare industry to supplement the processing and storing of information more effectively. Sensors

Data Collection in Fog Data Analytics Table 3 Explanation of the acronym SCALE

99

Fog Computing Security—Data is processed by various fog nodes in a distributed system Cognition—Assimilate the information and filter the relevant data that should be forwarded to the cloud Agility—Although the computational capability is low but helps in getting an instant response Latency—Nodes are geographically proximate to the users, thus helpful in speedy and swift responses Efficiency—There is no loss of connection thus it is both power and process efficient

and wearable medical devices have developed with the advancing technologies and thus have become more reliable and safer to use, as a result of which lots and lots of data is being generated everyday and processed and stored in the cloud. Although cloud computing mitigates the problems faced regarding the storage and processing it still has its own limitations such as the bidirectional transfer of data between the user and the cloud is not smooth and fast enough, this impediment can lead to lifethreatening situations and other blunders in the healthcare industry. This is where fog computing can be useful, it can facilitate the job of cloud computing in cases of medical emergencies. Fog Computing gives us an edge over cloud computing in terms of low latency, i.e. since fog nodes or gateways are near to the primitive devices, hence the overhead diminishes; and resilient, i.e. it helps to retrieve the lost data and also detect the anomalies in the link connections [30]. Fog Computing is also often referred to as SCALE, described in Table 3. Different architectures are being proposed that give us an intuition of how fog computing can immensely contribute to the Healthcare 4.0. One such three-layer architecture proposes the transfer of information between the Medical device layer, Fog layer and Cloud layer as per requirements. The first layer, i.e. the Medical device layer is the fundamental layer which involves collecting huge amounts of data through the different sensors, smart devices and wearable medical devices such as smartwatches and glasses. The IoT devices can further be divided into devices that digitally monitor the patient’s health and wellness such as ECG, glucose, haemoglobin and devices that record parameters based on the surroundings such as the number of steps walked, calories burnt, etc. The second layer is the fog layer which consists of several computational nodes that function on low power and are capable of giving high performance. One of the most valuable benefits of using fog computing is the quick responses while processing real-time data. The fog nodes are placed geographically near the medical devices and sensors, and thus the data which needs to be processed instantly are sent to fog nodes to get a prompt response and the rest of the data is forwarded to the cloud. The third and the final layer in this architecture is the cloud layer that incorporates major high computing data centres which perform on data that is collected by the sensors and the medical devices, and is filtered out by the fog layer. The cloud layer

100

S. R. Mani Sekhar et al.

is also responsible for the storage of the medical records and other information which can be accessed by both the medical practitioner or doctors and the patients. Further optimisations are required to implement the architecture in the near future and make it more scalable.

5.14 Comparison on Case Studies Table 4 shows a set of comparisons drawn between multiple case studies formerly presented. It is clear that there is a general trend that fog computing proves to be advantageous over general cloud computing schemes. Fog computing provides better data security and privacy, reduces latency, provides quick decisions and reduces transmission costs. It can also allow for efficient real-time analysis, quicker transmission speeds and also filter or suppress unwanted data. The boxes shaded in grey show the presence of the specific layer in the application mentioned.

6 Conclusions The drawbacks of cloud computing, such as latency, security and privacy, and eruption in the number of IoT devices in the past decade led to the rise of fog computing. Fog brought forward a new era in the world of technology. In this chapter, state-ofthe-art methods of data collection were presented along with several case studies that include summarized descriptions of data collection methods in various fog computing applications. The optimized collection of useful data was essential in all of the fog applications. There was a common theme among the methods used in the various applications. As fog nodes were deployed over a large area. Data received from the IoT devices was analyzed by these fog nodes and processed according to the significance of the data. The redundant data was filtered out and depending on the workload, the computation of the data is either performed on the fog node itself or otherwise sent to the cloud for further evaluation by the fog node itself. The processed data is then sent back to the IoT devices based on its requested service. Case studies in a wide range of fields show the importance and emergence of fog computing as a supplement to cloud computing. The fog has been widely adopted in fields such as autonomous vehicles and healthcare while still being in its infancy stages in fields such as agriculture, air quality and emotional profile monitoring systems. Subsequently, fifteen case studies were discussed in this work such as Moving Vehicles, Industrial Automation, Underwater Data Collection Water Conservation in Agriculture, Indoor Air Quality Monitoring, Health Monitoring System, Telehealth Big Data and Healthcare 4.0 related to data analytics by incorporating Cloud, Fog and IoT. In summary, the purpose of this chapter was to provide useful insights into the methods of data collection in the fog ecosystem, encapsulating recent research trends

Data Collection in Fog Data Analytics Table 4 Comparison of different case studies Application Node/Edge Fog Cloud Layer Layer Layer Moving Vehicles without Fog Devices [10]

101

Characteristics Causes signal fading and inherent bandwidth limitations

Moving Vehicles with Fog Devices [11]

Supresses redundant data upload and reduces transmission costs.

Industrial Automation without Fog Devices [12] Industrial Automation with Fog Devices [12]

Latency in transmission of data to data centers

Underwater Data Collection without Fog Devices [13]

Slow propogation speed, multi path effect and noise in signals Limited storage capabilities Extracts key information, performs localised computation, reduces communication delay Analysis of large amounts of data in an efficient manner

Underwater Data Collection with Fog Devices [13]

Water Conservation in Agriculture using Fog [14] Indoor Air Quality Monitoring using IoT [17] Indoor Air Quality Monitoring using Fog [18]

Health Monitoring System with IoT and Cloud [21]

Provides data security and positional cognizance of systems

Allows for collection of data with no real time analysis or security Data visualisation and real time alerts while reducing network traffic and power consumption Delay in data transfer from sensor to cloud and back and partial redundant data

Technologies Used VANET architecture with OBU, AU and RSU DAFOC architecture with fog nodes operating in LCS and HCS modes Presence of remote cloud centres Multiple routers and gateways with wired or wireless means of communication Underwater Acoustic Sensor Network model

Mobile fog nodes that transmit data via multi-hop mode or floating to sink layer SWAMP architecture with 3 generic layers and 2 application specific layers IoT devices to sample the air at constant intervals of time Three layer fog model with complex IoT devices to sample different types of pollutants in air Basic wireless sensors for collection of data

(continued)

102 Table 4 (continued) Health Monitoring System with IoT and Fog [22] Tele-health Big Data without Fog [24] Tele-health Big Data with Fog [24]

Healthcare 4.0 with Cloud [29] Healthcare 4.0 with Fog and Cloud [29]

S. R. Mani Sekhar et al.

Filtering of duplicate data, minimizes security risks and prevents data loss Low energy efficiency and consumes more space/storage Reduces unwanted data and improves communication speeds Bi-directional transfer of data is inefficient Diminishing of overheads and low latency and retrievably of data

Context sensitive fog computing environment with smart devices Nested computers gather time series data Energy efficient body sensor networks with smart wearable devices Sensors and wearable devices Three layer fog model which transfers data as per application requirements

in data collection and to extend the scope of research and implementation in new and exciting domains of fog computing as it will be the part and parcel of the cloud computing ecosystem.

References 1. Wang, F., Liu, J.: Networked wireless sensor data collection: issues, challenges, and approaches. IEEE Commun. Surv. Tut. 13(4), 673–687 (2010) 2. Chen, S., Du, L., Wang, K., & Lu, W.: Fog computing based optimized compressive data collection for big sensory data. In: 2018 IEEE International Conference on Communications(ICC), pp. 1–6. IEEE 3. Zhu, T., Wang, X., Cheng, S., Cai, Z., Li, J.: Critical point aware data acquisition algorithm in sensor networks. In: International Conference on Wireless Algorithms, Systems, and Applications, pp. 798–808. Springer, Cham (August 2015) 4. Cheng, S., Cai, Z., Li, J., Gao, H.: Extracting kernel dataset from big sensory data in wireless sensor networks. IEEE Trans. Knowl. Data Eng. 29(4), 813–827 (2016) 5. Dong, M., Ota, K., Liu, A.: RMER: Reliable and energy-efficient data collection for largescale wireless sensor networks. IEEE Internet of Things J. 3(4), 511–519 (2016) 6. Liu, F., Wang, Y., Lin, M., Liu, K., Wu, D.: A distributed routing algorithm for data collection in low-duty-cycle wireless sensor networks. IEEE Internet of Things J. 4(5), 1420–1433 (2017) 7. Li, S., Da Xu, L., Wang, X.: Compressed sensing signal and data acquisition in wireless sensor networks and internet of things. IEEE Trans. Industr. Inf. 9(4), 2177–2186 (2012) 8. de Souza, J.C.S., Assis, T.M.L., Pal, B.C.: Data compression in smart distribution systems via singular value decomposition. IEEE Trans. Smart Grid 8(1), 275–284 (2015) 9. Hosseinpour, F., Plosila, J., & Tenhunen, H.: An approach for smart management of big data in the fog computing context. In: 2016 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), pp. 468–471. IEEE (Dec 2016).

Data Collection in Fog Data Analytics

103

10. Al-Sultan, S., Al-Doori, M.M., Al-Bayatti, A.H., Zedan, H.: A comprehensive survey on vehicular ad hoc network. J. Netw. Comput. Appl. 37, 380-392 (2014) 11. Lai, Y., Zhang, L., Wang, T., Yang, F., Xu, Y.: Data gathering framework based on fog computing paradigm in vanets. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, pp. 227–236. Springer, Cham (July 2017) 12. Ramli, M.R., Bhardwaj, S., Kim, D.S.: Toward reliable fog computing architecture for industrial internet of things (2019) 13. Yu, H., Yao, J., Shen, X., Huang, Y., Jia, M.: Data collection scheme for underwater sensor cloud system based on fog computing. In: International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage, pp. 149–159. Springer, Cham (July 2019) 14. Kamienski, C., Soininen, J. P., Taumberger, M., Fernandes, S., Toscano, A., Cinotti, T. S., Neto, A. T. (2018, June). SWAMP: an IoT-based smart water management platform for precision irrigation in agriculture. In: 2018 Global Internet of Things Summit (GIoTS), pp. 1–6. IEEE. 15. Kamienski, C., Soininen, J. P., Taumberger, M., Dantas, R., Toscano, A., Salmon Cinotti, T., Torre Neto, A.: Smart water management platform: Iot-based precision irrigation for agriculture. Sensors 19(2), 276 (2019) 16. Ibañez, J.F., Castañeda, J.E.S., Santos, J.C.M.: An IoT camera system for the collection of data using QR code as object recognition algorithm. In: 2018 Congreso Internacional de Innovación y Tendencias en Ingeniería (CONIITI), pp. 1–6. IEEE (Oct 2018) 17. Firdhous, M.F.M., Sudantha, B.H., Karunaratne, P.M.: IoT enabled proactive indoor air quality monitoring system for sustainable health management. In: 2017 2nd International Conference on Computing and Communications Technologies (ICCCT), pp. 216–221. IEEE (Feb 2017) 18. Idrees, Z., Zou, Z., Zheng, L.: Edge computing based IoT architecture for low cost air pollution monitoring systems: a comprehensive system analysis. Des. Consid. Dev. Sens. 18(9), 3021 (2018) 19. Swan, M.: Sensor mania! the internet of things, wearable computing, objective metrics, and the quantified self 2.0. J. Sens. Actuator Netw. 1(3), 217–253 (2012) 20. Ortega, M.G.S., Rodriguez, L.F., Gutierrez-Garcia, J.O.: Energy-aware data collection from the Internet of Things for building emotional profiles. In: 2018 Third International Conference on Fog and Mobile Edge Computing (FMEC), pp. 234–239. IEEE (2018, April) 21. Kraemer, F.A., Braten, A.E., Tamkittikhun, N., Palma, D.: Fog computing in healthcare–a review and discussion. IEEE Access 5, 9206–9222 (2017) 22. Paul, A., Pinjari, H., Hong, W.H., Seo, H.C., Rho, S.: Fog computing-based IoT for health monitoring system. J. Sens. (2018) 23. Almeida, A., Fiore, A., Mainetti, L., Mulero, R., Patrono, L., Rametta, P.: An IoT-aware architecture for collecting and managing data related to elderly behavior. Wirel. Commun. Mob. Comput. (2017) 24. Dubey, H., Yang, J., Constant, N., Amiri, A.M., Yang, Q., Makodiya, K.: Fog data: Enhancing telehealth big data through fog computing. In: Proceedings of the ASE Bigdata & Socialinformatics 2015, p. 14. ACM (Oct 2015) 25. Naha, R.K., Garg, S., Georgakopoulos, D., Jayaraman, P.P., Gao, L., Xiang, Y., Ranjan, R.: Fog computing: survey of trends, architectures, requirements, and research directions. IEEE Access 6, 47980–48009 (2018) 26. Philips: Aims to relieve persistent pain with smartphone controlled devices. www.engadget. com/2014/09/17/philips-app-controlled-pain-reliever/ (17 Sept 2014) 27. Dubey, H., Goldberg, J.C., Abtahi, M., Mahler, L., Mankodiya, K.: EchoWear: smartwatch technology for voice and speech treatments of patients with Parkinson’s disease. In: Proceedings of the Conference on Wireless Health, p. 15. ACM (Oct 2015) 28. Boualouache, A.E., Nouali, O., Moussaoui, S., Derder, A.: A BLE-based data collection system for IoT. In: 2015 First International Conference on New Technologies of Information and Communication (NTIC), pp. 1–5. IEEE (Nov 2015)

104

S. R. Mani Sekhar et al.

29. Tanwar, S., Vora, J., Kaneriya, S., Tyagi, S.: Fog-based enhanced safety management system for miners. In: 2017 3rd International Conference on Advances in Computing, Communication & Automation (ICACCA) (Fall), pp. 1–6. IEEE (Sept 2017) 30. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N.: Fog computing for healthcare 4.0 environment: opportunities and challenges. Comput. Electr. Eng. 72, 1–13 (2018)

Emerging Technologies and Architecture for FDA

Mobile FOG Architecture Assisted Continuous Acquisition of Fetal ECG Data for Efficient Prediction Anupam Bhardwaj, Pooja Khanna, and Sachin Kumar

Abstract Electrocardiogram (ECG) is a cardiac test that records the timing and strength of electrical signals primarily responsible for heartbeats. An ECG data is used to gain an insight into irregularities in the heart rhythms. A Fetal Electrocardiogram (fECG) is extracted from the abdominal electrocardiogram signal of pregnant women. This is a non-invasive method for recording fECG during the early weeks of pregnancy and is an effective diagnostic tool used by clinicians to regularly evaluate the foetus health status. The aim of this paper is to put forward architecture for continuous monitoring of fetal electrocardiogram from maternal ECG to avoid any kind of acute condition caused to the newborn child at the time of birth. The continuous acquisition of fECG will lead to a very large amount of data to send over the cloud for further examination by the doctor, this data has to be pre-processed before moving it to the cloud for a much faster and efficient evaluation. The proposed architecture along with the Healthcare 4.0 environment and mobile fog computing; will have a potential to extend, virtualize new and efficient healthcare processes for fetal health monitoring. Keywords Electrocardiogram · Fetal electrocardiogram · Healthcare 4.0 · Mobile fog computing

A. Bhardwaj · P. Khanna · S. Kumar (B) Amity University, Lucknow, Uttar Pradesh, India e-mail: [email protected] A. Bhardwaj e-mail: [email protected] P. Khanna e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Tanwar (ed.), Fog Data Analytics for IoT Applications, Studies in Big Data 76, https://doi.org/10.1007/978-981-15-6044-6_6

107

108

A. Bhardwaj et al.

1 Introduction An Electrocardiogram or ECG is a non-invasive cardiac method that measures the timing and strength of the electrical signal. The primary purpose of these electrical signals is to make a heartbeat. These measurements provide an insight whether the heart rhythms are normal or if there is any kind of irregularity in them. During the early weeks of pregnancy obtaining a non-invasive Electrocardiogram of the foetus at regular intervals of time can act as an extremely effective diagnostic tool for the doctors to evaluate the foetus health status. However, due to the fairly weak nature of fetal electrocardiogram signals, they are quite difficult to extract from the maternal electrocardiogram signal and other sources of interference. Therefore, successfully extracting and processing fetal electrocardiogram signals in real time from the background of maternal electrocardiogram signals and other forms of noise is necessary to reduce any kind of serious situations caused to the newborn. The aim of this paper is to put forward an architecture for continuous monitoring of fetal electrocardiogram from maternal ECG to avoid any kind of acute condition caused to the newborn child at the time of birth [1]. The continuous acquisition of fECG will lead to a very large amount of data to send over the cloud for further examination by the doctor, this data has to be pre-processed before moving it to the cloud for a much faster and efficient evaluation. The server alone cannot handle this large amount of raw data to be processed and stored simultaneously. To save the server from this extensive computational task the Mobile Fog Computing provides an architecture to take some of the data centre’s computational load to the edge of the server. An edge server with mobile fog computing on it is an innovative architecture providing partial storing, computational and networking services between the end device and the Cloud Data Centres. With mobile fog computing end devices filter out the data for the Data Centre by gaining some logical intelligence in them. And since healthcare is a highly latency-sensitive application of IoT fog computing serves its primary objective of ensuring low latency. Figure 1 illustrates the Mobile fog computing concepts [2].

2 Motivation When the foetus is in its early stages, process of heart formation is in progress, there may be instances that may lead to some kind of heart defects affecting the normal functioning of a heart and consequently other parts of the body. Recent studies have shown that the most common type of birth-related defects is congenital heart disease (CHD). These congenital heart conditions are also found to be the leading cause of birth-related deaths. In a different study conducted in 2015, it was found that around 50 million people were found to be affected by the Congenital Heart Disease (CHD). The number of people affected by CHDs varies from 5 to 76 per 1000 live births. These numbers depend on how these diseases are diagnosed. The number of people

Mobile FOG Architecture Assisted Continuous Acquisition …

109

Fig. 1 Mobile Fog computing concepts

in which they cause a severe problem is relatively low than the number of people they are diagnosed but congenital heart diseases are still the leading cause of birth defect-related deaths [3–6]. Mobile Fog computing can help improving healthcare and elderly care system by using self-powered sensors working wirelessly to communicate data over Mobile fog node, being a pose to sending them directly to the Cloud. A smart healthcare system can be developed using a very generous number of sensors and Mobile Fog layer can perform semantics tagging and data classification tasks, leaving only sophisticated data for further processing by the Cloud System. A Cloud system for medical devices with integrated mobile fog computing concepts and using a similar approach can provide effective Quality of Service (QoS). Mobile fog Computing can support healthcare systems to accumulate heterogeneous data, jump between various communication protocols, assisting distributed computing using low powered devices [7–14]. Numerous efforts have been made in creating smart gateways for healthcare purpose. A smart gateway using wireless sensor networks was proposed for the healthcare systems in the year 2010, by Chen et al. This gateway acted as a link between the public communication network and the wireless sensor network and had a trivial database, data decision system and notification ability in case of emergency. Along with all these abilities, the gateway also provided the request and response message methods to reduce the load from a remote server. One of the Mobile fog enables health monitoring system is illustrated below in Fig. 2 [15].

110

A. Bhardwaj et al.

Fig. 2 Mobile Fog enabled health monitoring system

3 Fetal ECG Analysis and Synthesis The fetal monitoring techniques used today are mostly Phonocardiogram (PCG) and electrocardiogram (based). While the PCG uses audible sound, made by the fetal heart for information gathering, an ECG uses small electrodes placed on the mother’s abdomen to record information. Since the sound signals had to be within the audible frequency range for a Phonocardiogram, they are highly suspectable to external noises and may alter the recordings. This limitation of a Phonocardiogram makes Electrocardiogram more appropriate for a highly accurate and reliable recording concerning the health condition of the foetus. A fetal heart rhythm can be monitored using an ECG in two ways: the direct way or the invasive method, that may come out to be very harmful to both the mother as well as the foetus. The second method for a fetal ECG is an indirect non-invasive method, which does not require any kind of test instrument to be entered inside the mother’s body rather it uses the electrodes to be placed on the abdomen of the mother to record fetal heart rhythms. In a non-invasive fetal Electrocardiogram (fECG) the electrodes are placed on the mother’s abdomen to capture the heart rhythms of the foetus. Since the electrodes do not directly come in contact with the foetus the fetal ECG is comparatively low power high frequency noise as compared to a normal ECG. The positioning of the electrodes is right above the womb so the fECG recordings may contain various types of noises such the fetal respiratory noises, brain activity, electromyogram, system noises, etc. The most prominent type of interference in the fetal ECG is the maternal ECG signals that are also recorded during this process. All these interferences make the extraction of fECG difficult. Apart from these internal factors the positioning of electrodes also affects the fECG signal and may vary from person to person.

Mobile FOG Architecture Assisted Continuous Acquisition …

111

One of the possible solution to overcome this problem is to use a highly parallel instrumentation architecture with high sensitivity and low noise. The fECG extraction methods used currently are categorized on the basis of their methods as Linear decomposition, Non-Linear decomposition and adaptive filtering. Among the above-mentioned methods Wavelet transform, Doppler ultrasound, Blind source separation (BSS) methods using PCA (principal component analysis) and ICA (independent component analysis), combination of Blind source separation method and wavelets, adaptive neural network, adaptive neuro fuzzy inference system are some of the techniques that can be used. To separate out the maternal and fetal components from the abdominal ECG the Sparse decomposition method using Gaussian function is suggested. Earlier studies have shown that JBSS CUM4 algorithm is the most suitable algorithm for separating maternal ECG (mECG) and fetal ECG (fECG). To extract fECG, Hybrid non-liner adaptive noise canceller is used. These adaptive noise cancellers can be used with either single or multi-reference channels. Singular Value Decomposition (SVD) is applied to the spectrogram, which is followed by the Independent Component Analysis on the principle components by an iterated application. To detect the fetal ECG Wavelet-based method is used. This technique uses MRA or Multiresolution Analysis to eliminate large baseline functions along with the noise from the signal. For deducting a complete maternal ECG from an abnormal signal that contains both fetal and maternal ECG signals the Genetic algorithm methodology is being used. In case of single channel recording, Kalman based Bayesian filter framework is used because Kalman Filter proved to be a promising method for this purpose. The Fig. 3 illustrates the methodology of the system.

3.1 Data Extraction Abdominal ECG signal is obtained by placing electrodes at the mother’s abdomen. For continuous extraction of ECG signal, these electrodes are placed on a wearable belt such that when the mother wears the belt the electrodes are always in contact of the body to record the electrical signal. A controller unit attached to the belt will monitor the time interval between two consecutive recordings and the noting the time stamp for each recording. This controller will also send the recoded abdominal ECG signal for the pre-processing stage.

3.2 Pre-processing and Generation of ECG Signal The abdominal ECG signal recoded by the wearable belt contains both maternal and fetal ECG signals along with some other noises. This signal has to be filtered before extracting fetal ECG from it. The filtering of the signal is done using a Kalman filter which belongs to the adaptive filter class and is used broadly for noise removal

112 Fig. 3 Methodologies used in the system

A. Bhardwaj et al.

Continuous FECG Signal Acquisition ( Wearable Device Supported )

Data Transfer Via Mobile FOG and Cloud Support to Server

Cleaning & Noise Removal from FECG signal at Server

FECG Extraction employing Filter with Noise cancellation

Recovering FECG with Peak QRS detection

Information acquired though analysis is shared with Customer via Cloud Support

and extraction of fECG from abdominal ECG signals. Studies have shown that the Kalman filter is used to extract fetal ECG but the efficiency of this filter reduces when mECG and fECG overlaps in time. This method faces difficulty in discriminating in such overlapping situations and fails to extract the fECG efficiently. So, it is only used for removing unwanted noises from the combined signal for further processing. Based on the data, a high signal to noise ratio and smoother waveform for both maternal and fetal signal is generated using the Savitzky Golay filter.

3.3 Fetal ECG Extraction Using Adaptive Noise Cancellation An adaptive filter is a type of digital filter with time variability which automatically adjusts its filter coefficients without any user interference by assuming that time scale of variation is low when compared to the bandwidth of signal filtered, when applied

Mobile FOG Architecture Assisted Continuous Acquisition …

113

for stationary signals the filter maintains constant shape, orientation after successive iterations. The different performance measures such as convergence rate, minimum mean square error, computational complexity, stability, robustness, etc. are used to determine which filter is suitable for a particular application from the various adaptive filters with different configurations. Due to the tracking ability of Adaptive Noise Cancellation (ANC) technique, it is highly preferred for non-stationary signals. ANC can itself trace the orientation, curvature changes and any slow time variations in the input promptly. The characteristics of abdomen signal and noise are not accessible in the system, the noise is uncorrelated with the noise containing signal. The noises are correlated with a reference signal which is the secondary source of the noise and the noises are correlated with each other, whereas the reference signals are uncorrelated with the actual signal. For the adaptive filter, the Least Mean Square (LMS) gradient approximation technique is selected. The LMS gradient approximation technique the speculation is made depending on current filter coefficient, then the derivative of mean square error is calculated which is known as gradient vector, the speculations made above are altered in the direction opposite to gradient vector values. The complete above process is repeated until the mean square error reduces to zero. At each iteration, the gradient vector is calculated and the step size is selected for the parameter. Depending on the steepest descend algorithm, the tap weight vector is worked out which then congregates to the best possible solution. These tap weight vectors are changed after every iteration. The LMS process is illustrated mathematically as follows: Wk+1 = Wk + 2μk Xk

(1)

In the above equation, the Wk represents tap weight, εk represents error and the Xk is the input signal for the kth iteration, while μ represents the step-size parameter. In the process of extraction of fetal ECG, this method helps in separating the maternal ECG that matches with the reference signal and gets eliminated along with the reference signal and other unnecessary aberrations.

3.4 LMS Extraction The principle of working of Least Mean Square (LMS) filter is to reduce the error between the most anticipated signal and the actual output signal. This is achieved by obtaining the filter coefficients and producing the least mean square of error by reducing the variation between input and error. It is based on deterministic gradient method steepest descent algorithm which finds gradients by computing derivative of error but LMS estimates the value continuously but fails in computing optimal values and is stochastic gradient method which converges the mean of the square of error and changes the optimal values. As depicted from Fig. 4, the system involves two signals, one of them is the ECG signal recorded from the mother’s abdomen [n0(nT) + s(nT)], (which contains both

114

A. Bhardwaj et al.

Fig. 4 LMS extraction filter

the mECG and the fECG) and the second one is reference signal [n1(nT)] which is measured from mother’s chest. Both the signals are made free from all the broadband noises and then lastly mean square error is reduced. The adaptive filter shown in the above figure produces a signal (nT) which is subtracted from the input signals, to produce an output Y(nT). Y (nT ) = s(nT ) + n 0 (nT ) − Ψ (nT )

(2)

Squaring the above equation and implicating (nT) to simplify each term. Y 2 = s 2 + (n 0 + Ψ )2 + 2s(n 0 − Ψ )

(3)

Tacking expectation on both sides and minimizing the equation.     min E Y 2 = min E s 2 + min E[n 0 − Ψ ]

(4)

The MSE of the term (n0 − ) is minimized and noise has been synthesized adaptively by the filter, unaffecting the term E[s2 ] despite the filter adjustments. The optimal weight is updated by converging mean square error as shown in the equation given below. wn+1= wn − μ∇ ∈ [n]

(4)

3.5 QRS Peak Detection In a typical ECG signal, three of the graphical bends are collectively termed as the QRS complex or the most prominent peak in an ECG. This QRS complex is highly useful in discovering deviation in an ECG making it a dreary and complicated task

Mobile FOG Architecture Assisted Continuous Acquisition …

115

Fig. 5 QRS complex

in ECG signal processing, and yet most of the traditional ECG devices suffer from difficulties in accurate recognition of QRS complex. Figure 5 shows a conventional QRS Complex. To calculate the heart rate of the foetus, R peaks must be identified in the fECG signal extracted from the abdominal data. In this study, R-R peaks are identified using simple differentiation technique along with that the maximum number of R peaks per minute are also counted. From this computation, the R-R interval is used to calculate the fetal heart rate which then forms the basis to identify the bradycardia and tachycardia classes [16–26].

4 Mobile FOG Enabled Architecture Since the data synthesis process is continuous with very small intervals of only a few minutes, the data obtained from this process will be very huge and very unlikely that it can be transferred over the cloud data centres for processing and storing due to the existing load on the data centre and limitation of available resources. So, to overcome this problem the system will be provided with a Mobile FOG computing layer reducing most of heavy computational and storage load from the servers. In a mobile fog enabled system, the edge node is made up of all the sensors and local devices, which are connected to the cloud system via a mobile fog layer or gateway. All the data generated by the edge devices is transferred to the mobile fog layer, which is processed by this layer and then dispatched over to the cloud. Where this data is available for the medical experts, for frequent evaluation to identify if any kind of health risk is there for the patient. If the expert thinks that a personal

116

A. Bhardwaj et al.

evaluation is required then they may send an alert to the patient’s device for a personal visit [2, 15, 27, 28] (Fig. 6). In a Mobile fog computing platform, the fog layer does not use a conventional gateway that is responsible only for the transmission of data from the end node to the cloud node maintaining the heterogenicity of the two systems but instead a smart gateway is used that performs all the operations of a conventional gateway along with other high-level advance services. These advance services performed by a smart gateway in mobile fog computing are as follows: i.

GUI with access management: Smart gateway provides an interface to the end users such as the system administrator, medical experts, patients and caretakers to visualize the ECG data and other important instructions. ii. Realtime Notifications: For a healthcare system all the notifications should be sent in real time. This functionality is also provided by the smart gateway system of the mobile fog layer. The remote cloud receives a real time signal from the smart gateway which then sends a notification to the end user regarding any kind of abnormality in the user data or any other instructions provided by the medical experts for the patient. iii. Location awareness: This service is very important in case of an emergency. The gateway systems also monitor the geographical location of the user device which is constantly updated and can be used by the caregivers or emergency services to reach the patient if any kind of emergency situations occurs. iv. Heterogeneity and Interoperability: The edge node is made up of various kind of devices from different makers using different hardware and operating systems in them. This heterogenicity makes it difficult for these devices to communicate with a cloud platform. This problem is solved by the smart gateway that provides

Fig. 6 Proposed mobile FOG architecture for continuous acquisition of fetal ECG

Mobile FOG Architecture Assisted Continuous Acquisition …

117

interoperability between different operating systems and inconsistent protocols from the edge layer to the cloud layer [7–11, 29, 30].

5 Design and Implementation The implementation of Mobile Fog architecture incorporates the following elements: i.

Mobile Fog Device: This includes Mobile fog devices primarily the handset, which acts as a substitute for microcontrollers, switches, routers, embedded servers and video surveillance cameras. While describing the fog devices we need to specify connections of fog devices, sensors and actuators. These devices may be characterized with major attributes like accessible memory, processor, storage size, uplink and downlink bandwidths (defining the communication capacity of fog devices). In current work as we cannot restrict the movement of the patient so it is desired to record and process the information on go using a mobile phone and other local devices. ii. Sensor: Sensor entities are the IoT devices, as described in the FOG architecture connected with a wearable belt to acquire the fECG data. The attributes representing the characteristics of a sensor include an interface to output attributes, attribute to the gateway mobile fog device to which the sensor is connected and the latency of the connection between them. In the current work, the IoT enabled ECG Belts are used for continuous monitoring of the fetal ECG. These wearable belts can be wrapped around the patients’ abdomen to record the ECG. The recorded ECG is further by setting appropriate values of these attributes, processed by Actuators. iii. Actuator: An actuator reacts to the sensor actions and its network connection properties. It defines the actions steps required to perform activation of the signal. In the proposed technique the signals once collected at the mobile FOG layer are sampled to reduce the density and size of the data. Fetal heart rate (FHR) sampling rate used on the bedside is equal to or less than 4 Hz. Current FHR analysis methods fail to detect incipient fetal academia. In a fetal sheep model of human labour, FHR sampling rates near 1000 Hz are needed to detect fetal academia, continues data acquired at mobile FOG layer is proposed to sample the data at the same rate [12–14, 31–36]. The reduced data is then transferred to the server for further analysis.

6 Results The proposed architecture was tested for the sample obtained from physionet.org; sample acquired had 10 recordings for the duration of 5 min each, the total number of QRS complexes that were recognized reached nearly around 6500 with an accuracy of 1 ms with respect to the location of reference R waves. These R waves were

118

A. Bhardwaj et al.

Fig. 7 Extracted QRS complex with identified peaks (Data Source physionet.org)

then compared with the standard reference signals to determine the effectiveness. Figure 7 depicts the extracted QRS complex or the most prominent peak in an ECG, the simulation was performed in MATLAB 2018b. First step to find the effectiveness of this method is to find the number of True positive or TP (R waves that are correctly determined) along with the errors, i.e. False positives (FP-complexes that are falsely detected) and the False negatives (FN– complexes that were missed during detection). These values are used to determine various parameters to determine the effectiveness and efficiency of the technique. The different parameters used are sensitivity (Se), Positive diagnostic value (PDV) and Accuracy (Acc). Sensitivity, PDV and accuracy are calculated by employing the following formulas S E = T P/(T P + F N )

(5)

P DV = T P/(T P + F P)

(6)

Acc = T P/(T P + F P + F N )

(7)

True positives were found by assuming that it should be detected within the ±40 ms with respect to the reference peak. Out of the total number of QRS complexes detected 0.98% complexes (~55 complexes) were found to be false negative or missed complexes, while only 0.96% complexes (~58 complexes) were found to be false positive. Using these values of the TP, FP and FN the above-mentioned parameters were calculated. The results obtained show the high efficiency of this technique. The mean accuracy (Acc) calculated was around 93%, whereas the average Sensitivity and average PDV were found to be 95.7 and 94.99%, respectively [37–40].

Mobile FOG Architecture Assisted Continuous Acquisition …

119

7 Discussion and Outcome This paper suggests a model to extract fetal ECG from an abdominal ECG recording that contains both maternal ECG and fetal ECG in addition to noises. The model cleanses the data samples and removes undesired noise components from the recording, further this model differentiates the mECG and fECG signals for further processing. The system continuously extracts and records the fECG on regular intervals for few minutes. This continuous evaluation of fetal ECG characterizes a very important and efficient means to diagnose and reduce academia, hypoxic-ischemic encephalopathy and other congenital heart diseases (CHD) in newborn. The extraction and recording part of this system will generate a very large amount of data to be managed and processed by the cloud servers. The mobile fog computing provides an efficient solution to this problem. The fog computing layer provides a smart gateway between the edge layer and the cloud layer for the data to be shared effectively and efficiently over the cloud without any extra load from the continuous generation of data. The fog layer not only provides the functionalities of conventional gateways but also other functionalities to make the model smart and efficient for manipulation and processing of the patient data. Taking this system one more step towards the goal of Healthcare 4.0.

8 Conclusion The results obtained by using the above-discussed methodology provides strong visualization of fetal heart rates. This visualization if further send to the patient’s medical expert for a detailed evaluation of the data. The fetal heart rate determined by the system can be used as an effective tool to distinguish between bradycardia or tachycardia and between a Sinus Rhythm or Atrial Fibrillation. These conclusions of the fECG data can help early diagnosis of various birth-related congenital heart disease (CHD). Early diagnosis of these congenital heart diseases can reduce the birth-related death rate around the world. Implementation of the proposed system at mobile fog layer enhances the practical applicability and resource efficiency. Fog network is capable of reducing the bandwidth required for IoT data transmission along with reducing the response time. Response time is very critical in health care systems. The extraction of the fECG is performed at the edge of the patient network. With a mobile enabled health system, continuous monitoring of foetus health will ensure a healthy baby birth.

120

A. Bhardwaj et al.

References 1. Samuel, C.D., English, A.E., Alves, N.: A fetal electrocardiogram data acquisition and analysis system. In: 39th Annual Northeast bioengineering conference. IEEE, pp. 235–236 (2013) 2. Shi, Y., Ding, G., Wang, H., Roman, H.E., Lu, S.: The Fog computing service for healthcare. In: 2nd International Symposium on Future Information and Communication Technologies for Ubiquitous HealthCare (Ubi-HealthTech) (2015) 3. Sameni, R., Clifford, G.: A review of fetal ECG signal processing; issues and promising directions. In: Edition of book, Boston: University of Oxford, Harvard University, pp. 4–20 (2009) 4. Dervaitis, K.L., Poole, M., Schmidt, G., Penava, D., Natale, R., Gagnon, R.: (2004) ST segment analysis of the fetal electrocardiogram plus electronics fetal heart rate monitoring in labor and its relationship to umbilical cord arterial blood gases. 191(3), 879–884 5. Noren, H., Blad, S., Carlsson, A.: STAN in clinical practise—the outcome of 2 years of regular use in the city of Gothenburg. J. Obstet. Gynecol. 7–15 (2006) 6. Clifford, G., Sameni, R., Ward, J., Robinson, J., Wolfberg, A.J.: Clinically accurate fetal ECG parameters acquired from maternal abdominal sensors. Am. J. Obstet. Gynecol. 47 (2011) 7. Stantchev, V., Barnawi, A., Ghulam, S., Schubert, J., Tamm, G.: Smart items, fog and cloud computing as enablers of servitization in healthcare. Sens. Transducers 185(2), 121–128 (2015) 8. Shi, Y., Ding, G., Wang, H., Roman, H.E., Lu, S.: The fog computing service for healthcare, In: 2015 2nd International Symposium on Future Information and Communication Technologies for Ubiquitous HealthCare (Ubi-HealthTech). IEEE, pp. 1–5 9. Gia, T.N., Jiang, M., Rahmani, A.M., Westerlund, T., Liljeberg, P., Tenhunen, H.: Fog computing in healthcare internet of things: a case study on ecg feature extraction. In: 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing (CIT/IUCC/DASC/PICOM). IEEE, pp. 356–363 10. Cao, Y., Hou, P., Brown, D., Wang, J., Chen, S.: Distributed analytics and edge intelligence: pervasive health monitoring at the era of fog computing, In: Proceedings of the 2015 Workshop on Mobile Big Data. ACM, pp. 43–48 11. Cao, Y., Chen, S., Hou, P., Brown, D.: Fast: a fog computing assisted distributed analytics system to monitor fall for stroke mitigation. In: 2015 IEEE International Conference On Networking, Architecture and Storage (NAS). IEEE, pp. 2–11 12. Li, X., Xu, Y., Herry, C., Durosier, L.D. et al.: Sampling frequency of fetal heart rate impacts the ability to predict pH and BE at birth: a retrospective multi-cohort study. Physiol. Measur. 36(5), L1–L12 13. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N.: Fog computing for healthcare 4.0 environment: opportunities and challenges. Comput. Electric. Eng. 72, 1–13 (2018) 14. Tanwar, S., Tyagi, S., Kumar, N. (Eds.), Security and Privacy of Electronics Healthcare Records, IET Book Series on e-Health Technologies, pp. 1–450 (2019) 15. Chen, Y., Shen, W., Huo, H., Xu, Y.: A smart gateway for health care system using wireless sensor network. In: 2010 Fourth International Conference on Sensor Technologies and Applications (SENSORCOMM), July 2010, pp. 545–550 16. Prabha, V.P., Sriraam, N., Suresh, S.: A review: ultrasonic imaging based fetal cardiac chambers segmentation and detection of abnormality. Int. J. Res. Sci. Innov. (IJRSI) VI(IV), 237–238. ISSN 2321–2705 (2019) 17. Richards, A.A., Garg, V.: Genetics of congenital heart disease. Curr. Cardiol. Rev. 6(2), 91 (2010) 18. Minino, A.M., Heron, M.P., Murphy, S.L., Kochanek, K.D.: Deaths: neonatal data for, deaths: final data for 2004. Natl. Vital. Stat. Rep. 551–119 19. Naheed, Z.J., Strasburger, J.F., Deal, B.J., Benson, D.W., Gidding, S.S.: Fetal tachycardia mechanisms and predictors of hydrops fetalis. J. Am. Coll. Cardiol. 27, 1736–1740 (1996)

Mobile FOG Architecture Assisted Continuous Acquisition …

121

20. Da Poian, G., Bernardini, R., Rinaldo, R.: Separation and analysis of fetal-ECG signals from compressed sensed abdominal ECG Recordings. IEEE Trans. Biomed. Eng. 63(6), 1269–1279 (2016) 21. Sugumar, D., Vanathi, P.T., Mohan, S.: Joint blind source separation algorithms in the separation of non-invasive maternal and fetal ECG. In: IEEE Conference on Electronics and Communication Systems (ICECS) (2014). https://doi.org/10.1109/ecs.2014.6892754 22. Ma, Y., Xiao, Y., Wei, G., Sun, J., Wei, H.: A hybrid nonlinear adaptive noise canceller for fetal ECG extraction. In: Asia-Pacific Signal and Information Processing Association Annual Summit and conference (APSIPA) (2015). https://doi.org/10.1109/apsipa 015.7415385 23. Zarzoso, V., Roig, J.M., Nandi, A.K.: Fetal ECG extraction from maternal skin electrodes using blind source separation and adaptive noise cancellation techniques. In: Computers in Cardiology, vol. 27, pp. 431–434, September 2000 24. Ahlstrom, M.L., Tompkins, W.J.: Automated high-speed analysis of Holter tapes with microcomputers. IEEE Trans. Biomed. Eng. BME-30, 651–57 (1999) 25. Ahlstrom, M.L., Tompkins, W.J.: Digital filters for realtime ECG signal processing using microprocessors. IEEE Trans. Biomed. Eng. BME-32, 708–13 (2006) 26. Reaz, M.B.I., Wie, L.S.: Adaptive linear neural network filter for fetal ECG extraction, In: Proceedings of International Conference on Intelligent Sensing and Information Processing, Chennai, India, pp. 321–324, January 2004 27. Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: MCC Workshop on Mobile Cloud Computing, pp. 13–16 (2012) 28. Khan, S., Parkinson, S., Qin, Y.: Fog computing security: a review of current applications and security solutions 2017. J. Cloud Comput. Adv. Syst. Appl. 6(19), 2–22 (2017) 29. https://www.cisco.com/c/dam/en_us/solutions/trends/iot/docs/computing-solutions.pdf. Accessed 13 Dec 2016 30. Prieto González, L., Prieto González, L., Jaedicke, C., Jaedicke, C., Schubert, J., Schubert, J., Stantchev, V., Stantchev, V.: Fog computing architectures for healthcare: wireless performance and semantic opportunities. J. Inf. Commun. Ethics Soc. 14(4), 334–349 31. Tanwar, S., Tyagi, S., Kumar, N. (Eds.), Multimedia Big Data Computing for IoT Applications: Concepts, Paradigms and Solutions, Intelligent Systems Reference Library. Springer Nature Singapore Pte Ltd., Singapore, pp. 1–425 (2019) 32. Parisha, P.K., Sharma, P. and Rizvi, S.: Hash function based data partitioning in cloud computing for secured cloud storage. Int. J. Eng. Res. Appl. 7(7), 1–6 (2017) 33. Mistry, I., Tanwar, S., Tyagi, S., Kumar, N.: Blockchain for 5G-enabled IoT for industrial automation: a systematic review, solutions, and challenges. Mech. Syst. Signal Process. 135, 1–19 (2020) 34. Vora, J., Nayyar, A., Tanwar, S., Tyagi, S., Kumar, N., Obaidat, M.S., Rodrigues, J.J.: BHEEM: a blockchain-based framework for securing electronic health records. In: IEEE Global Communications Conference (IEEE GLOBECOM-2018), Abu Dhabi, UAE, 09–13th Dec, 2018, pp. 1–6 35. Khanna, P., Kumar, S.: Engineering 4.0: future with disruptive technologies. In: Rosa Righi, R., Alberti, A., Singh, M. (eds.) Blockchain Technology for Industry 4.0. Blockchain Technologies. Springer, Singapore, pp. 131–148 (2020) 36. Gupta, R., Tanwar, S., Tyagi, S., Kumar, N.: Machine learning models for secure data analytics: a taxonomy and threat model computer communications, special section on Artificial Intelligence (AI)- empowered intelligent transportation systems. IEEE Access 474–488 37. Mehta, P., Gupta, R., Tanwar, S.: Blockchain envisioned UAV networks: challenges, solutions, and comparisons. Comput. Commun. 518–538 38. Kumar, S., Mishra, S., Khanna, P.: Precision sugarcane monitoring using SVM classifier. Proc. Comput. Sci. 122, 881–887 (2017)

122

A. Bhardwaj et al.

39. Tanwar, S., Vora, J., Kanriya, S., Tyagi, S., Kumar, N., Sharma, V., You, I.: Human arthritis analysis in Fog computing environment using Bayesian network classifier and thread protocol. IEEE Consum. Electron. Mag. 9(1), 88–94 (2019) 40. Prasad, V.K., Bhavsar, M., Tanwar, S.: Influence of monitoring: fog and edge computing. Scalable Comput. Pract. Exper. 20(2), 365–376 (2019)

Proposed Framework for Fog Computing to Improve Quality-of-Service in IoT Applications Rakhi Akhare, Monika Mangla, Sanjivani Deokar, and Vaishali Wadhwa

Abstract In this era of IoT, edge devices generate gigantic data every second. The main aim of these IoT networks is to infer some meaningful information from the collected data. For the same, the huge data is transmitted to cloud which is highly expensive and time consuming. This huge cost is significantly reduced with introduction of Fog Computing (FC) which suggests performing data processing closer to its generation site. FC suggests preprocessing enormous data ahead of forwarding it to cloud by introducing a virtual layer between IoT and cloud, viz., Fog layer and thus accomplishes several benefits like reduced latency, low communication cost, reliability, and scalability. These benefits strongly advocate its employment in real-time application. However, FC also bears some challenges despite several benefits. First and foremost, the processing capability and storage at fog layer is limited in contrast to cloud. Hence, rigorous research is taking place in the direction of devising effective and efficient framework to garner utmost advantage of introducing fog layer. Here, in this chapter, we propose a framework that aims to improve QoS (Quality-of-Service) by providing reduced latency and load balancing at fog layer. This improvement in QoS is achieved with help of data aggregation and load balancing. In the proposed framework, an overburdened fog node requests its neighboring node to share its load. Additionally, it suggests implementing various techniques to aggregate data ahead of transmission. Resultantly, the proposed approach improves QoS by outperforming the existing approaches by preventing bottleneck in the network. Keywords Internet of Things (IoT) · Cloud computing · Fog computing · Data analytics · Edge devices · User experience · Load balancing · QoS

R. Akhare · M. Mangla (B) · S. Deokar CSED, Lokmanya Tilak College of Enginering, Navi Mumbai, India e-mail: [email protected] V. Wadhwa Karnal Institute of Technology and Management, Karnal, India © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Tanwar (ed.), Fog Data Analytics for IoT Applications, Studies in Big Data 76, https://doi.org/10.1007/978-981-15-6044-6_7

123

124

R. Akhare et al.

1 Introduction During the past few decades, society has witnessed an unprecedented increase in the magnitude of the Internet. Nowadays, almost everything like each wearable device is connected to the Internet known as IoT. IoT basically comprises sensors for data acquisition, data storage, data processing to generate data insight, security, etc. Here, it is important to mention that each connected IoT device frequently generates data. The collected data is analyzed in order to infer meaningful information which may be advantageous for various organizations. Resultantly, IoT has observed its significant application in the field of smart cities, healthcare, and weather forecasting to name a few [1–4]. However, this IoT transformation can be widely implemented into reality if it accomplishes better Quality-of-Service (QoS). For instance, widespread employment of IoT in healthcare necessitates minimal latency and customer satisfaction which can be obtained by integration of FC. Maintenance of high QoS is a challenging task which further improves user experience usually measured in terms of latency, cost, and time. For data analysis, the collected data is forwarded to the cloud where it is analyzed using various tools and techniques and generate relevant inference. Transfer of huge data from end devices to cloud requires enormous bandwidth and time. Requirement of time for data transfer amidst end devices and cloud dissuades its employment in time-constrained applications. Additionally, IoT suffers from the limitations of limited storage and limited computational power of the end devices. Considering the limited storage capability of end devices, data cannot be retained with end devices beyond a certain time and therefore needs to be forwarded. These issues have been addressed by emergence of FC which suggests introducing a virtual layer between end devices and cloud. This layer, known as fog layer, addresses the constraints and challenges of IoT devices and cloud. Here, in this chapter, the authors aim to propose an efficient fog architecture. Although, the chapter mainly focuses on architecture and analytics model of fog layer, a brief section is reserved to introduce Cloud Computing (CC) in order to maintain the completeness of chapter. The chapter has been organized as follows: Sect. 1 introduces the evolution in IoT. Cloud computing has been briefly presented in Sect. 2. It also presents the challenges in the technologies. Section 3 focuses on the FC in length. It discusses its features, benefits, challenges, and its applications. Data Analytics and Resource Scheduling have been presented in Sects. 4 and 5, respectively. The proposed architecture is presented in Sect. 6. Conclusion and future work is presented in concluding Sect. 7.

2 Cloud Computing Cloud computing refers to migration of all applications and services to the Internet [5]. By migrating applications and services over the Internet, it addresses issues like unlimited storage and processing capability, thus advocating its employment in

Proposed Framework for Fog Computing to Improve …

125

Fig. 1 Abstract model of Cloud computing

real-world scenarios. The following Fig. 1 represents the general architecture of CC. CC operates on pay-per-user-on-demand mode, thus enabling its effective sharing across the Internet. Also, CC enhances the availability of IT resources despite low cost incurred in physical devices. Few vital characteristics of CC are as follows: Shared Infrastructures: CC provides shared infrastructure across the Internet. As a result, it ensures the maximum utilization of available resources despite low cost involved in physical devices. Heterogeneous Network Access: CC model involves variety of end devices using heterogeneous software. Dynamic Provisioning: It uses a dynamic node-demand based model in order to effectively utilize resources. This dynamic provisioning is achieved by dynamically expanding and contracting the service capability in response to the demand which is managed by metering for bill generation. Thus, CC charges the consumers for actual usage for using shared resources/services without physical deployment at every location while giving a cost-effective solution [5]. Apart from cost effectiveness, other exciting features of CC are scalability, low maintenance, and reliability, etc. Although, CC has several supportive characteristics and features, it also has some associated challenges which may undermine its effectiveness. These challenges need to be invalidated using an efficient and careful model to garner the maximum benefit. The first and foremost challenge with CC is data privacy and security. The efficient architecture should handle the issue of data privacy as data is the most valuable possession for each organization. Hence, an efficient architecture should propose an effective security mechanism between cloud and organization for enhanced security of the data. Another challenge for CC is centralized processing and storage which requires a huge amount of energy for its operation. Moreover, centralized storage may lead to system susceptibility owing to failure of single point. Also, centralized approach

126

R. Akhare et al.

transfers sensitive data across a wide geographic distance, thus creating a means of escape for intruders. Centralized processing also deters the performance during peak load time and thus hinders expected behavior [6]. Resultantly, it observes its limited application in latency-sensitive applications like robotics, health monitoring, etc. [7, 8]. Heterogeneity of devices involved in the network is another vital challenge in CC. Lack of interfacing standards results in interoperability issue among clouds. Currently, various forums have been working in the direction of standardization to address interoperability issues. After comprehending features, limitations, and challenges of CC, it is realized that CC fails to accord utmost efficiency in time-constrained environment. As discussed in the previous section, the challenges and constraints of CC can be overcome by introducing a virtual fog layer between end devices and cloud referred as FC [9].

3 Fog Computing Fog Computing introduces the placement of physical or virtual resources between conventional data centers and smart devices [10]. Thus, it provides a scalable, distributed, and latency-sensitive solution for storage and computation. It also provides several other significant advantages in terms of low latency, reduced data transmission, and bottleneck removal. All these advantages are obtained owing to nearness between data processing and its source in comparison to CC [11]. Additionally, it raises service availability as connected devices do not require connecting to a centralized server to obtain a service. Hence, the advantages of FC over CC can be summarized as follows [12, 13]: • • • •

Reduced response time (latency time) Significant reduction in data transmission Enhanced data security over cloud computing Increased scalability

Considering the features of FC, it is evident that it preprocesses a large amount of data at fog layer. Thereafter, it forwards the summarized information to cloud, thus reducing the bandwidth requirement by 80% [12]. The architecture of FC is illustrated in the following Fig. 2. As illustrated in Fig. 2, the fog nodes lie amidst end devices and cloud. Each fog node performs several functions shown in the layered model in Fig. 2. This layered model of fog node is explained in the following subsection.

Proposed Framework for Fog Computing to Improve …

127

Fig. 2 Architecture of Fog computing

3.1 Layered Architecture of FC According to [14, 15], FC architecture has six layers, viz., Physical and virtualization, monitoring, preprocessing, temporary storage, security, and transport layer as illustrated in Fig. 2. The physical and virtualization layer manages and controls different types of sensors which are generally distributed geographically. These sensors sense the data from surroundings and forward the collected data to upper layers through gateways for further processing [16]. It is followed by monitoring layer which monitors the utilization of resources, sensors, fog nodes, and other network elements. It monitors which node is performing which task at what time. It also monitors the status and performance of deployed services and applications [17]. Additionally, it monitors the energy consumption of fog nodes as FC involves different devices with disparate energy consumption. The next layer in the architecture is preprocessing layer that manages the data. This layer manages data by performing operations like data filtering and trimming in order to infer meaningful information. The storage of this processed data is handled by temporary storage layer. The privacy of data is maintained by security layer using encryption/decryption. The security layer also takes into account the integrity of data. Finally, the transport layer uploads the processed data to the cloud so that more meaningful information could be derived. In the transport layer, emphasis is laid to minimize the amount of data uploading so as to enable energy conservation. Hence, the gateway device analyzes the data ahead of forwarding it to cloud. A gateway which is competent in data analysis is called smart gateway. Once the gateway forwards the data to cloud, it is stored to cater various user demands [9]. Now, handling the huge amount of continuously evolving data has several associated challenges. Some of these challenges have been discussed in subsequent subsection.

128

R. Akhare et al.

3.2 Challenges of FC This subsection mainly focuses on the latest challenges experienced by FC architecture. All these challenges are mainly owing to constant expansion of unstructured data [18]. Some of these challenges are as follows: Scalability: Many of the existing FC architectures are incompetent in handling the magnitude of data expansion in IoT networks. Thus, it calls for an efficient FC architecture which is capable of handling data scalability. Multi-objective design for Fog system: Many of prevalent FC architecture mainly consider few objectives such as QoS and cost, etc. Popular FC architectures address these objectives using load balancing or service provisioning, etc. However, there are few other vital objectives which have been ignored, e.g., user experience. Hence, the challenge is to design a multi-objective architecture. Mobile Fog computing: In the related literature, all researchers presume immobile fog nodes and consider mobile fog nodes. However, mobility of fog nodes cannot be ignored which augments associated challenges. User experience: Fog layer handles plethora of data generated by end devices. Each end device is handled by some fog node. In case of imbalanced load nodes, the end users may have despondent user experience. Thus, it becomes necessary to provide an acceptable user experience to each end user. For the same, fog nodes may be continuously monitored so as to obtain an effective QoS. Robustness and heterogeneity: Existing fog networks hardly consider failure of the fog nodes which is not very uncommon. Additionally, fog nodes may also experience attacks like Denial of Service (DoS) as fog nodes have limited capability and functionality. Such attacks hinder the performance of network. Hence, it becomes absolutely necessary to obtain a robust fog network. Additionally, an efficient fog network should also consider heterogeneous nature of fog and IoT nodes. In order to handle the heterogeneous devices, conventional protocols need to be suitably modified. Now, it is evident that FC architecture involves several constraints and challenges owing to huge, unstructured, and continuously evolving data at fog layer. Moreover, this huge data on fog layer needs to be preprocessed ahead of forwarding it to the cloud node. Thereafter, the data present on cloud is processed and analyzed to infer more meaningful information as per user demands. Consequently, analysis of data is the prime focus in such networks. The authors also briefly present the BDA with reference to FC in the following section so as to maintain the completeness of the chapter [10, 19]. Now, as we have discussed earlier, FC outperforms CC in terms of several parameters like latency, reliability, and scalability, etc. Generally, for end users, the most significant parameter to evaluate the effectiveness of FC is its response time. Hence, rigorous research is still continuing in the field of FC to enhance its performance. For the same, researchers have been mainly working in the direction of data analytics and resource scheduling [20]. Here, data analytics ensures that data is optimally analyzed at fog layer to obtain significant compression in its size leading to minimal

Proposed Framework for Fog Computing to Improve …

129

data transfer. On the other hand, resource scheduling ensures that each resource in the network (mainly fog nodes) is utilized efficiently so as to yield maximum throughput in minimum time. Resource scheduling ensures that no fog node remains underutilized or overburdened. In this chapter, the authors present the research in the direction of data analytics and resource scheduling. Thereafter, the authors also propose an efficient framework to garner maximum effectiveness of FC using load balancing and efficient data analytics. As discussed earlier, the significant principle to achieve QoS is Data Analytics in addition to Resource Scheduling. As these applications involve a huge amount of unstructured data, it is also referred as Big Data in the literature [10, 21]. Consequently, BDA needs to be apprehended in order to perform efficient and effective analysis of cloud data.

4 Data Analytics in FC Data Analytics (DA) primarily involves applications with elements for statistical algorithms, predictive models, etc. As Analytics are generally performed on ocean of data (Big data), it may also be referred as BDA. BDA processes the huge amount of data and finds meaningful patterns, associations, and other insights which is used by every commercial organization to improve its quality and spectrum of service [21]. BDA processes the data in store-and-process manner where processing is done on offline data. On the other hand, sometime data processing is required on real-time data, e.g., health monitoring system, stock marketing, etc. [22]. Model for BDA of real-time and offline data is illustrated in the following Fig. 3. In BDA, the initial step is collection of data from various sources. Here, in the context of this chapter, these sources refer to end IoT devices. The collection of data is followed by its cleaning that removes inconsistencies and errors so as to infer correct and valid conclusion. Data cleaning is an important step in BDA and needs to handle a huge amount of unstructured data and thus requiring substantial efforts and time [10]. The unstructured data is generally transformed into structured data which significantly reduces the effort and time required. The cleaned data is presented to the next for extracting features. Thereafter, the data is stored in data warehouse where it is analyzed using various data analytics tools. Prevalent tools for analysis are machine learning, knowledge-based queries, and visualization, etc. [12]. Although, BDA has an associated challenge of rapidly growing data that challenges efficient and accurate interpretation of data. This rapidly evolving data escalates the response time and thus underperforms. This persuades researchers to pursue their research in this direction to devise some efficient approaches to handle these issues. It is worth mentioning that BDA and FDA are referred interchangeably in the literature. While FDA refers to analytics at fog layer, BDA has a broad scope which also covers data analytics at fog layer, i.e., FDA. Fog layer is a virtualized middle layer among end IoT devices and CC. Like CC, Fog layer also provides compute, storage, and network services. However, it has

130

R. Akhare et al.

Fig. 3 Illustration of data processing in Fog computing

some distinctive features that makes it a perfect choice for several applications with rigorous requirements like real-time systems, online analytics, etc. [23]. As fog layer is amidst end devices and cloud layer, it performs an analysis of the data ahead of forwarding the same to cloud. This analysis of data at fog layer enormously reduces the cost for processing and storage at cloud. As FC is still evolving, researchers have been attempting to propose more efficient fog architecture and framework which is capable to handle continuously incoming unstructured data from heterogeneous devices. In general, every proposed architecture works on the principle of pushing the infrequently accessed data to the cloud. According to this principle, frequently accessed data is stored at fog nodes. It is realized to devise an efficient fog analytics method to minimize the data transfer between fog layer and cloud. For this purpose, associated challenges need to be understood. Some of these challenges are as follows: • Analysis and processing of huge amount of continuous data generated by heterogeneous IoT devices • Distantly located cloud deterring throughput and response time (latency)

Proposed Framework for Fog Computing to Improve …

131

• Concern for the security and privacy of data • Constrained resources in the network • Integration of heterogeneous software and hardware devices In order to address these challenges, data analytics is performed at fog nodes, i.e., near end devices. This reduces the data transfer in the network amidst end devices and cloud. These data analytics at fog nodes is referred as fog analytics which is discussed in the following subsection.

4.1 Fog Analytics As mentioned previously, fog analytics perform the data processing at the fog nodes in order to compress the ocean of data. This reduction results in significant reduction of data transfer between end devices and cloud. During fog analytics, the prime challenge is to scrutinize the data. It also involves data cleaning, data aggregation, and data analytics. All these tasks are performed by fog engine working at fog nodes as shown earlier in Fig. 3. Hence, basically, Fog Engine (FE) mainly performs the following tasks: • Preprocessing and analysis of data near the site of its generation • Interaction with IoT devices Thus, FE enables processing the data near its generation site. FE processes the in-stream data locally, and the processed data is thereafter transmitted to cloud for global and offline data analytics. Therefore, analysis of data on the cloud involves higher cost and latency. Moreover, introduction of FE offers improved fault tolerance as in the case of failure of any fog node, its processing task is moved to another FE in its neighborhood. Thus, numerous FEs build a virtual network called Fog where neighboring FEs communicate and exchange data. Here, it is worth noticing that FEs are battery operated and thus require to be energy efficient [24]. On the contrary, cloud enjoys continuous power supply which advocates pushing energy-demanding tasks to cloud. Basically, every FE performs following functions: (i) analytics and storage of data, (ii) communication with other peer fog nodes and communication to cloud and IoT, and finally (iii) synchronization of fog nodes with cloud. For the purpose of analytics at FE, the analytics model employed is determined by cloud analytics. During data analytics at FE, the data received from IoT devices is locally stored in fog nodes until its storage limit exceeds. Once the storage limit is full, data is offloaded to cloud at regular intervals. As previously mentioned, local analysis of data reduces the complexity of analytics at cloud. In addition to local data analytics at FE, data aggregation also reduces the communication overhead between IoT devices and cloud [25]. Data aggregation achieves this reduction by redundancy elimination and data compression. Also, data aggregation focuses on collecting critical data set and then generating original data from the

132

R. Akhare et al.

small data set [26, 27]. Thus, data aggregation is another major achievement of FE like data analytics that alleviates the task of data processing at cloud. This alleviation of cloud from data processing task enables its application in real-life scenarios like smart cities [28, 29] where sensory devices (located throughout the city in any form) generate an abundance of data. The authors in [28] have proposed an efficient architecture for smart city that garners the advantages of both cloud and FC. Thus, the proposed architecture succeeds in obtaining an architecture which reduces network traffic and also suitable for latency-sensitive applications. FE basically performs data acquisition. During data acquisition, data redundancy is removed using data filters. Non-redundant data can be further reduced in terms of size using data aggregation technique at next fog layer. Hence, FE at layer 1 stores data generated at this level and locally processes it. The processed data at layer 1 is forwarded to fog layer 2 where it is stored temporarily. Hence, FE at layer 2 stores data for a larger area, but it is less recent. FE at layer 2 also performs some data processing, and finally the data is uploaded to cloud where it is permanently stored for future inference and predictions. Thus, each layer performs local data processing for its data and reduces its size. Finally, cloud stores data for the complete area owing to its unlimited storage and computational capability. It can be concluded that this hierarchical model [28] achieves reduction in network traffic and response latency. Furthermore, the architecture [28] proposes that the data processed at cloud may also be sent back to the fog edges. This helps in making recent data available at fog nodes. Similarly, authors in [30] propose a FACE (Filtering, Aggregation, Compression, and Extraction) framework. According to this approach, data aggregation may not always be possible as sometimes applications require storing all the data. In such cases, a compression function needs to be employed to reduce the size of network flow. Like filtering, data extraction also focuses on extracting the relevant and required information only. For example, if there is a scenario of automated monitoring of vehicles, then image of license plate is captured. Now as per extraction, it is advised to send the license number as text rather than sending the captured image. As per [31], data reduction techniques are broadly classified into three categories: data compression, data forecasting, and in-network processing. In data compression, compression algorithms are used to reduce the size of data. For example, if sensors are deployed to monitor the room temperature, the temperature is not forwarded to the fog layer until it deviates the previous reading beyond a certain threshold. This significantly reduces the data transmission as the room temperature does not vary quite often. Same approach can be used for human body temperature, moisture, heartbeat, etc. Authors in [32] discuss the problem of data storage distribution in the network. According to authors in [32], data should be distributed across multiple fog nodes considering several aspects like fault tolerance, efficiency, security and privacy, scalability, energy consumption, etc. Even ace companies like Amazon and Google have their own strategies to distribute their data storage. Moreover, efficient scheduling of resources is another vital concern in FC as it involves a huge volume of data. Authors in [33] have proposed a fuzzy clustering algorithm with Particle Swarm Optimization (PSO) to obtain global optima. According to this, a task is divided into multiple

Proposed Framework for Fog Computing to Improve …

133

subtasks which are submitted to task scheduler in the fog environment. The task scheduler collects the scheduling information from users, gateways, and resources and then assigns tasks to corresponding fog node. Authors in [34] also present a model for IoT Big Data Analytics with fog computing considering smart home. The model presented in [34] consists of data acquisition, IoT management, fog nodes, and cloud system that uses Apriori algorithm to uncover associations [35]. In the next section, authors discuss the significance of resource scheduling at fog layer. It also discusses the finding by various researchers in this direction.

5 Resource Scheduling in FC This section discusses the scheduling of devices in the network, mainly fog nodes. Several researchers have worked for efficient resource scheduling within network. Here, resource scheduling primarily works on the principle of load balancing. Consequently, load balancing and resource scheduling have been used interchangeably in the literature. The aim of resource scheduling is to garner utmost performance of these nodes so as to obtain its optimized performance. In particular, goals of load balancing can be summarized as follows [36]: • • • • • •

Minimum response time Maximum throughput Optimum resource utilization Scalability Robust and fault tolerant Low overhead

Resource scheduling is obtained by efficiently distributing incoming workload among multiple fog nodes [36]. The load balancing is broadly categorized into static and dynamic load-balancing methods [37]. In static load balancing, load is distributed based on prior knowledge of task available during beginning of execution. On the contrary, dynamic load-balancing approach dynamically allocates tasks to a node based on current status of the network. Dynamic load-balancing model may further be categorized as centralized and distributed dynamic load balancing. In centralized load-balancing model, distribution of workload is done by a single node, whereas in distributed load-balancing model task of workload distribution is shared by multiple nodes [37]. It is worth mentioning that static methods of load balancing are nonpreemptive and hence tasks cannot be reallocated to some other node during its execution. However, in dynamic resource scheduling, the tasks can be migrated to another node during execution if need arises. Hence, real-time load assessment is required for efficient and effective load balancing. Real-time assessment aids in redistribution of load (computational load, virtual memory, and network load) for overloaded nodes among under-loaded or unutilized nodes [37].

134

R. Akhare et al.

During load balancing, it also needs to be considered if there is any dependency among the tasks that are scheduled. Dependent tasks need to be handled differently from independent tasks as former involves data communication among tasks. Dependent task scheduling requires building a task graph using Directed Acyclic Graph (DAG) [17]. Authors in [17] propose application of I-Apriori algorithm to the task set T. IApriori algorithm discovers the association rules in the task set. This task set is later given to TSFC (Task Scheduling in Fog Computing) layer to determine the relationship among task scheduling as shown in Fig. 4. Each fog node performs the following functions on the data set in order to perform data analytics as shown in the following Fig. 5. The primary functions in the analytics are I-Apriori algorithm, association rules which is followed by scheduling relation as shown below.

Fig. 4 Illustration of task scheduling in Fog computing

Transaction Set D

I-Apriori

Association Rules

Fig. 5 Stepwise Illustration of Task Scheduling

Scheduling Relational R

Proposed Framework for Fog Computing to Improve …

135

Authors in [38] also propose a load-balancing model for fog environment. The proposed model consists of four steps, viz., fog service partition, spare space detection, static resource allocation, and global load-balanced driven resource allocation. In fog service partition, fog services are partitioned based on their resource requirement. In the second step, i.e., spare space detection, computing nodes are analyzed in order to detect the spare space and determine if it can host fog service or not. In the resource allocation phase, workload from overburdened computing nodes is migrated to under-loaded computing nodes. In this step, resource allocation is done considering the fog service subset. This is refined in the last phase that performs load balancing with an objective of global load balancing during the execution period. Various authors have presented various architectures for fog computing. Here, authors present the brief description of architectures proposed by various researchers in the following Table 1. Authors in [45] have proposed a task-scheduling model that aims to maximize the utilization of fog nodes. According to the proposed model, the fog layer consists of Table 1 Proposed frameworks contributed by various researchers Research work

Contribution and features

Features

Salonikias et al. [39]

Proposed a three-layer architecture for vehicles and sensors. Consists of vehicles and personal devices

Low latency Less congestion cloud Local data storage at roadside units

Sehgal et al. [40]

Three layers architecture, viz., IoT, Fog, and Cloud

Distributed expert system for latency-sensitive applications

Liu et al. [13]

Proposed a security-based intelligent traffic light control

Secure and authenticated data storage and data computation Limitation: No access control and data privacy at end devices

Basudan et al. [41]

Proposed an improved certificate distribution approach for enhanced security among IoT devices

Secure authentication, access control, and data computation but no secure data storage and data privacy at end devices

Zeadally et al. [42]

Proposed an approach for deduplication of encrypted data in fog storage

Secure and authenticated access control Achieves privacy and secure data storage at end devices

Jayaraman et al. [43]

Secure machine to machine networks using smart gateways

Secure and authenticated access control Data computation, data storage, and data privacy at end devices

Dsouza et al. [44]

Secure collaboration between different user-requested resources using various smart devices

Secure and authenticated access control Data storage, computation, and data privacy at end devices are not secured

136

R. Akhare et al.

several fog nodes which are organized into fog colonies. Each fog colony is managed by fog orchestration node which is a node having extended capability for managing fog nodes. In this model, whenever a task is requested by some client, corresponding orchestration node determines where to deploy this task. It may decide to deploy the task on itself, i.e., orchestration node, a fog node within same fog colony, a fog node in some other fog colony or cloud. This deployment is decided based on resource requirement of the task and the current status of computing nodes in the network. In this approach, it is ensured that adequate resources (CPU time and memory) must be available to accomplish the requested task in required time. The task completion time basically involves deployment time, execution time, and communication time (comprising two-way communication to transfer task to selected node and return result from selected node to orchestration node). Here, it is worth mentioning that if a task is chosen to be migrated to the cloud, its processing time is negligible owing to unlimited resources at cloud. Hence, in such case, the processing time involves only communication time which is higher than communication within fog layer. Considering the underlying principles that help to obtain efficiency and effectiveness in the FC, authors propose a framework that outperforms the existing architecture and obtains improved QoS. The proposed framework is presented in the following section.

6 Proposed Framework Authors in this chapter propose a framework that mainly focuses on two factors that have been previously discussed in the chapter, viz., data aggregation and resource scheduling. The proposed framework aims to aggregate the data near its generation site so as to reduce the data transfer. Additionally, it tries to maximize the load balancing at fog layer. Hence, it prevents fog nodes from becoming overloaded and thus eliminates bottleneck. Here, we initially present principle used for data aggregation and resource scheduling ahead of presenting the proposed framework. Data Aggregation The proposed approach aims to reduce the data transfer using data aggregation. According to the proposed approach, every value need not be sent at same interval. For example, the temperature of a room may be transferred less frequently in comparison to pulse rate of a patient [46]. Here, authors do not propose a rigid guideline regarding the frequency of these parameters as it varies with application. It can also be determined by using tolerance level of the scenario. Apart from determining the frequency of data transfer, the authors also suggest sending the data at determined interval only if it deviates from previous value beyond a certain threshold. For example, if there is some sensor in the network which transmits the room temperature every minute. In such scenario, Authors suggest that if the previously sent value is 25 °C and the threshold is set to be ±3 °C. In such situation, room temperature is forwarded to fog layer only if it goes either below

Proposed Framework for Fog Computing to Improve …

137

22 °C or beyond 28 °C. This will significantly reduce the data transmission among edge devices and fog layer which consequently reduces the overall data transmission. Here, it is worth mentioning that the threshold value is decided in relation to the significance of monitored parameter and its application. Resource Scheduling Resource Scheduling aims to balance fog nodes in terms of computational load so that none of the fog node remains under-loaded or overburdened. According to the proposed framework, the fog layer consisting of numerous fog nodes is classified into several clusters. Fog nodes are clustered based on their geographical location so that the neighboring fog nodes are kept in same cluster. Each cluster is monitored and controlled by a fog node called cluster manager. Cluster manager is a special fog node competent in balancing load among all nodes within cluster. It ensures that each node is neither under-loaded nor overburdened. The concept of cluster manager is illustrated in the following Fig. 6. As illustrated in the following Fig. 6, the fog nodes are classified in clusters. Each cluster is managed by the cluster manager represented by fog node surrounded by solid black circle. As represented in the following Fig. 6, intercluster communication takes place through cluster managers of concerned clusters. In case, if cluster manager finds an overburdened node in the cluster, it takes any of the following actions: • Transfers the load to an under-loaded node within cluster • Transfers the load to some other cluster • Transfers the load to the cloud for further analytics and processing This decision is taken based on several parameters. Cluster manager ensures that it must accomplish the task in minimum time. For the same, it evaluates the completion time of a task over various fog nodes in the network. Here, underutilized nodes are taken into consideration while estimating completion time of a task. Completion time of a task is calculated using following Eq. 1:

Fig. 6 Illustration of Clustering in Fog layer

138

R. Akhare et al.

⎧ ⎨ DTa (C  M(Ci ), f x ) + E Xa ( f x ) ∨ f x ∈ C  i ∀x,  i(1)  C T (a, f x ) = DTa C M(Ci ), C M C j + DTa C M C j , f x ⎩ / Ci , f x ∈ C j , i = j∀x, i, j +E X a ( f x ) ∨ f x ∈

(1)

Here, in above Eq. 1, C T (a, f x ) DTa (x, y) C M(Ci ) E Xa ( fx )

represents the completion time of task a over fog node f x . refers the data transfer time of task a from fog node x to fog node y. represents the cluster manager of cluster Ci . represents the execution time of task a at fog node f x .

The equation finds the completion time of task a in two circumstances as follows: 1. It is executed by a fog node within the same cluster 2. It is forwarded to a fog node in some other cluster In the first case, i.e., when the task a is executed within the same cluster, the completion time for task a involves data transfer time from cluster manager to the fog node f x and its execution time at fog node f x . Similarly in the later case, i.e., when the task a is transferred to a fog node in other cluster, it involves the following. 1. Data transfer time from cluster manager of source cluster to cluster manager of recipient cluster. 2. Data transfer time from cluster manager of target cluster to destined fog node f x . 3. Execution time for a at fog node f x . Both these scenarios have been represented in Eq. 1. According to the proposed framework, whenever a task is received by a cluster manager, it determines the least utilized node in the cluster and allocates the task to it so that all fog nodes in the cluster remain balanced. However, in case of overloading of a node, it determines the nearest node which could share its load and also complete the task in minimum time [47, 48]. Among all nodes, which lie in the neighborhood of overloaded node, a node f x is chosen so that C T (a, f x ) is minimized. In order to determine if a node is overloaded or not, its average load is calculated over a period of time. If the average load of a node goes beyond a predetermined threshold , it is considered to be overloaded. Similarly, in order to find the underutilized nodes in the network, if average load of a node goes below ∇, the node may be considered to be under-loaded. Authors propose calculating the average load of a node in the network at a regular interval so as to balance the load. The proposed framework is illustrated in the following Fig. 7. As demonstrated in the Fig. 7, fog nodes are classified into clusters. Cluster manager of each cluster is demonstrated differently. Now as demonstrated in Fig. 7, cluster manager of a cluster checks if the task can be allocated to a node in the same network or it should be allocated to a node in other network. The decision of allocating this task to a node within same cluster or some other cluster is taken using Eq. 1 as given above. The detailed modus operandi for proposed framework is represented in the following Fig. 8.

Proposed Framework for Fog Computing to Improve …

139

Fig. 7 Illustration of Intracluster and Intercluster communication in proposed framework

Sends Request

I o T D e v i c e Sends result s

Determines the fog node for C l allocation u s Forwards to selected t fog node e r M Sends job result a n a g e r

F o g N Decompose job o d Executes job e s Sends job result

C l o u d N o d e s

Fig. 8 Sequence diagram for proposed framework

As demonstrated in the following Fig. 8, the IoT devices generate a request for a task. This task is first of all forwarded to the cluster manager of associated cluster at fog layer. The cluster manager determines the fog node which should execute the task using Eq. 1. The main objective of the cluster manager is to ensure that the task

140

R. Akhare et al.

is accomplished in minimum time while there is no overburdened node in the cluster. Once the node is determined which should execute the task, task is forwarded to that node. The target fog node may divide the task into multiple subtasks and then execute the task. Once the task is accomplished, the result is returned to the IoT device. The result may also be communicated and stored at cloud node in some cases while it may be useful for further analytics. Now, it is believed that the proposed framework outperforms existing frameworks as it employs data aggregation and resource scheduling: it ensures that load is balanced at fog layer so that no node remains overburdened or under-loaded. The proposed framework also claims to obtain the task in minimum time as it chooses the fog node for task deployment using Eq. 1 as given earlier. The efficacy of fog computing for improving QoS is explained through a case study of healthcare industry. This case study is regarding health-monitoring system where healthcare decision is predicted using biomedical data analytics. It is achieved using omnipresent systems and transmission of data in real-time environment in place of forwarding it to cloud. For instance, physical and mental health is evaluated using wearable medical devices which generate enormous big data. In any case, if the vital health parameters like blood sugar and temperature surpass its range, an alert message is sent to doctors and patient through medical device enabled at the fog. This kind of automated system can also be employed for fall and stroke detection. Fall detection is performed using fog computing which has been categorized into three sections, viz., Front-end, backend, and communication module working in independence. This approach obtained a low false positive rate in comparison to existing system. For the same, it uses distributed data analytics between edge devices and cloud server. This health monitoring system uses FDA architecture to provide complicated computation in order to detect extensive fall.

7 Conclusion and Future Work In this era of automation and information explosion, efficient approaches are necessitated to handle and transmit the data in most sophisticated manner. Transmission of data incurs substantial cost, and it is a matter of grave concern for IoT networks as it involves plethora of data. Earlier, the whole data was needed to be transmitted to the cloud for storage and analytics purpose, but the emergence of FC has prevented this transmission. FC suggests processing the data near its generation site by introducing a virtual layer that consists of fog nodes. As fog nodes are limited in storage and computational capability, efficient framework must be devised so as to obtain improved QoS. Authors in this chapter propose an efficient framework that employs data aggregation and resource scheduling. Data aggregation ensures that data is compressed farthest from the cloud so that the compressed data travels in the network. Secondly,

Proposed Framework for Fog Computing to Improve …

141

resource scheduling aims to monitor the nodes in the fog layer so that no node remains overburdened. It aims to balance load among involved fog nodes in the network. The authors in this chapter have considered static details of the task for its allocation, and the suggested frameworks work in non-preemptive manner. The proposed work can be extended in the direction of considering dynamic details of the task. It can also be extended in the direction of providing preemptive strategy for task allocation.

References 1. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N.: Fog computing for Healthcare 4.0 environment: opportunities and challenges. Comput. Electric. Eng. 72, 1–13 (2018) 2. Mangla, M., Akhare, R., Ambarkar, S.: Context-aware automation based energy conservation techniques for IoT ecosystem. In: Energy Conservation for IoT Devices, pp. 129–153. Springer (2019) 3. Vora, J., DevMurari, P., Tanwar, S., Tyagi, S., Kumar, N., Obaidat, M.S.: Blind signatures based secured e-healthcare system. In: 2018 International Conference on Computer, Information and Telecommunication Systems (CITS), pp. 1–5 (2018) 4. Patel, D., Narmawala, Z., Tanwar, S., Singh, PK.: A systematic review on scheduling public transport using IoT as tool. In: Smart Innovations in Communication and Computational Sciences, pp. 39–48 Springer (2019) 5. Kumar, S., Goudar, R.H.: Cloud computing – research issues, challenges, architecture, platforms and applications: a survey. Int. J. Futur. Comput. Commun. 356–360 (2012). https://doi. org/10.7763/ijfcc.2012.v1.95 6. Morshed, S., Islam, M.M., Goswami, P.: Cloud computing: a survey on its limitations and potential solutions (2013) 7. Vora, J., et al.: Ensuring privacy and security in E-health records. In: 2018 International Conference on Computer, Information and Telecommunication Systems (CITS), pp. 1–5 (2018) 8. Ambarkar, S.S., Shekokar, N.: Toward smart and secure IoT based healthcare system. In: Internet of Things, Smart Computing and Technology: A Roadmap Ahead, pp. 283–303. Springer (2020) 9. Tanwar, S., Vora, J., Kaneriya, S., Tyagi, S.: Fog-based enhanced safety management system for miners. In: 2017 3rd International Conference on Advances in Computing, Communication & Automation (ICACCA) (Fall), pp. 1–6 (2017) 10. Tanwar, S., Tyagi, S., Kumar, N.: Multimedia Big Data Computing for IoT Applications: Concepts, Paradigms and Solutions, vol. 163. Springer (2019) 11. Mehraeen, E., Ghazisaeedi, M., Farzi, J., Mirshekari, S.: Security challenges in healthcare cloud computing: a systematic. Glob. J. Health Sci. 9(3) (2017) 12. Simmhan, Y.: Big Data and Fog Computing, December 2017. https://doi.org/10.1007/978-3319-63962-8_41-1 13. Mukherjee, M., Shu, L., Wang, D.: Survey of fog computing: Fundamental, network applications, and research challenges. IEEE Commun. Surv. Tutorials, 20(3), 1826–1857 (2018). https://doi.org/10.1109/COMST.2018.2814571. 14. Muntjir, M., Rahul, M., Alhumyani, H.A.: An analysis of Internet of Things (IoT): novel architectures, modern applications, security aspects and future scope with latest case studies. Int. J. Eng. Res. Technol. 6(06), 422–447 (2017) 15. Verma, M., Yadav, N.B.A.K.: An architecture for load balancing techniques for Fog computing environment. Int. J. Comput. Sci. Commun. 8(2), 43–49 (2015) 16. Dastjerdi, A.V., Gupta, H., Calheiros, R.N., Ghosh, S.K., Buyya, R.: Chapter 4 - Fog computing: principles, architectures, and applications (2016)

142

R. Akhare et al.

17. Liu, L., Qi, D., Zhou, N., Wu, Y.: A task scheduling algorithm based on classification mining in Fog computing environment. Wirel. Commun. Mob. Comput. 2018 (2018) 18. Yousefpour, A. et al.: All one needs to know about fog computing and related edge computing paradigms: a complete survey. J. Syst. Archit. (2019) 19. Srivastava, A., Singh, S.K., Tanwar, S., Tyagi, S.: Suitability of big data analytics in indian banking sector to increase revenue and profitability. In: 2017 3rd International Conference on Advances in Computing, Communication & Automation (ICACCA) (Fall), pp. 1–6 (2017) 20. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N., Parizi, R.M., Choo, K.-K.R.: Fog data analytics: a taxonomy and process model. J. Netw. Comput. Appl. 128, 90–104 (2019) 21. Mehdipour, F., Javadi, B., Mahanti, A., Ramirez-Prado, G.: Fog Computing Realization for Big Data Analytics, no August (2019) 22. Verma, J.P., Tanwar, S., Garg, S., Gandhi, I., Bachani, N.H.: Evaluation of pattern based customized approach for stock market trend prediction with Big Data and Machine Learning techniques. Int. J. Bus. Anal. 6(3), 1–15 (2019) 23. Dastjerdi, A.V., Buyya, R.: Fog computing: helping the Internet of Things realize its potential. Computer (Long. Beach. Calif). 49(8), 112–116 (2016). https://doi.org/10.1109/mc.2016.245 24. Bhardwaj, K.K., Khanna, A., Sharma, D.K., Chhabra, A.: Energy Conservation for IoT Devices, vol. 206 (2019) 25. Chen, S., Du, L., Wang, K., Lu, W.: Fog computing based optimized compressive data collection for big sensory data. In: 2018 IEEE International Conference on Communications (ICC), pp. 1–6 (2018) 26. Dong, M., Ota, K., Liu, A.: RMER: reliable and energy-efficient data collection for large-scale wireless sensor networks. IEEE Internet Things J. 3(4), 511–519 (2016) 27. Liu, F., Wang, Y., Lin, M., Liu, K., Wu, D.: A distributed routing algorithm for data collection in low-duty-cycle wireless sensor networks. IEEE Internet Things J. 4(5), 1420–1433 (2017) 28. Sinaeepourfard, A., García Almiñana, J., Masip Bruin, X., Marín Tordera, E.: Fog-to-Cloud (F2C) data management for smart cities. In: Proceedings of 2017 Future Technologies Conference (FTC): 29–30 November 2017, Vancouver, Canada, pp. 162–172 (2017) 29. Tanwar, S., Tyagi, S., Kumar, S.: The role of internet of things and smart grid for the development of a smart city. In: Intelligent Communication and Computational Technologies, pp. 23–33. Springer (2018) 30. Marquesone, R.D.F.P. et al.: Towards bandwidth optimization in fog computing using FACE framework. In: CLOSER, pp. 463–470 (2017) 31. Ismael, W.M., Gao, M., Al-Shargabi, A.A., Zahary, A.: An in-networking double-layered data reduction for Internet of Things (IoT). Sensors 19(4), 795 (2019) 32. Bermbach, D. et al.: A Research Perspective on Fog Computing 33. Li, G., Liu, Y., Wu, J., Lin, D., Zhao, S.: Methods of resource scheduling based on optimized fuzzy clustering in fog computing. Sensors (Switzerland) 19(9) (2019). https://doi.org/10.3390/ s19092122 34. Singh, S., Yassine, A.: IoT Big Data analytics with Fog computing for household energy management in smart grids. In: International Conference on Smart Grid and Internet of Things, pp. 13–22 (2018) 35. Tanwar, S., Patel, P., Patel, K., Tyagi, S., Kumar, N., Obaidat, M.S.: An advanced internet of thing based security alert system for smart home. In: 2017 International Conference on Computer, Information and Telecommunication Systems (CITS), pp. 25–29 (2017) 36. Baek, J., Kaddoum, G., Garg, S., Kaur, K., Gravel, V.: Managing Fog networks using reinforcement learning based load balancing algorithm. arXiv Preprint https://arxiv.org/abs/1901. 10023 (2019) 37. Verma, M., Bhardawaj, N., Yadav, A.K.: An architecture for Load Balancing Techniques for Fog Computing Environment. Int. J. Comput. Sci. Commun. 6(2), 269–274 (2015). 10.090592/IJCSC.2015.627 38. Xu, X. et al.: Dynamic resource allocation for load balancing in fog environment. Wirel. Commun. Mob. Comput. 2018 (2018)

Proposed Framework for Fog Computing to Improve …

143

39. Salonikias, S., Mavridis, I., Gritzalis, D.: Access control issues in utilizing fog computing for transport infrastructure. Lecture Notes in Computer Science (LNCS), (including its subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics), vol. 9578, pp. 15–26 (2016). https://doi.org/10.1007/978-3-319-33331-1_2 40. Sehgal, V.K. Patrick, A., Soni, A., Rajput, L.: Intelligent distributed computing. In: Proceedings of the Third International Symposium on Intelligent Informatics, ISI 2014, September 24-27, 2014, Greater Noida, Delhi, India,” no. August (2015). https://doi.org/10.1007/978-3-319-112 27-5 41. Basudan, S., Lin, X., Sankaranarayanan, K.: A privacy-preserving vehicular crowdsensingbased road surface condition monitoring system using fog computing. IEEE Internet Things J. 4(3), 772–782 (2017). https://doi.org/10.1109/JIOT.2017.2666783 42. Zeadally, Z., Isaac, S., Baig, J.T.: Security attacks and solutions in electronic health (E-health) systems. J. Med. Syst. 40, 263 (2016) 43. Jayaraman, P.P., Gomes, J.B., Nguyen, H.L., Abdallah, Z.S., Krishnaswamy, S., Zaslavsky, A.: CARDAP: A scalable energy-efficient context aware distributed mobile data analytics platform for the fog. Lecture Notes in Computer Science (LNCS), (including its subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics), vol. 8716, no. December, pp. 192–206 (2014). https://doi.org/10.1007/978-3-319-10933-6_15 44. Dsouza, C., Ahn, G.J., Taguinod, M.: Policy-driven security management for fog computing: Preliminary framework and a case study. In: Proceedings of 2014 IEEE 15th International Conference on Information Reuse and Integration IEEE IRI 2014, pp. 16–23 (2014). https:// doi.org/10.1109/iri.2014.7051866 45. Tran, M.-Q., Nguyen, D.T., Le, V.A., Nguyen, D.H., Pham, T.V.: Task placement on Fog computing made efficient for IoT application provision. Wirel. Commun. Mob. Comput. 2019 (2019) 46. Vora, J., Tanwar, S., Tyagi, S., Kumar, N., Rodrigues, J.J.P.C.: FAAL: Fog computing-based patient monitoring system for ambient assisted living. In: 2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom), pp. 1–6 (2017) 47. Mangla, M., Garg, D.: Rapidly converging solution for p-centers in nonconvex regions. Turkish J. Electr. Eng. Comput. Sci. 25(3), 2424–2433 (2017) 48. Wadhwa, V., Garg, D.: Facility location problem using Genetic algorithm: a review. Res. J. Comput. Syst. Eng. 2(2) (2011)

Fog Data Based Statistical Analysis to Check Effects of Yajna and Mantra Science: Next Generation Health Practices Rohit Rastogi, Mamta Saxena, D. K. Chaturvedi, Santosh Satya, Navneet Arora, Mayank Gupta, and Parul Singhal Abstract As we all know that our neighboring country China is facing Corona Virus, and common symptoms of this virus is common cold and flu which become life threatening. So what is the solution? First is wearing good quality mask in public places and maintaining good hygiene. Other is environmental purification which one can do by medicinal havan with Geloye and Pragya pey, herbal tea along with common Havan Samagri, made by one cup decoction of Geloye along with Vasa and kalmegha which can boost up the immunity of individual. The smoke of Mango wood and Googgal can kill a deadly virus like corona; a person infected with corona and other deadly viruses can be cured by such smoke. It can be a matter of debate about Indian culture. It is about biological science. Havan Samagri specially was designed for Asthmatic patients with 3:1 ratio, and Surya Gayatri Mantra chanting R. Rastogi (B) Department of CSE, ABESEC Ghaziabad, Ghaziabad, India e-mail: [email protected] M. Saxena Director General, DSDD, Ministry of Statistics & Program Implementation, GoI, Delhi, India e-mail: [email protected] R. Rastogi · D. K. Chaturvedi Department of Electrical Engineering, DEI, Agra, India e-mail: [email protected] S. Satya Centre for Rural Development and Technology, IIT-Delhi, Delhi, India e-mail: [email protected] N. Arora Mechanical & Industrial Engineering Department, IIT-Roorkee, Roorkee, India e-mail: [email protected] M. Gupta Tata Consultancy Services, Noida, India e-mail: [email protected] P. Singhal Department of CSE, ABESEC Ghaziabad, Ghaziabad, India e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Tanwar (ed.), Fog Data Analytics for IoT Applications, Studies in Big Data 76, https://doi.org/10.1007/978-981-15-6044-6_8

145

146

R. Rastogi et al.

was performed 24 times and Nadi Shodhan Pranayam was executed for half an hour duration daily. Subjects were asked to take kwath of Havan Samagri twice in a day. Lung Function Test (LFT) was experimented on 18 Dec 2018 up to 24 Dec 2018 to check the efficacy of Yagyopathy. Three important parameters were tested in LFT which were FVC-Forced vital capacity, FEV-Forced expirartory volume, and FEV1/FVC known as MER-measured expiratory rate. FVC is used to calculate the capacity of human lungs/volume of air which is exhaled after deep inhalation. MER is directly related to proportion of lung size to be exhaled per second. The results were measured for one year and nine months and demonstrated a significant improved performance in lung function parameters. It was noted that if experiments are done twice in a day results are more improved. Keywords Diabetes · T1D · T2D · Gestational diabetes · Migraine · Anxiety · Hypertension · Stress · ML and AI in healthcare · Machine vision · Medical images and analysis · Yajna science and mantra science · Gayatri Mantra (GM) · OM chanting · OM symbol · Rudrakash · Elaeocarpus

1 Introduction The oldest kind of Hindu appeal is known as the Havan Yajna. It is a religious assistance where a sanctified fire is lit and Sanskrit mantras are talked about. A couple of scientists wrongly envision that it is love of fire, which it is not. It relies upon the standard of yielding for other individuals. These are the basic purposes behind Havan Yajna which includes lighting a fire and offering wood, ghee, and herbs which is a significant exhibition of giving and shows one not to be prideful. Recitation of petitions in a social event trains one to live merrily by offering to other individuals. It is typical among Western scientists and their Westernized understudies to isolate between the Vedic yagna and the Puranic puja traditions that describe the two significant times of Hinduism; one that flourished over 3,000 years earlier and one that rose 2,000 years back. Clearly, at the different crazy stages of time, we have the certain bhakta-Indologists who claim that Hinduism has no history or stages of improvement in their civilaztions. and there everything is homogenous and static. But with facts and annals, this statement doesn’t prove true.

1.1 Different Diseases 1.1.1

Diabetes

Diabetes is a kind of problem which occurs due to increase of sugar level in our body. This problem increases with the age. That is why it is mainly seen in old

Fog Data Based Statistical Analysis to Check Effects …

147

people. The main cause for diabetes is taking high amount of sugar on regular basis. If we neglect this problem, then it will cause serious damage to the body. So, for the prevention of this problem, we should have proper knowledge about diabetes and various preventive practices which can be used to cure this problem. Diabetes is also known as diabetes mellitus. In Diabetes, there are mainly two possibilities; either our body is unable to make enough insulin or our body cannot use the insulin the body makes. There are various types of diabetes: Type 1 Diabetes: In Type 1 Diabetes, the immune system attacks and destroys cells in pancreas where insulin is made. Type 2 Diabetes: In Type 2 Diabetes, our body becomes resistant to insulin, and sugar builds up in our blood. Prediabetes: In Prediabetes, the blood sugar is higher than the normal. Gestational Diabetes: In Gestational Diabetes, there is an increase in sugar level during pregnancy (Stephanie Watson, “health line,” 4 October 2018 and [1, 2].

1.1.2

TTH and Stress

Nowadays, tension-type headache is most common type of headache seen in people. The various causes of tension-type headaches are it causes mild headache in your head and behind your eyes. These can be seen one or two times in a month [3, 4]. Causes of Tension-type Headaches Tension-type headache is caused by several of variety of foods, activities, etc. The main causes of tension-type headache are it is caused by alcohol, eyestrain, dry eyes, fatigue, smoking, a cold or flu, caffeine, a sinus infection, poor posture, and emotional stress [5]. Symptoms of Tension-type Headache The symptoms of tension-type headache include the pressure around the forehead, dull head pain, and tenderness around the forehead and scalp. The pain of tensiontype headache is usually moderate, but sometimes it can be intense. It depends on the duration of the headache you are facing (Deoborah Weatherspoon, healthline, 5 February 1982).

1.1.3

Migraine

A migraine is a type of pain resulting in one side of the head. Migraine can last from hours to days and result in serious problems and interfere with the daily routine. So, it is necessary to know about the symptoms and how we can cure this problem [6, 7]. Causes—The main cause of migraine is drinks. In drinks, especially wine and too much of caffeine is also the main cause of migraine. Stress at work or home can also

148

R. Rastogi et al.

cause this problem. Getting too much of sleep or getting too less sleep is harmful and causes migraine. The change of weather can also cause migraine. Symptoms—The symptoms of migraine are classified into four stages–– prodrome, aura, attack, and postdrome (Mayo Clinic Staff, Mayo Clinic, 31 May 2019).

1.1.4

Anxiety

The American Psychological Association (APA) describes anxiety as “an inclination depicted by opinions of strain, focused on thoughts, and physical changes like extended heartbeat.” Anxiety is an ordinary and frequently sound feeling. Nonetheless, when an individual consistently feels lopsided degrees of uneasiness, it may turn into a restorative issue. Anxiety issue in mass has a specific structure and a class of psychological wellbeing is analyzed in it by researchers that lead to inordinate anxiety, dread, fear, and stress [8, 9].

1.1.5

Hypertension

Hypertension is generally called as blood pressure at high level. Blood pressure is defined as the force exerted by the person’s blood against their blood vessels. It is seen that almost half of adults of United States have high blood pressure problems, and many are unaware of it. Keeping control of blood pressure is very important (as per Fig. 1). The prevention of hypertension is very important. First of all, all should have to change their lifestyle as this is the first step. Along with this, regular physical exercise should be done. People use specific medications to treat hypertension. Doctors also recommend using low doses of medications as they can have some side effects also. People can also cure hypertension by following a heart-healthy diet [10] (As per Fig. 1).

1.1.6

Stress

Stress is a medical condition in which an individual thinks too much about a particular problem. This type of problem is mainly seen in youngsters as this is the age where there are many ups and downs in teenager life. They come across various new things which they had never seen before. Since they do not have proper knowledge about stress, they become the victim of it. So, it is very important to have the knowledge about stress and various methods to overcome it. The main cause of stress is the demands related to finance, work, relationship, and various other situations. It is not necessary that stress has only disadvantages; it

Fog Data Based Statistical Analysis to Check Effects …

149

Fig. 1 The figure is showing measurement of blood pressure using a sphygmomanometer [10]

has some advantages also. Stress leads the body to be prepared for the competition. It is obvious that when we have stress on some competitive situtation, then our body will work thoroughly and practice until we will prepare good for that competition [11–13]. If someone is overstressed, then the best method to overcome stress is doing Yoga and Pranayamas. Yoga relaxes our body and provides calmness to the body. Pranayamas makes us physically and mentally active [14].

1.1.7

Obesity

Obesity is defined as the disease which takes place when there is excess fat in our body. If we take in more calories than our body can burn, then the extra fat is stored in our body resulting in obesity. Obesity is a long-term disease and causes problems like high blood pressure or diabetes (as per Fig. 2). Obesity can also be checked by evaluating the distribution of fat in our body and hence determining the risk of obesity-related health problems. The first type is body fat distributed around waist, and the second type is fat distributed on the hips and thighs. The first-type body fat around the waist is more risky than the second type on hips and thighs. The main cause of obesity is overeating and sedentary habits and not doing any physical exercise on the regular basis [15] (As Per Fig. 2).

150

R. Rastogi et al.

Fig. 2 The figure shows the fat distributed around waist [10]

1.2 Machine Vision It plays a vital role in healthcare. By implementing AI and other technologies at the hospitals, there is a huge decrease in the death of patients. With computer vision, doctors can understand the disease more accurately and diagnose the patients more easily [16].

1.2.1

Medical Images

Medical imaging plays a vital role in patient healthcare. It aids in disease prevention, early detection, diagnosis, and treatment. It has become essential for virtually all major medical conditions and diseases [17].

1.2.2

Artificial Intelligence (AI), Machine Learning, and Fog Computing in Healthcare

Pranic energy of items can be measured with Fog Computing, AI, and ML. Also, it measures the energy aura of different gadgets, objects, and ritual activities. There are many scientific kits which would be far suitable for Yagyopathy. They are only suitable for daily Balivaishwa or short Agnihotra. Reason is that they can accommodate only 10–12 GM havan samagri for giving ahuties. For Bheshaj Yagya, quantity matters; it is recommended to use at least 80–100 GMs for doing Bheshaj Yagya in one session [18, 19]. Sometimes, the experimenter and subjects get overexcited and spread the message without going in details. It is requested to know about the subject in detail then definitely one should speak about it and spread the knowledge otherwise better to remain quite rather giving a wrong or half inform, that may be unknowingly and due to lack of knowledge about the subject. So, all the steps and contents are represented here

Fog Data Based Statistical Analysis to Check Effects …

151

on different experiments and their fruitful results. We can do short Agnihotra with common havan samagri daily, and pranic energy would still be generated which would benefit the patient, but it would not be sufficient for generating the amount of vapors required to make the atmosphere free from pathogens. The AI, ML, Data Analysis, and latest Fog Computing concepts can be used efficiently to collect, analyze, and store in different applications. Many health-related apps, applications, and softwares are getting popular using these methods [20–26].

1.3 Yajna Science and Cure for Different Diseases It is not the biomass that yields results, but the entire SoP of Yagya combined with mantras and an Aavahan brings out the results. The researches have shown that Yagya fumes were effective in reducing the harmful pathogenic microbes, but burning only mango wood increased their numbers. Similar was the finding of Lko NBRI scientists remind, and their research work is published already. Arani seems to be a special type of wood which catches fire easily on suitable application of friction and also seems to be a hardwood variety. Himalayas from where we receive our Vedic initiation are rich in phenol that can make these wooden varieties more easily inflammable. Devdaru, Chinar, Cheed, etc., are some of such high-altitude trees. Complete detail is that the wood of Banyan is only useful. It seems a more complicated assortment of alternatives. We should use this type of Agni for Yagya; then after poojan, it will become divyagni. If it can be scaled and compared with what the regular Agni invocation process. The Author teams are also coming across that the agni generated this way is more powerful. There was some comparative index also for the quality and process but not produced here to avoid deviations from core content. In southern India, it is called as Arani kattai. Rubbing the samidhas and chanting agni suktam had been done earlier by forefathers, litting up the yagyagni. For epilepsy, we can recommend Surya Gayatri or Chandra Gayatri Mantra. Updated list of samagri and havan mantra is also needed. Mantra, illness, and the combination of illness and mantras must be clearly understood. What if someone is suffering from heart disease, high BP, sugar, and obesity at the very same time; so which problem can be addressed first? In case of asthma and arthritis, medicine can be given for both of them. For physical problems, Surya Gayatri Mantra; for mental problems, Chandra Gayatri Mantra; and for retardation, Saraswati Gayatri Mantra are recommended. If someone is suffering from multiple problems, each samagri is to be used with Surya Gayatri Mantra, with different samagris one by one, e.g., first heart, next BP, and then diabetes, etc. [27, 28].

152

1.3.1

R. Rastogi et al.

Fumigating Substances Used in Yagya

Various chemical changes take place. In order to get an idea, it is crucial to know the various objects offered in Yagya, which are described below. Wood: The wood is cut into pieces of changing lengths called “samidhas” according to the size of the alter or “agnikunda”. There are various types of wood, namely Sandalwood, Agar and Tagar wood, Deodar, Mango, Dhak or Palash, Bilva, Pipal, and so on. In addition to wood, various Havishya or Havan samagri are offered in yagya, which is divided into the following four groups. (A) Odoriferous Substances: They are saffron, musk, agar, tagar, chandan, illaychi, jayphaljavitri, and camphor. (B) Substances with Healthy Constituents: They are clarified butter, milk, fruits, and cereals like wheat, rice, barley, til, kangu, munga, chana, arhar, and masuror peas [25, 29, 30]. (C) Sweet Substances: They usually are sugar, dried grapes, honey, or chhuhara. (D) Medicinal Herbs: For specific requirement, medicinal herbs like Somalata or Giloya, Shankhpushpi, Nagkesar, Baheda, Mulhati, RedChandan, Harad, and so on are used. 1.3.2

Benefits of Yagya

For yagya or havan is known only a dharmik karmkand by most of the people and so not their cup of tea. Whereas it has lot to offer. If Yagya is to be promoted, then we go to become a common man and think alike. 1. Yagya is like pest control: It protects you from invisible pests in one’s home and around for 15–30 days. So treat it like that and repeat it at regular intervals. 2. Yagya is for virus or bacterial control: Protects your family from these attacks and builds immunity. Healing becomes faster if already attacked by viral or bacterial infection. 3. Yagya is like health insurance policy: Pay premium regularly to keep the policy alive and get the benefits. 4. Yagya is like fumigation done by MCD, Delhi, India in your colony. All should do collective yagya in societies, colony, muhalla, or village to ward off viruses and bacteria attacks. 5. You simply have to inhale for 10–15 min, and it can save from fat medical bills. 6. Need not to be done with lengthy Karmkand/rituals and pravachan because people are afraid of sitting for 2–3 h. 7. It can also be treated as aroma therapy. Corona virus has provided us a great opportunity to propagate this vidha, and people can be made aware of this not only as a dharmik karmkand but also about

Fog Data Based Statistical Analysis to Check Effects …

153

its physical, mental, and psychological benefits. Grahe Grahe yagya can also be promoted on this pattern [9, 31, 32].

1.4 Mantra Science Mantra science works on the principal of sequence of sounds. The word mantra itself means “revealed sound.” Mantra science is very ancient like 5,000 years old and was once practiced in various parts of the world. The main principal behind Mantra science is that there is no power in the words but there is power in vibrations created by those words or mantras verbally. Mantra science helps the individual to unleash the real power, knowledge, and forces within. It provides coordination between individual and depth of his inner being. Each mantra has its unique meaning, symbol, and importance [33].

1.4.1

Important Points of Mantra Science

Point-1: Mantra science should never be misunderstood to emphasis on a particular God or Religion. There should not be conflict among religion, on the basis of Mantra science. Many People think that they will not repeat Om Namay Shivay as this mantra is not of their religion. But this is not so; Mantra science did not promote any particular religion. Point-2: Secondly, mantra cannot be translated as translation alters the sound. If you translate or change the order of mantra, the mantra ceases to be a mantra. If you translate the words, you may have beautiful prayer but not a mantra which purifies your soul [34].

1.4.2

Types of Mantras

There are thousands of mantras, and every mantra has its unique and different meanings and principal. Some common ones are Dr. Ajay said that for this he approached Sanskrit Vidyapeeth in Qutub Institutional Area, and he was included in the study. Under the study, the patient was first resolved within the hospital; then the patient was sent to the Sanskrit Vidyapeeth, and the Mahamrityunjaya Mantra was used in an organized way. He told how much benefit this mantra had on those patients when it is being assessed with other groups [14, 35].

1.4.3

Science and Effects of Mahamrityunjaya Mantra

People have been using Mahamrityunjaya Mantra as a life-saving mantra for thousands of years, but till now it is just a belief of the people. Now studies are being

154

R. Rastogi et al.

done to prove people’s faith and belief in this mantra in a scientific way. The use of reciting this mantra to the head injury patients has been done for the first time in the country at Ram Manohar Lohia Hospital, which is also showing good signs. This research is in the final stage. Ajay Choudhary, the neurosurgeon doctor at Ram Manohar Lohia Hospital and his team is studying it. He told that periodic fasting has been practiced in our country for thousands of years. Devotees fast during Chaturdashi, Ekadashi, but there has been no study on it in the country. The Japanese doctor, who received the Nobel Prize for Medicine in 2016, studied only on periodic fasting [4]. Japanese doctor told in his study that sick period cells end in those who have carried out periodic fasting. Especially cancer cells die. But, there is no movement on this in our country, and there is no investigation. Dr. Chaudhary said that in the same way, people consider Mahamrityunjaya Mantra as life saving. This is his belief, but there is no scientific study. Now it needs to be proved. Study is being done to know the scientific facts of Mahamrityunjaya Mantra. He said that this study has been funded by the Indian Council of Medical Research (ICMR), and the study is going on [7]. Doctor Ajay Chaudhary told that there is a three-year study, which is in the final stage. 40 people have been studied; two groups of 20–20 were formed. Head injury patients were divided into two separate groups. According to the protocol for treatment of head injury, patients of both groups were treated, but one of the groups was given the Mahamrityunjaya Mantra. This work was done while healing out of ICU.

1.4.4

Beliefs in Mantra Science

In Mantra Science, it is believed that a child should receive its mantra when he is eight years old. This mantra should be practiced by the child at the time of sunrise and sunset along with breath awareness process. This practice is done by the child for at least 5 min during sunrise and 5 min during sunset. Then, a second mantra is given at the time of marriage or when great changes are taking place in life. Then, when the person becomes spiritual, a third mantra is given. Chakra Beej Mantras are given below in detail [36, 10, 37, 38] (as per Table 1).

1.4.5

Scientific Use of Mantras in Awakening the Soul

Scientists say that the “Soul” Does Not Die it “Returns to the Universe.” According to two leading scientists, the human brain is a biological computer, and human consciousness is just a software program that is activated by the “bio quantum computer” that is inside the brain. Furthermore, it continues to exist even after death. Researchers say that after people die the soul returns to the cosmos; it does not die [15].

Fog Data Based Statistical Analysis to Check Effects …

155

Table 1 Beej Mantra and related chakra, planets, lords, and goddesses in Indian Philosophy Chakra

Beej Mantra

Associated Planet

Presiding lord

Goddess

Muladhara

LAM

Shani and Uranus but elements of Sun

Ganesha with Siddhi and Buddhi

Dakini

Svadhisthana

VAM

Guru/Jupiter and Neptune but elements of moon and Venus

Brahma in child form Rakini with Savitri

Manipura

RAM

Mangal/Mars and Vishnu Pluto, with elements of Jupiter

Garbha and Kundalini Chakras are associated. Lakini

Anahata

YAM

Venus but elements of sun and Mercury

Kakini

Vissudhi

HAM

Mercury but Shiva and Adyasakti elements of Jupiter

Ajna

OM

Moon mainly but elements of Sun

Sahashara

OM

All Unite

Mahadeva in Rudra form with Uma

Shakini

Parannath and Param Siva and Sakti Sakti, Hamsa Devata, and Sushmanasakt

The origin of consciousness reflects our place in the universe, the nature of our existence. “Did consciousness evolve from complex computation among brain neurons, as most scientists assert? Or has consciousness, in some sense, been here all along, as spiritual approaches maintain?” ask Hameroff and Penrose in the current review. “This opens a potential Pandora’s Box, but our theory accommodates both these views, suggesting consciousness derives from quantum vibrations in microtubulesprotein polymers inside brain neurons, which both govern neuronal and synaptic function, and connect brain processes to self-organizing processes in the fine scale, ‘proto-conscious’ quantum structure of reality” [39]. Lead author Stuart Hameroff concludes, “Orch OR is the most rigorous, comprehensive and successfully tested theory of consciousness ever put forth. From a practical standpoint, treating brain microtubule vibrations could benefit a host of mental, neurological, and cognitive conditions” [40].

1.5 Effects of Yajna and Mantra on Human Health Yajna and havan should be performed cautiously as demonstrated by the standards set down in the hallowed writings. The smallest deviation can have damaging effects. The mantras related during havan and yajna are stunning serenades in adoration for the grand animals who deal with our life, welfare, and assets. They ensure prosperity,

156

R. Rastogi et al.

long life, and powerful flourishing. The havan is along these lines a blessing and help [41]. The old Rishis were not outwardly weakened enthusiasts to function. There was staggering significance in every custom that they wove step by step into life. Physical prosperity, mental control, improvement, and cleansing of the heart are only a bit of the preferences. Whatever hankering an individual has when doing the havan, that aching is fulfilled. These exercises with their results mollify away and are diminished to nothing but when exercises are executed with compensation of ill acts and pious feelings then they actually get complete and fulfill the expectations of Lord [42].

1.6 Role of Technology in Addressing the Problem of Integration of Healthcare System Information and correspondence advances offer the open entryway for immense improvement in human administrations. Development-based therapeutic and care coordination systems, that grip web, adaptable, recognizing, handling, and bioinformatics propels, offer great assurance for engaging absolutely new models of restorative administrations both inside and outside of formal structures of thought and offer the opportunity to have a tremendous general prosperity influence. We prescribe that development can expect a key activity in all of the three of these progressions, notwithstanding extra. To begin with, advancement bears another model for enabling purchasers to expect a central activity in picking and portraying the course of their own human administrations. A wide display of instruments exists to screen and discover progressing data about individuals’ social and physiological states (e.g., by methods for customer commitment to compact applications, wearable sensors just as wireless sensors). Likewise, there has been an impact of creative work activities inciting the development of self-facilitated contraptions that give on-demand, enlightening, or therapeutic assistance at whatever point/wherever to empower individuals to manage their own one of a kind prosperity lead. These gadgets may in like manner outfit individuals with the decision to attract a sweeping empowering gathering of individuals in their very own restorative administrations the officials (e.g., by sharing their prosperity lead data with family and allies to both empower and reinforce them; by sharing in virtual enduring systems, etc.). Further, decision help gadgets are dynamically being made to empower individuals to all the more probable get, access, and choose choices about treatment [43].

1.7 Impact of Yagya in Reducing the Atmospheric Pollution Today, the air we take in is stacked with terrible gases like NO2 , CO, SPM, and RSPM, which are all in all over the checks prescribed by the Government and are unbelievably

Fog Data Based Statistical Analysis to Check Effects …

157

hazardous for human prosperity. There are moreover new kinds of tiny creatures and contamination coming up which cause new ailments and are impenetrable to old prescriptions. The misfortunes are growing high inspite of the all endeavors. As the city are being dumped into the streams, thusly causing extraordinary water defilement. The eccentric use of pesticides and designed blend fertilizers has achieved hurting of underground water stores and moreover came to fruition into loss of soil productivity. Besides, to top it all, the nonattendance of compassion of the people to these issues has exacerbated it. The Government is spending crores of rupees for taking care of this issue, yet next to no results are seen. From the outset, a couple of sorts of wood was scorched to see the CO radiations from all of them. The releases were recorded on automated straightforward. It was found that the mango wood gave for all intents and purposes near zero CO spread. Accordingly, the mango wood was taken as the crucial Samidha for the assessment. The one of a kind Havan Samigri suggested by BrahmaVarchas, for the sterilization of atmosphere, was used nearby the standard havan samigri. In the past what many would consider conceivable, cow’s unadulterated ghee was used for the Havan.

2 Literature Survey Epilepsy is a neuropsychiatric issue with high commonness among kids and youthful grown-ups. In India, around ten million individuals experience the ill-effects of epilepsy with a predominance of about 1.9% in country zones and 0.6% in urban districts. The more prominent commonness of epilepsy in rustic zones is a demonstration of effect of shame that antiquated occasions; this malady was considered as a hallowed illness and various superstitious estimates were used to be taken to prevent/fix it. Yajurveda promoters performing Havan consistently, morning and night to achieve profound illumination, mental harmony, cleansing of the brain and condition encompass this ailment on levels of treatment that Indians get [42]. Since days of yore, smoke radiating from the burning of different pieces of therapeutic plants has been utilized for relieving infections/issue. The importance of the ethno pharmacological parts of therapeutic smoke uncovers the job of flame as a main thrust in advancement. The yagya fumes have two effects. Firstly, it boosts the immune system of a person and also reduces the number of microbes in the atmosphere (Nautiyal CS, Chauhan PS, Nene YL. Medicinal smoke reduces airborne bacteria. J Ethnopharmacol. 2007; 114(3): 446–51). They have observed that by burning wood and medical herbs for 1 h the bacterial count in the atmosphere has been reduced by 94% in 60 min of duration in that room. They maintained it up to 24 h in the closed room to make the environment cleaner. Thus, chanting of the mantras while yagya has an effect to vaccination. We have observed that people usually fall sick when the festivals are nearing but our festivals such as Holi and Durga Puja provide a kind of silent treatment to the disease.

158

R. Rastogi et al.

According to Shriram Sharma Acharya, there are two energy systems in this world that are heat and sound. In performing Yagya, these two energy systems are produced; heat from burning of substances and sound from mantras chanting during the yagya are combined to achieve the desired physical, psychological, and spiritual benefits. Authors did a broad study of antiquated writing and current references to get satisfactory bits of knowledge to conceptualize the working standard of agnihotra yajna. To play out this yajna, the entertainer would offer certain characteristic substances (dairy animals’ waste, rice, milk, clove, camphor, and ghee) into the flame which is lit in an area of a reversed pyramid-like structure made of copper having a level base. Many people in the Indian society do not know about the yagya science and its benefits. People have to be aware of the ayurvedic approach of treatment for any kind of disease. Benefit of the yagya is that it has no side effect to our body and also has good impact on our body and nature too. Also the substances used in Yajna and their various proportions with the temperature attained with controlled supply of air and interaction produces fruitful results and is necessary for the various products formed which are boon for atmosphere. Mantra chanting during the Yagya produces vibrations which soothe the human mind and all plant and animal life. These vibrations also help in spreading specific energy waves in the surrounding atmosphere as the oblations are offered [44].

3 Methodology All activities of the entire universe turn around the rotation of Yajnas. Unprecedented Rishis of days of old announce, “Ayam Yajno Vishwasya Bhuvanasya Naabhihi” (Atharva Veda 9/15/14). It suggests that Yajnas are the point of convergence of this universe. The amazing craftsman of Bhagwad Geeta, for instance, Master Krishna likewise says, “Sahayajnaahaaprajaahaapurovaachaprajapatihi. Anainaprasavishvadhwameshavostvishtakaamadhuka.”(3/10). It infers that at the beginning of this Kalpa (creation), Brahma in the wake of making Yajnas, living animals, etc., urged them to induce by methods of Yajnas in light of the fact that these Yajnas will fulfill your material/significant needs. Yajnas are those truly important guides given to world of mankind by Vedic Rishis of old-fashioned India which is a foundation stone of material/significant satisfaction and keeping up a strong natural framework (especially today when world pioneers are worried over biological defilement and a risky barometrical devotion). Every declaration of the nectar-like works of HH Gurudeva Shriram Sharma Acharya enlightens the unprecedented foundation stones of Divine Culture, viz., Yajnas and Science of Super Mantra Gayatri. He clarifies the extraordinary outcomes of flawless imperativeness emanated by methods for Gayatri Mantra discussing, and thusly the basic essentialness of the Gayatri sweetheart ends up being progressively exceptional altogether. Our cherished maker diviner evacuated (symbolically and not really) an inappropriate thought of Yajnas which relied upon incredible (Puranas) delineation and rather reestablished Super Mantra Gayatri Yajnas subject

Fog Data Based Statistical Analysis to Check Effects …

159

to Vedic resolutions. Such Gayatri Yajnas were “energetically” absorbed by an enormous number of fans since their resultant preferences were incredible without a doubt. This which we can passionately say is a dynamic endeavor of this period. It is fundamentally progressively significant in stature when diverged from the turmoil of post Yogiraj Gorakhnath’s events wherein thousands were obstructed from mishandling Tantra significant chips away at including Tantra based Yajnas. (During Gorakhnathji’s lifetime, Tantra was used most appropriately yet it was after the phenomenal sage shed his human circle that across the board, maltreatment of Tantra was accepted by dolts set in). Today by virtue of our worshiped Yuga Rishi Pt. Sri Ram Sharma Acharya ji, it is pleasant to see that in such an enormous number of passionate families accpet it and Gayatri Yajnas are performed reliably so a perfect holy condition sets into present for a splendid world future in the 21st Century.

4 Result and Discussion The analysis has been written for FBS and different samples. For the analysis of the effect of yagyopathy in controlling the diabetes, two groups were created at different locations. The individuals’ concerns have been taken for the participation, and they agreed to follow the instruction during the Experiment Period. Two groups are created for the experiments. The distribution of patients in the groups was 2 and 4. In first group, there were two patients, and in another group, four patients participated. They all are reported to have Type-II diabetes. Type-II diabetes is mainly caused due to a life style in which insulin formation slowly starts decreasing in the body resulting into the increase of sugar level in blood (as per Fig. 3). For monitoring the blood sugar level of patients, the following three parameters are checked regularly in blood sample: 1. Glycated Hemoglobin (HbA1c) 2. Postprandial Blood Sugar (PPBS) 3. Fasting Blood Sugar (FBS) Apart from this, physical weight is also measured as obesity is one of the major causes of many types of illness, one of which is Diabetes (as per Fig. 4) and (as per Table 2). Weight Variation All patients are either at the verge of obesity or having reported obesity. The overall decreasing trend of obesity is observed. The weight reduction rate is higher in first month of treatment, and it slows down a bit for subsequent month (as per Fig. 5). When looking on the individual level, similar trend is observed. In some cases, very minute increase in weight is observed after one month of treatment (as per Fig. 6).

160

R. Rastogi et al.

Fig. 3 The details of subjects in different diabetes groups

Fig. 4 The details of subjects’ initial parameters

Table 2 Different group diabetes parameters and weights Group

Patient ID

Wt. in Kgs

FBS

HbA1c

Ppbs

GRP-1

PT-1

74.1

173

8.3

222

GRP-1

PT-2

59.2

103

6.3

162

GRP-2

PT-3

88.3

141

6.5

157

GRP-2

PT-4

72

104

5.8

115

GRP-2

PT-5

62.4

130.9

6.2

154

GRP-2

PT-6

73

142

6.4

142

To further understand the significance of decrease in first month and subsequent increase in next month, the percentage variation is checked, and it is observed that the variation of increase is not even 1%. So, such negligible observation cannot be considered significant, as it may be due to change in cloth style during the weight measurement between two time periods (as per Fig. 7 and Table 3).

Fog Data Based Statistical Analysis to Check Effects …

161

Fig. 5 The average weight reduction of subjects during experiments

Fig. 6 The average weight reduction of each subject on monthly basis

5 Analysis of Fasting Blood Sugar Parameter (FBS) The FBS normal range is less than 100, and 100–125 is considered as prediabetes period, and beyond this is diabetes. So, the two-third of the patients under clinical trial of yagyopathy are having diabetes, and one-third are in pre-diabetic phase. A rapid decrease of FBS is observed in the starting month of treatment; then overall increase of FBS trend is observed for the patients in the subsequent months.

162

R. Rastogi et al.

Fig. 7 The average percentage weight reduction of each subject during experiments Table 3 The weight reduction of subjects measured during their lab visits Date of lab test

Group

Patient ID

03-Dec-18

GRP-1

PT-1

03-Dec-18

GRP-1

PT-2

03-Dec-18

GRP-2

PT-3

03-Dec-18

GRP-2

PT-4

03-Dec-18

GRP-2

PT-5

03-Dec-18

GRP-2

PT-6

% Difference in Avg. Wt. in Kgs

03-Dec-18

−2.49%

02-Jan-19 02-Jan-19

GRP-1

PT-1

−1.75%

02-Jan-19

GRP-1

PT-2

−6.08%

02-Jan-19

GRP-2

PT-3

−1.81%

02-Jan-19

GRP-2

PT-4

−0.56%

02-Jan-19

GRP-2

PT-5

−4.33%

02-Jan-19

GRP-2

PT-6

−1.51%

05-Mar-19

GRP-1

PT-1

−3.85%

05-Mar-19

GRP-1

PT-2

0.18%

05-Mar-19

GRP-2

PT-3

0.12%

05-Mar-19

GRP-2

PT-4

−2.23%

05-Mar-19

GRP-2

PT-5

0.50%

05-Mar-19

GRP-2

PT-6

0.83%

−0.79%

05-Mar-19

Fog Data Based Statistical Analysis to Check Effects …

163

Fig. 8 The analysis of fasting blood sugar

Very positive impact of Yagyopathy is seen in reducing the FBS level from above diabetic range to well below the diabetic range. When analyzing at group level, it has been observed that Group-2 FBS level is continuously decreasing; but due to increase in FBS level for Jan–March period in Group-1, the overall trend seems to be increasing. It is indicative that any outlier in the data exists i.e. there is some patient in the Group-1 whose FBS is increased rapidly, thereby impacting the complete group average. To see this, study at individual level is done (As per Fig. 8). From the graph, it has been clearly seen that except one patient all the other patients come under the pre-diabetic range in the first month of treatment itself. The excepted patient level also reduces and comes under pre-diabetic phase in subsequent months (As per Fig. 9). From the graph, it has been clearly seen that Patient PT-1 in Group-1 is showing sudden increase. So, to bring general conclusion about the experiment, the analysis has been done for FBS after removing the PT-1 patient’s data. Now, it has been clearly seen that the overall trend is decreasing (Gray line in the graph is showing overall trend). But still it seems there are some cases in which FBS is increasing for the Jan–Mar period. To study this variation, data is grouped into two parts: one for increasing FBS in this period and another for decreasing FBS in this period (As per Fig. 10): For increasing group analysis, it has been seen that only average of 5.25 point is there for the concerned period in which only 11 point increase is observed in Patient

164

R. Rastogi et al.

Fig. 9 The analysis and trend of subjects’ health on FBS parameters

Fig. 10 The analysis and trend of subjects’ health on FBS parameters after removing exceptions

PT-1. So, after removing it from the data, it has been observed that only 3.33 point increase is observed (as per Fig. 11). For the decreasing group, the result was very positive for the clinical trials. As an average of 10 point decrement per period is observed with no exception.

Fog Data Based Statistical Analysis to Check Effects …

165

Fig. 11 The analysis and trend of subjects’ health on FBS parameters for increasing groups

Result: In summary, the impact of yagyopathy on FBS controlling is very positive and rapid; as within one month, almost all patients’ range comes well below the diabetic range (As per Figs. 12 and 13). Observations on the obtained results We also observed that the parameters stop decreasing with significant rate or even increase in parameters as observed in the period of Jan–Mar. Although for first month significant decrease in parameters values is observed.

Fig. 12 The analysis and trend of subjects’ health on FBS parameters for increasing groups after removing exception

166

R. Rastogi et al.

Fig. 13 The analysis and trend of subjects’ health on FBS parameters for decreasing groups

It may be due to normal behavior of parameters that after some period of decrement it goes into ideal/inert period; then it will show improvement again or it is due to one of the possible reasons discussed in the below text. Also, we are interested to know the trend generally observed in FBS, PPBS, and HgA1c. Like if one parameter in the study decreases, then other factor will increase as per in early phase of treatment. Like in case of Liver, we know if Serum Bilirubin goes down then for that particular period temporary increase in SGOT and SGPT can be observed. So, we want to know if any such pattern exists between these three. 1. Parameters are decreasing in first month (As per Fig. 14). 2. No significant improvement is observed in subsequent months. This is against the Motivation Theories; since rapid decrease in first month in parameter will lead to the motivation of patients, they will have more trust on trials and will follow the instructions more rigorously. Here, either the reward seems to be negative and they start taking casually or the month gap is two for next observation; does the gap in feedback lead to loss of interest? Since the patients belong to Gayatripariwar, we do not think that gap month or negative reward or casual attitude. They all must be motivated as they have practiced yagya for many years; they were seeing its scientific evidence during trials, so there must be a positive impact, and decrement in the parameter should be observed continuously (may be with decreased rate). Now what we think of is, in India, January and February are the period of marriage ceremonies and Holi festival, so after seeing the decreased value in December, they might have started attending parties: P. I am not joking but that can be the reason. Or it may be quiet possible that in month of December, it’s a period of chilled winter in India sand treatment works best in chilled winter but in Feb- March, temperature starts increasing so effectiveness decreases. Or it may be possible that the subjects were taking their allopathic

Fog Data Based Statistical Analysis to Check Effects …

167

Fig. 14 The overall analysis dashboard

medicine in first month and by seeing the positive growth they stopped taking medicine or taking it on need basis. If that is the case, then its really very positive and so it has been highlighted in our study. Or, it may be also possible and there may be another factor, that subjects’ body was not habitual of such treatment, so when body gets such treatment, it gets healed rapidly, but as it becomes the routine process so, the body becomes habitual and do not responds with same pace in subsequent month. If so, then in such case, we have to think of some increment plan, i.e., time-based kwath dose change or yagya ahuti increment, etc.

6 Novelty in Our Work Yagya is a breathing treatment of many diseases which requires no physical touch. Burning of various oils releases fragrance to the nature. Environment is purified by yagya as it controls the pollution. It also has no side effect and reduces the bacteria count from the environment. Hormones of our body get activated by yagya.

168

R. Rastogi et al.

7 Recommendations Yajna is an ancient practice and involves various beneficial effects. Yajna provides various beneficial effects to our body; it provides purity to the house and is helpful in treatment of diabetes and is highly helpful in getting relief from drugs.

8 Future Scope and Possible Applications There are various applications of yajna such as Curing of the disease. It also helps in increasing the immunity power of the person. Different diseases such as diabetes, stress, hypertension, and many more can easily be cured through yagya. Drug Addiction can be reduced by following the timetable of the treatment on regular basis of three months.

9 Limitations While performing the experiments, my team and I have to face difficulties such as limited number of subjects. Today, people are not interested in slow process treatment; they want quick response. Also, the subject should follow the treatment on the regular basis for three months. Subject should have will to get fit. Experienced persons in the experiment are less in number due to which number of experiment performed on subjects is less in number. Since the data is very less, trend model was showing non-significant. So, we have not done lighter side of trend analysis. Even P value was not