Advanced Computing and Systems for Security: Volume Eleven [1st ed.] 9789811557460, 9789811557477

This book features extended versions of selected papers that were presented and discussed at the 7th International Docto

453 130 4MB

English Pages IX, 143 [148] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Advanced Computing and Systems for Security: Volume Eleven [1st ed.]
 9789811557460, 9789811557477

Table of contents :
Front Matter ....Pages i-ix
Front Matter ....Pages 1-1
A New Graph Polynomial and Generalized Tutte–Grothendieck Invariant from Quantum Circuits (Chaowen Guan, Kenneth W. Regan)....Pages 3-16
Cryptanalysis of a Centralized Location-Sharing Scheme for Mobile Online Social Networks (Munmun Bhattacharya, Sandip Roy, Soumya Banerjee, Samiran Chattopadhyay)....Pages 17-30
Bounded Version Vectors Using Mazurkiewicz Traces (Madhavan Mukund, Gautham Shenoy R., S. P. Suresh)....Pages 31-42
Intents Analysis of Android Apps for Confidentiality Leakage Detection (Rocco Salvia, Agostino Cortesi, Pietro Ferrara, Fausto Spoto)....Pages 43-65
Fingerprint and Keystroke Dynamics Fusion in Multimodal Biometrics System (Maciej Szymkowski, Patryk Milewski, Khalid Saeed)....Pages 67-82
Front Matter ....Pages 83-83
QoS Enhancement in WBAN with Twin Coordinators (Sriyanjana Adhikary, Biswajit Ghosh, Sankhayan Choudhury)....Pages 85-97
A Distributed Power Control Scheme for Device-to-Device Communication in Cellular Networks (Udit Narayana Kar, Debarshi Kumar Sanyal, Monideepa Roy)....Pages 99-107
An Application of Block Chain in Examination System, A Case Study (Ashis Kumar Samanta, Bidyut Biman Sarkar)....Pages 109-124
A Study on Energy-Efficient Routing Protocols for Wireless Sensor Networks (Soumyabrata Saha, Rituparna Chaki)....Pages 125-143

Citation preview

Advances in Intelligent Systems and Computing 1178

Rituparna Chaki Agostino Cortesi Khalid Saeed Nabendu Chaki   Editors

Advanced Computing and Systems for Security Volume Eleven

Advances in Intelligent Systems and Computing Volume 1178

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong

The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **

More information about this series at http://www.springer.com/series/11156

Rituparna Chaki Agostino Cortesi Khalid Saeed Nabendu Chaki •





Editors

Advanced Computing and Systems for Security Volume Eleven

123

Editors Rituparna Chaki A.K. Choudhury School of Information Technology University of Calcutta Kolkata, West Bengal, India Khalid Saeed Faculty of Computer Science Bialystok University of Technology Bialystok, Poland

Agostino Cortesi DAIS Ca’ Foscari University Venezia, Italy Nabendu Chaki Department of Computer Science and Engineering University of Calcutta Kolkata, West Bengal, India

ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-981-15-5746-0 ISBN 978-981-15-5747-7 (eBook) https://doi.org/10.1007/978-981-15-5747-7 © Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

The seventh International Doctoral Symposium on Applied Computation and Security Systems (ACSS 2020) was held in Kolkata, India, during February 28–29, 2020. The University of Calcutta in collaboration with Ca Foscari University of Venice, Italy, and Bialystok University of Technology, Poland, organized the symposium. This unique symposium is aimed specially to facilitate budding researchers in their pursuit of Ph.D., by providing a platform for exchange of ideas. Over the years, we have been continuously updating the list of significant research areas so as to include the most significant research domains in the scope of the symposium each year. This helps ACSS to stay in tune with the evolving research trends. The seventh year of the symposium showed a significant improvement in overall quality of papers, besides some very interesting papers in the domain of security and image analysis. We are grateful to the program committee members for sharing their expertise and taking time off from their busy schedule to complete the review of the papers with utmost sincerity. The reviewers have pointed out the points of improvement for each paper they reviewed, and we believe that these suggestions would go a long way in improving the overall quality of research among the scholars. In our bid to match the changing research perspectives, we have invited some eminent researchers to share some specific research works of them, hitherto unpublished. The submitted works were sent for feedback to the members of Program Committee. This volume consists of mainly the invited papers and some contributed papers from ACSS 2020. We sincerely hope that this volume answers many queries of the researchers besides opening some new research dimensions. We have invited researchers working in the domains of Computer Vision and Applications, Biometrics-based Authentication, Security for Internet of Things, Analysis and Verification Techniques, Security in Heterogeneous Networks, Large Scale Networking, Remote Healthcare Distributed Systems, Signal Processing Routing and Security in WSN Intelligent Transportation System, Human Computer Interaction, Bioinformatics and System Biology, Computer Forensics, Privacy and Confidentiality Access Control, Big Data and Cloud Computing Data Analytics, VLSI Design and Embedded Systems Requirements, Engineering Software v

vi

Preface

Security Algorithms, and Natural Language Processing Quantum Computing to submit their ongoing research works. The indexing initiatives from Springer have drawn high-quality submissions from scholars in India and abroad. ACSS continues with the tradition of double-blind review process by the PC members and by external reviewers. The reviewers mainly considered the technical aspect and novelty of each paper, besides the validation of each work. This being a doctoral symposium, clarity of presentation was also given importance. The entire process of paper submission, review, and acceptance process was done electronically. We thank the members of Program Committee and Organizing Committee, whose sincere efforts before and during the symposium have resulted in a suite of strong technical paper presentations followed by effective discussions and suggestions for improvement for each researcher. The Technical Program Committee for the symposium selected only 15 papers for publication out of 35 submissions. We would like to take this opportunity to thank all the members of the Program Committee and the external reviewers for their excellent and time-bound review works. We thank Springer for sponsoring the best paper award. We would also like to thank ACM for the continuous support toward the success of the symposium. We appreciate the initiative and support from Mr. Aninda Bose and his colleagues in Springer Nature for their strong support toward publishing this post-symposium book in the series “Advances in Intelligent Systems and Computing”. Last but not the least, we thank all the authors without whom the symposium would not have reached up to this standard. On behalf of the editorial team of ACSS 2020, we sincerely hope that ACSS 2020 and the works discussed in the symposium will be beneficial to all its readers and motivate them toward even better works. Kolkata, India Venice, Italy Bialystok, Poland Kolkata, India

Rituparna Chaki Agostino Cortesi Khalid Saeed Nabendu Chaki

Contents

Invited Papers A New Graph Polynomial and Generalized Tutte–Grothendieck Invariant from Quantum Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chaowen Guan and Kenneth W. Regan Cryptanalysis of a Centralized Location-Sharing Scheme for Mobile Online Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Munmun Bhattacharya, Sandip Roy, Soumya Banerjee, and Samiran Chattopadhyay Bounded Version Vectors Using Mazurkiewicz Traces . . . . . . . . . . . . . . Madhavan Mukund, Gautham Shenoy R., and S. P. Suresh

3

17

31

Intents Analysis of Android Apps for Confidentiality Leakage Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rocco Salvia, Agostino Cortesi, Pietro Ferrara, and Fausto Spoto

43

Fingerprint and Keystroke Dynamics Fusion in Multimodal Biometrics System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maciej Szymkowski, Patryk Milewski, and Khalid Saeed

67

Contributed Papers QoS Enhancement in WBAN with Twin Coordinators . . . . . . . . . . . . . . Sriyanjana Adhikary, Biswajit Ghosh, and Sankhayan Choudhury A Distributed Power Control Scheme for Device-to-Device Communication in Cellular Networks . . . . . . . . . . . . . . . . . . . . . . . . . . Udit Narayana Kar, Debarshi Kumar Sanyal, and Monideepa Roy

85

99

vii

viii

Contents

An Application of Block Chain in Examination System, A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Ashis Kumar Samanta and Bidyut Biman Sarkar A Study on Energy-Efficient Routing Protocols for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Soumyabrata Saha and Rituparna Chaki

About the Editors

Rituparna Chaki is a Professor of Information Technology at the University of Calcutta, India. She received her Ph.D. from Jadavpur University, India, in 2003. She has served as a System Executive in the Ministry of Steel, Government of India, before joining the academics in 2005. Prof. Chaki is with the University of Calcutta since 2013. Her current research interests include systems software, mobile ad hoc networks, and the Internet of things. Agostino Cortesi, Ph.D., is a Full Professor of Computer Science at Ca’ Foscari University, Venice, Italy. He has previously served as Dean of Computer Science studies, as Department Chair, and as Vice-Rector for quality assessment and institutional affairs. His research interests include programming language theory, software engineering, and static analysis techniques, with particular emphasis on security applications. Khalid Saeed is a Full Professor at the Faculty of Computer Science, Bialystok University of Technology, Poland. His research interests include biometrics, image analysis and processing, and computer information systems. Nabendu Chaki is a Professor at the Department of Computer Science & Engineering, University of Calcutta, India. He received his Ph.D. from Jadavpur University, India, in 2000. Prof. Chaki has been highly active in developing international standards for software engineering and cloud computing as a member of the Global Directory (GD) for ISO-IEC. His research interests include software engineering and distributed systems, and security.

ix

Invited Papers

A New Graph Polynomial and Generalized Tutte–Grothendieck Invariant from Quantum Circuits Chaowen Guan and Kenneth W. Regan

Abstract A new polynomial Q G (x) associated to graphs G is defined and studied. The main theorems represent Q G (x) as a quasi-specialization of the rank-generating polynomial SG (x, y) of Oxley and Whittle, J Comb Theory Ser B 59:210–244, 1993, [10] and show that Q G is likewise a generalized Tutte–Grothendieck invariant. The value Q G (1) gives the amplitude of acceptance for a class of quantum circuits with associated graphs G. This class, called stabilizer circuits or Clifford circuits, has long been known to have deterministic polynomial time simulation, so Q G (1) √ is polynomial time computable, given G as input. The specialization has y = − 2i, which (along with its complex conjugate) is the only choice that invalidates formulas in a theorem by Noble, Comb. Probab. Comput 15:449–461, 2006, [9] classifying hard and easy real points of SG , so the complexity of other points Q G (x) is open. We reduce the base cases for SG by adjoining multiple kinds of isolated nodes and draw possible further implications of the connections between matroid theory and quantum computing developed by this work. Keywords Graphs · Matroids · Tutte polynomial · Quantum circuits

1 Introduction The Tutte polynomial TG (x, y) of an undirected n-vertex graph G = (V, E) is based on the following property of spanning trees T , for any edge e ∈ E: Either T excludes e, in which case T is a spanning tree of the graph G \ e obtained by deleting e from E, or T includes e, in which case the rest of T is a spanning tree of the (n − 1)-vertex graph G/e obtained by contracting e. Hence spanning trees of G are in 1–1 correspondence with spanning trees of G \ e union those of G/e. C. Guan · K. W. Regan (B) University at Buffalo (SUNY), Buffalo, NY, USA e-mail: [email protected] C. Guan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Chaki et al. (eds.), Advanced Computing and Systems for Security, Advances in Intelligent Systems and Computing 1178, https://doi.org/10.1007/978-981-15-5747-7_1

3

4

C. Guan and K. W. Regan

Contraction is defined by identifying u = v in all other edges of G. Contracting one edge of a triangle creates a double edge and contracting one edge of a double edge creates a self-loop. A single node with however-many loops is considered to have one spanning tree of zero edges. A loop edge cannot be contracted. Contraction and deletion are shown in the following figure:

The 1–1 correspondence is abstracted into the following recursion, which together with base cases defines TG (x, y) for all G: TG (x, y) = TG\e (x, y) + TG/e (x, y).

(1)

The base G are n-node trees with k self-loops, for which TG is the monomial x n−1 y k . For the graph G 1 consisting of a single node and no edges, which is considered to have ∅ as a spanning tree, TG 1 (x, y) = 1. An explicit formula is TG (x, y) =



(x − 1)cc(E)−cc(A) y cc(A)+|A|−n ,

(2)

A⊆E

where cc(A) means the number of connected components in A. When G is the disjoint union of graphs G 1 and G 2 , TG = TG 1 · TG 2 , then TG (1, 1) counts the number of spanning forests that have one tree per connected component of G. More general is the notion of a spanning edge-subgraph S, which need not be connected and can have as few as  n2  edges. The corresponding recursive counting principle involves eliminating both u and v when e is used, then turning any edge (u, w) or (v, w) with w = u, v into a self-loop at w. We call this step “explosion” and denote the resulting graph by G\\e, as illustrated here:

Explosion can produce multiple loops at the nodes w. Existing loops at u or v, and possible (u, v)-edges other than e, become circles, meaning edges incident to

A New Graph Polynomial and Generalized …

5

no vertex, whose (non-)membership in S does not affect the spanning property. The following principle works also when exploding a single loop: Either S excludes e, in which case S spans G \ e, or S includes e, in which case the rest of S spans G\\e. If the rest of S includes other edges involving u and/or v, they are preserved in G\\e as loops or circles. Hence spanning subgraphs of G remain in 1–1 correspondence with spanning subgraphs of G \ e union those of G\\e. We give a “deeper” basis than the one defined in [10] by augmenting graphs with two more extra notions besides circles: – A “wisp” is a copy of the empty graph G ∅ = (∅, ∅). The explosion of an edge (u, v) leaves behind two wisps in place of u and v. – A “negative isol” G −1 is a signed wisp. The explosion of a self-loop at u leaves behind a G −1 in place of u. The explosion of a circle can be considered to leave behind two of them. Definition 1 (cf. [10]) The (adjusted) rank-generating polynomial SG (x, y) is defined by recursion on G = (V, E) by SG ∅ (x, y) = 1; SG 1 (x, y) = x;

(3) (4)

SG −1(x, y) = y; SG (x, SG (x,

y) = y) =

(5)

SG 1 (x, y)SG 2 (x, y) if Gis the disjoint union SG\e (x, y) + SG\\e (x, y) for any e ∈ E.

of G 1 and G 2 ;

(6) (7)

This definition differs from SG as defined in [10] by a factor of x k where k is the number of isolated nodes. It gives the same polynomials for the circle graph , the graph  of a self-loop at one node, and G 2 of one edge between two nodes, which are defined as the basis in [10]: (x, y) = 1 + y 2 ; S

S (x, y) = x + y; SG 2 (x, y) = x 2 + 1.

A key distinction is that G 1 is not considered to have any spanning edgesubgraphs; instead, its lone node is “untouched.” By the multiplicative rule, any graph with an isolated node has zero spanning edge-subgraphs. For any A ⊆ E, let f (A) denote the number of nodes it touches. This allows the rule defining SG to extend to graphs with isolated nodes, upon replacing “ f (E)” in [10] by “n”: Proposition 2 (cf. [10]) For any graph G = (V, E) with n = |V |, including also graphs with circles, wisps, and negative isols,

6

C. Guan and K. W. Regan

SG (x, y) =



x n− f (A) y 2|A|− f (A) ,

A⊆E

and SG (0, 1) counts the number of spanning edge-subgraphs of G. The proof is as given in [10], taking the above adjustments into account. Henceforth, we write SG only when it is necessary to remind of the difference when G may have isolated nodes. Attempts to re-base TG seem less satisfying. Note that TG for a connected graph G always has degree n − 1 in x, since the case A = ∅ makes cc(A) = n by the definition that an isolated vertex is its own connected component, whereas this case makes SG have degree n in x. Another difference is that the number of spanning trees TG (1, 1) is computable in polynomial time, whereas the number SG (0, 1) of spanning edge-subgraphs is NP-hard, indeed #P-complete (see the main theorem of [9]). The genius of both TG and SG is that evaluations at other points (x, y) give other information about the graph. For example, when G is connected, TG (1, 2) counts the number of spanning edge-subgraphs that are connected, and for disconnected G, it counts the number of spanning edge-subgraphs A with cc(A) = CC(G). TG (2, 1) counts all edge-subgraphs that are forests, whereas TG (2, 2) is just 2|E| . The main separate significance of SG is that SG (0, 0) counts the number of perfect matchings of G, and SG (1, 0) the number of (“partial”) matchings overall. The specialization TG (x, x1 ) gives the Jones polynomial of a topological knot associated to G. If G has no isolated nodes then  mk x k , (8) x n/2 SG (x −1/2 , 0) = k≥0

where m k is the number of matchings of size k. We call the left side of (8) a quasi-specialization of SG owing to the extra factor x n/2 . The most famous quasispecialization of TG is the chromatic polynomial χG (z) = (−1)n−cc(G) z cc(G) TG (1 − z, 0), which counts the number of ways to color G properly using z colors. Another beautiful point is that these evaluations and (quasi-)specializations can derive from recursions akin to (1) and (7) but with other linear coefficients a, b besides a = 1 and b = 1. For example, the recursion for χG (z) uses b = −1. The main contribution of this paper is discovering a polynomial Q G (z) that conveys information about a class of quantum circuits C G associated to graphs G. In particular, Q G (1) gives the acceptance amplitude of C G on the all-zero input, given all-zero output as the acceptance criterion, which is polynomial time computable and show that√it is a quasifor this class of circuits. We derive Q G (z) by recursion√ specialization of SG (x, y) at the complex points x = i 2z, y = −i 2. We show that like SG , Q G is a generalized Tutte–Grothendieck invariant (GTGI). We discuss

A New Graph Polynomial and Generalized …

7

possible further significance of these results and connections. To support this purpose, we include a section on matroid theory as originally used to derive SG before defining Q G .

2 Matroids and Polymatroids Crapo [5] and Brylawski [3] extended the definition of Tutte polynomial to matroids. Brylawski also identified the relevance of the Grothendieck ring in his paper, and this led to the convention of naming GTGIs. In 1979, Oxley and Welsh proved a set of conditions for a graph invariant to be called a GTGI [11]. This result is known as the “recipe theorem” because it defines an umbrella form for TGIs. In 1993, Oxley and Whittle [10] further extended the theory of Tutte invariants to 2-polymatroids, whose definitions we review. A matroid is defined by a set U and a function f from finite subsets of U to N that obeys the rules: 1. 2. 3. 4.

f (∅) = 0; for all a ∈ U , f ({a}) ≤ 1; if A ⊆ B then f (A) ≤ f (B); and if A ⊆ B and c ∈ / B then f (A ∪ {c}) − f (A) ≥ f (B ∪ {c}) − f (B).

The function giving the rank of sets of vectors in an ordinary vector space obeys these axioms. A motivation for polymatroids is that the function f (A) = the number of vertices touched by edges in A obeys these axioms, except that f ({e}) = 2 when e is an edge (u, v). A k-polymatroid replaces rule 2 by the rule that all singleton sets {e} have f ({e}) ≤ k. Not all 2-polymatroids arise from graphs; those that do are called graphical 2-polymatroids (G2PMs). The upshot with polymatroids is that there is no notion of “vertex,” only the definition of the rank function f . Hence, all graphs composed of k isolated vertices define the empty 2-polymatroid. Thus, the original definition of the rank-generating function for polymatroids (E, f ), S f (x, y) =



x f (E)− f (A) y 2|A|− f (A) ,

(9)

A⊆E

when restricted to G2PMs, is unchanged by isolated nodes. Including those nodes in Definition 1 for our modified SG will simplify later formulas.

3 Quantum Stabilizer Circuits and Graphs A quantum circuit consists of some number n of qubit lines and some number s of gates acting on qubits. Quantum stabilizer circuits use only gates from the set {H, S, Z, CZ} where

8

C. Guan and K. W. Regan



1 ⎢0 1 11 10 10 , S= , Z = − , CZ = ⎢ H= √ ⎣0 0i 0 1 2 1 −1 0 











0 1 0 0

⎤ 00 0 0⎥ ⎥. 1 0⎦ 0 −1

These gates generate the Clifford group on n qubits. Note Z = S2 and we also get NOT = HZH. Each circuit C of these gates yields a 2n × 2n unitary matrix whose entry for any row label x ∈ {0, 1}n and column label y ∈ {0, 1}n gives the complex amplitude y|C|x. Stabilizer circuits are distinguished by these amplitudes being computable in deterministic polynomial time ([7], see also [1]). Graph-state circuits are the special case consisting of an initial Hadamard (H) gate on each qubit line, then CZ and S gates placed ad-lib, then a final H gate to close each line. The CZ and S gates all commute so they can be placed in any order and simplified. The (surviving) CZ gates become edges and any Z become self-loops. A possible leftover S gate on each line is handled as below. That graph-state circuits capture all stabilizer circuits C which is well known (see [2]), but we sketch a proof via quadratic forms qC over Z4 to develop further connections. Initially, qC = 0 and each qubit line i has xi as its current annotation u i . In general, suppose the nondeterministic variables y1 , . . . , y−1 have been allocated while transcribing the first s − 1 gates of C. The next gate gives cases: – H on line i: Allocate a variable y , do qC += 2u i y , and reassign u i to be y ; – CZ gate on lines i and j: qC += 2u i u j , no other change; – S gate on line i: qC += u i2 , no other change. So Z does qC += 2u i2 . At the end of each qubit line i, we can identify z i with the variable last denoted by u i . Let h be the final number of nondeterministic variables, which here is the number of Hadamard gates in C. If the input and output are |0⊗n , so that all xi and z i are zeroed out, we are left with   ai yi2 + 2yi y j (mod 4), (10) qC (y1 , . . . , yh ) = i∈S

(i, j)∈E

where S is the set of lines having S gates, ai corresponds to how many times S is applied to line i, and E is the set of nonzero yi y j terms. Then

1 1 0n |C|0n = n/2 (N0 − N2 ) + n/2 i(N1 − N3 ), 2 2

where Nc stands for the number of arguments y ∈ {0, 1}n giving q(y) = c mod 4 for c = 0, 1, 2, 3. Regarding the graph G as formed by E on V = {1, . . . , n} suffices for the notion of simulation by graph-state circuits in [2] since that is modulo the local (i.e., single-qubit) operations of the S gates, but we can go further by considering the probabilities | 0n |C G |0n  |2 . Then a process in Sect. 5 of [8] converts qC into qC using twice as many variables such that

A New Graph Polynomial and Generalized …

9

n

0 |C|0n 2 = 1 (N − N ), 2 2n 0 where N0 , N2 are defined analogously and the values 1 and 3 do not apply. The set S of i giving ai = 2 receive self-loops in the resulting graph G , which in turn yields a graph-state circuit C with no leftover S gates such that the counts underlying its (always-real) amplitudes yield probabilities of the original C. Thus, we can regard graphs as fully representative of stabilizer circuit computations. Now consider black/white two colorings (not necessarily proper) of G. Call an edge (or self-loop) a B-B edge if both ends are colored black. Let c0 count the colorings that make an even number of B-B edges and c1 = 2n − c0 those giving an odd number. The following, implicit also in [6], can be called “amplitude”: a(G) =

c0 − c1 . 2n

Because the denominator is 2n not 2n/2 it may be better understood as tracking probability. A wisp or circle has n = 0 but is considered to have one black coloring, so a(G ∅ ) = 1 and a( ) = −1. The isolated node G 1 has n = 1, c0 = 2, and c1 = 0, so a(G 1 ) = 1. The loop has c0 = c1 = 1 so a() = 0. Now we claim 1 a(G) = a(G \ e) − a(G\\e), 2

(11)

for any edge e = (u, v). To prove this, first note that every coloring has the same odd/even parity of black-black edges for G \ e as for G except when both u and v black. Let c0uv denote the colorings among the latter that make an even number of black-black edges (including e) overall, c1uv for an odd number. Then a(G) = a(G \ e) +

2 uv (c − c0uv ). 2n 1

Absent other edges between (or loops at) u and v, c1uv is the same as the number of colorings of G\\e that make an even number of B-B edges, and c0uv becomes the odd case in G\\e again because we subtracted e. Considering the sign changes from any other (u, v) edges or loops and 22n = 21/2 n−2 yields (11). To handle the case of a self-loop e at a node u, define c0W (G) to be the number of colorings with an even number of B-B edges that color u white, similarly c0B (G) for u black, likewise c1W (G), c2B (G) for odd, and apply this notation the same way for G \ e and G\\e. This also justifies two options we give later for negative isols:

10

C. Guan and K. W. Regan

1 1 a(G) = n (c0 (G) − c1 (G)) = n (c0W (G) − c1W (G) − c1B (G) + c0B (G)) 2 2 1 = n (c0W (G \ e) − c1W (G \ e) − c1B (G) + c0B (G)) 2 1 = n (c0 (G \ e) − c1 (G \ e) − c0B (G \ e) + c1B (G \ e) − c1B (G) + c0B (G)) 2 1 = n (c0 (G \ e) − c1 (G \ e) + (c1B (G \ e) + c0B (G)) − (c0B (G \ e) + c1B (G))) 2 1 = n (c0 (G \ e) − c1 (G \ e) + 2c1B (G \ e) − 2c0B (G \ e)) 2 1 = n (c0 (G \ e) − c1 (G \ e) + 2c1 (G\\e) − 2c0 (G\\e)) 2 1 1 = n (c0 (G \ e) − c1 (G \ e)) − n−1 (c0 (G\\e) − c1 (G\\e)) 2 2 = a(G \ e) − a(G\\e).



4 The Amplitude Polynomial Q G (x) The edge and loop cases of the last section’s deletion-explosion recurrence for a(G) can be combined using the rank function f (e) into one line as follows: a(G) = a(G \ e) −

1 a(G\\e). f (e)

(12)

Defining a(G) = a(G 1 )a(G 2 ) for a disjoint union completes a recursive definition of a(G). To get more mileage, we want to emulate the ideas behind TG and SG : Definition 3 For an m-graph G, define its amplitude polynomial Q G (x) inductively by Q G (x) = (−1)k x  , if G consists of  isolated nodes, k circles, and any number of wisps or negative isols, and for any edge or loop e, Q G (x) = Q G\e −

1 Q G\\e . f (e)

(13)

If we prefer writing Q G (x) = Q G\e − 21 Q G\\e for loops too, then we can stipulate that a negative isol subtracts 1 from n or has amplitude 2. (This is cosmetic.) When G consists of a single node with a loop, the recursion gives Q  = x − 1. When it is a single edge e between two nodes, we get Q e (x) = x 2 − 21 . If we have a single node v with two loops, something portentous happens. Exploding one of the loops e turns the other into a circle. Removing e still leaves one loop. So we get: Q G (x) = Q  (x) − Q (x) = x − 1 − (−1) = x. This agrees with the value on one isolated node. Similarly, if G has just a double edge between two nodes

A New Graph Polynomial and Generalized …

11

u and v, then one edge e becomes a circle upon exploding the other edge, so recursing on the other edge makes Q G (x) = Q e (x) −

1 1 1 Q = x 2 − + = x 2. 2 2 2

Thus, the double edge gives the same result as having two isolated nodes. The upshot is that Eq. (13) naturally treats edges modulo two. We can recurse on a non-edge as if it were a double edge. Any circle produced in an explosion becomes a −1 multiplier on that branch of the recursion. The circle thus becomes a placeholder for calculating phase flips and cancellations. It may not be obvious, however, that Q G (x) is well defined when G has more than one edge. Does it come out the same for any order of choosing edges e to recurse? We will prove confluence by connecting Q G to the polynomial SG (u, v).

5 Relation to the Rank-Generating Function SG (x, y) Note that Definition 1 of SG has a basis that is symmetric in x and y, as is the one-edge basis S (x, y) = 1 + y 2 , S (x, y) = x + y, and SG 2 (x, y) = x 2 + 1 (on which SG and SG agree). Our first main theorem, however, breaks the symmetry. Theorem 1 For √ any m-graph G with n nodes, of which k are isolated, and all x ∈ C, taking α = − 2i, we have the following:  n  n 1 1 Q G (x) = SG (αx, −α) = SG (αx, −α)(αx)k . α α

(14)

In consequence, Definition 3 for Q(x) is confluent, likewise for a(G) = Q G (1). Proof Call RG (x) the right-hand side of (14). Note first that if G consists only of k isolated nodes, then SG (·, ·) = 1 while SG (αx, −α) = (αx)k , so RG (x) =

 k 1 (αx)k = x k = Q G (x). α

Now we verify the other base cases: 1. 2. 3. 4.

R∅ (x) = (1/α)0 · 1 · (αx)0 = 1 = Q ∅ (x). R = 1 · (1 + (−α)2 ) · 1 = 1 − 2 = −1 = Q (x). R = α1 (αx − α) = x − 1 = Q  (x). RG 2 (x) = ( α1 )2 ((αx)2 + 1) = x 2 + α12 = x 2 − 21 = Q G 2 (x).

For the recursion, we continue the policy of tracking SG as employed by [10] even though using our SG would combine and simplify the cases:

12

C. Guan and K. W. Regan

(a) If e is a loop at a vertex with other incident edges, so f (e) = 1 and f (E \ {e}) = f (E), then SG (u, v) = SG\e (u, v) + vSG\\e (u, v). (b) If e is a pendant edge, i.e., with one node of degree 1, then f (e) = 2 and f (E \ {e}) = f (E) − 1, so SG (u, v) = u SG\e (u, v) + SG\\e (u, v). (c) If e is an edge between nodes of degree at least 2, so f (e) = 2 and f (E \ {e}) = f (E), then SG (u, v) = SG\e (u, v) + SG\\e (u, v). Every connected graph other than those we put in our basis has an edge e that falls into one of these three cases. By induction we analyze (a, b, c) as follows: (a) Removing or exploding e gives no isolated node, so the (αx) term factors through and can be ignored. Forming G\\e reduces n by 1. Thus we get: Q G (x) = Q G\e (x) − Q G\\e (x)  n  n−1 1 1 = SG\e (αx, −α) − αSG\\e (αx, −α) α α  n  n 1 1 = [SG\e (αx, −α) + (−α)SG\\e (αx, −α)] = SG (αx, −α). α α (b) Removing e causes one more isolated node, but while exploding e reduces n by 2, it does not isolate any nodes, so: 1 Q G (x) = Q G\e (x) − Q G\\e (x) 2    n 1 1 n−2 1 SG\e (αx, −α)αx − αSG\\e (αx, −α) = α 2 α  n 1 1 = [SG\e (αx, −α)αx − α 2 SG\\e (αx, −α)] α 2  n  n 1 1 = [(αx)SG (αx, −α) + SG\\e (αx, −α)] = SG (αx, −α). α α (c) Here removing e causes no more isolated nodes. Nor does exploding e, though we should note that exploding one edge of the three-node triangle graph creates a double loop at the opposite vertex w. This is equivalent to isolating w, but the equivalence is already handled by our treatment of the base cases and by multiplicativity for disjoint components. So we can calculate the following:

A New Graph Polynomial and Generalized …

13

Q G (x) = Q G\e (x) − (1/2)Q G\\e (x)  n   1 1 1 n−2 = SG\e (αx, −α) − αSG\\e (αx, −α) α 2 α  n 1 1 = [SG\e (αx, −α)αx − α 2 SG\\e (αx, −α)] α 2  n  n 1 1 = [(αx)SG (αx, −α) + SG\\e (αx, −α)] = SG (αx, −α). α α 

6 Generalized Tutte(–Grothendieck) Invariants and Q G The paper [10], after discussing GTGIs, defines a generalized Tutte invariant (GTI) as a function from graphs to C, but it is natural to allow complex polynomials. A set S of edges is a separator for a 2-polymatroid (G, f ) if f (S) + f (E\X ) = f (E). In terms of graphs G, S is a separator if and only if the set of vertices touched by S is disjoint from those covered by E\S. Definition 4 ([10]) A GTI on 2-polymatroids is a function φ for which there exist a, b, c, d, μ, ν, r, s, t ∈ C[X ] such that φ( ) = s, φ() = t, φ(G 2 ) = r, and for any G2PM (G, f ) and edge e: If e is a separator then φ(G) = φ(G\(E\e))φ(G\e); else: ⎧ ⎨ aφ(G\e) + bφ(G\\e), if f (E\e) = f (E) and f (e) = 1; φ(G) = cφ(G\e) + dφ(G\\e), if f (E\e) = f (E) − 1 and f (e) = 2; ⎩ μφ(G\e) + νφ(G\\e), if f (E\e) = f (E) and f (e) = 2. After citing a theorem classifying GTIs on G2PMs into two main families, the first one governed by SG , we state and prove our second main theorem. Theorem 2 ([10]) Let φ be a generalized Tutte invariant on graphic 2-polymatroids and suppose that at most two of r, s, t, a, b, c, μ, ν are zero. Then either: 1. a = μ; d = ν; μr = μν + c2 ; νs = μν + b2 ; t = b + c; μ = 0; ν = 0 and (∀G):   c b ; or φ(G) = μ|E|− f (E)/2 ν f (E)/2 SG , (μν)1/2 (μν)1/2

14

C. Guan and K. W. Regan

2. φ belongs to a second family whose details do not concern us here. Theorem 3 On graphical 2-polymatroids, the polynomial Q G (x) is a generalized Tutte–Grothendieck invariant belonging to the first family with: 1 1 r = x 2 − , s = −1, t = x − 1, a = μ = 1, b = −1, c = x, d = ν = − . 2 2

Proof We begin with the empty graph ∅, which is also a G2PM, giving Q ∅ (x) = 1. This is distinct from the graph G 1 consisting of a single isolated node, which is not a G2PM and which gives Q G 1 (x) = x. The circle is a base case giving s = Q (x) = −1. For the other two base cases of Definition 4 we have t = Q  (x) = Q G 1 (x) − Q ∅− (x) = x − 1, 1 1 t = Q G 2 (x) = Q G 21 (x) − Q ∅ (x) = x 2 − , 2 2 where G 21 is the graph of two isolated nodes. Now consider any G2PM G in which the edge e is a separator. If it has two nodes, then the graph G \ e in the recursion (13) of Definition 3 has two isolated nodes. As a G2PM in Definition 4, however, it refers to the G2PM obtained by deleting those two nodes, which we call G . The graph G\\e obtained by exploding e is the same in both definitions and is the same as G . Finally, the G2PM G\(E\e) obtained by deleting all other edges is the same as G 2 . So, we get 1 1 Q G (x) = Q G\e (x) − Q G\\e (x) = x 2 Q G (x) − Q G (x) 2 2   1 2 Q G (x) = Q G 2 (x) · Q G (x), = x − 2 as required. If e is an isolated loop, then G is defined similarly, G\(E\e) = , and the verification becomes Q G (x) = Q G\e (x) − Q G\\e (x) = x Q G (x) − Q G (x) = (x − 1)Q G (x) = Q  (x) · Q G (x). Now consider an edge e that is not a separator. Then it is either a non-isolated loop, a pendant edge, or an edge between nodes that have other edges. In the loop case, G \ e remains a G2PM, as is G\\e, and the recursion gives simply Q G (x) = Q G\e (x) − Q G\\e (x), so a = 1 and b = −1. In the non-pendant-edge case, once again both G \ e and G\\e are G2PMs, and the recursion (13) gives μ = 1 and ν = − 21 . The pendant-edge case again involves the distinction between a graph and a G2PM. Define G to be the graph G \ e minus the resulting isolated vertex, which is what Definition 4 means by

A New Graph Polynomial and Generalized …

15

G\e as a G2PM. Thus, we have Q G (x) = Q G\e (x) − Q G\\e (x) = x Q G (x) − Q G\\e (x), which gives c = x and d = −1 as the coefficients in Definition 4. It is immediate that these coefficients obey the relations of the first family in Theorem 2. Our last need is to verify the equation that follows. On substituting, it becomes  Q G (x) =

−1 2

 f (E)/2

 SG

 x −1 ,√ . √ −1/2 −1/2

(15)

Since G has no isolated nodes, the term “(αx)k ” in (14) disappears. Once again using n to denote the number of nodes, so f (E) = n, the main point is that 

−1 2



n/2 =

−1 2

n

 =

1 √ −2

n

 n 1 = α

√ with α = i 2, and the rest of (15) similarly becomes SG (αx, −α). So (14) yields  the relation to SG needed here. Corollary 1 The amplitude a(G) is a GTGI, with coefficients the same as above except r = 21 , t = 0, and c = 1. Proof The essence is that the property of being a GT(G)I—for polynomials—is closed under polynomial specialization.

7 Conclusions We have derived a new graph polynomial Q G (x) from considerations of quantum circuits and have established that it is a generalized Tutte–Grothendieck invariant while following a recursion with the unusual second coefficient 21 . It is distinguished by specializing the umbrella√polynomial SG (x, y) for GTGIs with complex values. Indeed, the choices y = ±i 2 are the only ones that make denominators vanish in Noble’s technique for proving that all real points (x0 , y0 ) off the hyperbola x y = 1 make SG (x0 , y0 ) #P-hard to compute given G. The main resulting question is to classify (real) x0 making Q G (x0 ) hard or easy to compute. We started with Q G (1) = a(G) belonging to polynomial time. The next case is x0 = 0. We need only to consider connected graphs G and find Q(0) =

1 2n

 spanning A⊆E

(−2)|A| .

(16)

16

C. Guan and K. W. Regan

Is this alternating sum easy or hard to compute? We have not found it in the literature nor worked it out from [6] or known (real) cases of TG (x, y) or SG (x, y). For sparse G, in particular, with |E| ≤ 2n − 2, the summation part of (16) remains bounded by a polynomial in n [4]. This is evidence against these cases being #P-hard, but it is still possible for the sum to be NP-hard—or randomized NP-hard by dint of being hard for the “parity-P” class ⊕P. Deeper possibilities are extending the techniques given here for more general quantum circuits, say Clifford + T. Are they related to k-polymatroids for some k > 2? Our “re-basing” SG to SG while embracing multiple kinds of isolated nodes may also simplify the GTGI theory to increase its appeal and leverage more deep connections made by the original Tutte polynomial theory.

References 1. Aaronson, S., Gottesman, D.: Improved simulation of stabilizer circuits. Phys. Rev. A 70, 052328 (2004) 2. Anders, S., Briegel, H.: Fast simulation of stabilizer circuits using a graph state representation. Phys. Rev. A 73, 022334 (2006) 3. Brylawski, T.: A decomposition for combinatorial geometries. Trans. Am. Math. Soc. 171, 235–282 (1972) 4. Chaki, R.: Restricted proof of the polynomial boundness of the sum over all 2−|A| , where A is an edge cover of a graph G (2020). arXiv:2001.0048 5. Crapo, H.: The Tutte polynomial. Aequationes Math. 3(3), 211–229 (1969) 6. Goldberg, L., Grohe, M., Jerrum, M., Thurley, M.: A complexity dichotomy for partition functions with mixed signs. SIAM J. Comput. 39, 3336–3402 (2010) 7. Gottesman, D.: The Heisenberg representation of quantum computers (1998). arXiv:quant-ph/9807006 8. Guan, C., Regan, K.: Stabilizer circuits, quadratic forms, and computing matrix rank (2019). arXiv:1904.00101 9. Noble, S.: Evaluating the rank generating function of a graphic 2-polymatroid. Comb. Probab. Comput. 15, 449–461 (2006) 10. Oxley, J., Whittle, G.: A characterization of Tutte invariants of 2-polymatroids. J. Comb. Theory Ser. B 59, 210–244 (1993) 11. Oxley, J., Welsh, D.: The Tutte polynomial and percolation. Graph Theory and Related Topics, pp. 329–339. Academic Press, New York (1979)

Cryptanalysis of a Centralized Location-Sharing Scheme for Mobile Online Social Networks Munmun Bhattacharya, Sandip Roy, Soumya Banerjee, and Samiran Chattopadhyay

Abstract In recent past, due to extensive development of mobile Internet and GPS technology, mobile online social networks (mOSNs) have gained more popularity over traditional online social networks (OSNs). mOSN provides supports to various day-to-day online social network operations like establishing friend relationship, providing location-based services, location sharing among friends, etc. Very recently, in 2018, Xiao et al. proposed a centralized location-sharing scheme where social network server and location-based server are integrated into a single entity (future generation computer systems). In this paper, we analyze that though the scheme of Xi Xiao et al. is efficient and incurs lesser communication and storage cost compared to existing schemes, it has several security weaknesses. As, for example, it cannot resist man-in-the-middle attack and replay attack. Moreover, due to incorrect strategy in location updates phase, user suffers from denial-of-service attack querying friend’s location phase. The cryptanalysis of the scheme of Xi Xiao et al. shows that it is not suitable for practical applications. We verify the attack on the protocol using widely accepted ProVerif and AVISPA simulation tools. Finally, we hint at some possible improvements that can be adopted by their scheme to make it more secured against various possible known attacks. Keywords Mobile online social networks · Location sharing · Friends’ location query · Cryptanalysis · Security attack · ProVerif simulation · AVISPA simulation M. Bhattacharya (B) · S. Banerjee · S. Chattopadhyay Department of Information Technology, Jadavpur University, Salt Lake City 700 098, Kolkata, India e-mail: [email protected] S. Banerjee e-mail: [email protected] S. Chattopadhyay e-mail: [email protected] S. Roy Department of Computer Science and Engineering, Asansol Engineering College, Asansol 713305, WB , India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Chaki et al. (eds.), Advanced Computing and Systems for Security, Advances in Intelligent Systems and Computing 1178, https://doi.org/10.1007/978-981-15-5747-7_2

17

18

M. Bhattacharya et al.

1 Introduction In earlier days of traditional online social networks, users required to share his/her current location information through the process of “check-in” where they need to explicitly register their current location information and time to corresponding OSN website or apps. Mobile users can now avoid that inefficient “check-in” process, as the inception and development of GPS technology now enable mobile users to determine current mobile geo-locations. Hence, in mOSNs, various users can inform, update, and share current mobile locations with other users. Although mOSNs provide various sorts of conveniences, without a guarantee of security and privacy of location and social information, users might not intend to use the application. Considering the existing popularity of mOSNs and the users’ concern for security of the location-related information, it requires a serious research effort to limit the privacy threats faced by today’s mOSN users without sacrificing their functions and facilities. Lack of security measure can disclose both physical location information and social information (e.g., friend relation, home address, hobbies, current health status, etc.) of a registered user to a vicious attacker. Leakage and intentional abuse of such sensitive data can be a serious issue to mOSN users. In the field of mOSNs, privacy and security issues have drawn a great deal of research focus. Hence, in recent past, numbers of privacy-preserving schemes in the field of online social networks have been proposed with their own merits and limitations. Earlier research works were aimed at achievement of information privacy [2] and protection of location privacy [3]. To achieve user location anonymity, location server must be prevented from knowing the actual user location, especially during the provision of location-based services. Recent well-known solutions include K-anonymity [4], the pseudonym methods [5], the m-unobservability [6], and the location anonymity [7]. Location sharing while maintaining privacy protection in online social networks has been primarily addressed in 2007, by SmokeScreen [8], which allows to share locations between social friends and strangers. Wei et al. enhanced this scheme and proposed Mobishare, where users’ social and location information are separately stored into social network server (SNS) and location-based server (LBS), respectively [9]. Mobishare suffers from the weakness that, in the query phase, LBS can reveal the topology structure of social networks of a user. Recently, Li et al. [10] enhanced Mobishare to propose new privacy-protected location-sharing scheme in mOSNs, namely, MobiShare+. This scheme introduces the concept of dummy queries and private set intersection to prevent LBS from knowing social information of a user. N-MobiShare [12] is another improvement over MobiShare, where CT is replaced by SNS. However, N-MobiShare suffers from the drawback that LBS can get users’ social network topology. BMobiShare is an improved version over MobiShare+ in terms of transmission efficiency, where existing private set intersection method is replaced by Bloom Filter [11]. However, computation cost of BMobiShare is quite high.

Cryptanalysis of a Centralized Location-Sharing …

19

To address these issues, very recently, in 2018, Xiao et al. proposed centralized location-sharing scheme for mOSN or CenLocShare [1]. Here, SNS and LBS are amalgamated into one single server. This scheme reduces communication cost, storage cost, and also increases user’s privacy protection. In this paper, we execute detailed cryptanalysis over the scheme of Xi Xiao et al. and analyze that this scheme is vulnerable to various active and passive security attacks: (1) registration phase and location update phases suffer from man-in-the-middle attack, (2) the scheme has security weakness against replay and strong replay attack, (3) location update phase shows security pitfalls which leads querying friend’s location phase to incur denialof-service attack, and (4) as a whole, the scheme lacks secure login, authentication, and key-establishment phase. The remainder of this paper is organized as follows. In Sect. 2, we explain the adopted system architecture and threat model of the scheme of Xi Xiao et al. We review all the phases of the scheme of Xi Xiao et al. in Sect. 3. In Sect. 4, we perform a detailed cryptanalysis of the scheme of Xi Xiao et al. and identify their security pitfalls. In Sect. 5, we provide the simulation result of security attack on the scheme using both ProVerif and AVISPA simulation tools. In Sect. 6, as a future research direction, we suggest some remedial improvements on this scheme in order to make it more secured against those security weaknesses.

2 Adversary Model and System Model In this section, we briefly explain the basic adversary model or the attack model of the scheme of Xi Xiao et al. Moreover, we present an outline of the system model adopted for the same scheme.

2.1 Threat Model The basic threat model or attack model is defined with respect to mobile user (U ) and over location-sharing social network server (LSSNS), and assume that cellular tower (CT ) is a fully trusted body. The model is defined below: 1. A malicious user intends to read, intercept, block, and modify various wireless messages communicated among various entities of the system. The adversary has the capability of executing all the attacks which are applicable for the classical, widely accepted Dolev–Yao threat model (DY model) [13]. 2. An authorized user of the system can transform into a malicious user in the sense that it can register into one LSSNS and then try to access or reveal social or location information of other genuine users in an illegal way.

20

M. Bhattacharya et al.

3. LSSNS may collude with other authorized users in order to reveal the total social information or location information topology of a user U . In short, the nature of LSSNS is “honest but curious.” 4. CT is assumed to be a trusted entity.

2.2 System Model The architecture of our proposed system model is outlined in Fig. 1. The model assumes three basic entities that are defined below: 1. Mobile user (U ) : U requests LSSNS for own location update, requests for friend’s location information, and intends to share his/her own location information to friends. 2. Location-Sharing Social Network Server (LSSNS) : This server stores, updates, and shares U ’s social and location information. Moreover, following the request of U , LSSNS provides location information of U ’s friends and strangers. 3. Cellular Tower (CT) : Communications among U and LSSNS are done through the trusted entity CT . In practice, a location-sharing scheme in online social network faces several challenges. First, U ’s location update, sharing friend’s location with U , and informing U ’s location on his/her friend’s request must be privacy preserving. The social information, ID, and other privacy-sensitive information of the authorized user U cannot

1. RegistraƟon/Request

2. RegistraƟon/Request

3. Response

4. Response

User (U)

Cellular Tower (CT) LocaƟon Sharing Social Network Server (LSSNS) System Architecture for Centralized LocaƟon Sharing in mOSN

Fig. 1 Architecture for centralized location sharing in mOSN. 1: U sends a registration request message to CT ; 2: CT forwards the message to LSSNS; 3: LSSNS returns a response to CT ; 4: CT forwards the response to U

Cryptanalysis of a Centralized Location-Sharing …

21

be disclosed to the adversaries. Second, in order to provide user U ’s location secrecy, LSSNSs save multiple dummy/fake identities of U , along with its real identity. Finally, the friend’s or stranger’s location queries are processed according to the physical distance threshold between U and its friends or strangers, respectively, as set during registration process.

3 Review of Xi Xiao et al.’s Centralized Location-Sharing Scheme for mOSN In this section, we review the scheme of Xi Xiao et al. in brief [1]. The proposed scheme comprises four phases, namely, (a) registration phase, (b) location updates phase, (c) querying friends’ locations phase, and (d) querying strangers’ locations phase. For the sake of simplicity, we discuss first three phases only. Various notations used to design the scheme are tabulated in Table 1.

3.1 Registration Phase In this phase, through CT , U shares his/her location-sharing preferences with LSSNS. 1. U sends a record in the form of {IDU , ‘reg’, dfU , dsU } to CT , where ‘reg’ presents the registration process, dfU conveys its usual meaning as noted in Table 1. 2. After receiving message of U , CT forwards {IDU , ‘reg’, dfU , dsU } to LSSNS.

Table 1 Summary of symbols and notations Symbol Description IDU (xU , yU ) dfU qfU SessU Decrypt Key () FS KeyCT TagU KeyLS dist(., .) min(., .) A

Identity of user U Actual location coordinate of U Distance threshold of U ’s (friend-case) U ’s distance threshold in friends’ location query Symmetric key shared between U and his/her friends A decryption function using secret key Key An identity set of U ’s friends The secret key of cellular tower An encrypted string indicating an index of the real location LSSNS’s secret key, shared with cellular towers A function to compute Euclidean distance A function to return the minimum of two values An adversary or malicious user

22

M. Bhattacharya et al.

3. LSSNS receives this registration query, and LSSNS keeps a record as {IDU , dfU , dsU } in the database. Then LSSNS sends a reply (IDU , ‘ok’) to CT . 4. Finally, CT forwards LSSNS’s message (‘ok’) to U to indicate a successful registration. Consequently, the secure communication link is established between U and LSSNS.

3.2 Location Updates Phase This phase updates the current location record in LSSNS through CT . 1. To update current location, U uploads a record in the form of (IDU , (x, y), SessU (x, y)) to CT , where (x, y) and SessU (x, y) convey their usual meanings as mentioned in Table 1. 2. CT randomly generates k − 1 dummy locations (xi , yi ), i = 1, . . . , k − 1, and k − 1 random strings stri , i = 1, . . . , k − 1, to imitate the encrypted locations. To anonymize the location, one real location and k − 1 dummy locations are sent to LSSNS in the form of {IDU , (x1 , y1 , str1 ), . . . , (xk , yk , strk ), TagU }, where the real location (xU , yU , strU ) is randomly put at the nth place, (1 ≤ n ≤ k) and TagU is an encrypted string that is encrypted by CT ’s secret key KeyCT and indicates the index of the real location. 3. When LSSNS receives the update record, it stores the record in the database and sends a success message (IDU , ‘ok’) to CT . 4. After receiving the message from LSSNS, CT directly forwards ‘ok’ to U .

3.3 Querying Friends’ Locations Phase This phase shows the details of friends’ locations query as sent by user U . 1. To execute friends’ locations within a certain range, U submits a query (IDU , ‘f’, qfU ) to CT , where qfU conveys its usual meaning given in Table 1. 2. After receiving U ’s query, CT appends a sequence number seq, and forwards the query (IDU , ‘f’, qfU , KeyLS (seq)) to LSSNS. seq is used to resist the replay attack and the tampering attack. KeyLS is LSSNS’s secret key, shared with the CT . 3. Upon receiving the query, LSSNS finds the set of U ’s friends, FS, consisting identifiers of U ’s friends. For each entry (ID, df , ds, (x1 , y1 , str1 ), (x2 , y2 , str2 ), . . . , (xk , yk , strk ), Tag) stored in the database, excluding (IDU , dfU , dsU , (xU 1 , yU 1 , str1 ), (xU 2 , yU 2 , str2 ), . . . , (xU k , yU k , strk ), TagU ), LSSNS checks whether dist((xi , yi ), (xU t , yU t )) ≤ min(qfU , dfs ) s ∈ FS, i = 1, . . . , k, t = 1, . . . , k, where (xU t , yU t ), t = 1, . . . , k, are the corresponding k locations of U . For all F ∈ FS, if one or more (xFi , yFi ), satisfy the distance requirement, the corresponding record in the form of (F, (i, stri ), TagF ) inserts into the result

Cryptanalysis of a Centralized Location-Sharing …

23

set. The result set is divided into k subsets Fqi , i = 1, . . . , k, according to the corresponding center point (xU i , yU i ), i = 1, . . . , k. Finally, LSSNS replies KeyLS ({Fqi }i=1,...,k , TagU , seq) to CT . 4. CT receives the message and decrypts the reply message with KeyLS and checks validity of seq and finds U ’s real location with the real location index r = DecryptKeyCT (TagU ), where DecryptKeyCT () is a decryption function using the secret key KeyCT . According to the real center point, CT chooses Fqr and discards the reminding set Fqi , i = r. Finally, Ans is a set in the form of ({IDi , SessU (xi , yi )}i=1,...,k  ), and CT sends (Ans) to U .

4 Cryptanalysis of Xi Xiao et al.’s Scheme In this section, we investigated the security of the scheme of Xi Xiao et al. and found that this scheme is insecure to the following attacks:

4.1 Vulnerability to Man-in-the-Middle Attack In this subsection, we analyze that the scheme of Xi Xiao et al. is vulnerable to man-in-the-middle attack. This attack enables an adversary A to alter and/or modify one or more communicated messages between the sender and the receiver keeping both parties completely unaware of the message modification. In registration phase Sect. 3.1, U sends {IDU , ‘reg’, dfU , dsU } to CT , where dfU is the distance threshold within which U allows his friends to find himself. dsU is the distance threshold in strangers’ location query. Receiving the message from U , CT forwards the same message to LSSNS. Both messages are communicated through insecure public channel. Unfortunately, no user authentication is done in the server side. This gives an opportunity to an adversary A to launch a man-in-the-middle attack. To introduce the attack, A executes the following steps: 1. A eavesdrops the registration message {IDU , ‘reg’, dfU , dsU } forwarded by CT for recipient LSSNS. 2. A modifies distance thresholds dfU and dsU into dfU∗ and dsU∗ , respectively. Also, A saves user id IDU to own database. 3. A sends modified message {IDU , ‘reg’, dfU∗ , dsU∗ } to server LSSNS. 4. LSSNS receives message, and keeps a malicious record as {IDU , dfU∗ , dsU∗ } in its database. 5. A generates an ‘ok’ and sends (IDU , ‘ok’) to CT . 6. CT forwards ‘ok’ to user U .

24

M. Bhattacharya et al.

Here, the adversary A tricks server LSSNS to store wrong distance thresholds dfU∗ and dsU∗ against user IDU . Further, A sends (IDU , ‘ok’) message to CT , who forwards to user U . U receives this ‘ok’ message and becomes completely unaware of the modification of his/her threshold parameters by the adversary. As a result, A is successful in execution of the man-in-the-middle attack.

4.2 Vulnerability to Replay Attack In this subsection, we analyze that the scheme of Xi Xiao et al. is vulnerable to replay attack. Through this attack, an adversary A intercepts a message msg, saves it, and retransmits it later on to the receiver. The receiver is pretended to assume the replayed message as the current message. In registration phase Sect. 3.1, U sends {IDU , ‘reg’, dfU , dsU } to CT , where dfU and dsU are the “current” distance thresholds in friends’ and strangers’ location query. CT forwards the same message to LSSNS through insecure public channel. It is to be noted that LSSNS does not verify whether the registration request message is a current message or an old replayed message. This gives A an opportunity to launch a replay attack. A executes the following steps to launch a replay attack: 1. A intercepts, holds, and saves the registration message {IDU , ‘reg’, dfU , dsU } forwarded by CT at time . 2. At time  1 , A replays the same message {IDU , ‘reg’, dfU , dsU } to server LSSNS. Transmission delay | 1 − | might be as small as few seconds or might be as large as few months. 3. LSSNS receives and saves the replayed message {IDU , ‘reg’, dfU , dsU } in its database. Note that, dfU and dsU are old distance thresholds of U , and are obsolete now. 4. A generates an ‘ok’ and sends (IDU , ‘ok’) to CT . 5. CT forwards ‘ok’ to user U . Here, the server is tricked into saving old and obsolete distance thresholds dfU and dsU against user IDU . As a result, the scheme has security weakness against replay attack. A is able to execute the replay because (a) the scheme does not use current timestamp to check the communication delay and (b) the scheme does not use random nonce and cryptographic hash function uses check message integrity. To protect messages from replay attack strongly, U , CT , and LSSNS need to verify both timestamp and random nonce embedded in the registration request message.

Cryptanalysis of a Centralized Location-Sharing …

25

4.3 Security Weakness Against Denial-of-Service Attack Due to Inefficiency in Location Updates Phase In the protocol of Xi Xiao, attacker A can launch a kind of denial-of-service attack. Due to this attack, U might not be able to find his/her social friends in friends’ location query phase. The remaining process of the attack is described below: 1. The adversary A intercepts location update message Msg = {IDU , (x1 , y1 , str1 ), . . . , (xk , yk , strk ), TagU } sent by CT to LSSNS. 2. A modifies parameter TagU into TagU∗ and sends modified message Msg  = {IDU , (x1 , y1 , str1 ), . . . , (xk , yk , strk ), TagU∗ } to LSSNS. 3. LSSNS saves Msg  containing TagU∗ into its database. Now, the legal user U fails to find his/her social friends within the distance threshold. The details are given below: 1. In step 3 of Querying friends’ locations phase Sect. 3.3, LSSNS replies KeyLS ({Fqi }i=1,...,k , TagU∗ , seq) to CT . 2. In step 4 of Querying friends’ locations phase Sect. 3.3, CT finds fake location of U with the incorrect location index r ∗ = DecryptKeyCT (TagU∗ ). 3. In the same step, Sect. 3.3, CT chooses wrong set Fqr∗ , and discards the reminding set Fqi , i = r ∗ . 4. Finally, U receives incorrect set of friends or no friends at all. Hence, even if CT enters a valid encrypted location parameter TagU , but user U cannot find its real set of friends within stipulated distance from the server. Adversary can successfully execute this denial-of-service attack, because there is no verification of authenticity of parameter TagU by the server in location update phase.

5 Security Verification Using ProVerif and AVISPA Simulation Tools In this section, we verify the security attack on the scheme of Xi Xiao et al. using widely accepted ProVerif and AVISPA simulation tools [16, 17].

5.1 Simulation of Security Attack Using ProVerif ProVerif is an applied pi calculus-based software tool for automated reasoning to analyze and verify the security properties found in cryptographic protocols [16]. This tool can be practically used for testing whether an attacker is able to attack (or compromise) various public channel messages communicated in a security protocol.

26

M. Bhattacharya et al. (* ————– channels ————–*) free pch: channel. (* public channel *) (* ————– constants ————–*) free IDu:bitstring [private]. free reg:bitstring [private]. free dfu:bitstring [private]. free dsu:bitstring [private]. (* ————– aims for verification ————– *) query attacker(dfu). query attacker(dsu). query id:bitstring; inj-event(UserAuth(id)) ==> inj-event(UserStart(id)). (* ————– event ————– *) event UserStart(bitstring). (* User starts authentication *) event UserAuth(bitstring). (* User is authenticated *)

Fig. 2 Declaration of channels, constants, aims, and events

let User= ! ( event UserStart(IDu); out(pch,(IDu,reg,dfu,dsu)); in(pch,(rIDu:bitstring,rMsg1:bitstring)); 0 ). let SAuth = in(pch,(xIDu:bitstring, xreg:bitstring, xdfu:bitstring, xdsu:bitstring)); event UserAuth(xIDu); new msg1:bitstring; out(pch,(xIDu, msg1)). let S = SAuth. process !User — !S Fig. 3 ProVerif code for the process of user U and server LSSNS

In Fig. 2, we provide the code for declaration of channels, free variables, constants, aims, queries, and events required for the scheme of Xi Xiao et al.. The code for the process of user U and LSSNS of distance registration phase Sect. 3.1 is modeled as parallel composition in Fig. 3. We keep an aim for verification on the possibility of man-in-the-middle attack on distance threshold parameters dfu and dsu . Figure 4 shows the simulation result of ProVerif. Clearly, Fig. 4 shows “goal reachable: attacker(dfu[])” and “goal reachable: attacker(dsu[])” with “RESULT not attacker(dfu[]) is false” and “RESULT not attacker(dsu[]) is false.” Hence, the simulation result clearly shows that both dfu and dsu are vulnerable to security attack by adversary A.

Cryptanalysis of a Centralized Location-Sharing …

27

– Query not attacker(dfu[]) Completing... Starting query not attacker(dfu[]) goal reachable: attacker(dfu[]) RESULT not attacker(dfu[]) is false. – Query not attacker(dsu[]) Completing... Starting query not attacker(dsu[]) goal reachable: attacker(dsu[]) RESULT not attacker(dsu[]) is false. – Query inj-event(UserAuth(id)) ==> inj-event(UserStart(id)) Completing... Starting query inj-event(UserAuth(id)) ==> inj-event(UserStart(id)) goal reachable: attacker(id 590) − > end(endsid 591,UserAuth(id 590)) RESULT inj-event(UserAuth(id)) ==> inj-event(UserStart(id)) is false. RESULT (even event(UserAuth(id 592)) ==> event(UserStart(id 592)) is false.) Fig. 4 Analysis of the attack simulation results

5.2 Simulation of Security Attack Using AVISPA In this subsection, we simulate the security attack on the scheme of Xi Xiao et al. using broadly accepted AVISPA tool [17]. We provide the implementation details of the scheme in high-level protocol specification language (HLPSL) [18] and then the simulation results to show our scheme is secure against replay and man-in-the-middle attacks. We have chosen the broadly used on-the-fly model-checker (OFMC) [19] and constraint logic-based attack searcher (CL-AtSe) backends for the execution test in order to find whether there are any attacks on our proposed scheme. For the replay attack checking, the backends check whether the legitimate agents can execute the specified protocol by performing a search of a passive intruder. After that the backends supply the intruder the knowledge of some normal sessions between the legitimate agents. For the Dolev–Yao model checking, the backends verify whether there is any man-in-the-middle attack possible by the intruder. All public parameters are known to the intruder. We have simulated our scheme using SPAN, the Security Protocol ANimator for AVISPA [20], for both OFMC and CL-AtSe backends. The simulation results of the analysis using OFMC and CLAtSe backends are shown in Fig. 5, and it is evident that AVISPA finds attack in the scheme of Xi Xiao et al. and the scheme is unsafe against replay and man-in-themiddle attacks.

28

% OFMC % Version of 2006/02/13 SUMMARY UNSAFE DETAILS ATTACK_FOUND PROTOCOL C:\progra~1\SPAN\testsuite\results\sr_tes t.if GOAL secrecy_of_sID BACKEND OFMC COMMENTS STATISTICS parseTime: 0.00s searchTime: 0.03s visitedNodes: 0 nodes depth: 0 plies

M. Bhattacharya et al.

% OFMC % Version of 2006/02/13 SUMMARY UNSAFE DETAILS ATTACK_FOUND TYPED_MODEL PROTOCOL C:\progra~1\SPAN\testsuite\results\sr_test .if GOAL Secrecy attack on (n7(IDu)) BACKEND CL-AtSe STATISTICS Analysed : 0 states Reachable : 0 states Translation: 0.02 seconds Computation: 0.00 seconds

Fig. 5 AVISPA simulation of attack using OFMC and CL-AtSe backends

6 Discussion and Future Work In Sect. 4, we performed detailed cryptanalysis of Xi Xiao et al.’s, centralized location-sharing scheme of Xi Xiao et al. for mobile online social networks. From the cryptanalysis, it becomes evident that their scheme becomes impractical, inefficient, and insecure due to the fact that the scheme encounters several security weaknesses, especially in registration phase and location updates phase. The scheme is vulnerable to man-in-the-middle attacks, replay attack, strong replay attack, denial-of-service attack, etc. Man-in-the-middle attack can be avoided if message integrity is checked in the recipient side. This can be done with the use of cryptographic hash function and random nonce, as performed in. The receiver should compute a hash value with the received parameters and check validity of the computed and received hash value. Any mismatch should lead to rejection of the received message [14]. Second, in registration phase and location updates phase, use of timestamp and random nonce can prevent an adversary from executing replay attack. The receiver needs to find out the communication delay by comparing the current timestamp and the received timestamp. Moreover, use of random nonce along with timestamp in a message can detect strong replay attack [15]. Third, as already discussed, an adversary can launch DoS attack by preventing a genuine user from accessing location-based services of the system. In location update

Cryptanalysis of a Centralized Location-Sharing …

29

phase, LSSNS should verify the validity and authenticity of the actual encrypted location sent by CT . Wrong index value for encrypted location would prevent user U from accessing his/her actual friends’ location information in future. Finally, our general observation is that, the scheme should go through a secure login, authentication and key-establishment process before any real data communication. Session keys should be established in a mutually shared way by the communicating parties. Using this key, relevant messages should be encrypted through symmetric key encryption, in order to protect it from malicious users [14, 15]. In future, we aim to propose a security-enhanced scheme that removes the security pitfalls present in this scheme and will resist the aforementioned security attacks and threats.

References 1. Xiao, X., Chen, C., Sangaiah, A.K., Huc, G., Ye, R., Jiang, Y.: A centralized privacy-preserving location-sharing system for mobile online social networks. Future Generation Computer Systems 86(1), 863–872 (2018) 2. Jiang, R., Lu, R., Choo, K.: Achieving high performance and privacy-preserving query over encrypted multidimensional big metering data. Future Generation Computer Systems 78(1), 392–401 (2018) 3. Ju, X., Shin, K.: Location privacy protection for smartphone users using quadtree entropy maps. Journal of Information Privacy and Security 11(2), 62–79 (2015) 4. Sweeney, L.: k-anonymity: A model for protecting privacy. IEEE Security and Privacy Magazine 10(5), 1–14 (2002) 5. Ouyang Y., Le Z., Xu Y., Triandopoulos N., ZhangS., Ford J., MakeDon F., Providing Anonymity in Wireless Sensor Networks, in: IEEE International Conference on Pervasive Services, ICPS, pp. 145-148 (2007) 6. Chen Z., Hu X., Ju X., Ju X., Shin K., LISA: Location information scrambler for privacy protection on smartphone, in: IEEE Communications and Network Security, CNS, pp. 296304 (2013) 7. Rass S., Wigoutschnigg R., Schartner P., Doubly-anonymous crowds: Using secret-sharing to achieve sender-and receiver-anonymity, J. Wirel. Mob. Netw., Ubiquitous Comput., Dependable Appl., 2(4), 27-41 (2011) 8. Cox, L.P., Dalton, A., Marupadi, V.: Smokescreen: Flexible privacy controls for presencesharing, ACM Proceedings of the 5th International Conference on Mobile Systems, pp. 233– 245. Applications and Services, ACM (2007) 9. Wei W., Xu F., Li Q., MobiShare: Flexible privacy-preserving location sharing in mobile online social networks, IEEE INFOCOM, pp. 2616-2620, (2012) 10. Li J. W., Li J., Chen X. F., Liu Z. L., Jia C. F., MobiShare+: Security improved system for location sharing in mobile online social networks. Journal of Internet Services Information Security, (JISIS), 4(1), 25-36 (2014) 11. Shen, N., Yang, J., Yuan, K., Fu, C., Jia, C.: An efficient and privacy-preserving location sharing mechanism. Computer Standards & Interfaces 44(1), 102–109 (2016) 12. Liu, Z., Luo, D., Li, J., Jin, L., Chen, X., Jia, C.: N-Mobishare: new privacy-preserving locationsharing system for mobile online social networks. International Journal of Computer Mathematics 93(2), 384–400 (2016) 13. Dolev, D., Yao, A.: On the security of public key protocols. IEEE Transactions on Information Theory 29(2), 198–208 (1983)

30

M. Bhattacharya et al.

14. Roy S., Chatterjee S., Das A. K., Chattopadhyay S., Kumari S., Jo. M., Chaotic Map-based Anonymous User Authentication Scheme with User Biometrics and Fuzzy Extractor for Crowdsourcing Internet of Things, IEEE Internet of Things Journal, 5(4), 2884-2895, (2018) 15. Roy S., Chatterjee S., Das A. K., Chattopadhyay S., Kumar, Vasilakos A. V., On the Design of Provably Secure Lightweight Remote User Authentication Scheme for Mobile Cloud Computing Services, IEEE Access, 5(1), 25808-25825, (2017) 16. Abadi M., Blanchet B., and Comon-Lundh H., Models and Proofs of Protocol Security: A Progress Report. In 21st International Conference on Computer Aided Verification (CAV’09), pp. 35-49, Grenoble, France, (2009) 17. AVISPA, “Automated Validation of Internet Security Protocols and Applications,” http://www. avispa-project.org/. Accessed on November 2019 18. von Oheimb, D.: The high-level protocol specification language hlpsl developed in the eu project avispa, in Proceedings of 3rd APPSEM II Workshop on Applied Semantics (APPSEM 2005), pp. 1–17. Frauenchiemsee, Germany (2005) 19. Basin D., Modersheim S., Vigano L., OFMC: A symbolic model checker for security protocols International Journal of Information Security, 4(3), 181-208, (2005) 20. AVISPA, SPAN, the Security Protocol ANimator for AVISPA, http://www.avispa-project.org/. Accessed on November 2019

Bounded Version Vectors Using Mazurkiewicz Traces Madhavan Mukund, Gautham Shenoy R., and S. P. Suresh

Abstract Version vectors constitute an essential feature of distributed systems that enable the computing elements to keep track of causality between the events of the distributed systems. In this article, we study a variant named Bounded Version Vectors. We define the semantics of version vectors using the framework of Mazurkiewicz traces. We use these semantics along with the solution for the gossip problem to come up with a succinct bounded representation of version vectors in distributed environments where replicas communicate via pairwise synchronization.

1 Introduction Brewer’s CAP Theorem [2] states that distributed systems which are required to be highly available and partition tolerant cannot guarantee that the replicas are strongly consistent. In such situations, systems make do with weaker notions of consistency. A popular weaker notion of consistency is eventual consistency, which allows for the states of the replicas to diverge for a finite (but not necessarily bounded) period of time with a guarantee that the states will eventually converge. The properties of such eventually consistent systems have been studied in [5]. The replicas in an eventually consistent systems perform updates locally and propagate the updates to other replicas. Whenever a pair of replicas a and b participate in the exchange of their knowledge about the local states of other replicas in the M. Mukund (B) · G. Shenoy R. · S. P. Suresh Chennai Mathematical Institute, Chennai, India e-mail: [email protected] G. Shenoy R. e-mail: [email protected] S. P. Suresh e-mail: [email protected] M. Mukund · S. P. Suresh CNRS UMI 2000 ReLaX, Chennai, India © Springer Nature Singapore Pte Ltd. 2021 R. Chaki et al. (eds.), Advanced Computing and Systems for Security, Advances in Intelligent Systems and Computing 1178, https://doi.org/10.1007/978-981-15-5747-7_3

31

32

M. Mukund et al.

system, they require a mechanism to decide which among a and b have the most up-to-date information about a third replica c. Version vectors play an important role in this decision-making process. A typical implementation of version vectors uses integer counters. The size of the vector is equal to the number of replicas in the distributed system. Whenever a relevant event occurs locally at a replica a, the replica increments the counter corresponding to itself in its copy of the version vector. In reactive systems, the replicas communicate with each other using various mechanisms. Two prominent ones are epidemic propagation and pairwise synchronization. In epidemic propagation, each replica sends its version vector to all other replicas. On receiving a copy from a remote replica, the local replica will update its own version vector by taking a pointwise maximum of the received version vector and its own version vector. In pairwise synchronization, two replicas synchronize and simultaneously update both their version vectors by taking the pointwise maximum. In this article, we concentrate on pairwise synchronization as the communication mechanism. It can be noted that in a reactive system, these counter values grow as the computations grow and hence version vectors, in general, are unbounded. However, we observe that given two replicas a and b with their states ra and rb , respectively, there can be only four possible relationships between them: ra is more up to date than rb (or vice versa), or ra and rb cannot be compared since they have received concurrent updates or they are the same. Moreover, when these two replicas participate in pairwise synchronization, they jointly derive the new version vector that captures their state merger by locally comparing their respective version vectors. The bounded number of relations they capture, and the fact that they can be locally computed without the need for a global view, provides the motivation to seek the existence of a finite representation for version vectors. One such finite representation for version vectors was presented in [1]. In this article, we define the semantics of version vectors within the framework of Mazurkiewicz traces. Using these semantics we reduce the problem of comparing the integers in the version vectors to the problem of comparing appropriate trace events. We shall finally adapt the solution for the gossip problem presented in [4] to provide a more succinct bounded representation for version vectors.

2 Version Vectors We restrict ourselves to eventually consistent distributed systems with N replicas {0, 1, . . . , N − 1}. Having initialized themselves by executing the initialization operation I , each of the replicas can perform one of the following operations: – When a client requests an update operation to be performed, replica i performs the update locally. This is denoted by U i . This changes the state of i.

Bounded Version Vectors Using Mazurkiewicz Traces Table 1 Semantics of version vectors Operation Initialization Update Synchronization

33

Semantics Vik (I ) = 0



Vik (α) + 1 ifk = i = a V k (α) otherwise  i k (α)  V k (α) if i ∈ {a, b} V a b Vik (α · S ab ) = otherwise Vik (α) Vik (α · U a ) =

– Periodically, replicas participate in pairwise synchronization, where they exchange their states and arrive at a common state to reflect this exchange and update of knowledge. Pairwise synchronization between i and j is denoted by S i j . An operation o is said to be a k-operation (for k ∈ {0, 1, . . . , N − 1}) if o = U k or o = S ik for some i. We also say that k participates in o. A finite sequence of operations performed by the distributed system is known as a run. We denote a run by α = I o1 o2 . . . on , where oi is either an update operation or a synchronization operation. The version vector of replica i at the end of a run α is a vector Vi (α) of N integer counters. The kth entry of Vi (α), denoted by Vik (α), represents the most recent update of replica k that i is aware of, at the end of α. We say that a version vector Vi (α) dominates V j (α) iff ∀k ∈ [0 . . . N − 1] : Vik (α) ≥ V jk (α). Whenever Vi (α) dominates V j (α) and Vi (α) = V j (α), semantically it means that at the end of a run α, replica i is more up to date than j, which implies that i has received all the updates that j has received. The join operation between the kth components of Vi (α) and V j (α) is defined as follows: Vik (α)  V jk (α) = max(Vik (α), V jk (α)). The semantics of the version vectors for the various operations are presented in Table 1.

3 Mazurkiewicz Traces and Version Vectors Let α = I o1 o2 . . . on be a run. We define Evs(α) to be {e⊥ , e1 , . . . , en }, the set of events at which the operations in α occur. We define Ops(α) to be {I, o1 , . . . , on }, the set of operations in α. An event ei is said to be a k-event if oi is a k-operation. We also say that k participates in the event ei . For two k-events ei , e j ∈ Evs(α), we say that ei ≤k e j iff i ≤ j. Note that ≤k is a total order for each k. Definition 1 Given a run α = I o1 . . . on , its Mazurkiewicz trace (referred to as trace henceforth), denoted t (α), is a triple (E, ≤, λ) such that:

34

M. Mukund et al.

– E = Evs(α) – λ:E → Ops(α) such that λ(e⊥ ) = I and λ(ei ) = oi for i ∈ {1, . . . , n} – ≤= ( k ≤k )∗ . A triple t = (E, ≤, λ) is a trace if t = t (α) for some run α. Note that for any two events e, f in a trace and any replica k, e ≤k f ⇒ e ≤ f . We say that e  f if e < f and there is no event g such that e < g < f . If e  f , we say that f is an immediate successor of e. Note that if e  f , there is a replica i such that both e and f are i-events. Definition 2 For a trace t = (E, ≤, λ) and S ⊆ E, the subtrace of t induced by S is given by t (S) = (S, ≤ S , λ S ) where: – ≤ S =≤ ∩ (S × S) – λ S (e) = λ(e) for all e ∈ S. We shall now define the semantics of version vectors over traces. Definition 3 Let t = (E  ∪ {e}, ≤, λ) and t  = (E  , ≤ , λ ) be two traces such that t (E  ) = t  . The version vector of replica i in trace t, denoted by Vi (t), is inductively defined as follows: Case e = e⊥ : For all k, Vik (t) = 0. Case e is a j event, j = i: For all k, Vik (t) = Vik (t  ). Case e is a U i event: For k = i, Vik (t) = Vik (t  ). Vii (t) = Vii (t  ) + 1. Case e is a S ij event: For all k, Vik (t) = V jk (t) = Vik (t  )  V jk (t  ). In this representation, for i = j, ei ≤ e j implies that the event e j occurs strictly after ei and thus the replicas participating in λ(e j ) know about the occurrence of the event ei . The transfer of this knowledge can be traced along the path from ei to e j in the trace. We formalize this notion by defining ideals. Definition 4 Let t = (E, ≤, λ) be a trace of the run α. A subset I ⊆ E is said to be an ideal iff ∀ei ∈ I and e j ≤ ei , e j ∈ I. Thus, an ideal is a downward closed subset of E with respect to the partial order ≤. The following properties of ideals follow from the definition: Proposition 5 Let t = (E, ≤, λ) be a trace of some run α. Then the following are true: 1. E is an ideal. 2. If ei ∈ E, the set ↓ ei = {e j |e j ≤ ei } is an ideal. (It is referred to as the ideal generated by ei .) 3. Every ideal I is generated by its maximal elements. In other words I =  ei ∈sup I ↓ ei . 4. If I and J are ideals, so are I ∪ J and I ∩ J .

Bounded Version Vectors Using Mazurkiewicz Traces

35

Our goal is to reduce the problem of comparing the integer values in the version vectors to the problem of comparing appropriate trace events. To that end, we define the following terminology which will be used in the remainder of the paper. Definition 6 Let I be an ideal and i, j and k be replicas. An i-event e is the maximal event of i in I, denoted by maxi (I), if f ≤ e for all i-events f . If t = (E, ≤, λ) is a trace, then we use the notation maxi (t) to denote maxi (E). The view of i in an ideal I, denoted by ∂i (I), is ↓ maxi (I), the ideal generated by the maximal i-event in I. For a trace t = (E, ≤, λ) and event e ∈ E, we use the notations ∂i (t) and ∂i (e) to denote ∂i (E) and ∂i (↓ e), respectively. The latest information that i has about j in I, denoted by latest i→ j (I), is defined to be max j (∂i (I)), the maximal j-event in the view of i. If t = (E, ≤, λ) is a trace, then we use the notation latest i→ j (t) to denote latest i→ j (E). The maximal update event of i in I, denoted updatei (I), is max{e ∈ I : λ(e) = U i }, the maximal i-event in I which is also an update event. If no such event exists then updatei (I) = e⊥ For a trace t = (E, ≤, λ) and event e ∈ E, we use the notations updatei (t) and updatei (e) to denote updatei (E) and updatei (↓ e), respectively. The latest update information that i has about k in I, denoted luik (I), is the def

maximal k-update event in the view of i in I. Thus, luik (I) = updatek (∂i (I)) = updatek (latest i→k (I)). If t = (E, ≤, λ) then we use luik (t) to denote luik (E). At this juncture, we shall prove the following lemma which we shall be alluding to in the rest of the paper. Lemma 7 (Crossover point lemma) Let I ⊆ E be an ideal in a trace, and a be a replica. For events e1 and e2 such that e1 ∈ ∂a (I), e2 ∈ I \ ∂a (I) and e1 < e2 , there exists a replica c such that e1 ≤ latest a→c (I) ≤ e2 . Proof Let P be a path from e1 to e2 such that for any two successive events e, f in P, e  f . Let e3 be the maximal ∂a (I)-event on this path. Let e4 be the minimal / ∂a (I). Clearly e1 ≤ e3 , e4 ≤ e2 and e3  e4 . This element along P such that e4 ∈ means that there is a replica c such that both e3 and e4 are c-events. / ∂a (I), e4 ≤ latest a→c (I). But since e4 is Clearly e3 ≤ latest a→c (I). Since e4 ∈ a c-event, latest a→c (I) < e4 . Thus, e1 ≤ latest a→c (I) ≤ e2 . If e is an event in the trace t, the version vector of replica i at e, denoted by Vi (e) = Vi (t (↓ e)). Note that in a run α, Vik (α) denotes the integer corresponding to latest k-update operation that replica i has received. If t = t (α) then luik (t) denotes the latest k-update known to replica i. We formalize this intuition through the following theorem. Theorem 8 If α is a run, t = t (α), and i, k are replicas, then Vik (α) = Vik (t) = Vkk (luik (t)).

36

M. Mukund et al.

Proof We shall prove this result by induction on the length of the run α. Suppose α contains only one operation, I . Then, from the semantics of version vectors, for every i, k, Vik (α) = Vik (t) = 0. Also, since for every i, k, luik (t) = e⊥ , and the trace corresponding to ↓ e⊥ is t itself, it follows that Vkk (luik (t)) = 0. Assume that for all runs whose length is smaller than n, the result holds, and let α = α  · omax be a run of length n with trace t = (E, ≤, λ). Let emax be a maximal event in t with λ(emax ) = omax . Define E  = E \ {emax } and t  = t (E  ). By induction hypothesis, for any i, k, Vik (α  ) = Vik (t  ) = Vkk (luik (t  )). There are the following cases to consider. i does not participate in omax : In this case Vik (α) = Vik (α  ), Vik (t) = Vik (t  ) and luik (t) = luik (t  ). Thus Vik (α) = Vik (t) = Vkk (luik (t)). omax = U i and i = k: We argue as in the previous case. omax = U i and k = i: From the definition and induction hypothesis, Vik (α) = Vik (t) = Vkk (luik (t  )) + 1. Now luik (t) = emax and if t  = t (↓ emax \ {emax }) then Thus Vkk (luik (t)) = Vkk (lukk (t)) = Vkk (lukk (t  )) + 1 = Vkk luik (t  ) = luik (t  ). k  k k (lui (t )) + 1 = Vi (t) = Vi (α). omax = S i j : Now Vik (α) = max(Vik (α  ), V jk (α  )), Vik (t) = max(Vik (t  ), V jk (t  )). By induction hypothesis, this is the same as max(Vkk (luik (t  )), Vkk (lukj (t  ))). Since emax is not an update event, luik (t) = max(luik (t  ), lukj (t  )). It follows that Vik (α) = Vik (t) = Vkk (luik (t)). Corollary 9 If α is a run with trace t and a, b, k are replicas then – Vak (α) < Vbk (α) iff luak (t) < lukb (t). – Vak (α) = Vbk (α) iff luak (t) = lukb (t). Proof Vak (α) ≤ Vbk (α) iff Vkk (luak (t)) ≤ Vkk (lukb (t)) (from Theorem 8) iff luak (t) ≤ lukb (t) (using the fact that luak (t) and lukb (t) are both k-update events, and using the semantics of update operations over trace events). The statements in the corollary follow from this. With this we have reduced the problem of comparing integer values in the version vectors to the problem of comparing the corresponding update events in the trace of the run. We shall explore options to provide bounded representation for version vectors. We look at the gossip problem and its solution to achieve this.

4 Bounding the Version Vectors: Using Gossip Suppose a and b are replicas with version vectors Va (t) and Vb (t) at the end of a trace t. We have already argued (in Corollary 9) that comparing Vak (t) and Vbk (t) is equivalent to comparing luak (t) with lukb (t). The following is an immediate consequence of the definition of latest update information.

Bounded Version Vectors Using Mazurkiewicz Traces Fig. 1 Figure for Lemma 12. e is a maximal event in ∂a (t) ∩ ∂b (t)

37

∂b (t)

max b (t) e = latest a→c (t) = latest b→d (t) max a (t)

∂a (t)

Proposition 10 Let t = (E, ≤, λ) be a trace and a, b, k < N . If luak (t) = lukb (t) then luak (t) < lukb (t) iff latest a→k (t) < latest b→k (t). Thus, the problem of comparing distinct k-update events corresponds to comparing the latest k-events in the views of the appropriate a and b. This latter problem is a special case of the gossip problem in which the number of replicas communicating with each other is restricted to two. A finite-state local solution to the gossip problem was provided in [4]. We adapt the solution to arrive at a bounded representation for version vectors. Definition 11 Let t = (E, ≤, λ) be an ideal and a be a replica. We define the primary information of a in t, denoted by Primarya (t) to be the set of all the latest events in the view of a in t. Formally, Primarya (t) = {latest a→k (t) | k < N }. We define the primary graph of a in t to be Ga (t) = (Primarya (t), ≤at ) where ≤at the partial order ≤ restricted to the events in Primarya (t). Our goal is to settle the question of comparing latest a→k (t) and latest b→k (t) using a finite amount of information that is also local to replicas a and b. If latest a→k (t) ≤ latest b→k (t) then latest a→k (t) is in the view ∂a (t) ∩ ∂b (t). Thus, latest a→k (t) is in the view generated by the maximal elements of ∂a (t) ∩ ∂b (t). We show the following nice property about these maximal elements (Fig. 1). Lemma 12 ([4]) Let a and b be replicas and t = (E, ≤, λ) be a trace. If e is a maximal event in the ideal ∂a (t) ∩ ∂b (t) then e ∈ Primarya (t) ∩ Primaryb (t).

38

M. Mukund et al.

Proof If maxa (t) ∈ ∂b (t), then ∂a (t) ∩ ∂b (t) = ∂a (t). Its sole maximal element is maxa (t), which happens to be the same as latest b→a (t), which is a member of Primaryb (t) by definition. Thus, maxa (t) ∈ Primaryb (t). The case where maxb (t) ∈ ∂a (t) can be similarly handled. / ∂b (t) and maxb (t) ∈ / ∂a (t). Then maxa (t) ∈ Suppose it is the case that maxa (t) ∈ ∂a (t) \ ∂b (t). Similarly maxb (t) ∈ ∂b (t) \ ∂a (t). From the crossover point lemma that there is c such that e ≤ latest a→c (t) ≤ maxb (t). Since e is a maximal event in ∂a (t), it follows that e = latest a→c (t). Hence e ∈ Primarya (t). Similarly, e = latest b→d (t) for some replica d and hence e ∈ Primaryb (t). Thus, any maximal (∂a (t) ∩ ∂b (t))event e is in Primarya (t) ∩ Primaryb (t). This result gives us a way to compare latest a→k (t) and latest b→k (t) using the information present in the primary graphs Ga (t) and Gb (t). We record this through the following corollary. Corollary 13 ([4]) If e = latest a→k (t) and f = latest b→k (t) then e ≤ f iff ∃g ∈ Primarya (t) ∩ Primaryb (t) : e ≤at g. If our purpose was to decide if Va (t) dominates Vb (t) or if they are incomparable, then the information in Ga (t) and Gb (t) suffices since for any index k, whenever latest a→k (t) ≤ latest b→k (t) it is the case that luak (t) ≤ lukb (t). However, if we also want to verify if the version vectors of the two replicas are the same, we have to show that for each k, luak (t) = lukb (t). For this we maintain the primary update information defined as follows. Definition 14 For any replica a and ideal t = (E, ≤, λ), the primary update information, denoted PrimaryUpdatea (t), is the indexed set {luak (t) | k ∈ [N ]}. Thus, for any replica a, the primary graph Ga (t) together with the primary update information PrimaryUpdatea (t) is an alternative representation of the version vector of Va (t). Our goal is to provide a bounded representation for this object. We do so by assigning labels to the events in the primary information and primary update information such that the labels come from a bounded set. Since the trace t can grow infinitely, finitely many labels imply reuse of labels at some point in time. However, care must be taken to ensure that whenever two primary events latest a→i (t) and j latest b→ j (t) (or two primary update events luia (t) and lub (t)) are assigned the same label, then they are indeed the same events. Otherwise, the solution outlined above for comparing corresponding primary events would not work. Thus, whenever we label a new S ab event, we need to pick a label that is not currently in use for labelling any other primary event. We need to make this decision by looking at the local information of a and b. [4] shows that the information in the primary graphs Ga and Gb is not sufficient. Similarly when we label a new U a event, we need to pick a label that is not currently in use for labelling any other primary update event. The information present in PrimaryUpdatea (t) is not sufficient for this purpose. So we define secondary information.

Bounded Version Vectors Using Mazurkiewicz Traces

39

Definition 15 Let t = (E, ≤, λ) be a trace and a, b, k be replicas. def def k (t) = lukb (∂a (t)). Define latest a→b→k (t) = latest b→k (∂a (t)) and lua→b The secondary information for a in t, denoted Secondarya (t), is defined as k (t) | b, k < N } Secondarya (t) = {latest a→b→k (t), lua→b

. We now show that the secondary information of a replica a contains enough information for it to know if a certain a-event is in the primary information or primary update information of any other replica b. Lemma 16 Let a, b be replicas and t = (E, ≤, λ) be a trace. Then 1. latest b→a (t) ∈ Secondarya (t). 2. luab (t) ∈ Secondarya (t). Proof 1. We need to show that latest b→a (t) = latest a→d→a (t) for some d < N . If maxb (t) ∈ ∂a (t) then latest b→a (t) = latest b→a (∂a (t)) = latest a→b→a (t). / ∂a (t). Since latest b→a (t) is also an a-event, latest b→a (t) ∈ Suppose maxb (t) ∈ ∂a (t). By the crossover point lemma we can find c such that latest b→a (t) ≤ latest a→c (t) ≤ maxb (t). Since latest b→a (t) is the maximal a event in ∂b (t), we have latest a→c→a (t) = latest b→a (t). Thus latest b→a (t) ∈ Secondarya (t). 2. We need to show that there is c such that luab (t) = luaa→c (t). If maxb (t) ∈ ∂a (t), then luab (t) = luaa→b (t) and we are done. Otherwise, we can appeal to the crossover lemma to show that there is a c such that luab (t) ≤ latest a→c (t) and latest a→c (t) ≤ maxb (E). It then follows that luab (t) ≤ luaa→c (t) and luaa→c (t) ≤ luab (t) and we are done. Definition 17 For a replica a and a trace t, the event version vector of a in t, denoted EVV a (t), is defined to be (Ga (t), PrimaryUpdatea (t), Secondarya (t)). As the replicas participate in more operations, the trace t grows. We need to show that the event version vectors of the participating replicas can be reconstructed from their event version vectors before the operation. Lemma 18 Let t = (E, ≤, λ) and t  = (E ∪ {enew }, ≤ , λ ) be traces such that t  (E) = t. Let λ (enew ) = onew . If replicas a, b participate in the operation onew then EVV a (t  ) and EVV b (t  ) can be reconstructed from EVV a (t) and EVV b (t). Proof We prove by induction on the size of t  . The base case is when t  = ({e⊥ }, ≤ , λ ), in which case we have latest a→b (t  ) = latest a→b→k (t  ) = luab (t  ) =  k (t  ) = e⊥ for any replicas a, b, k. And we have e⊥ ≤at e⊥ . lua→b Suppose the result holds for all traces of size smaller than n. Suppose t  is of size n. We have the following two cases to consider.

40

M. Mukund et al.

onew = U a : From the definitions, for all i = a and any k, latest a→i (t  ) = latest a→i (t) and hence latest a→i→k (t  ) = latest a→i→k (t), luia (t  ) = luia (t) and k k (t  ) = lua→i (t). Now latest a→a (t  ) = latest a→a→a (t  ) = luaa (t  ) = lua→i  a  lua→a (t ) = enew . For any two events e, f ∈ Primarya (t  ), e ≤at f iff e ≤at f or  f = enew . Thus we have reconstructed Ga (t ), PrimaryUpdatea (t  ) and Secondarya (t  ) thereby reconstructing EVV a (t  ). / {a, b}, it is onew = S ab : Let i, j ∈ {a, b}. From Corollary 13, for any replica k ∈ possible to decide whether latest a→k (t) ≤ latest b→k (t) from the information in Ga (t) and Gb (t). Define wk as follows:  wk =

b if latest a→k (t) ≤ latest b→k (t) a otherwise

Now by definition, latest i→k (t  ) = latest wk →k (t) and latest i→ j (t  ) = enew . j j Similarly luik (t  ) = lukwk (t) and lui (t  ) = lu j (t). Further, for any replica p, latest i→k→ p (t  ) = latest wk →k→ p (t) and latest i→ j→ p (t  ) = latest j→ p (t  ). Simip p p p larly, lui→k (t  ) = luwk →k (t) and lui→ j (t  ) = lu j (t  ). If e ∈ Primaryi (t  ) then  e ≤it enew . If e, f ∈ Primaryi (t  ) ∩ Primary j (t) with e ≤tj f , then it is the case  that e ≤it f . If e, f ∈ Primaryi (t  ) but for i = j, e ∈ Primaryi (t) \ Primary j (t) and f ∈ Primary j (t) \ Primaryi (t), then e and f are incomparable in t, and hence are incomparable in t  . With this, we have reconstructed Ga (t  ), Gb (t  ), PrimaryUpdatea (t  ), PrimaryUpdateb (t  ), Secondarya (t  ) and Secondaryb (t  ), thereby reconstructing EVV a (t  ) and EVV b (t  ).

4.1 Labelling Let {Li j | i ≤ j < N } be a collection of mutually disjoint sets of labels, such that each  Li j is of size 2N + 1. Let l0 be a label not occurring in any of the Li j s. Let L = i≤ j= 24) { takePictureIntent.putExtra("output", FileProvider.getUriForFile(.. ,.. , image)); ... } else { takePictureIntent.putExtra("output", Uri.fromFile(image)); } this.currentPicturePath = image.getAbsolutePath(); startActivityForResult(takePictureIntent, ..); } ... }

Telegram invokes method sendLogs(), reported below, in case of crash or when an inconsistent runtime state is detected. It collects all logs and sends them in an email as attachment. For that purpose, Telegram asks the OS for all apps that can satisfy the action "android.intent.action.SEND_MULTIPLE":

Intents Analysis of Android Apps … Fig. 10 Results of the analysis of the 50 most popular apps on the Italian Google Play marketplace. OOM stands for out of memory

59 App

LOC

3 AliBaba Amazon Booking CandyCrash CatsCrashArena Chat&Cash ChickenScream Clash of Clans Faceapp Facebook FacebookLite FacebookMessenger FifaMobile FightList GooglePhoto GooglePlayGames GoogleTranslate InfoTarga Instagram Launcher Musically MusicDownloader Netflix PianoTiles Pinterest PjMask Pokemon Pou Roll the Ball Shazam Slither Snapchat Spotify Subito Subway Surfers SuperMario SuperOptCleaner Telegram Tiger Ball Tim Mobile Twitter Vodafone Waze WhatsApp Wind Wish

249K 230K 354K 260K 239K 294K 288K 322K 270K 244K 231K 266K 233K 212K 311K 312K 330K 295K 243K 333K 319K 195K 262K 346K 335K 383K 254K 248K 237K 323K 296K 290K 141K 312K 232K 329K 273K 222K 315K 208K 253K 161K 284K 265K 359K 294K 332K

public class SettingsActivity{ ... public void onItemClick(View view, int position) { ... SettingsActivity.this.sendLogs(); ... } ... private void sendLogs() { List uris = new ArrayList(); for (File file: new File(getAbsolutePath() + "/logs") .listFiles()) uris.add(Uri.fromFile(file)); if (!uris.isEmpty()) { Intent i = new Intent ("android.intent.action.SEND_MULTIPLE"); i.setType("message/rfc822"); i.putExtra("android.intent.extra.EMAIL", new String[] { BuildVars.SEND_LOGS_EMAIL }); i.putExtra

SDLI Time Warns 11:29 223 5:53 1 27:44 540 10:28 65 9:35 129 19:02 265 21:07 132 25:19 268 17:35 160 09:07 25 OOM 13:33 137 8:58 55 6:18 21 20:08 207 20:40 113 29:07 276 25:37 146 11:40 139 OOM 21:12 259 5:27 6 17:22 170 25:32 156 30:12 492 OOM 13:59 164 11:18 130 9:32 16 26:14 369 15:08 37 22:06 240 2:58 2 21:01 51 8:39 9 23:52 325 16:55 229 7:59 11 27:55 188 7:20 42 11:44 86 4:24 0 14:50 73 9:37 58 35:43 320 18:24 115 20:31 92

60

R. Salvia et al. ("android.intent.extra.SUBJECT", "last logs"); i.putParcelableArrayListExtra ("android.intent.extra.STREAM", uris); startActivityForResult(Intent.createChooser (i, "Select email application."), 500);

} } ... }

We have matched the XML summary file of Telegram against those of our database of apps and found that Intent Analyzer can serve intents with action "android.intent.action.SEND_MULTIPLE". If chosen, it will show the sensitive information held in the intent. This leak can be prevented by avoiding the use of implicit intents. For instance, one can upload the logs to a remote server and share the latter’s address by email. Gmail v.6.11.27.141872707 Gmail12 specifies, in its manifest, that it is available to serve intents with action Intent.ACTION_SEND or Intent.ACTION_SEND_MULTIPLE, that get dispatched to its ComposeActivityGmailExternal component:

...

...

That component is obfuscated. Its class extends ComposeActivityGmail, that extends cja, whose method a(Message) is reachable from an event handler of ComposeActivityGmailExternal. It accesses the extra for key Intent. EXTRA_TEXT and it fills the body of the email by calling method a(String, boolean, boolean): public class ComposeActivityGmailExternal extends ComposeActivityGmail {...} public class ComposeActivityGmail extends cja {...} public class cja extends abl ... { private final void a(Message message) { Intent intent = getIntent(); ... String stringExtra2 = intent.getCharSequenceExtra("android.intent.extra.TEXT"); stringExtra2 = Html.toHtml (new SpannableString(stringExtra2)); // fill the textfield of the message of the email a(stringExtra2, true, false); ... } }

12 https://play.google.com/store/apps/details?id=com.google.android.gm.

Intents Analysis of Android Apps …

61

Being obfuscated, we could not check the method where stringExtra2 is injected into the email body. However, SDLI signals that the information is leaked and we reproduced the leak in our device. This paves the way to information leaks if other apps send confidential information through intents with action Intent.ACTION _SEND, as shown below. WhatsApp v.2.16.396 13

WhatsApp Messenger is the most popular app worldwide. It is heavily obfuscated, but we could identify a method that accesses sensitive information: public static String a(...) { ... return ... + "LC" + locale.getCountry() + "\n" + "LG" + locale.getLanguage() + "\n" + "Model" + Build.MODEL + "\n" + "Product" + Build.PRODUCT + "\n" + "Device" + Build.DEVICE + "\n" + "Architecture" + System.getProperty("os.arch") + ...; }

which is then passed as str2 to the following method: public static boolean a(..., String str2, ...) { Intent intent = new Intent(obj != null ? "android.intent.action.SEND_MULTIPLE" : "android.intent.action.SEND"); intent.putExtra("android.intent.extra.TEXT", str2); ... }

To reproduce the leak, it is enough to go to the options of WhatsApp, select Settings → About and help → Contact us, digit an email in the text field and tap Next. This fires Gmail, if it is the only installed app willing to serve intents with action Intent.ACTION_SEND or Intent.Action_SEND_MULTIPLE, or opens a chooser, if there are more. The mail that gets sent contains sensitive information about the device. WhatsApp allows one to add a new contact to the Android contact list, by using information provided by the user. To perform this action, it creates, in method ContactInfo.onOptionsItemSelected(), an intent with action Intent.ACTION_INSERT, meant to be served by a component of the OS: intent = new Intent("android.intent.action.INSERT", ...); intent.putExtra("name", this.l.z); intent.putExtra("phone", bd.b(this.l.t)); startActivityForResult(intent, 10);

Such intent contains confidential information about the new contact. However, if installed, Intent Analyzer appears in the chooser menu, together with the preinstalled default app suited to handle the request. In this case, the user does not expect to select an app, because the operation is usually managed by the default app. A better way to do the same thing would be through a content provider operation, avoiding the use of implicit intents. 13 https://www.whatsapp.com.

62

R. Salvia et al.

Tripadvisor v19.0.1 TripAdvisor is the world’s largest travel website, with a companion popular Android app. Its code is heavily obfuscated but we could find where it creates an intent containing confidential information. The following snippet creates an intent with action SEND_MULTIPLE and confidential extras. SDLI infers that field aVar is tainted, hence str and the TEXT extra are tainted as well. String str = ... + "@Reporter Email=" + a(activity) + "\n\n"+ aVar.k + "\n\n"+ "[Report Information]\n" + "Incident ID:"+ aVar.e + "\n" + "Device ID: " + aVar.f + "\n" + "Session ID:" + aVar.g + "\n" + "App version: " + aVar.b + "\n" + "App build date: " + aVar.c; ... intent2 = new Intent("android.intent.action.SEND_MULTIPLE"); intent2.putExtra("android.intent.extra.EMAIL", a); intent2.putExtra("android.intent.extra.CC", b); intent2.putExtra("android.intent.extra.SUBJECT", str); intent2.putExtra("android.intent.extra.TEXT", "\n" + str);

As in the previous examples, also this intent might start Gmail and send sensitive information. Netflix v4.12.2 Netflix is the world’s leading subscription service for watching TV episodes and movies. The app can be downloaded for free, but needs a paying subscription to the service. The following code (from ExportDebugData.export(Activity activity)) creates an intent and fills it with sensitive information about the device. It is run from inside an event listened fired from the menu of the app. intent = new Intent("android.intent.action.SEND_MULTIPLE"); intent.setType("text/plain"); intent.putExtra( "android.intent.extra.SUBJECT", "Netflix Android Bug Report : com.netflix.mediaclient4.12.2build14444"); intent.putExtra("android.intent.extra.TEXT", "\n\n\n[" + VERSION.SDK_INT + Build.BRAND + Build.MANUFACTURER + Build.MODEL + Build.DEVICE + Locale.getDefault().getCountry() + Locale.getDefault().getLanguage() + "]"); startActivity(Intent.createChooser(intent, "Send email..."));

As in the previous examples, also this intent might start Gmail and send sensitive information. Google Play Games Accounts Google Play Games uses implicit intents to share data between its components. For instance, method ClientUiProxyActivity.launchProxyIntent() broadcasts an intent with action CLIENT_PROXY, containing confidential data about the user’s accounts, stored there by method VideoCapturedPopup. handleClick(): Context localContext = getContext(); Account acc = this.mGamesContext.mClientContext.mResolvedAccount; ... intent.putExtra("com.google.android.gms.games.ACCOUNT", acc); ClientUiProxyActivity.launchProxyIntent(localContext, intent);

Intents Analysis of Android Apps …

63

That intent was meant to stay inside the app’s boundaries. However, being implicit, it might be intercepted by other apps, as it can be verified with Intent Analyzer. A malware might consequently gain access to all user accounts. Discussion The issues related to implicit intents with action Intent.ACTION_SEND do not seem critical, since almost all devices have Gmail installed. Hence, the user should explicitly select an untrusted app (malware) with the chooser, instead of Gmail, which seems unlikely. Similarly, the issue related to implicit intents with action Intent.ACTION_INSERT does not seem critical: there is, normally, a trusted receiver app for that intent, namely, the app dealing with the phone contacts. A malware might be willing to serve the intent, in which case a chooser will pop up. But it is unlikely that the user will choose the malware, although it might simulate the expected behavior, while actually leaking the information. The issue related to implicit intents with action CLIENT_PROXY seems definitely critical, instead. That action is uncommon, it is strictly related to the internal logic of Google Games (rather than to a user request) and there exists just one available receiver among the 50 most popular Android apps: Google Games itself. Hence, the OS will automatically redirect the intent to it, without popping up any chooser. But another app (such as Intent Analyzer or a malware) might easily intercept the intent. The user will then be faced to a puzzling question about the right app to choose for a non-understandable, never explicitly required action, and might randomly choose the malware. In this case, the programmers of Google Games should have used explicit intents instead.

7 Conclusion The new static analysis described in this article tracks information flows through Android intents. Its implementation SDLI, instantiated from Julia’s information flow analysis, detects inter-app communications that leak sensitive data. Experiments show that it is precise and efficient on some test cases from DroidBench scales to real-world apps and, despite these being heavily obfuscated, allows one to detect actually reproducible leaks of confidential information in some of the most popular apps in the Google Play marketplace. SDLI confirmed that implicit intents are extremely dangerous, since the receiver is not statically known. Google classified this as a serious security issue and this is why Android 5.0 (API level 21) and later throw an exception at bindService() calls with an implicit intent.14 From the user perspective, an implicit intent carrying confidential extras is a security issue if and only if another app is installed that catches and leaks it. Future work is mainly related to the improvement of the string analysis used in SDLIthat directly affects its precision when actions or keys are not constants, as 14 https://developer.android.com/guide/components/intents-filters.html.

64

R. Salvia et al.

shown in Sect. 6 for some test cases from DroidBench. The use of  in that case (Sect. 5) keeps the analysis sound but reduces its precision. We are also aiming at integrating the SDLI tool with our very recent vulnerability analysis of car infotainment apps [19].

References 1. Andersen, L.O.: Program analysis and specialization for the C programming language. University of Copenhagen, DIKU (1994). Ph.D. thesis 2. Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., Le Traon, Y., Octeau, D., McDaniel, P.: FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: Proceedings of Programming Language Design and Implementation (PLDI), Edinburgh, UK, June 2014, p. 29 (2014) 3. Bartel, A., Klein, J., Le Traon, Y., Monperrus, M.: Dexpler: converting android Dalvik bytecode to jimple for static analysis with soot. In: Proceedings of State of the Art in Java Program Analysis (SOAP) (2012) 4. Bhandari, S., Jaballah, W.B., et al.: Android inter-app communication threats and detection techniques. Comput. Secur. 70, 392–421 (2017) 5. Bryant, R.: Symbolic Boolean manipulation with ordered binary-decision diagrams. ACM Comput. Surv. 24(3), 293–318 (1992) 6. Cortesi, A., Ferrara, P., Pistoia, M., Tripp, O.: Datacentric semantics for verification of privacy policy compliance by mobile applications. In: Verification, Model Checking, and Abstract Interpretation - 16th International Conference, VMCAI 2015, Mumbai, India, 12–14 January 2015, pp. 61–79 (2015) 7. Cortesi, A., Olliaro, M.: M-string segmentation: a refined abstract domain for string analysis in C programs. In: 2018 International Symposium on Theoretical Aspects of Software Engineering, TASE 2018, Guangzhou, China, 29–31 August 2018, pp. 1–8 (2018) 8. Cortesi, A., Ferrara, P., Halder, R., Zanioli, M.: Combining symbolic and numerical domains for information leakage analysis. In: Transactions on Computational Science 31. LNCS, vol. 10730, pp. 98–135 (2018) 9. Costantini, G., Ferrara, P., Cortesi, A.: A suite of abstract domains for static analysis of string values. Softw. Pract. Exp. 45(2), 245–287 (2015) 10. Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of Principles of Programming Languages (POPL), pp. 238–252 (1977) 11. Enck, W., Gilbert, P., Han, S., Tendulkar, V., Chun, B.-G., Cox, L.P., Jung, J., McDaniel, P.D., Sheth, A.N.: TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM Trans. Comput. Syst. 32(2), 5:1–5:29 (2014) 12. Ernst, M.D., Lovato, A., Macedonio, D., Spiridon, C., Spoto, F.: Boolean formulas for the static identification of injection attacks in java. In: Proceedings of logic for programming, artificial intelligence, and reasoning (LPAR-20), Suva, Fiji. LNCS, vol. 9450, pp. 130–145 (2015) 13. Ferrara, P., Cortesi, A., Spoto, F.: From cil to java bytecode: semantics-based translation for static analysis leveraging. Sci. Comput. Program. 191, (2020) 14. Ferrara, P., Mandal, A.K., Cortesi, A., Spoto, F.: Cross-programming language taint analysis for the iot ecosystem. In: ECEASST, vol. 77 (2019) 15. Halder, Raju: Cortesi, Agostino: Abstract interpretation of database query languages. Comput. Lang. Syst. Struct. 38(2), 123–157 (2012) 16. Jana, A., Halder, R., Kalahasti, A., Ganni, S., Cortesi, A.: Extending abstract interpretation to dependency analysis of database applications. IEEE Trans. Softw, Eng (2020)

Intents Analysis of Android Apps …

65

17. Li, L., Bartel, A., Bissyandé, T.F., Klein, J., Le Traon, Y., Arzt, S., Rasthofer, S., Bodden, E., Octeau, D., McDaniel, P.D.: IccTA: detecting inter-component privacy leaks in android apps. In: Proceedings of the International Conference on Software Engineering (ICSE), Florence, Italy, pp. 280–291 (2015) 18. Livshits, B., Sridharan, M., Smaragdakis, Y., Lhoták, O., Amaral, J.N., Chang, B.E., Guyer, S.Z., Khedker, U.P., Møller, A., Vardoulakis, D.: In defense of soundiness: a manifesto. Commun. ACM 58(2), 44–46 (2015) 19. Mandal, A.K., Cortesi, A., Ferrara, P., Panarotto, F., Spoto, F.: Vulnerability analysis of android auto infotainment apps. In: Proceedings of the 15th ACM International Conference on Computing Frontiers, CF 2018, Ischia, Italy, 08–10 May 2018, pp. 183–190 (2018) 20. Mandal, A.K., Panarotto, F., Cortesi, A., Ferrara, P., Spoto, F.: Static analysis of android auto infotainment and on-board diagnostics II apps. Softw. Pract. Exp. 49(7), 1131–1161 (2019) 21. Octeau, D., Jha, S., McDaniel, P.D.: Retargeting android applications to java bytecode. In: Proceedings of Foundations of Software Engineering (FSE), Cary, NC, USA (2012) 22. Octeau, D., Luchaup, D., Jha, S., McDaniel, P.D.: Composite constant propagation and its application to android program analysis. IEEE Trans. Softw. Eng. 42(11), 999–1014 (2016) 23. Octeau, D., McDaniel, P.D., Jha, S., Bartel, A., Bodden, E., Klein, J., Le Traon, Y.: Effective inter-component communication mapping in android: an essential step towards holistic security analysis. In: Proceedings of USENIX Security, Washington, DC, USA, pp. 543–558 (2013) 24. Payet, É., Spoto, F.: Static analysis of android programs. Inf. Softw. Technol. 54(11), 1192–1201 (2012) 25. Rasthofer, S., Arzt, S., Bodden, E.: A Machine-learning approach for classifying and categorizing android sources and sinks. In: Proceedings of Network and Distributed System Security (NDSS), San Diego, California, USA (2014) 26. Sadeghi, A., Bagheri, H., Garcia, J., Malek, S.: A taxonomy and qualitative comparison of program analysis techniques for security assessment of android software. IEEE Trans. Softw. Eng. 43(6), 492–530 (2017) 27. Salvia, R., Ferrara, P., Spoto, F., Cortesi, A.: SDLI: static detection of leaks across intents. In: 17th IEEE International Conference on Trust, Security And Privacy, TrustCom2018, New York, NY, USA, 1–3 August 2018, pp. 1002–1007 (2018) 28. Spoto, F.: The Julia static analyzer for java. In: Proceedings of Static Analysis Symposium (SAS). Lecture Notes in Computer Science, vol. 9837, pp. 39–57, Edinburgh, UK (2016) 29. Vallée-Rai, R., Gagnon, E., Hendren, L.J., Lam, P., Pominville, P., Sundaresan, V.: Optimizing java bytecode using the soot framework: is it feasible? In: Proceedings of Compiler Contruction (CC), Berlin, Germany. Lecture Notes in Computer Science, vol. 1781, pp. 18–34 (2000) 30. Wei, F., Roy, S., Ou, X., Robby: Amandroid: a precise and general inter-component data flow analysis framework for security vetting of android apps. In: Proceedings of Computer and Communication Security (CCS), Scottsdale, AZ, USA, pp. 1329–1341 (2014)

Fingerprint and Keystroke Dynamics Fusion in Multimodal Biometrics System Maciej Szymkowski, Patryk Milewski, and Khalid Saeed

Abstract Recently biometrics traits are the most popular way by which we can secure our data and devices. In the case of them, we do not need any specific passwords or cards, simply measurable human traits become a key. However, it was proven that sometimes one trait is not enough to assure high accuracy level of biometrics system. In the proposed algorithm, we concentrate on two measurable traits—fingerprint and keystroke dynamics. The first of them is commonly used in diversified approaches. The second one is less used due to its low distinguishability. Methods to extract feature vector from fingerprint and keystroke dynamics along with the way to use it in multimodal biometrics system are presented in this paper. We divided experiments into three main parts: accuracy of fingerprint-based method, precision of keystroke dynamics-based algorithm, and at last the concept of multimodal system. During the experiments we have used diversified approaches, like k-nearest neighbors, k-means, Naive Bayes classifier, and decision trees. All experiments were realized on the sample sets collected by the authors. Performed analysis has shown that it is clearly possible to recognize human identity on the basis of keystroke dynamics with satisfactory accuracy level. Moreover, it showed that it is a huge impact of machine learning algorithm selection on the quality of classification and recognition. Keywords Biometrics · Keystroke dynamics · Fingerprints · Multimodal biometrics systems machine learning · k-means · Decision trees

M. Szymkowski (B) · P. Milewski · K. Saeed Faculty of Computer Science, Białystok University of Technology, Białystok, Poland e-mail: [email protected] P. Milewski e-mail: [email protected] K. Saeed e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Chaki et al. (eds.), Advanced Computing and Systems for Security, Advances in Intelligent Systems and Computing 1178, https://doi.org/10.1007/978-981-15-5747-7_5

67

68

M. Szymkowski et al.

1 Introduction Biometrics is a science that deals with human identity recognition on the basis of his measurable traits. In the literature, we can find one general classification of these features [1]. Three main groups were specified: physiological, behavioral, and hybrid. The first of them consists of these traits that we gain during the first years of our being. Set called “behavioral” is connected with the specific schemes that we can learn during our life—for example, it can be keystroke dynamics or our signature. The “hybrid” group consists of these traits that can be classified as physiological and behavioral simultaneously. Right now, it consists of only one element—human voice, due to the fact that hybrid group was recently mentioned in biometrics classifications. Earlier human traits were classified only as behavioral or physiological. Keystroke dynamics is a subject of many studies. However, the vast majority of them are focused on feature vector creation and the discovery of a new characteristics connected with the way we write on a keyboard [2, 3]. These measurable traits are mostly used in multimodal biometrics systems as a support for physiological features. It is connected with the fact that sometimes it is really hard to enter behavioral trait in exactly the same way. These features can be changed during human life and it is connected with our experience and certain habits. However, still some of them are used in stand-alone biometrics systems as the only trait. During the last years, the research has shown that keystroke dynamics can be proved as a feature that can guarantee satisfactory accuracy level. It is connected with the fact that most of us type on a keyboard in a completely different way. This fact can also be connected with the user’s age—elderly people will use computer keyboard in a bit slower manner than youths. This fact can also be connected with the users’ profession—for instance, software developers will type on a keyboard in a completely different way (due to some diversified influences connected with their work) than construction workers. The second trait analyzed in the conducted research is fingerprint. In fact, we can claim that it is one of the most commonly used human measurable features. Nowadays, we can observe it in diversified devices and systems. The best examples are smartphones and laptops. Moreover, during last years, fingerprint was proven as one of the best ways to secure our belongings. This statement was caused by the fact that fingerprint is easy to obtain, and the devices are really easy to use. What is more, novel systems take into consideration some additional parameters (like temperature or material) to check whether fingerprint is real or not. These auxiliary features cause that fingerprint is also really hard to spoof. On the other hand, we can observe some illnesses that lead to disappearance of fingerprints. These changes cause that human cannot be recognized by his fingerprints, he need to use different way (for example, keystroke dynamics). It is one of the reasons why we are analyzing possibilities of multimodal biometrics systems. In this work, the authors present their approach to multimodal biometrics system based on keystroke dynamics and fingerprints. In the case of the first mentioned trait, we made a comparison between the most popular machine learning techniques

Fingerprint and Keystroke Dynamics Fusion …

69

used in biometrics identity recognition based on this trait. In our experiments, we used k-means, k-nearest neighbors, Naive Bayes classifier, and decision trees. When it comes to fingerprint, we used our own algorithm based on fingerprint structure division to squares. In the classification phase, we used simple k-nearest neighbors algorithm. One of the main aims of this approach was to check whether it is possible to obtain human identity on the basis of both traits with higher accuracy level that when each of them is used separately. This work is organized as follows. In the first section, we present current state of the art. We took into consideration not only research connected with fingerprints, keystroke dynamics but also multimodal biometrics systems with these traits. In the second chapter, proposed methodology was described—what algorithms and in what ways were used. We also point how multimodal system was constructed and how much samples were engaged during the analysis procedures. The third part of this work is connected with performed experiments. In this section, we present also a short discussion of the results. Finally, future work and conclusions are given.

2 How Others See It Our research started with the literature review. We found multiple, diversified approaches to keystroke dynamics, fingerprints and human identity recognition on the basis of them. In [2], the authors proposed a comparison between three methods like k-nearest neighbors, backpropagation neural network, and Bayesian classifier for pattern classification on the basis of keystroke dynamics. Simply, they try to identify each sample on the basis of the training set. Their results showed that Bayesian method and k-NN gave the best results in relatively short time. The main concern connected with this work is a small number of the users in the database. It can lead to misidentifications and distorted results. Another interesting approach was presented in [4]. In this paper, the authors claimed that their database consists of 2500 samples gathered from 25 users. Their algorithm is based on two metrics: dwell time and flight time. At the beginning, the results are filtered and then classified. The last procedure is done on the basis of Principal Component Analysis (PCA) method. The main disadvantage of the proposed approach is connected with its high computational complexity in the case of huge databases. In the literature, we can also find different ideas connected with the elements of feature vector that can obtained from the way in which the user type on keyboard. The best representative of this group can be [5]. In this work, the authors proposed to construct each sample on the basis of key press intervals. Their experiments have shown that it is sometimes possible to identify a user on the basis of this feature although it cannot be done with high accuracy and satisfactory efficiency. It has to be claimed that this approach do not guarantee requested quality in the terms of computer user safety procedures.

70

M. Szymkowski et al.

In most of the works, we can observe that standard distance metrics, like Euclidean, Manhattan, or Chebyshev, are used. The approach to create new metric was presented in [6]. The idea proposed by the authors is based on comparison between two histograms, each representing sample in the database. In the work, experimental results are presented although they are not clear enough to properly assess the quality of the results presented in this work. The authors also found other interesting algorithms connected with human identification on the basis of keystroke dynamics. The first of them [7] was based on Gaussian mixture modeling of keystroke patterns. The second [8] proposed Hamming distance for human identification and comparison between two samples. The third [9] proposed multimodal biometrics system that identifies user on the basis of keystroke dynamics with fingerprint. Each of these works has small disadvantages. In the case of two, mentioned at the beginning, it is a huge amount of time needed to get results in the case of a large database, while in the third, identity is obtained mostly on the basis of a fingerprint (that returns satisfactory results) rather than keystroke dynamics. Behavioral trait was used only to confirm user identity when fingerprint module cannot properly assess this parameter. The second phase of the literature review was connected with fingerprints. In diversified conference materials and journals, we can find really interesting approaches to human identity recognition on the basis of this trait. However, we have to claim that huge part of these works (even they can guarantee high accuracy level) have high computational complexity and information about identity is obtained with a really huge time. At the beginning, we would like to present work [10]. In this case, the authors proposed traditional algorithm for fingerprint recognition. It starts with preprocessing (mostly based on binarization and thinning) and then minutiae extraction is performed. The most interesting is the fact that all of found minutiae were taken into consideration in the classification phase. For this aim, the authors used both k-nearest neighbors and artificial neural network classifiers. In comparison to our approach, the proposed method uses too much data (minutiae) to get the results in acceptable time. The authors of [11] proposed an interesting approach to incorporate information from rare features. In the work, they described an algorithm to align latent and tenprint minutiae patterns using rare minutiae types. The authors in their work claim that their method can be used alongside traditional one due to the fact that it can guarantee some additional data for recognition process. Completely different idea was presented in [12]. In most of the works, fingerprint is presented in the form of image while in this work [12] it is stored in the form of signal. It is connected with the fact that the authors used promising technology of Radio Frequency Fingerprinting (RFF). This research is worth knowing because it describes some interesting solutions; however, we cannot even compare it to our work because the way in which identification process is performed differs from the idea in which we deal with this task. In the literature, we can also find some approaches based on soft computing methods. For instance, the most popular recently is deep learning. This solution was

Fingerprint and Keystroke Dynamics Fusion …

71

used in works [13, 14]. Other commonly used algorithms are support vector machines [15] and artificial neural networks [16]. In the next sections of this paper, we present our approach and experimental results. They showed that it is possible to get information about user identity with satisfactory precision level regarding the well-known rules described in the case of biometrics safety systems.

3 Methodology The proposed methodology was divided into three main parts: keystroke dynamics, fingerprint, and multimodal system. Each of them was described in separate subsection.

3.1 Keystroke Dynamics The authors have recently published articles in the field of biometrics and especially keystroke dynamics [17, 18]. However, in the previous works, we mostly concentrated on multimodal approaches in the scope of keystroke dynamics as well as in the combination with physiological traits. Before the experiments we have collected our own database that consists of 50 users, each described by three samples. Each of them was collected with usage of one keyboard (built-in, Apple MacBook Pro, 2016). Samples were obtained during one day in different times. The authors selected three trials due to the fact that this amount does not complicate the system as well as it is enough to properly represent user. In our experiments, only software developers were taken into account. Our users were males and females of ages ranging between 20 and 45. Each of them has experience in using a computer as well as uses it every day (for example, in his work). During the experiments, users have to prescribe the same text. Its form was fixed before this stage. If user made a mistake during prescription, the text was considered incorrect as well as the feature vector was cancelled (none of the values were ever taken into account). The feature vector consists of average dwell time for each letter. Dwell time scheme is presented in Fig. 1 while Fig. 2 shows the sample feature vector. The authors decided to use simple dwell times due to the easiness of the way in which they can be obtained and the ability to reduce feature vectors classification computing time because of their simplicity.

Fig. 1 Dwell time for keyboard button

72

M. Szymkowski et al.

Fig. 2 Sample feature vector. In the upper row, we see each of the analyzed letters while in the lower, average dwell times are presented

During our experiments, we implemented the following machine learning classifiers: k-nearest neighbors [19], k-means clustering [20], decision trees [21], and Naive Bayes [22]. All of these algorithms were implemented in C# 8.0, on.NET Core 3.0 Framework with Microsoft Visual Studio 2019. All classifiers except the decision trees were tested with “k” parameter in the range between 1 and 20 (both inclusive). It means that we took into account only integer values (as 1, 2, 3… 20). Each classifier was also tested with different distances: Euclidean, Chebyshev, Manhattan, and Mahalanobis. Diversified methods helped us to properly evaluate accuracy level of the proposed human identity recognition procedure. Euclidean distance was calculated as in (1) while Chebyshev measure was determined as in (2). Manhattan interval was evaluated as in (3) and Mahalanobis distance was assessed as in (4). Canberra distance was computed as in (5), Bray Curtis metric as defined in (6). Hamming and Cosine distance in (7) and (8), respectively.   n  d =  |xi − yi |2

(1)

i=1

d = max |xi − yi | 1≤i≤n

d=

n  |xi − yi |

(2)

(3)

i=1

 d = ( x − y)T S −1 ( x − y) n  |xi − yi | d= |x i | + |yi | i=1 n |xi − yi | d = ni=1 (x i=1 i + yi ) n  1x = yi d= { i 0xi = yi i=1

(4)

(5)

(6)

(7)

Fingerprint and Keystroke Dynamics Fusion …

73

n d =  n

i=1

2 i=1 x i

·

xi yi 

n i=1

(8) yi2

where xi is the i-th element of first vector while yi is the i-th element of the second sample, and S is the covariance matrix. In the case of decision trees, the following learning algorithms were compared: Iterative Dichotomiser 3 (ID3) [23] and C4.5 [24]. ID3 is the statistics-based machine learning algorithm that generates a decision tree. Its main point is to split dataset by the attribute that can guarantee the minimum entropy—in this case, we will also observe maximum information gain. The same algorithm is used recursively for every subset with the remaining attributes. C4.5 is a modification of ID3—the main difference between these solutions is that the dataset is split into maximum normalized information gain (also known as Gain Ratio). ID3 is making decision based on how accurate the splitting decision is, ID3 is also taking into account how well samples will be distributed after splitting process.

3.2 Fingerprint The second part of the methodology is connected with fingerprints. In this case, we have prepared improved approach based on [25]. The main point of this algorithm was connected with changing the way in which feature vector is calculated. In the basis version, we divided image into rectangles while in this approach we used squares. However, before feature vector extraction we had to improve the quality of the image. Each sample was obtained with U.Are.U.5160 fingerprint scanner. The process is presented in the form of pseudocode in Algorithm 1. Algorithm 1 Algorithm for fingerprint feature vector extraction

At the beginning, image is loaded from source. In this case, we used method available in Java Programming Language from ImageIO library. Afterward, we cut image from each side. This step was done on the basis of the first detected black pixel. By this operation we removed all unnecessary white regions around the fingerprint

74

M. Szymkowski et al.

Fig. 3 Original fingerprint image (a) and sample after processing stage (b)

and created Region of Interest (ROI). In the next stage, the image was converted to black and white. During this procedure, we tested multiple diversified approaches like Niblack [26], Otsu [27], and Bernsen [28]. However, the most precise results in the terms of fingerprint visibility were obtained with Otsu algorithm. After this operation, fingerprint was clearly visible and can be used in further processing. Subsequently, we have to improve image quality due to little artifacts visibility after binarization process. For this aim, we used median filtering with 3 × 3 square mask. The main usage of this algorithm is to remove salt and pepper noise. We can claim that distortions remaining after binarization were similar to this kind of noise. In the next stage, image was thinned (after this operation each edge is 1 pixel wide and can be used in further processing) with K3M algorithm [29]. As the last stage, minutiae were extracted and a part of them were removed. All removals were done with usage of distance method. It means that if some minutiae were too close to each other, one of them was removed. After this stage, we obtained an image with marked minutiae. The final step was based on image division to squares. It was needed due to the fact that our feature vector is created on the basis of this information. Original image and sample after processing stage are presented in Fig. 3.

3.3 Multimodal System In this subsection, we would like to present the general idea of proposed multimodal system. We construct this kind of system due to the fact that sometimes one biometrics trait is not enough to strictly identify human (sometimes it is done too roughly). We also know that behavioral traits should not be used as the only ones

Fingerprint and Keystroke Dynamics Fusion …

75

in biometrics systems. It is connected with the fact that they cannot guarantee high distinguishability. Our system works in a few stages. At the beginning, fingerprint image and keystroke dynamics sample are obtained. Afterward, each of them is processed in their individual process. At the end of it, each sample is described with feature vector. In the next stage, extracted information are sent to classification modules. Each of them provides decision about human identity on the basis of feature vector (fingerprint and keystroke dynamics, respectively) and databases. If both decisions are exactly the same, then we can claim that a person was recognized with the identity pointed by both algorithms. In the other case, when decisions are different, we have to claim that a man was recognized with the identity pointed out by algorithm with higher priority level. Priority level means that one of the algorithms has additional flag by which it is considered as more important. The described process is presented in Fig. 4.

4 Experiments and Discussion The significant part of the research is connected with the experiments. The main aim of this part was to check whether it is a possibility to obtain higher accuracy for identity recognition by multimodal system rather than in the case of each algorithm independently. However, our tests at the beginning were connected with keystroke dynamics. During them we check which tested machine learning algorithm with what parameter values can guarantee satisfactory results. The first experiment was connected with k-nearest neighbors algorithm. During it we tested diversified k values (from 1 to 20) and distance measures: Euclidean, Manhattan, Chebyshev, Mahalanobis, Hamming, Cosine, Canberra, and Bray Curtis. The result of this study is presented in Fig. 5. In this case, Bray Curtis and Canberra metrics returned the best result of 73% accuracy for k-parameter in the range from 1 to 2. Both of them can be classified as an improvement for Manhattan distance that, in our case, has 61% accuracy for equivalent k range. Hamming distance, a very simple and fast measurement method, returned 55% accuracy for k-parameter value between 1 and 2. It seems that the best results are obtained by taking into account only the nearest samples. Accuracies for k-parameter in the range from 1 to 2 inclusively are the same due to the nature of k-nearest neighbors algorithm; for k = 1 only the closest neighbor is taken into account, while k = 2 takes two of the nearest samples, and then—due to parity it has to pick one—the closest one. The second experiment was connected with k-means algorithm. Once again, we have tested different distance metrics—we used the same set of measurement methods as in the case of k-nearest neighbors algorithm (Fig. 6). The best result of 14% accuracy returned Euclidean distance. Best k-NN distances, Canberra, and Bray Curtis lowered the accuracy to 12%. The best results were

76

Fig. 4 Multimodal biometrics system

M. Szymkowski et al.

Fingerprint and Keystroke Dynamics Fusion …

77

Fig. 5 k-nearest neighbors results

Fig. 6 k-means results

obtained for k = 2. This classifier, in our case, returned the worst and practically random results that cannot be used in a real biometrics system. The best results were obtained for k = 1. For this value, Naive Bayes is equivalent to 1-NN; both algorithms take the first, nearest sample. The accuracy quickly drops below 10% for all distance metrics. There is a sudden rise in the accuracy for the Canberra distance at k = 5 and much weaker at k in {8, 11, 14, 17}. These values are from arithmetic series k = 3n + 2, for n  N due to the nature of our accuracy testing method. It takes one sample out and classifies it, using remaining samples as

78

M. Szymkowski et al.

Fig. 7 Naive Bayes results

a training dataset. This method is called leave-one-out. Furthermore, there are three samples from every user and that is why the best accuracy is observed with these intervals. In case of decision trees, we achieved the best result of 55% accuracy for Euclidean metric, and k-parameter value = 1341. Moreover, we have to claim that there were no observable differences between ID3 and C4.5 learning algorithms. Euclidean distance was followed by Manhattan and Hamming measurement methods with 40% accuracy on k = {1809, 84}, respectively. Chebyshev distance returned 31% accuracy when k = 507 and Mahalanobis had 19% accuracy for k = 32. All of the remaining metrics, Bray Curtis, Canberra, and Cosine, returned 0% accuracy (Fig. 7). The purpose of this experiments was to check whether it is a possibility to identify a man on the basis of a keystroke dynamics trait with satisfactory accuracy level and which basic machine learning algorithm can return the best results. We would like to start our analysis with the first question. The experiments have shown that it is clearly possible to identify a man with an acceptable accuracy. The highest level was above 60%. This result can be connected with the fact that our database consists of samples that we got from one specific group (in the context of their work)— software developers. Despite the fact that the accuracy level was satisfactory, we have to claim that our experiments confirmed the thesis we observed in other works. It was connected with the fact that we should use behavioral biometrics only as a support trait in multimodal biometrics systems. In the real programs, none of the approaches presented in this paper can be used. It is connected with the fact that it is too much possibility that program will make mistakenly identify a user. However, the presented solution can be easily implemented in some kind of multimodal system as a second way of human recognition. We have to claim that the first trait in the proposed solution has to be physiological, because these traits are much harder to spoof.

Fingerprint and Keystroke Dynamics Fusion …

79

Table 1 Summary of the experiments Algorithm

Highest accuracy (%)

k-nearest neighbors

73

k-means

14

Decision trees

55

Naive Bayes

73

The second question is much more interesting. We present the summary of our experiments in Table 1. We can observe that the best results (in this case, we keep in mind most precise outcomes that guarantee the highest accuracy level) were obtained with k-nearest neighbors algorithm and Naive Bayes method. Both of them returned 73% of accuracy level. We have to claim that these approaches returned the best results because they are basing on the most similar samples in comparison to the currently analyzed one. k-means algorithm gave us the worst outcome. This result can be connected with inappropriate strategy selected for initialization of each set. In the case of the last tested method, decision trees, it is observable that division of times into small sets does not return results that can be used in the real biometrics system. The second part of the experiments was connected with accuracy level of fingerprint algorithm. In this case, we used only k-nearest neighbors methods. During this experiment, we used database consisting of 150 samples from 50 users. Each user was described by three samples. Users were the same as in the case of keystroke dynamics. Experiments were repeated 100 times with randomly created testing set and training set. However, we should claim that the most precise results were observable when training and testing sets consist of 75 samples, respectively. Average identity recognition accuracy was 92% when k-parameter was 2. It means that only two nearest neighbors were taken into consideration. As the last experiment, we tested accuracy of multimodal biometrics system. It works with the rules described in Sect. 3.3. Each experiment was repeated 100 times with randomly created testing and training sets. The most precise results were obtained with Naïve Bayes approach for keystroke dynamics, 2-nearest neighbors algorithm for fingerprint, and priority flag set for physiological trait. With this configuration we observed 96% accuracy level. On the basis of this result, we can claim that the thesis presented at the beginning of this work was confirmed. Summary of the final results is presented in Table 2. Table 2 Summary of the experiments connected with multimodal biometrics system Algorithm

Obtained accuracy (%)

Keystroke dynamics (Naïve Bayes)

73

Fingerprint (k-nearest neighbors)

92

Multimodal biometrics system

96

80

M. Szymkowski et al.

5 Conclusions and Future Work Nowadays, we can observe that biometrics is a branch of science that is constantly evolving. Diversified sensors and algorithms can be observed in everyday used devices—for example, smartphones. Moreover, multimodal biometrics systems are more commonly used in everyday use. It is connected with the fact that these systems can guarantee much higher efficiency than each measurable trait separately. The keystroke dynamics approach presented in this paper was implemented in a real development environment with C# 8.0 Programming Language and.NET Framework while fingerprint algorithm was prepared with Java Programming Language and Maven framework. Multimodal biometrics system was created as online application on the basis of two previously prepared programs. The algorithms were tested on the author’s database consisting of 50 users and 150 samples. Each user was described by three numeric samples consisting of each key dwell time for keystroke dynamics and three fingerprint images. In the future, we will consider sharing this data with other, interested scientific teams or it will be available online. In the work, we used four basic machine learning algorithms by which we evaluated user identity with keystroke dynamics: k-nearest neighbors, k-means, Naive Bayes classifier, and decision trees. k-nearest neighbors was also used for fingerprint recognition. Each of them was tested due to its low complexity level and high accuracy in other biometrics approaches (especially in some other behavioral approaches). Our experiments have shown that the main thesis was confirmed. It means that multimodal biometrics system based on keystroke dynamics and fingerprint fusion can guarantee higher accuracy than each of them separately. Author’s current work is to prepare algorithms by which the results will be improved, and more complex feature vectors will be extracted. We are considering, for instance, letter joins (as rz or ch in Polish language) as well as diversified minutiae types for fingerprint. Moreover, we are also considering some artificial intelligence solutions, for example, artificial neural networks or evolutionary algorithms. In the future, we will also implement our solution with hardware languages as VHDL or FPGA. Acknowledgments This work was partially supported by grant S/WI/3/2018 and WI/WI/2/2019 from Białystok University of Technology and funded with resources for research by the Ministry of Science and Higher Education in Poland.

References 1. Saeed, K., Nagashima, T.: Biometrics and Kansei Engineering. Springer (2012) 2. Cho, T-H.: Pattern classification methods for keystroke analysis. In: SICE-ICASE International Joint Conference 2006, Proceedings, Bexco, Busan, Korea, 18–21 Oct 2006

Fingerprint and Keystroke Dynamics Fusion …

81

3. Pentel, A.: High precision handedness detection based on short input keystroke dynamics. In: IEEE 2017 8th International Conference on Information, Intelligence, Systems and Applications, IEEE Proceedings, Larnaca, Cyprus, 27–30 Aug 2017 4. Meszaros, A., Banko, Z., Czuni, L.: Strengthening passwords by keystroke dynamics. In: 2007 IEEE International Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, IEEE Proceedings, Dortmund, Germany, 6–8 Aug 2007 5. Giroux, S., Wachowiak-Smolikova, R., Wachowiak, M.: Keystroke-based authentication by key press intervals as a complementary behavioral biometric. In: 2009 IEEE International Conference on Systems, Man and Cybernetics, IEEE Proceedings, San Antonio, USA, Oct 2009 6. Davoudi, H., Kabir, E.: A new distance measure for free text keystroke authentication. In: 2009 IEEE 14th International CSI Computer Conference, IEEE Proceedings, Teheran, Iran, 1–2 July 2009 7. Hosseinzadeh, D., Krishnan, S.: Gaussian mixture modeling of keystroke patterns for biometric applications. IEEE Trans. Syst. Man Cybern. 38(6), 816–826 (2008) 8. Kaneko, Y., Kinpara, Y., Shiomi, Y.: A Hamming distance-like filtering in keystroke dynamics. In: 2011 IEEE 9th International Conference on Privacy, Security and Trust, IEEE Proceedings, Montreal, Canada, 19–21 July 2011 9. Vinoth Kumar, G., Govinth Raj, S., Prasanth, K., Sarathi, S.: Fingerprint based authentication system with keystroke dynamics for realistic user. In: IEEE 2nd International Conference on Current Trends in Engineering and Technology, Proceedings, Coimbatore, India, pp. 206–209, 8 July 2014 10. Sagayam, K., Ponraj, D., Winston, J., Yaspy J. C., Esther Jeba D., Clara, A.: Authentication of biometric system using fingerprint recognition with euclidean distance and neural network classifier. Int. J. Innov. Technol. Explor. Eng. 8(4), 766–771 (2019) 11. Krish, R., Fierrez, J., Ramos, D., Alonso-Fernandez, F., Bigun, J.: Improving automated latent fingerprint identification using extended minutiae types. Inf. Fusion 50, 9–19 (2019) 12. Yu, J., Hu, A., Zhou, F., Xing, Y., Yu, Y., Li, G., Peng, L.: Radio frequency fingerprint identification based on denoising autoencoders. http://arxiv.org/abs/1907.08809v1 [eess.SP] 20 Jul 2019 13. Uliyan, D., Sadeghi, S., Jalab, H.: Anti-spoofing method for fingerprint recognition using patch based deep learning machine. Eng. Sci. Technol. Int. J. https://doi.org/10.1016/j.jestch.2019. 06.005. July (2019) 14. Zeng, F., Hu, S., Xiao, K.: Research on partial fingerprint recognition algorithm based on deep learning. Neural Comput. Appl. 31(9), 4789–4798 (2019) 15. Singh, M., Girdhar, A.: Fingerprint enhancement using wavelet transformation and differential support vector machine. In: IEEE 2018 International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 11–12 July 2018 16. Li, J., Feng, J., Jay Kuo, C.-C.: Deep convolutional neural network for latent fingerprint enhancement. Sig. Process. Image Commun. 60, 52–63 (2018) 17. Panasiuk, P., Szymkowski, M., D˛abrowski, M., Saeed, K.: A multimodal biometric user identification system based on keystroke dynamics and mouse movements. In: Saeed, K., Homenda, W. (eds.) 15th IFIP International Conference on Computer Information Systems and Industrial Management, CISIM 2016, Vilnius, Lithuania, Springer, Sep 2016. Lecture Notes in Computer Science, LNCS-9842, pp. 672–681 18. Panasiuk, P., Rybnik, M., Saeed, K.: Authentication with dynamics using fixed text keystroke. In: 2009 IEEE International Conference on Biometrics and Kansei Engineering, ICBAKE 2009, Proceedings, Cieszyn, pp. 70–75, 25–28 June 2009 19. Bezdek, J.C., Chuah, S.K., Leep, D.: Generalized k-Nearest neighbor rules. Fuzzy Sets Syst. 18(3), 237–256 (1986) 20. Li, Y., Wu, H.: A clustering method based on k-means algorithm. Phys. Proced. 25, 1104–1109 (2012)

82

M. Szymkowski et al.

21. Topirceanu, A., Grosseck, G.: Decision tree learning used for the classification of student archetypes in online courses. Proced. Comput. Sci. 112, 51–60 (2017) 22. Zhang, W., Gao, F.: An improvement to naive bayes for text classification. Proced. Eng. 15, 2160–2164 (2011) 23. Xiaohu, W., Lele, W., Nianfeng, L.: An application of decision tree based on ID3. Phys. Proced. 25, 1017–1021 (2012) 24. Cherfi, A., Nouira, K., Ferchichi, A.: Very fast C4.5 decision tree algorithm. Appl.Artif. Intell. 32(2), 119–137 (2018) 25. Szymkowski, M., Saeed, K.: A novel approach to fingerprint identification using method of sectoralization. In: ICBAKE 2017—International Conference on Biometrics and Kansei Engineering, IEEE Proceedings, Kyoto, Japan, 15–17 Sept 2017 26. Samorodova, O., Samorodov, A.: Fast implementation of the Niblack binarization algorithm for microscope image segmentation. Pattern Recogn. Image Analy. 26(3), 548–551 (2016) 27. Qu, Z., Zhang, L.: Research on image segmentation based on the improved Otsu algorithm (2010). https://doi.org/10.1109/IHMSC.2010.157 28. Eyupoglu, C.: Implementation of Bernsen’s locally adaptive binarization method for gray scale images. In: International Science and Technology Conference, Vienna, Austria, pp. 621–625, 13–15 July 2016 29. Tab˛edzki, M., Saeed, K., Szczepa´nski, A.: A modified K3M thinning algorithm. Int. J. Appl. Math. Comput. Sci. 26(2), 439–450 (2016)

Contributed Papers

QoS Enhancement in WBAN with Twin Coordinators Sriyanjana Adhikary, Biswajit Ghosh, and Sankhayan Choudhury

Abstract WBAN is an emerging area which has the potential to revolutionize healthcare scenario. WBAN nodes are highly constrained in terms of energy and other resources. WBAN supporting healthcare applications deal with wireless communication with real-time monitoring. Hence, the delivery of data to the intended location within a specified threshold delay is of major concern. So, various researchers are working on it to get its potential fully utilized in the healthcare domain and to improve the QoS. None proved to be truly efficient yet. This research focuses on enhancing stability period, packet delivery ratio (PDR), and the remaining energy of the network by decreasing the end-to-end delay. The concept of Anycasting has been modified to implement in a WBAN topology consisting of eight sensors and two coordinators on the human body. The performance of the proposed protocol is compared with other state-of-the-art protocols, like Anycasting in dual sink approach (ACIDS), link-aware and energy-efficient scheme for body area networks (LAEEBA), and destination-assisted routing enhancement (DARE). The results show that the modification delivered better quality of service (QoS) results in terms of stability period, packet delivery ratio, and end-to-end delay compared to DARE and LAEEBA and in few cases for ACIDS. Keywords WBAN · Stability period · PDR · End-to-end delay · Remaining energy

S. Adhikary Department of Information Technology, Jadavpur University, Kolkata, India e-mail: [email protected] B. Ghosh (B) Department of Information Technology, FIEM, Kolkata, West Bengal, India e-mail: [email protected] S. Choudhury (B) Department of Computer Science and Engineering, University of Calcutta, Kolkata, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Chaki et al. (eds.), Advanced Computing and Systems for Security, Advances in Intelligent Systems and Computing 1178, https://doi.org/10.1007/978-981-15-5747-7_6

85

86

S. Adhikary et al.

1 Introduction The rising population of India along with the cost of medical service demands an alternative solution to improve patient healthcare system. This demand triggered the requirement of WBAN technology to be used in medical domain. Recent developments in wireless technology, sensor network technology, embedded system, and extensive computing provoked the potential of wireless body area network (WBAN) to enhance the healthcare sector [1, 2]. A WBAN comprises miniaturized sensors on, in, or near human body. Data rate and power consumption of implanted devices are comparatively less than the wearable ones [3, 4]. These sensors sense and transmit data via a wireless communication channel generally toward a sink node called coordinator or gateway node. This coordinator further forwards the data to a personal digital assistance (PDA) which conveys the data to a remote location based on the required application. The basic architecture of WBAN is shown in Fig. 1. In WBAN, the maximum energy lost by a node is due to data transmission from sensor to sink. So, it is required to utilize the transmission energy pragmatically. This demands energy-efficient routing protocol which supports mobility. Though a lot of communication protocol is available in wireless sensor network (WSN). There are some fundamental differences in WBAN w.r.t network requirements. They are as follows:

Fig. 1 Basic architecture of WBAN

QoS Enhancement in WBAN with Twin Coordinators

1. 2. 3. 4.

87

Varying bandwidth of communication through body rather than space. Energy-constrained nodes, one for each health parameter. Strict constraint for power consumption. Being harmless to human tissue.

These boundaries call for different sets of protocols in WBAN. Usually, in WBAN architecture, a single sink is responsible to collect data from the sensors. Thus, sensors either have to increase power to maintain star topology to send their data directly to the sink or can relay the data toward the sink. Enhanced power for direct transmission will dissipate their energy faster resulting in low stability period. Forwarding the data will increase the hop count and hence the delay due to extra processing and relaying by other intermediate sensors. Moreover, the forwarder sensors also get over burdened and tend to loose limited energy for this relaying process. To solve the problem of “sensor-node-as-forwarder,” special nodes called relay nodes can be incorporated. This solution still causes delay in data delivery which is fatal in case of emergency situation. Another way to efficiently utilize the energy of the node is to use dual sink in the WBAN architecture. This is a common approach in WSN to forward the data to the nearest sink [5] in an Anycast manner. Though this approach is common in WSN, it is yet to be utilized in WBAN. Therefore, in this study, we are going to address this approach and observe how the basic QoS parameters of WBAN like packet delivery ratio (PDR), end-to-end delay, stability period, and energy consumption are getting affected. In WBAN, high mobility leads to poor reliability. For efficiency in WBAN operations, the issue of optimum placement of the gateway nodes is also taken into account in this paper. The rest of the paper is organized as follows. Section 2 highlights the related work. Section 3 provides the protocol for the proposed model followed by simulation results in Sect. 4 and conclusion is given in Sect. 5.

2 Related Work Different researchers have put their best to overcome the challenging requirements of WBAN. In paper [6], authors have used relay-based cooperation to transmit critical data in emergency situation. Every intermediate node keeps a copy of the packet to reduce packet drop. This enhances delay of the network which is unappealing. A protocol called DARE has been proposed in [7] which creates a scenario of a ward having eight patients. Each one of them is encompassed with seven sensors to monitor their vital parameters. This protocol is only aligned to optimize the net energy consumption, ignoring equally important other QoS parameters of WBAN. Authors in [8] have proposed a mathematical model to select cluster head in order to balance the energy utilization and enhance stability period. But, due to high mobility of WBAN, this protocol also failed to establish a benchmark. Multiple sink nodes have been used in different sides of a soccer game by the protocol THE-FAME [9] to measure the fatigue of the players. In this protocol, direct transmission is enforced

88

S. Adhikary et al.

when the threshold is reached. This depletes the energy level of a node completely due to high power usage. As a result, the node tends to “die out” faster. Authors in [4] put forward a protocol called LAEEBA where they have used both direct and multi-hop communication based on path loss component to minimize only the energy consumption of the node. Authors in [10] have revised the LAEEBA protocol and proposed Co-LAEEBA. Cooperative knowledge of the path loss component, residual energy, and intermediate distance between the sender and the coordinator helped to find a feasible route to the destination. Both [4, 10] have unsatisfactory PDR for WBAN. In paper [11], a renowned thermal-aware WBAN routing protocol called M-Attempt has been suggested. It looks for hotspot, bypasses data from the hotspot, and forwards data via a different path. Moreover, it supports mobility. But it keeps a node inactive which is below the threshold. Thus, data for that particular sensor becomes inaccessible. This may turn fatal in emergency situation. Author in [3] initiated the modification of WBAN architecture by introducing dual coordinators. But they did not take into account the favorable placement of them on human body. Moreover, data is transferred to both the coordinators in LoS of the sensor. This resulted in energy wastage and early dead node formation. The optimum placement of the coordinator in WBAN is a crucial issue which affects the PDR of the system. Few attempts have been made in the past to solve this problem. In [12], relaying and cooperation were the two mechanisms used to maximize the network lifetime. They have not optimized the relay nodes. Thus, the positioning of relay nodes was fixed. References [13, 14] solved the problem of relay optimization using integer linear programming model. They have also proposed a routing algorithm minimizing the installation cost and energy consumption. However, in WBAN relaying is not an optimal solution because of high mobility of the patients. References [15, 16] designed their experiments to maximize the PDR of the system. The interval between the data communication of each node is considered as 8 seconds which is not appropriate in WBAN. References [17, 18] have used a mesh WBAN network where the subject was instructed to walk for 3 s in order to carry out the simulation. This duration was too less to represent the impact of their architecture on human mobility. None of the researches dealt with the ideal placement of twin coordinators on human body.

3 Proposed Protocol 3.1 Assumptions Based on the following assumptions we have proposed a routing protocol: 1. All nodes only possess group mobility, i.e., nodes possess a similar moving pattern as groups. 2. All nodes are intelligent enough to calculate their residual energy.

QoS Enhancement in WBAN with Twin Coordinators

89

3. All nodes can smartly compute link reliability between itself and neighboring nodes. These assumptions ensure that nodes moving similarly get engaged to a closer relay or coordinator, providing a more stable topology. Intelligent nodes guarantee that they will not engage themselves in any route.

3.2 Different Phases of the Proposed Protocol 3.2.1

Network Topology

In a WBAN, network mobility of a person is not restricted, though the parameters need to be strictly maintained. A WBAN has a single sink which collects vital signs from different sensors either in single hop or multi-hop fashion depending on the presence of it within the transmission range of the sender. Thus, solitary coordinator gets overburdened and the risk of single point failure gets immensely higher using the traffic model of [19]. According to this traffic model, using a single coordinator, the trade-off between star and tree topology is difficult as the later outperforms in terms of throughput, whereas the former is better in low end-to-end delay. So, we have proposed an architecture comprising of twin coordinators C1 and C2 with eight sensors. To find the optimum position, we have placed the gateways in two different sets of positions on the body based on the line-of-sight (LOS). We have evaluated the performance metrics of WBAN for both the experimental setups to conclude that the second set of positions of the coordinators are better than the first.

3.2.2

Initialization Phase

Initially, both the coordinators C1 and C2 send “HELLO” message having their Id and location. This location is fixed by the user as the coordinators are static. On receiving the packet, sensor nodes record them and broadcast their Node-Id, Packet-Id, and residual energy level. Thus, all the nodes get aware of their neighbors.

3.2.3

Next Hop Selection

When none of the coordinators are within the range of the sender, a node is selected having maximum cost based on Eq. 1. To evaluate Eq. 1, Eqs. 2–10 are used [20]. The node having maximum cost based on the equations is selected as next hop if and only if none of the coordinators are within direct range of the sender.

90

S. Adhikary et al.

When both the coordinators are in direct range of sender, then it uses received signal strength indicator (RSSI) model of [21] to calculate Eq. 11. The coordinator whose Pr value is maximum is chosen as receiver. Costi j = C E ∗ (

Er es, j ) + C L ∗ L Ri j E init, j

L Ri j = (1 − λ)L Ri j + λ ∗ (

T xsucc,i j ) T xtotal,i j

Er es,i = E init,i − E con,i

(1)

(2) (3)

Table 1 Notations Costi j Cost computed to select the next hop node CE , CL Constant coefficients Er es,i Residual energy of node i E init,i Initial energy of node i L Ri j Link reliability between two nodes T xsucc,i j Number of packets successfully transmitted through the link between nodes Si and S j T xtotal,i j Total number of transmission and retransmission attempts for all packets E con,i Energy consumed by node i ESB Average energy consumed at a node during backoff for a successful transmission E SC Average energy consumed at a node during collision for a successful transmission ET x Average energy consumed at a node when transmitting packet TS B Average backoff time at a node for a successful transmitted packet NC Average number of collision for a packet successfully transmitted TC Average collision time TT x Average time to transmit a packet TD B Average backoff time for a packet dropped ED Average energy consumed by a node due to a packet dropped EDB Average energy consumed at a node during backoff for a packet dropped E DC Average energy consumed at a node due to collision for a packet dropped R Maximum number of retransmissions of a packet Pr Wireless signal received power Pt Power transmitted by wireless signal d Distance between sender and receiver nodes n Transmission factor between sender and receiver whose value is environment dependent

QoS Enhancement in WBAN with Twin Coordinators

E con,i = E S B + E SC + E T x

(4)

E S B = L Ri j ∗ TS B

(5)

E SC = NC ∗ [PRx−T x ∗ TRx−T x + TC ∗ PT x + PT x−Rx ∗ TT x−Rx ]

(6)

E T x = TT x ∗ PT x

(7)

E D B = L Ri j ∗ T D B

(8)

E DC = (R + 1) ∗ [PRx−T x ∗ TT x−Rx + TC ∗ PT x + PT x−Rx ∗ TT x−Rx ]

(9)

E D = E D B + E DC

(10)

Pr = Pt

 n 1 d

The notations used are given in Table 1.

3.2.4

91

Routing

The flowchart of the proposed routing algorithm is given in Fig. 2.

Fig. 2 Routing algorithm flowchart

(11)

92

S. Adhikary et al.

4 Performance Evaluation 4.1 Setting and Configuration The performance of the proposed protocol is studied and compared with ACIDS [3], LAEEBA [4], and DARE [7] in CASTALIA simulator [22]. Two main experiments were performed to access the network. In each experiment, eight sensors were deployed in 3 ∗ 3 m2 body area with two coordinators. In Experiment 1, C1 was located at right lumbar and C2 was situated at left lumbar as shown in Fig. 3. In Experiment 2, position of C1 remained unaltered, whereas C2 was placed at right collar bone instead of left lumbar as shown in Fig. 4. The simulation parameters used are shown in Table 2. The IEEE standard 802.15.6 is used for MAC layer and CSMA/CA mechanism is used.

Fig. 3 Setup for Experiment 1

QoS Enhancement in WBAN with Twin Coordinators

93

Fig. 4 Setup for Experiment 2 Table 2 Simulation parameters

Tx-Rx, Rx-Tx Tx (power consumption) Rx (power consumption) Minimum supply voltage Wavelength (λ) Frequency (f) Initial energy (Eo) Battery capacity

0.02 s 3 mW 3 mW 1.9 V 0.135 m 2.4 GHz 18,720 J 560 mAh

94

S. Adhikary et al.

4.2 Performance Metrics To evaluate the performance of the proposed routing protocol, following metrics were used: 1. Stability Period The duration of the operation of the network before the first node depletes all its energy is termed as stability period. In WBAN, only one node is allowed to collect a particular psychological parameter. There is no backup for a node. If a node dissipates its entire energy and dies out, then its functionality will be stopped. This may be fatal in WBAN. Thus, high stability period is essential in this type of network. 2. End-to-End Delay The time lag between sender node sending the packet and receiver node receiving it is termed as delay. As WBAN deals with health-related data, delayed delivery may be fatal. Thus, decreasing delay is a crucial factor in WBAN. 3. Packet Delivery Ratio It denotes the ratio between the number of packets sent by the sender to the number of packets received by the coordinator. The higher the PDR value, better it is. This is because this type of network handles health data. So, packet drop is not desired at all. 4. Remaining Energy The amount of energy left from the initial energy is called remaining energy. This is a deciding factor whether the node participates in the network. Only one node is alotted to collect each health parameter. So, if a node gets inactive the value of that parameter will not get gathered. This may turn out to be disaster. So, energy of the nodes in WBAN should be used pragmatically.

4.3 Simulation Results According to Fig. 5, performance of proposed protocol is much better than LAEEBA and DARE. Its performance is at par with ACIDS till 9 ∗ 103 s in Experiment 1 and outperforms it in 104 s as the last sensor of the network is still alive. In Experiment 2, the number of active nodes in the network is much higher following the proposed protocol than ACIDS. This result was expected as in the proposed protocol each sensor sends the data to one of the coordinators with greater RSSI value unlike ACIDS which double sends data. Minimizing delay in WBAN is a much required QoS as it deals with human vitals. According to Fig. 6, LAEEBA and DARE have high end-to-end delay as they deal with only single sink. The presence of dual coordinators lowers the delay both in ACIDS and proposed protocol as most of the sensors are in direct communication range of the coordinators. In Experiment 1, initially ACIDS performs better than the proposed. In the proposed protocol, there is some extra calculations done to send data

QoS Enhancement in WBAN with Twin Coordinators

95

Fig. 5 Stability period versus time

Fig. 6 End-to-end delay versus time

Fig. 7 PDR versus Time

to one of the coordinators. So delay of the proposed protocols is initially higher than ACIDS, but later they are almost similar. In Experiment 2, position of the second coordinator is better. Hence, the delay decreases. As the stability period of ACIDS and proposed protocol is better than LAEEBA and DARE, their PDR is much higher in both experiments as shown in Fig. 7. In Experiment 1, proposed protocol is atleast as good as ACIDS till 6 ∗ 103 s. Later, as the number of dead nodes in ACIDS increases, the PDR value decreases. In Experiment 2, the position of C2 is more favorable than Experiment 1. So, number of active nodes is higher. Thus, PDR of proposed protocol is significantly elevated. As shown in Fig. 8, DARE is the most energy-efficient protocol whereas LAEEBA is least in the list. DARE did not consider other basic parameters of WBAN like stability period, delay, and throughput. Thus, energy parameters of others are comparatively lower as they come in trade-off with other parameters as well. In proposed protocol, sender sends the data to any one of the coordinators unlike ACIDS. So, it uses energy much better as compared to ACIDS in both experimental setups.

96

S. Adhikary et al.

Fig. 8 Remaining energy versus time

5 Conclusion This research focuses on increasing stability period, PDR, and remaining energy while decreasing end-to-end delay. We have proposed a change in topology by introducing dual coordinators. A modified concept of Anycasting was used to route data from sender to coordinator. We implemented the concept using two coordinators and eight sensors on human body. To have an optimum position of the coordinators, we have simulated using two different setups. We have compared the proposed protocol with state-of-the-art protocols like ACIDS, LAEEBA, and DARE for both setups. Simulation results show that the setup of Experiment 2 is more favorable for the proposed protocol. It outperforms LAEEBA, DARE, and atleast as good as ACIDS in terms of PDR, stability period, and end-to-end delay. Due to the availability of more than one coordinator, the sender can send data to the appropriate one. Thus, topologically most of the nodes come in star fashion to sink. This results in better efficiency of the proposed protocol than others. In some cases, ACIDS performed better than proposed one because of extra processing it had to do to choose between multiple sinks. But, overall performance was way better than ACIDS In future, we must incorporate other QoS parameters like security, interference, scalability, etc. with this concept to ease the challenges of WBAN.

References 1. Ullah, S., Higgins, H., Braem, B., Latre, B., Blondia, C., Moerman, I., Saleem, S., Rahman, Z., Kwak, K.S.: A comprehensive survey of wireless body area networks. J. Med. Syst. 36(3), 1065–1094 (2012) 2. Latré, B., Braem, B., Moerman, I., Blondia, C., Demeester, P.: A survey on wireless body area networks. Wirel. Netw. 17(1), 1–18 (2011) 3. Ullah, N., Hadi, F., Ahmed, S., Hanan, A., Ahmed, I.: Muhammad Rahim Baig: Anycasting in dual sink approach (acids) for wbasns. Int. J. Adv. Comput. Sci. Appl. 8(3), 257–263 (2017) 4. Khan, Z.A., Rasheed, M.B., Javaid, N., Robertson, B.: Effect of packet inter-arrival time on the energy consumption of beacon enabled mac protocol for body area networks. Proc. Comput. Sci. 32, 579–586 (2014)

QoS Enhancement in WBAN with Twin Coordinators

97

5. e Hadi, F., Minhas, A.A.: Eaa: energy aware anycast routing in wireless sensor networks. J. Eng. Appl. Sci. 30(1) (2011) 6. Yousaf, S., Ahmed, S., Akbar, M., Javaid, N., Khan, Z.A., Qasim, U.: Incremental relay-based co-cestat protocol for wireless body area networks. In: 2014 Ninth International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA). IEEE, pp. 113–119 (2014) 7. Tauqir, A., Javaid, N, Akram, S., Rao, A., Mohammad, S.N.: Distance aware relaying energyefficient: Dare to monitor patients in multi-hop body area sensor networks. In: 2013 Eighth International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA). IEEE, pp. 206–213 (2013) 8. Nadeem, Q., Javaid, N., Mohammad, S.N., Khan, M.Y., Sarfraz, S., Gull, M.: Simple: stable increased-throughput multi-hop protocol for link efficiency in wireless body area networks. In: 2013 Eighth International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA). IEEE, pp. 221–226 (2013) 9. Akram, S., Javaid, N., Tauqir, A., Rao, A., Mohammad, S.N.: The-fame: threshold based energyefficient fatigue measurement for wireless body area sensor networks using multiple sinks. In: 2013 Eighth International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA). IEEE, pp. 214–220 (2013) 10. Ahmed, S., Javaid, N., Yousaf, S., Ahmad, A., Sandhu, M.M., Imran, M., Khan, Z.A., Alrajeh, N.: Co-laeeba: cooperative link aware and energy efficient protocol for wireless body area networks. Comput. Hum. Behav. 51, 1205–1215 (2015) 11. Javaid, N., Abbas, Z., Fareed, M.S., Khan, Z.A., Alrajeh, N.: M-attempt: a new energy-efficient routing protocol for wireless body area sensor networks. Proc. Comput. Sci. 19, 224–231 (2013) 12. Reusens, E., Joseph, W., Latré, B., Braem, B., Vermeeren, G., Tanghe, E., Martens, L., Moerman, I., Blondia, C.: Characterization of on-body communication channel and energy efficient topology design for wireless body area networks. IEEE Trans. Inf. Technol. Biomed. 13(6), 933–945 (2009) 13. Elias, J., Mehaoua, A.: Energy-aware topology design for wireless body area networks. In: 2012 IEEE International Conference on Communications (ICC). IEEE, pp. 3409–3410 (2012) 14. Elias, J., Jarray, A., Salazar, J., Karmouch, A., Mehaoua, A.: A reliable design of wireless body area networks. In 2013 IEEE Global Communications Conference (GLOBECOM). IEEE, pp. 2742–2748 (2013) 15. Natarajan, A., De Silva, B., Yap, K.-K., Motani, M.: To hop or not to hop: network architecture for body sensor networks. In: 6th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, SECON’09. IEEE, pp. 1–9 (2009) 16. Natarajan, A., De Silva, B., Yap, K.-K., Motani, M.: Link layer behavior of body area networks at 2.4 ghz. In: Proceedings of the 15th Annual International Conference on Mobile Computing and Networking. ACM, pp.241–252 (2009) 17. D’Errico, R., Rosini, R., Maman, M.: A performance evaluation of cooperative schemes for on-body area networks based on measured time-variant channels. In: 2011 IEEE International Conference on Communications (ICC). IEEE, pp. 1–5 (2011) 18. Hamida, E.B., D’Errico, R., Denis, B.: Topology dynamics and network architecture performance in wireless body sensor networks. In: 2011 4th IFIP International Conference on New Technologies, Mobility and Security (NTMS). IEEE, pp. 1–6 (2011) 19. Alshamsi, H.S., Al-Shamisi, H.S., Mengi, H.: Traffic modeling of wireless body area network 20. Jacob, A.K., Jacob, L.: Energy efficient mac for qos traffic in wireless body area network. Int. J. Distrib. Sens. Netw. 11(2), 404182 (2015) 21. Zhou, Y., Sheng, Z., Mahapatra, C., Leung, V.C.M., Servati, P.: Topology design and cross-layer optimization for wireless body sensor networks. Ad Hoc Netw. 59, 48–62 (2017) 22. Castalia (2006). https://omnetpp.org/download-items/Castalia.html

A Distributed Power Control Scheme for Device-to-Device Communication in Cellular Networks Udit Narayana Kar, Debarshi Kumar Sanyal, and Monideepa Roy

Abstract In the current generation of cellular communications, two devices can communicate directly bypassing the core network. This mechanism is known as device-to-device communication. Enhanced throughput, higher spectral efficiency, and minimum delay are the major features of D2D communication. Simultaneous communication of multiple D2D users in a single cell gives rise to interference. In this paper, we have presented a distributed power control algorithm to mitigate the interference in a macrocell of D2D and cellular users. We perform numerical simulations to analyze the performance of the algorithm. Keywords Device-to-device communication · Power allocation · Data rate · Cellular network

1 Introduction The enormous rise in the number of smart mobile devices has sparked a speedy progress in recent cellular technology and services. At the same time, users’ demands for higher data rates have increased exponentially. To manage this ever-increasing demand for higher data rate, device-to-device (D2D) communication mechanism can be a great mechanism. D2D communication allows two users in close proximity to bypass the core network and communicate directly with each other. Bluetooth and Wi-Fi are such popular short-range communication techniques where the users can communicate directly using industrial scientific and medical (ISM) band [3]. U. N. Kar (B) · D. K. Sanyal · M. Roy Kalinga Institute of Industrial Technology (Deemed to be University), Bubaneswar 751024, Odisha, India e-mail: [email protected] D. K. Sanyal e-mail: [email protected] M. Roy e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Chaki et al. (eds.), Advanced Computing and Systems for Security, Advances in Intelligent Systems and Computing 1178, https://doi.org/10.1007/978-981-15-5747-7_7

99

100

U. N. Kar et al.

Earlier, in conventional cellular communication, there was no provision for direct communication between two users. Meanwhile, to support a larger subscriber base and to satisfy the throughput and delay budget, cellular operators started to use D2D communication in cellular networks. D2D communication is expected to play a vital role in the 5G era [4, 8]. Known as LTE-Direct, D2D communication is standardized by the third-generation partnership project (3GPP) Release 12 [5, 9]. D2D users in cellular communication can use both licensed and unlicensed spectrums for communication. The licensed spectrum used by D2D users is the same as that of the cellular users. Higher bit rates, reduced delay, and low energy consumption are some of the advantages of D2D communication in a cellular network. At the same time, sharing of the spectrum across D2D and cellular users can lead to a high co-channel interference. To avoid the interference, separate spectrum resources can be allocated for both D2D users and cellular users. In this case, multiple D2D users can use dedicated spectrum resources to communicate among them. In this type of communication, there arises a huge interference among the D2D users. One way to handle interference is through effective power control mechanism. In this paper, we have considered a system model where D2D users and cellular users communicate using separate spectrum resources. Utilizing separate spectrum resources decreases the interference between the D2D users and cellular users. However, it increases the interference among the D2D users. In this paper, we have presented a distributed power control approach to mitigate the interference between the users in each category (i.e., cellular and D2D) separately. In our analysis, the value of the transmission gain parameter is kept different for cellular users and D2D users. We have conducted the experiments for two scenarios. In the first, we have considered multiple D2D users communicating among themselves. In the second scenario, we have considered multiple D2D users and multiple cellular users communicating simultaneously in a single cell. Cellular and D2D users use separate dedicated spectrum. Numerical simulations are carried out in MATLAB and the results generated are presented graphically. The rest of the paper is organized as follows. Section 2 presents the related work, Sect. 3 presents the system model. Mathematical modeling of the power control algorithm is presented in Sect. 4. Experimental results and analysis are presented in Sect. 5. Finally, the paper concludes in Sect. 6.

2 Related Work The current long-term evolution-advanced (LTE-A) standard allows mobile users in a given proximity to communicate directly in a cellular network. The users within a close vicinity do not need to depend on the core network architecture. To enhance the overall spectral efficiency, the D2D users can utilize the same cellular spectrum resources for direct communication. In such a scenario, where multiple modes (D2D and cellular) of communication takes place in a single cell, interference seems to be a major issue. To avoid this problem, dedicated spectrum resources may be allo-

A Distributed Power Control Scheme for Device-to-Device …

101

cated for cellular users and D2D users. This reduces the interference between the cellular users and the D2D users. While multiple D2D users communicate using the same spectrum resources, there arises interference between the D2D users. Efficient power control mechanism is one of the easiest ways to restrict the interference between the D2D users. There is an abundance of research available on effective power control mechanisms in D2D communication. In a single cell scenario, a simple power control mechanism has been proposed in [12] where the D2D users reduce their power to avoid interference with the cellular users. D2D communication under cellular network can play a significant role in improving the overall local services. Using multi-variant resource-sharing modes, an optimum power control mechanism between D2D and cellular users has been proposed in [11]. The proposed method used maximum and minimum spectral efficiencies and maximum transmit power as constraints. Generated results proved that through appropriate utilization of resourcesharing modes and effective power control mechanism, throughput of the network can be enhanced. A distributed and centralized power control algorithm based on stochastic geometric approach has been presented in [7]. The dualistic approach of the proposed centralize approach guarantees substantial coverage probability for cellular users while scheduling the largest possible number of D2D links. Similarly, the distributed approach uses an optimal power on–off strategy to mitigate the interference. Recent research on power control schemes in D2D communication is based on the real-time channel state information (CSI). As the wireless channel conditions change dynamically, accurate estimation of channel properties is difficult. Especially for D2D users it becomes more difficult to analyze the varying channel conditions within the channel coherence time. Using the outdated CSI for power control may lead to severe degradation in network performance [10]. Most of the literature available on the background focuses on in-band D2D communication where D2D users and cellular users share the same spectrum resources. In this paper, we have considered a system model where dedicated spectrum resources are allocated for both D2D communication and cellular communication. This will avoid interference between D2D users and cellular users. Interference arises only among users in either category. To reduce this type of interference, we have proposed a distributed power control algorithm.

3 System Model and Problem Definition Let us consider a system model where there are n D2D transmitters and n D2D receivers. To avoid the interference between the D2D users and cellular users, we have allocated dedicated spectrum resources for both of them. As per our system model shown in Fig. 1, multiple D2D pairs communicate using same spectral resources. Hence, there arises interference between the D2D pairs. We consider that for n D2D

102

U. N. Kar et al.

Fig. 1 Interference for D2D communication

users, each user i has a power consumption τi . Let us consider that G ii is the link gain from the transmitter Ti to the receiver Ri . The overall SINR can be represented as G ii τi (1) ζi =  j=i G ji τn + ηi In Eq. (1), η is the additive Gaussian noise and G ni > 0 is the loss from the transmitter Ri to receiver Rn . Let us consider that for each link, there is a threshold value for SINR. Let the threshold value be represented as γi , and γi > 0. To maintain appropriate QoS, we must have ζi ≥ γi ∀i. We formulate a separate problem of power minimization in the form of objective function in terms of the vector of transmission powers τ : n  τi minimize (2) i=1 subject to ζi (τ ) ≥ γi i ∈ {1, 2, 3, . . . , n}

4 Distributed Power Control Then D2D users use distributed power control to set their own power levels that satisfy the given SINR (or data rate) constraints. We follow [1, 2] to develop the distributed power control algorithm for the D2D transmitters in the macrocell. The transmitters can update their power level in a synchronous manner through discrete time steps to reach a point where their powers converge. Rewriting Eq. (2), we get

A Distributed Power Control Scheme for Device-to-Device …



G ii τi ≥ γi j=i G ji τ j + ηi G ii τi ≥ γi



G ji τ j + ηi

103



j=i

τi ≥

γi

τi ≥ γi



j=i

 j=i

G ji τ j + ηi



(3)

G ii  G ji τ j ηi + G ii G ii

Let us consider F to be an n × n non-zero and a non-negative matrix. From Eq. (3), elements of F can be represented as  0 i= j Fi j = γi G ji (4) i = j G ii Let u be the column vector for normalized noise power. From Eq. (3), u can be represented as u i = λi Gηiii . We rewrite Eq. (3) as τi ≥ Fi j τ j + u i Fi j τ j + u i − τi ≤ 0

(5)

Equations (1) and (2) can be represented in the matrix form where τ = (τ , τ , τ . . . τn )T is column vector of transmission power, u = 1 2 3 T γ1 η 1 γ2 η 2 γ3 η 3 γN ηN , , . . . is the column vector for normalized noise power. InequalG 11 G 22 G 33 G nn ity Eq. 5 can be written as

Fi j τ j + u i − τi ≤ 0 Fτ + u − τ ≤ 0 (F − I )τ + u ≤ 0 (F − I )τ ≤ −u (I − F)τ ≥ u

(6)

Here I is the identity matrix and τ > 0. Considering all the D2D users in cell are communicating with each other, the matrix F is irreducible. Therefore, there exists a feasible power vector for which the below mentioned statements stand true [1, 2]. 1. 2. 3. 4.

The maximum modulus eigenvalue is μ F < 1. There is a power τ such that (I − F)τ ≥ u. ∞vector t = F . (I − F)−1 t=0  t limt→∞ ∞ t=0 F = 0.

As per the theorem in [2], if there exists a power vector τ ∗ and satisfies the required ζi values, then no matter what the initial power is, the power vector τ (t) will converge

104

U. N. Kar et al.

component-wise to τ ∗ . For any power vector τ > 0, we have τ ≥ τ ∗ , i.e., τ ∗ is Pareto-optimal. From Eq. 6, we can say that (I − F)τ ≥ u =⇒ τ ≥ (I − F)−1 u

(7)

−1



=⇒ τ = (I − F) u We now use the following greedy update rule for power control [2] of transmitter i: τi (t + 1) = Fτi (t) + u i In terms of the power profile of all transmitters, the above update rule can be written as τ (t + 1) =Fτ (t) + u = F(Fτ (t − 1) + u) + u = F(F 2 τ (t − 2) + Fu + u) + u = F t (τ (0)) + F 2 u + Fu + u . . .

(8)

= F (τ (0)) + (F + F + F . . .)u   t−1 = F t (τ (0)) + Fi u t

t−1

t

0

i=0

The limiting value of the power profile of the transmitters is given by lim τ (t) =[F ] lim τ (0) + lim t

t→∞

t→∞

= lim

t→∞

t→∞

 t−1

i F u

 t−1

F u i

i=0

(9)

i=0

= (I − F)−1 u = τ∗ The distributed power control algorithm can further be simplified. The power updates for transmitter i can be stated as τi (t + 1) = ( Gγiii )( j=i G ji τi (t) + ηi ). From Equa G τ tion (1) ( j=i G ji τ j + ηi ) = ζiii j . The modified equation will be     γi τi (t + 1) = G ni τi (t) + ηi G ii n=i    G ii τi (t) γi τi (t + 1) = G ii ζi   γi τi (t + 1) = τi (t) ζi

(10)

A Distributed Power Control Scheme for Device-to-Device …

105

Equation (10) represents the distributed power control approach for D2D links where each D2D link measures its respective ζi independently. The calculated ζi is then compared with the threshold SINR γi . Equation (10) is used to update the power.

5 Results and Analysis The numerical simulations for the abovementioned distributed power control algorithms have been carried out in MATLAB. We have conducted the experiments in two phases. In the initial phase, we have considered multiple number of D2D users communicating among each other through dedicated spectrum resources. In our experiment, we have considered 4 D2D pairs communicating simultaneously. The transmission gain for the D2D transmitters has been set up to 0.5 and the noise power at the receiver end is 0.1 dB. The power control algorithm has been computed for 50 iterations. The convergence of the transmission power for the D2D transmitters is shown in Fig. 2. Algorithm 1 D2D Distributed Power Control Algorithm for D2D Transmitter 1: procedure D2D Distributed Power Control(τi , ζi , γi , ) 2: τi = current transmission power of D2D transmitter 3: ζi = measured SINR 4: γi = threshold SINR 5: τi = updated transmission power 6:  = fixed constant   positive 7: τi = γζii τi /* Distributed power control equation. */ 8: if |τi − τi | <  then 9: Stop 10: else 11: τi = τi

12: Goto step 7

From the generated graph, one can say that for D2D communication, the less transmission power is required. Every time, the ζ goes beyond the threshold, the distributed power control approach can be used to reduce the transmission power. In second phase, we have considered five pairs of cellular users and four pairs of D2D users operating in a single cell. The D2D users can communicate in D2D mode if and only if the distance between the D2D users lies below the threshold [6]. The threshold distance for D2D communication is considered to be 30 m. If the distance is higher than the threshold, the D2D users use infrastructure mode of communication. In this case, we have set different values of G ii for cellular users and D2D users. The G ii for cellular users is set 0.9, whereas for the D2D users we have set it 0.7. The initial transmission power for both D2D transmitter is set to 1.5 dB. Figure 3 shows that the values at which the transmission powers converge in case of D2D users are lower than those of the cellular users. This is not unexpected as D2D pairs communicate within a short distance, and therefore require comparatively lower power than cellular users.

106

U. N. Kar et al. Transmit Power vs number of Iterations 0.7 D2D 1 D2D 2 D2D 3 D2D 4

0.6

Transmit Power

0.5 0.4 0.3 0.2 0.1 0

0

10

20

30

40

50

60

Iterations

Fig. 2 Convergence in transmission power for D2D users Transmit Power vs number of Iterations 0.5

D2D 1 D2D 2 D2D 3 D2D 4 CU 1 CU 2 CU 3 CU 4 CU 5

0.45

Transmit Power

0.4 0.35 0.3 0.25 0.2 0.15 0.1

0

10

20

30

40

50

60

Iterations

Fig. 3 Convergence in transmission power for cellular and D2D users. The upper band is for cellular users and the lower for D2D users

6 Conclusions D2D communications allow two users to communicate with each other without being dependent upon the core network. However, communication between multiple D2D pairs using the same spectral resources gives rise to interference among the D2D users. In this paper, we have provided a distributed power control algorithm for the D2D users to mitigate the interference. We have considered some of the existing approaches to handle the interference between the D2D users. We have conducted the experiments for two scenarios. In the first one, the distributed power control algorithm is applied for multiple D2D users communicating in a single cell. In the

A Distributed Power Control Scheme for Device-to-Device …

107

second scenario, we have multiple cellular users and D2D users communicating simultaneously in a single cell, each type using a separate spectrum. To achieve the desired SINR, we use a distributed greedy algorithm that converges to the desired power level.

References 1. Bambos, N.: Toward power-sensitive network architectures in wireless communications: concepts, issues, and design aspects. IEEE Pers. Commun. 5(3), 50–59 (1998) 2. Foschini, G.J., Miljanic, Z.: A simple distributed autonomous power control algorithm and its convergence. IEEE Trans. Veh. Technol. 42(4), 641–646 (1993) 3. Kar, U.N., Sanyal, D.K.: An overview of device-to-device communication in cellular networks. ICT Express (2017) 4. Kar, U.N., Sanyal, D.K.: A sneak peek into 5g communications. Resonance 23(5), 555–572 (2018) 5. Kar, U.N., Sanyal, D.K.: A critical review of 3g pp standardization of device-to-device communication in cellular networks. SN Comput. Sci. 1(1), 37 (2019) 6. Kar, U.N., Sanyal, D.K.: Experimental analysis of device-to-device communication. In: 2019 Twelfth International Conference on Contemporary Computing (IC3). IEEE, pp. 1–6 (2019) 7. Lee, N., Lin, X., Andrews, J.G., Heath, R.W.: Power control for d2d underlaid cellular networks: modeling, algorithms, and analysis. IEEE J. Sel. Areas Commun. 33(1), 1–13 (2014) 8. Mittal, D., Kar, U.N., Sanyal, D.K.: A novel matching theory-based framework for computation offloading in device-to-device communication. In: 2017 14th IEEE India Council International Conference (INDICON). IEEE, pp. 1–6 (2017) 9. Shen, X.: Device-to-device communication in 5g cellular networks. IEEE Netw. 29(2), 2–3 (2015) 10. Sun, P., Shin, K.G., Zhang, H., He, L.: Transmit power control for d2d-underlaid cellular networks based on statistical features. IEEE Trans. Veh. Technol. 66(5), 4110–4119 (2016) 11. Yu, C.H., Doppler, K., Ribeiro, C.B., Tirkkonen, O.: Resource sharing optimization for deviceto-device communication underlaying cellular networks. IEEE Trans. Wirel. Commun. 10(8), 2752–2763 (2011) 12. Yu, C.H., Tirkkonen, O., Doppler, K., Ribeiro, C.: On the performance of device-to-device underlay communication with simple power control. In: VTC Spring 2009-IEEE 69th Vehicular Technology Conference. IEEE, pp. 1–5 (2009)

An Application of Block Chain in Examination System, A Case Study Ashis Kumar Samanta and Bidyut Biman Sarkar

Abstract A block chain is a set of unchangeable transactions that usually exist in multiple locations in a network. In a data warehouse, predominant operations are additions and deletions. However, changes in virtual data warehouses are also possible. Therefore, block chain technology can distribute public data in an incontrovertible, secured, and encrypted form to ensure that transactions can never be fiddled. The present system of conducting examination suffers from various inconsistencies like score manipulations, issuance of certificates, and many more. After a detailed study of a large university examination system, an attempt is made to present the block chain technology, starting from blank answer script distribution to the examination centers, script evaluation, and tabulation to auto-generation of certificates upon successful completion of the examination. In this paper, we present the block chain framework of the system and validate the framework with the help of a case study showing the hash-digest at every stage of the transaction added directly on the block chain. It enables the tracing of how exactly a candidate answers the script, script evaluation, score received, and adding more trustworthiness to the obtained certificate. Keywords Block chain · Examination process · Transparency

1 Introduction Block chain technology is a peer-to-peer, decentralized distributed network that shares the information among the nodes. These nodes do not trust each other but the information that the block holds is hard to tamper and is trustful information. A. K. Samanta (B) Department of Computer Science and Engineering, University of Calcutta, Kolkata, India e-mail: [email protected] B. B. Sarkar MCA Department, Techno International Newtown, Kolkata, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Chaki et al. (eds.), Advanced Computing and Systems for Security, Advances in Intelligent Systems and Computing 1178, https://doi.org/10.1007/978-981-15-5747-7_8

109

110

A. K. Samanta and B. B. Sarkar

Fig. 1 Block chain representation

Fig. 2 Structure of each block

Fig. 3 Internal structure of block chain

The chain of the blocks (Fig. 1) contains information in meta-technology in BC. The technology saves the data with a timestamp mechanism in every digital transaction to secure the data and make it hard to temper. It is a distributed ledger completely open to all users [1, 2]. Instead of using a central entity, block chain is used as a peer-to-peer distributed system. When a new user joins this network, the person gets a full copy of the block chain with read-only access on the data [2–4]. Each block of block chain contained some data, the unique hash value of the block and the hash value of the previous block (Figs. 2 and 3). The first block does not contain the hash value of the previous block, called the “Genesis” block [5]. As soon as a block is created, its hashes are calculated. Tampering data in a block changes its hash value and does not match the rest of the chain of blocks. This technique makes a block chain secured (Fig. 4). Hashes are very useful and therefore act as the fingerprints of a block. When someone creates a new block, the block is sent to everyone on the network. Each node then verifies the block to make sure that it has not tampered. If everything checks out by each node, the block is added to block chain. All the nodes in this network create consensus on which blocks are valid and which are not. Any tempered block will be rejected by other nodes of the network [1].

An Application of Block Chain in Examination …

111

Fig. 4 Data representation in block chain in case tampering

2 Related Work During the survey work on block chain technology in the education system, it is observed that all-round research has started in areas like finance, energy, healthcare, medicine, and education. However, the use of block chain technology in education and research is at its primitive stage, but some of the promising experimentation and innovation are recorded in relation to the present scope of work. There exist works [6] based on an assessment of education quality using course outcomes (CO). These CO form the basis of accreditation by bodies like ABET. The system is studied and the allocation of credit to a particular incumbent in a transparent manner is proposed using the distributed ledger technology of block chains. This continuous assessment of grades is a secured measure. The distinction is the continuous assessment of the learning process, storage of up-to-date learning materials and individual. Results are kept through block chain in a robust and secure way. However, the paper is silent on internal evaluation, record processing, record keeping, and record auditing prior to placing the block in the chain. The proposed works in [1, 3] deal with the behavioral aspects of a student like motivational instincts, behavior with juniors and seniors, public relation issues during projects and studies, and recording of corresponding certificates. The availability of credentials is suggested through a query mechanism, which increases efficiency and reduces the time. Block chain technology is used here as a tool to record and disseminate through a common platform on demand. No finer dimension is worth mentioning on this work except block chain technology as a service. The existing work [2] proposed block chain technology in the domain of smart contact microservice system. The author here proposed the comparative study of microservice and smart contacts based on three layer of architecture. The implementation of the interfacing of different languages would make the paper more generic. The proposed smart contact microservice block chain model can be used in the education system for examination processing. The existing works in [7–10] have addressed the issue of credit transfer. Now, students are not confined within any specific region or rather they can be called global entities. They may attain a single degree over different institutions in various

112

A. K. Samanta and B. B. Sarkar

parts of the world. This includes online courses like massive open online courses (MOOCs). This is distributed in nature. Transfer and accumulation of credit points are advocated using block chain technology in this work. The merit of the paper is credit transfer but distributed nature should ensure a tightly coupled integrated system. This work did not address other administrative issues like the security of the basic system. In [5, 11], block chain promotes accord due to its transparency in the distributed network. The fact that validation of the individual ledgers can be done without any alteration implies permanency. This tamper-proof approach of block chain in healthcare is endorsed to guard against mafias, protect medicines, and decentralization of authority. A holistic approach which is also advocated to secure data for the appropriate user is presented in this paper. The primary focus is on the deployment of block chain technology from the security viewpoint and provides fair healthcare service to society. However, there is no processing mechanism described in these studies. The existing work in [4], block chain technology is suggested for a library management system. The block chain technology is recommended in a distributed environment for data protection, issuance of transcripts for library applications, and digital certificates termed as “Digital Badges.” This paper addresses primarily the information protection issues of library data. No other dimensions of the educational system are discussed here. In the intensive survey of the literature, it is found that many implementations have been done in different application domains of BC. Since BC is an emerging technology, there are still a lot of research scopes in this domain. There are various sectors of higher education where we can think about the implementation of BC for improving security. The motivation of this work is illustrated in Sect. 3.

3 Motivation and Proposed Work After the literature survey, it is felt that block chain technology is still in its early days for use in education and research. An examination process is a tool for evaluation where specific objectivity, reliability, transparency in the process, and integrity of data are essential issues. Traditionally, tangible and structured testing administered within a limited time period is followed in the examination so as to measure students’ knowledge base and their understanding. We considered the large university examination system, where the number of affiliated colleges is more than 150, the number of its departments is more than 65, and the number of examination centers is more than 600. The confidentiality, integrity, trust, and transparency are the main objectives of this huge examination system. The integrity and trust in the data are the issues not only for the present date but also toward maintaining the data for the future. We have defined two specific problems of the university examination system and tried to address it with block chain as an alternate platform to migrate from conventional distributed RDBMS. The aim is to utilize the inherent property of tamper-proof security and integrity of block chain. The basic advantage of block

An Application of Block Chain in Examination …

113

Table 1 Operational flow of examination system Sl. No. Explanation of process 1 2. 3. 4. 5. 6. 7. 8. 9.

Distribution of the blank scripts to the recipient college examination centers The actual assignment of blank scripts against a particular admit card Assignment of scripts to head examiners Assignment of scripts to the tabulator From tabulator to the university examination cell Result publication Re-examination and review Rebuild of the hash Printing of certificates

chain is that it brings the system tamper resistive from any kind of human interference. In the case of block chain, even if somebody hacks into a few data nodes and acquires the dump of those node(s) containing a large volume of data blocks, actual data cannot be interpreted from these as the file and namespace information is not stored in the same data nodes. In this work, we have given an effort to implement block chain technology in a certain area of academic and administrative issues of the examination system. We like to start our work from the distribution of the blank scripts to the recipient colleges. The actual assignment of blank scripts will be recorded from the examination center. Assignment of the script to head examiner to tabulator and back to university examination cell will be recorded with date and time. A block will be created and will be placed in the chain. Block chain technology makes the system transparent to reduce the legal litigation or Right to Information (RTI) issues, which in turn makes the academic administration of the system sound. The proposed workflow is presented in Table 1. In the real-life university system, blank answer scripts are required to be distributed, which have numbers of examinations throughout the year, have large numbers of affiliated colleges, and own departments and schools. In our case study, we have taken an example of such a university, have 750 number of examinations over 600 examination centers, and have more than 65 departments including 1(one) school. In this pilot exercise, the problem is restricted to 50 odd cases as described in Fig. 3. This process will restrict some of the malpractices and corresponding administrative hazards like particular script distribution to a particular examination, script missing, and generating wrong packets. There are many more issues in real-life situations, which will be included in our subsequent work. The suggested solution proposed that each blank script with a serial number needs to be included in block chain such that the center code, examination code, student roll number must be tagged in the block. It will check the inventory in each and every micro-level for the particular examination. The manipulation and wrong packets problem may be overcome through the use of the block chain implementation.

114

A. K. Samanta and B. B. Sarkar

4 Methodology The block chain database is distributed in nature and in each case the system should maintain, calculate, and update new entries into the database. Connected nodes will function in a secure manner in the network. We use MongoDB as our database and Python as our frontend in the block chain implementation process for our examination process. The algorithmic process is described herein Fig. 5 to implement the block chain technology. The UML diagram of the proposed work defined is shown in Fig. 6.

4.1 Algorithm for the Proposed Work 4.1.1

Step1: Examination Name Creation

With the announcement of the new examination schedule, a block of Block-1 type will be generated and will be added to the examination block chain (as mentioned in Sl. No: 4.2.1 and Fig. 7). Here it is assumed that with the declaration of examination schedule, the application of eligible students is accepted and the center of the examination of each student is allotted accordingly.

Fig. 5 Proposed work flow

An Application of Block Chain in Examination …

Fig. 6 UML diagram of proposed work Fig. 7 Examination block

115

116

A. K. Samanta and B. B. Sarkar

Fig. 8 Blank answer script sending to center block

4.1.2

Step2: Sending Blank Answer Scripts to the Respective Examination Centre

The block chain will be maintained when blank answer scripts would be sent to the respective examination center (as mentioned in Sl. No: 4.2.2 and Fig. 8). The number of scripts needs to be sent to the respective examination center is slightly more than the number of scripts required (required scripts are = (Number of students ∗ Number of modules) + extra scripts).

Fig. 9 Conduction of exam block

An Application of Block Chain in Examination …

117

Fig. 10 Teacher-assignment block

4.1.3

Step3: Conduction of Day Wise Examination

The block chain will be maintained in respect of the individual day of a particular examination (as mentioned in Sl. No-4.2.3 and Fig. 9). Each block maintains the daywise and examination-center-wise block chain. Each block contains the reference of the previous block of the same type, sending of blank answer script block, studentapplication block, and seat-allotment block.

4.1.4

Step4: Issuing of Exam Wise Assignment Letter

An individual block is generated with the issuance of each assignment letter to individual teachers for a particular examination (as mentioned in Sl-No: 4.2.4 and Fig. 10). Each of these blocks contains the reference of the previous block of the same type and the concerned examination block.

4.1.5

Step5: Generation of Packet at the end of Examination

When the packeting is done after the immediate completion of a particular examination, the block of the packet is generated. The packeting is done exam-wise, subjectwise, and paper-wise in the respective center (as mentioned in Sl. No: 4.2.5 and Fig. 11). The information on each packet becomes the data part of each block. Each block that maintains the chain contains the reference of the concerned examination block, conduction of examination block, and seat-allotment block.

118

A. K. Samanta and B. B. Sarkar

4.2 Description of the Process 4.2.1

Examination Block (Block Type-1)

The university has a number of examinations. Whenever a new examination will come, it will create a block and will add to the examination chain using the public block chain algorithm. This technology supports to store data in Java scripts object notation (JSON) format. The block contains the data like exam ID, exam name, date of commencement of exam, date of completion of exam, semester number, and timestamp. The diagram is shown in Fig. 7. After the creation of this block, applications are accepted from students and the examination center of the individual student is allocated.

4.2.2

Blank Answer Script Sending to the Center Block (Block Type-2)

After receiving the application and seat-allotment data of the eligible students, the blank scripts need to be sent to the respective centers. These blank scripts also maintain a chain to minimize any sort of malpractices and other litigations (Fig. 8). The data part of this block contains exam ID, center name, date, semester number, timestamp, etc.

4.2.3

Conduction of Examination Block (Block Type-3)

The conduction of examination on a particular day is one of the important processes that helps to identify and verify the present and absent data of students. Authenticated script is used by the respective students in the examination and is genuine. The packeting of scripts will also be done. The data part of the block contains (Fig. 9) student detail, date, subject ID, paper ID, timestamp, etc. The serial number of the script used is maintained in this block for the future reference of evaluations.

4.2.4

Assignment of Teacher Block (Block Type-4)

Assignment is recommended by the concerned Board of Studies (BOS) of the university assigned to the faculty members. Paper setter, moderator, scrutineer, examiner, head examiner, coordinator (Fig. 10) are assigned. The objective of this block is to maintain a private chain of confidential assignment and that will be referred in the future at the time of script evaluation. The declined cases which are described in the problems can be identified for future decision-making.

An Application of Block Chain in Examination …

4.2.5

119

Packeting of Unevaluated Answer Script Block (Block-5)

The packeting will be done each day after the completion of the examination. This chain will help to distribute the unevaluated scripts and take care of any missing script. The data part of the block is shown in Fig. 11. This block helps to maintain the financial information about the individual level of remuneration for a particular examination. This will help to identify the particular teacher who has not yet received his/her claims and also protect from the payment more than one time for a particular assignment. It will help to address the administrative issue of distributing the evaluation load equally to maintain academic quality.

4.2.6

Description of the Hash Function Used

A hashing algorithm is a mathematical function that condenses data to a fixed size. SHA-256 is one of the secure hash algorithms of 256 bit. The algorithm of this function takes the input data and returns an output of a fixed length, which represents the fingerprint of the data. This procedure is normally used in block chain (Fig. 12).

Fig. 11 Packeting of scripts block in the day of examination Fig. 12 SHA-256 hash function

120

A. K. Samanta and B. B. Sarkar

5 Experimental Setup We used in our experiment the computer with the hardware configuration, Intel(R) Core(TM) i5-CPU M 480 @ 2.67GHz, 2.67 GHz Processor(s), 4 GB RAM and Windows7 64 bit OS. We have used Python 3.7.4 version for Windows and MongoDB server version 3.0. For the use of connecting drivers between Python and MongoDB, we have used Pymongo driver or PIP version 19.0.3. The PyScripter 3.6.1 is used as the Python Editor. We have also used the GUI tools of NoSQL for Windows and NoSQL manager for MongoDB freeware version 5.3. We have developed a sample code to create a block chain of examination and the outcomes of the code are also reflected in the result and discussion section of the paper. We have used Python and MongoDB to conduct our experiment as MongoDB can handle a huge volume of data items. The interface and connectivity between Python and MongoDB are very secured and less problematic. The code of block chain in respect of Figs. 7, 8, 9, 10, and 11 is available in the link [12].

6 Result and Discussion In typical implementations, examination systems are often deployed on conventional RDBMS. Different layers of operational security may also be implemented in RDBMS. However, there are several advantages of migrating the system from a conventional system to a block chain system. Transparency is one of the primary objectives of the examination system. The advantage of implementing the process in block chain is that transparency can be maintained in a distributed environment. Every transaction is recorded publicly in an encrypted format. As every data is kept in archive format in decentralized mode, any authorized user can verify the data. The data is immutable and no one can make any change. Block chain enhances the security from the external threats of tampering. Block chain implementation will help to protect examination data from unauthorized hacking. When all the authorized users have the same information within the network, the speed of the system is expected to be increased. Therefore, the cost of the entire process is expected to be reduced. All the stakeholders in the society involved with the examination system shall be free from the concern of the genuineness of the data for implementation in block chain. In the experiment work, we have taken five tables which are very much related to defining the solutions of our proposed problems. We have presented the sample of used datasets in the paper. The data for testing Algorithm 4.2.1 is presented in Table 2 and the sample result of the test to build the examination block is shown in [12]. The genesis block is created for the individual collection. Each document in the collection creates a block. The chain is maintained for each document by keeping the reference of the hash value and previous hash value. Table 3 shows the sample dataset of sending of the scripts to different examination centers of the “Three Year Master of Technology (M.Tech.) 4th Semester Exami-

An Application of Block Chain in Examination … Table 2 Dataset of examinations block Exam. Bos Ex Yr. M.A./M.Sc. 1st Semester Examination, 2017 in Human Develomt. Master of Science (M.Sc.) 2nd Semester Examination, 2017 in Biophysics B.B.A. (Honours) Part—2 Examination, 2017 Two Year M.Phil Part—I Examination, 2017 in Economics M.A./ M.Sc. 3rd Semester Examination, 2017 in App. Math.

121

Sem.

Start Dt.

End Dt.

Final

Human Develomt.

2017

1st

03.12.2017

28.12.2017

N

Biophy.

2017

2nd

17.06.2017

29.06.2017

N

BBA

2017

part-2

11.06.2017

28.06.2017

N

Economics

2017

part-1

21.06.2017

04.07.2017

N

Mathematics

2017

3rd

21.12.2017

06.01.2018

N

nation, 2017 in Instrumentation and Control Engineering” (Examcode: EOO1). We consider that the examination has four modules (module(s)—401, 402, 403, and 404). The blank scripts that are sent to each examination center are four multiples of the number of candidates allotted and with some extra numbers. Table 4 shows the sample dataset of the assignment of the script to different centers of the examination EOO1. We consider the dataset for one sample module (module—401). The individual script is allotted against a single roll number of each candidate for each module. Table 5 shows the sample dataset of different assignments assigned to the faculty member of Board of Studies (BOS) of Examination EOO1. We consider the dataset of four modules (401, 402, 403, and 404) of that particular semester. The individual assignment is allotted to faculty members for the respective module. Table 6 shows the sample dataset of packeting of scripts from center of Examination EOO1. The unevaluated scripts are packed according to date-wise, subject-wise, paper-wise, and module-wise to the office of the authority of the examination controlling system. Thereafter, the scripts are handed over to the concerned examiner, head examiner for evaluation. We consider the dataset for one module (module—

122

A. K. Samanta and B. B. Sarkar

Table 3 Dataset of sending of script to different centers Center Name Script From Uluberia College Hope Institute of Bengal Scottish Church College Sree Sree Ramkrishna College El-Bethel College

160,070 160,091 160,120 160,149 160,174

Script To 16,090 160,119 160,148 16,173 16,194

Table 4 Dataset of used script on examination day Script No. Module Reg. No. Name Roll No. 160,070

401

160,091

401

160,074

401

160,121

401

224-11210214-16 613-11241261-16 543-12210321-16 543-12210329-16

Soujanna Mitra Tuhin Nath Sucheta Dutta Debarati Jana

Table 5 Dataset of assignment of the teacher Module Full Marks 401 401 403 ALL

40 50 40 ALL

Venue

91/MTIC/160001

Uluberia College

91/MTIC/160002 91/MTIC/1600051

Hope Institute of Bengal El-Bethel College

91/MTIC/160008

Uluberia College

Faculty

Assignment

Dr. Sumit Roy Dr. Nilima Gupta Dr. Jarhana Dawn Dr. Anjita Dey

Setter/HE Examiner Examiner Scrutineer

401). The concerned outcomes of this dataset are shown in Fig. 13 as sample output. A similar figure of experimental outcomes of Tables 1, 2, 3, 4, 5, and 6 is available in [12]. In the sample output (Fig. 13), the data are stored in MongoDB and maintained in block chain in JSON format. The first document of the data is the “Genesis” block. The individual document of the data forms a single block. All the documentations of the database form a chain of blocks. Each document (block) contains a hash value and also the hash value of the previous block. The update privileges in MongoDB are restricted with the following command. Authorized user only can insert a valid block or read a block, and cannot modify the data. Privileges: [ resource: db: “education”, collection: “examination” , actions: [ “find”, “insert” ]].

An Application of Block Chain in Examination …

123

Table 6 Dataset of assignment of packet of the teacher Pkt. Mod Sct. No. Reg. No. Roll No. No. A001

401

160,174

543-1221-0321-16

A001

401

160,175

014-1221-0683-16

A002

401

160,096

012-1221-2496-16

Venue

91/MTIC/160005 El-Bethel College 91/MTIC/160012 El-Bethel College 91/MTIC/160020 Hope Institute of Bengal

Date 21.06.2017 21.06.2017 21.06.2017

Fig. 13 Result of packeting and assign to teacher from block chain

7 Conclusion In this paper, we try to present block chain technology to fix the issues that arise in the traditional university examination system. Though the examination system of the traditional university is a steady system, we are to believe that the proposed system will make the discussed area of the examination very transparent and error-free. In the future, we have the plan to study more areas of the examination system like evaluation, tabulation, result publication, verification, migration, transcript generation, and different fee collection areas of the system. We will also try to implement block chain technology wherever applicable using Python and MongoDB.

124

A. K. Samanta and B. B. Sarkar

References 1. Sharples, M., Domingue, J.: The blockchain and kudos: a distributed system for educational record, reputation and reward. In: Verbert, K., Sharples, M., Klobuˇcar, T. (eds.) EC-TEL 2016. LNCS, vol. 9891, pp. 490–496. Springer, Cham (2016). https://doi.org/10.1007/978-3-31945153-448 2. Tonelli, R., Lunesu, M.I., Pinna, A., Marchesi, M.: Implementing a microservices system with blockchain smart contracts. In: 2019 IEEE International Workshop on Blockchain Oriented Software Engineering (IWBOSE), 24 February 2019 (2019). https://doi.org/10.1109/ IWBOSE.2019.8666520 3. Xu, Y., Zhao, S., Kong, L., Zheng, Y., Zhang, S., Li, Q.: ECBC: a high performance educational certificate blockchain with efficient query. In: Hung, D., Kapur, D. (eds.) ICTAC 2017. LNCS, vol. 10580, pp. 288–304. Springer International Publishing AG, Berlin (2017). https://doi.org/ 10.1007/9783-319-67729-3_17 4. Chen, G., Xu, B., Lu, M., Chen, N.-S.: Exploring blockchain technology and its potential applications for education. In: Chen et al. (eds.) Smart Learning Environments. Springer (2018). https://doi.org/10.1186/s40561-017-0050-X 5. Manivannan, K., Academicsi, D.: Contribution of blockchain technology to the growth of the society. In: 2018 World Engineering Education Forum - Global Engineering Deans Council (WEEF-GEDC), 12–16 November 2018 (2018). https://doi.org/10.1109/WEEF-GEDC.2018. 8629694 6. Duan, B., Zhong, Y., Liu, D.: Education application of blockchain technology: learning outcome and meta-diploma. In: 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS), 15–17 December 2017 (2017). https://doi.org/10.1109/ICPADS. 2017.00114, Print ISSN: 1521-9097 7. Ocheja, P., Flanagan, B., Ueda, H., Ogata, H.: Managing lifelong learning records through blockchain. In: Research and Practice in Technology Enhanced Learning. Springer Crossmark (2019).https://doi.org/10.1186/s41039-019-0097-0 8. Al Harthy, K., Al Shuhaimi, F., Al Ismaily, K.K.J.: The upcoming blockchain adoption in Higher-education: requirements and process. In: 2019 4th MEC International Conference on Big Data and Smart City (ICBDSC), 15–16 January 2019. https://doi.org/10.1109/ICBDSC. 2019.8645599 9. Turkanovi´c, M., Hölbl, M., Košiˇc, K., Heriˇckoi, M.: A blockchain-based higher education credit platform. IEEE 6, 5112–5127 (2018). https://doi.org/10.1109/ACCESS.2018.2789929 10. https://www.ugc.ac.in/pdfnews/3370062_examination-reform.pdf 11. Raju, S., Rajesh, V. and Deogun J. S.: The case for a data bank: an institution to govern healthcare and education. In: Proceedings of the 10th International Conference on Theory and Practice of Electronic Governance, ICEGOV 2017, vol. Part F128003, pp. 538–539. ACM. https://doi.org/10.1145/3047273.3047275, 978-1-4503-4825-6/17/03 12. https://mega.nz/#F!rLZx0abT!Lxa3MM7xn28lNBsNgxQxVQ

A Study on Energy-Efficient Routing Protocols for Wireless Sensor Networks Soumyabrata Saha

and Rituparna Chaki

Abstract Wireless sensor networks entail of miniaturized battery-powered sensor nodes with inhibited computational competency. Thus, a routing protocol for sensor networks needs to ensure uniform energy dispersal during its operation. In addition, it is also expected to guarantee fast data delivery irrespective of node density, besides being flexible in terms of the routing framework and route computation metric. The restricted and constrained resources in wireless sensor networks have directed research towards minimization of energy consumption, reduced storage usage and complexity of routing functionalities. In this paper, a number of striking routing algorithms have been studied to afford an insight into energy-efficient designs and present a generous study of different topology control techniques for sensor networks. The routing protocols have been categorized based on the underlying network structure: flat, location based and hierarchical. For all of the protocol families, authors have stressed on the primary motivation behind the development and expounded their operation along with the advantages and disadvantages of those protocols. In conclusion, a number of open research issues have been pointed as an outcome for achieving energy adeptness in the development of routing protocols. Keywords Wireless sensor networks · Routing · Flat · Location based · Hierarchical · Energy efficiency

1 Introduction Wireless sensor networks encompass of copious tiny sensor nodes to form an ad hoc distributed, data propagation network to bring together the context information on the physical environment. The mounting attentiveness in sensor network S. Saha (B) JIS College of Engineering, Kolkatta, West Bengal, India e-mail: [email protected] R. Chaki University of Calcutta, Kolkatta, West Bengal, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Chaki et al. (eds.), Advanced Computing and Systems for Security, Advances in Intelligent Systems and Computing 1178, https://doi.org/10.1007/978-981-15-5747-7_9

125

126

S. Saha and R. Chaki

and the continual emergence of new architectural techniques enthused charting the characteristics, applications and communication protocols for such a technical area. The impending applications of sensor networks are critically wide ranging, such as natural phenomena, environmental changes, security controlling, traffic flow estimation, military application monitoring, tracking friendly forces in the battlefields, vehicular movement, mechanical stress levels on attached objects, etc. Routing in sensor network is a perplexing task due to plentiful innate individualities that extricate them from extant communication and wireless ad hoc networks. The characteristics of WSNs and application requirements have direct waves on the design issues in terms of network performance and capabilities. The capricious distribution of large number of sensors and the dynamics of their operating environment pose unique challenges on the architectural design of sensor networks. Flat routing protocols are akin to the conventional multi-hop ad hoc routing protocols and nodes are of equal power and accomplish the same function. Flat routing protocols have ratified a number of advantages, as it has the low overhead of topology maintenance, the ability to discern multiple routes and each node can not only collect data from interesting events, but can also transmit information data when serving as a relay node. Location-based routing protocols have emerged as one of the utmost significant, efficient and scalable routing schemes for sensor networks where sensor nodes are addressed by their locations. This location information is required to calculate the distance between a pair of nodes for estimation of energy consumption. This technique has employed position information to relay the data to the desired sections rather than the whole network. The key advantage of this routing solution is that there is no obligation to make out the topology of the network. Hierarchical network erection is an illustrious technique with distinctive benefits related to scalability and efficient communication and is also employed to carry out energy-efficient routing in WSNs. The foremost objective of hierarchical routing is to proficiently endure the energy consumption of sensor nodes by concerning them in multi-hop communication within a detailed region by carrying out data aggregation to decrease the number of transmitted messages to the sink. The rest of the paper is outlined as follows. In Sect. 2, routing challenges and issues have been identified. A comprehensive analysis of different routing methods for wireless sensor networks has been pursued in Sect. 3. This cataloguing offers a deep analysis on the most illustrious energy-efficient routing along with their pros and cons. This article clinches and ascertains some of the future directions with open research issues. Conclusion has been offered in Sect. 4.

2 Routing Challenges and Issues Even with the incredible and copious advantages like distributed localized computing, wide area coverage, extreme environment area monitoring, WSNs stance various challenges to the research community. As the performance of the routing

A Study on Energy-Efficient Routing Protocols …

127

protocol is diligently associated to the architectural model, we have endeavoured to internment architectural issues and high point their implications and as well as identified few of the major challenges faced while scheming the routing in WSNs, such as hardware resource constraints, node capabilities, node deployment, data delivery models, energy consumptions, data aggregation and gathering, network dynamics in nature, network scalability, fault tolerance, latency and quality of services. Compared with the customary network algorithm, routing protocols of wireless sensor networks have diverse characteristics and requirements. As each individual node has restricted energy, it is an imperative goal to design energy effectual routing protocol to extend the network lifetime. Minimalized data transmission and lessened information redundancy through data aggregation are exceedingly imperative in designing the network. In order to minimize energy consumption, multi-hop communications are expressively mandatory in this network. Routing protocols may not be too convoluted calculation, routing mechanism must be simple and efficient, able to acclimate to the dynamic topology change, being there of fault tolerance mechanism, condense communication costs and progress the transmission efficiency. Sensor network routing protocols can be categorized in diverse ways, according to the manner of setting up routing paths, according to the network structure, according to the protocol operation and according to the initiator of communications. In the next section, authors would present an extensive summary of different energy-efficient routing protocols for wireless sensor networks.

3 Related Works Researchers have worked a lot towards the development of the sensor network routing protocols, applications and systems with immensely wavering requirements and characteristics. In this section, authors have presented an inclusive and fine-grained exhaustive survey on different flat, location based and hierarchical energy-efficient routing protocols which are used in order to route messages, taking into consideration the energy consumptions, energy minimization and extend the network lifetime.

3.1 Flat Routing Protocol Flat routing protocols are intended for networks with homogenous nodes, with all nodes having identical processing and transmission capacities while their packet forwarding roles are also similar. The first of this type that needs to be mentioned is SPIN [1], proposed by Heinzelman et al. which had the novel idea of metadata usage to lower packet transmissions in the network. SPIN family of protocols resulted in efficient dissemination of data, without the need to maintain per-neighbour state. In [1], the topological changes are localized as each node know only its single hop neighbours and it has provided a factor which is less than classic flooding in terms of energy dissipation

128

S. Saha and R. Chaki

and metadata negotiation almost halves the redundant data. The overhead imposed by the control messages, however, increased with dense networks and thus SPIN’s [1] performance degrades in case of networks with high number of nodes. SPIN also has not considered the security threats while routing the data. The metadata-based logic in SPIN was an event-based approach and like all event-based approaches suffered from network congestion frequently. Intanagonwiwat et al. [2] proposed directed diffusion based on query-driven data delivery. This model selected empirically good paths and used the techniques of caching and processing data in network. The approach resulted in quicker responses to event query. However, directed diffusion has a tendency to increase the network traffic as the query path is diffused through the network in search of the event. Braginsky et al. proposed rumour routing [3] as a variation of directed diffusion [2], which compromises between flooding queries and flooding event notifications but it maintains only one path between source and destination. Rumour routing [3] is applicable for delivering queries to events in large networks according to a wide range of conditions and used to handle node failure gracefully, degrading its delivery rate linearly with the number of failed nodes. COUGAR [4] has been used for declarative queries in order to abstract query processing from the network layer functions and data aggregation technique has been introduced for energy efficiency. In [4], the additional query layer of sensor node has provided extra overhead memory storage and the leader nodes should be dynamically maintained to prevent from hot spots problem. In GRAB [5], node with sufficient power level and minimum cost path reaches its next hop neighbours and nodes having lesser cost than the sender are allowed to forward packets. GRAB [5] has employed event-driven refreshing of the cost field that can ensure that the information about link failures spreads throughout the network and avoids the counting to infinity problem. INSENS [6] is a simple routing protocol where the routing computations are performed by the central base station rather than the resource-constrained sensor nodes. INSENS [6] has ensured that a single compromised node can only affect a limited portion of the network without disrupting the functioning of the rest of the network. In ASCENT [7], each node has determined its connectivity and followed a reactive algorithm that responds to changes in the network characteristics. ASCENT [7] has operated in between the MAC and network layers and has determined which nodes join the routing infrastructure and does not utilize or modify state maintained by the routing protocol. Sohrabi et al. [8] proposed a table-driven multipath routing algorithm which has been used to improve the resilience of the network to node failures. The objective of this algorithm [8] is to optimize the average weighted QoS metric in the network. In SAR [8], the counting to infinity problem has been avoided by hastening the convergence to infinity whenever the path metric reaches an upper threshold. Trajectory-based forwarding algorithm [9] is a greedy algorithm where each intermediate node attempts to forward packets along an optimal path with respect to the intended trajectory. In TBF [9], the actual intermediate nodes are not explicitly named by the path but the main issues are to specify and modify the trajectory, whether to use curve fitting techniques or simply a list of points.

Yes

Yes

Possible

Possible

INSENS [6]

Yes

No

TBF [9]

Possible

GRAB [5]

No

Limited

RR [3]

No

No

Limited

COUGAR [4]

No

No

SAR [8]

Limited

Direct diffusion [2]

Yes

Limited

SPIN [1]

Position awareness

ASCENT [7] No

Mobility

Algorithm

N/A

N/A

Low

Low

Low

N/A

N/A

Low

Low

Power usage

Comparison among flat routing protocols

No

Yes

No

No

No

No

No

Yes

Yes

Negotiation based

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Data aggregation

No

No

No

Yes

Yes

No

No

Yes

No

Localization

Moderate

Moderate

Low

Low

Low

Low

Low

Low

Low

Complexity

Good

Limited

Very Good

Limited

Very Good

Good

Limited

Limited

Limited

Scalability

Yes

No

No

Yes

No

No

No

Yes

Yes

Multipath

No

Yes

No

No

No

Yes

Yes

Yes

Yes

Query based

A Study on Energy-Efficient Routing Protocols … 129

130

S. Saha and R. Chaki

The preceding sections have described several flat routing protocols for sensor networks.

3.2 Location-Based Routing Protocol Location information is crucial to conjecture the distance between two nodes and elevated the routing in an energy proficient method. Two techniques are cast off to find location: relative coordinates of neighbouring nodes can be obtained by bartering information between neighbours and location information can be achieved unswervingly through GPS devices. In location-based technique, the query can be diffused only to that certain region which will eradicate the number of transmission significantly. In order to stay with the theme of the survey, we limit the scope of coverage to only energy-aware location-based routing protocols for wireless sensor networks. In GAF [10], the network area has alienated into fixed zones to form a virtual grid and this protocol has preserved energy by turning off unnecessary nodes without distressing the level of routing fidelity. Three states have been defined in GAF [10], where discovery state has been used for determining the neighbours in the grid, active for reflecting participation in routing and sleep when the radio is turned off. GEAR [11] has used energy-aware and geographically informed neighbour selection to restrict the number of interests in directed diffusion [2] by only considering a certain region rather than sending the interests to the whole network. Each node in GEAR [11] keeps an estimated cost and a learning cost of reaching the destination through its neighbours and can conserve more energy than directed diffusion [2]. Researches have explained that GEAR [11] not only reduces energy consumption for route setup, but also performs better than GPSR [12] in terms of packet delivery. The design goal of GeRaF [13] is to deliver each packet to sink through minimum hops. In GeRaF [13], next hop node would be chosen depending on the closure distance from the destination than the node currently holding the message. Best effort forwarding RTS/a CTS message mechanism has been employed in GeRaF [13] and back-off time increases the reliability. Chen et al. [14] proposed SPAN which is a position-based algorithm that can operate under the routing layer and above the MAC layers and has been designed to conserve energy to increase network lifetime. SPAN [14] has improved the routing throughput and packet delivery latency. The goal of SMECN [15] is to determine the enclosure graph for minimum energy paths. SMECN [15] is less complex, more

A Study on Energy-Efficient Routing Protocols …

131

realistic and more power-efficient technique but the trade-off is the overhead than the MECN. The EEGR [16] algorithm uses a metric which defines communication costs between neighbours and sends messages along paths having the best trade-off between communication probability, progress and energy consumption. In EEGR [16], node’s location has been estimated with a certain error ‘ε’ and the shortest path from sensor to base station can be computed with Dijkstra’s algorithm. In EEFS [17], nodes have been randomly distributed in the network and aim to improve energy efficiency by considering distance and reception rate in the routing decisions. In EEFS [17], neighbours have been classified based on link reliability and neighbour selection mechanism. In EAGPR [18], nodes have local knowledge of neighbours’ position, energy levels and the location of the destination and used to prolongs the lifetime of the sensors and network lifetime. EBGR [19] has been designed for highly dynamic scenarios with changing topology in which location information is known. This algorithm has aimed to provide loop-free, energy-efficient routing in the presence of unreliable communication links by employing blacklisting and a discrete delay function. In [20], authors have proposed the technique which is a beacon-less algorithm which consists of two forwarding phases. In the contention process, the node has been determined as next hop through a timer-based function where the suppression phase has been used to reduce the selection of more than one node as the next hop, as well as to reduce the overhead of the protocol. DIR [21] is named as compass routing as it has minimized the angle between the computed direction and the direction source destination. In [21], the best neighbours have the closest direction towards the destination. In [22], authors have proposed SBZRP and LBZRP to optimize the pro-activeness within the zone. In SBZRP [22], zones have been created dynamically and this method is favourable to have low proactive property. LBZRP [22] is an extension of SBZRP and has been influenced by location-aided routing. LBZRP not only has reduced the proactive nature within the zone but also reduced the control flow outside the zone. In [23], authors have proposed an energy-efficient geographical routing protocol to minimize the energy consumption of sensor network by using gateway node. In [23], networks have been separated into four logical regions, in which two regions have used direct communication and rest of the two regions have used clustering hierarchy.

Mobility

Limited

No

Limited

No

No

No

Possible

No

No

Algorithm

GEAR [11]

GeRaF [13]

SPAN [14]

SMECN [15]

EEFS [17]

EAGPR [18]

EBGR [19]

DIR [21]

MGEAR [23]

Yes

Yes

Yes

Yes

No

No

No

Yes

No

Position awareness

Limited

Max

Limited

Limited

Limited

Max

N/A

Limited

Limited

Power usage

Comparison among location-based routing protocols

No

Yes

Yes

No

No

No

Yes

No

No

Negotiation based

No

No

No

No

No

No

No

No

No

Data aggregation

No

Yes

Yes

Yes

Yes

No

No

No

No

Localization

Medium

Low

Low

Low

Low

Low

Low

Low

Low

Complexity

Good

Medium

Good

Good

Medium

Low

Limited

Limited

Limited

Scalability

No

No

No

No

No

No

No

No

No

Multipath

No

No

No

No

No

No

No

No

No

Query based

132 S. Saha and R. Chaki

A Study on Energy-Efficient Routing Protocols …

133

The foregoing sections have described a number of location-based routing protocols for sensor networks.

3.3 Hierarchical Routing Protocol The sensor network must be designed as a hierarchical clustering structure to lessen the communication costs and energy usage to persist the network lifetime. Clustering is a network management procedure that forms a hierarchical structure on a flat network and offers scalability and robustness for that network. A cluster is a group of associated nodes that work together closely for the same purposes and belong to the identical topological structure. A cluster head is liable of data aggregation, giving out of information and network management, resource allocation to all nodes belonging to the cluster and directly associated to its neighbouring clusters to carry out various tasks among inter- and intra-cluster communication. The well-appreciated LEACH [24] protocol uses cluster-based hierarchical model with random rotation of the cluster heads to steadily mete out the energy load among the sensor nodes in the network. Two layers of architecture have been brought together in LEACH [24], one used for communication within the clusters and the other was between the cluster heads and sink. LEACH [24] is not applicable to large networks and cannot ensure real load balancing. ECR [25] protocol has been designed based on LEACH [24] where two different hierarchy structures of the network topology are presented and it has been applied to the network for unceasingly sending the monitored messages to the sink and highly correlated data among the neighbour nodes. In [26], authors have proposed HEED protocol for prolonging network lifetime by distributing energy consumption. HEED [26] is fully distributed clustering method where communications can take place in a multi-hop fashion between cluster heads and the base station. In HEED [26], the clustering formation in each round has imposed significant overhead and decreases the network lifetime. TEEN [27] has been designed for time-critical applications to respond for sudden changes in the sensed data. In TEEN [27], cluster head has used hard threshold and soft threshold values but this protocol is not suitable for applications which involve periodic reports. In PEGASIS [28], nodes can communicate with their closest neighbours and the base station and this method have introduced unwarranted delay for distant node where the single leader can become a bottleneck. In PEGASIS [28], the greedy algorithm can keep the minimum distance of each hop while it cannot achieve the optimal routing in the whole network. PEDAP [29] routing scheme has been extended from PEGASIS [28] where all nodes are constructed into a minimum spanning tree and it would receive the routing information from base station. PEDAP [29] has achieved less communication time

134

S. Saha and R. Chaki

to forward packets to destination and could be used for reliable communication with high bandwidth utilization. EEUC [30] is a distributed algorithm where cluster heads are elected locally and the network topology has varied frequently in routing technique that may cost much more energy consumption than others. To address the hot spots problem, EEUC [30] has introduced an unequal clustering mechanism to balance the energy consumption which has achieved the remarkable network lifetime improvement. In [30], global data aggregation can create overhead for all nodes and deteriorates the network performance. Energy-efficient routing algorithm based on unequal clustering and connected graph in wireless sensor networks [31] is a distributed approach that has improved energy efficiency in two ways: election of the head of the cluster and the cluster routing. Voting scheme has been used to construct unequal size clusters and smaller clusters are constructed near base station to reduce intra-cluster traffic and results in the elimination of hot spot problem. This protocol [31] has achieved maximum network lifetime than EEUC [30] and HEED [26]. In [32], authors proposed energy degree distance unequal clustering algorithm to approximate the equalization of energy consumption and has improved the network lifetime and eliminated the hot spot problem. Sierpinski triangle method has been used in [32] in order to divide network into unequal clusters. This protocol has effectively balanced the energy consumption and prolongs the network lifetime. In [33], NEECP cluster head selection is achieved by the use of an adjustable sensing range where data aggregation is achieved through the formation of chains. An adjustable sensing range is a solution to the energy hotspot problem and minimizes the energy consumption. NEECP [33] provides the better performance than LEACH [24] and HEED [26]. Hierarchical power-aware routing [34] has divided the network into groups of sensors and each group of sensors in geographic proximity is clustered together as a zone. TTCRP [35] has introduced a power control algorithm to allow the isolated sensor nodes and cluster heads to dynamically change their transmission power for connecting sensor nodes with unreachable clusters to provide network robustness. In DCWE [36], weight of cluster head selection algorithm has reduced the energy consumption of data transmission effectively. MAXD-N [37] has selected candidate based on the maximum degree and determines the cluster head according to the negotiation strategy. In [37], during the cluster head selection phase, the promotion from candidate to cluster head has required all neighbours’ information. BCDCP [38] has resolved the cluster head distribution problem and ensured similar power dissipation of cluster heads where TDMA schedule has been employed to schedule the time slots of cluster members. BCDCP [38] has required the network node information before selection of the cluster heads and this approach would not provide better performance for large networks. The clustering periodic event-driven and query-based routing protocol [39] has been designed based on the PEQ [39] mechanism. CPEQ [39] has employed an

A Study on Energy-Efficient Routing Protocols …

135

energy-aware cluster head selection mechanism in which the sensor nodes with more residual energy are selected as cluster head to increase the network lifetime and provide a better distribution of the energy consumption among the sensor nodes. CPEQ [39] has performed data aggregation to reduce repetitive data transmission to minimize energy consumption, but in a highly dense network high amount of energy has been wasted during the transmission. ICE [40] has used acknowledgement-based approach to faulty paths discovery and provided QoS to find a route with the lowest cost for high priority event notification messages. ICE [40] protocol has the benefits of both CPEQ [39] and PEQ [39] where the load balancing, network longevity and fault tolerance have been ensured through the use of multipath routing. Cheng et al. proposed an energy-efficient weight clustering algorithm [41] which is lithe and the coefficients can be adjusted conferring to different networks. This routing mechanism has been used to reduce energy consumption by distributed cluster formation procedure to extend the network lifetime and ensure the data delivery. Mittal et al. [42] have proposed SEECP which is a reactive protocol where the predetermined number of cluster heads are selected deterministically based on residual energy of the nodes and threshold condition-based data transmission process has been executed. In [43], an energy-aware distributed unequal clustering protocol has been extended in order to improve the lifespan of multi-hop heterogeneous WSN. This protocol is used to monitor the data gathering applications and avoids hot spot problem in multi-hop heterogeneous sensor networks. In [44], author proposed EECCCP for three-level energy heterogeneous WSN where circular area is divided into concentric zones. The advance nodes are deployed in the region between the two zones, normal nodes and the super nodes, and send their data directly to the sink and the advance nodes use clustering-based approach. The cluster head selection is based on node’s residual energy and average energy of the network. P-SEP [45] is used to prolong the stable period of heterogeneous hierarchical WSN where normal nodes are deployed randomly and the energy-rich advanced nodes are deployed at predetermined locations. P-SEP [45] makes a fully distributed and appropriate choice of cluster heads based on the weighted probabilities specific to the node type with the consideration of the average energy of the nodes of the current round and the initial energy of the advanced nodes. Yang et al. proposed UCR-H [46] which is an unequal cluster-based routing scheme for three-level energy heterogeneous WSNs to elude energy hole problem. In [46], WSN field is apportioned into an optimum number of equal-sized rectangular units and this method has augmented the number of clusters in each unit, the cluster sizes and the round threshold. In [47], author proposed QoS-aware and heterogeneously clustered routing protocol for four-level energy heterogeneous WSN which has considered a multipath intra-cluster communication, where the path metric has been calculated based on the initial energy, the expected transmission count of the path, the path loss, and based on

136

S. Saha and R. Chaki

this path metric separate paths have been selected for different traffic requirements. It has shown improvement in network lifetime, throughput, stability and end-to-end delay. Bigdeli et al. [48] proposed an incremental two-layer cluster-based structure for incongruity discovery where the core idea is to cluster network data and epitomize these clusters as a Gaussian mixture model, it can categorize new occurrences and also perceive and flout the redundant ones. In [49], QoS-aware routing protocols have charted based on the network architecture where the authors have classified the surveyed protocols into two categories: multi-sink approach-based design and single-sink approach-based design. QoSaware routing protocol associated to every cataloguing has been compared based on their operations, advantages and shortcomings. In [50], authors have proposed a dynamic clustering-based energy-efficient and QoS-aware routing protocol that has exploded the intelligent search behaviour and mating behaviour of birds to address the clustering and routing problems in WSN. In [51], authors have deliberated the routing algorithms for heterogeneous WSNs and considered routing algorithms for energy harvesting WSNs as a distinctive case of energy heterogeneous scenario. In cluster-based energy heterogeneous circumstances, with diverse nodes having different initial energies, the nodes with higher energy are preferred for the energy-intensive operations. In [52], authors have used data aggregation technique on the clustered network so as to evenly dispense the energy usage among all nodes of the network. The technique showed significant performance improvement in terms of energy, network lifetime, throughput and average packet delivery ratio. However, the technique led to an increased overhead due to the extra amount of computational tasks at the cluster heads. In [53], authors have proposed the mechanism to estimate buffer-based congestion and the apposite action would be taken on a node based on a self-organized clustering network, each monitor independently and cumbersome way congestion. The framework has interrelated constituent distinctiveness that is exceedingly effectual in dealing with multiple interfering flows and achieving high delivery ratios, low delays, higher throughput, along with substantial energy savings in terms of considerable reduction in packet lost through effective regulation of the network load. Through the extensive literature review it is observed that the selection of cluster heads, cluster formation and rotation of the cluster head impressively affect the recital of the whole network and well-selected cluster heads not only decrease the energy consumption but also protract the network lifetime.

Mobility

Fixed BS

No

No

Fixed BS

Fixed BS

No

No

No

No

No

No

No

No

No

No

Algorithm

LEACH [24]

ECR [25]

HEED [26]

TEEN [27]

PEGASIS [28]

PEDAP [29]

EEUC [30]

HPAR [34]

MAXD-N [37]

BCDCP [38]

PEQ [39]

CPEQ [39]

EEWCA [41]

EECCCP [44]

P-SEP [45]

Yes

Yes

Yes

No

No

No

No

No

Yes

No

No

No

Yes

No

No

Position awareness

Moderate

Moderate

Limited

Good

Moderate

Good

Limited

N/A

Limited

Moderate

Max

Max

Good

Limited

Max

Power usage

Comparison among hierarchical routing protocols

No

No

Yes

No

Yes

No

Yes

No

Yes

No

No

No

No

No

No

Negotiation based

Yes

No

No

No

No

No

Yes

No

Yes

Yes

No

Yes

Yes

No

Yes

Data aggregation

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Localization

Low

CH selection

Low

CH selection

CH selection

CH selection

CH selection

CH selection

CH selection

CH selection

CH selection

CH selection

CH selection

CH selection

CH selection

Complexity

Good

Good

Good

Good

Limited

Low

Yes

Good

Yes

No

Good

Good

Medium

Low

Good

Scalability

No

Yes

Yes

No

Yes

Yes

No

No

Yes

No

No

No

Yes

Yes

No

Multipath

(continued)

No

Yes

No

No

Yes

No

No

No

Yes

No

No

No

No

No

No

Query based

A Study on Energy-Efficient Routing Protocols … 137

Yes

No

No

No

HDAR [52]

EADUC [43]

CBFABBC [53]

Yes

No

Yes

No

No

Position awareness

QHCR [47]

Mobility

UCR-H [46] No

Algorithm

Yes

Limited

Yes

Limited

Limited

Power usage

Comparison among hierarchical routing protocols

(continued)

No

No

No

Yes

No

Negotiation based

Yes

Yes

Yes

Yes

No

Data aggregation

Yes

No

Yes

Yes

Yes

Localization

CH selection

CH selection

CH selection

Low

Low

Complexity

Good

Good

Good

Good

Good

Scalability

Yes

No

Yes

Yes

Yes

Multipath

Yes

No

Yes

No

No

Query based

138 S. Saha and R. Chaki

A Study on Energy-Efficient Routing Protocols …

139

• Open Challenges This section discusses the challenges being envisaged for different categories of routing protocols. In flat routing protocol, route finding is executed through flooding where duplicate messages increased the network load and required additional bandwidth. It is identified that data aggregation technique is not applicable for flat routing topologies which create requirements of additional bandwidth for the redundant data delivery. Location-based routing protocols represent the algorithmic process of determining the routing paths using position information or geographic location only about network nodes which is one of the key limitations. The cost and use of GPS device have added an additional overhead for the said routing. In cluster-based topology, inter-cluster communication takes place through the cluster head which creates bottleneck and causes the quick dissipation of the energy sources of the cluster head and makes the network unstable. Congestion control is another important research area in hierarchical routing which would be controlled to achieve high energy efficiency, to prolong the network lifetime and achieve quality of service. Sensor nodes are supported with limited battery sources and energy efficacy is unanimously considered as core design issue in order to improve the paucities of the aforementioned schemes. In flat routing, 53% redundant message delivery causes additional bandwidth requirement and affects the network lifetime and 20.9% throughput has been recorded which is not remarkable. Cluster head selection and cluster formation cause additional overhead in hierarchical sensor routing but clustering approach has minimized redundant message flow within the network. Due to minimization of the duplicate message transmission, network load along with additional bandwidth requirement decreased and it has been identified that 56% of network lifetime and 57% of throughput have achieved. From the above discussion, it can be identified that cluster-based approach would provide better performance when compared with other schemes.

4 Conclusions Several resource constraints in the form of limited computation memory and power make the problem of routing in wireless sensor networks an interesting and challenging one. The preceding section has described several flat routing protocols for sensor networks where routing is achieved based on the local knowledge only. Location-based routing represents the algorithmic process of determining the routing paths using position information or geographic location only about network nodes. In location-based routing, all nodes are involved in the routing process and contribute to make routing decisions using localization methods and computing the best forwarding options. Hierarchical cluster-based routing protocols hold an immense

140

S. Saha and R. Chaki

potential towards energy efficiency in wireless sensor networks. Clustering algorithms have been a searing research area in the current years. Clustering techniques endow with low overhead of cluster head rotation as well as optimal traffic distribution among cluster heads while keeping network connectivity and coverage. A comprehensive study on different categories of routing has been presented which outline the key features and compare these different approaches based on taxonomy along with the primary metrics. From the above study, it is clearly observed that the significant efforts have been made to address the different routing techniques. In several routing protocols, additional bandwidth has been required for duplicate message delivery and has caused the additional network load, as all routing protocols do not offer data aggregation technique that causes redundant data transmission within the network and creates additional energy requirements. Cluster-based scheme is one of the key concern areas in WSN, where cluster head selection and cluster formation would be taken into highest priority. As more number of packet transmissions are involved during the communication, energy optimization technique would be taken into consideration during design of the protocol. Although many routing protocols have already been proposed in WSN in past few years, there are still many problems along with several challenges to be solved in sensor networks. The future vision of WSNs is to embed numerous distributed devices to monitor and interact with physical world phenomena, and to exploit spatially and temporally dense sensing and actuation capabilities of those sensing devices.

References 1. Heinzelman, W., Kulik, J., Balakrishnan, H.: Adaptive protocols for information dissemination in wireless sensor networks. In: Proceedings of the 5th Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom’99), Seattle, WA, Aug 1999 2. Intanagonwiwat, C., Govindan, R., Estrin, D.: Directed diffusion: a scalable and robust communication paradigm for sensor networks. In: MobiCom ‘00: Proceedings of the 6th Annual International Conference on Mobile computing and Networking, pp. 56–67, Aug 2000. https://doi. org/10.1145/345910.345920 3. Braginsky, D., Estrin, D.: Rumor routing algorithm for sensor networks. In: Proceedings of the ACM International Workshop on Wireless Sensor Networks and Applications, Atlanta, Georgia, USA., pp: 22–31, 28 Sept 2002 4. Yao, Y., Gehrke, J.: The Cougar approach to in network query processing in sensor networks. In: SIGMOD Record, Sept 2002. https://doi.org/10.1145/601858.601861 5. Ye, F., Zhong, G., Lu, S.: GRAdient broadcast: a robust data delivery protocol for large scale sensor networks. Wirel. Netw. 11, 285–298 (2005). https://doi.org/10.1007/s11276-005-6612-9 6. Deng, J., Han, R., Mishra¸S.: INSENS: intrusion tolerant routing in wireless sensor networks. In: The 23rd International Conference on Distributed Computing Systems (ICDCS03) Rhode Island, May 2003 7. Cerpa, A., Estrin, D.: ASCENT: adaptive self configuring sensor networks topologies. IEEE Trans. Mob. Comput. 3(3), 272–285, July–Aug (2004). https://doi.org/10.1109/tmc.2004.16 8. Sohrabi, K., Gao, J., Ailawadhi, V., Pottie, G.J.: Protocols for self organization of a wireless sensor network. IEEE Pers. Commun. 7, 16–27 (2000)

A Study on Energy-Efficient Routing Protocols …

141

9. Niculescu, D., Nath, B.: Trajectory based forwarding and its applications. In: MobiCom ‘03: Proceedings of the 9th Annual International Conference on Mobile Computing and Networking, pp. 260–272, Sept 2003. https://doi.org/10.1145/938985.939012 10. Xu, Y., Heidemann, J., Estrin, D.: Geography informed energy conservation for ad-hoc routing. In: MobiCom ‘01: Proceedings of the 7th Annual International Conference on Mobile Computing and Networking, pp. 70–84, July 2001. https://doi.org/10.1145/381677.381685 11. Yu, Y., Estrin, D., Govindan, R.: Geographical and energy-aware routing: a recursive data dissemination protocol for wireless sensor networks. UCLA Comp. Sci. Dept. Tech. Rep., UCLA-CSD TR-010023, May (2001) 12. Karp, B., Kung, H.T.: GPSR: greedy perimeter stateless routing for wireless sensor networks. In: Proceedings of MobiCom 2000, Boston, MA, Aug 2000 13. Zorzi, M., Rao, R.R.: Geographic random forwarding (GeRaF) for Ad hoc and sensor networks: energy and latency performance. IEEE Trans. Mob. Comput. 2(4), 349–365, Oct–Dec (2003). https://doi.org/10.1109/tmc.2003.1255650 14. Chen, B., Jamieson, K., Balakrishnan, H., Morris, R.: SPAN: an energy efficient coordination algorithm for topology maintenance in ad hoc wireless networks. Wirel. Netw. 8(5), 481–494 (2002). https://doi.org/10.1023/A:1016542229220. Sep 15. Li, L., Halpern, J.: Minimum energy mobile wireless networks revisited. IEEE Int. Conf. Commun. 1, 278–283. https://doi.org/10.1109/icc.2001.936317 (2001) 16. Champ, J., Saad, C.: An energy efficient geographic routing with location errors in wireless sensor networks. In: ISPAN ‘08: Proceedings of The International Symposium on Parallel Architectures, Algorithms, and Networks, pp. 247–253, May 2008. https://doi.org/10.1109/ISPAN.2008.28 17. Seada, K., Zuniga, M., Helmy, A., Krishnamachari, B.: Energy efficient forwarding strategies for geographic routing in lossy wireless sensor networks. In: SenSys ‘04: Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems, pp. 108–121, November 2004. https://doi.org/10.1145/1031495.1031509 18. Elrahim, A.G.A., Elsayed, H.A., Ramly, S.E., Magdy, M.I.: An energy aware WSN geographic routing protocol. Univ. J. Comput. Sci. Eng. Technol. pp. 105–11, Nov (2010) 19. Zhang, H., Shen, H.: Energy efficient beaconless geographic routing in wireless sensor networks. IEEE Trans. Parallel Distrib. Syst. 21(6), 881–896 (2010). June 20. Füßler, H., Widmer, J., Mauve, M., Hartenstein, H.: A novel forwarding paradigm for position based routing (with Implicit Addressing). In: IEEE Computer Communications Workshop (CCW 2003), pp. 194–200 (2003) 21. Stojmenovic, I., Lin, X.: Loop free hybrid single path/flooding routing algorithms with guaranteed delivery for wireless networks. IEEE Trans. Parallel Distrib. Syst. 12(10), 1023–1032 Oct (2001). https://doi.org/10.1109/71.963415 22. Malwe, S.R., Rohilla, S., Biswas, G.P.: Location and selective border cast based enhancement of zone routing protocol. In: 3rd International Conference on Recent Advances in Information Technology (RAIT), Dhanbad, 3–5 March 2016, pp. 83–88. https://doi.org/10.1109/rait.2016. 7507880 23. Singh, P.K., Prajapati, A.K., Singh, A., Singh, R.K.: Modified geographical energy aware routing protocol in wireless sensor networks. In: International Conference on Emerging Trends in Electrical Electronics and Sustainable Energy Systems (ICETEESES), Sultanpur, 11–12 March 2016, pp. 208-212. https://doi.org/10.1109/ICETEESES.2016.7581386 24. Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication protocol for wireless micro sensor networks. In: Published in the Proceedings of the Hawaii International Conference on System Sciences, Maui, Hawaii, 4–7 Jan 2000 25. Wu, Z., Zhang, C., Chen, H.: Energy level based routing algorithm of multi sink sensor networks. In: IEEE International Conference on Networking, Sensing and Control, ICNSC 2008. https://doi.org/10.1109/icnsc.2008.4525358 26. Younis, O., Fahmy, S.: HEED: a hybrid energy efficient distributed clustering approach for Ad hoc sensor networks. In: IEEE Trans. Mob. Comput. 3(4), 366–379, Oct–Dec (2004). https:// doi.org/10.1109/tmc.2004.41

142

S. Saha and R. Chaki

27. Agrawal, A.M.A.D.P.: TEEN: a protocol for enhanced efficiency in wireless sensor networks. In: Proceedings of 1st International Workshop on Parallel and Distributed Computing, in Wireless Networks and Mobile Computing, 2001 28. Lindsey, S., Raghavendra, C.: PEGASIS: power-efficient gathering in sensor information systems. IEEE Aerosp. Conf. Proc. 3(9-16), 1125–1130 (2002) 29. Ozgur, H., Korpeoglu, I.: Power efficient data gathering and aggregation in wireless sensor networks. Proc. ACM SIGMOD Int. Conf. 32(4), 66–71 (2003) 30. Yong, H., Kyung, Y., Kim, T.: An Energy efficient unequal clustering mechanism for wireless sensor networks. In: IEEE International Conference on Mobile Ad hoc and Sensor Systems Conference, 2005 31. Guiloufi, A.B.F., Nasri, N., Kachouri, A.: An energy efficient unequal clustering algorithm using “Sierpinski Triangle” for WSNs. Wirel. Pers. Commun. 88, 449–465 (2016). https://doi. org/10.1007/s11277-015-3137-0 32. Xia, H., Zhang, R.H., Yu, J., Pan, Z.K.: Energy efficient routing algorithm based on unequal clustering and connected graph in wireless sensor networks. Int. J. Wirel. Inf. Netw. 23(2), 141–150 (2016). https://doi.org/10.1007/s10776-016-0304-5 33. Chand, S., Kumar, R., Kumar, B., Singh, S., Malik, A.: NEECP: novel energy efficient clustering protocol for prolonging lifetime of WSNs. IET Wirel. Sens. Syst. 6(5), 151–157 (2016). https:// doi.org/10.1049/iet-wss.2015.0017 34. Li, Q., Aslam, J., Rus, D.: Hierarchical power aware routing in sensor networks. In: Proceedings of the DIMACS Workshop on Pervasive Networking, May 2001 35. Khattak, A.U., Shah, G.A., Ahsan, M.: Two tier cluster based routing protocol for wireless sensor networks. In: IEEE/IFIP 8th International Conference on Embedded and Ubiquitous Computing (EUC) (2010) 36. Lin, M., Wang, Z., Zou, C., Yu, M.: Double cluster heads routing policy based on the weights of energy efficient for wireless sensor networks. In: 2010 International Conference on Computational and Information Sciences, Chengdu, pp. 696–699 (2010). https://doi.org/10.1109/iccis. 2010.173 37. Wang, Q., Wang, C., Wang, Y.: A maximum degree and negotiation strategy based clustering algorithm for wireless sensor networks. In: IEEE Instrumentation and Measurement Technology Conferences (I2MTC) (2011) 38. Muraganathan, D.C.F.M.S.D., Bhasin, R.I., Fapojuwo, A.O.: A Centralized energy efficient routing protocol for wireless sensor networks. In: Communication Magazine, IEEE, pp. 8–13, 2005 39. Boukerche, A., Pazzi, R.W., Araujo, R.B.: Fault tolerant wireless sensor network routing protocols for the supervision of context aware physical environments. J. Parallel Distrib. Comput. 66(4), 586–599 (2006) 40. Boukerche, A., Martirosyan, A.:, “An energy aware and fault tolerant inter cluster communication based protocol for wireless sensor networks. In: Proceedings of the Global Communications Conference, GLOBECOM ’07, Washington, DC, USA, 26–30 Nov 2007 41. Cheng, L., Qian, D., Wu, W.: An energy efficient weight clustering algorithm in wireless sensor networks. In: IEEE Japan-China Joint Workshop on Frontier of Computer Science and Technology, pp. 30–35 (2008). https://doi.org/10.1109/FCST.2008.24 42. Mittal, N., Singh, U., Sohi, B.S.: A stable energy efficient clustering protocol for wireless sensor networks. Wirel. Netw. 23, 1809–1821 (2017). https://doi.org/10.1007/s11276-016-1255-6 43. Gupta, V., Pandey, R.: An improved energy aware distributed unequal clustering protocol for heterogeneous wireless sensor networks. Eng. Sci. Technol. Int. J. 19(2), 1050–1058 (2016). https://doi.org/10.1016/j.jestch.2015.12.015 44. Chithra, A., Kumari, S., Shantha, R.: A novel 3-level energy heterogeneity clustering protocol with hybrid routing for a concentric circular wireless sensor network. Cluster Comput. 22 (2019). https://doi.org/10.1007/s10586-017-1310-9 45. Naranjo, P.G.V., Shojafar, M., Mostafaei, H., Pooranian, Z., Baccarelli, E.: P-SEP: a prolong stable election routing algorithm for energy limited heterogeneous fog supported wireless sensor networks. J. Super Comput. 73, 733–755 (2017). https://doi.org/10.1007/s11227-0161785-9

A Study on Energy-Efficient Routing Protocols …

143

46. Yang, L., Lu, Y.Z., Zhong, Y.C., Yang, S.X.: An unequal cluster-based routing scheme for multilevel heterogeneous wireless sensor networks. Telecommun. Syst. 68(1), 11–26 (2017). https://doi.org/10.1007/s11235-017-0372-6-0 47. Amjad, M., Afzal, M.K., Umer, T., Kim, B.: QoS aware and heterogeneously clustered routing protocol for wireless sensor networks. IEEE Access 5, 10250–10262 (2017). https://doi.org/ 10.1109/access.2017.2712662 48. Bigdeli, E., Mohammadi, M., Raahemi, B., Matwin, S.: Incremental anomaly detection using two-layer cluster based structure. Inf. Sci. Int. J. 429, 315–331 (2018). https://doi.org/10.1016/ j.ins.2017.11.023 49. Yessad, N., Omar, M., Tari, A., Bouabdallah, A.: QoS based routing in wireless body area networks: a survey and taxonomy. J. Comput. Issue 3(2018), 245–275 (2018) 50. Faheem, M., Gungor, V.C.: Energy efficient and QoS aware routing protocol for wireless sensor network based smart grid applications in the context of industry 4.0. J. Appl. Soft Comput. 68, 910–922, July (2018). https://doi.org/10.1016/j.asoc.2017.07.045 51. Sharma, D., Ojha, A., Bhondekar, A.P.: Heterogeneity consideration in wireless sensor networks routing algorithms: a review. J. Super Comput. 75(5) (2018). https://doi.org/10.1007/ s11227-018-2635-8 52. Saha, S., Chaki, R., Chaki, N.: Hierarchical Data Aggregation Based Routing for Wireless Sensor Networks. In: Nguyen, N., Iliadis, L., Manolopoulos, Y., Trawi´nski, B. (eds.) Computational Collective Intelligence. ICCCI 2016. Lecture Notes in Computer Science, vol 9876. Springer (2016). https://doi.org/10.1007/978-3-319-45246-3_16 53. Saha, S., Chaki, R.: Cluster based framework for alleviating buffer based congestion for wireless sensor network. In: Saeed, K., Chaki, R., Janev, V. (eds.) Computer Information Systems and Industrial Management. CISIM 2019. Lecture Notes in Computer Science, vol 11703, Springer (2019). https://doi.org/10.1007/978-3-030-28957-7_16