Intelligent Computing Paradigm: Recent Trends [1st ed.] 978-981-13-7333-6;978-981-13-7334-3

This book includes extended versions of selected works presented at the 52nd Annual Convention of Computer Society of In

483 120 6MB

English Pages X, 122 [129] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Recent Trends in Communication and Intelligent Systems: Proceedings of ICRTCIS 2020 (Algorithms for Intelligent Systems) [1st ed. 2021] 9811601666, 9789811601668

This book presents best selected research papers presented at the International Conference on Recent Trends in Communica

966 120 11MB Read more

Recent Developments in Intelligent Computing, Communication and Devices: Proceedings of ICCD 2019 [1st ed.] 9789811558863, 9789811558870

This book gathers high-quality papers presented at the 5th International Conference on Intelligent Computing, Communicat

590 92 24MB Read more

Emerging Trends in Intelligent Computing and Informatics: Data Science, Intelligent Information Systems and Smart Computing [1st ed. 2020] 978-3-030-33581-6, 978-3-030-33582-3

This book presents the proceedings of the 4th International Conference of Reliable Information and Communication Technol

1,378 114 80MB Read more

Recent Trends in Intelligent Computing, Communication and Devices: Proceedings of ICCD 2018 [1st ed. 2020] 978-981-13-9405-8, 978-981-13-9406-5

This book gathers a collection of high-quality, peer-reviewed research papers presented at the International Conference

976 104 28MB Read more

Intelligent Computing: Image Processing Based Applications [1st ed.] 9789811542879, 9789811542886

This book features a collection of extended versions of papers presented at OPTRONIX 2019, held at the University of Eng

866 134 7MB Read more

Recent Advances of Hybrid Intelligent Systems Based on Soft Computing [915, 1 ed.] 9783030587277, 9783030587284

This book describes recent advances on fuzzy logic, neural networks and optimization algorithms, as well as their hybrid

464 21 15MB Read more

Intelligent Computing: Proceedings of the 2020 Computing Conference, Volume 3 [1st ed.] 9783030522421, 9783030522438

This book focuses on the core areas of computing and their applications in the real world. Presenting papers from the Co

780 82 81MB Read more

Intelligent Computing: Proceedings of the 2020 Computing Conference, Volume 1 [1st ed.] 9783030522483, 9783030522490

This book focuses on the core areas of computing and their applications in the real world. Presenting papers from the Co

1,307 93 98MB Read more

Intelligent Computing: Proceedings of the 2020 Computing Conference, Volume 2 [1st ed.] 9783030522452, 9783030522469

This book focuses on the core areas of computing and their applications in the real world. Presenting papers from the Co

2,632 137 79MB Read more

Recent Trends in Communication and Intelligent Systems: Proceedings of ICRTCIS 2019 (Algorithms for Intelligent Systems) 9789811504259, 9789811504266, 9811504253

The book gathers the best research papers presented at the International Conference on Recent Trends in Communication an

163 14 10MB Read more

Intelligent Computing Paradigm: Recent Trends [1st ed.]
978-981-13-7333-6;978-981-13-7334-3

Author / Uploaded
J. K. Mandal
Devadutta Sinha

Table of contents :
Front Matter ....Pages i-x
Improved Hybrid Approach of Filtering Using Classified Library Resources in Recommender System (Snehalata B. Shirude, Satish R. Kolhe)....Pages 1-10
A Study on Collapse Time Analysis of Behaviorally Changing Nodes in Static Wireless Sensor Network (Sudakshina Dasgupta, Paramartha Dutta)....Pages 11-20
Artificial Intelligent Reliable Doctor (AIRDr.): Prospect of Disease Prediction Using Reliability (Sumit Das, Manas Kumar Sanyal, Debamoy Datta)....Pages 21-42
Bacterial Foraging Optimization-Based Clustering in Wireless Sensor Network by Preventing Left-Out Nodes (S. R. Deepa, D. Rekha)....Pages 43-58
Product Prediction and Recommendation in E-Commerce Using Collaborative Filtering and Artificial Neural Networks: A Hybrid Approach (Soma Bandyopadhyay, S. S. Thakur)....Pages 59-67
PGRDP: Reliability, Delay, and Power-Aware Area Minimization of Large-Scale VLSI Power Grid Network Using Cooperative Coevolution (Sukanta Dey, Sukumar Nandi, Gaurav Trivedi)....Pages 69-84
Forest Cover Change Analysis in Sundarban Delta Using Remote Sensing Data and GIS (K. Kundu, P. Halder, J. K. Mandal)....Pages 85-101
Identification of Malignancy from Cytological Images Based on Superpixel and Convolutional Neural Networks (Shyamali Mitra, Soumyajyoti Dey, Nibaran Das, Sukanta Chakrabarty, Mita Nasipuri, Mrinal Kanti Naskar)....Pages 103-122

Citation preview

Studies in Computational Intelligence 784

J. K. Mandal Devadutta Sinha Editors

Intelligent Computing Paradigm: Recent Trends

Studies in Computational Intelligence Volume 784

Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland

The series “Studies in Computational Intelligence” (SCI) publishes new developments and advances in the various areas of computational intelligence—quickly and with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. The books of this series are submitted to indexing to Web of Science, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink.

More information about this series at http://www.springer.com/series/7092

J. K. Mandal Devadutta Sinha •

Editors

Intelligent Computing Paradigm: Recent Trends

123

Editors J. K. Mandal Department of Computer Science and Engineering University of Kalyani Kalyani, West Bengal, India

Devadutta Sinha Department of Computer Science and Engineering University of Calcutta Kolkata, India

ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN 978-981-13-7333-6 ISBN 978-981-13-7334-3 (eBook) https://doi.org/10.1007/978-981-13-7334-3 Library of Congress Control Number: 2019934794 © Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

This volume entitled “Intelligent Computing Paradigm: Recent Trends” is an extended version of some selected papers from CSI-2017 along with papers submitted independently by the authors against the call for papers. There are eight chapters in this volume: The first chapter deals with the classification of library resources based on the recommender system; the third chapter deals with the prospect of disease prediction using reliability. The second and fourth chapters deal with wireless sensor network for studying the behavioral change and clustering of nodes. The third chapter deals with Al-based disease prediction, whereas the fifth chapter deals with product prediction and recommendation. The sixth chapter deals with reliability and area minimization of VLSI power grid network. The seventh chapter deals with the detection of forest cover changes from remote sensing data. The last chapter deals with the identification of malignancy from cytological images. The chapters were reviewed as per norms of Studies in Computational Intelligence book series. Based on comments of the reviewers, the chapters were modified by the authors and the modified chapters are examined for incorporation of the same. Finally, eight chapters are selected for publication in this special issue Intelligent Computing Paradigm: Recent Trends under the book series Studies in Computational Intelligence, Springer. On behalf of the editors, we would like to express our thanks to the authors for sending chapters into this special issue.We would like to express our sincere gratitude to the reviewers for reviewing the chapters. Hope this special issue will be a good material on the state-of-the-art research. Kalyani, West Bengal, India

Jyotsna Kumar Mandal University of Kalyani

Kolkata, West Bengal, India April 2019

Devadatta Sinha Calcutta University

v

Contents

Improved Hybrid Approach of Filtering Using Classified Library Resources in Recommender System . . . . . . . . . . . . . . . . . . . . . . . . . . . . Snehalata B. Shirude and Satish R. Kolhe

1

A Study on Collapse Time Analysis of Behaviorally Changing Nodes in Static Wireless Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . Sudakshina Dasgupta and Paramartha Dutta

11

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect of Disease Prediction Using Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sumit Das, Manas Kumar Sanyal and Debamoy Datta

21

Bacterial Foraging Optimization-Based Clustering in Wireless Sensor Network by Preventing Left-Out Nodes . . . . . . . . . . . . . . . . . . . S. R. Deepa and D. Rekha

43

Product Prediction and Recommendation in E-Commerce Using Collaborative Filtering and Artificial Neural Networks: A Hybrid Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Soma Bandyopadhyay and S. S. Thakur PGRDP: Reliability, Delay, and Power-Aware Area Minimization of Large-Scale VLSI Power Grid Network Using Cooperative Coevolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sukanta Dey, Sukumar Nandi and Gaurav Trivedi Forest Cover Change Analysis in Sundarban Delta Using Remote Sensing Data and GIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Kundu, P. Halder and J. K. Mandal

59

69

85

Identification of Malignancy from Cytological Images Based on Superpixel and Convolutional Neural Networks . . . . . . . . . . . 103 Shyamali Mitra, Soumyajyoti Dey, Nibaran Das, Sukanta Chakrabarty, Mita Nasipuri and Mrinal Kanti Naskar

vii

About the Editors

J. K. Mandal is former Dean of the Faculty of Engineering, Technology and Management, and Senior Professor at the Department of Computer Science & Engineering, University of Kalyani, India. He has obtained his Ph.D. (Eng.) from Jadavpur University. Professor Mandal has co-authored six books: Algorithmic Design of Compression Schemes and Correction Techniques—A Practical Approach; Symmetric Encryption—Algorithm, Analysis and Applications: Low Cost-based Security; Steganographic Techniques and Application in Document Authentication—An Algorithmic Approach; Optimization-based Filtering of Random Valued Impulses—An Algorithmic Approach; and Artificial Neural Network Guided Secured Communication Techniques: A Practical Approach; all published by Lambert Academic Publishing, Germany. He has also authored more than 350 papers on a wide range of topics in international journals and proceedings. Twenty-three scholars awarded Ph.D. Degree under his supervision. His profile is included in the 31st edition of Marque’s World Who’s Who published in 2013. Government of West Bengal, India conferred him ‘Siksha Ratna’ award as an outstanding teacher in 2018. His areas of research include coding theory, data and network security, remote sensing and GIS-based applications, data compression, error correction, visual cryptography and steganography, distributed and shared memory parallel programming. He is Fellow of Institution of Electronics and Telecommunication Engineers, and Members of IEEE, ACM, and Computer Society of India. Prof. Dr. Devadutta Sinha graduated with honors in Mathematics from Presidency College and completed his postgraduation in Applied Mathematics and then in Computer Science. He completed his Ph.D. in the field of Computer Science at Jadavpur University in 1985. He started his teaching career at the Department of Computer Engineering at BIT Mesra Ranchi, then at Jadavpur University and Calcutta University, where he was a Professor at the Department of Computer Science and Engineering. He also served as Head of the Department of Computer Science and Engineering, and Convener of the Ph.D. Committee in Computer Science and Engineering and in Information Technology at the University of ix

x

About the Editors

Calcutta. He also served as Vice-Chairman of the Research Committee in Computer Science and Engineering, West Bengal University of Technology. During his career, he has written a number of research papers in national and international journals and conference proceedings. He has also written a number of expository articles in periodicals, books and monographs. His research interests include software engineering, parallel and distributed algorithms, bioinformatics, computational Intelligence, computer education, mathematical ecology and networking. He has total teaching/research experience of more than 38 years. He was also on the editorial boards of various journals and conference proceedings and served in different capacities in the program and organizing committees of several national and international conferences. He was Sectional President, Section of Computer Science, Indian Science Congress Association for the year 1993–94. He is an active member of a number of academic bodies in various institutions. He is a fellow and senior life member of CSI and has been involved in different activities including organization of different computer/IT courses. He is also a Computer Society of India Distinguished Speaker.

Improved Hybrid Approach of Filtering Using Classified Library Resources in Recommender System Snehalata B. Shirude and Satish R. Kolhe

Abstract The goal of planned library recommender system is to provide needful library resources quickly. The important phases required to perform are build and update user profiles and search the proper library resources. This proposed system uses a hybrid approach for filtering available books of different subjects, research journal articles, and other resources. Content-based filtering evaluates user profile with available library resources. The results generated are satisfying the users need. The system can generate satisfactory recommendations, since in dataset most of the entries for books and research journal articles are rich with keywords. This richness is possible by referring to abstract and TOC (table of contents) while adding records of research journals articles and books, respectively. Collaborative filtering computes recommendations by searching users with similar interests. Finally, to the active user, recommendations are provided which are generated with the hybrid approach. To make it simpler and develop the outcome of the recommendation process, categorization of available records is made into distinct classes. The distinct classes are defined in ACM CCS 2012. The classifier is the output of relevant machine learning methods. The paper discusses the improvement in results by hybrid approach due to the use of classified library resources. Keywords Improvement in library recommender system · Filtering · Hybrid approach · Classification · ACM CCS 2012 · Machine learning

1 Introduction A large amount of information, finding user’s choice without asking directly to the user, are the challenges directing in the upgrading of the digital library recommender S. B. Shirude (B) · S. R. Kolhe School of Computer Sciences, North Maharashtra University, Jalgaon, India e-mail: [email protected] S. R. Kolhe e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent Trends, Studies in Computational Intelligence 784, https://doi.org/10.1007/978-981-13-7334-3_1

1

2

S. B. Shirude and S. R. Kolhe

system. Searching of proper library resources satisfying the user’s need is a major step in the implementation of such a system. Providing quick and relevant recommendations to the user of the system is the basic objective [1]. This process requires making decision and taking action according to the perception, the agent-based framework is fitting for the implementation of the system [2]. Agent’s filtering task is much similar to the task of taking out a specific book/journal according to your interest from bookshelf arranged in physical libraries. Therefore, if library records in a dataset of the digital library are classified into right categories, it will be easy to find the right resource to an agent. The implementation of classification into 14 classes given in CCS 2012 given by ACM is explained in paper [3]. Due to the use of classified library resources while implementing the task of filtering, the system provides relevant recommendations. The process combines the results from both contentand collaborative-based techniques, which is the hybrid approach. This paper is divided into six sections. The detailed literature study related to the classification can be obtained from paper [3] as this is extended work done for perfection. Library resources classification is given in Sect. 2. This explains the architecture of the proposed system, dataset prepared, and novel classifier built using different machine learning techniques. Section 3 describes the way recommendations generated are improved by the application of a hybrid approach, which makes use of classified library resources for filtering step. Section 4 discusses the results of classification and use of the hybrid approach. Conclusion about the experiment and possible scope in future is provided in Sect. 5.

2 The Framework of the System Description of the system proposed along with library resources classifier implemented is explained below [3].

2.1 Proposed System Figure 1 explains the processes required to perform in the proposed library recommender system [1]. The taxonomy of such systems includes two phases namely, profile generation and maintenance and Pprofile exploitation. Representing, generating, learning user profile, and relevance feedback are tasks included in profile generation and maintenance. Information filtering, matching user profile with the item and with other profiles, adapting user profile are the tasks included in profile exploitation phase [1]. The flow includes the first step of registering user, which creates basic user profile. The primary responsibility of profile agent is keeping the user profile implicitly updated. The dataset includes important entities such as all library records and user profiles in XML form. To satisfy the main purpose which is to recommend

Improved Hybrid Approach of Filtering …

3

Fig. 1 Proposed framework of the recommender system

books/research papers to logged user according to his or her choice, the system needs to exploit the user profile of the logged user, then searching is required to generate the results/recommendations. The hybrid approach has been implemented by the proposed system. Similarity measures are essential to calculate the resemblance between the user profile and the library resources while implementing filtering [4].

2.2 Dataset Preparation The prepared dataset includes profiles of a registered user (XML form), books/research journal records (XML form), and ACM CCS 2012 (SKOS form). Figure 2, 3, 4 shows sample for them, respectively [1].

2.2.1

User Profiles

See Fig. 2.

2.2.2

Library Resources

See Fig. 3.

4

Fig. 2 Sample user profile generated by the system

Fig. 3 Sample library resources added in the system in XML form

S. B. Shirude and S. R. Kolhe

Improved Hybrid Approach of Filtering …

5

Fig. 4 Ontology used

2.2.3

Ontology

See Fig. 4.

2.3 Library Resources Classifier A novel classifier is built using various machine learning techniques. The results obtained are analyzed and compared. It is found that PU learning approach which uses Naïve Bayes algorithm outperformed. The details are given in paper [1, 3]. The extension to this work which is improving the recommendation using hybrid approach is explained in the next section.

6

S. B. Shirude and S. R. Kolhe

3 Improved Hybrid Approach to Generate Recommendations This section describes the application of the hybrid approach to the classified library resources to generate recommendations. This approach merges results from content as well as collaborative filters. The algorithm given in Sect. 3.1 explains the use of the hybrid approach.

3.1 Algorithm HybridFilter (UserProfile, ClassifiedLibResources)//UserProfile is of active user xml form, ClassifiedLibResources are results by Naïve Bayes classifier Begin conRecom ContentFilter (UserProfile, ClassifiedLibResources) //ContentFilter isContent based filter generating recommendations conRecom colRecom

CollabFilter (UserProfile, ClassifiedLibResources) //CollabFilter is Collaborativefilter generating recommendations colRecom

hybridRecom

CombineRessults (conRecom, colRecom) //CombineResults from both filters

End of HybridFilter ContentFilter (UserProfile, ClassifiedLibResources)//ContentFilter is Content based filter generating recommendations conRecom Begin

intUser

ProfileAgent (SI, UserProfile) //SI is information searched for user in text format and UP is the active user profile. ProfileAgent return interests of user reading user profile

intKeywordU

RespresentVect (intUser) // intKeyword is a vector of key words representing interests of the user

intKeywordL

RespresentVect (intUser) // intKeyword is a vector of key word representing interests of the user

conRecom

MeasureSim (intKeywordU, intKeywordL) // Similarity is measured between classified library resources and user profile

return conRecom End for ContentFilter

Improved Hybrid Approach of Filtering …

7

CollabFilter (UserProfile, ClassifiedLibResources)//CollabFilter is Collaborativefilter generating recommendations colRecom Begin distanceU

ProfileAgent (Concepts, Users) //ProfileAgent returns a matrix giving distance of users with the concepts in ACM CCS 2012

SimUsers

RetriveSimUsers (UserProfile, distanceU) //From distances returned by ProfileAgent retrieve similar user having interest like active user

ratedLibResources

colRecom

IdentifyRatings (SimUsers, ClassifiedLibResources) // Rated Library Resources are generated used by similar users MeasureSim (UserProfile, ratedLibResources) // Similarity is measured between rated library resources and user profile

return colRecom End for CollabFilter

HybridFilter is mainly performing the task of combining recommendations generated by content and collaborative filters. ContentFilter is basically an agent generating recommendations by matching all available library resources with interest of the active user. This agent is referring to the profile agent, whose job is to identify the interests of the user. Profile agent performs subtask such as collect information about the interest of the user, removal of noise, weight assignment, and updation. CollabFilter is basically an agent generating recommendations by identifying similar users and referring to the library resources they rated.

4 Results and Discussion For evaluating the improved results the dataset uses 25 different users and approximately 500 different books records plus 205 research journal articles.

4.1 Results of Improved Hybrid Approach Results of the improved hybrid approach described in Sect. 3 are given below: Table 1 gives a comparison between approaches content-based, collaborative, and improved hybrid using classified library resources. Precision, recall, and f1 are calculated using the number of relevant recommended, relevant not recommended, irrelevant not recommended, and irrelevant not recommended from the dataset of classified library records of books and journals. The values show that improved hybrid approach provides more relevant recommendations

8

S. B. Shirude and S. R. Kolhe

Table 1 Evaluation for 25 different Users Users

Content based filter (%)

Collaborative filter (%)

Improved hybrid approach (%)

Precision Recall

F1

Precision Recall

F1

Precision Recall

F1

User1

87.50

77.78

82.35

85.71

69.23

76.60

93.10

87.10

90.00

User2

75.00

37.50

50.00

73.33

61.11

66.67

90.00

90.00

90.00

User3

53.57

83.33

65.22

57.69

71.43

63.83

86.36

90.48

88.37

User4

42.42

77.78

54.90

38.71

75.00

51.06

80.77

91.30

85.71

User5

90.00

90.00

90.00

81.82

81.82

81.82

81.82

81.82

81.82

User6

33.33

75.00

46.15

45.45

71.43

55.56

76.92

66.67

71.43

User7

44.00

84.62

57.89

50.00

84.62

62.86

75.00

80.00

77.42

User8

40.00

57.14

47.06

40.00

57.14

47.06

81.25

81.25

81.25

User9

60.87

70.00

65.12

50.00

60.00

54.55

81.58

81.58

81.58

User10

60.00

47.37

52.94

64.71

57.89

61.11

78.57

84.62

81.48

User11

79.31

79.31

79.31

75.00

75.00

75.00

84.62

84.62

84.62

User12

58.33

75.00

65.63

58.82

71.43

64.52

78.38

78.38

78.38

User13

58.62

65.38

61.82

60.00

66.67

63.16

77.27

85.00

80.95

User14

61.54

76.19

68.09

64.29

75.00

69.23

84.62

75.86

80.00

User15

72.73

72.73

72.73

73.53

73.53

73.53

92.31

85.71

88.89

User16

63.64

77.78

70.00

64.71

78.57

70.97

80.77

91.30

85.71

User17

68.00

85.00

75.56

62.07

72.00

66.67

89.29

86.21

87.72

User18

60.00

64.29

62.07

55.56

62.50

58.82

81.82

75.00

78.26

User19

63.64

63.64

63.64

55.56

55.56

55.56

87.10

77.14

81.82

User20

69.57

72.73

71.11

66.67

70.00

68.29

72.73

85.71

78.69

User21

66.67

60.00

63.16

62.50

55.56

58.82

70.00

75.00

72.41

User22

64.71

78.57

70.97

60.00

75.00

66.67

76.19

84.21

80.00

User23

68.97

86.96

76.92

66.67

85.71

75.00

83.33

86.96

85.11

User24

63.16

80.00

70.59

58.82

76.92

66.67

71.43

83.33

76.92

User25

72.97

75.00

73.97

72.97

75.00

73.97

86.36

90.48

88.37

Average

63.14

72.52

66.29

61.78

70.32

65.12

81.66

83.19

82.28

satisfying the need of users in comparison with only content-based filter and collaborative filter. The innovation with respect to various aspects can be identified in comparison with other similar works [5–7]. The aspect includes using SKOS form of ACM CCS 2012 like ontology, assigning weights to semantic terms and using them while matching, and automatic update of profiles.

Improved Hybrid Approach of Filtering …

9

5 Conclusions and Future Scope The proposed recommender system for a digital library gives recommendations to logged in users by merging results given by both content filter and collaborative filter. Library records are grouped in common classes so that filter can refer them easily at the time of generating recommendations. This is a similar idea like putting related library books or journals into the common shelf so that we can find similar ones in a group. Various machine learning techniques are experimented, and results are analyzed for comparision. Naïve Bayes classifier which uses PU learning approach performed fine because in the dataset no negative documents were present. It can reach to 90.05%. The resultant groups of the library resources are employed to further improvise the results/recommendations. These recommendations are generated deploying content based, collaborative filtering, and by combining both the methods, that is, the hybrid approach. It is observed that the hybrid approach applied to classified library resources improves the recommendations result. Currently, the system applies collaborative filter after some number of users registers to the system, so that the system does not enter into the cold start problem. This problem can be solved using other approaches also. The approaches experimented in the evolution of the recommender system can be applied to various problems like finding experts within a specific domain.

References 1. Shirude, S.B., Kolhe, S.R.: Agent based architecture for developing recommender system in libraries. In: Margret Anouncia S., Wiil U. (eds) Knowledge Computing and its Applications. Springer, Singapore. https://doi.org/10.1007/978-981-10-8258-0_8, Print ISBN: 978-981-108257-3, Online ISBN: 978-981-10-8258-0 PP 157-181 (2018) 2. Montaner, M., López, B., De La Rosa, J.L.: A taxonomy of recommender agents on the internet. Artif. Intell. Rev. 19(4), 285–330 (2003) 3. Shirude, S.B., Kolhe, S.R.: Classification of Library Resources in Recommender System Using Machine Learning Techniques. Annual Convention of the Computer Society of India, Springer Singapore (2018) 4. Shirude Snehalata, B., Kolhe, S.R.: Measuring Similarity between user profile and library book. Inf. Syst. Comput. Netw. (ISCON) 50–54, IEEE, (2014) 5. Morales-del-Castillo, J.M., Peis, E., Herrera-Viedma, E.: A filtering and recommender system for e-scholars. Int. J. Technol. Enhanc. Learn. 2(3), 227–240 (2010) 6. Porcel, C., Moreno, J.M., Herrera-Viedma, E.: A multi-disciplinar recommender system to advice research resources in university digital libraries. Expert Syst. Appl. 36(10), 12520–12528 (2009) 7. Hulseberg, A., Monson, S.: Investigating student driven taxonomy for library website design. J. Electron. Resour. Libr. 361–378 (2012) 8. Vijayakumar, V., Vairavasundaram, S., Logesh, R., Sivapathi, A.: Effective knowledge based recommender system for tailored multiple point of interest recommendation. Int. J. Web Portals (IJWP) 11(1), 1–18 (2019) 9. Kaur, H., Kumar, N., Batra, S.: An efficient multi-party scheme for privacy preserving collaborative filtering for healthcare recommender system. Futur. Gener. Comput. Syst. (2018)

10

S. B. Shirude and S. R. Kolhe

10. Gunawardana, A., Shani, G: A survey of accuracy evaluation metrics of recommendation tasks. The Journal of Machine Learning Research, 10, 2935–2962, (2009) 11. Azizi, M., Do, H.: A Collaborative Filtering Recommender System for Test Case Prioritization in Web Applications (2018). arXiv preprint arXiv:1801.06605 12. Kluver, D.: Improvements in Holistic Recommender System Research (2018)

A Study on Collapse Time Analysis of Behaviorally Changing Nodes in Static Wireless Sensor Network Sudakshina Dasgupta and Paramartha Dutta

Abstract Active participation of clustered nodes in a static Wireless Sensor Network offers comprehensive relief to the perennial arising out of limited energy reserve. In this paper, we propose a statistical composition for the lifetime prediction based on the active and sleep probability of the participating sensor nodes in the network. This approach is able to estimate the collapse time of the entire network. It identifies two key attributes of the network that might affect the network lifetime. The key attributes are the node density and active-sleep transition characteristic of the nodes. The simulation results further establish the relevance of the analytical study and assert that the overall network lifetime is increased as the node density is increased in general. But, on the contrary, the comprehensive energy necessity of the network is also increased. A trade-off between these two factors is observed by changing the active-sleep transition characteristics of the nodes in the network.

1 Introduction Due to the constraints in operational behavior, self-deployment structure, and limited computing and communication capabilities, managing node energy is crucial. Contemporary progress in wireless sensor technology has empowered to evolve multifunctional, less costly, micro-sized sensor nodes that can be capable to communicate in a short span. They are entrusted to sense physical environmental information such as temperature, humidity, light intensity with potential applications of military, industrial, scientific, health care, domestic etc. [1]. These aggregated information need to be processed locally with limited computations and send it to one or more S. Dasgupta (B) Government College of Engineering and Textile Technology, Serampore, India e-mail: [email protected] P. Dutta Department of Computer and System Science, Visva Bharati University, Santiniketan, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent Trends, Studies in Computational Intelligence 784, https://doi.org/10.1007/978-981-13-7334-3_2

11

12

S. Dasgupta and P. Dutta

sink nodes through wireless communications. Because of the difficulty in battery replacement, preservation of energy is of prime challenge. The large number of selforganizing sensor nodes and the limited operational node energy have emerged as a major design concern in wireless sensor network. Grouping of sensor nodes into clusters is an effective option to manage the energy consumption issues into a hierarchical structure, in order to reduce communication distance and the transmission overhead [2]. In that case, where the base station is at the outside of the application region, the clustered wireless sensor network requires more energy to communicate with it. Since more energy is required for establishing a communication between cluster head to the base station as well as member nodes of one cluster to cluster head. Battery utilization of sensor nodes is associated to the lifespan of sensors. The simulation is used to consider a static network architecture with thousands of nodes turned to account on the region of interest, where the base station is in the outside of the application region. But the position of base station creates an adverse effect in local data aggregation instead of direct communication to base station. Therefore, clustering is significant to facilitate the deployment and functioning of wireless sensors. This architecture helps to reduce comprehensive power consumption in the network. This scalable feature of sensor nodes helps to balance the network load. The energy depletion of cluster head is gradually very fast compared to other member nodes of the Cluster. In this paper, we have addressed a probabilistic formulation of collapse time of sensor nodes with respect to change of operational modes between active and sleep state so that it can predict the collapse time of the entire network. This prediction results in a scheduling methodology among the nodes meticulously avoiding it to reduce overloaded state. As sensors can behaviorally adjust their vogue of action among active and sleep state, consequently minimum number of nodes covering the application area according to their sensing capacity leaving the cluster alive. As a result, the cluster head might not be overloaded with additional redundant information from overlapping service region of sensor nodes. Numerical results show the operational behavior of the sensor nodes with different probabilities from active to sleep justifying the stability of the network. Mathematical analysis provides the node dependence of the entire network with respect to collapse time of sensor nodes, which determine the requirement of clustering of the concerned region space. The rest of the paper is structured as follows. Section 2 introduces a summarized brief of some existing works of relevance. Section 3 describes the proposed method and mathematical analysis. Sections 4 and 5 contribute theoretical analysis and simulation results, respectively. Section 6 draws the conclusion of the work.

2 Literature Survey Of late, Wireless Sensor Network has become popular due to exceptional capabilities on a substantial range of domains. WSN is used to address innumerable challenges in the field of health and medicine by monitoring and transmitting data to a base station.

A Study on Collapse Time Analysis of Behaviorally …

13

Therefore, these low-cost and light- weighted sensor nodes are key device for a monitoring system. But, the short lifespan of these battery-driven devices happen to be the bottleneck in ensuring a long-lasting monitoring service. Thus, energy is a censorious affair for sensor lifetime. Hierarchical structure of the clustering mechanism has been applied to sensor networks to enhance the sustainability of network performance with reduced energy consumption. This approach results in considerable presentation of network liveliness and also decrease interference among the participating sensor nodes for channel access in a dense network. To overcome the organizational design limitations, a number of routing algorithms have been reported in the literature [3]. Replacement of batteries and recharging is not possible in an adverse environment. To maintain the long lifetime of network, the nodes can be scheduled according to the changes of operational modes between sleep state to active state and vice-versa. Therefore, prediction of collapse time of sensor nodes is one of the major concerns in WSNs. Foudil Mir, Ahune Bouncer, and Farid Maziane have outlined a novel approach for lifetime forecast in WSN using statistical regression analysis. But the simulation was done only on smaller networks [4]. Kewei Sha and Weisong Shi [1] have proposed IQ and traditional model with respect to remaining energy of each sensor nodes as well as the entire network. But the author does not provide any consideration of sleep/active dynamics in mathematical analysis. In [3], Rukpakarong proposed a technique conducted with different battery models on real hardware and software platforms in WSNs. The author explored several factors of sensor nodes such as battery type, discharge rate, etc., but there was no consideration about duty scheduling of sensor nodes on real testbed. Rodrigues in [2] highlighted a technique demonstrated by their developed software to evaluate the state of change of an analytical battery model with respect to time at different temperatures. Here, the author faces some problems in implementation with complex analytical models on low- capacity hardware platform. Stefano Abbate proposed a software-based framework for run-time monitoring of energy consumption by an energy tracker module in WSN application [5]. This technique does not require any additional hardware cost but attracts code size and computational cost. Yunxia Chen and Qing Zhao explored the functional behavior of each nodes capitalizing its remaining energy information and the channel state information in Medium Access Control protocol [6]. The proposed Max–Min approach escalates the minimum residual energy across the network in each transmission. Here, the achieved results demonstrated the improvements of the network lifetime performance with the size of the network. Our contribution in this paper is to bridge the gap between the implementation of estimated calculations of Sensor Network lifetime and the maximization of this. Here, we use probabilistic approach that establishes the node dependence of the entire network with respect to collapse time of sensor nodes. These calculations lead to predict the situation of collapsing the network as well as a need to reforming it in cluster organization for surviving. The proposed work offers the direction of the prediction of the lifetime of the network.

14

S. Dasgupta and P. Dutta

3 Proposed Work Consider a scenario having N nodes with active behavior P1 , P2 , P3 , . . . , Pl in the sequence. It is quite understandable that network will crumble on encountering all N nodes transferred to sleep mode. Therefore, network failure is subject to a number of active nodes among all the nodes at a given point of time. Nodes have been initiated with the same initial energy in standing position after deployment. Nodes are permitted to communicate in a single-hop manner. The motivation of the paper is to analyze the transition characteristics of sensor nodes with respect to expected collapse or breakdown time. This changing behavior of sensor nodes leads to estimate the expected lifetime of sensor network. In this context, our proposed approach probabilistically try to determine as and when the next turn of clustering is required. This might, in turn, avoid unwanted clustering in each round of the algorithm and hence capable of preservation of network energy. Let us consider B = Breakdown time of the System Breakdown time or collapse time might be assigned with a positive integer value randomly. The transition characteristic of the ith node is given by

Active Pi = Sleep

Active Sleep pi 1 − pi 0 1

Consider Bi = Shortest time duration in which ith node fall asleep, 1 ≤ i ≤ N . Since breakdown time or collapse times takes place only when all the N nodes in the system attains sleep state. Therefore, collapse time B is the maximum of all individual Bi ,where 1 ≤ i ≤ N B = Max {Bi } 1≤i≤N

(1)

Bi follows geometric distribution with parameter 1 − pi where 1 ≤ i ≤ N Bi geometric(1 − pi ), pi ∈ ( 0, 1) , 1 ≤ i ≤ N Accordingly, the probability mass function of Bi will be P(Bi = li ) = pili −1 (1 − pi ), li ≥ 1, 1 ≤ i ≤ N A small effort derives that P(Bi ≤ li ) = 1 − pili , li ≥ 1, 1 ≤ i ≤ N

(2)

A Study on Collapse Time Analysis of Behaviorally …

15

which is the probability distribution function Bi because of independence, P(B ≤ l) =

N

(1 − pil ), l ≥ 1

i=1

Now, the event B ≤ l is equivalent to the mathematical intersection of the events Bi ≤ l for all i, 1 ≤ i ≤ N . This is valid for all l ≥ 1 {B ≤ l} =

N {Bi ≤ l}, l ≥ 1 i=1

Therefore, P(l) =

N

(1 − pil ), l ≥ 1

(3)

i=1

Now, P(B = l) = P(B ≤ l) − P(B ≤ l + 1) N N (1 − pil ) − (1 − pil+1 ), l ≥ 1 = i=1

=

N

i=1

(1 − pil ) pi , l ≥ 1

i=1

Now the expected breakdown time of the entire system is E(B) =

l P(B = l) =

l≥1

N l≥1 i=1

pi l[

N

(1 − pil )]

(4)

(5)

i=1

If identical node behavior is assumed, i.e., pi = p ∀i, 1 ≤ i ≤ n then it can be shown by means of requisite mathematical derivation. Hence, l(1 − l m ) N E(B) = l N m≥1

4 Theoretical Analysis The mathematical analysis demonstrates the expected breakdown time of a sensor network consisting of N nodes with identical transition probability with pi . It establishes the fact that such a system of equilibrium is supposed to break after a stipulated

16

S. Dasgupta and P. Dutta

amount of time. The expected breakdown time of such a system might be predicted as it reaches to a peak value for a certain value of pi . The transition probability pi might be varied to delay the overall breakdown time of such network.

5 Simulation Result The implementation is carried out in MATLAB to evaluate the performance of our algorithm. There are 200 sensor nodes distributed across the simulation area of 300 m × 300 m. The base station is positioned outside the simulation area. The sensor nodes are comparable in quality. The transmission radius of each sensor node is set to 10 m. Packets are generated at a constant rate of 1 packet in unit time. It is also considered that, the sensors nodes are static and mobility is restrained. The first graph is p along horizontal axis and E(B) along vertical axis, for different choices of N . As N is considered larger, the curve is found to shift toward right, i.e., the curvature becomes steeper. The value of p at which E(B) assumes maximum value is 0.9930. Very surprisingly, this maximized value of p is observed to be independent of the choice of N . The height of the peak, however, is found varying with different choice of N . From Fig. 1 as p is increased, i.e., the activation probability of nodes, the expected collapse time also increases. Again it is shown that the convergence of the optimum value of E(B), i.e., the highest energy consumption of the network is found to be faster. Accordingly, another experiment was conducted to explore as to what exactly is the inherent relation being maintained between N and the height of the corresponding peak. This is reflected in Fig. 2. Here N is taken along horizontal axis, whereas the corresponding maximum peak value is considered along the vertical

Fig. 1 Transition characteristics P versus Expected collapse time E(B)

A Study on Collapse Time Analysis of Behaviorally …

17

Fig. 2 Number of nodes N versus Maximum energy consumption Y(max)

Fig. 3 Maximum value of E(B) versus Number of nodes N with transition probability 0.80

axis. The nature of this curve also is very interesting. From Fig. 2, it is observed that as N , i.e., number of nodes in the WSN increases the value of ymax (highest energy consumption) decreases. This shows that in a sparse (when the number of nodes is less) WSN environment, most of the nodes have to be kept active during data gathering, hence would consume maximum energy. But as N increases (dense network), due to the proper scheduling of the nodes, overall energy consumption gets reduced. Therefore, it is observed that with minimum node density, the overall energy consumption of the network is maximum, which is reflected in Fig. 2. As these minimum number of nodes take the responsibility of covering entire domain, hence the energy depletion becomes faster for the entire region space. In Fig. 2 as N increases, E(B) decreases. The height of the peak, i.e., the optimum value of E(B)

18

S. Dasgupta and P. Dutta

Fig. 4 Maximum value of E(B) versus Number of nodes N with transition probability 0.90

is varied with different values of N . So the final observation is that whenever the node density is minimum, the convergence rate toward the optimum value of E(B) becomes smooth. Whenever the node density becomes maximum, the convergence rate toward the optimum value of E(B) is much steeper. As in dense network, there are maximum chances of redundant information which leads to consume more energy (cumulative energy of all nodes). Therefore, this situation with high node density needs to be clustered the network more earlier. From Figs. 3, 4, 5, it might be seen that the expected breakdown time in the network E(B) increases with the increased number of nodes but it is independent for a certain range of N values. The experiment has been performed with different values of p that is the probability of a node to remain active. It may also be noted that in Fig. 3, E(B) remains fixed for N = 45 to 100, signifying that in some cases energy consumption of the network remains unaffected over a specific range of N (node density). All this together demonstrate the effect of node substantiality in the system on the presumed breakdown time. It is prominent from Fig. 6 that the expected breakdown time E(B) effectively increases with the increasing value of number of nodes N using random and max– min protocol [6]. The max–min protocol operates on two physical layer parameters namely, CSI and REI of individual sensors. But the random protocol randomly chooses a sensor node for communication. It neither utilizes CSI nor REI. By exploiting CSI and REI, the max–min protocol assumes extra burden for transmission. But with maximized transition probability of 0.85 our proposed technique outperforms the network lifetime performance of the existing techniques. As the overall energy requirement of the network is increasing with dense network from sparse scenario, the entire network lifetime is beneficial.

A Study on Collapse Time Analysis of Behaviorally …

19

Fig. 5 Maximum value of E(B) versus number of nodes N with transition probability 0.95

Fig. 6 Comparison of the network lifetime of the proposed approach with pure opportunistic max–min technique

6 Conclusion In this report, a conceptual skeleton along with essential inference has been offered. The inference conducts the credence of the lifespan of the WSN on node density and transition probability. It is observed that for a certain range of node density, the expected breakdown time of the entire network maintains uniformity. This behavior of the sensor network defines the scalability of the system as this balancing nature improvises the functional performance of the system. A comparative study of some existing protocol on communication capability is provided to analyze their expected lifetime in the real scenario. A comparative scenario is also provided.

20

S. Dasgupta and P. Dutta

References 1. Kewei, S., Shi, W.: Modeling the lifetime of wireless sensor networks. Sens. Lett. 3, 110 (2005) 2. Rodrignes, L.M., Montez, C., Budke, G., Vasque, F., Portugal, P.: Estimating the lifetime of wireless sensor network nodes through the use of embedded analytical battery models. J. Sens. Actuator Netw. 6(8) (2017) 3. Rukpakavong. W., Guan, L., Phillips, L.: Dynamic node lifetime estimation for wireless sensor netwoks. IEEE Sens. J. textbf14(5), 1370–1379 4. Mir, F., Bounceur, A., Meziane, F.:Regression analysis for energy and lifetime prediction in large wireless sensor networks. In: INDS’14 Proceedings of the 2014 International Conference on Advanced Networking Distributed Systems and Applications, pp. 1-6 (2014) 5. Abbate, S., Avvenuti, M., Cesarini, D., Vecchio, A.: Estimation of energy consumption for TinyOS 2. x-based applications. Procedia Comput. Sci. 10, 1166–1171. Elsevier (2012) 6. Chen, Y., Zhao, Q.: On the lifetime of wireless sensor networks. IEEE Commun. Lett. 9(11), 976–978 (2005)

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect of Disease Prediction Using Reliability Sumit Das, Manas Kumar Sanyal and Debamoy Datta

Abstract Presently, diagnosis of disease is an important issue in the field of health care using Artificial Intelligence (AI). Doctors are not always present and sometimes although doctors are available but people are not able to afford them due to financial issues. The basic information like blood pressure, ages, etc. are known at that moment without knowing any symptoms how the disease can be predicted. If people know the symptoms of how the disease can be predicted? Both of these aspects, we would look into, propose algorithms, and implement them for the welfare of the society. The proposed algorithms are capable of classifying diseases of people and healthy people in efficient manner. In this work, the authors also link the concepts of probability with fuzzy logic and describe how to interpret them. Then, we can consider human being as a kind of machine and we know that any machine can be described by a parameter called reliability but the definition of classical reliability if used in case of human being fails miserably. The aim of this paper is to make a bridge among fuzzy logic, probability, and reliability. Keywords Gini coefficient · Reliability · Disease prediction algorithm · BMI · SVM · AIRDr

S. Das (B) Information Technology, JIS College of Engineering, Kalyani 741235, India e-mail: [email protected] M. K. Sanyal Department of Business Administration, University of Kalyani, Kalyani 741235, India e-mail: [email protected] D. Datta Electrical Engineering, JIS College of Engineering, Kalyani 741235, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent Trends, Studies in Computational Intelligence 784, https://doi.org/10.1007/978-981-13-7334-3_3

21

22

S. Das et al.

1 Introduction This paper utilizes the concept of reliability, which is defined for artificial intelligent machines. Here authors considering that human beings are in fact a kind of machine for which it can be defined by the concept of reliability. At present, some parts of a machine are more reliable as compared to the other parts. Consequently, for this purpose we redefine reliability in a form suitable for the analysis of disease. Further, it combines the concept of fuzzy logic and reliability to arrive at better results. Currently, it is seen that although artificial neural networks are suitable in many applications for classification as well as regression tasks, however, the concept of support vector machine is more intuitive for this purpose and both of these are used for comparison purpose in our transformed data, which would be described in the methodology section. The definition has been modified to include two critical parameters such as BMI (body mass index) and age. It uses another parameter, which is statistical in nature, Gini coefficient in both algorithms, the Disease Prediction Algorithm 1 (DPA1) and used the diabetes as the disease for the current investigation in Disease Prediction Algorithm 2 (DPA2). The Pima Indian datasets were used for the implementation of the second algorithm, the proposed methodology would enable efficient reliable diagnosis of the disease, and thereby lives of many people would be saved. The authors also visualize the data by bee swarm plot and try to judge the quality of data that we would be using. This paper would also point out the various pros and cons of the proposed method. We at the outset do a background study and then we describe the methodology that we follow. This part is divided into two subsections where one section represents description of DPA1 for classification and someone can query from the user end. The next sections represent analysis and description of DPA2 and the results.

2 Literature Survey All the people in this world experience more or less disease in their daily lives. Patients’ lives are at risk when the practitioners are unable to do proper diagnosis. The term Artificial Intelligence (AI) comes into play the role for automating the diagnosis [1]. The expert system was designed to evaluate the range of symptoms by the application of Artificial Neural Network (ANN) and regression techniques [2]. In Rio de Janeiro city, a paper focuses on Pulmonary Tuberculosis (TB) by training ANN model for the purpose of classification. In that case, Multilayer Perceptron (MLP) was applied and for group assignment self-organizing feature maps were used for the patients with high complexity [3]; the seven variables such as gender-category, patient-age, tendency of cough, loss of weight, sweating at night, and sitophobia are very crucial in addition to test data. In the inspiring paper, for achieving the accuracy, the Centripetal-Accelerated Particle Swarm Optimization (CAPSO) was used for advanced learning in ANN [4]. Newton’s law of Motion and PSO were applied

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

23

for the enhancement of the algorithm. In these studies, the concept of data mining was applied to enhance the diagnosis. [5]. Advance diagnosis of deadly disease is acquired by ANN and regression technique [6]. The literature survey provides the thorough process of encapsulating concept of statistical method, data mining, and regression to empower the diagnosis system more accurately. Consequently, after literature survey, the authors draw the following objectives: • To improve the method of classification of disease from traditional methods that only utilizes ANN or SVM. • To propose two algorithms for disease prediction and analyze their effectiveness. • To propose a connection between probability and membership function [7]. • To propose a fuzzy membership function that changes with time. The most important paper by Shanker (1996) applied the method of features selection to illustrate the variables such as BMI, glucose, and age of PIMA INDIANS datasets to acquire the knowledge of the rate of classification [8]. Thus, this paper takes the inspiration from their results and we define a variable R called reliability and propose a hypothesis: “Increased reliability means the person will be healthy”.

3 Methodologies Now, the term vague or fuzzy is used for the assessment of disease by interpreting the knowledge of symptoms. For example, “I am feeling EXTREMELY COLD”, in this statement the patient expresses the linguistic term EXTREME. Accordingly, the symptoms peer with the linguistic variables such as very, extremely, etc. Thus, a fuzzy membership function is used to describe accurately the occurrence of disease, which is inherently associated with two probabilistic outcomes that a disease is there or not. Consequently, perception is that there may be a relationship between probability and fuzzy membership. So based on the above arguments, P ∝ μ P = kμ. Here, k is a constant and at this instant, if the patient has disease then μ = 1 and P = 1, these implies k = 1.

3.1 Gini Coefficient and Reliability in Primitive Health The concentration of Gini coefficient is measured from Lorenz curve [9]. The concentration yields zero only when the investigation is evenly distributed and it is one for extreme. 1 (vi−1 + vi ) n i=1 n

G=1−

(1)

24

S. Das et al.

Fig. 1 Structure of the data after adding the reliability

vi = (

i j=1

ui =

xj )/(

n

xj )

j=1

i n

(2)

For i = 0,……,n. Here, the linguistic variables are arranged as 0 < x1 < x2 < · · · < xn . Xi implies the observations and the Lorenz curve is obtained by plotting u along x-axis and v along y-axis. At this instance, G is the degree of randomness. For a particular group of people, BMI is large and there may be disease for some part of the population. If G = 1, then it means some of the patients in the population have disease and the sampled population is stored in a vector termed as BMI. The concept of reliability is illustrated as follows: R = (1 − G) ∗ Age ∗ B M I

(3)

where G is the average Gini coefficients over the entire range of variables, i.e., it is average of Gini coefficient for each column of the data shown in Fig. 1. This variable call primed reliability and if we take negative of it, we get the reliability, which we denote by R. The part (1−G) in the equation measures how evenly the values are distributed over the ages and the BMI fractions. If it takes negative of the R’, then we get R. As we would clearly see in the result section, high R clearly in the plot shows the presence of disease. So if high R means you have disease followed by R’ which is negative of it indicates that how much reliable you are. We will stick to the two notations introduced here. The authors also fuzzify the other variables such that these variables work like symptoms, i.e., the membership function produces 1 for that particular variable when

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

25

Fig. 2 Hypothetical contour plot of MF

the patients absolutely have the disease. As described in the algorithm of Sect. 3.2 from line 12, we need to query from the user but we automate the process in this paper in such a way that we do not need to query from the user. This is equivalent to asking how much high glucose level do you feel? In addition, answering it to get the membership function in the algorithm. We describe this process in our result and analysis section. Afterward all the variables are multiplied with R, as a result in the xy plot of any variable the top right corner would always represent diseased population. This would also be clear in the result and analysis section. Although numerous results have been used, we describe only those that are startling and new.

3.2 Modified Fuzzy Function The probability of occurring disease is considered as the function of linguistic variable, which is further refined. It is assumed that the human body is an automated repair system in which the disease is recovered by continuous repairing. That is why Membership Function (MF) is changed with time where MF is function of x and t. The patient can go to the doctor many times in which the value starts with zero. When the patient visits the doctor next time, then the value of t will be changed or updated accordingly. The analysis of these three variables such as x, μ and t are depicted in the following Fig. 2. According to the above Fig.˜2, the MF is retraced with time and it can be mathematically represented as M = ∇μ ∗ dx ∗ dt The speculation is that M is an irrecoverable function with finite integral values. The gradient value of μ need to be very large and it can be interpreted from the plot of contour such as μ1 < μ2 < μ3 . Test-Case1: It is seen that for black arrow in Fig. 2, the value of μ can be obtained from x and t. If μ has taken some fractional values 0.1, 0.2, and 0.3, it indicates that for large variation of (x,t), the variation of μ is very less, which speculates μ is about approx. constant over time and gradient is large.

26

S. Das et al.

Case1:The volume when the membership is nearly constant

Case2:The volume when the membership is varied

Fig. 3 Interpretation of M

Test-Case2: Similarly, for red arrow (inner one), the gradient is small. Interpretation of M: The observation is that M rises finitely when μ is increased. Here, constants k and ϕ are considered in which ϕ is varying patientwise. It is obvious that the value of M is varying but ϕ is constant for a specific patient (Fig. 3). The value is completely dependent on weighted volume, which is depicted in Fig. 14. ¨ ˚ ∇μ ∗ ds) α( dv (4) ¨ ˚ ∇μ ∗ ds = k ∗ dv (5) or ˚

˚ ∇.(∇μ) dv = k ∗

dv

(6)

According to the Theorem of Gauss’s divergence, or ∇2 μ = k ∗ With the initial conditions: μ(5, 0) = 1 μ(0, 0) = 0 lim μ(x, t) = ∞ t→∞

Hence, the authors derived the equation of partial differentiation of MF.

(7)

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

27

3.3 Result of Partial Differentiation The nonhomogeneous equation is converted into homogeneous by using substitution as μ(x, t) = f(x, t) +

1 ∗ k ∗ ∗ x2 2

(8)

The resultant:∇ 2 f = 0. The above Laplace equation can be solved by variable separation and the derived equation is μ(x, t) =

1 − 25k 1 2 ∗ sin(px) ∗ ept + kx2 sin(5p) 2

(9)

Equation 9 is the boundary condition. The physical interpretation is considered in the paper [10]. This paper linked the relationship of probability and MF; it also speculates if the patients visited the doctor frequently, then recovery of disease is partial. The range of radial basis function is 0–1 and it is the same for MF. So, the multiplication of these two functions ranges from 0–1. The input data is designed based on training in the following algorithm. Disease Prediction Algorithm 1(DPA1): 1. FOR: i = 1 to n (a) (b) (c) (d) (e)

WRITE “The age of patient”; Age = (age); A(i,0) = age; Compute BMI BMI(i, 2 0) =← (21.9 + 1.63) + (0.29 + 0.06) ∗ age − (0.0028 + 0.0008) ∗ age ; (f) END

2. Compute the Gini-coefficient 3. G = (Gini-coefficient) 4. FOR: i = 1 to n (a) Input-value1(i,0) ← {BMI(i,0)*(1−G)*A(i,0)}; (b) Train-value1(i,0)←{1}; (c) END 5. 6. 7. 8.

State the Net N1 = newrb(Input-value1,Train-value1,0.5) //where 0.5 is spread after taking the samples. FOR: i = 1 to n

28

S. Das et al.

(a) (b) (c) (d) (e) (f) 9. 10. 11. 12.

Compute μ Symp = {‘symp1’,’symp2’,’symp3’}; //3 symptoms (symp) are assumed FOR: i = 1 to n (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m)

13. 14. 15. 16. 17. 18.

A1 = (age); A2 =← (21.9 + 1.63) + (0.29 + 0.06) ∗ A1 − (0.0028 + 0.0008) ∗ A12 ; A3 = G; A = (A1*A2*A3); R = sim(1,A); END

“Feelings of patient, symp(i,0)” “READ 5 for extreme symp” “READ 4 for very much symp” “READ 3 for moderate symp” “READ 2 for somewhat symp” “READ 1fora little bit symp” “READ 0 for No fellings symp” X = (Numeric feeling of symp); P = 0.5 K=1 =1 1− 25k 2 ∗ sin(px) ∗ ept + 21 kx2 meu(i, 0) ← sin(5p) END

//Generate a perceptron Input-value2 = {1,1,1,1,0.4,0.3,0.2,0.1,0,0.6,0.7,0.8,0.9}; Train-value2 = {1,1,1,1,0,0,0,0,0,1,1,1,1} U = newp(Input-value2,Train-value2) //Feed the samples FOR: i = 1 to n (a) L = meu*R; (b) Result = sim(U,L) (c) If(Result ==1) i. WRITE “The likelyhood of disease” ii. WRITE “meu” iii. END (d) ELSE i. WRITE“Healthy patient” ii. END

19. //Compute α 20. FOR: i = 1 to n (a) α(i, 0) =

1 BMI age

−1

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

29

Fig. 4 Depict the generated perceptron

Fig. 5 Depictradial bias network

(b) T(i,0) = {1} (c) END 21. 22. 23. 24.

//The α for healthy persons and perceptron is about1 //Theperceptron net for placebo existence NPlacebo + newp(α,T) If there is unknown alpha it produces 1 and it implies no placebo otherwise placebo.

The results were good but we have improved the entire process to make highly accurate results, which is described in the next section (Figs. 4, 5).

30

S. Das et al.

4 Result and Analysis 4.1 A Brief Description of Dataset Used for Analysis In this paper, PIMA INDIANS dataset had been used [11]. In this dataset, all the populations are female of around 21 years old. Description of variables: • • • • • • • •

Pregnancies: Number of pregnancy occurs. Glucose: Concentration of glucose in the interval of 2 hours. Blood Pressure: Pressure of blood at diastolic. Skin Thickness: Thickness of skin in fold. Insulin: Insulin in 2 hours duration. BMI: Weight measure in kg/(the tallness measure in m)ˆ2. AGE. Diabetes pedigree: It yields the knowledge of diabetes history and genetic influence as well as risk factors [12]. • Outcome: Either 1 or 0, 1 indicating sick and 0 indicating healthy.

4.2 Analysis of the Techniques of Disease Prediction Algorithm 2 (DPA2) Steps followed for the analysis: • Visualize data by bee swarm plot. • Plot reliability against a variable to show a clear separation of two types of outcomes and interpret it. • To demonstrate the effect of traditional methods like clustering, multilayered perceptron, and KNN on the data • To propose, implement, and test the Prediction Algorithm 2 (DPA2). • To further improve the DPA2 and devise a method called z-database method.

4.2.1

Visualization of Data by Bee Swarm Plot

Before doing analysis let us visualize data. Although box plot is most popular and used but in this paper it uses a new approach for visualization. Here, we have used bee swarm plot [13] for visualization purpose. This plot shows the number of observation belonging in a particular range of a variable. For example, consider Fig. 6, which is shown below. Since there are more number of observations in the range 0–1 for diabetes pedigree function, it is seen that there are lots of points in that region and as these plots appear like swarm of bees these are called bee swarm plot. The authors can understand a very important thing about the data that are going to be used for our

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

31

Fig. 6 Showing the bee swarm plot of variable for different outcomes

Fig. 7 Showing the bee swarm plot of different variables for the two outcomes

analysis. The important fact is that we have more number of data points corresponding to outcome 0. Hence, any accurate model that would make out of these data would tend to classify more number of sample inputs as of 0 outcomes and that does not mean that the model is inaccurate but it means that sufficient quality of data is not available but still would see how our model performs better than the expected. Key points obtained from Figs. 6 and 7 are as follows: • More number of observations lie in the category 0, which means most of the people belong to the healthy category. • Maximum number of observation for the variable age lies in the Range of 20–70, mostly belonging to the group 20–40 as in Fig. 6 most of the points lie in this range

32

S. Das et al.

Fig. 8 Showing the normalized value of R vs. diabetes pedigree function

• Maximum number of observation for the variable diabetes pedigree function lies in the range of 0–1, mostly belonging to the group 0–0.6, as most of the points lie in Fig. 6 in this range. • BMI lies between 20 and 40, most of the observation lies in the range 20–35 for the outcome 0, and in case of outcome 1, they are shifted very slightly in the range of 25–45 for most observation and this becomes clear if one tries to find the center from the cluster of points which is nothing but the mean of the observation as is clear from Fig. 7.

4.2.2

Plots of Reliability Against a Variable

To test our hypothesis, the authors imported the PIMA INDIANS dataset [11]. This dataset contains records of BMI, age, diabetes pedigree function, and the outcome. If the outcome is 1, it means the person is diabetic and if the outcome is 0, it means the person is healthy. The R as shown in our plot is actually the reliability that we have defined. R = −(1 − G) ∗ Age ∗ BMI At this time to apply the techniques, we normalized the values of R to zero mean and variance 1 and it is clear from the above figure. Higher R means person will be healthy. Thus, any machine learning algorithm can use our theory to easily distinguish the patterns in the dataset. Here, G is the average Gini coefficient of all the variables. Key observations from plots of Figs. 8 and 9 are as follows:

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

33

Fig. 9 Plot of our data frame in pictorial format

• In the plot of Fig. 9, in the plot of R-value against all other variables represented by the small boxes in the last column, it is seen that higher R-value corresponds to the healthy people represented by the points in red. • In the last row of Fig. 9, it represents the plot of R on y-axis against all other variables, for example, the last row first plot is glucose on x-axis and R-value in yaxis, in which it is seen that higher R-value corresponds to the healthy population.

4.2.3

Demonstration of Traditional Methods on the Data

The authors first analyze the above problem using certain techniques like clustering. Here, the package Mclust in R is used. Two clusters are clearly visible and we have also expected this outcome by our arguments. Here, in the figure is shown the mixture model of probabilistic model presenting subpopulation over population. The mixture distribution is related to controlling the characteristic of overall population and subpopulation, which is used for statistical inference. The mixture model follows some precise steps that postulate subpopulation identities to individual observation, which is the concept of unsupervised learning or clustering [14] (Fig. 10). The Gaussian mixture model is generated from a finite number of Gaussian distribution [15]. The expectation–maximization iterative algorithms were used to find maximum likelihood, which estimates the parameter in statistical manner [10]. Thus, in Fig. 11, it is clearly seen that using our R-value and only limited information we are good at our prediction using the theory that we developed. As we can also see from the confusion matrix, our model does a good job at classification

34

S. Das et al.

Fig. 10 Clustering method applied to the above plot Fig. 11 Showing our clustering results where R and diabetes pedigree function have been used

using only the values of R and diabetes pedigree function. Let us see how does KNN behaves. Using only two variables, we are able to classify the results fairly accurately as shown in Fig. 12 (Fig. 13). In statistic, it is known that the Residual Square Sum (RSS) and it is also known as Square Sum Residual (SSR) where deviation is predicted from empirical data [13]. Figure 14 shows that our multilayered perceptron model converges and is stable only after a few iterations.

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

35

Fig. 12 Confusion matrix generated after classification using KNN. FALSE is 1 and TRUE means 0

Fig. 13 Diagrammatic representation of the multilayer perceptron generated

4.2.4

Analysis of DPA2

The authors execute the entire proposed methodology and we call the method as z-database method. Therefore, after adding a column on reliability values, we first fuzzify the other variables with bell membership function. For fuzzification, we determined for what value of the variable the membership function should produce a membership value of one by following membership, disease prediction algorithm as follows:

36

S. Das et al.

Fig. 14 Performance of the model that created

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

i = NULL j = NULL cal = NULL //here diabetes is the data frame having //10 variables out of which we use only 8 variables for fuzzification. Diabetes = diabetes //storing it in another dataframe. for(j in 1:8) { for(i in 1:nrow(diabetes)) { 1. if(diabetes[i,9] ==1) 2. { a. cal[i] = diabetes[i,j] 3. }

12. } 13. c = mean(cal,na.rm = T) 14. //if variable has value c that impliesthe membership function produces 1. Now, the ninth column was the outcome column. After using this value of c as a parameter for generating the bell membership function, we got (Figs. 15, 16, 17, 18, 19, 20, and 21). As is clear from the above figures, after a certain x-value the membership function would give a value of 1 and when we get the value of 1 it can be said that the value of the outcome variable would be 1. The reliability values are multiplied with each

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

Fig. 15 Plot of MF age with x-axis

Fig. 16 Plot of MF blood pressure with x-axis

Fig. 17 Plot of MF diabetes pedigree with x-axis

37

38

Fig. 18 Plot of MF insulin levels with x-axis

Fig. 19 Plot of MF BMI levels with x-axis

Fig. 20 Plot of MF number of pregnancies with x-axis

S. Das et al.

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

39

Fig. 21 Plot of MF skin thickness with x-axis

Fig. 22 Plot of our final transformed data frame for analysis

column to generate the z function, i.e., z = R*c where c is the variable we are transforming. The reliability is multiplied with all the columns and normalizes them such that the maximum value is mapped to 1 and minimum value gets mapped to zero. After that, we plot the resultant data frame. After doing all these transformations, the resultant plot can be seen in the above Fig. 22 with the startling fact that there is a clear separation of the outcome values in each plot. Given any new value, you can easily identify its corresponding outcome. That is the power of our method. Even a single plot is enough as is clear; points that lie in top right corner are always those of healthy peoples. Hence, a linear hyperplane is accurately able to separate healthy one. Main observations from the plot of Fig. 22 are given below: • All points lying in the top right corner of any plot from the data frame that is generated belongs to the category of healthy people.

40

S. Das et al.

Fig. 23 Showing the SVM that we created using a simple linear kernel

Fig. 24 Showing the confusion matrix on training and test datasets

• A linear line can easily separate the two classes of observation. • The plot of diabetes pedigree function versus R is a straight line. • Plot of pregnancies versus R is also straight line indicating that increasing the number of pregnancies increases the reliability of the female (Figs. 23, 24). The authors partitioned the transformed data frame into 70% for training and 30% for testing. As can be seen from the confusion matrix that there is a tendency for classifying more number of 1’s as 0’s. This is due to the reason that it is shown from the bee swarm plot that the data inherently contains more number of observation lying in the outcome 0. If better quality of data with nearly even number of observation lies in both types of outcome then this method can beat any other method known so far.

Artificial Intelligent Reliable Doctor (AIRDr.): Prospect …

41

5 Conclusion The main hypothesis was the concept of reliability successfully applied for the classification of disease. Proposed algorithms conclude speculation that it could be applied in the field of medical diagnosis to some extent and the z-database method that has been described can be extended for other classification purposes that involve different types of disease. Any trained technician can handle the model and it is more reliable than an ordinary practitioner’s speculation. It yields confidence in diagnosis as it used probabilistic reasoning. Findings of the work: Disease Prediction Algorithm 1 (DPA1) proposed accurately predicts the disease provided queries that are allowed to be made by the user. The Disease Prediction Algorithm 2 (DPA2) proposed in the z-database method classifies the presence of disease very accurately provided age, BMI, and other parameters are known. It is speculated from plots of Figs. 8 and 9 that high R-values represent the absence of disease. The theoretical justification of time-varying membership functions for prediction of disease is analyzed to serve the society as AIRDr. Future scope of this paper is to use reliability that could be applied to predict disease other than diabetes.

References 1. Das, S., et al.: AI doctor: an intelligent approach for medical diagnosis. In: Industry Interactive Innovations in Science, Engineering and Technology. Lecture Notes in Networks and Systems, vol. 11. Springer, Singapore (2017) 2. Adebayo, A.O., Fatunke, M., Nwankwo, U., Odiete, O.G.: The design and creation of a malaria diagnosing expert system. School of Computing and Engineering Sciences, Babcock University, P.M.B.21244 Ikeja, Lagos, Nigeria 3. https://www.springerprofessional.de/development-of-two-artificial-neural-network-m 4. Beheshti, Z., Shamsuddin, S.M.H., Beheshti, E., Yuhaniz, S.S.: Enhancement of artificial neural network learning using centripetal accelerated particle swarm optimization for medical diseases diagnosis, vol. 18(11), pp 2253–2270 (2014) 5. Yacout, S.: Logical analysis of maintenance and performance data of physical assets, ID34. D.Sc., PE, ÉcolePolytechnique de Montréal. 978-1-4577-1851-9/12/$26.00. IEEE (2012) 6. Das S., Sanyal M.K., Datta D.: Advanced diagnosis of deadly diseases using regression and neural network. In: Social Transformation—Digital Way. CSI 2018. Communications in Computer and Information Science, vol. 836. Springer, Singapore (2018) 7. Das, S., et al.: AISLDr: artificial intelligent self-learning doctor. In: Intelligent Engineering Informatics. Advances in Intelligent Systems and Computing, vol. 695. Springer, Singapore (2018) 8. Hung, M.S., Hu, M.Y., Shanker, M.S., Patuwo, B.E.: Estimating posterior probabilities in classification problems with neural networks. Int. J. Comput. Intell. Organ. 1(1), 49–60 (1996) 9. Gini coefficient. In: Wikipedia, The Free Encyclopedia. Retrieved 16:52, September 9, 2017, from https://en.wikipedia.org/w/index.php?title=Gini_coefficient&oldid=798388811 (2017) 10. Wikipedia contributors. Expectation–maximization algorithm. In: Wikipedia, The Free Encyclopedia. Retrieved 07:53, June 30, 2018, from https://en.wikipedia.org/w/index.php?title= Expectation%E2%80%93maximization_algorithm&oldid=847180107 (2018) 11. https://www.kaggle.com/uciml/pima-indians-diabetes-database

42

S. Das et al.

12. Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., Johannes, R.S.: Using the ADAP algorithm to forecast the onset of diabetes mellitus. In Symposium on Computer Applications in Medical Care, pp. 261–265 (1988) 13. www.cbs.dtu.dk/~eklund/beeswarm/ 14. Wikipedia contributors. Mixture model. In Wikipedia, The Free Encyclopedia. Retrieved 08:21, June 30, 2018, from https://en.wikipedia.org/w/index.php?title=Mixture_model& oldid=847267104 (2018) 15. Gaussian mixture models—scikit-learn 0.19.1… (n.d.). Retrieved from www.scikit-learn.org/ stable/modules/mixture.html

Bacterial Foraging Optimization-Based Clustering in Wireless Sensor Network by Preventing Left-Out Nodes S. R. Deepa

and D. Rekha

Abstract The primary aim of Wireless Sensor Network (WSN) design is achieving maximum lifetime of network. Organizing sensor nodes into clusters achieves this goal. Further, the nodes which do not join any cluster consume high energy in transmitting data to the base station and should be avoided. There is a need to optimize the cluster formation process by preventing these left-out nodes. Bacterial Foraging Optimization (BFO) is one of the potential bio-inspired techniques, which is yet to be fully explored for its opportunities in WSN. Bacterial Foraging Algorithm for Optimization (BFAO) is used in this paper as an optimization method for improving the clustering performance in WSN by preventing left-out node’s formation. The performance of BFAO is compared with the Particle Swarm Optimization (PSO) and LEACH. The results show that the BFAO performance is better than PSO and LEACH in improving the lifetime of network and throughput. Keywords Wireless sensor networks · Bacterial foraging algorithm · Particle swarm optimization · Clustering · Routing protocol

1 Introduction WSN has many sensor nodes characterized by limited resources, e.g., (i) power, (ii) processing capability, (iii) less internal storage, and (iv) restricted transmission/reception capacity. The capabilities of sensors are further limited by the power supply, bandwidth, processing power, and malfunction [1]. With such characteristics, the deployment of WSN was seen in various commercial applications, e.g., monitoring habitats in forests, inventory tracking, location sensing, military, and disaster relief operations [2]. The communication in WSN is characterized by three types of S. R. Deepa (B) · D. Rekha SCSE, VIT, Chennai, India e-mail: [email protected] D. Rekha e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent Trends, Studies in Computational Intelligence 784, https://doi.org/10.1007/978-981-13-7334-3_4

43

44

S. R. Deepa and D. Rekha

routing protocols, e.g., flat, hierarchical, and location based [3, 4]. Many stochastic algorithms such as LEACH, LEACH-C, PEGASIS, HEED, etc. are applied to solve the WSN clustering problem [5]. The significant problem in majority of cases of energy consumption is found to be clustering process. The theoretical background of WSN has clear definition of clustering and routing, but there is very less literature found to attempt constructing bridging gap between routing and clustering process to be dynamic. It is because, in order to design a dynamic and adaptive form of routing along with clustering, the marginal characteristics of the sensors will need to support such algorithm implementation. With a scope of existing routing and clustering privilege, the present state of the sensor is not much in a robust state to sustain a better network lifetime. Some nodes are left out during the cluster formation and do not join to any cluster. Such nodes utilize higher power in transmitting data. Over time these nodes may die leading to reduction in network lifetime. This problem calls for implementing a better form of optimization-based approach that can bridge the trade-off between communication performances with computational efficiency. However, such optimization concepts are usually assisted by a sophisticated mathematical approach and have good chances to be iterative as well in order to explore the better solution to a problem. At present, there is a much rising discussion of computational benefits of bio-inspired-based optimization techniques, which could possibly assist in designing a dynamic process in the presence of real-time constraint to offer an elite outcome. Studies in this direction have witnessed approximate and heuristic algorithms involving swarm intelligence [6]. Bacterial Foraging Optimization (BFO) [7], etc. are employed to reduce the time complexity of combinatorial optimization problems. Although there are some good numbers of studies of PSO and ACO toward optimizing the performance of WSN, there is less standard implementation work of optimization carried out in WSN using BFAO. Here, a simple clustering optimization algorithm BFAO is introduced where energy efficiency is prioritized. Section 2 discusses prior research techniques toward clustering techniques followed by problem and motivation in Sect. 3. Discussion of proposed research methodology is carried out in Sect. 4 followed by the briefing of system model in Sect. 5. The obtained results are discussed, followed by a summary of implementation.

2 Previous Works on WSN Clustering Problem This section briefs about the existing research work toward improving clustering performance in WSN. The probabilistic distributed energy-aware LEACH approach is given in Heinemann [8] for routing. LEACH has low overhead by saving energy with rotating CH in a certain number of rounds. The disadvantage in the LEACH protocol was that the possibility of low-energy nodes being chosen as a CH in its election protocol of CH would decrease the network lifetime. LEACH was improved to LEACH-C. The base station (BS) assigns predefined number of CHs depending on the residual energy criteria during the setup phase. The heuristic bio-inspired

Bacterial Foraging Optimization-Based Clustering in Wireless …

45

optimization algorithms using swarm intelligence have been applied to obtain global minimum communication cost. Swarm intelligence is a collective performance of tasks exhibited by colonies of ants, herds of sheep, swarms of bees, bats, colonies of bacteria, etc. PSO was proposed in [9]. The paper [10] reviews applications of PSO algorithm for problems of WSNs. The paper [11] applies the PSO method for clustering in WSN to prevent residual nodes in the network. The BFAO was proposed by Passino in 2002. The next section outlines the problems associated with existing technique and motivation for carrying out research work in this.

3 Problem and Motivation The problems associated with the existing studies are as follows: Majority of the existing approaches toward clustering performance improvement depends upon the selection strategy of the cluster head, which is normally carried out considering higher residual energy criteria. However, there could be possibly large number of dynamic and hidden attributes that could significantly affect the selection process. It is never being appropriately investigated. Some nodes that do not join any cluster consume more energy during communication of data due to exchange of control packets. Eventually, such nodes die leading to network death. There is a need for preventing left-out node formation. Hence, usage of swarm intelligence offers a significant platform to model such latent attributes and thereby includes dynamic behavior of environment to be testified on WSN. However, existing swarm intelligence techniques are quite iterative in principle and the majority of the researchers are more focused on implementing distributed scenario. However, yet such distributed scenario is not able to obtain better clustering performance till date. Moreover, there is no standard model reporting to use BFAO for enhancing clustering performance in WSN, which poses as one of the major research gaps. Hence, proposed study will be “To design a centralized BFAO algorithm that could significantly improve the clustering performance in WSN and thereby enhance network longevity.” A closer look at existing swarm intelligence algorithms shows that if they are made to work in centralized manner they have higher likelihood of being affected by the traffic density and cannot sustain the load eventually. BFAO algorithm never leads to any nonlinearity problems. The probability of BFAO to offer superior convergence is higher compared to any other analytical models presented till date. Another most important fact still remains that BFAO offers quite minimal computational burdens and exhibits enhanced capability to execute multiple objective functions. However, such potential features are less investigated on WSN, and hence it motivates to perform the proposed study toward using BFAO for enhancing clustering performance.

46

S. R. Deepa and D. Rekha

4 Research Methodology The proposed research work considers analytical modeling with emphasis on adoption of undirected graph theory for constructing a novel clustering algorithm. The prime idea is to construct a centralized BFAO algorithm that can offer better clustering processing. Here, BFAO is centralized as well as hierarchical which will finally create a base of any distributed routing technique. The potential advantage of considering centralized clustering scheme is to ensure regular update and seamless monitoring of any criterion (e.g., distance, energy, memory, algorithm execution, etc.) at a single instance. The proposed system introduces a simple and novel clustering technique using BFAO by preventing left-out node formation, which is also associated with the energy modeling. The proposed system performs energy modeling using standard first-order radio energy modeling that allows simple and easier inference of the energy performance of the sensor nodes in every state of its data forwarding operations. As the proposed system considers clustering as a discrete optimization problem therefore, selection of CH is also considered to be very crucial in enhancing network lifetime. The CHs are taken as the variables for the optimization. 2 × N coordinates in N nodes, xi ∀ i = 1, 2, . . . , N, are the variables for optimizing clusters. The 2 × N-dimensional search space (P) spans N nodes deployed in a two-dimensional region ranging from 0 to Xmax in x-coordinate and 0 to Ymaxin y-coordinate. The set of solutions is given by the position of ith bacterium θi = x1x , x1y , . . . , xNx , xNy . The objective function is called as the nutrient function, f in the BFAO algorithm, assigns a positive candidate solution θi ∈ P. The optimal solution θopt cost to each i satisfies f θopt ≥ f (θ), ∀θ ∈ P. The objective function f is given by f =α

ECHavg DOavg + (1 − α) EOavg DCHavg

(1)

where α = 0.5, EOavg , ECHavg are the average values of residual energy for ordinary nodes and CHs, respectively, and DOavg is the average value of distance for ordinary nodes to BS, and DCHavg is the average distance of CHs to BS. The percentage of nodes to become CHs, where popt is selected prior. There is provision to select each node as the CH in the 1/popt rounds, which is the epoch. Hierarchical routing protocol clusters the sensor nodes during each round of communication. Each communication round has two phases, setup phase to obtain the optimal routing path and steady phase to transmit sensing data by ordinary nodes to the BS via CHs. All the aggregated data are forwarded by CH to BS. BS runs centralized optimization algorithm to choose the CHs at starting along the minimum communication energy path. The sensor nodes are partitioned into the Ordinary node (O) and CH during clustering process in the setup phase at the BS. The nodes send the sensing data in the steady phase to the respective CH, which forwards it to the BS. Energy consumed is the addition of energy at setup and steady phase. In this work, nodes are considered to be in the transmission range of BS or the node distance

Bacterial Foraging Optimization-Based Clustering in Wireless …

47

to BS is less than threshold distance, d0 . The proposed system applies the following operation in the cluster setup phase: • CH selection phase at the BS; • Assignment phase of assigning the ordinary nodes to the respective cluster; and • The routing path of the CHs depends on the data load corresponding to the nodes in cluster and distance from the BS; BS will find a multi-hop routing path to achieve load balancing of the CHs. • Transmission is by BS of message consisting of Node_ID (1 to Ntotal ), Node_type (CH or ordinary node), Cluster_ID (1 to K), and Routing_Node_ID (set of Node_ID). In this work, all the nodes are considered within the radio range or transmission range of the base station. – Reception of the same by every node Total setup phase energy used can be obtained by the sensor nodes Esetup , equality with Esetup_ON , energy consumed by ordinary nodes and Esetup_CH , and energy consumed by CHs. Energy Esetup is consumed in one round of setup phase by all nodes in the region of interest to receive la bits of assignment message from the BS. Esetup = la · E elec · Ntotal

(2)

1. Energy expenditure in steady phase: Esteady . Ordinary nodes will transmit the information to its BS through the route sent by a base station in the setup section. Hence, energy spent during such setup phase is expressed as the amount of total energy of CH and active state of a sensor. Esteady = T otal_E C H + T otal_E O

(3)

2. Total energy expenditure in one round of communication: The communication cost, Jround , in one round of transmission of sensory information of ordinary nodes to BS via CH is the addition of total energy expenditure by CH and ordinary nodes. Jround = la · Ntotal · E elec + T otal_E C H + T otal_E O

(4)

Hence, the proposed system introduces a significant enhancement in the clustering process where energy factor is further split into various sub-energies involved in the process of clustering with respect to communication. The BFAO sorts the nodes considering fitness factor that permits formation of the cluster only on the basis of highest value of fitness factor. It does this task by considering neighboring sensors, while the consecutive sensor from the sorted list constructs the cluster by including nodes in its transmission range. This operation continues until each cluster is formed. The next section further outlines the system model designed.

48

S. R. Deepa and D. Rekha

5 Proposed System Model In the current work, a centralized BFO algorithm is used to optimize cluster formation within WSN. The centralized clustering algorithms are suitable for managing many nodes. It uses the nutrient defined in Eq. 1. Standard radio energy model [9] was used to calculate the energy consumed for each round of communication. Each ordinary node in the cluster transmits l packet of sensory data to the CH. Amplification energy for transmission is εamp_ f s = 0.01e − 9J/bit/m2 in free space model, d < d0 and εamp_mp = 1.3e−15J/bit/m4 in multi-path model, d > d0. Electronics energy for transmission and reception is E elec = 50.0e − 9J/bit where d is the distance between nodes. Threshold distance is d0 = (εamp_ f s /εamp_mp )0.5 = 87.7 m. Ntotal nodes have been partitioned into k clusters. total number of nodes The includes nj ordinary nodes and one CH Ntotal = kj=1 n j + 1 . Equations 5–13 explain the energy consumption by CHs and ordinary nodes for one round of transmission of sensory data.

5.1 Modeling Energy Consumption for CH The CH in the cluster Cj receives l bits of data from each node out of nj ordinary nodes in the cluster Cj . So, total l · n j bits of data are received by the C j node. Therefore, energy depleted by each CH due to reception is given by the following equation: E RxC H j l · n j n j = l · n j · E elec

(5)

The CHj transmits information of nj nodes of cluster Cj and also from its sensory region. So, total (l + 1) · n j bits of data are transmitted by the CHj . Energy depleted by the CHj in transmitting (l + 1) · n j data bits through distance of Dj meter is given by the following equation: E T xC H j (l + 1) · n j , D j (nJ) = E elec + εamp · D 2j · (l + 1) · n j

(6)

Energy used by k CHs for transmission, reception, and amplification can be empirically expressed as EC H =

k

(2 · l + 1) · nj · E elec + εamp · D 2j · (l + 1) · nj

(7)

j=1

(a) The sum of consumed energy of q CHs in one round of transmission of data for D j ≤ d0 (nearer nodes) is

Bacterial Foraging Optimization-Based Clustering in Wireless …

T otal EC H = E elec · (2 · l + 1) · (Ntotal − q) + εamp_ f s · (l + 1) ·

49 k

n j · D 2j

j=1

(8) (b) Total energy consumed by q CHs in one round of transmission of data for D j > d0 (far nodes) is T otal_E C H = E elec · (2 · l + 1) · (Ntotal − q) + εamp_mp · (l + 1) ·

k

n j · D 4j

j=1

(9)

5.2 Energy Consumption by Member Nodes The Cj cluster has nj ordinary nodes, and l bits of data are sent to the CHj at a distance d ij from the CH. The ordinary node loses energy, EON l, dij in the cluster, C j during transmission, and amplification of l bits of data to the CHj . Equation 10 gives the amount of energy depleted by sensor nj in the cluster, C j E O T xC j =

nj

E elec + εamp · di2j · l

(10)

i=1

Total energy consumed by all ordinary sensors in the region is equal to the energy consumed during transmission and amplification as there is no reception of data. EO =

nj k

E elec + εamp · di2j · l

(11)

j=1 i=1

Total energy consumed by (Ntotal − q) ordinary nodes: (a) For dij ≤ d0 , T otal_E O =

nj k

E elec + εamp_ f s · di2j · l

(12)

E elec + εamp_mp · di4j · l

(13)

j=1 i=1

(b) For dij > d0 , T otal_E O =

nj k j=1 i=1

50

S. R. Deepa and D. Rekha

The communication cost, J, in one round of transmission of sensory data from ordinary nodes to BS via CH is the sum of total energy expenditure by CH and ordinary nodes. J = T otal_E C H + T otal_E O

(14)

The abovementioned equation for computing total energy consumption is applicable for all the member nodes irrespective of their individual among each CH. The mechanism is further optimized by using BFAO algorithm that is a global search method mimicking the foraging behavior of bacteria to obtain a near optimal solution. The clustering problem is transformed into optimization problem through two techniques. The first one is the encoding strategy and criterion function. The encoding technique represents the candidate solution as a particle in PSO, or a bacterial in BFAO, or a chromosome in GA. The clustering property is evaluated by the criterion function, called as the fitness function in PSO and nutrient function in BFAO. The nutrient function is the weighted sum of the objective functions to be optimized in the problem. The foraging E.coli bacteria are utilized to resolve optimization problem. The position of each bacterium contains the candidate solutions to the problem. The bacterium position is updated according to three mechanisms, Nc chemotactic and Ned elimination and (swimming and tumbling) steps, Nre reproduction steps, dispersal steps. The position of ith bacterium, θi = θi (n)|n = 1, . . . , p having p elements is updated in the p-dimensional search space at any time in the loop of j = 1, . . . , Nc , m = 1, . . . , Ns , k = 1, . . . , Nre and l = 1, 2, . . . , Ned . In Fig. 1, the flowchart of BFAO is depicted. Bacteria are initially positioned at several points in the P search space. Their positions are updated to find the minimum of fitness function f. Each chemotactic step, j, generates a unit length random direction function ϕ(j) due to tumbling. The cost function at the new position at the beginning of each chemotactic step is updated by adding the cost, Jcc , due to cell-to-cell signaling to the previous objective function, f. It increases the cost function by releasing attractant chemicals of depth dattractant = 0.1 and width, wattractant = 0.2 and repellant chemicals of height h repellant = 0.1 and width, wrepellant = 10 [8]. The objective function, f, changes by Jcc because of swarming in case of nutrient or noxious environment. The nutrient environment in case of maximization of objective function will attract other bacteria at the peak position of the function. The bacterial will make longer swim steps and swim up the nutrient gradient to make a swarm. The noxious environment in case of minimization of objective function will attract the bacteria around the valley. The bacteria will climb down the noxious gradient with shorter swimming step and more tumbling. The bacteria then swim a distance denoted by step size C, in the direction of tumbling. Positions of bacteria are updated during the jth chemotactic step by θi(j + 1, k, l) = θi(j, k, l) + C(i, m) × ϕ(j), where C(i, m) represents the swimming step of the ith bacterium in the mth swimming step. If the objective function at the new position is increased, then the bacterium takes a swimming step. The new position is calculated, and the objective function is

Bacterial Foraging Optimization-Based Clustering in Wireless …

51

Fig. 1 Process flow diagram of proposed BFAO

evaluated. If the objective function increases, the bacteria swim and reach the maximum number of steps Ns. If the objective function does not improve, then the bacterium takes the next chemotactic step from j + 1 to j + 2 and tumbles with the corresponding function ϕ(j + 2). The bacterium reproduces after Nc chemotactic steps. The health of bacterium is represented by the sum of the costs in the chemotactic steps. So, the bacteria that are stagnant in the local maximum and do not swim are replaced by the dynamic bacteria. These dynamic bacteria have taken many swimming steps and are in the correct direction of reaching the global maximum. The health function is considered to sort bacteria. Half of the healthiest bacteria are allowed to reproduce so that the total number of bacteria remains the same. Maximum four generations are replicated from the reproduction steps. The elimination and dispersal step deletes a certain number of bacteria randomly by comparing with

52

S. R. Deepa and D. Rekha

the elimination-dispersal probability Ped , and reallocates the same number of bacteria randomly. This will widen the search space of bacteria. Bacteria are allowed to undergo the elimination-dispersal phase once while searching for the global maximum.

5.3 Proposed Clustering Using BFAO The BFAO algorithm is divided into the following steps to be implemented for clustering of WSN. • Initialization: At the beginning, randomly deploy Ntotal a number of nodes having 2 × Ntotal coordinates in the given region. All nodes are given E0 energy. S bacteria are assigned S candidate solutions of the coordinates of the q CHs. Each bacterium contains 2 × q a random number of coordinates in the given region. • CH Selection: The 2 × q numbers of coordinates are obtained from the positions of bacteria and are assigned as the CHs. The localization operator assigns the coordinates to the nodes. The nodes closest to these coordinates are assigned as CHs, provided they satisfy residual energy criteria. The residual energy criteria should be more than average residual energy of the total nodes. – Formation of CHj (j = 1, 2, . . . , q) – Find out the distance Dj of CHj from the BS • Cluster formation: Sensor nodes are partitioned into ordinary node and CH by assigning ordinary nodes to the cluster depending on their proximity to the respective CHs. – Assign ordinary nodes, Oi (i = 1, 0, . . . , Ntotal − q), to the clusters Cj (j = 1, 2, . . . , q) having shortest distance from CH j . – In C j cluster , n j ordinary nodes are found. – Find distance di j of Oi i = 1, 2, .., nj from CH j . • Communication energy calculation: Objective function also called as nutrient function, f, is calculated for each bacterium. • Update of nutrient function with swarming cost: The swarming cost Jcc is added to the nutrient function, f, to obtain the new nutrient function for each bacterium. • Update of positions by BFAO Algorithm: Updation of each bacterium position is done according to the bacteria foraging algorithm. The new CHs are assigned at the end of the chemotactic stage, reproduction, elimination, and dispersal stages for each round of communication. The proposed system considers applying the optimization operation on the basis of the cost factor involved in the communication process. The nodes are sorted according to fitness value, the node that possessed highest value of fitness constructs the cluster by including the nodes in its transmission range, and similarly, the next node in the sorted list forms the cluster by including nodes in its transmission range until the clusters get formed.

Bacterial Foraging Optimization-Based Clustering in Wireless …

53

Table 1 Simulation results using BFAO Nodes

500 rounds of communication Alive nodes

Communication rounds

Residual energy (J)

Throughput (bits)

Variance First node dead

Half of the nodes dead

50

50

128881

10 × 107

0.00547

538

876

100

100

322255

1 × 107

0.01972

738

1713

Table 2 Comparison of clustering algorithms and half of the nodes dead (in communication rounds) Clustering algorithm

100 nodes First sensor node death

LEACH

50 nodes Half the sensor nodes death

First sensor node death

Death of half of the nodes

65

249

110

218

PSO

178

1319

197

1298

BFAO

738

1713

538

876

Table 3 Comparison of LEACH, PSO, and in BFAO for 50 nodes and 500 communication rounds Clustering algorithm

Alive nodes

Residual energy (J)

Throughput (bits)

Variance 0.00207

6

0.7279

2.6 ×

PSO

38

132653

14.7 × 107

0.02884

BFAO

50

132855

15 × 107

0.00547

LEACH

107

6 Result Analysis The assumptions for the numerical simulation are similar to that of LEACH [8]. The nodes and BS are stationary. The BS is placed at (110 m, 110 m) area which is outside the deployment region of sensors. The nodes are deployed randomly. All the sensor nodes are homogeneous. Every sensor node sends a fixed amount (2000 bits) of message in each round. The simulation is performed with the value of optimal percentage of the nodes to be CH per round per epoch, Popt as 5%. There were 100 homogeneous nodes deployed in a region of 100 m∗100 m area. Nodes exhibit initial energy of 0.5 J and the size of each packet being 2000 bits. The epoch consists of 1/popt equal to 20 rounds. So on average five nodes become CH in each round, and each node becomes CH once in 20 rounds. Table 1 depicts the results of simulation considering 50 nodes and 100 nodes in the network. The analysis is done by giving importance to the network longevity that is calculated with respect to the first node death as well as half of the node death. Table 2 shows that BFAO is efficient in improving network lifetime compared to LEACH and PSO. Table 3 depicts that BFAO is efficient in terms of residual energy, alive nodes, and throughput compared to LEACH and PSO.

54

S. R. Deepa and D. Rekha

Fig. 2 Alive nodes analysis

Following graphs show the simulation results considering 100 nodes deployed in the network. The outcomes of proposed study show that proposed BFAO offers more network sustainability with better communication performance with respect to number of alive nodes and throughput, respectively. Figure 2 shows that live nodes decline at a faster speed for PSO and LEACH, whereas it slowly degrades for proposed BFAO technique. The graph shows that BFAO-based approach is able to sustain increased number of alive nodes with increasing simulation rounds owing to its optimization technique. The similar trend can be seen in Fig. 3, where throughput is found to be significantly improved with the rise of the simulation rounds for the presented approach. A significant consistency can be seen as LEACH protocol ceases to perform till 700th rounds, while PSO sustains till 1400th rounds. BFAO exceeds more than 1600th rounds. The reasons for such trends of outcomes are as follows: (i) The centralized positioning of a base station in LEACH leads to faster rate of consumption of energy due to increasing traffic load from each cluster to base station. (ii) PSO algorithm offers good solution toward exploring global optima causing better energy efficient routes after performing clustering. However, the process is highly iterative and soon the CH becomes overloaded when it crosses more than 800th iterations causing many nodes to die. The proposed system solves this problem of faster node death by ensuring a highly distributed form of behavior in its optimization stages. This leads to equal dissemination of network traffic based on nodes leading

Bacterial Foraging Optimization-Based Clustering in Wireless …

55

Fig. 3 Throughput analysis

to a slower mode of sustenance. A closer look into the throughput curve shows that LEACH collapses very soon, while PSO offers more than 30% of enhancement as compared to LEACH, while proposed system offers much better throughput performance in contrast to PSO as it is not affected by size and offers better convergence performance irrespective of any network condition. In Fig. 4, residual energy remains higher in BFAO as the message is being routed through the global minimum energy path. A closer look shows that residual energy of LEACH witness stiff fall, while the PSO and the proposed system offer smooth declination of the residual energy with increasing rounds. The consistency of PSO was good till 600th rounds; however, it started falling apart owing to consumption of energy. On the other hand, proposed BFAO offers superior consistency that goes very well till maximum round of simulation. Hence, proposed clustering is capable of offering better network lifetime. Figure 5 shows the variance of residual energy of the nodes. PSO includes lots of recursive step that results in maximized energy demands from the sensor node. At the same time, the synching operation between local and global optima is never in parallel in PSO that results in higher fluctuation while transmitting data. It can be seen that the amount of energy fluctuation for both LEACH and PSO is quite intermittent, while proposed BFAO algorithm offers very much less intermittent fluctuation. The BFAO algorithm takes care of the load balancing problem and hence is more robust than LEACH algorithm.

56

S. R. Deepa and D. Rekha

Fig. 4 Residual energy analysis

Figure 6 shows that BFAO outperforms by reducing the left-out nodes compared to PSO and LEACH. The outcome shows that the proposed system offers more number of participation of the sensor motes for data aggregation, whereas PSO and LEACH induce more depletion of node owing to energy depletion resulting in increasingly lower partitioned node. For this reason, the number of left-out node is higher for LEACH as well as PSO. Intentionally, the proposed study outcome is not compared with the latest variants of PSO or LEACH as it will be quite challenging to infer the comparative analysis as various variants have various unique schemes which do not offer measurable outcomes. Hence, the proposed system is compared to only generic version of PSO and LEACH which is also the base adoption of any new variants. The performance of the proposed algorithm will be nearly uniform when it is exposed to sparse or dense network system. There will be only negligible deviation in the outcome of proposed system with another test environment of sparse/dense network. As the proposed system is evaluated using random positioning system of the nodes and various simulation iterations involved different deployment configurations, therefore, it can be said that outcome of the proposed system is quite well justified as well as applicable for both dense and sparse networks.

Bacterial Foraging Optimization-Based Clustering in Wireless …

57

Fig. 5 Analysis of energy variance Left out Node Comparison

No. of Nodes

50 45 40 35 30 25 20 15 10 5 0 Series 1

BFAO

PSO

LEACH

Fig. 6 Left-out node comparison among different protocols

7 Conclusion The Bacterial Foraging Algorithm for Optimization (BFAO) is proposed in this paper to optimize cluster formation in WSN and reduce the number of left-out nodes to enhance network lifetime. The left-out nodes consume more energy for data transmission to base station and eventually die, thereby reducing the network lifetime. The prevention of left-out node formation is done by using BFAO. This reduction is

58

S. R. Deepa and D. Rekha

done by considering the fitness of every node in deciding the cluster heads by BFAO. The simulation results show that BFAO outperforms in reducing the left-out nodes, increasing throughput, and improving the network lifetime.

References 1. Zhang, Y., Laurence Yang, T., Chen, J. (eds.): RFID and Sensor Networks: Architectures, Protocols, Security, and Integrations. Wireless Networks and Mobile Communications, pp. 323–353. CRC Press, Boca Raton, Fl (2009) 2. Kahn, J.M., Katz, R.H., Pister, K.S.J.: Next century challenges: scalable coordination in sensor networks. In: MobiCom1999: Proceedings of the 5th Annual ACM/IEEE International Conference on Mobile Computing and Networking, New York, USA, pp. 271–278 (1999) 3. Kulik, J., Heinzelman, W.R., Balakrishnan, H.: Negotiation-based protocols for disseminating information in wireless sensor networks’. Wirel. Netw. 8,169–185 (2002) 4. Subramanian, L., Katz, R.H.: An architecture for building self configurable systems. In: MobiHOC 2000: Proceedings of First Annual Workshop on Mobile and Ad Hoc Networking and Computing, Boston, MA, pp. 63–73 (2000) 5. Banerjee, S., Khuller, S.A.: Clustering scheme for hierarchical control in multi-hop wireless networks. In: IEEE INFOCOM 2001. Proceedings of Conference on Computer Communications; Twentieth Annual Joint Conference of the IEEE Computer and Communications Society, Anchorage, AK, vol. 2, pp. 1028–1037 (2001) 6. Wang, X., Li, Q., Xiong, N., Pan, Y.: Ant colony optimization-based location-aware routing for wireless sensor networks. In: Li, Y., Huynh, D.T., Das, S.K., Du, D.Z. (eds.) Wireless Algorithms, Systems, and Applications, WASA 2008. Lecture Notes in Computer Science, vol 5258. Springer, Heidelberg (2008) 7. Passino, K.M.: Biomimicry of bacterial foraging for distributed optimization and control. IEEE Control Syst. Mag. 22, 52–67 (2002) 8. Heinzelman, W.B., Chandrakasan, A.P., Balakrishnan, H.: An application specific protocol architecture for wireless microsensor networks. IEEE Trans. Wireless Commun. 1, 660–670 (2002) 9. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: ICNN 1995: Proceedings of International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995) 10. Guru, S., Halgamuge, S., Fernando, S.: Particle swarm optimisers for cluster formation in wireless sensor networks. In: Proceedings of International Conference on Intelligent Sensors, Sensor Networks and Information Processing, pp. 319–324 (2005) 11. RejinaParvin, J., Vasanthanayaki, C.: Particle swarm optimization-based clustering by preventing residual nodes in wireless sensor networks. IEEE Sens. Journa 15, 4264–4274 (2015)

Product Prediction and Recommendation in E-Commerce Using Collaborative Filtering and Artificial Neural Networks: A Hybrid Approach Soma Bandyopadhyay and S. S. Thakur

Abstract In modern society, online purchasing using popular website has become a new trend and the reason beyond it is E-commerce business which has grown rapidly. These E-commerce systems cannot provide one to one recommendation, due to this reason customers are not able to decide about products, and they may purchase. The main concern of this work is to increase the product sales, by keeping in mind that at least our system may satisfy the needs of regular customers. This paper presents an innovative approach using collaborative filtering (CF) and artificial neural networks (ANN) to generate predictions which may help students to use these predictions for their future requirements. In this work, buying pattern of the students who are going to face campus interviews has been taken into consideration. In addition to this, buying patterns of the alumni for luxurious items was also considered. This recommendation has been done for the products, and the results generated by our approach are quite interesting. Keywords Predictions · Recommender system · E-Commerce · Collaborative filtering · Nearest neighbors · Artificial neural networks

1 Introduction As recommendation has become a part of day-to-day life, we rely on external information before taking any decision about an artifact of interest. After getting user’s preferences, if accurate prediction algorithm is applied personalized recommendation can be done more correctly [1, 2]. According to Cosley et al. and Ziegler et al., it is essential to publish the information related to recommendation so that system designers and users can utilize S. Bandyopadhyay (B) · S. S. Thakur MCKV Institute of Engineering, Howrah, West-Bengal, India e-mail: [email protected] S. S. Thakur e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent Trends, Studies in Computational Intelligence 784, https://doi.org/10.1007/978-981-13-7334-3_5

59

60

S. Bandyopadhyay and S. S. Thakur

the same in future recommendation purpose [3, 4]. Though algorithmic accuracy is important, it is not only adequate tool to analyze the accuracy of any recommendation systems. Over a prolonged period, content-based and hybrid recommender systems have been developed to provide recommendation [5]. It has been observed that features, ratings, etc. are taken into consideration in case of content-based recommender systems. Sometimes the reviews of customers are on various products that are also taken to develop proper recommendation system [6]. In Sect. 1, literature survey related to the work was mentioned and the remaining part of the paper is organized as follows: In Sect. 2, discussion on details related to memory-based collaborative filtering technique has been done. Neural network preliminaries and classification approach and related algorithm are explained in Sect. 3. The focus on the proposed work was mentioned in Sect. 4. The implementation of the work using collaborative filtering and ANN has been explained in Sect. 5. The experimental results are discussed in Sect. 6. We conclude this paper with discussion on future work in Sect. 7.

2 Memory-Based Collaborative Filtering Technique Nowadays, memory-based CF uses either the entire database or a sample of user–item database to generate prediction and recommendation for an active user (new user) in E-commerce website. Both user-based and item-based collaborative filtering approaches are used to find target user’s nearest neighbor [7].

2.1 Similarity Computation In memory-based CF algorithms, similarity computation between item and user is a crucial task. In order to find similarity, either weight wi, j or wu,v is calculated between two items i and j or between two users u and v, respectively. Here, two items i and j are the items to which the users have rated in similar manner. Similarly, u and v are the users having same preferences of items or who have created the same items. Correlation-Based Similarity. In this case, wu,v between two users u and v or wi, j between two items i and j are computed by Pearson correlation (PC) or other correlation method. To get accurate result, first the co-rated cases are isolated, and the PC is computed between users u and v using the following equation: i∈I r u,i − r¯u r v,i − r¯v wu,v = (1) 2 2 r r − r ¯ − r ¯ u,i u v,i v i∈I i∈I

Product Prediction and Recommendation in E-Commerce …

61

Here, i ∈ I summations are over the items for which both the users u and v have given rating. r¯u denotes the average rating of the co-rated items of the uth user. Equation 2 is used to determine PC for item-based CF algorithm. u∈U r u,i − r¯i r u, j − r¯ j wi, j = (2) 2 2 r r − r ¯ − r ¯ u,i i u, j j u∈U u∈U In this case, the set of users who rated both items i and j is denoted by u ∈ U, ru,i denotes the rating of user u on item i, and ¯ri is the average rating of the ith item by the same users. Vector Cosine-Based Similarity. In this case, instead of documents, either users or items are used and instead of word frequencies ratings are used. Vector cosine similarity between items i and j is given by i · j wi, j = cos i, j = i ∗ j

(3)

Adjusted Cosine Similarity. The measurement of adjusted cosine similarity where M k,x denotes the rating of user k on item ix , and M¯ k represents the average rating value of user k on all items. m (Mk,x − M¯ k ) × (Mk,y − M¯ k ) sim i x , i y = k=1 m ¯ 2 m ¯ 2 k=1 (Mk,x − Mk ) k=1 (Mk,y − Mk )

(4)

2.2 Computing Prediction In the neighborhood-based CF approach, by finding similarity with active user, a subset of nearest neighbors is chosen. Then, a weighted aggregate of their ratings is computed which is used for generating predictions in future. After computing the similarity between items, a set of k most similar items to the target item are selected and a predicted value for the target item is generated. The weighted sum measurement is used as follows: k Ma,t =

Ma, j sim i j , i t k j=1 sim i j, i t

j=1

(5)

Here, M a,t represents the prediction value of target user Ua on target item it . Only the k most similar items are used to generate the prediction. Similarly, we compute item-based similarities on the user–item matrix M [7].

62

S. Bandyopadhyay and S. S. Thakur

3 Neural Network Preliminaries and Classification Approach Artificial neural network (ANN) is a widely used as computation model that can process information. The information from available dataset is taken by the input nodes, and the summation of input is taken and activation function is used for producing output of the neuron. Multilayer perceptron (MLP) is a popular algorithm that takes back propagation error as input of the model. If correct parameters are available for a particular dataset at the time of training, any algorithm may work well but lot of testing is required when any algorithm is applied on new dataset [8].

3.1 Feedforward Neural Network (FFNN) In FFNN, information moves from the input nodes to the output nodes in forward direction through hidden nodes. In this case, the weight denotes the knowledge of the network and weighted sum of the input is calculated which is used as new input values for the next layer. This process is an iterative process until it goes through all layers and finally provides the output.

3.2 Backpropagation Algorithm (BPA) BPA is used to measure how each neuron in the network contributed to the overall error after the processing of data. In this method, each weight is adjusted in proportion to its contribution in overall error. If the error of each weight can be minimized, good prediction result can be achieved.

3.3 Algorithm Step 1: Create a neural network (NN). (a) Set hidden unit equal to 1. (b) Randomly initialize all weights. Step 2: Step 3: Step 4: Step 5: Step 6:

Train the artificial neural network by using training dataset. Compute error function on valid dataset. Calculate efficiency. Check error function and efficiency are within range. Stop the process if error function is acceptable else add one hidden unit to the hidden layer and go to step 2.

Product Prediction and Recommendation in E-Commerce …

63

4 Proposed Work This work presents a method which helps to analyze customers’ buying pattern and to find customers’ future requirement. FFNN and BPA have been applied, for training the neural network (NN) for future prediction. The weights are generated randomly for each neuron, and final input is obtained using activation function as function. The block diagram of the proposed system is shown in Fig. 1. Initially, a survey was done which comprises 20 student’s dataset and questionnaire are explained to them. Then the students of different engineering institutes were provided a form using Google form and were requested to participate in data collection. Incomplete dataset was removed using normalization. The total number of dataset available in the database is 1050. The customers are students of same age group varying from 21 years to 23 years, who are going to participate in campus interviews. Another survey form was designed for the students who have already been placed and joined some organization. They were asked to rate different items which they would like to purchase in near future. Data of 915 alumni were taken who have rated 10 different items which they wish to purchase. It has been observed that there was a high probability among the users who want to purchase iPhone. In this work, our major focus is for offline rating of luxurious items keeping in mind both the middle-class and high-income groups. For rating data, variances in user rating styles were taken into consideration. Here, we compare the different offline itembased collaborative filterings in the context of two different datasets. The datasets are of the students who have passed in the year 2015 and 2016. Out of collected 915 user–item datasets, only 15 user’s rating’s datasets have been shown in Table 1. Another form was designed, and feedback was taken from the pass-out students, 1 year after their joining in the company, and was used to measure the accuracy of prediction. Standard mean absolute error (MAE) between the prediction and actual purchased data of 1 year has been taken for consecutive 2 years. The purchased data of 665 alumni were collected and used in this work.

Fig. 1 Block diagram of the proposed system

User 15

2

4

5

5

1

3

4

User 14

1

User 13

1

3

5

3

5

3

4

Digital camera

I4

4

3

5

4

HDD

I3

User 12

User 11

User 10

5

2

User 8

4

5

1

User 7

User 9

1

1

User4

4

User6

5

User3

2

4

User 5

5

User2

Laptop

iPhone

User1

I2

I1

Table 1 User–item datasets with ratings I6

1

2

5

2

4

2

4

2

2

3

1

5

Motorcycle Jewelry

I5

5

4

5

Refrigerator

I7

3

5

4

1

1

3

3

1

2

3

Air conditioner

I8

4

3

2

3

4

2

5

4

3

2

LED television

I9

2

1

3

3

2

4

1

1

Car

I10

64 S. Bandyopadhyay and S. S. Thakur

Product Prediction and Recommendation in E-Commerce …

65

5 Implementation of the Work Using Collaborative Filtering and ANN Initially, a dataset has been prepared based on the feedback of students participated in survey. The dataset was divided into two parts: one is for training and other is for testing purposes. At first, 70% of the data have been used for training purposes and 30% of data were used for testing purposes, and the accuracy was found to be 74.5%. In the second case from the same dataset, 60% of the data were used for training purposes and 40% of the data were used for testing purposes. In this case, accuracy was 72.8%. In the last case, the dataset was partitioned into two parts where 50% of the datasets were used for training purposes and 50% of datasets were used for testing purposes. In this case, accuracy was 71.6%. Table 2 clearly shows the accuracy in % with different sizes of training and test dataset. Table 3 shows the prediction about how many numbers of alumni will purchase different items like iPhone, digital camera, etc. It also shows the actual purchase data of the alumni and mean absolute error. To get the optimal results, there is a requirement of changing the number of neurons and number of iterations.

Table 2 Dataset used for training and testing purpose Training dataset (%)

Test dataset (%)

Accuracy in %

70

30

74.5

60

40

72.8

50

50

71.6

Table 3 Predicted and actual purchased data Item name

Predicted purchased data

Actual purchased data

Error

MAE

iPhone

452

478

26

23.6

Laptop

134

98

36

HDD

35

22

13

Digital camera

292

246

46

Motor cycle

65

42

23

Jewelry

105

90

15

Refrigerator

76

54

22

Air conditioner

206

233

27

LED television

167

145

22

Car

13

07

06

66

S. Bandyopadhyay and S. S. Thakur

Table 4 Total number of different attributes for input, predicted outputs, and actual output ˆ Items Attributes Input(X) Predicted O/P(Y) Actual O/P(Y) Error 1

Shirt

150

160

170

10

2

Trouser

132

140

150

5

3

Salwar

80

87

90

3

4

Shoe

55

41

50

9

5

Socks

43

32

40

8

6

Watch

13

15

10

1

7

Tie

32

27

25

8

8

Belt

18

9

15

6

9

Blazer

12

17

15

5

10

Cosmetics

45

50

52

2

ANN was applied to the database of 1050 students collected from different engineering institutes for the products they have purchased before campus interviews. The results are shown in Table 4.

6 Experimental Results The results of collaborative filtering show that 68% of the placed students have shown interest and rated iPhone as the first product they want to purchase after their joining in an organization within 1 year, followed by digital camera which was opted by 44%

Fig. 2 Plot of actual versus predicted output

Product Prediction and Recommendation in E-Commerce …

67

of the students. 31% said that they would like to purchase air conditioner. Feedback data show that 72% of the alumni bought iPhone, 37% of them bought digital camera, and 35% bought air conditioner within a year. In Fig. 2, the plot shows the difference between actual and predicted outputs. From Table 4 and Fig. 2, it is to be noted that the probability of purchasing of shirt and trouser is maximum, while the purchasing of shirt and trouser is maximum, while the purchasing of blazer, watch, belt, or tie is minimum.

7 Conclusion and Future Work At present, customers’ expectations are high as cost and quality are concerned and at the same time manufacturers may compromise on profits as to competitors, due to the dynamic business scenario. The performance of the model was found to be good. In future, the same model can be used for prediction and recommendation, with large database, i.e., newly added items.

References 1. Burke, R.: Hybrid recommender systems: survey and experiments. User Model. User-Adap. Int. 12, 331–370 (2002) 2. McNee, S., Riedl, J., Konstan, J.: Being accurate is not enough: how accuracy metrics have hurt recommender systems. In: 24th International Conference Human Factors in Computing Systems, Montréal, Canada, pp. 1097–1101 (2006) 3. Cosley, D., Lam, S., Albert, I., Konstan, J., Riedl, J.: Is seeing believing?: how recommender system interfaces affect users’ opinions. In: SIGCHI Conference on Human Factors in Computing Systems, Ft. Lauderdale, FL, pp. 585–592 (2003) 4. Ziegler, C., McNee, S., Konstan, J., Lausen, G.: Improving recommendation lists through topic diversification. In: 14th International World Wide Web Conference, Chiba, Japan, pp. 22–32 (2005) 5. Huang, Z., Chung, W., Chen, H.: A graph model for E commerce recommender systems. J. Am. Soc. Inform. Sci. Technol. 55(3), 259–274 (2004) 6. Liu, Z.B., Qu, W.Y., Li, H.T., Xie, C.S.: A hybrid collaborative filtering recommendation mechanism for P2P networks. Futur. Gener. Comput. Syst. 26(8), 1409–1417 (2010) 7. Paul, D., Sarkar, S., Chelliah, M., Kalyan, C., Nadkarni, P.P.S.: Recommendation of high-quality representative reviews in E-commerce. In: Proceedings of the Eleventh ACM Conference on Recommender Systems, Como, Italy, pp 311–315 (2017) 8. Baha’addin, F.B.: Kurdistan engineering colleges and using of artificial neural network for knowledge representation in learning process. Int. J. Eng. Innov. Tech. 3(6), 292–300 (2013)

PGRDP: Reliability, Delay, and Power-Aware Area Minimization of Large-Scale VLSI Power Grid Network Using Cooperative Coevolution Sukanta Dey, Sukumar Nandi and Gaurav Trivedi

Abstract Power grid network (PGN) of a VLSI system-on-chip (SoC) occupies a significant amount of routing area in a chip. As the number of functional blocks is increasing in an SoC chronologically, the need of the hour is to have more power lines in order to provide adequate power connections to the extra-added functional blocks. Therefore, to accommodate more functional blocks in the minimum area possible, the PGN should also have minimum area. Minimization of the area can be achieved by relaxing few power grid constraints. In view of this, due to the resistance of the PGN, it suffers from considerable reliability issues such as voltage drop noise and electromigration. Further, it also suffers from the interconnect delay and power dissipation due to its parasitic resistances and capacitances. These PGN constraints should be relaxed up to a certain limit, and the area minimization should be done accordingly. Therefore, in this paper, we have considered an RC model of the PGN and formulated the area minimization for PGN as a large-scale minimization problem considering different reliability, delay, and power-aware constraints. Evolutionary computation-based cooperative coevolution technique has been used to solve this large-scale minimization problem. The proposed method is tested on industry-based power grid benchmarks. It is observed that significant metal routing area of the PGN has been reduced using the proposed method. Keywords Area minimization · Cooperative coevolution · Delay · Evolutionary computation · Large-scale optimization · Power grid networks · Reliability · VLSI

S. Dey (B) · S. Nandi Department of CSE, IIT Guwahati, Amingaon, North Guwahati 781039, Assam, India e-mail: [email protected] S. Nandi e-mail: [email protected] G. Trivedi Department of EEE, IIT Guwahati, Amingaon, North Guwahati 781039, Assam, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent Trends, Studies in Computational Intelligence 784, https://doi.org/10.1007/978-981-13-7334-3_6

69

70

S. Dey et al.

1 Introduction With the advancement of VLSI technology, the number of functional blocks in a system-on-chip (SoC) is increasing constantly. These functional blocks are powered by the power grid network (PGN) which is connected to the power pads of the chip. In order to accommodate more number of functional blocks in an SoC, it is necessary to have more power lines so that the extra-added functional blocks can be connected to power. As PGN occupies a significant amount of the routing area, therefore, it is desirable to have a lesser area of the PGN so that extra power connections can be accommodated in the same area or in the minimum area of the chip. Hence, the minimization of the area of the PGN is important. However, the minimization of the PGN can be achieved by relaxing different reliability constraints, time-delay constraints, and power dissipation constraints. Generally, different kinds of reliability issues may occur in the PGN due to the parasitic effects such as resistances, capacitances, and inductances of the metal lines of the PGN. Voltage drop noise is one of the major reliability issues occurred in the PGN. Due to the voltage drop noise, some functional blocks may not get the required voltage which can malfunction the chip. Electromigration is another major reliability issue due to which electrons transfer its momentum to the metal atoms and as a result voids and hillocks form in metal lines which may create open circuit or short circuit some part of the power grid. Also, due to unequal voltage drop noises across the PGN and the length of the metal lines of the power grid, some unwanted delays may occur. Moreover, due to the high current through the PGN, the power dissipation by the grid also becomes significant. In order to minimize the delay, the power dissipation, and the reliability issues, designers overdesigned the PGN. Overdesigning it increases the area overhead and reduces the yield of the chip. Hence, in this paper, we are trying to minimize the metal area of the PGN considering different reliability and delay constraints of the PGN. The significant findings of this research paper contain the following: • The metal routing area minimization of the PGN is constructed as a large-scale variables (also known as the large-scale problem) optimization problem. We have used a simple DC load model or a steady-state model of PGN for the construction of the problem. • For the area minimization problem of the PGN different reliability, time-delay, and power dissipation constraints are considered. • The area minimization problem is solved by cooperative coevolution-based metaheuristics which are an optimization method for large-scale problems. • The proposed minimization approach is able to minimize the total metal routing area. • To showcase the applicability of the method, different standard benchmark circuits of PGN are used to test the proposed scheme. The arrangement of the paper is presented as follows. Section 2 comprises the necessary preliminary information required to understand the paper. The problem formulation for the area minimization is constructed in Sect. 3, which also contains

PGRDP: Reliability, Delay, and Power-Aware Area …

71

a discussion about the different reliability, delay, and power-aware constraints of PGN. Section 4 contains the cooperative coevolution scheme and its adaptation for solving the metal routing area minimization problem of PGN. The results obtained from different experiments using the PGN benchmark circuits are listed in Sect. 5. At last, in Sect. 6, the conclusion of the paper is given.

2 Preliminaries Analysis of power grid network (PGN) is an active area of research; one such recent work is [1]. In literature, several works on PGN optimization exist, where the metal area was minimized constructing it as a two-phase optimization problem. Tan et al. [2] solved the PGN optimization problem with the help of linear programming. He uses sequence of linear programming in order to optimize the two-phase optimization problem of PGN. Wang and Chen [3] solved the same two-phase optimization problem of PGN with the help of sequential network simplex algorithm. Furthermore, it is observed in the literature that there are several other works on the PGN metal routing area minimization of the PGN. Zeng and Li [4] have come up with a scheme for PGN wire sizing formulating it to be a two-step optimization problem with the help of locality-driven partitioning-based solution of PGN. Zhou et al. [5] have solved electromigration-lifetime-constrained power grid optimization by minimizing the area using a sequence of a linear programming method. It is important to perform the area minimization of the PGN considering timing delay and power dissipation constraints along with the reliability constraints which have not been reported yet in the literature. Also, power grid area minimization problem has not been solved using evolutionary computation-based metaheuristics, which has been proved to be an effective approach to solve the complex optimization problem. Recent work of [6] showed that evolutionary computation-based metaheuristics can be used for power grid optimization which has successfully able to minimize the IR drop noise with an overhead of PGN metal routing area. In this work, we proposed a scheme to minimize the metal routing area of the whole PGN by varying the widths of each of the PGN metal fragments (edges) using evolutionary optimization technique. Here, the metal routing area minimization problem of PGN is constructed in the form of large-scale optimization problem. Our proposal also includes different reliability, delay, and power-aware constraints while minimizing the metal area of the PGN. And finally, the large-scale metal routing area minimization problem is solved using cooperative coevolution-based metaheuristics.

2.1 Power Grid Network Model An illustration of the PGN along with its functional blocks is shown in Fig. 2. In this paper, the metal lines of the PGN are modeled as the RC circuit. The resistive (R)

72

S. Dey et al.

Fig. 1 π model of metal segment of the PGN

R C/2

C/2

Fig. 2 An illustration of floor plan is shown along with its PG network (metal lines) and the functional blocks

Fig. 3 RC equivalent model of the PGN Vdd

Vdd

Vdd

Vdd

and capacitive (C) elements are considered in the modeling of the PGN as these two circuit elements generate significant IR drop noises (voltage drop noise), signal delay, and power dissipations. Therefore, only RC elements are considered in this work. The equivalent R and C value of a metal segment for π model of an interconnect is shown in Fig. 1. An RC model of the PGN is considered here for the metal routing area minimization problem, which is shown in Fig. 3. For modeling the currents drawn by the underlying standard cells of the chip, DC load is being used which is connected to the ground pads from the power network as shown in Fig. 3. Similarly, for the ground network, DC loads are connected from the

PGRDP: Reliability, Delay, and Power-Aware Area …

73

ground network to the Vdd pads. The vias connecting different layers of the metals are considered to have zero resistance since it is considered that vias have very low resistance. C4 bumps which are used for Vdd and ground connections are assumed to have no inductances. Basically, any sort of parasitic effects due to inductances is not contemplated in this work. To find all the node voltages and edge currents, it is necessary to represent the PGN as mathematical model. Hence, the RC model of the PGN is represented as system of equations, i.e., GV(t) + CV (t) = I(t),

(1)

where G matrix indicates the conductances of the metal lines of the PGN, and the capacitances connected to the ground at each of the nodes of PGN constitute the C matrix. Similarly, I(t) is formed by the current sources connected to the grounds and V(t) is the representation of the node voltages vector, and V (t) denotes the first-order derivative of V(t). By solving this system of equations, node voltages and edge currents can be obtained, which will be required for evaluating different reliability, delay, and power constraints for minimization of area of the PGN. Here, KLU-based solver is employed to determine the solutions of system of equations after discretizing it with Backward Euler approximation technique [7]. For the area minimization problem, evolutionary computation-based metaheuristics is used which is described in Sect. 2.2.

2.2 Evolutionary Computation Evolutionary computation is based on the biological evolution of the species. Different complex optimization problems can be solved using the evolutionary-based metaheuristics. To perform the optimization of a mathematical function which is also known as a cost function, initial solutions are generated randomly and evaluated for the cost function. Then, the candidate solutions generated in new generations are stochastically selected which has a greater chance of producing an optimal cost for the cost function. And in this way, the process going on iteratively until the solutions (or cost) does not saturate. The point where solutions are saturated is considered as the optimal point, and corresponding solutions are considered as optimal solutions. As minimization of the area for the large-scale PGN is a complex problem, evolutionary computation is used here. The problem is expressed in the form of a large-scale problem. Henceforth, the process for minimization of the cost function is achieved using cooperative coevolution-based metaheuristics which are described in the later sections.

74

S. Dey et al.

3 Problem Formulation and Constraints 3.1 Cost Function for Metal Area Minimization Here, we consider our PGN as a graph G = {V, E} with all the nodes of the PGN as vertices set V = {1, 2, . . . , n} and all the branches of the PGN as edges set E = {1, 2, . . . , b} for the DC load model of the PGN. A pictorial representation of metal lines of 3 × 3 PGN is shown in Fig. 4. If l (length) and w (width) are the dimensions of a single metal fragment of the PGN which exhibits resistance R, then area covered by the metal fragment is expressed as A: A = lw (2) For the same metal fragment, if it exhibits a sheet resistance ρ Ω/ which is generally considered to be constant for the same layer of the metals, then the resistance of the metal fragment is analytically expressed as R=

ρl , w

(3)

For a current of I A across the metal line, the voltage drop (IR drop) across the metal line can be defined by Vir = I R (4) ρl =I w

ith metal line

Fig. 4 A pictorial representation of metal lines of 3 × 3 PGN

wi

li

PGRDP: Reliability, Delay, and Power-Aware Area …

75

From the Elmore delay model of a metal line, the delay occurred across a metal line can be represented by (5) Tdelay = RC Also the power dissipation of a metal line is represented by the following: Pdiss = I Vir = I2R

(6)

Our aim in this paper is the minimization of the metal routing area of the PGN maintaining IR drop (Vir ) within an acceptable limit, without having significant delay (Tdelay ), with having less power dissipation (Pdiss ), and also subject to other reliability constraints mentioned in the Sect. 3.2. Hence, for the entire PGN containing b metal wire fragments (or edges), the total metal routing area is expressed as given below: b (7) li wi Atotal = i=1

A large PGN will have a large value of b, which makes (7) a cost function containing large tally of decision-making variables. In view of this, as the cost function of (7) has to be minimized, hence this cost function is termed as large-scale minimization problem, where wi makes the variables set w = (w1 , w2 , . . . , wb ) for i = 1, 2, . . . , b. Here, li is considered to be a constant for the cost function (Eq. (7)) and the value of li is imported from the PGN netlist in order to evaluate the cost function. Therefore, the cost function is expressed as a large-scale total metal routing area minimization problem with b number of variables and is constructed as follows: P : minimi ze Atotal , wi ∈W

(8)

subject to the constraints mentioned in Sect. 3.2.

3.2 Reliability, Delay, and Power-Aware Constraints 3.2.1

IR Drop Constraints

From (4), the IR drop restriction is established by the expression given below: C1 : |Ii∈E |ρ

li∈E ≤ξ wi∈E

(9)

76

S. Dey et al.

The inequality given above should be strictly obeyed for all the ith edges of the PGN. ξ is the highest value of tolerance of IR drop noise permitted between two connected vertices of the PGN. Basically, ξ is the maximum allowable voltage difference between two consecutive nodes of the PGN.

3.2.2

Metal Line Area Constraint

In order to limit our design in a confined area, the total metal routing area occupied by the metal lines of the PGN should be limited to Amax : C2 :

b

li wi ≤ Amax

(10)

i=1

3.2.3

Current Density Constraint

The maximum current density of the metal lines of the PGN should be limited to Im , in order to avoid degradation of the metal lines due to electromigration-based reliability issues. Ii∈E C3 : ≤ Im (11) wi∈E 3.2.4

Metal Line Width Constraint

The design of the metal lines should follow the design rules of the given CMOS technology nodes and should follow the minimum width design rules in order to avoid any design rule violations. The metal width constraint can be represented as follows: (12) C4 : wi∈E ≥ wmin

3.2.5

Current Conservation Constraint

At all the n vertices of the PGN, Kirchhoff’s Current Law (KCL), or the current conservation constraint must be observed, which is represented as follows: C5 :

K

I ji + Ix = 0 ∀ j ∈ V

(13)

i=1

where the symbol K denotes neighboring vertices tally around the vertex j and the symbol Ix represents DC load current of the PGN model which is placed at all nodes to the ground.

PGRDP: Reliability, Delay, and Power-Aware Area …

3.2.6

77

Time-Delay Constraint

A metal interconnect is modeled as RC element. However, the capacitance of a interconnect can further be classified by plate capacitance Ci plate , fringe capacitance Ci f ringe , and sidewall capacitance Cisidewall . Among these Ci plate has a simple mathematical expression which is εli wi (14) Ci plate = t Therefore, time delay of each of the metal lines of the PGN should be with in ζ. Using Ri = ρlwii and (14), we get C6 : Tdelay = Ri (Ci plate + Ci f ringe + Cisidewall ) ≤ ζ =

3.2.7

ρli Ci f ringe ρεli2 ρli Cisidewall + + ≤ζ t wi wi

(15)

Power Dissipation Constraint

The power dissipation of a metal interconnect of the PGN should be limited by ψ. C7 : Pdiss = Ii Viir ≤ ψ = Ii2 Ri ≤ ψ ρli Ii2 ≤ψ = wi

(16)

Proposition 1 Minimization of area of the PGN is dependent on reliability, delay, and power-aware constraints Proof From Eq. (7), we know that Atotal depends upon width (wi ) of each of the metal interconnects. (17) Atotal ∝ wi Therefore, to minimize the area, we have to reduce the wi . However, most of the constraints mentioned in Sect. 3.2 are directly or inversely proportional to wi Constraints ∝ wi or

1 wi

(18)

and reducing wi will surely affect these constraints. Therefore, minimization of the area depends on the reliability, delay, and power-aware constraints.

78

S. Dey et al.

4 Proposed Minimization Scheme 4.1 Basic Cooperative Coevolution Scheme Cooperative coevolution (CC) is a decomposition scheme used in evolutionary computation which embraces the divide-and-conquer technique in order to find optimum solutions for optimization problems containing a large tally of variables. A complex mathematical problem having a large tally of variables is decomposed into small subcomponents using the divide-and-conquer approach. In order to incorporate the divide-and-conquer, the CC scheme divides a large problem with n decision variables into small subcomponents. Once the small subcomponents are created, then each of the subcomponents undergoes optimization with help of a standard evolutionary optimization process in a periodic manner. Evolutionary optimization algorithms try to mimic the biological evolution process. It generates a pool of population corresponding to a subcomponent of the large-scale problem. These populations represent the candidate solutions of the subcomponent. These individuals undergo different genetic processes naming mutation, crossover, and selection to find the optimum solutions for the instances of the certain subcomponent. Consequently, the cooperative evaluation of all the individuals in a pool of subpopulation is carried out by proper selection of the current individual and also the best individual from the remaining individuals of the subpopulation as described by [8]. The working of cooperative coevolution scheme is expressed in a concise way in Algorithm 1.

Algorithm 1: The working of Cooperative Coevolution Scheme

1 2 3 4 5 6 7 8

Input: The cost function f , lower bound xmin , upper bound xmax , frequency of decision variables n. Output: Optimimum value of the cost function f and numerical values of corresponding decision variables x1 , x2 , . . . , xn values. subcomponents ← grouping( f, xmin , xmax , n) /*grouping based variable decomposition*/; population_arr ← random(population_size,n); /*Optimization stage*/; for j ← 1 to size(subcomponents) do group_number_var ← subcomponents[j]; subpopulation_arr ← population_arr[:,group_number_var]; subpopulation_arr ← evolutionary_optimizer(best,subpopulation_arr,FE); population_arr[:,group_number_var] ← subpopulation_arr; (best,best_val)←min(population_arr);

Proposition 2 Cooperative coevolution scheme can generate near-optimal solutions if the main optimizer generates the near-optimal solutions. Proof Suppose if we consider f (x1 , x2 , . . . , xn ) as an objective function or the cost function having n decision variables xi ∈ R ∀i ∈ n. Now if we want to decompose the

PGRDP: Reliability, Delay, and Power-Aware Area …

79

n variables with the help of cooperative coevolution scheme by employing random grouping of the decision variables in such a way that each of the decomposed groups contain s decision variables. The decomposition of n variables will create t = ns number of groups or subcomponents of the main objective function. We can interpret this as t instances of the cost function with each subcomponent containing n decision variables each. Now, each of the t instances of subcomponents will go through an optimization process with the help of a standard optimizer. The co-adaptation of the near-optimal values will be done using random grouping strategy, to obtain the global near-optimum of the cost function f . Potter and De Jong [9] first used genetic algorithm in cooperative coevolution (CC). In a large-scale optimization problem, CC was first used by [10]. They have proposed a fast method for evolutionary programming computation with the use of cooperative coevolution. Van den Bergh [11] in their work for the first time introduced CC into particle swarm optimization. Shi and Li [12] and Yang and Yao [13] have also tested the performance of differential evolution (DE) by incorporating CC scheme into it. An enhanced variant of DE is proposed by [14] which is named as self-adaptive neighborhood search-based differential evolution scheme (SaNSDE). The merit of this proposed self-adaptive DE is it self-adapts its mutation strategy, the crossover rate CR, and the scaling factor F. Yang and Yao [14] also confirmed that the selfadaptive version of DE works considerably good in comparison to the other versions of DE schemes. Yang and Yao [13] further used the self-adaptive DE (SDE) along with the CC scheme (CC-SDE) for the mathematical problems with large number of decision variables and obtained exceedingly good results. In order to increase the solution accuracy of the large-scale cost functions, it is observed that the random grouping of the variables while decomposing gives the best results as discovered by [15]. Although our area minimization problem is separable in nature, still random grouping-based strategy is used as this strategy has been proved to be statistically good grouping strategy in a large number of variables environment. Therefore, CCSDE is accommodated here to determine the optimum solutions of the total metal routing area minimization of PGN.

4.2 Metal Area Minimization Using CC-SDE The total metal routing area minimization algorithm for PGN exercising CC-SDE is presented in Algorithm 2. Initially, to find all the edge currents and all node voltages of the PGN, the grid analysis of PGN is performed with the help of the KLU-based matrix solver [16]. In order to power grid analysis, backward Euler-based discretization approach is used [7] to discretize (1) and KLU solver is used subsequently to solve the linearized system of equations of the PGN. All the required parameters are initialized for the SDE. Search space T is created considering all the constraints mentioned in Sect. 3.2 to restrict the search space of the cost function interior to the area of validation. Subsequently, for the problem P, initially cooperative coevolutionbased scheme is used to decompose a large number of variables of the problems into

80

S. Dey et al.

Algorithm 2: Metal Area minimization using CC-SDE

1 2 3 4 5 6 7

Input: The cost function P, widths (wi ), lengths (li ), number of edges (b), number of vertices (n). All the data are extracted from the power grid netlist. Output: Optimum metal widths of the metal lines with decreased total metal routing area. The reliability, delay, and power-aware constraints C1 , C2 , . . . , C7 mentioned in Sect. 3.2 are incorporated to generate a search space T .; while inside search space T do Initialization is done for the initial parameters of CC-SDE; Decomposition of the b variables in t subcomponents is done using random grouping strategy; For optimizing the subcomponents, SDE optimization algorithm is used; The subcomponents are co-adapted by randomly grouping the best solutions of the subcomponents.; Optimum widths of the metal lines of the PGN are evaluated corresponding to the minimized total metal routing area and the model parameters are updated.;

smaller instances of the subcomponents. The decomposed variables are collected randomly in a number of smaller groups to make different subcomponents. Once the subcomponents are formed, the minimization of each of the subcomponents is done individually, with the help of SDE optimizer. Eventually, after all the subcomponents are minimized individually, again arbitrary grouping-based co-adaptation scheme is employed to reach the global near-optimum value of the cost function. In this way, the total metal routing area of P is minimized. Optimized edge widths are determined corresponding to the minimized metal routing area of the PGN with the help of the cost function P.

5 Experimental Results For the implementation of all the algorithms used in this paper, MATLAB programming language is used. The experiments are accomplished on a computer with Linux operating system and with 64GB memory for validation of the proposed schemes. IBM PGN benchmarks [17] are used to showcase the area minimization results which are listed in Table 1.

Table 1 Power grid benchmark circuits data [17] Benchmark circuits #Nodes(n) #Edges(b) ibmpg2 ibmpg3 ibmpg4 ibmpg5 ibmpg6 ibmpgnew1

127238 851584 953583 1079310 1670494 1461036

208325 1401572 1560645 1076848 1649002 2352355

Edge resistance limits (in Ω) (0,1.17] (0,9.36] (0,2.34] (0,1.51] (0,17.16] (0,21.6]

PGRDP: Reliability, Delay, and Power-Aware Area …

81

IBM power grid benchmarks are the industry standard power grid benchmarks available which are extracted from the IBM processors, and which are widely used in the research of the power grid simulation. The node counts (n) and the edge counts (b) for different six IBM power grid benchmarks are listed in Table 1. Also, resistance values of all the edges are listed in Table 1. We also assumed different capacitances similar to the transient IBM power grid benchmarks [17] for time-delay constraint estimation. The power grid benchmarks do not contain length and width information, for which appropriate dimensions of the metal segments are considered with a minimum width of the metal lines be 0.1 µm. The metal interconnects’ sheet resistance is considered to be 0.02 Ω/. Power grid analysis is performed on the PGN benchmarks with the help of KLU solver. The edge currents and node voltages of the PGN are obtained from power grid analysis results, which can be used for the evaluation of the reliability, delay, and power dissipation constraints. Subsequently, all the constraints are evaluated and a search space T is constructed. The algorithm looks for optimum values of width within the search space T , corresponding to the minimized area of the PGN. The experiments were performed for the six IBM benchmarks circuits ibmpg2 to ibmpgnew1. The area minimization is done for the benchmark circuits using Algorithm 2. Before and after minimization results are given in Table 2. It is clear from Table 2 that the proposed scheme is able to minimize the metal routing area of the PGN significantly. For the ibmpg2, we have got 25.85% reduction in area. This shows that the metal routing area can be minimized for an overdesigned PGN without violating reliability, time-delay, and power-aware constraints. In order to obtain a variation of width before and after minimization, resistance and metal width budgeting is done for ibmpg4 circuit. Resistance budget for the ibmpg4 circuit is shown in Fig. 5 which shows different values of the resistances which are present in ibmpg4 circuit. From Fig. 5, we have got the metal width budget for ibmpg4 circuit before minimization which is shown in Fig. 6. Comparing Figs. 6 and 7, we can see that after minimization of the area for ibmpg4, the widths have been reduced significantly. In our experiments, we used fitness evaluation (FE)

Table 2 Comparison of metal routing area for IBM PGN benchmarks before and after minimization procedure Benchmark circuits Total area (mm2 ) Area reduced (%) Before minimization After minimization ibmpg2 ibmpg3 ibmpg4 ibmpg5 ibmpg6 ibmpgnew1

97.73 931.95 279.97 575.73 754.85 799.97

72.46 744.62 230.21 505.83 662.38 719.01

25.85 20.10 17.77 12.14 12.25 10.12

82

S. Dey et al.

Fig. 5 Resistance budget for ibmpg4 circuit according to the benchmark circuit data

105

Number of branches

3 2.5 2 1.5 1 0.5 0

0

0.5

1

1.5

2

2.5

Branch resistances (in Ohm)

Fig. 6 Metal width budget for ibmpg4 circuit before minimization

× 105

Number of branches

3 2.5 2 1.5 1 0.5 0

0

10

20

30

40

Metal width of the branches (in µm)

14

Number of branches

Fig. 7 Metal width budget for ibmpg4 circuit after minimization

× 104

12 10

8 6 4 2 0

0

10

20

30

40

Metal width of the branches (in µm)

value as 106 . One of the important reasons behind using this numerical value is that our proposed scheme (Algorithm 2) furnishes the best result with respect to the convergence, for the numerical FE value of 106 .

PGRDP: Reliability, Delay, and Power-Aware Area …

83

6 Conclusion The paper manifests a scheme for minimizing the metal routing area of the VLSI PGN. The cost function of metal area minimization problem is expressed in the form of large-scale optimization problem. For the minimization procedure, an evolutionary computation-based minimization approach using cooperative coevolution scheme is proposed in this paper which is used for the metal routing area minimization. Reliability, time-delay, and power-aware constraints are considered as a part of the minimization process in order to define the search space of the cost function. Different PGN benchmarks are used to demonstrate the applicability of the algorithm. Results on PGN benchmarks show significant reduction of the metal routing area without violating reliability, delay, and power-aware constraints.

References 1. Dey, S., Nandi, S., Trivedi, G.: Markov chain model using Lévy flight for VLSI power grid analysis. In: Proceedings of VLSID, pp. 107–112 (2017) 2. Tan, S.X.D., Shi, C.J.R., Lee, J.C.: Reliability-constrained area optimization of VLSI power/ground networks via sequence of linear programmings. IEEE TCAD 22(12), 1678– 1684 (2003) 3. Wang, T.Y., Chen, C.P.: Optimization of the power/ground network wire-sizing and spacing based on sequential network simplex algorithm. In: Proceedings of ISQED, pp. 157–162 (2002) 4. Zeng, Z., Li, P.: Locality-driven parallel power grid optimization. IEEE TCAD 28(8), 1190– 1200 (2009) 5. Zhou, H., Sun, Y., Tan, S.X.D.: Electromigration-lifetime constrained power grid optimization considering multi-segment interconnect wires. In: Proceedings of ASP-DAC, pp. 399–404 (2018) 6. Dey, S., Nandi, S., Trivedi, G.: PGIREM: reliability-constrained IR drop minimization and electromigration assessment of VLSI power grid networks using cooperative coevolution. In: Proceedings of ISVLSI (2018) 7. Butcher, J.C.: Numerical Methods for Ordinary Differential Equations. Wiley (2016) 8. Potter, M.A., Jong, K.A.D.: Cooperative coevolution: an architecture for evolving coadapted subcomponents. Evol. Comput. 8(1), 1–29 (2000) 9. Potter, M.A., De Jong, K.A.: A CC approach to function optimization. In: Proceedings of PPSN, pp. 249–257 (1994) 10. Liu, Y., Higuchi, T.: Scaling up fast evolutionary programming with CC. Proceedings of CEC, vol. 2, pp. 1101–1108 (2001) 11. Van den Bergh, F.: A cooperative approach to PSO. IEEE TEC 8(3), 225–239 (2004) 12. Shi, Y.J., Li, Z.Q.: Cooperative co-evolutionary DE for function optimization. In: Advances in Natural Computation, p. 428. Springer (2005) 13. Yang, Z., Yao, X.: Large scale evolutionary optimization using CC. Elsevier Inf. Sci. 178(15), 2985–2999 (2008a)

84

S. Dey et al.

14. Yang, Z., Yao, X.: Self-adaptive DE with neighborhood search. In: Proceedings of CEC, pp. 1110–1116 (2008b) 15. Omidvar, M.N., Yao, X.: Cooperative co-evolution for large scale optimization through more frequent random grouping. In: IEEE CEC, pp. 1–8 (2010) 16. Davis, T.A.: KLU, a direct sparse solver for circuit simulation problems. ACM TOMS 37(3), 36 (2010) 17. Nassif, S.R.: Power grid analysis benchmarks. In: Proceedings of ASP-DAC, pp. 376–381 (2008)

Forest Cover Change Analysis in Sundarban Delta Using Remote Sensing Data and GIS K. Kundu, P. Halder and J. K. Mandal

Abstract The present study deals with change detection analysis of forest cover in Sundarban delta during 1975–2015 using remote sensing data and GIS. Supervised maximum likelihood classification techniques are needed to classify the remote sensing data, and the classes are water body, barren land, dense forest, and open forest. The study reveals that forest cover areas have been increased by 1.06% (19.28 km2 ), 5.80% (106.82 km2 ) during the periods of 1975–1989 and 1989–2000, respectively. The reversed tendency has been observed during 2000–2015, and its areas have been reduced to 5.77% (111.85 km2 ). The change detection results show that 63%–80% of dense forest area and 66%–70% of open forest area have been unaffected during 1975–2015 and 1975–2000, respectively, while during the interval 2000–2015, only 36% of open forest area has been unaltered. The overall accuracy (86.75%, 90.77%, 88.16%, and 85.03%) and kappa statistic (0.823, 0.876, 0.842, and 0799) have been achieved for the year of 1975, 1989, 2000, and 2015 correspondingly to validate the classification accuracy. Future trend of forest cover changes has been analyzed using the fuzzy logic techniques. From this study, it may be concluded that in future, the forest cover area has been more declined. The primary goal of this study is to notify the alteration of each features, and decision-maker has to take measurement, scrutinize, and control the natural coastal ecosystem in Sundarban delta. Keywords Change detection · Remote sensing data · Dense forest · Open forest · Fuzzy logic · Sundarban K. Kundu (B) Department of Computer Science and Engineering, Government College of Engineering & Textile Technology, Serampore, Hooghly, India e-mail: [email protected] P. Halder Department of Computer Science and Engineering, Purulia Government Engineering College, Purulia, West Bengal, India e-mail: [email protected] J. K. Mandal Department of Computer Science and Engineering, University of Kalyani, Kalyani, Nadia, West Bengal, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent Trends, Studies in Computational Intelligence 784, https://doi.org/10.1007/978-981-13-7334-3_7

85

86

K. Kundu et al.

1 Introduction Sundarbans (India and Bangladesh) is the biggest continuous tidal halophytic mangrove [1] forest delta in the globe. Its net area is approximately 10,000 km2 . Among these areas, 40% of the areas are present in India and the remaining areas belong to Bangladesh. It is the largest complex intertidal region where the three major rivers (Ganges, Brahmaputra, and Meghna) meet the Bay of Bengal. It is enclosed by the Hooghly River in West, on the east by Ichamati-Kalindi-Raimangal, on the south by Bay of Bengal, and on the north by Dampier Hodge line. Out of 102 islands in Sundarban delta, 48 islands are reserved for forest and remaining islands are reserved for the human settlement. Along the shoreline region it creates halophytic mangrove forest. It was stated as a biosphere reserve in the year of 1989 by United Nations Educational and Scientific Co-operation (UNESCO), and in the year of 1987, it was declared as world heritage place by International Union for Conservation of Nature (IUCN). Forests are very important natural resource on the earth. It does construct a natural blockage to guard along the riverbank and inland areas from natural calamities (hurricanes, cyclones, and tsunamis) [2]. It plays an important role in ecological safeguard such as soil conservation, enlarged biodiversity [3], and avoidance of climate changes. It is also utilized for widespread national-level economic growth through providing such as timber of industry and construction and source of medicine. Mangrove forests were declined in various region of the world. In present decades, rigorous and swift deforestation [4] has led to increasing global notifications for defended managing of forest assets. The region has enormous pressure due to raising various factors like settlement area, agricultural growth, huge wood removal, weather change, industrialization, and urbanization. The main causes of forest degradation are infringement, unlawful cutting of tree, forest fire and climate change. Forest lands cover areas translation into agricultural land, livelihood, farmland, and improper industrialization. The deforestation outcomes consist of fall down the biodiversity, impacts on climate, loss of a vital sink for atmospheric carbon dioxide, and negative effects on the regular livelihoods of stifling region peoples [5]. Tropical deforestation is accountable for enormous species destruction and effects on biological diversity. There are mainly two ways that affect the biological biodiversity such as habitat demolition and separation of the previously proximate forest into forest fragments. During 2000–2005, the worldwide total forest cover reduction was 0.6% per year. The usual ground observation techniques of forest cover monitoring [6] are too much more arduous, inadequate, time-consuming process. There are some drawbacks in normal ground verification method; the remote sensing tool is very important in supervising the forest, and it is also able to find out the deforestation patterns and movement of tropical forests. The primary data source like remote sensing data for change detection analysis plays a key role in the present scenario [7]. Remote

Forest Cover Change Analysis in Sundarban …

87

sensing data yield more accurate measurements than conventional techniques. The digital characteristics of remote sensing data are allowed to classification, compatibility with geographic information systems and advanced computer analysis. The digitally stored remotely sensed data offer a remarkable prospect; they help to study the present forest cover changes [8]. Recently, IKONOS, Quick Bird, and IRS are used for forest cover change detection analysis with high degree of accuracy while their spatial resolution is high and the cost is also very high. The lower cost image which is freely available from the Internet is used for examining land cover, land use, and landscape ecology, instead of high cost high-resolution image using medium and coarse resolution satellite data (MSS, TM, ETM+) with tolerable levels of precision [9]. India is a developing country and largely density populated nation in the earth. It has restricted number of natural resources. In current decades, numerous studies in India have been analyzed forest cover alteration in Sundarban with remote sensing data and GIS. In India, most of the forests are resides in the Sundarban region. During the period of 1970–1990, the mangrove forests areas were increased by 1.4%, while they were reduced by 2.5% during 1990–2000 in this region (India and Bangladesh). Mangrove forests areas are increased due to regrowth, plantation, and aggradations. Around 0.42% of mangrove forest areas were lost during 1975–2006 due to illegal cutting trees, enlarged settlement, augmented agricultural land, increased aquatics farm, etc. It was also notified that entire mangrove forests areas in South Asia were more deforested in compared with the reforested during the period 2000–2012 [10]. Remote sensing data are classified into various classification techniques such as on-screen visual interpretation, supervised, and unsupervised classifications [11]. To determine the various characteristics of the land cover features, different classification methods are used. It has been examined that among these various classification methods, on-screen classification procedure is more superior to other methods. As per accuracy, it is divulged that band ratio of supervised classification is healthier in compared with the other classification methods. In a recent scenario, it has been seen that for the past few years, the earth’s temperature is rising because of unplanned industrialization, deforestation, etc. Normal biodiversity has been changed due to the climate changes of an environment. The changes of climate directly or indirectly impact on the Sundarban estuary along the river bank or shoreline. As a result, along the shoreline of Bay of Bengal, the sea level has been raised, increased downstream salinity, and regular occurrence of natural distrusters such as cyclones, storms, etc. Recently, the status and distribution of various mangrove species have been discussed [12] in the Indian Sundarban region, and it is also observed that mangrove vegetation areas are more declined. The main purpose of this study is to explore the changes in various features and how much areas are converted into various features. By this inspection, the policy-makers can able to take decision to coordinate, manage, and sustain the normal biodiversity in the Sundarban delta.

88

K. Kundu et al.

88°50'0"E

89°0'0"E

22°10'0"N

88°40'0"E

22°0'0"N

88°30'0"E

India

West Bengal

21°40'0"N

21°40'0"N

21°50'0"N

21°50'0"N

22°0'0"N

22°10'0"N

Study Area (Sundarban)

88°30'0"E

88°40'0"E

88°50'0"E

89°0'0"E

Fig. 1 Geographical position of the study area

2 Study Area The present study area is positioned in the district of South 24 Parganas at West Bengal state in India, which is shown in Fig. 1. It lies between latitude 21˚40 00 N to 22˚30 00 N and longitude 88˚32 00 E to 89˚4 00 E. The study area covers 3038.68 km2 , and the whole region is covered by the reserve forest. The area is delimited by the Bay of Bengal on the south, borderline of India and Bangladesh on the east, Thakuran River on the west, and Basanti block on the north. Numerous natural biodiversity with various flora and fauna is present in the Sundarban region. More than 27 mangrove species, 40 species of mammals, 35 groups of reptiles, and 260 bird species have resided in this region. Wildlife species that exist in the vicinity include the Indian python, man-eating Royal Bengal tiger, spotted deer, sharks, macaque monkey, crocodiles, and wild boar. The forests are demonstrated three main tree groups Sundri, Goran, and Gewa. Additional species that build up the forest assembly consist of Avicennia, Xylocarpus, Sonneratia, Bruguiera, Rhizophora, and Nypa palm. The region contains various islands’ such as Gana, Baghmara, Bhangaduni, Mayadwip, Chandkhali, Harinbangha, and Holiday Island. It is usually flooded by diurnal tides.

Forest Cover Change Analysis in Sundarban …

89

Table 1 Detailed descriptions of the Landsat image Satellite type

Sensor

No. of bands

Date of acquisition

Path and row

Spatial resolution (m)

Landsat 3

MSS

4

05.12.1975

p-148, r-45

60

Landsat 5

TM

7

03.01.1989

p-138, r-45

30

Landsat 7

ETM+

8

17.11.2000

p-138, r-45

30

Landsat 7

ETM+

8

25.11.2015

p-137, r-45

30

Table 2 Description of the forest cover types defined in this study

Class

Description

Water body

Area cover by open water like rivers, lakes, small channels, etc.

Barren land

Area cover by such as wetland, bare river beds, degraded land, newly cleared land, etc.

Dense forest

Tree density of canopy cover of 50% and above

Open forest

Tree density of canopy cover of less than 50%

3 Materials and Methods 3.1 Data Source In this paper, four multispectral Landsat satellite data were obtained from earth explorer website (https://earthexplorer.usgs.gov/), which were freely available on the web. The details of the Landsat satellite images information are presented in Table 1, and it is noticeably shown that all the images were acquired almost in the same season. All the satellite images were cloud free and unambiguous. Multispectral scanner (MSS) is obtained from Landsat 3 satellite with spatial resolution of 60 m, and it includes four spectral bands such as green (0.5–0.6 µm), red (0.6–0.7 µm), near-infrared (NIR) (0.7–0.8 µm), and near-infrared (NIR) (0.8–1.1 µm). Thematic mapper (TM) sensor was acquired from Landsat 5 satellite, and it contained seven spectral bands such as blue, green, red, near-infrared (NIR), shortwave infrared 1, thermal, and shortwave infrared 2 with spatial resolution of 30 m except for the thermal band. Landsat 7 satellite contains Enhanced Thematic Mapper Plus (ETM+) sensor, and it consists of eight spectral bands such as blue, green, red, near-infrared (NIR), shortwave infrared 1, thermal, shortwave infrared 2, and panchromatic with a spatial resolution of 30 m except for the thermal and panchromatic bands. A topographic map (79 C/9) with a scale of 1:50,000 was collected from the Survey of India, which was used for georeferencing purpose.

90

K. Kundu et al.

3.2 Methodology Four Landsat multispectral satellite data have been preprocessed to find out the significant information from the satellite images. In this article, image preprocessing operations have been performed through TNTmips Professional 2017 software. The images were geometrically corrected, radiometric calibration, and drop lines or systematic striping or banding is removed from the collected images. Histogram equalization techniques are needed to rectify the quality of the image. Then, each image is digitized and obtained the boundary line. A layer stacking tool is used to integrate the three bands (bands 2, 3, and 4) into a single layer and to generate a false color combination (FCC). The nearest-neighbor algorithm has been applied for resampling the dataset. Then, select datum WGS84 and project to UTM zone with 45 N, which is used for mapping purpose. Finally, crop the study area to carry out the present work. In this study area, the forest cover image has been classified into four classes, which include water body, barren land, dense forest, and open forest. The maximum likelihood classification technique has been used to classify the 4 years images and select the various colors for the various features (water body, barren land, dense forest, and open forest). After the classification of each image, evaluate the area of each feature in km2 , which is shown in Table 3. Precision assessment has been obtained to verify the classification exactness through overall accuracy and the kappa statistic. For change detection analysis, 2 years (ex. 1975–1989, 1989–2000, and 2000–2015) of raster classification images are combined into a single raster file. The single raster file format is converted into the vector file format. From the vector file, various alteration features area has been obtained to find out how much areas have been unchanged and how much areas have been changed. Figure 6 demonstrates the changes in various features such as unchanged area, increased forest areas, decreased forest areas, increased barren land area, decrease barren land area, and water body, which are represented in various colors. Figure 2 depicts the detailed representation of the flowchart of methodology.

4 Results 4.1 Classification In this article, forest cover areas are classified into four classes: water body, barren land, dense forest, and open forest. Table 2 depicts the detailed description of the forest cover types defined in this study. The images are categorized through the maximum likelihood classifier (MLC) algorithm, which is based on the supervised classification techniques. For this classification technique, the training dataset is necessary for classifying the image. The training sets are used to recognize the forest cover classes in the whole image. The pixels are assigned to a particular class

Forest Cover Change Analysis in Sundarban …

91

Table 3 Summary of forest cover class area (in km2 ) and percentage of area in 1975, 1989, 2000, and 2015 Class name

Year: 1975

Year: 1989 Area

Year: 2000

Area

%

%

Water body

1016.89

33.47

993.11

32.69

Barren land

210.17

6.92

214.62

Dense forest

595.62

19.60

Open forest

1215.79

Total

3038.34

Area

Year: 2015 %

Area

%

902.11

29.69

1141.93

37.58

7.06

199.76

6.57

71.58

2.36

701.87

23.10

725.4

23.87

658.39

21.67

40.01

1128.82

37.15

1211.41

39.87

1166.57

38.39

100

3038.42

100

3038.68

100

3038.47

100

Table 4 Total forest area in km2 for the years 1975, 1989, 2000, and 2015 Forest class

1975 (area in km2 )

Dense forest

595.62

1989 (area in km2 ) 701.87

2000 (area in km2 ) 725.4

2015 (area in km2 ) 658.39

Open forest

1215.79

1128.82

1211.41

1166.57

Total forest area

1811.41

1830.69

1936.81

1824.96

Landsat Multi-temporal Satellite images (MSS1975, TM-1989, ETM+-2000, and ETM+- 2015) Geometric Correction Boundary Digitization

Final 1975, 1989, 2000, 2015

Supervised Classification Future Prediction using Fuzzy Logic

Forest Cover Analysis Change Detection Analysis

Fig. 2 Flowchart of the methodology

Accuracy Assessment Change Detection Map

92

K. Kundu et al.

Table 5 Total forest area change (in km2 ) and percentage of area change for the years 1975–1989, 1989–2000, and 2000–2015 Year: 1975–1989

Year: 1989–2000

Year: 2000–2015

km2 )

19.28

106.12

−111.85

Percentage of forest area change

1.06

5.80

−5.77

Total forest area change (in

according to its probability fit into a specific class. MLC contains two important parts such as mean vector and covariance metric, which are reclaimed from training dataset. The outcomes of the classification have shown that MLC is the vigorous method and more superior than other methods, and there are minimum chances of misclassification. Table 3 clearly inspects that water body areas have been declined during the period 1975–2000 while the same has been increased during 2000–2015 because of global warming and the effects of rising sea level. During 1975–1989, the barren land areas have been slightly increased due to fall in rainfall and water body, but the barren land areas have been gradually reduced during 1989–2015 which is caused by rising sea level. Dense forest area has been significantly increased during 1975–2000 although it has been declined during 2000–2015 due to deforestation. Open forest area has been depleted during the period 1975–1989, but it has been marginally increased during 1989–2000 and from the year 2000–2015, a declining trend is observed. Table 3 illustrates the summary of forest cover class area (in km2 ) and the percentage of area in 1975, 1989, 2000, and 2015. Figure 3 depicts the detailed representation of the four forest cover classes (water body, barren land, dense forest, and open forest) in the years of 1975, 1989, 2000, and 2015. From the observation, it is seen that more dense forest areas exist in the regions of southeast, south, and east position of the Sundarban. It is also examined that dense forest and open forest areas have been increased along the river bank or shoreline. Figure 4 represents the year-wise forest cover class areas, and Fig. 5 depicts the year wise total forest areas. Table 4 describes the total forest area includes dense and open forest area for the year of 1975, 1989, 2000 and 2015. From Table 5, it has been seen that during the period 1975–2000, the forest areas has been increased by approximately 6.86% (125.4 km2 ) whereas during the period 2000–2015, it has been declined by around 5.77% (111.85 km2 ).

4.2 Forest Covers Change Analysis In this study, the forest cover areas primarily classify into four classes, namely, water body, barren land, dense forest, and open forest. Table 6 depicts the detailed representation of percentage of change in forest area during 1975–1989, 1989–2000, and 2000–2015. During the period 1975–2015, the net water body areas have been increased by approximately 12.30%, whereas during the year 1975–2000, it has been declined by around 11.5%, while from the year 2000–2015, it has been significantly

Forest Cover Change Analysis in Sundarban …

93

Total area in sq. KM

Fig. 3 Maximum likelihood classification results of a 1975, b 1989, c 2000, and d 2015

2000

Water Body Barren Land

0 1975

1989

2000

2015

Year

Fig. 4 Forest covers area for the years of 1975, 1989, 2000, and 2015

Dense Forest Open Forest

K. Kundu et al.

Area in sq. KM

94

2000 1900 1800 1700

total forest area 1975

1989

2000

2015

Year Fig. 5 Forest area for the years of 1975, 1989, 2000, and 2015 Table 6 Percentage of forest cover area change during 1975–1989, 1989–2000, and 2000–2015

Class

Year: 1975–1989

Year: 1989–2000

Year: 2000–2015

Water body

−2.34

−9.16

26.58

Barren land

2.12

−6.92

−64.17

Dense forest

17.84

3.35

−9.24

Open forest

−7.15

7.32

−3.70

increased by about 26.58%. Water body area has been increased because of rising sea level due to the effects of global warning. During the period 1975–2015, the barren land areas have been decreased by about 64.17% because of growing water body and deforestation, etc., whereas during the year 1975–1989, the barren land areas have been marginally increased by around 2.12%, while it has been declined during 1989–2015. The dense forest areas have been increased by about 10.54% during the period 1975–2015, while it has been significantly declined by approximately 9.24% during the year 2000–2015. The open forest area has been depleted by about 4.05% during the period 1975–2015, whereas during 1989–2000, it has been significantly increased by about 7.32%. Therefore, the forest area has been increased by approximately 1.09% during the period 1975–2015. From Tables 7, 8 and 9, it has been illustrate that 84–98% of water bodies and 63–81% of dense forests areas have no changes during the period 1975–2015, while reversed trend has been observed for the barren land and its unchanged area is 23–34%. For the open forests areas, 66–70% of areas have been unchanged during the year 1975–2000 and 36% of areas are fixed during 2000–2015. The major change in areas has been seen in the region of outer edge or near the shoreline because of anthropogenic and natural forces. Moreover, it has been surveyed that during the period 1975–2000, 15–32% of barren land areas have been transformed into water bodies because of rising sea level or variation tidal inundation during acquisition of satellite image, 35–51% of barren land areas have been converted into the open forest areas which is caused by new plantation, 15–34% of dense forests areas have been transformed into open forests areas because of deforestation, and 21–46% of open forests areas have been converted into dense forests areas due to regrowth. During the period 1975–2000, 4–13% of water bodies have been translated into the open forests areas because of new plantation program on the shoreline or along the river

Forest Cover Change Analysis in Sundarban …

95

Table 7 Forest cover change matrix (area in km2 and percentage) during 1975–1989 Year

Water body

Barren land

Dense forest

Open forest

1975–1989

Area

%

Area

Area

Area

Water body

935.37

91.98

44.91

4.42

Barren land

32.92

15.66

71.19

33.87

19.24

Dense forest

4.82

0.81

8.68

1.46

378.95

Open forest

17.95

1.48

88.85

7.31

300.51

24.72

%

3.38

% 0.33

%

31.55

3.10

9.15

86.66

41.23

63.62

202.89

34.06

808.13

66.47

Table 8 Forest cover change matrix (area in km2 and percentage) during 1989–2000 Year

Water body

Barren land

Dense forest

1989–2000

Area

%

Area

Water body

839.92

84.57

22.52

2.27

Barren land

45.01

20.97

82.06

38.24

10.37

Dense forest

7.64

1.09

10.74

1.53

477.62

Open forest

9.48

0.84

84.22

7.46

234.78

20.80

%

Area 1.62

%

Open forest Area

%

129.04

12.99

4.83

76.11

35.46

68.05

205.87

29.33

800.11

70.88

0.16

Table 9 Forest cover change matrix (area in km2 and percentage) during 2000–2015 Year

Water Body

Barren land

2000–2015

Area

%

Area

Water body

888.88

98.53

1.92

0.21

Barren land

42.18

21.12

47.16

Dense forest

13.83

1.91

5.68

Open forest

194.98

16.10

17.46

%

Dense forest

Open forest

Area

Area

%

%

0.76

0.08

7.79

0.86

23.61

20.32

10.17

102.33

51.23

0.78

586.94

80.91

111.44

15.36

1.44

558.06

46.07

436.45

36.03

bank while during the year 2000–2015, opposite tendency has been seen, 16% of open forest areas has been transformed into the water bodies due to erosion along the coastline and rising sea level. The high turnover is done between forests and barren land areas because of infringement, erosion, aggradations, and forest rehabilitation programs. The major erosion occurred on the southern region of Sundarban delta which shown in Fig. 6(a–d).

4.3 Future Trend Analysis Using Fuzzy Logic From the study, it is clearly indicated that Sundarban mangrove forest area has been declined while it was not uniform over the periods (1975–2015). The main causes of forest degradation are natural and human influence pressures, decreased fresh water supply, changes in coastal erosion, increased salinity level, pollution from industry, rising sea level, natural disaster, improper planning and management, increased in man–animal conflicts, etc. In fuzzy set theory, the magnitude of membership of an

96

K. Kundu et al.

Fig. 6 Change detection results of a 1975–1989, b 1989–2000, c 2000–2015, and d 1975–2015

element is continuously changing from 0 to 1. In this fuzzy logic system, the two parameters are taken as inputs, i.e., rising sea level (RSL) and climate change (CC), and considered as one output, i.e., forest cover change (FCC). Figure 7 depicts the block diagram of the fuzzy logic system. Initially, two input parameters (RSL and CC) are fuzzified, and then, the fuzzy rules are penetrated into the fuzzy inference engine; it is the heart of the system where two input parameters (RSL and CC) are processed and the output is obtained, i.e., FCC. For the input–output parameters (RSL, CC, and FCC), the fuzzy logic system is categorize into three intervals such

Forest Cover Change Analysis in Sundarban …

97

Fuzzy Rule RSL Fuzzification

Fuzzy Inference Engine

Defuzzification FCC

CC

Fig. 7 Fuzzy inference system

Fig. 8 Membership function for low, medium, and high

as low, medium, and high. Fuzzy membership values between 0 and 1 for the low, medium, and high are shown in Fig. 8. The fuzzy-based rules are given below: IF (RSL is low) AND (CC is low) THEN (FCC is low). IF (RSL is high) AND (CC is low) THEN (FCC is medium). IF (RSL is low) AND (CC is high) THEN (FCC is medium). IF (RSL is high) AND (CC is high) THEN (FCC is high). IF (RSL is high) OR (CC is high) THEN (FCC is high). Whenever rising sea level (RSL) and climate change (CC) are very high, then the changes in forest area (FCC) may be high, i.e., forest area may be more declined. If the rising sea level (RSL) or climate change (CC) is very high, then the forest cover changes (FCC) may be high, whereas if the changes in the two input parameters are very low, then the reversed trend has been obtained for the output parameter. If the changes in the two input parameters (RSL and CC) are either low or high, then the changes in the output variable (FCC) will be moderate. In the recent years, it is observed that natural calamities have been rapidly increasing (i.e., increasing sea level, increasing storm, cyclone, hurricane, temperature, etc.), as a result of degradation of forest cover area. From the study, it may conclude that forest degradation tendency will be more increased in the future.

98

K. Kundu et al.

Table 10 Confusion matrix for the year 1975 Class

Reference data Water body

Barren land

Dense forest

Open forest

Total

User accuracy

Water body

29

0

0

0

29

100%

Barren land

2

31

0

3

36

86.11%

Dense forest

0

0

37

2

39

94.87%

Open forest

1

3

9

34

47

72.34%

Total

32

34

46

39

151

Producer accuracy

90.63%

91.18%

80.43%

87.18%

Overall classification accuracy 86.75%, Kappa statistic 0.823

4.3.1

Accuracy Assessment

Image classification accuracy has been obtained through the error matrix which is generally used as the quantitative technique. It is a matrix association between the reference image and the classification results. In the confusion matrix or error matrix, the columns represent the ground truth data, i.e., field observation data or visual interpretation data or Google Earth data, the rows represent the classes of the classified image to be evaluated, and the cells represent the number of pixel for all possible correlation between ground truth and the classified image. Diagonal cells indicate that the number of properly identified pixels and other pixels specify that not appropriately recognized pixels. Overall classification accuracy has been obtained by ratio of total number of elements in the diagonal position into the total number of elements are used in classification. Kappa statistic is also another marker to measure the accuracy. It is computed how the classification results contrast to values assigned by possibility. It ranges from 0 to 1. The higher Kappa statistic value indicates that the classification result is more accurate. To obtain the correctness of the forest cover maps, confusion matrices (error matrices) have been produced. Confusion matrices of 1975, 1989, 2000, and 2015 images are presented in Tables 10, 11, 12 and 13, respectively. Overall accuracy, producer accuracy, user accuracy, and Kappa statistic have been achieved for scrutiny of classification exactness. The overall accuracy of 1975 image is 86.75% (with related Kappa statistic of 0.823), of 1989 image is 90.77% (with related Kappa statistic of 0.876), of 2000 image is 88.16% (with related Kappa statistic of 0.842), and of 2015 image is 85.03% (with related Kappa statistic of 0.799).

Forest Cover Change Analysis in Sundarban …

99

Table 11 Confusion matrix for the year 1989 Class

Reference data Water body

Barren land

Dense forest

Open forest

Total

User accuracy

Water body

23

2

0

0

25

92%

Barren land

1

31

2

0

34

91.18%

Dense forest

0

0

27

3

30

90%

Open forest

1

1

2

37

41

90.24%

Total

25

34

31

40

130

Producer accuracy

92%

91.18%

87.10%

92.5%

Overall classification accuracy 90.77%, Kappa statistic 0.876 Table 12 Confusion matrix for the year 2000 Class

Reference data Water body

Barren land

Dense forest

Open forest

Total

User accuracy

Water body

39

4

0

0

43

90.70%

Barren land

2

30

3

0

35

85.71%

Dense forest

0

2

29

1

32

90.63%

Open forest

0

1

5

36

42

85.71%

Total

41

37

37

37

152

Producer accuracy

95.12%

81.08%

78.38%

97.30%

Overall classification accuracy 88.16%, Kappa statistic 0.842 Table 13 Confusion matrix for the year 2015 Class

Reference data Water body

Barren land

Dense forest

Open forest

Total

User accuracy

Water body

47

3

1

0

51

92.16%

Barren land

3

25

5

0

33

75.76%

Dense forest

1

4

37

1

43

86.05%

Open forest

0

2

5

33

40

82.5%

Total

51

34

48

34

167

Producer accuracy

92.16%

73.53%

77.08%

97.06%

Overall classification accuracy 85.03%, Kappa statistic 0.799

100

K. Kundu et al.

5 Conclusions The present study reveals that Sundarban forest area has been increased by about 6.86% during 1975–2000, while it is not uniform over the period. Its area has been increased by 1.06% for the interval 1975–1989, and by 5.80% during the period 1989–2000. During the period 2000–2015 the opposite trend has been observed, and its area has been declined by around 5.77%, although these outcomes were not significant in the viewpoint of inaccuracy interrelated due to natural atmosphere which was not equal during the collection of images. The main causes of forest decline are frequently occurring storm, decline in fresh water supply, rising sea level, submerge of the coastal region, human activity, etc. From the study, it is seen that forest area has been gradually declining along the shoreline on the southern region of the Sundarban delta. The forest region has been increased along the small channel of northern side of the delta. The change detection results signify that 63%–80% of dense forest area and 66%–70% of open forest areas have no changes during the periods 1975–2015 and 1975–2000, respectively, although during the interval 2000–2015, almost 36% of open forest areas have been unaffected. The studies examine that some forest areas have been transfer to barren land and water bodies. The overall classification accuracy is more than 85%, which indicates that the classification results are well. In future, by the year 2030, the forest areas will decline by around 2% of its net areas in 1975 because of rising sea level, global warning, and deforestation over the world. Therefore, to supervise, plan, execute are immediately needed to survive the natural coastal ecosystem in sundarban region. Acknowledgement This research activity has been carried out in the Dept. of CSE, University of Kalyani, Kalyani, India. The authors acknowledge the support provided by the DST PURSE Scheme, Govt. of India at the University of Kalyani.

References 1. Ghosh, A., Schmidt, S., Fickert, T., Nüsser, M.: The Indian Sundarban mangrove forests: history, utilization, conservation strategies and local perception. Diversity 7, 149–169 (2015) 2. Alongi, D.M.: Mangrove forests: resilience; protection from tsunamis; and responses to global climate change. Estuar. Coast. Shelf Sci. 76, 1–13 (2008) 3. FSI: India State of Forest Report 2011. Forest Survey of India, Ministry of Environment and Forests, Dehradun (2011) 4. Jha, C.S., Goparaju, L., Tripathi, A., Gharai, B., Raghubanshi, A.S., Singh, J.S.: Forest fragmentation and its impact on species diversity: an analysis using remote sensing and GIS. Biodivers. Conserv. 14, 1681–1698 (2005) 5. Giri, C., Pengra, B., Zhu, Z., Singh, A., Tieszen, L.L.: Monitoring mangrove forest dynamics of the Sundarbans in Bangladesh and India using multi-temporal satellite data from 1973 to 2000. Estuar. Coast. Shelf Sci. 73, 91–100 (2007) 6. Pan, Y., Birdsey, R.A., Fang, J., Houghton, R., Kauppi, P.E., Kurz, W.A., et al.: A large and persistent carbon sinks in the world’s forests. Science 333(6045), 988–993 (2011) 7. Giri, C., Long, J., Sawaid Abbas, R., Murali, M., Qamer, F.M., Pengra, B., Thau, D.: Distribution and dynamics of mangrove forests of South Asia. J. Environ. Manage. 148, 1–11 (2014)

Forest Cover Change Analysis in Sundarban …

101

8. Ostendorf, B., Hilbert, D.W., Hopkins, M.S.: The effect of climate change on tropical rainforest vegetation pattern. Ecol. Model. 145(2), 211–224 (2001) 9. Giriraj, A., Shilpa, B., Reddy, C.S.: Monitoring of Forest cover change in Pranahita Wildlife Sanctuary, Andhra Pradesh, India using remote sensing and GIS. J. Environ. Sci. Technol. 1(2), 73–79 (2008) 10. Jayappa, K.S., Mitra, D., Mishra, A.K.: Coastal geomorphological and land-use and land cover study of Sagar Island, Bay of Bengal (India) using remotely sensed data. Int. J. Remote Sens. 27(17), 3671–3682 (2006) 11. Mitra, D., Karmekar, S.: Mangrove classification in Sundarban using high resolution multi spectral remote sensing data and GIS. Asian J. Environ. Disast. Manage 2(2), 197–207 (2010) 12. Giri, S., Mukhopadhyay, A., Hazra, S., Mukherjee, S., Roy, D., Ghosh, S., Ghosh, T., Mitra, D.: A study on abundance and distribution of mangrove species in Indian Sundarban using remote sensing technique. J Coast Conserv. 18, 359–367 (2014)

Identification of Malignancy from Cytological Images Based on Superpixel and Convolutional Neural Networks Shyamali Mitra, Soumyajyoti Dey, Nibaran Das, Sukanta Chakrabarty, Mita Nasipuri and Mrinal Kanti Naskar Abstract This chapter explores two methodologies for classification of cytology images into benign and malignant. Heading toward the automated analysis of the images to eradicate human intervention, this chapter draws curtain from the history of automated CAD-based design system for better understanding of the roots of the evolving image processing techniques in the analysis of biomedical images. Our first approach introduces the clustering-based approach to segment the nucleus region from the rest. After segmentation, nuclei features are extracted based on which classification is done using some standard classifiers. The second perspective suggests the usage of deep-learning-based techniques such as ResNet and InceptionNet-v3. In this case, classification is done with and without segmented images but not using any handcrafted features. The analysis provides results in favor of CNN where the average performances are found better than the existing result using feature-based approach. Keywords Cytology · FNAC · Superpixel-based segmentation · ResNet50 · InceptionNet-V3 · Random crop · Random horizontal flip S. Mitra · M. K. Naskar Department of Electronics and Telecommunication Engineering, Jadavpur University, Kolkata, India e-mail: [email protected] M. K. Naskar e-mail: [email protected] S. Dey · N. Das (B) · M. Nasipuri Department of Computer Science and Engineering, Jadavpur University, Kolkata, India e-mail: [email protected]; [email protected] S. Dey e-mail: [email protected] M. Nasipuri e-mail: [email protected] S. Chakrabarty Theism Medical Diagnostics Centre, Dumdum, Kolkata, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 J. K. Mandal and D. Sinha (eds.), Intelligent Computing Paradigm: Recent Trends, Studies in Computational Intelligence 784, https://doi.org/10.1007/978-981-13-7334-3_8

103

104

S. Mitra et al.

1 Introduction In recent times, incidence rates of cancer have reached an alarming situation. Cancer causes abnormal overgrowth of cells from the originating site. Normally, cell divides to form new ones by replacing the old and worn out cells. This helps in healing process. But in case of cancer cells, the cell divides indefinitely and more rapidly by violating the intricate control system of cell division. Mutations in gene cause the cells to proliferate and evade other tissues and cells. Though genetic factor is the leading cause of cancer for an individual, there are various other factors that accelerate the process of acquiring the disease. Exposure to external agents like plastics, heavy metals, radiation, toxic and chemical compounds, intake of junk and processed foods, and the overall lifestyle are additive factors for abnormal cell division. There are various imaging techniques to detect lumps or masses like magnetic resonance imaging (MRI), X-ray (plain film and computed tomography (CT), ultrasound (US), optical imaging, etc. But these imaging techniques cannot analyze the images at cellular level. For that reason, cytology is used popularly to detect abnormality in the cell structure. In cytology, cancer is diagnosed by expert cytotechnologists by examining cell samples taken either through biopsy or cytology. Cytology has various advantages compared to biopsy and is vividly stated under the section cytology. As our current discussion is restricted to cytology-based diagnosis, we will discuss only on cytology-based research works on automated computer-based system design. In cytology, nucleus plays a vital role to detect abnormalities in cell. In some cases, nuclei are highly overlapped. Therefore, it is very hard to segment the nucleus regions. Most of the techniques available in literature segment out the nucleus region prior to classification into benign and malignant. With the onset of deep-learningbased techniques, it is possible to classify the images without the need for prior segmentation. But the techniques are not yet successful for cytology images, especially for breast cancer images. So, in this chapter, we have discussed two methodologies and used them accordingly for classification of benign and malignant cells. In the first approach, classification is done by segmenting out the nuclei region from rest by using superpixel-based approach. Later, we have used the segmented images for classification purpose using standard classifiers based on some extracted features. In the second approach, we have used the same set of segmented images to classify using deep-learning-based approach without extracting any handcrafted feature. We have observed that both the approaches are almost comparable and to some extent better for deep-learning-based approaches.

2 Cytology: A Brief Overview Human body is composed of innumerable cells and each cell contains a nucleus and a cytoplasm. The nucleus consists of the genetic imprint, the DNA which undergoes mutation when a certain disease is acquired. The change is observed in various

Identification of Malignancy from Cytological Images …

105

attributes of nucleus and cytoplasm like shape, size, color, texture, nature, etc. and by observing these criteria under microscope, malignant and premalignant conditions can be diagnosed. The branch of medical science that deals with the examination of cells to detect the presence of any abnormal cellularity is known as cytology. Based on the originating site there are several types of cytology. Cytology can be categorized into various types based on originating site. Normally, a benign cell exhibits a welldefined pattern with regular nuclear contour. On the other hand, malignant nucleus is characterized by irregular nuclear contours with varied shapes and sizes. Thus, nuclei have utmost diagnostic value from clinical point of view. There are various modalities by which the cellular specimen can be collected by the pathologist. One of the very common techniques is the fine needle aspiration (FNA). It is a diagnostic procedure to sample cells from a cyst or a palpable mass or from the region detected by using other imaging techniques like X-ray [1], ultrasound [2], or mammography [3]. A thin needle (normally 23–25 gage) is injected through the skin and sufficient amount of cells are taken out for investigation. This procedure was first done at Maimonides Medical Center, United States, in 1981 successfully. Soon it was realized that it is relatively faster, safer, cheaper, noninvasive, much less painful method, and trauma-free diagnostic process compared to surgical biopsy [4]. Complications are very rare and include mildly soared area and minor hemorrhage. It is extensively used in the detection and diagnosis of breast lumps, liver lesions, renal lesions, ovarian masses, soft tissue masses, pulmonary lesions, thyroid nodules, subcutaneous soft tissue mass, salivary gland, lymph nodes, etc. [5]. Many techniques such as bacterial culture, immune cytochemistry, cytogenetics, polymerase chain reaction, etc. are possible from FNAC. There are two types of tests that are commonly used in cytology to detect the malignancy condition which are as follows: Screening Test: This test is normally done before the actual onset of the disease, that is, when symptoms are not perceivable and it is recommended for those who are at high-risk zone. Screening test at regular intervals can help to diagnose the disease at a premature stage and it can be readily treated with best possible outcome. Diagnostic Test: This test is usually performed when the symptoms actually start to develop and are prescribed for patients who are found to suffer from serious ailment. Both the tests can be performed manually or via automatic computer-assisted design (CAD)-based Systems. A manual test is normally performed under human jurisdiction, which involves human effort, time, and fatigue. To examine more than hundreds of slides per day is cumbersome task and it requires expert cytotechnologists which is a dearth as of now. A manual diagnostic procedure is shown in Fig. 1. Therefore, automatic CAD-based systems are designed which can process thousands of specimens per minute despite much of human intervention. Nevertheless to mention, it saves time and energy of the cytotechnologist by assisting them technically in various ways. • It can work as prescreening system to distinguish between a normal and more often an abnormal specimen. Thus, the system should show greater false positive

106

S. Mitra et al.

Fig. 1 Diagram of a manual screening system

cases to make sure that no single malignant specimen is overlooked that requires further investigation by doctors. Thus, this system can work as an assistant to the doctors by eradicating the necessity to assess the normal specimens. Thus, it saves significant amount of time and energy of the doctors and improves efficiency of the diagnosis process. • An automated system can be implemented in a parallel mode to the conventional manual procedure to screen the disease. Thus, the screening process becomes bifolded so that chances of faulty diagnosis and false negative cases are reduced to a greater extent, which is utmost important from diagnostic point of view. An automatic diagnostic system is shown in Fig. 2. Though cytology has numerous advantages, there are a few issues that are yet to be resolved: • • • • •

It cannot localize neoplastic lesion to an exact anatomic location. It cannot distinguish between preinvasive and invasive cancer. It is unable to distinguish reactive from dysplastic. May not be able to determine tumor type. The cellular structure of the specimen is influenced by the experience of the specimen collector, the technique, and instruments used. • False negative cases are more with FNA.

Identification of Malignancy from Cytological Images …

107

Fig. 2 Diagram of an automatic screening system

2.1 Automation as a Growing Concept in Cytology Automation is a pertinent solution in present context of increasing number of cancer cases and aimed to reduce the workload of the cytotechnologists. There were various designs developed during the period 1970–1990 like CERVISCAN [6], BioPEPR [7], LEYTAS [8], Xcyt [9], etc. which laid strong foundation on today’s era of automation-assisted screening system. In cytology, there are some automated devices that work as an adjunct with human interface to reduce the workload which are given below: Automated Slide Preparation Devices: Two systems have got FDA sanction for automated preparation of slides, ThinPrep Processor, and the AutoCyte Prep. ThinPrep processor 5000 uses thin prep technology for the preparation of slides and can produce roughly 35 slides/hour. AutoCyte prep also uses the same liquid-based technology but the technique of slide preparation of the two systems is different. Computerized Screening Systems: These systems work on the principle of the underlying image processing algorithms, where corresponding to an input cytology an output or processed image is produced. Image processing algorithms segment the region of interest, i.e., nuclei from the rest and extract features which could classify the image into benign and malignant. There are few systems that focus the portion of the slide carrying abnormalities and reduce the time by examining a small portion of slide. Computerized microscopes like CompuCyte’s M Pathfinder, Accumed International’s AcCell 2000 are highly appreciated for this purpose.

108

S. Mitra et al.

3 Literature Survey Segmentation is always a challenging problem for extracting the region of interest in biomedical images because of their diverse and complicated nature. Several researchers approached to mitigate this problem in various ways. In the present work, as already stated, we will approach to classify the specimen images into two ways. Before diving straight into the present work let us have a look at some significant state-of-the-art methods that were implemented successfully and is also helpful in understanding the context of the present work. An adaptive thresholding based segmentation method was proposed by Zhang et al. [10] for automatic segmentation of cervical cells. They explored concave pointbased algorithm to delineate overlapping nuclei with reduced computational cost. Zhang et al. [11] used local graph cut technique to segment nuclei and a global graph cut to segment cytoplasm of free-lying cells. Zhao et al. [12] invoked a superpixel-based Markov random field (MRF) segmentation technique. The labeling process delineates nucleus, cytoplasm, and other components of cell. Li et al. [13] proposed spatial K-means clustering to initially classify image into three clusters nucleus, background, and cytoplasm. A final segmentation is done using a radiating vector flow (RGVF) snake model. RGVF uses a new edge-map computation method to locate the ambiguous and incomplete boundaries but fails to delineate the overlapping cells. Thanatip Chankong et al. [14] proposed a patch-based fuzzy C-means (FCM) clustering to segment nucleus, cytoplasm, and background. Hough-transform-based technique has been extensively used in contemporary works. To segment breast FNAC images, George et al. [15] proposed Houghtransform-based technique. Circular shaped structures were detected first using this and to eliminate the false circles that were created during the process, Otsu’s thresholding was used. The marker-controlled watershed transform then accurately draw the nucleus boundary. For classification, 12 features were extracted. Four different classifiers like MLP, PNN, LVQ, and SVM were used to classify the images into benign and malignant with tenfold cross-validation. Marker-controlled watershed proposed by Xiaodong Yang et al. 2006 [16] segment out nuclei region from the rest. W. N Street et al. [9] proposed a system called Xcyt [9] to perform screening test of breast cancer. They proposed Hough transform to detect circle-like shapes and active contouring technique to detect boundaries of nucleus. Hrebien et al. 08 [17] also used Hough-transform-based technique followed by an automatic nuclei localization method using (1 + 1) search strategy. Nuclei segmentation is done using watershed, active contour model, and grow-cut algorithm. But they failed to address overlapping nuclei. Another drawback was the generation of false circles which was not be resolved later. Garud et al. [18] proposed deep CNN-based classification approach on breast cytology samples. Experiments were conducted on eightfold cross-validation process of 37 cytopathology samples by using GoogLeNet architecture. Manually, ROI’s were extracted from the cell samples and then GoogLeNet architecture was trained

Identification of Malignancy from Cytological Images …

109

to classify these breast FNAC region of interest (ROIs) and achieved the mean recognition accuracy of 89.7%. Artificial neural network (ANN)-based diagnosis process was proposed by Dey et al. [19] to classify lobular carcinoma cases. Dataset consisted of 64 images (40 training data, 8 validation data, and 16 validation data). By using HE strain, automated image morphometry operation was analyzed to study nuclei features like area, diameter, perimeter, etc. The network consisted of 34 inputs in the first layer, 17 input hidden layers, and 3 class output layers. ANN-based classification of cytology images was proposed by Isha et al. [20] to classify breast precancerous lesions. Dataset consisted of 1300 precancerous cases collected from Penang General Hospital and Hospital University Sains Malaysia, Kelantan, Malaysia for training and testing purpose. Hybrid multilayered perceptron network was used to classify images. An automated detection technique of nuclei of cervical cell was proposed by Braz et al. [21] by using convolution neural network (CNN). For the experiment, they used overlapping cervical cytology image segmentation challenge—ISBI 2014 dataset. The square patches were extracted from the training images, where the central pixel must belong to the target classes. They used rectified linear units for convolution in fully connected layers. Tareef et al. [22] suggested deep-learning-based segmentation and classification of Papanicolau smeared cervix cytological images. The image patches were generated by simple linear iterative clustering process. The diagnosis process was done by superpixel-based convolution neural network.

4 Present Work As indicated earlier, present work consists of two different approaches and both the approaches are validated on the same dataset. In both the approaches, nucleus segmentation is performed. As seen, segmentation is the most crucial and vital problem in image analysis. The main reason behind it is the complex and varied nature of the images. In cytological images, the region of interest is the nucleus where most of the abnormalities are registered. So, absolute delineation of the nucleus is required to extract the meaningful contents. Various algorithms are proposed for accurate segmentation of the nucleus. In the first approach, we will segment the nucleus in the images using a combination of various clustering algorithms. After segmenting the region of interest, i.e., nuclei, features like compactness, eccentricity, area, and convex area of nuclei are extracted. Based on the extracted features classification is done using MLP, k-NN, AdaBoost, SVM, and random forest classifiers. In the second approach, after segmenting the nuclei, classification is done using deep learning approach without using the feature set. In the next section, we will discuss both the approaches in details.

110

S. Mitra et al.

Fig. 3 Sample images of benign tumors

5 Dataset Description The experiment was performed on the FNAC-based cytological images, collected from pathology center “Theism Medical Diagnostics Centre, Dumdum, West Bengal.” 100 cytological images were collected consisting of 50 malignant samples and 50 benign samples. The images are captured with 5-megapixel resolution using Olympus microscope at 40X optical zoom (Figs. 3, 4).

6 First Approach: Classification of the Images with the Help of Standard Classifiers In this approach, we have invoked feature-based classification using standard classifier such as K-NN, MLP, etc. The outline of the first process is shown in Fig. 5. The images are transformed into RGB color space using Eq. 1. I RG B = (FR , FG , FB )

(1)

Identification of Malignancy from Cytological Images …

111

Fig. 4 Sample images of malignant tumors

Fig. 5 A block diagram of the first approach

where I R (x, y) = intensity of the red channel pixel (x, y), IG (x, y) = intensity of the green channel pixel (x, y), I B (x, y) = intensity of the blue channel pixel (x, y). Now, the image is split into red, green and blue channels. Anisotropic diffusion [23] is applied individually on the three channels for removal of noise. Anisotropic diffusion performs blurring or removes noise selectively inside the edges. Edges are not blurred in this process of noise removal which is very advantageous for further processing of the images. It is performed by using the following equation:

112

S. Mitra et al.

Ft = div(c(x, y, t)∇ F) = c(x, y, t)ΔF + ∇c.∇ F

(2)

where the symbol “∇” denotes the difference in intensity of nearest-neighboring pixel and c is the diffusion coefficient which is a function of g(x). 2 1 where g(x) is defined by g(x) = e(−(x/k) ) and g(x) = 2. 1+( x k ) Now median-based SLIC [24] is used for segmenting the noise removed image. 2000 superpixels are formed from the image. The superpixel, whose area ≤10 pixels, is removed and the features are empirically determined using mediancolored value. The labeled image thus produced is subjected to Spatial DB SCAN [25] based clustering method to identify the high-density regions in the image. Let, the new labeled image be Idb-scan corresponding to new clustered regions. The labeled image is then used to detect boundary to produce Iboundary and is binarized to produce image to binary IBinary . This is reversed make the label more distinct and is denoted by I = [1 1 . . . . . . .1]m×n − IBinary m×n . Now, morphological erosion of the image I’ is done with the presence of a structural element s, , I’ s, = {z | s, z ⊆ I’} where s, z is the translation of s, . Assume, I” ≡ I’ s, . Now, the mean intensity of the connected components of the image I” is calculated. Mean intensity of each connected component is calculated as 1 n (MI) = n j=1 f (x j , y j ) , where n is the number of pixels and f (x j , y j ) is the intensity value of the pixel at (xj , yj ). The values of mean intensity of each pixel of the connected components are F1(x, y) = MIired for all x, y, where (x, y) are the pixels of ith connected components corresponding to red channel. Similarly, F2(x, y) = MIigreen F3(x, y) = MIiblue . Thus, Icontour = F1 ∪ F2 ∪ F3

(3)

Icontour is splitted into red, green, and blue channel. After conversion, each channel is turned into gray level and is finally merged. Fuzzy C-means clustering [26] divides the image and 15 such clusters are chosen to form a new clustered image Ifcm. The irrelevant details are removed using connected component analysis. For ith connected component CCi(x, y) = 0, if area of CCi ≤ 700 pixels. Morphological operations like erosion and dilation are applied to remove unwanted objects. Finally, the original pink-colored nuclei are overlayed on the masked background. To isolate the overlapped nuclei, entropy-based superpixel [27] algorithm is introduced resulting in 750 superpixels and the overlapping portions are separated based on superpixel labels using the entropy rate.

Identification of Malignancy from Cytological Images …

113

Fig. 6 A block diagram of the superpixel-based segmentation approach Table 1 Statistical information on performance of different classifiers [28] Classifier K-NN

MLP

Class #1

Precision

Recall

F-measure

Accuracy (%)

1

0.813

0.897

91

Class# 2

0.85

1

0.919

Weighted average

0.923

0.909

0.908

class#1

0.93

0.875

0.903

class#2

0.889

0.941

0.914

Weighted average

0.91

0.909

0.909

91

The overlapping regions are separated by the maximum value of the entropybased objective function [27]. Thus, the segmented image is produced with only deep pink-colored nuclei independent from each other (Fig. 6).

6.1 Experimental Results of the First Approach [28] The experiment was performed in threefold cross-validation using K-NN and MLP classifier and we have observed that both K-NN and MLP classifier have achieved average maximum recognition accuracy of 91%. For more details, see Table 1.

114

S. Mitra et al.

7 Second Approach: Classification of the Images with the Help of CNN In this section, we introduce deep-learning-based techniques for identification of benign and malignant cells. Deep-learning-based techniques are nowadays used heavily for computer vision related task. The success rate highly depends on architecture of the developed network and the number of image samples in the database. Most of the successful networks are used mainly for natural image recognition such as Alexnet [29], ResNet [30], InceptionNet [31], etc. On the other hand, benign and malignant cell identification from cytological images is very difficult due to not having any hard distinguishing factors among them. Another major challenge of cytological images is the limited database. Here, we first segment the cytological images based on superpixel. Then, the segmented images are used for training and testing purpose using deep learning architecture such as ResNet and InceptionNetV3. We have observed the CNNs performed better using segmented images rather than using raw images. And, the average performances are better than the existing result using feature-based approach. We have used deep learning more specifically convolution neural network for identification of benign and malignant cells. To do that we have used superpixelbased image segmentation technique to segment nuclei from the cytoplasm and then the segmented regions are classified using CNNs. Dataset preparation for deep-learning-based classification: The 100 segmented images (50 are benign and 50 are malignant samples) which are previously segmented by superpixel-based approach are divided randomly in the ratio 3:2:1 to make train, test, and validation sets. The validation set is used to select appropriate deep learning model using training data. The best model will be selected depending on the validation set. It is used to give an estimate of the tuned model. In our proposed method, two types of deep learning networks are used for classification of the images, ResNet50 and InceptionNet-V3. The predefined architectures of these two neural networks are used for training purpose. The detailed descriptions of these architectures are described in the following two subtitles.

7.1 Network Architecture Description of ResNet50 ResNet architecture, which was introduced by Microsoft Research Asia on 2015, is now popularly used in computer vision domain. In Resnet50 architecture, each two-layer block of Resnet34 is replaced with three-layer bottleneck block (Fig. 7 and Table 2).

Identification of Malignancy from Cytological Images …

115

Fig. 7 Architecture of ResNet50 Table 2 Parameters of the network architecture [32] Layer Name

Output Size

Conv1 x

7 × 7,64, stride 2

112 × 112

Conv2 x

3 × 3 max pool, stride 2

56 × 56

1 × 1,64 3 × 3,64 × 3 1 × 1,256 Conv3 x

1 × 1,128 3 × 3,128 × 4 1 × 1,512

28 × 28

Conv4 x

1 × 1,256 3 × 3,256 × 6 1 × 1,1024

14 × 14

Conv5 x

1 × 1,512 3 × 3,512 × 3 1 × 1,2048

7×7

Average pool, 2, softmax

1×1

7.2 Network Architecture Description of InceptionNet-V3 InceptionNet-V3, a deep neural network, is one of the pretrained models in PyTorch environment. First, the Inception module was trained on ImageNet dataset of 1000 classes by using Google Inc. But in our proposed work, the inception module is trained for our binary class dataset of cytology images. The Inception-v3 architecture is an upgraded module of Inception-v1 and Inception-v2. In the network Inception-v2, a 7 × 7 convolution is factorized into three 3 × 3 convolutions. Inception-v2 is a little updated from Inception-v1. There are three inception modules of sizes 35 × 35 (Table 3). Inception-v3 has the same architecture as Inception-v2 with inclusion of some minor changes. Here, the batch normalization auxiliary is added with the Inceptionv2, i.e., the fully connected layer of the auxiliary classifier is also normalized (Fig. 8).

116

S. Mitra et al.

Table 3 Parameters of the network architecture [33] Layer name

Input size

Conv (convolution layer)

299 × 299 × 3

Conv

149 × 149 × 32

Conv padded

147 × 147 × 32

Pool (pooling layer)

147 × 147 × 64

Conv

73 × 73 × 64

Conv

71 × 71 × 80

Conv

35 × 35 × 192

3 X Inception

35 × 35 × 288

5 X Inception

17 × 17 × 768

2 X Inception

8 × 8 × 1280

Pool

8 × 8 × 2048

Linear layer

1 × 1 × 2048

Softmax

1×1×2

Fig. 8 Architecture of InceptionNet-V3 (Image Source https://hackathonprojects.files.wordpress. com/2016/09/74911-image03.png)

7.3 Training Process First, the images from trained set are randomly flipped horizontally and the flipped images are cropped randomly. So, from an image with dimension 960 × 1280 is cropped randomly into the dimension of 224 × 224 (for ResNet50) and 299 × 299 (for InceptionNet-V3). In PyTorch implementation, the input images must be resized in these dimensions. These randomly cropped images are trained using neural networks. These transformation processes are done at runtime and during training at each epoch the cropped portions will be changed randomly. For these experiments, NVIDIA GTX Geforce 970 GPU system with 1664 CUDA core and 4 GB RAM is used. The batch size, number of epochs, and learning rate of the model are set to 8, 200, 0.0001, respectively, for the present work. Among different optimization techniques, ADAM optimizer is used for both networks. The training loss is calculated by negative log likelihood estimation method. In a 4 Gb GPU, the training time for InceptionNet-V3

Identification of Malignancy from Cytological Images …

117

Fig. 9 The experimentation module for Experiment-1 of second approach Table 4 The results of classification accuracy of Experiment-1 Neural network model

Test 1 accuracy (%)

Test 2 accuracy (%)

Test 3 accuracy (%)

Average accuracy (%)

InceptionNetV3

90

95

90

91.67

ResNet50

85

90

85

86.67

is approximately 1.5 h and for ResNet50 is approximately 1 h. The developed training architecture for Experiment 1 is shown in Fig. 9.

7.4 Classification Result Experiment-1: First, the 20 test images (10 benign and 10 malignant) are transformed by random horizontal flip and then cropped randomly with the dimension 224 × 224 and then they are predicted by the best-trained models of ResNet50 and InceptionNet-V3. We tested three times by using these trained models (Figs. 10, 11 and Table 4).

118

S. Mitra et al.

Fig. 10 A diagram of using deep learning with corresponding segmented input images Table 5 Statistical information of Experiment-2 Inception-v3

ResNet-50

# Benign

# Malignant

# Benign

# Malignant

Precession

1

0.8

0.9

0.8

Recall

0.83

1

0.82

0.89

F-measure

0.91

0.89

0.86

0.84

Accuracy

90%

85%

Experiment-2: From the best-trained models, the class-specific probability distributions of test samples are extracted. Since the test images are transformed by random horizontal flip and then randomly cropped, so we can get distinct class-specific probability distribution values of test samples at each testing. Here, five tests are conducted. The average probability distribution values are calculated. Let piB (k), piM (k) are the probability distribution values of class-1 (i.e., Benign) and class-2 (i.e., malignant), respectively, of kth testing sample in ith test, where i = 1(1)5, k = 1(1)20. Resultant probability distribution value of kthtest sample is m m piB (k) and p M (k) = m1 i=1 piM (k) where p B (k), p M (k) p B (k) = m1 i=1 denote the average probability distribution value of class-1 and class-2, respectively, of the kth sample and m is the number of test cases. The test samples are predicted by these probability distribution values (i.e., if the test sample has high probability distribution value on class-k, then the test sample belongs to the class-k) (Table 5).

Identification of Malignancy from Cytological Images …

119

Fig. 11 Graph plot of train loss (in red color) versus validation loss (in blue color) per epoch (training on segmented images) of Experiment-1 Table 6 Experiment results of Experiment-3 Neural network

Test-1 (%)

Test-2 (%)

Test-3 (%)

Average accuracy (%)

ResNet50

80

75

85

80

Inception-v3

85

75

70

76.67

Experiment-3: The raw RGB images are transformed by random horizontal flip and then random crop by 224 × 224 for ResNet50 and 299 × 299 for InceptionNet-V3 (Fig. 12 and Table 6).

120

S. Mitra et al.

Fig. 12 Graph plot of train loss (in red color) versus validation loss (in blue color) per epoch in Experiment-3

8 Conclusion In this chapter, we discussed the advantages and challenges of cytology images and explored two techniques for automatic recognition of cytology images. One approach consists of traditional feature-based approach, where prior to feature extraction segmentation is done and based on extracted features the segmented nuclei classification is done. Segmentation is done using different clustering-based methods with finetuning using entropy-based superpixel. We found maximum recognition accuracy of 91% on threefold cross-validation using MLP and K-NN. However, the classification of dataset using deep learning especially using CNNs is explored in another approach, where we conducted two experiments. Experiment-1 deals with the raw dataset and we detected poor performance during testing. Experiment-2 investigates the performance of CNN using the segmented data and we saw profound improvement in recognition accuracy. Both the experiments in second approach are devoid of handcrafted features. Two popular CNNs modules, Resnet and InceptionNet-V3,

Identification of Malignancy from Cytological Images …

121

are used for classification of data. We also explore the probability-based combination of two CNNs modules. We found that average performance of InceptionNetV3 is better than the other methods. The highest recognition accuracy of 95% is accorded by InceptionNet-V3 in one set of segmented data, which is slightly higher than our first approach. Inclusion of more number of samples may improve the performance of deep learning module significantly.

References 1. Sagawa, M., Usuda, K., Aikawa, H., et al.: Screening for lung cancer: present and future. Gan To Kagaku Ryoho 39, 19–22 (2012) 2. Xian, G.M.: An identification method of malignant and benign liver tumors from ultrasonography based on GLCM texture features and fuzzy SVM. Expert Syst. Appl. 37, 6737–6741 (2010). https://doi.org/10.1016/j.eswa.2010.02.067 3. Muhammad Hussain NK (2012) AUTOMATIC MASS DETECTION IN MAMMOGRAMS USING MULTISCALE SPATIAL WEBER LOCAL DESCRIPTOR. IWSSIP 2012 4. Domanski, H.A.: Fine-needle aspiration cytology of soft tissue lesions: diagnostic challenges. Diagn. Cytopathol. 35, 768–773 (2007). https://doi.org/10.1002/dc.20765 5. Lopes Cardozo, P.: The significance of fine needle aspiration cytology for the diagnosis and treatment of malignant lymphomas. Folia Haematol. Int. Mag. Klin Morphol. Blutforsch. 107, 601–620 (1980) 6. Tucker, J.H.: Cerviscan: an image analysis system for experiments in automatic cervical smear prescreening. Comput. Biomed Res. (1976). https://doi.org/10.1016/0010-4809(76)90033-1 7. Zahniser, D.J., Oud, P.S., Raaijmakers, M.C.T., et al.: BioPEPR: a system for the automatic prescreening of cervical smears. J. Histochem. Cytochem. 27, 635–641 (1979). https://doi.org/ 10.1177/27.1.86581 8. Vrolijk, J., Pearson, P.L., Ploem, J.S.: LEYTAS: a system for the processing of microscopic images. Anal. Quant. Cytol. (1980) 9. Street, W.N.: Xcyt: a system for remote cytological diagnosis and prognosis of breast cancer. In: Artificial Intelligence Techniques in Breast Cancer Diagnosis and Prognosis, pp. 297–326. World Scientific Publishing (2000) 10. Zhang, L., Chen, S., Wang, T., et al.: A practical segmentation method for automated screening of cervical cytology. In: 2011 International Conference on Intelligent Computation and BioMedical Instrumentation (2011). https://doi.org/10.1109/icbmi.2011.4 11. Zhang, L., Kong, H., Chin, C.T., et al.: Segmentation of cytoplasm and nuclei of abnormal cells in cervical cytology using global and local graph cuts. Comput. Med. Imaging Graph. 38, 369–380 (2014). https://doi.org/10.1016/j.compmedimag.2014.02.001 12. Zhao, L., Li, K., Wang, M., et al.: Automatic cytoplasm and nuclei segmentation for color cervical smear image using an efficient gap-search MRF. Comput. Biol. Med. (2016). https:// doi.org/10.1016/j.compbiomed.2016.01.025 13. Li, K., Lu, Z., Liu, W., Yin, J.: Cytoplasm and nucleus segmentation in cervical smear images using Radiating GVF Snake. Pattern Recognit. (2012). https://doi.org/10.1016/j.patcog.2011. 09.018 14. Chankong, T., Theera-Umpon, N., Auephanwiriyakul, S.: Automatic cervical cell segmentation and classification in PAP smears. Comput. Methods Programs Biomed. 113 (2014) 15. George, Y.M., Zayed, H.H., Roushdy, M.I., Elbagoury, B.M.: Remote computer-aided breast cancer detection and diagnosis system based on cytological images. IEEE Syst. J. 8, 949–964 (2014). https://doi.org/10.1109/JSYST.2013.2279415 16. Yang, X., Li, H., Zhou, X.: Nuclei segmentation using marker-controlled watershed, tracking using mean-shift, and Kalman filter in time-lapse microscopy. IEEE Trans. Circ. Syst. I Regul. Pap. 53, 2405–2414 (2006). https://doi.org/10.1109/TCSI.2006.884469

122

S. Mitra et al.

17. Hrebień, M., Korbicz, J., Obuchowicz, A.: Hough transform, (1 + 1) search strategy and watershed algorithm in segmentation of cytological images. In: Kurzynski, M., Puchala, E., Wozniak, M., Zolnierek, A. (eds.) Advances in Soft Computing, pp. 550–557. Springer, Berlin, Heidelberg (2007) 18. Garud, H., Karri, S.P.K., Sheet, D., et al.: High-magnification multi-views based classification of breast fine needle aspiration cytology cell samples using fusion of decisions from deep convolutional networks. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (2017) 19. Dey, P., Logasundaram, R., Joshi, K.: Artificial neural network in diagnosis of lobular carcinoma of breast in fine-needle aspiration cytology. Diagn. Cytopathol. 41, 102–106 (2011). https:// doi.org/10.1002/dc.21773 20. Isa, N.A.M., Subramaniam, E., Mashor, M.Y., Othman, N.H.: Fine needle aspiration cytology evaluation for classifying breast cancer using artificial neural network. Am. J. Appl. Sci. 4, 999–1008 (2007) 21. Braz, E.F., Lotufo, R.D.A.: Nuclei detection using deep learning. In: Brazilian Symposium on Telecommunications and Processing of Signals, pp. 1059–1063 (2017) 22. Tareef, A., Song, Y., Huang, H., et al.: Optimizing the cervix cytological examination based on deep learning and dynamic shape modeling. Neurocomputing 248, 28–40 (2017). https:// doi.org/10.1016/j.neucom.2017.01.093 23. Weickert, J.: Anisotropic diffusion in image processing. Image Rochester NY 256:170 (1998). http://doi.org/10.1.1.11.751 24. Achanta, R., Shaji, A., Smith, K., et al.: SLIC superpixels compared to state-of-the-art superpixel methods. 6, 1–8 (2011) 25. Sander, J., Ester, M., Kriegel, H.-P., Xu, X.: Density-Based clustering in spatial databases: the algorithm GDBSCAN and Its applications. Data Min. Knowl. Discov. 2, 169–194 (1998). https://doi.org/10.1023/A:1009745219419 26. Luukka, P.: Feature selection using fuzzy entropy measures with similarity classifier. Expert Syst. Appl. 38, 4600–4607 (2011). https://doi.org/10.1016/j.eswa.2010.09.133 27. Liu, M., Tuzel, O., Ramalingam, S., Chellappa, R.: Entropy rate superpixel segmentation. In: CVPR 2011, pp. 2097–2104 (2011) 28. Mitra, S., Dey, S., Das, N., et al.: Identification of Benign and Malignant Cells from cytological images using superpixel based segmentation approach. In: Mandal, J.K., Sinha, D. (eds.) 52nd Annual Convention of CSI 2018: Social Transformation—Digital Way, pp. 257–269. Springer, Singapore (2018) 29. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Alexnet. Adv. Neural Inf. Process. Syst. (2012). http://dx.doi.org/10.1016/j.protcy.2014.09.007 30. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016) 31. Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015) 32. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015). CoRR abs/1512.0 33. Szegedy, C., Vanhoucke, V., Ioffe, S., et al.: Rethinking the inception architecture for computer vision (2015). CoRR abs/1512.0