Learning Automata Approach for Social Networks [1st ed.] 978-3-030-10766-6, 978-3-030-10767-3

This book begins by briefly explaining learning automata (LA) models and a recently developed cellular learning automato

357 126 10MB

English Pages XVII, 329 [339] Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Learning Automata Approach for Social Networks [1st ed.]
 978-3-030-10766-6, 978-3-030-10767-3

Table of contents :
Front Matter ....Pages i-xvii
Introduction to Learning Automata Models (Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, Mohammad Reza Meybodi)....Pages 1-49
Wavefront Cellular Learning Automata: A New Learning Paradigm (Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, Mohammad Reza Meybodi)....Pages 51-74
Social Networks and Learning Systems: A Bibliometric Analysis (Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, Mohammad Reza Meybodi)....Pages 75-89
Social Network Sampling (Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, Mohammad Reza Meybodi)....Pages 91-149
Social Community Detection (Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, Mohammad Reza Meybodi)....Pages 151-168
Social Link Prediction (Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, Mohammad Reza Meybodi)....Pages 169-239
Social Trust Management (Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, Mohammad Reza Meybodi)....Pages 241-279
Social Recommender Systems (Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, Mohammad Reza Meybodi)....Pages 281-313
Social Influence Maximization (Alireza Rezvanian, Behnaz Moradabadi, Mina Ghavipour, Mohammad Mehdi Daliri Khomami, Mohammad Reza Meybodi)....Pages 315-329

Citation preview

Studies in Computational Intelligence 820

Alireza Rezvanian Behnaz Moradabadi Mina Ghavipour Mohammad Mehdi Daliri Khomami Mohammad Reza Meybodi

Learning Automata Approach for Social Networks

Studies in Computational Intelligence Volume 820

Series editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland e-mail: [email protected]

The series “Studies in Computational Intelligence” (SCI) publishes new developments and advances in the various areas of computational intelligence—quickly and with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. The books of this series are submitted to indexing to Web of Science, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink.

More information about this series at http://www.springer.com/series/7092

Alireza Rezvanian Behnaz Moradabadi Mina Ghavipour Mohammad Mehdi Daliri Khomami Mohammad Reza Meybodi •







Learning Automata Approach for Social Networks

123

Alireza Rezvanian School of Computer Science Institute for Research in Fundamental Sciences (IPM) Tehran, Iran

Behnaz Moradabadi Computer Engineering and Information Technology Department Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran

and Computer Engineering and Information Technology Department Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran

Mohammad Mehdi Daliri Khomami Computer Engineering and Information Technology Department Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran

Mina Ghavipour Computer Engineering and Information Technology Department Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran Mohammad Reza Meybodi Computer Engineering and Information Technology Department Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran

ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN 978-3-030-10766-6 ISBN 978-3-030-10767-3 (eBook) https://doi.org/10.1007/978-3-030-10767-3 Library of Congress Control Number: 2018966843 © Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

To my lovely wife, Razieh my beloved and merciful parents, Mohammad Reza and Nahid my dear sisters, Saba and Sepinoud for their love and support Alireza Rezvanian To the memory of my father, Mohsen Behnaz Moradabadi To my family Mina Ghavipour To my family Mohammad Mehdi Daliri Khomami

Preface

This book is written for computer engineers, social scientists, and students studying/working on social networks, artificial intelligence, machine learning, reinforcement learning, and learning automata. The book collects recent developments in learning automaton theory in social network analysis applications. The book in detail describes those learning automata models that applied for solving different problems of social network analysis including graph problems, network sampling, community detection, link prediction, trust management, recommender systems, and influence maximization. In each chapter of the book, validation of the learning automata-based methods is presented through theoretical or simulations aspects. The new model of cellular learning automata called wavefront cellular learning automata for social network analysis is also introduced in this book. It is shown that due to the distributed characteristics of wavefront cellular learning automata, this model successfully applied in link prediction and network sampling. The level of mathematical analysis is well suited within the grasp of the scientists as well as the graduate students from the computer engineering and social science domains. The readers are encouraged to have basic understanding of social network analysis, reinforcement learning, learning automata, and related topics. This book consists of nine chapters dedicated toward using recent models of learning automata for social network applications. Chapter 1 provides the necessary background about learning automata theory, distributed learning automata, and several models of learning algorithms. Chapter 2 gives a brief introduction about a recent cellular learning automata model named wavefront cellular learning automata. Chapter 3 analyzes the research study for learning approach on social network as bibliometric aspect. Chapter 4 is devoted to applications of learning automata in network sampling algorithms. Chapter 5 discusses the learning automata algorithms for community detections. In Chap. 6 link prediction methods using learning automata models are provided. Chapter 7 introduces recent developments in learning automaton theory in social trust management. Recent social recommender systems based on learning automata techniques are reported in Chap. 8. Finally, Chap. 9 provides new methods of influence maximization based on learning automata for information diffusion. vii

viii

Preface

The authors would like to thank Dr. Thomas Ditzinger, Springer, Executive Editor, Interdisciplinary and Applied Sciences and Engineering, Mr. Ravi Vengadachalam, Project Coordinator and Books Production of Springer-Verlag, Heidelberg, for the editorial assistance and Sindhu Sundararajan, Scientific Publishing Services for excellent cooperative collaboration to produce this important scientific work. We hope that readers will share our pleasure to present this book on learning automata approach for social networks and will find it useful in their research.

Acknowledgements We are grateful to many people who have helped us during past few years, who have contributed to work presented here, and who have offered critical reviews of prior publications. We would like to thank IPM, Iran, for partially supporting this research (Grant no. CS1397-4-79). We thank Springer for their assistance in publishing the book. We are also gratitude to our academic supervisor, our family, our parents, and all our friends for their love and support. Tehran, Iran October 2018

Alireza Rezvanian Behnaz Moradabadi Mina Ghavipour Mohammad Mehdi Daliri Khomami Mohammad Reza Meybodi

Contents

1 Introduction to Learning Automata Models . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Learning Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Automaton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 Fixed-Structure Learning Automata (FSLA) . . . . . . 1.2.4 Variable-Structure Learning Automata (VSLA) . . . . 1.2.5 Variable Action Set Learning Automata . . . . . . . . . 1.2.6 Continuous Action-Set Learning Automata (CALA) 1.2.7 Non-estimator Learning Algorithms . . . . . . . . . . . . 1.2.8 Estimator Learning Algorithms . . . . . . . . . . . . . . . . 1.2.9 Pursuit Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Interconnected Learning Automata . . . . . . . . . . . . . . . . . . . 1.3.1 Hierarchical Structure Learning Automata (HSLA) . 1.3.2 Multi-level Game of Learning Automata . . . . . . . . . 1.3.3 Network of Learning Automata (NLA) . . . . . . . . . . 1.3.4 Distributed Learning Automata (DLA) . . . . . . . . . . 1.3.5 Extended Distributed Learning Automata (eDLA) . . 1.4 Recent Applications of Learning Automata . . . . . . . . . . . . 1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

1 1 2 3 4 6 9 11 13 14 26 30 33 34 35 35 36 38 39 40 42

2 Wavefront Cellular Learning Automata: A New Learning Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Cellular Learning Automata . . . . . . . . . . . . . . . . 2.2 Wavefront Cellular Learning Automata . . . . . . . . . . . . . 2.2.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

51 51 51 61 65 69 69

. . . . . . .

. . . . . . .

ix

x

3 Social Networks and Learning Systems: A Bibliometric Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Material and Method . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Data Collection and Initial Results . . . . . . . . 3.2.2 Refining the Initial Results . . . . . . . . . . . . . . 3.2.3 Analyzing the Final Results . . . . . . . . . . . . . 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Initial Result Statistics . . . . . . . . . . . . . . . . . 3.3.2 Key Journals . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Key Researchers . . . . . . . . . . . . . . . . . . . . . 3.3.4 Key Articles . . . . . . . . . . . . . . . . . . . . . . . . 3.3.5 Key Affiliation . . . . . . . . . . . . . . . . . . . . . . . 3.3.6 Top Keywords . . . . . . . . . . . . . . . . . . . . . . . 3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Contents

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

75 75 76 76 77 77 77 77 78 79 81 83 86 88 88

4 Social Network Sampling . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Categorization of Graph Sampling Algorithms . . . . . . . 4.2.1 Random Versus Topology-Based Sampling . . . . 4.2.2 Simple Versus Extended Sampling . . . . . . . . . . 4.2.3 Static Versus Streaming Graph Sampling . . . . . 4.3 Learning Automata Based Graph Sampling Algorithms 4.3.1 Distributed Learning Automata-Based Sampling (DLAS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Extended Distributed Learning Automata-Based Sampling (EDLAS) . . . . . . . . . . . . . . . . . . . . . 4.3.3 The Extended Topology-Based Node Sampling Algorithm ICLA-NS . . . . . . . . . . . . . . . . . . . . 4.3.4 The Streaming Sampling Algorithm FLAS . . . . 4.4 Learning Automata Based Stochastic Graph Sampling Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Network Sampling in Stochastic Graphs . . . . . . 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

91 91 92 92 94 95 98

........

98

........

99

. . . . . . . . . . . . . . .

. . . . . . . . 102 . . . . . . . . 120 . . . .

. . . .

. . . .

. . . .

. . . .

134 134 145 146

5 Social Community Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Community Detection Using Distributed Learning Automata . 5.2.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Communities Formation . . . . . . . . . . . . . . . . . . . . . . 5.2.3 Computation of the Objective Function . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

151 151 151 153 154 154 155

. . . .

. . . .

. . . .

Contents

xi

5.2.4 Updating Action Probabilities . . . . . . . . . . . . . . . . 5.2.5 Stopping Conditions . . . . . . . . . . . . . . . . . . . . . . 5.3 Community Detection Using Michigan Memetic Learning Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Community Detection Using Cellular Learning Automata . 5.4.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Communities Formation . . . . . . . . . . . . . . . . . . . . 5.4.3 Computation of the Objective Function . . . . . . . . . 5.4.4 Updating the Action Probabilities . . . . . . . . . . . . . 5.4.5 Stopping Conditions . . . . . . . . . . . . . . . . . . . . . . 5.4.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . 155 . . . . . . 155 . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

156 160 162 162 163 164 164 164 167 167

6 Social Link Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Link Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Link Prediction in Stochastic Social Networks . . . . . . . . . . . . 6.3.1 Similarity Metrics in Stochastic Graphs . . . . . . . . . . . . 6.3.2 Link Prediction in Stochastic Graphs . . . . . . . . . . . . . 6.3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Link Prediction in Weighted Social Networks . . . . . . . . . . . . 6.4.1 Review of Link Prediction in Weighted Networks . . . . 6.4.2 Weighted Similarity Metrics . . . . . . . . . . . . . . . . . . . . 6.4.3 The Weighted Link Prediction . . . . . . . . . . . . . . . . . . 6.4.4 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Link Prediction in Fuzzy Social Networks Using Distributed Learning Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Link Prediction Method in Fuzzy Social Networks . . . 6.5.2 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Link Prediction in Time Series Social Networks . . . . . . . . . . . 6.6.1 A Time Series Based Link Prediction Method . . . . . . . 6.6.2 Link Prediction Based on Temporal Similarity Metrics 6.6.3 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

169 169 170 173 174 176 179 191 192 192 194 195 198 206

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

206 207 212 220 221 228 233 236 237 237

7 Social Trust Management . 7.1 Introduction . . . . . . . . 7.2 Properties of Trust . . . 7.3 Categorization of Trust

. . . .

. . . .

. . . .

241 241 242 243

................ ................ ................ Inference Models . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . . . . .

. . . .

. . . . . . . . . .

. . . .

. . . .

xii

Contents

7.3.1 Non-propagating Trust Models . . . . . . . . . . . . . . . . . 7.3.2 Propagating Trust Models . . . . . . . . . . . . . . . . . . . . 7.4 Learning Automata Based Trust Propagation Algorithms . . . 7.4.1 The Trust Propagation Algorithm DLATrust . . . . . . . 7.4.2 The Stochastic Trust Propagation Algorithm DyTrust 7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Social Recommender Systems . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Categorization of Recommender Systems . . . . . . . . . . . . . 8.2.1 CF-Based Recommender Systems . . . . . . . . . . . . . 8.2.2 Trust-Based CF Recommender Systems . . . . . . . . 8.3 Learning Automata Based Recommender Systems . . . . . . 8.3.1 The Adaptive Fuzzy Recommender System CALA-OptMF . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.2 Stochastic Trust Propagation-Based Recommender System LTRS . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

243 244 250 250 259 275 276

. . . . . .

. . . . . .

. . . . . .

. . . . . .

281 281 281 282 283 287

. . . . . . 287 . . . . . . 298 . . . . . . 308 . . . . . . 309

9 Social Influence Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Learning Automata Approach for Influence Maximization . . . . 9.3 Learning Automata for Solving Positive Influence Dominating Set and Its Application in Influence Maximization . . . . . . . . . 9.3.1 Learning Automata-Based Algorithm for Finding the Minimum Positive Influence Dominating Set . . . . . 9.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

315 315 316 317

. . . 318 . . . 319 . . . 327 . . . 328

About the Authors

Alireza Rezvanian received the B.Sc. degree from Bu-Ali Sina University of Hamedan, Iran, in 2007, the M.Sc. degree in Computer Engineering with honors from Islamic Azad University of Qazvin, Iran, in 2010, and the Ph.D. degree in Computer Engineering at the Computer Engineering and Information Technology Department from Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran, in 2016. Currently, he works as a researcher in the School of Computer Science from Institute for Research in Fundamental Sciences (IPM), Tehran, Iran. He has authored or co-authored more than 70 research publications in reputable peer-reviewed journals and conferences including IEEE, Elsevier, Springer, Wiley, and Taylor & Francis. He has been guest editor of special issue on new applications of learning automata-based techniques in real-world environments for Journal of Computational Science (Elsevier). He is an associate editor of the Human-centric Computing and Information Sciences (Springer). His research activities include soft computing, evolutionary algorithms, complex networks, social network analysis, data mining, data science, machine learning, and learning automata.

xiii

xiv

About the Authors

Behnaz Moradabadi received the B.S. from Tabriz University in 2009, Tabriz, Iran, and M.Sc. degree in Computer Engineering from Sharif University of Technology in 2011, Tehran, Iran. She also received the Ph.D. degree in Computer Engineering at the Computer Engineering and Information Technology Department from Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran, in 2018. Her current research interests include social networks, learning systems, soft computing, and information retrieval. Mina Ghavipour received her B.Sc. degree in Computer Engineering, in 2008. She also received the M.Sc. and Ph.D. degrees from the Department of Computer Engineering and Information Technology at Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran, respectively, in 2012 and 2018, under supervision of Prof. Mohammad Reza Meybodi. Her research interests lie in the areas of social network analysis and mining, network sampling, recommender systems, social trust, machine learning, and reinforcement learning. Mohammad Mehdi Daliri Khomami received the M.S. degree Computer Engineering from Department of electrical and computer engineering at Qazvin Islamic Azad University. He is currently pursuing Ph.D. degree in Computer Engineering at the Computer Engineering and Information Technology Department, Amirkabir University of Technology (Tehran Polytechnic), Teharan, Iran. His research interests include learning automata, social network analysis, and optimization with application to problems from graph theory.

About the Authors

xv

Mohammad Reza Meybodi received the B.S. and M.S. degrees in Economics from the Shahid Beheshti University in Iran, in 1973 and 1977, respectively. He also received the M.S. and Ph.D. degrees from the Oklahoma University, USA, in 1980 and 1983, respectively, in Computer Science. Currently, he is a Full Professor in Computer Engineering Department, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran. Prior to current position, he worked from 1983 to 1985 as an Assistant Professor at the Western Michigan University and from 1985 to 1991 as an Associate Professor at the Ohio University, USA. His current research interests include learning systems, cloud computing, soft computing, and social networks.

Abstract

Online social networks such as Facebook, Twitter, Instagram, and LinkedIn have provided an appropriate platform for people to interact with each other and disseminate different types of information. Thus, analyzing these networks is increasingly important for discovering behavior patterns of interactions among individuals and evolution of the networks over time, as well as developing algorithms required for meaningful analysis. Due to uncertain, dynamic and time-varying nature of social interactions in online social networks, especially in activity and interaction networks, some properties of networks such as network centralities, trust values, diffusion probabilities and user influences change dynamicity over time. Therefore, it would be difficult to capture the structural and dynamical properties of the network. To deal with this problem, several studies based on learning systems have been presented in the literature to reflect dynamical behavior of social network issues in time. In recent years, learning automaton (LA) as a promising intelligent technique has presented potential solutions for many real network problems and has the advantage of being able to work in unknown, uncertain, complex and dynamic environments. This book is aimed to survey recent developments in problems of social networks addressed by learning automata theories, which are related to network measures, network sampling, stochastic networks, stochastic graphs, community detection, link prediction, trust management, recommender system, influence maximization and their applications.

xvii

Chapter 1

Introduction to Learning Automata Models

1.1 Introduction Psychologists consider any systematic change in performance of system with a certain specific goal as learning (Fu 1970). Learning is defined as any relatively permanent change in behavior resulting from past experience and learning system is characterized by its ability to improve its behavior with time, in some sense tending towards an ultimate goal (Narendra and Thathachar 1974). The concept of learning makes it possible to design systems which can gradually improve their performance during actual operation through a process of learning from past experiences. Every learning task consists of two parts: learning system and environment. The learning system must learn to act in the environment. The learning tasks can be classified into three following main models (Haykin 1994). • Supervised Learning. In supervised learning, there is an external teacher that has knowledge of the environment represented by input-output examples. The environment gives an input to both learning system and teacher. Teacher and learning system generate corresponding responses, in which the response of teacher is always correct. Then the error signal is obtained by comparing the responses of the teacher and the learning system. The error signal is used by learning system to adapt its parameters to improve its performance. In supervised learning, the system usually stops to learn as soon as training process is terminated which a disadvantage of this model is because it has not ability to act in nonstationary environments. In recent years, the supervised learning problems are generalized to several ways such as Semi-supervised learning (working with incomplete training set) and Active learning (a limited set of instances (based on a budget) is available). • Unsupervised Learning. In unsupervised learning or self-organizing learning, there is no external teacher to oversee the learning process. In other words, there is no specific task independent measure of the quality of representation that the learning system is required to learn. In this model, the learning system does not receive any additional information from outside environment during the learn© Springer Nature Switzerland AG 2019 A. Rezvanian et al., Learning Automata Approach for Social Networks, Studies in Computational Intelligence 820, https://doi.org/10.1007/978-3-030-10767-3_1

1

2

1 Introduction to Learning Automata Models

ing. In unsupervised learning, the learning system learns the statistical properties of input data. It is known that the rate of learning for unsupervised learning is relatively slower than that for supervised learning (Haykin 1994). • Reinforcement learning. Reinforcement learning is learning from positive and negative rewards (Sutton and Barto 1998). In standard reinforcement learning a learning system is connected to its environment via perception and action. At each instant, the learning system receives an input, which is some indication of current state of environment. Then the learning system chooses an action to generate as output. This action changes state of environment and the new state of environment is communicated to the learning system with a scalar reinforcement signal. The reinforcement signal changes behavior of learning system and the learning system can learn to choose the best action over time by systematic trial and error. Reinforcement learning tasks can be divided naturally into two types: sequential and non-sequential tasks. In non-sequential tasks, the objective is to learn a mapping from situations to actions that maximizes the expected immediate payoff. The non-sequential reinforcement learning has been studied extensively in the field of learning automata (Narendra and Thathachar 1989). In sequential tasks, the objective is to maximize the expected long-term payoffs. Sequential tasks are more difficult, because the action selected may influence the future situations and thus future payoffs. The sequential tasks are studied in reinforcement learning techniques based on dynamic programming, approximation of dynamic programming, and etc. (Kaelbling et al. 1996). The main focus of this chapter is to overview on the learning automata models. This chapter is organized into three subsections. We first introduce single learning automata models and then we give a brief description of interconnected learning automata. Finally, we will give a summarization about recent successful applications of learning automata.

1.2 Learning Automata Reinforcement learning or learning with a critic is defined by characterizing a learning problem, instead of characterizing the learning methods (Sutton and Barto 1998). Any method that is well suited to solve that problem is considered as a reinforcement learning method. The reinforcement learning problem is the problem of learning from interactions in order to achieve a certain goal. Considering this specification for the reinforcement learning problem, there must be an agent capable of learning, called the learner or the decision-maker. The learner must interact with a so-called environment or teacher, comprising everything outside of the learner, which provides evaluative responses to the actions performed by the learner. The learner and the environment interact continually; the learner selects actions and the environment responds to the selected actions, presenting new situations to the learner. In short, reinforcement learning is a framework of the learning problems in which the environment does

1.2 Learning Automata

3

α

Random Environment

Learning Automaton

β Fig. 1.1 The interaction of a learning automaton and its environment

not indicate the correct action, but provides only a scalar evaluative response to the selection of an action by the learner. A learning automaton (LA) is an adaptive self-organization decision making unit that improves its performance through repeated interaction with a random environment. Learning automaton learns how to choose the best action based upon the information (response) it receives from the environment. The interaction between an automaton and its environment is shown in Fig. 1.1. nth iteration begins when an automaton receives input vector x(n), called context vector, selected from set X ∈ m from the environment, where m denotes the dimension of the context vector. Depending on context vector x(n), automaton chooses one of its possible actions (e.g., action α(n) ∈ α) and applies it to the random environment. Random environment evaluates the selected action, α(n), in the context of input x(n) and emits a reinforcement signal β(n) ∈ β. To updates its state, automaton uses a learning algorithm T , context vector x(n), action α(n), and reinforcement signal β(n). Learning automaton learns how to select the optimal action by continuing (repeating) this process. Such an automaton which improves its performance by the environment’ reactions is called as learning automaton [1]. The following describes the definitions of the random environment and automaton.

1.2.1 Environment The environment can be mathematically modeled by quintuple X , α, β, F, q, where X denotes the set of context vectors, α denotes the set of inputs, β represents the set of values  that can be taken by the reinforcement signal, F   f (α, x)|α ∈ α, x ∈ X denotes the set of probability distributions defined over β, and q represents the probability distribution defined over X which is assumed to

4

1 Introduction to Learning Automata Models

be unknown. α(n) and β(n) show the input and output of the environment at discrete time n(n  0, 1, 2, . . .), respectively. It should be noted that F is initially unknown and goes to the optimal action α for each context vector x due to the interaction with the environment. The random environments are classified based on the probability distribution function F and reinforcement signal β. Based on the nature of probability distribution function F, environments can be classified as stationary and nonstationary environments. In stationary environments, F is fixed with the time while in non-stationary environments it varies with time. Based on the nature of reinforcement signal β, random environments are classified into three classes: P-, Q-, and S-models. The output of a P-model environment is two elements, success or failure. Usually in P-model environments, F is called as penalty probabilities. In Q-model environments, the output of the environment, β(n), takes a finite number of values in the interval [0, 1], while in S-model environments, the output of environment falls into the interval [a, b].

1.2.2 Automaton An automaton can be represented by a tuple X , α, β, , F, G, where X denotes the set of context vectors, β represents the set of values that can be taken by reinforcement signal,  denotes the set of internal states, α is the set of actions, F : X ××β →  denotes the state transition function under which the state of an automaton at instant n + 1 is determined on the basis of its previous state and inputs at instant n, and G : X ×  → α represents the output function by which the output of the automaton at instant n + 1 can be calculated based on its current state and input. The automata approach to learning can be considered as the determination of an optimal action from a set of actions. A learning automaton can be regarded as an abstract object that has a finite number of actions. It selects an action from its finite set of actions and applies it to an environment. The environment evaluates the applied action and sends a reinforcement signal to the learning automaton as depicted in Fig. 1.1. The reinforcement signal provided by the environment is used to update the internal state of the learning automaton. By continuing this process, the learning automaton gradually learns to select the optimal action, which leads to favorable responses from the environment. A simple pseudo-code for the behavior of an r-action learning automaton within a stationary environment with β ∈ {0, 1} is presented in Fig. 1.2. Learning automata have several good features, which make them suitable for using in many applications. Main features of learning automata are stated below 1. Learning automata can be used without any priori information about the underlying application. 2. Learning automata are very useful for applications with large amount of uncertainty.

1.2 Learning Automata

5

Fig. 1.2 Pseudo-code of a learning automaton (LA)

3. Unlike traditional hill-climbing algorithms, hill-climbing in learning automata is done in expected sense in a probability space. 4. Learning automata require a very little and simple feedback from their environment. 5. Learning automata are very useful in multi-agent and distributed systems with limited intercommunication and incomplete information. 6. Learning automata are very simple in structure and can be implemented easily in software or hardware. 7. The action set of learning automata can be a set of symbolic or numeric values. 8. Optimization algorithms based on learning automata don’t need the objective function to be an analytical function of adjustable parameters. 9. Learning automata can be analyzed by powerful mathematical methodologies. It has been shown that learning automata are optimal in single, hierarchical, and distributed structures. 10. Learning automata require a few mathematical operations at each iteration so it can be used in real-time applications. 11. Learning automata have flexibility and analytical tractability needed for most applications. Stochastic learning automata can be categorized into two main classes namely, finite action-set learning automata (FALA) and continuous action-set learning automata (CALA) (Thathachar and Sastry 2003). The learning automaton is called FALA if it has a finite set of actions and it is called CALA otherwise. For an FALA with r actions, the action probability distribution is an r-dimensional probability vector.

6

1 Introduction to Learning Automata Models

FALA can be categorized into two main families: variable structure learning automata (VSLA), in which the transition and output functions vary in time, and otherwise, fixed structure learning automata (FSLA) (Narendra and Thathachar 1989). Also, the learning automata algorithms can be divided into two groups: non-estimator and estimator learning algorithms. In many applications there is need to learn a real-valued parameter. In this situation, the actions of the automaton can be possible values of that parameter. To use an FALA for such an optimization problem, we have to discretize the value space of parameter to obtain a finite number of actions. However, a fine discretization increases the number of actions which in turn decreases the convergence speed of the automaton. A natural solution to this problem would be to employ an automaton with a continuous space of actions. Such model of LA is called continuous action-set learning automata (CALA). In finite action-set learning automaton (FALA), learning algorithms can update action probability vectors in discrete or continuous steps. The former is called discretized learning algorithm while the latter is called continuous learning algorithm. The learning algorithms can be divided into two groups: non-estimator and estimator learning algorithms, which are briefly described in the two following subsections.

1.2.3 Fixed-Structure Learning Automata (FSLA) The learning automaton is called fixed-structure either the probability of the transition from one state to another state or the action probability of any action in any state is fixed. A FSLA is a quintuple α, β, Φ, F, G, where α  {α1 , α2 , . . . , αr } is the set of the actions that automaton chooses from, β  {0, 1} is automaton’s set of inputs where it receives a penalty, if β  1 and receives a reward otherwise, Φ  {φ1 , φ2 , . . . , φr N } indicates its set of states, where N is called the depth of memory of the automaton, F : Φ × β → Φ illustrates the transition of the state of the automaton on receiving an input from the environment. F can be stochastic, G : Φ → α is the output function of the automaton. The action taken by the automaton is determined according to its current state. This means that theautomaton  selects action αi if it is in any of the states φ(i−1)N +1 , φ(i−1)N +2 , . . . , φi N . The state φ(i−1)N +1 is considered to be the most internal state, and φi N is considered to be the boundary state of action αi , indicating that the automaton has the most and the least certainty in performing the action αi , respectively. The action chosen by the automaton is applied to the environment which in turn emits a reinforcement signal β. On the basis of the received signal β, the state of the automaton is updated and the new action is chosen according to the functions F and G respectively. There exist different types of FSLA based on the state transition function F and the output function G. L2N,2 , G2N,2 and Krinsky are some of the most important FSLA types. We will introduce these automata in the following paragraphs.

1.2 Learning Automata

7

Fig. 1.3 The state transition graph for L2N,2 automaton

1.2.3.1

The Tsetlin Automaton (L2N,2 )

The Tsetlin automaton (Tsetlin 1962) is an extension of the more simple L2,2 automaton. This automaton has 2N states, that are denoted by φ1 , φ2 , . . . , φ2N , and two actions α1 and α2 . These states keep track of the automaton’s prior behavior and its received feedback. The automaton chooses an action based on the current state it resides in. That is, action α1 is chosen by the automaton if it is in the states φ1 , φ2 , . . . , φ N , while if the current state is φ N +1 , φ N +2 , . . . , φ2N action α2 is chosen. The automaton moves towards its most internal state whenever it receives a reward, and conversely, on receiving a penalty, it moves towards its boundary state or to the boundary state of the other action. For instance, consider that the automaton is in state φ N . The automaton moves to state φ N −1 if it has been rewarded and it moves to state φ2N in the case of punishment. However, if the current state is either φ1 or φ N +1 , the automaton remains in the mentioned states until it keeps on being rewarded. The state transition graph of L 2N ,2 automaton is depicted in Fig. 1.3. As mentioned before, the state transition function F can be considered as a stochastic function (Tsetlin 1962). In such a case, on receiving a signal from the environment, the transition of the automaton among its states is not deterministic. For instance, when an action results in a reward, the automaton may move one state towards the boundary state with probability γ1 ∈ [0, 1), and move one state towards its most internal state with probability 1 − γ1 , and reverse this procedure using probabilities γ2 and 1 − γ2 when the response of the environment is penalty. In some situations where rewarding a favorable action may be preferable more than penalizing an unfavorable action, one can set γ1  0 and γ2 ∈ [0, 1). These settings result in state transitions of L2N,2 automaton become deterministic when the automaton receives a reward (β  0), as shown in Fig. 1.3. But, in the case of punishment the L2N,2 automaton will transit among its states stochastically as shown in Fig. 1.4. By considering γ1  0, the automaton will be expedient for all values of γ2 in interval [0, 1) (Thathachar and Sastry 2002).

8

1 Introduction to Learning Automata Models

Fig. 1.4 The state transition graph for L2N,2 in case of punishment

Fig. 1.5 The state transition graph for G2N,2 automaton

Fig. 1.6 The state transition graph for G2N,2 in case of punishment

1.2.3.2

The TsetlinG Automaton (G2N,2 )

The TsetlinG automaton (Tsetlin 1962) is another type of the FSLA. This automaton behaves exactly the same as the L2N,2 automaton, except for the times when it is being punished where the automaton moves from state φ N to φ N +1 and from stat φ2N to state φ1 . The state transition graph of G2N,2 automaton is illustrated in Fig. 1.5. When the state transition function of G2N,2 automaton is stochastic and γ1  0 and γ2 ∈ [0, 1), the automaton transits among its states on receiving a reward as shown in Fig. 1.5 and in the case of punishment as shown in Fig. 1.6.

1.2.3.3

The Krinsky Automaton

In this subsection we briefly describe another type of the fixed structure learning automata, namely the Krinsky automaton (Tsetlin 1962). When the chosen action by this automaton results in a penalty it acts exactly like the L2N,2 automaton. However,

1.2 Learning Automata

9

Fig. 1.7 The state transition graph for Krinsky automaton

in situations where the automaton is rewarded, any of the states φ1 , φ2 , . . . , φ N pass to the state φ1 and any of the states φ N +1 , φ N +2 , . . . , φ2N transit to the state φ N +1 . Therefore, N successive penalties are needed for the automaton to switch from its current action to the other one. One can note the state transition graph of the introduced automaton in Fig. 1.7. In the case of a stochastic state transition function and for settings γ1  0 and γ2 ∈ [0, 1), the automaton behaves the same as before deterministically when it is rewarded (see Fig. 1.7) and on receiving a penalty, the behavior of Krinsky automaton is identical to the L2N,2 automaton as shown in Fig. 1. 4 .

1.2.4 Variable-Structure Learning Automata (VSLA) VSLA can be represented by a quadruple α, β, p, T , where α  {α1 , α2 , . . . , αr } indicates the action set from which the automaton chooses, β  {β1 , β2 , . . . , βk } indicates the set of inputs to the automaton, p  { p1 , p2 , . . . , pr } indicates the action probability vector, such that pi is the probability of selecting the action αi , T indicates the learning algorithm that is used to update the action probability vector of the automaton in terms of the environment’s response, i.e. p(t +1)  T [α(t), β(t), p(t)], where the inputs are the chosen action α(t), the response of the environment β(t) and the action probability vector p(t) at time t. Let αi (t) be the action selected by the automaton at time t. The action probability vector p(t) is updated as given in Eq. (1.1), if the environment’s response is reward, and p(t) is updated according to Eq. (1.2), if the response of the environment is penalty.  p j (t + 1)   p j (t + 1) 

  p j (t) + a 1 − p j (t) j  i ∀ j  i (1 − a) p j (t)

(1.1)

j i (1 b− b) p j (t) + (1 − b) p (t) ∀ j  i j r −1

(1.2)

10

1 Introduction to Learning Automata Models

where r denotes the number of actions that can be taken by the automaton, and a and b are the reward and penalty parameters which determine the amount of increases and decreases of the action probabilities, respectively. If a  b, the learning algorithm is a linear reward–penalty (L R−P ) algorithm, if a b, it is a linear reward– penalty (L R− P ) algorithm, and finally if b  0, it is a linear reward–Inaction (L R−I ) algorithm in which the action probability vector remains unchanged when the taken action is penalized by the environment. The reward and penalty parameters a and b influence the speed of convergence as well as how closely the automaton approaches optimal behavior (the convergence accuracy) (Thathachar and Sastry 2002). If a is too small, the learning process is too slow. In contrary, if a is too large, the increments in the action probabilities are too high and the automaton’s accuracy in perceiving the optimal behavior becomes low. By choosing the parameter a to be sufficiently small, the probability of convergence to the optimal behavior may be made as close to 1 as desired (Thathachar and Sastry 2002). In the L R− P learning algorithm, the penalty parameter b is considered to be small in comparison with the reward parameter a (b  a, where 0 <  1). In this algorithm, the action probability distribution p(t) of the automaton converges in distribution to a random variable p ∗ which can be made as close to the optimal vector as desired by choosing  sufficiently small (Thathachar and Ramachandran 1984). In order to investigate the learning ability of a learning automaton, a pure-chance automaton that always selects its available actions with equal probabilities is used as the standard for comparison (Thathachar and Sastry 2002). Any automaton that is said to learn must perform at least better than such a pure-chance automaton. To make this comparison, one measure can be the average penalty for a given action probability vector. For a stationary random environment with the penalty probability vector c  {c1 , c2 , . . . , cr }, the average penalty probability M(t) received by an automaton is equal to M(t)  E[β(t)| p(t)] 



ci pi (t)

(1.3)

αi ∈α

For a pure-chance automaton, M(t) is a constant and is defined as M(0) 

1 ci r α ∈α

(1.4)

i

An automaton that does better than pure chance must have the average penalty M(t) less than M(0) at least asymptotically as t → ∞. Since p(t) and consequently M(t) are random variables in general, the expected value E[M(t)] is compare with M(0). Definition 1.1 A learning automaton interacting with a P-, Q-, or S-model environment is said to be expedient if lim E[M(t)] < M(0)

t→∞

(1.5)

1.2 Learning Automata

11

Expediency means that the average penalty probability decreases when the automaton updates its action probability function. It would be more interested to determine an updating procedure which would result in E[M(t)] attaining its minimum value. In such a case, the automaton is called optimal. Definition 1.2 A learning automaton interacting with a P-, Q-, or S-model environment is said to be optimal lim E[M(t)]  c

t→∞

(1.6)

where c  mini ci . Optimality implies that asymptotically the action α with the minimum penalty probability c is chosen with probability one. While optimality is a very desirable property in stationary environments, it may not be possible to achieve it in a given situation. In such case, one might aim at a suboptimal performance which is represented by ε-optimality. Definition 1.3 A learning automaton interacting with a P-, Q-, or S-model environment is said to be ε-optimal if the following equation can be obtained for any ε > 0 by a proper choice of the parameters of the learning automaton. lim E[M(t)] < c + ε

t→∞

(1.7)

ε-optimality implies that the performance of the automaton can be made as close to the optimal as desired.

1.2.5 Variable Action Set Learning Automata A variable action set learning automaton (also known as a learning automaton with variable number of actions) is defined as an automaton in which the number of available actions at each instant varies over time (Thathachar and Harita 1987). Such a learning automaton has a finite set of r actions, α  {α1 , α2 , . . . , αr }. At each instant t, the action subset α(t) ˆ ⊆ α is available for the learning automaton to choose from. Selecting the elements of α(t) ˆ is made randomly by an external agency. The procedure of choosing an action and updating the action probability vector in this automaton is done

as follows. pi (t) presents the sum of the probabilities of the available Let K (t)  αi ∈α(t) ˆ actions in subset α(t). ˆ Before choosing an action, the available actions probability vector is scaled as given below. pˆ i (t) 

pi (t) ∀αi ∈ α(t) ˆ K (t)

(1.8)

12

1 Introduction to Learning Automata Models

Fig. 1.8 Pseudo-code of the behavior of a variable action-set learning automaton

Then, the automaton randomly chooses one of its available actions according to the scaled action probability vector p(t). ˆ Depending on the reinforcement signal received from the environment, the automaton updates the vector p(t). ˆ Finally, the available actions probability vector p(t) ˆ is rescaled according to Eq. (1.9). ε-optimality of this type of LA have been proved in (Thathachar and Harita 1987). ˆ pi (t + 1)  pˆ i (t + 1) · K (t) ∀αi ∈ α(t)

(1.9)

The pseudo-code of the behavior of a variable action set learning automaton is shown in Fig. 1.8.

1.2 Learning Automata

13

1.2.6 Continuous Action-Set Learning Automata (CALA) In a continuous action-set learning automaton (CALA), the action-set is defined as a continuous interval over the real numbers. This means that each automaton chooses its actions from the real line. In such a learning automaton, the action probability of the possible actions is defined as a probability distribution function. All actions are initially selected with the same probability, that is, the probability distribution function under which the actions are initially selected has a uniform distribution. The probability distribution function is updated depending upon the responses received from the environment. A continuous action-set learning automata (CALA) (Thathachar and Sastry 2003) is an automaton whose action-set is the real line and its action probability distribution is considered to be a normal distribution with mean μ(t) and standard deviation σ (t). At each time instant t, the CALA chooses a real number α at random based on the current action probability distribution N (μ(t), σ (t)). The two actions α(t) and μ(t) are served as the inputs to the random environment. The CALA receives the reinforcement signals βα(t) and βμ(t) from the environment for both actions. At last, μ(t) and σ (t) are updated as   βα(t) − βμ(t) (α(t) − μ(t)) (1.10) μ(t + 1)  μ(t) + λ φ(σ (t)) φ(σ (t))   

βα(t) − βμ(t) (α(t) − μ(t)) 2 σ (t + 1)  σ (t) + λ − 1 − λK (σ (t) − σ L ) φ(σ (t)) φ(σ (t)) (1.11) where,  φ(σ (t)) 

σ L f or σ (t) ≤ σ L σ (t) f or σ (t) > σ L

(1.12)

and 0 < λ < 1 denotes the learning parameter, K > 0 is a large positive constant controlling the shrinking of σ (t) and σ L is a sufficiently small lower bound on σ (t). Since the updating given for σ (t) does not automatically ensure that σ (t) ≥ σ L , the function φ provides a projected version of σ (t), denoted by φ(σ (t)). The interaction with the environment continue until μ(t) does not change noticeably and σ (t) converges close to σ L . The objective of CALA is to learn value of α for which E[βα(t) ] attains a minimum. That is, the objective is to make N (μ(t), σ (t)) converge to N (α∗ , 0), where α∗ is a minimum of E[βα(t) ]. However, we cannot let σ (t) converge to zero, since we want the asymptotic behaviour of the algorithm to be analytically tractable. Hence, the lower bound σ L > 0 is used and the objective of learning is kept as σ (t) converging to σ L and μ(t) converging to α∗ . By choosing σ L and λ sufficiently small and K

14

1 Introduction to Learning Automata Models

sufficiently large, μ(t) of the CALA algorithm will be close to a minimum of E[βα(t) ] with probability close to unity after a long enough time (Thathachar and Sastry 2003).

1.2.7 Non-estimator Learning Algorithms In non-estimator learning algorithms, the current value of the reinforcement signal is the only parameter which is used to update the action probability vector. Let us assume a finite action-set learning automaton (FALA) with r actions operating in a stationary P-model environment. αi (for i  1, 2, . . . , r ) denotes the action taken by this automaton at instant n (i.e. α(n)  αi ), and β  {0, 1} is the response of the environment to this action. The following is a general learning algorithm for updating the action probability vector proposed in (Aso and Kimura 1979).  p j (n) − gi j [ p(n)] if β(n)  0 p j (n + 1)  (1.13) p j (n) − h i j [ p(n)] if β(n)  1 for all j  i. For preserving probability measure, we must have so we obtain  pi (n) − gii [ p(n)] if β(n)  0 pi (n + 1)  pi (n) − h ii [ p(n)] if β(n)  1 where gii ( p)  −

r j1 j i

gi j [ p(n)] and h ii ( p)  −

r j1 j i

r j1

p j (n)  1,

(1.14)

h i j [ p(n)]. Functions gi j

and h i j (for i, j  1, 2, · · · , r are called as reward and penalty functions, respectively. Reward and penalty functions are arbitrary, continuous and nonnegative with the following properties. 0 < gi j [ p(n)] < p j (n) 0
0. lim E[ pi (n)]  r

n→∞

and the average penalty is defined as

1 ci

1 j1 c j

i  1, 2, . . . , r,

(1.23)

1.2 Learning Automata

17

lim E[M(n)]  r

n→∞

r

1 j1 c j

< M0

(1.24)

The expediency of L R−P algorithm can be concluded for all initial action probability vectors and in all stationary environments. In Narendra and Thathachar (1989), it has been shown that L R−P algorithm has no absorbing states and its Markov process is ergodic. In L R−P algorithm, the limiting action probability vector converges to a distribution with known parameters. For instance, for a two action automaton, using the distance diminishing operators (Norman 1972), it can be shown that the limiting distribution of pi (for i  1, 2) has a normal distribution with the following mean and variance. cj for j  i c1 + c2 c1 c2(1−a) V ar [ pi (∞)]  (c1 + c2 )2 [(1 − a) + 2a(c1 + c2 )] [ pi (∞)] 

(1.25) (1.26)

The discrete version of L R−P algorithm (DL R−P ) with two actions has (N + 1) states {φ0 , φ1 , . . . , φ N } (Oommen 1986). When the automaton is in state φk , it chooses action α1 and α2 with probabilities Nk and 1 − Nk , respectively. The DL R−P algorithm updates its action probability vector using the following rule.  p1 (n + 1) 

p1 (n) + N1 if (α(n)  α1 and β(n)  0)or (α(n)  α2 and β(n)  1) p1 (n) − N1 if (α(n)  α1 and β(n)  1)or (α(n)  α2 and β(n)  0) (1.27)

/ {0, 1}, and when p1 (n) ∈ ⎧ ⎨ p1 (n) if β(n)  0 p1 (n + 1)  N1 if p1 (n)  0 and β(n)  1 ⎩ 1 − N1 if p1 (n)  1 and β(n)  1

(1.28)

when p1 (n) ∈ {0, 1}. The ergodicity and ε-optimality of a two-action DL R−P can be shown in all stationary environments, when cl < 0.5. In (Oommen and Christensen 1988), a two-action discretized linear reward-penalty algorithm called modified discretized linear reward-penalty (M DL R−P ) was proposed that is ergodic and ε-optimal, but non-absorbing in all stationary environments. In this reference, a two-action linear reward-penalty algorithm with artificially created absorbing barriers called absorbing discretized linear reward-penalty ( ADL R−P ) was also introduced in which the states φ0 and φ N of DL R−P are absorbing states. The ε-optimality of ADL R−P in all stationary environments has been shown in Oommen and Christensen (1988). The action probability vector of Q-model and S-model L R−P algorithms, which are denoted by S L R−P , are updated as follows.

18

1 Introduction to Learning Automata Models

 p j (n + 1) 

 1  − p j (n) − a[1 − β(n)] p j (n) j  i p j (n) + aβ(n) r −1 j  i. p j (n) − aβ(n) p j (n) + a[1 − β(n)][1 − p j (n)]

(1.29)

The expediency of S L R−P algorithm is shown for all stationary environments. Furthermore, it can be seen that the limiting expectations of the action probability vector is inversely proportional to the corresponding penalty strengths. In reference Poznyak and Najim (1997), it has been shown that S L R−P with variable number of actions asymptotically generates the optimal pure strategy. The main property of the ergodic learning algorithms is that the final value of the action probability vector (state of automaton) is independent of the initial action probability vector. Although this property results in the automaton not to be trapped in an action, but the main drawback of all these algorithms is that no priori information can be used in the learning process. ⎧ ap1 (n) ⎪ ⎪ ⎨ 1 − ap2 (n) p1 (n + 1)  ⎪ (n) + q1 ap ⎪ ⎩ 1 1 − ap2 (n) − q2

if if if if

α(n)  α2 α(n)  α1 α(n)  α2 α(n)  α1

and and and and

β(n)  0 β(n)  0 β(n)  1 β(n)  1

(1.30)

In Oommen (1987), a linear reward-penalty algorithm called ( AL R−P ) was proposed in which the learning process can be affected by a priori information. In this algorithm, the choice probability of the selected action always increases if the selected action is rewarded by the environment, but the choice probability of the penalized action does not necessarily decrease in all cases. In this automaton, qi and q j are appropriately chosen in such a way that their ratio to be equal to the ratio of the priori action probabilities. The ergodicity in mean of the AL R−P automaton has been shown in Oommen (1987). AL R−P updates the action probability vector by Eq. (1.30).

Linear Reward Inaction (LR-I ) Algorithm In this algorithm, which is called Linear Reward-Inaction (L R−I ), the choice probability of the selected action increases and that of the other actions decreases, if the selected action is rewarded by the random environment. The action probabilities remain unchanged, if the selected action is penalized. That is, in L R−I algorithm, action probability pi (n) increases if action αi is rewarded

and other action probabilities p j (n) (for j  i) decreases in such a way that rk1 pk (n + 1)  1. The action probability vector remains unchanged if action αi is penalized. L R−I algorithm is resulted from Eq. (1.23), if the reward and penalty parameters are set as a  0 and b  0. Therefore, we have g j ( p)  ap j (n) and h j ( p)  0. L R−I algorithm is absolutely expedient and hence ε-optimal in all stationary environments. Due to the dependency of the lower and upper bounds of limn→∞ pl (n) on the reward parameter and initial value of the action probability vector (Lakshmivarahan and Thathachar

1.2 Learning Automata

19

1976b; Kaddour and Poznyak 1994), this algorithm converges to the optimal action with probability less than 1, if the reward parameter is constant (Viswanathan and Narendra 1972; Sawaragi and Baba 1973). The convergence to a non-optimal action is possible, when the reward parameter is assumed to be fixed. The probability of convergence to the non-optimal action could be arbitrarily small, if the reward parameter is chosen small enough. The ε-optimality of L R−I algorithm under non-stationary environments with fixed optimal action is shown in Sawaragi and Baba (1974). Discretized version of L R−I algorithm denoted by DL R−I has been proposed in Oommen and Hansen (1984). DL R−I with two actions has (N + 1) internal states where N is an even integer. DL R−I chooses action α1 with probability i/N and action α2 with probability 1−i/N when it is in state φi . DL R−I updates its action probability vector as follows. ⎧ 1 ⎨ p1 (n) + N if α(n)  α1 and β(n)  0 p1 (n + 1)  p1 (n) − N1 if α(n)  α2 and β(n)  0 ⎩ if β(n)  0 p1 (n)

(1.31)

when p1 (n) ∈ (0, 1). p1 (n) remains unchanged when p1 (n) ∈ {0, 1}. In (Oommen and Hansen 1984), it has been shown that DL R−I is absorbing and ε-optimal in all stationary environments. The main problem with DL R−I is that it is a one-parameter algorithm. In other words, for a given N, E[ p1 (∞)] and the convergence speed of algorithm are fixed in any environment. This algorithm will be more flexible, when the action probabilities are updated by a nonlinear state function. Such a modified algorithm is called discrete nonlinear reward-inaction (D N R−I ) algorithm. The optimality of D N R−I in all stationary environments has been shown in Oommen (1986). The action probability vector of algorithm L R−I in Q- and S-model environments is updated as given in the following equation. ⎧

⎨ p j (n) + a[1 − β(n)] pk (n) p j (n) j  i k i (1.32) p j (n + 1)  ⎩ p (n) − a[1 − β(n)] p (n) j  i j j This learning algorithm which is referred to as S L R−I inherits many properties of L R−I algorithm like absolutely expediency and ε-optimality (Baba 1983). In comparison with variable action-set learning automata, learning automata with a fixed action-set are much easier to deal with mathematical analysis. Therefore, the variable action-set learning automata have not received the attention they deserve. However, in some applications, learning automata with changing number of actions are needed. In Thathachar and Harita (1987), it has been shown that L R−I algorithm with changing number of actions is both absolutely expedient and ε-optimal.

20

1 Introduction to Learning Automata Models

Linear Inaction-Penalty (LI-P ) Algorithm Linear Inaction-penalty (L I −P ) algorithm is a derivation of algorithm L R−P in which the reward and penalty parameters are set as a  0 and b  0. From Eqs. (1.21), (1.22), b + bp j (n) we conclude that g j ( p) and h j ( p) are set as g j ( p)  0 and h j ( p)  r −1 for this algorithm. Penalty parameter b determines the decrease rate of the action choice probability when the selected action is penalized by the environment. In linear inaction-penalty algorithm, as the name suggests, the action probability vector is not updated, if the environment rewards the selected action. The probability of 1 and hence this algorithm is not ε-optimal in the optimal action converges to r −1 stationary environments except for r  2. The discrete version of this algorithm, which is denoted by DL I −P , with two actions has (N + 1) states {φ0 , φ1 , . . . , φ N }, where N is an even integer (Oommen 1986). This automaton chooses its actions as DL R−I does, and updates its action probability vector by the following equation. ⎧ 1 ⎨ p1 (n) + N if α(n)  α2 and β(n)  1 p1 (n + 1)  p1 (n) − N1 if α(n)  α1 and β(n)  1 ⎩ if β(n)  0 p1 (n)

(1.33)

when p1 (n) ∈ (0, 1) and p1 (n) remains unchanged when p1 (n) ∈ {0, 1}. The ergodiity (non-absorbing) and expediency of DL I −P algorithm in all stationary environments has been shown in Oommen (1986). The absorbing DL I −P algorithm ( ADL I −P ) has been introduced in (Oommen 1986). The ADL I −P algorithm can be made, if the states φ0 and φ N of DL I −P are defined as absorbing states. That is, in absorbing DL I −P the automaton does not change its state, when it arrives at state φ0 or φ N . In ADL I −P algorithm, the action probability vector is updated as it is done in DL I −P algorithm. This algorithm is a version of linear inaction-penalty algorithm which is ε-optimal in all stationary environments (Oommen 1986).

Linear Reward-Epsilon Penalty (LR-εP ) Algorithm Linear Reward-Epsilon Penalty (L R−ε P ) algorithm is a modified version of L R−P in which the penalty parameter is much smaller as compared with the rewarding parameter (b  εa and 0 < ε 1). In other words, L R−ε P algorithm can be achieved, if a small penalty parameter to be added to the L R−I algorithm. In spite of the little difference between L R−ε P and L R−I , L R−ε P shows considerably different behaviors. It is shown that though L R−I algorithm has absorbing states, there is only a non zero probability with which the automaton might be getting trapped in a wrong action. A comprehensive analysis of the two-action L R−ε P algorithm in which the results reported in Norman (1974) is used has been given in Lakshmivarahan (1981). In Kushner and Huang (1981), using weak convergence theorems, it is shown that the long-term changes in action probability vector can be modeled by the Gauss-Markov diffusion (Thathachar and Ramachandran 1984), when parameters a and b are small

1.2 Learning Automata

21

enough. Therefore, the Markov process describing L R−ε P algorithm is ergodic and has no absorbing states. The action probability vector of automaton converges to near of unit vector el , when ε is selected small enough. This property shows the ε-optimality of algorithm (Thathachar and Ramachandran 1984). Ünsal considered all values of the reward and penalty parameters a and b and used the general stability theorem to find the convergence region, without the condition ε 1. The optimality of L R−ε P algorithm for every 0 < a, b < 1 has been shown when cl  0.

Linear Reward-Reward (LR-R ) Algorithm Linear Reward-Reward algorithm (L R−R ) was proposed by Viswanthan (Christensen and Oommen 1990). L R−R increases the probability of the selected action regardless of the response received from the random environment, even if the response is unfavorable. The growth of the choice probability is based on a systematic and methodical way in such a way that the optimal action can be eventually found by the automaton. L R−R algorithm can be resulted from AL R−P algorithm, if qi and q j are substituted by bpi and bp j in Eq. (1.30), respectively, where a, b ≥ 0 with 0 < a + b < 1. A two action L R−R algorithm can be obtained if g j and h j are substituted in Eqs. (1.13) and (1.14) as follows g j [ p(n)]  (1 − a) p j (n) h j [ p(n)]  −(1 − a − b) p j (n) (for j  1, 2).

(1.34)

where a and b (0 < a + b < 1) are two positive constants parameters. In Christensen and Oommen (1990), the ε-optimality of the linear reward-reward algorithm in all stationary environments has been shown. It has been also shown that L R−R algorithm is a powerful algorithm to cope with the stubborn learners (Christensen and Oommen 1990). It is shown in Oommen (1987) that L R−I is superior to L R−R . This is due to the fact that the single step decrement in the average penalty for L R−I is greater than that of L R−R . In other words, |M(n)| L R−I ≥ |M(n)| L R−R . 1.2.7.2

(1.35)

Nonlinear Algorithms

Nonlinear algorithms include a wide variety of learning algorithms in which the rule under which the action probabilities are updated is nonlinear. Many studies have been conducted on linear algorithms, specifically L R−P and L R−I algorithms, but nonlinear algorithms have not received the attention they merit. Two actions nonlinear learning algorithms are paid more attention, since they are easier to design and evaluation. Due to the nonlinearity of the updating rules, nonlinear learning

22

1 Introduction to Learning Automata Models

algorithms can not be easily generalized to more actions (e.g., three actions or more). In the following subsection, a brief overview of the nonlinear algorithms is provided.

Nonlinear Algorithm I Narendra and Thathachar (1974) proposed the first nonlinear learning algorithm for a two actions learning automaton. In this algorithm, the updating rule is obtained by substituting the following g j and h j of Eq. (1.36) in Eqs. (1.37) and (1.38). g j [ p(n)]  h j [ p(n)]  ap j (n)[1 − p j (n)] j  1, 2. This algorithm is ε-optimal in environments that satisfy either c1 < c2 < 21 < c1 .

(1.36) 1 2

< c2 or

Nonlinear Algorithm II (Square Law) The first nonlinear algorithm was then extended by Shapiro and Narendra for an automaton with r actions. In this algorithm, the updating rule is obtained by substituting the following g j and h j shown in Eq. (1.37) in Eqs. (1.38) and (1.14. The ε-optimality of this algorithm in environments wherein the penalty probabilities are defined as cl < r1 and c j > r1 for all j  l was shown by Sawaragi and Baba (1974). g j [ p(n)]  h j [ p(n)] 

a p j (n)[1 − p j (n)] j  1, 2, . . . , r. r −1

(1.37)

Nonlinear Algorithm III The third nonlinear learning algorithm in which g j and h j are defined as given in Eq. (1.38) was introduced by Chandrasekharan and Shen. In Eq. (1.38), function φ(x) is defined as φ(x)  ax m , where m  2, 3, . . . and 0 < a ≤ 1. g j [ p(n)]  p j (n) − φ[ p j (n)] p (n)−φ[ p (n)] h j [ p(n)]  j r −1 j

j  1, 2, . . . , r and j  i

(1.38)

where 0 ≤ φ[ p j (n)] ≤ p j (n).

Nonlinear Algorithm IV Luce also proposed the following nonlinear algorithm in mathematical psychology for a two action learning automaton. This algorithm can be described by the following equations.

1.2 Learning Automata

23

g j [ p(n)]  h j [ p(n)] 

p j (n)[b−1][1− p j (n)] b[1− p j (n)]+ p j (n) p j (n)[b−1][1− p j (n)] p j (n)[b−1]+1

j  1, 2

(1.39)

where b > 1. It can be seen that this algorithm is a generalization of nonlinear x , and so can be algorithm III in which function φ(x) is defined as (x)  b(1−x)+x generalized for an automaton with r actions.

Nonlinear Algorithm V The nonlinear learning algorithm proposed by Vorontsova is applied to an automaton with two actions. The absolutely expediency of this algorithm has been shown in (Narendra and Thathachar 1989). In this algorithm, the updating rules are defined as g j [ p(n)]  aφ[ p1 (n), 1 − p1 (n)][1 − p j (n)]θ p j (n)θ h j [ p(n)]  bφ[ p1 (n), 1 − p1 (n)][1 − p j (n)]θ p j (n)θ

j  1, 2

(1.40)

where φ[ p1 (n), 1 − p1 (n)]  φ[1 − p1 (n), p1 (n)] is a nonlinear function and θ ≥ 1.

Nonlinear Algorithm VI Viswanathan and Narendra proposed a nonlinear learning algorithm for two actions learning automata which is a combination of nonlinear algorithm IV and algorithm L R−I . This algorithm is described by the following equations.   g j [ p(n)]  p j (n) a1 + a2 p j (n)θ (1 − p j (n))θ h j [ p(n)]  b[1 − p j (n)]θ p j (n)θ+1

(1.41)

where j  1, 2 and a1 , a2 and b are positive constants. Absolutely expediency of this algorithm has been also shown in Narendra and Thathachar (1989).

Nonlinear Algorithm VII The following equations show the nonlinear learning algorithm proposed for two actions learning automata by Lackshmivarahan and Thathachar (Christensen and Oommen 1990). In Sawaragi and Baba (1974) showed the ε-optimality of this algorithm. g j [ p(n)]  a[1 − p j (n)] p 2j (n) h j [ p(n)]  0

(1.42)

24

1 Introduction to Learning Automata Models

where j  1, 2.

Nonlinear Algorithm VIII Poznyak and Najim (1997) proposed a nonlinear algorithm in which the action probability vector is updated by Eq. (1.48). Some of the previously mentioned algorithms can be achieved from this algorithm. For example, L R−P algorithm can be resulted from this algorithm, if [x]  ax, for 0 ≤ a ≤ 1. In Poznyak and Najim (1997), the convergence results of the proposed algorithm are proven applying the Martingal and Lyapunov theorems. ⎧   r

⎪ ⎪ p j (n) + j i  pk (n) − β(n) ⎨ r −1 k1 p j (n + 1)  (1.43) k   j  ⎪ ⎪ ⎩ p j (n) −  pk (n) − β(n) j  i r −1

where [x] is a function of x.

Nonlinear Algorithm IX Meybodi et al. (Meybodi and Lakshmivarahan 1982) also proposed another algorithm in which the updating equations are defined as gi j [ p(n)]  a[1 − pi (n)] N [1 − p j (n)] N p j (n)

(1.44)

h i j [ p(n)]  bpiM−1 (n) p M j (n)[1 − pi (n)][1 − p j (n)]

(1.45)

where i, j  1, 2, . . . , r and j  i. M, N > 0 are two constants, and a and b (0 < a ≤ 1 and 0 < b ≤ 1) denote the reward and penalty parameters, respectively. This algorithm is strongly absolutely expedient and hence ε-optimal in all stationary environments.

Nonlinear Algorithm X (NEM ) In Thathachar and Oommen (1983), Thathachar et al. introduced a nonlinear learning algorithm for two actions learning automata. They showed that the proposed algorithm is ergodic in mean and so expedient (Thathachar and Oommen 1983). The updating rules under which the proposed algorithm modifies the action probability vector have been shown in the following equations.   g j [ p(n)]  p j (n) − p j (n) a + bp kj (n)(1 − p j (n))k

(1.46)

1.2 Learning Automata

  h j [ p(n)]  p j (n) a + bp kj (n)(1 − p j (n))k − p j (n) + μ,

25

(1.47)

where j  1, 2. Parameter μ controls the convergence rate of algorithm and b controls the variance of the limiting probability vector.

Nonlinear Algorithm XI Meybodi et al. (Meybodi and Lakshmivarahan 1982) proposed a nonlinear learning algorithm which is strongly absolutely expedient and hence ε-optimal in all stationary environments. The updating rule of this algorithm is defined as follows.   gi j [ p(n)]  a[1 − pi (n) [1 − p j (n) p j (n)

(1.48)

h i j [ p(n)]  bpi (n) p 2j (n)[1 − pi (n)][1 − p j (n)]

(1.49)

where j  1, 2, . . . , r and j  i. a and b (0 < a ≤ 1 and 0 < b ≤ 1) denote the reward and penalty parameters, respectively.

1.2.7.3

Hybrid Algorithms

A hybrid learning algorithm is a combination of linear and nonlinear learning algorithms in which some advantages can be achieved due to the group synergic of algorithms. A well-defined combination of linear and nonlinear algorithms is useful to improve the rate of convergence or convergence speed of algorithm in many cases. The property of ε-optimality of an algorithm results in a low speed convergence, when the action probabilities approach their final values. On the other hand, the convergence speed of the expedient algorithms is significantly faster as compared with ε-optimal algorithms. Hence, a combination of an ε-optimal algorithm and an expedient algorithm significantly improves the convergence speed to the optimal solution. The following subsection reviews the hybrid algorithms in a nut shell. Hybrid Algorithm I (H (L R−I , Lˆ R−P ) H Viswanathan and Narendra (1989) proposed a hybrid learning algorithm combining a linear reward-inaction algorithm (L R−I ) and a linear reward-penalty algorithm denoted as Lˆ R−P . Depending upon the intervals that the action probabilities fall into, one of the above mentioned algorithms (either L R−I or Lˆ R−P ) is chosen to update the action probability vector. Such a hybrid learning algorithm is obtained by substituting g j and h j in Eqs. (1.13) and (1.14) as follows.

26

1 Introduction to Learning Automata Models

g j [ p(n)]  ap j (n)

(1.50)

h j [ p(n)]  Ap j (n),

(1.51)

where 0 < a < 1 is the learning parameter and A is defined as  A

a if p j (n) ∈ 0 otherwise.



a , 1 1+a 1+a

 (1.52)

The proposed algorithm is simplified to a Lˆ R−P algorithm, when A is set to a in updating rule. The absolutely expediency of the proposed algorithm has been shown in (Narendra and Thathachar 1989).

Hybrid Algorithm II (RLAa ) Friedman et al. (Friedman and Shenker 1992) proposed the responsive learning automaton (R L Aa ) for solving the sub-optimality of learning algorithm L R−I in S-model nonstationary environments. In such a learning automaton, the probability with which each action is chosen is greater than a/2, this results in each action can be selected infinitely often. Equation (1.53) shows the rule through which R L Aa updates its action probability vector. The ε-optimality of R L Aa in all stationary environments has been shown in Friedman and Shenker (1996). It should be noted that the above mentioned updating rule is the same as that of algorithm S L R−I , if p j (n) ≥ a (for all j). ⎧

⎨ p j (n) + a[1 − β(n)] ak (n) pk (n) j  i k  j (1.53) pi (n + 1)  ⎩ p (n) − a[1 − β(n)]a (n) p (n) j  i j j j   p j (n)−a/2 where a j (n)  min 1, a[1−β(n)] . p j (n)

1.2.8 Estimator Learning Algorithms One of the main difficulties in using LA in many practical applications is the slow rate of convergence. Although several attempts are conducted to increase the rate of convergence, these are not enough. An improvement in the rate of convergence could be to determine the characteristics of the environment as the learning proceeds. This additional information is used when the action probabilities are updated. These algorithms are called estimator learning algorithms (Thathachar and Sastry 1985b). Non-estimator learning algorithms update their action probability vectors based on the current response from the environment while estimator learning algorithms main-

1.2 Learning Automata

27

tain a running estimate of reward strengths for each action. Then action probability vector will be updated on the basis of the current response from the environment and the running estimate of reward strengths. The state of an automaton operating under a P-model environment which is ˆ equipped with an estimator learning algorithm is defined as  p(n), d(n) at instance n, ˆ where p(n) denotes the action probability vector, d(n)  [dˆ1 (n), dˆ2 (n), . . . , dˆr (n)]T represents the set of reward estimations and dˆi (n) denotes the estimation of the reward probability d j at instant n. The estimation of reward probability d j is defined as the ratio of the number of times that action αi is rewarded to the number of times αi is selected (Thathachar and Sastry 1985a). An estimator learning algorithm operates as follows. An action is initially chosen by the learning automaton (e.g., αi ). The selected action is applied to the environment. Random environment generates response β(n) after it evaluates the chosen action. ˆ Learning automaton updates the action probability vector by using d(n) and β(n). ˆ is finally updated on the basis of the current response by the following rules. d(n) Ri (n + 1)  Ri (n) + [1 − β(n)] R j (n + 1)  R j (n) j  i

(1.54)

Z i (n + 1)  Z i (n) + 1 Z j (n + 1)  Z j (n) j  i

(1.55)

Rk (n) ˆ k  1, 2, . . . , r, d(n)  Z k (n)

(1.56)

where R j (n) denotes the number of times action αi is rewarded and Z i (n) denotes the number of times action αi is selected. Several well-known estimator learning algorithms are briefly reviewed in what follows.

1.2.8.1

TS Stochastic Estimator Algorithm (TSE)

TS stochastic estimator algorithm is shown to be an ε-optimal learning algorithm in which, at instant n, the action probability vector is updated according to the following rules.   f (dˆi (n)) − f (dˆ j (n)) pi (n + 1)  pi (n) + λ 

j i

 pi (k) (1 − p j (n)) Si j (n) p j (n) + S ji (n) r −1   p j (n + 1)  p j (n) + λ f (dˆi (n)) − f (dˆ j (n))   pi (k) (1 − p j (n)) for j  i Si j (n) p j (n) + S ji (n) r −1

(1.57)

28

1 Introduction to Learning Automata Models

where f : [0, 1] → [0, 1] is a monotonically increasing function, λ is a constant value in the interval (0, 1), and Si j (n) is defined as  Si j (n) 

1.2.8.2

1 if dˆi (n) > dˆ j (n) 0 if dˆi (n) ≤ dˆ j (n)

(1.58)

Discretized TS Estimator Algorithm (DTSE)

This estimator algorithm is the discrete version of TSE in which the action probability vector is updated as  

  S ji (n) p j (n + 1)  p j (n) − Check pi (n), p j (n), θ f dˆi (n) − dˆ j (n) Si j (n) + r −1 for j  i

    S ji (n) , pi (n + 1)  pi (n) + Check pi (n), p j (n), θ f dˆi (n) − dˆ j (n) Si j (n) + r −1

(2.35) where N denotes the resolution parameter, θ denotes the largest multiple of step size  that each component of the action probability vector can be decreased at each iteration, and Check( pi , p j , x) calculates the largest multiple of  between 1 and x that can be added to pi and subtracted from p j regarding pi and p j are bounded between zero and one. It is shown that discretized TS estimator algorithm is ε-optimal in all stationary environments (Lanctot and Oommen 1992). In this estimator algorithm, the history of the learning process is thoroughly used to estimate the reward probabilities. Such a learning algorithm is able to provide more and more accurate estimations as the learning process proceeds. Therefore, the convergence speed of this algorithm is much faster as compared with the other schemes. In nonstationary environments, the characteristics of the environment vary with time. This causes the information upon which the estimations are based to be no longer valid. Hence, TS estimator learning algorithm does not operate well in nonstationary environments. Using the partial history (Simha and Kurose 1989) or stochastically computing the mean rewards (Papadimitriou et al. 1991) are the approaches by which this problem can be relieved.

1.2.8.3

Relative Strength Learning Automata

Simha and Kurose (1989) proposed three S-model estimator algorithms called relative strength learning algorithms which are based on the gradient projection methods. These algorithms update the action probability vector so that the amount of the growth in the choice probabilities is proportional to the relative size of the recently received rewards. That is, in these algorithms, the most recently reward strengths are used to update the choice probability of each action.

1.2 Learning Automata

1.2.8.4

29

Stochastic Estimator Learning Automata

Stochastic estimator learning automata (SELA) is an S-model estimator learning algorithm in which the reward strength estimations are computed stochastically (Papadimitriou et al. 1991). In stochastic estimator learning automata, those actions that have not been chosen recently are given the opportunity to be estimated as an optimal action. Since in SELA the estimations are updated based on the recently selected actions, estimator can be adapted to the changes in the environment. The state of a stochastic estimator learning automaton at instance n is represented by a ˆ ˆ tuple d(n), V (n), U (n), where d(n) denotes the deterministic estimator vector, V (n) denotes the oldness vector including the time passed from the last selection of an action, and U (n) represents the stochastic estimator vector. Determinˆ istic estimator vector d(n) includes the current deterministic estimations of the reward strengths in a learning window W , and stochastic estimator vector U (n) includes the current stochastic estimations of the reward strengths which is defined as u i (n)  dˆi (n) + N (0, σi2 (n)), where σi (n)  max{au i (n − 1), σmax }. The absolutely expediency, and so the ε-optimality of the stochastic estimator learning automata have been shown in Papadimitriou et al. (1991) for all stationary environments. SELA updates the action probability vector as follows.   1 p j (n + 1)  max p j (n) − , 0 for all j  m N pm (n + 1)  1 − p j (n + 1) (1.59) j m

where N denotes the resolution parameter and αm is the action with the highest stochastic estimation value, i.e. u m (n) > u j (n) for all j  m.

1.2.8.5

S-Model Ergodic Discretized Estimator Learning Algorithm (SEDEL)

S-model ergodic discretized estimator learning algorithm (SEDEL) is an ergodic discrete estimator learning algorithm in which for each action the mean of the reward strength is calculated as the average of the W last responses received from the environment for the given action (Vasilakos and Paximadis 1994). It can be seen that SEDEL is ε-optimal in all stationary environments. S-model ergodic discretized estimator learning algorithm uses the rule shown in Eq. (1.60) for updating the action probability vector.   1 for all k  m pk (n + 1)  max 0, pk (n) − N

30

1 Introduction to Learning Automata Models

⎧ ⎨

pm (n + 1)  max 0, ⎩

k m

⎫ ⎬ pk (n + 1)



(1.60)

where αm denotes the action with the highest reward estimation.

1.2.8.6

Absorbing Stochastic Estimator Learning Algorithm (ASELA)

In this algorithm, the estimations of the probability strengths are computed stochastically, and the action probability vector is updated in a discrete manner. This algorithm operates under S-model environments (Papadimitriou et al. 2002). The obtained results reveal the superiority of ASELA over the similar existing algorithms. In this algorithm, those actions that have been selected a few times are given more opportunity to be selected again and more chance to be estimated as the optimal action.

1.2.9 Pursuit Algorithms From the name of this class of finite action-set learning automata, it can be concluded that in these algorithms the action probability vector chases after the action which is most recently estimated as the optimal action. In fact, a pursuit learning algorithm is a simplified version of the estimator algorithms inheriting their main characteristics. In pursuit learning algorithms, the choice probability of the action with the maximum rewarding estimation is increased. By this updating method, learning algorithm always pursues the optimal action. In the following subsections, several well-known pursuit learning algorithms are briefly reviewed.

1.2.9.1

Pursuit Reward-Penalty (PR-P ) Algorithm

Pursuit reward-penalty learning algorithm was introduced by Thathachar and Sastry (1986). This algorithm is very similar to linear reward-penalty L R−P learning algorithm. The updating rule by which PR−P brings the action probability vector up to date is based on the long-term estimations. In this algorithm, after the action probability vector is updated, the rewarding probability estimations are also updated. The ε-optimality of PR−P has been shown in Thathachar and Sastry (1986). The major difference between PR−P and L R−P is as follows. In L R−P algorithm, the action probability vector follows the most recently rewarded action or the action that has not been penalized, while in PR−P algorithm, the action probability vector goes a long the action which has the highest reward that is p(n + 1)  p(n)[1 − a] + aem ,

(1.61)

1.2 Learning Automata

31

where 0 < a < 1 denotes the learning parameter, em is a r -dimensional unit vector in mth direction and m denotes the action with the maximum estimation of the rewarding probability. In Oommen and Agache (2001), Oommen et al. proposed a discretized pursuit reward-penalty D PR−P algorithm in which the action probability vector is updated in a discrete manner. In this algorithm, the choice probability of the actions that are not corresponding to the highest rewarding estimation is decreased by  > 0, and the choice probability of the action with the highest estimation of the rewarding probability is increased by an integral multiple of . The ε-optimality of this algorithm in all stationary environments has been shown in Oommen and Agache (2001). The following equation shows the updating rule of D PR−P   p j (n + 1)  max p j (n) − , 0 for j  m p j (n + 1) pm (n + 1)  1 −

(1.62)

j m

/ Vr , and the action probability vector remains unchanged when p ∈ Vr . when p ∈

1.2.9.2

Pursuit Reward-Inaction (PR-I ) Algorithm

Pursuit reward-inaction algorithm is very similar to L R−I algorithm in which the action probability vector is updated, only if the selected action receives the favorable response from the environment. Otherwise, the action probability vector remains unchanged. The action probability vector is updated by the rule given in Eq. (1.61). It has been shown that PR−I is ε-optimal. PR−I includes both long-term and shortterm characteristics of the environment (John Oommen and Agache 2001). The long-term behavior of the environment is recorded by the rewarding estimations and the short-term behavior is recorded by the responses recently received from the environment. Discretized pursuit reward-inaction algorithm (D PR−I ) is a special case of the pursuit reward-inaction algorithm in which the action probability vector is updated in a discrete manner, unlike the continuous version in which the action probability vector is updated by using a continuous function (Oommen and Lanctôt 1990). In this algorithm, the choice probability of the action with the highest rewarding estimation, say action αm , is increased by an integral multiple of , if the selected action is rewarded. The choice probability of all the actions which are not corresponding to the action with the highest rewarding estimation are decreased by . In D PR−I , like PR−I , the action probability vector remains unchanged when the selected action receives an unfavorable response from the environment. It is shown in Oommen and Lanctôt (1990) that D PR−I is absorbing and ε-optimal in all stationary environments. The following equation shows the updating rule of D PR−I .   p j (n + 1)  max p j (n) − , 0 for j  m

32

1 Introduction to Learning Automata Models

pm (n + 1)  1 −



p j (n + 1)

(1.63)

j m

1.2.9.3

Generalized Pursuit Algorithm

The main disadvantage of the above mentioned pursuit algorithms is that at each iteration of algorithm, the choice probability of the action with the highest rewarding estimation has to be increased. This results in the movement of the action probability vector toward the action with the maximum rewarding estimation. This means that the learning automaton may converge to a wrong (non-optimal) action, when the action which has the highest rewarding estimation is not the action with the minimum penalty probability. To avoid such a wrong convergence problem, the generalized pursuit algorithm (GP) was introduced in Agache and Oommen (2002). In this algorithm, a set of possible actions with higher estimations than the currently selected action can be pursued at each instant. In Agache and Oommen (2002), it has been shown that this algorithm is ε-optimal in all stationary environment. Let K (n) denotes the number of actions that have higher estimations than the action selected at instant n. Equation (1.64) shows the updating rule of the generalized pursuit algorithm. p j (n + 1)  p j (n)(1 − a) + Ka(n) ∀ j( j  m) such that dˆ j > dˆi p j (n + 1)  p j (n)(1 − a) ∀ j( j  m) such that dˆ j ≤ dˆi

pm (n + 1)  1 − p j (n + 1)

(1.64)

j m

where αm denotes the action with the highest rewarding estimation. Discretized generalized pursuit algorithm (DGP) (Agache and Oommen 2002) is a special case of the generalized pursuit algorithm in which the action probability vector is updated in discrete steps. This algorithm is called pseudo-discretized, since the step size of the algorithm varies in different steps. In this algorithm, the choice probability of all the actions with higher rewarding estimations (than the selected action) increases with amount K(n) , and that of the other actions decreases with amount r −K (n) . The ε-optimality of the discretized generalized pursuit algorithm in all stationary environments has been shown in Agache and Oommen (2002). DGP updates the action probability vector by the following updating rule. ! ∀ j( j  m) such that dˆ j > dˆi p j (n + 1)  min p j (n) + K(n) , 1 !  p j (n + 1)  max p j (n) − r −K , 0 ∀ j( j  m) such that dˆ j ≤ dˆi (1.65) (n)

pm (n + 1)  1 − p j (n + 1) j m

The pursuit algorithms are ranked in decreasing order of performance as DGP, D PR−I , GP, D PR−P , PR−I , and PR−P .

1.3 Interconnected Learning Automata

33

1.3 Interconnected Learning Automata It seems that the full potential of learning automaton is realized when multiple automata interact with each other. It is shown that a set of interconnected learning automata is able to describe the behavior of an ant colony capable of finding the shortest path from their nest to food sources and back (Verbeeck et al. 2002). In this section, we study the interconnected learning automata. The interconnected learning automata techniques based on activation type of learning automata for taking an action can be classified into three classes: synchronous, sequential, and asynchronous, as follows • Synchronous Model of Interconnected Automata. In synchronous model of interconnected automata, at any time instant, all automata are activated simultaneously, choose their actions, then apply their chosen actions to the environment, and finally update their states. Two models of synchronous interconnected learning automata have been reported in the literature: game of automata and synchronous cellular learning automata. • Asynchronous Model of Interconnected Automata. In asynchronous model of interconnected automata, at any time instant only a group of automaton is activated, independently. The only proposed model for asynchronous model is asynchronous cellular learning automata. An asynchronous cellular learning automaton is a cellular automaton in which an automaton (or multiple automata) is assigned to its every cell. The learning automata residing in a particular cell determines its state (action) on the basis of its action probability vector. Like cellular automata, there is a rule that cellular learning automata operate under it. The rule of cellular learning automata and the state of neighboring cell of any particular cell determine the reinforcement signal to the learning automata residing in that cell. In cellular learning automata, the neighboring cells of any particular cell constitute its environment because they produce the reinforcement signal to the learning automata residing in that cell. This environment is a nonstationary environment because it varies as action probability vectors of cell vary and called local environment because is local to every cell. Krishna proposed an asynchronous cellular learning automata in which the order to which learning automata is determined is imposed by the environment (Krishna 1993). • Sequential Model of Interconnected Automata. In sequential model of interconnected automata, at any time instant only a group of automaton is activated and the actions chosen by currently activated automata determine next automata to be activated. The hierarchical structure learning automata, network of learning automata, distributed learning automata, and extended distributed learning automata are examples of sequential model. In following subsections, we focus on sequential interconnected learning automata.

34

1 Introduction to Learning Automata Models

1.3.1 Hierarchical Structure Learning Automata (HSLA) When the number of actions for a learning automaton becomes large (e.g., more than 10 actions) the time taken for the action probability vector to converges it also increases. Under such circumstances, a hierarchical structure of learning automata (HSLA) can be used. A hierarchical system of automata is a tree structure with depth of M where each node corresponds to an automaton and the arcs emanating from that node corresponds to actions of that automaton. In HSLA, an automaton with r actions is in the first level (root of tree) and kth level has r k −1 automata each with r actions. The root node corresponds to an automaton which will be referred to as the first-level or top-level automaton. Selection of each action of this automaton leads to activate an automaton at the second level. In this way, the structure can be extended to an arbitrary number of levels. A three-level hierarchy with three actions per automaton is shown in Fig. 1.9. The operation of hierarchical structure learning automata can be described as follows: initially, root automaton selects one action, say action αi1 . Then the i 1 th automaton at the second level will be activated. The action selected by i 1 th automaton at the second level (say αi1 i2 ) activates an automaton at the third level. This process is continued until a leaf automaton is activated. The action of this automaton is applied to the environment. The response from the environment is used to update the action probability vectors of the activated automata at the path from root to the selected leaf node. The basic idea of learning algorithm is to increase the probability of selecting good action and to decrease the probability of selecting other actions. In HSLA, set of actions {αi1 , αi1 i2 , . . . , αi1 i2 ...i M } is said to be on optimal path if the product of their respective reward probabilities are maximum. The HSLA can be classified into three types I, II and III (Thathachar and Sastry 1987). A HSLA is said to be of type I if actions constituting the optimal path are also individual optimal at their respective

LA 3

1 2

11

LA11

12

LA12

LA3

LA2

LA1 13

LA13

21

LA21

23

22

LA22

LA23

31

LA31

32

LA32

33

LA33 333

111

Environment Fig. 1.9 Hierarchical structure learning automata (HSLA)

1.3 Interconnected Learning Automata

35

B

A Agent 1

Agent 2 Decide game

Player 3

Player 4

High level

High level

A Player 1

low level

Player 2

Decide action

low level

Fig. 1.10 Multi-level game of learning automata

levels. A HSLA is said to be of type II if the actions constituting the optimal path are also individual optimal at their respective automata. Any general hierarchy is said to be of type III.

1.3.2 Multi-level Game of Learning Automata Multi-level game of LAs can be effectively used for solving the multi-criteria optimization problems in which several objective functions are required to be optimized. In multi-level game of LAs, multiple games are played in different levels (Billard 1994, 1996). The game, which is being played at each level of the multi-level game of LAs, decides the game, which has to be played in the next level. Figure 1.10 shows a two-level game of LAs with four players.

1.3.3 Network of Learning Automata (NLA) A network of LAs (Willianms 1988) is a collection of LAs connected together as a hierarchical feed-forward layered structure. In this structure, the outgoing link of the LAs in the preceding layers is the input of the LAs of the succeeding layers. In this model, the LAs (and consequently the layers) are classified into three separate groups. The first group includes the LAs located at the first level of the network, called the input LAs (input layer). The second group is composed of the LAs located at the last level of the network, called the output LAs (output layer). The third group includes

36

1 Introduction to Learning Automata Models

Environment Outputs

Reinforcement Signal

Network of Units

Fig. 1.11 Network of learning automata

the LAs located between the first and the last layers, called the hidden LAs (hidden layer). In a network of LAs, the inputs LAs receive the context vectors as external inputs from the environment and the output LAs apply the output of the network to the environment. The difference between the feed-forward neural networks and NLA is that units of neural networks are deterministic while units of NLA are stochastic and the learning algorithms used in two networks are different. Since units are stochastic, the output of a particular unit i is drawn from a distribution depending its input weight vector and output of the units in the preceding layers. This model operates as follows. The context vector is applied to the input LAs. Each input LA selects one of its possible actions on the basis of its action probability vector and the input signals it receives from the environment. The chosen action activates the LAs of the next level, which are connected to this LA. Each activated LA selects one of its actions as stated before. The actions selected by the output LAs are applied to the random environment. The environment evaluates the output action in comparison with the desired output and generates the reinforcement signal. This reinforcement signal is then used by all LAs for updating their states. The structure of such a network is shown in Fig. 1.11.

1.3.4 Distributed Learning Automata (DLA) The hierarchical structure learning automata has a tree structure, in which there exists a unique path between the root of the tree and each of its leaves. However, in some applications, such as routing in computer networks, there may be multiple paths

1.3 Interconnected Learning Automata

37 A1 1

Ai

Aj

(n) (n)

Environment

Fig. 1.12 Structure of distributed learning automata

between the source and destination nodes. This system is a generalization of HSLA, which referred to as distributed learning automata (DLA). A Distributed learning automata (DLA) (Beigy and Meybodi 2006) shown in Fig. 1.12 is a network of interconnected learning automata which collectively cooperate to solve a particular problem. The number of actions for a particular LA in DLA is equal to the number of LA’s that are connected to this LA. Selection of an action by a LA in DLA activates another LA which corresponds to this action. Formally, a DLA can be defined by a quadruple A, E, T, A0 , where A  {A1 , A2 , …, An } is the set of learning automata, j E ⊂ A × A is the set of edges where edge (vi , v j ) corresponds to action αi of automaton Ai , T is the set of learning algorithms with which learning automata update their action probability vectors, and A1 is the root automaton of DLA at which activation of DLA starts. The operation of a DLA can be described as follows: At first, the root automaton A0 randomly chooses one of its outgoing edges (actions) according to its action probabilities and activates the learning automaton at the other end of the selected edge. The activated automaton also randomly selects an action which results in activation of another automaton. The process of choosing actions and activating automata is continued until a leaf automaton (an automaton which interacts with the environment) is reached. The chosen actions, along the path induced by the activated automata are applied to the random environment. The environment evaluates the applied actions and emits a reinforcement signal to DLA. The activated learning automata along the chosen path update their action probability vectors on the basis of the reinforcement signal according to the learning algorithms. The paths from the unique root automaton to one of the leaf automata are selected until the probability with which one of the chosen paths is close enough to unity. Each DLA has exactly one root automaton which is always activated, and at least one leaf automaton which is activated probabilistically. For example in Fig. 1.13, every automaton has two actions. If automaton A0 selects α02 from its action set, then it will activate automaton A2 . Afterward, automaton A2 chooses one of its possible actions and so on.

38

1 Introduction to Learning Automata Models

Fig. 1.13 Example of distributed learning automata

Ai

Ak

Aj

In Sato (1999), a restricted version of above DLA is introduced. In this model, the underlying graph, in which the DLA is embedded, is a finite directed acyclic graph (DAG). Sato used the LR-I learning algorithm with decaying reward parameter. It is shown that every edge of DAG is selected infinitely often and thus every learning automaton is activated infinitely often. Also, it is shown that when every learning automaton has unique best action, the DLA converges to its best action with probability 1 (Sato 1999). Meybodi and Beigy introduced a DLA in which the underlying graph is not restricted to be a DAG (Beigy and Meybodi 2006). But in order to restrict a learning automaton to appear more than once in any path, the learning automaton with changing number of actions are used. The DLA was used for solving the several stochastic graph problems (Akbari Torkestani and Meybodi 2010, 2012; Mollakhalili Meybodi and Meybodi 2014; Rezvanian and Meybodi 2015a, b).

1.3.5 Extended Distributed Learning Automata (eDLA) An extended distributed learning automata (eDLA) (Mollakhalili Meybodi and Meybodi 2014) is a new extension of DLA supervised by a set of rules governing the operation of the LAs. Mollakhalili-Meybodi et al. presented a framework based on eDLA for solving stochastic graph optimization problems such as stochastic shortest path problem and stochastic minimum spanning tree problem. Here, we provide a brief introduction to the eDLA. In general in eDLA, the ability of a DLA is improved by adding communication rules and changing the activity level of each LA depends on a problem to be solved by an eDLA. An eDLA similar to a DLA can be modeled by a directed graph in which the node-set of graph constructs the set of LA and the number of actions for each LA in eDLA equals to the number of LAs that are connected to that LA. In eDLA, at any time, not only each LA can be in one mode of activity level but also each LA with a high activity level can be performed an action according to its probabilities on the random environment. Formally, an eDLA can be described by a 7-tuple A, E, S, P, S 0 , F, C, where A is the set of LA, E ⊂ A × A is the edge-set of communication graph G  V , E and S  {s1 , s2 , …, sn } is a set of activity levels corresponding to each LA in eDLA; specially si indicates the activity level for learning automaton Ai in which si ∈ {Pa, Ac, Fi, Of }

1.3 Interconnected Learning Automata

39

consists of one of the following activity levels: Passive (initial level of each LA and can be changed to Active), Active (activity level for set of available LAs and its level can be upgraded to Fire), Fire (the highest level of activity, LA can be performed and its level can be changed to Off ) and Off (the lowest level of activity, LA is disabled and its level stay unchanged), represented briefly by Pa, Ac, Fi, and Of respectively. As mentioned, at any time only one LA in eDLA can be in the Fi level of activity and can be determined by fire function C which randomly selects a LA from a set of LAs with activity level of Ac. Governing rule P is the finite set of rules that governs the activity levels of each LA. P according to the current activity level of each LA, its adjacent LA or depending on the particular problem which eDLA is designed is defined. S 0  (s10 , s20 , . . . , sn0 ) and F  {S F | S F  (s1F , s2F , . . . , snF )} are the initial state and final conditions of eDLA. The operation of eDLA can be described as follows. In eDLA, at first at initial state S 0 , a starting LA is randomly selected by firing function F to fires, selects one of outgoing edges (actions) according to its action probabilities and performs it on the random environment and at the same time the activity level of fired LA and neighboring LAs are changed to Of and Ac respectively. Changing activity levels of LAs result in the state of eDLA transferred from state S k to state S k+1 at instant k by governing rule P. Then the firing function C fires one LA from the set of LA with activity level of Ac to selects an action and then changes its activity level and neighboring LAs. The process of firing one LA by firing function, performing an action by fired LA, changing the activity level of fired LA and its neighbors by governing rule P is continued until the final condition of eDLA F is reached. F can be defined based on a set of criteria in terms of activity levels of LAs such that, if one of them is satisfied, the final condition of eDLA is realized. The environment evaluates the performed actions by fired LA and generates a reinforcement signal to eDLA. The action probabilities of fired LA along the visited nodes or LA of the nodes which are part of a solution to the problem of graph are then updated on the basis of the reinforcement signal according to the learning algorithm. Firing LAs of eDLA by starting from randomly LA is repeated predefined number of times until the solution of the problem for which eDLA is designed is obtained.

1.4 Recent Applications of Learning Automata In the recent years, learning automata as one of powerful computational intelligence techniques have been found very useful technique to solve many real, complex and dynamic environments where a large amount of uncertainty or lacking the information about the environment exists (Rezvanian et al. 2018a, d). Table 1.1 summarized some recent applications of learning automata. In Table 1.2, recent applications of learning automata for social network problems and type of each learning automata are given.

40

1 Introduction to Learning Automata Models

1.5 Conclusion In this chapter, we made a beginning with the basic concepts and properties of learning automata models. In addition to single learning automata, models consisting of many learning automata, like network of learning automata, hierarchical learning automata, distributed learning automata, extended learning automata, and learning algorithms, are introduced in this chapter. We also reported recent applications of learning automata models as well.

Table 1.1 Summary of recent applications of learning automata Applications

Learning automata type

Cellular networks

CLA (Beigy and Meybodi 2010), LA (Rezapoor Mirsaleh and Meybodi 2018a)

Cloud computing

DLA (Hasanzadeh and Meybodi 2014), LA (Jobava et al. 2018), LA (Rahmanian et al. 2018), ICLA (Morshedlou and Meybodi 2018), LA (Morshedlou and Meybodi 2014), FALA (Velusamy and Lent 2018), LA (Qavami et al. 2017)

Cyber-physical systems

LA (Ren et al. 2018)

Data mining

FALA (Hasanzadeh-Mofrad and Rezvanian 2018), ACLA (Ahangaran et al. 2017), CLA (Sohrabi and Roshani 2017)

Graph problems

CLA (Vahidipour et al. 2017a), LA (Rezapoor Mirsaleh and Meybodi 2018b), LA (Mousavian et al. 2013), ICLA (Mousavian et al. 2014), DLA (Soleimani-Pouri et al. 2012), LA (Khomami et al. 2016a), FALA (Vahidipour et al. 2019), DLA (Vahidipour et al. 2019), ICAL (Vahidipour et al. 2019; Daliri Khomami et al. 2017)

Image processing

CLA (Hasanzadeh Mofrad et al. 2015), LA (Damerchilu et al. 2016), LA (Kumar et al. 2015b), CLA (Adinehvand et al. 2017)

Internet of Things (IoT)

LA (Di et al. 2018), LA (Sikeridis et al. 2018)

Network security

LA (Krishna et al. 2014), FALA (Di et al. 2019), LA (Farsi et al. 2018), CALA (Kahani and Fallah 2018), FALA (Su et al. 2018), DPR-P (Seyyedi and Minaei-Bidgoli 2018)

Optimization

FALA (Rezvanian and Meybodi 2010), FALA (Rezvanian and Meybodi 2010), FALA (Mahdaviani et al. 2015), VSLA (Kordestani et al. 2018), CLA (Vafashoar and Meybodi 2016), LA (Rezapoor Mirsaleh and Meybodi 2015), LA (Li et al. 2018), LA (Rezapoor Mirsaleh and Meybodi 2018c)

Social network analysis

LA (Amiri et al. 2013), DLA (Khomami et al. 2016b), CALA (Moradabadi and Meybodi 2016), FSLA (Ghavipour and Meybodi 2018a), ICLA (Ghavipour and Meybodi 2017), ICLA (Khomami et al. 2018), ICLA (Zhao et al. 2015), CLA (Aldrees and Ykhlef 2014) (continued)

1.5 Conclusion

41

Table 1.1 (continued) Applications

Learning automata type

Stochastic social networks

DLA (Rezvanian and Meybodi 2016a), LA (Moradabadi and Meybodi 2018b), DLA (Vahidipour et al. 2017b), ICLA (Vahidipour et al. 2019), EDLA (Rezvanian and Meybodi 2017a)

Peer-to-peer networks

LA (Saghiri and Meybodi 2016, 2017), CLA (Saghiri and Meybodi 2018; Amirazodi et al. 2018), ICLA (Rezvanian et al. 2018b)

Vehicular environments

LA (Misra et al. 2014; Kumar et al. 2015a), LA (Toffolo et al. 2018)

Wireless mesh networks

LA (Parvanak et al. 2018; Beheshtifard and Meybodi 2018)

Wireless sensor networks

LA (Han and Li 2019), ICLA (Rezvanian et al. 2018c), FALA (Javadi et al. 2018), DLA (Mostafaei 2018), ICLA (Mostafaei and Obaidat 2018a), DLA (Mostafaei and Obaidat 2018b), GLA (Rahmani et al. 2018)

Table 1.2 Summary of recent applications of learning automata for social network problems Social network problem

Learning automata type

Centrality measurement

NLA (Rezvanian and Meybodi 2016b), LA (Moradabadi and Meybodi 2018b)

Community detection

FALA (Amiri et al. 2013), DLA (Khomami et al. 2016b), ICLA (Khomami et al. 2018), EDLA (Ghamgosar et al. 2017)

Graph problems

DLA (Rezvanian and Meybodi 2015a), DLA (Rezvanian and Meybodi 2015b), WCLA (Moradabadi and Meybodi 2018c)

Influence maximization

NLA (Daliri Khomami et al. 2018), DGCPA (Ge et al. 2017), DLri (Huang et al. 2018), CLA (Aldrees and Ykhlef 2014)

Link prediction

FALA (Moradabadi and Meybodi 2018b), FALA (Moradabadi and Meybodi 2017a), CALA (Moradabadi and Meybodi 2018a), DLA (Moradabadi and Meybodi 2017b), CALA (Moradabadi and Meybodi 2016)

Network sampling

DLA (Rezvanian et al. 2014), DLA (Rezvanian and Meybodi 2017a), EDLA (Rezvanian and Meybodi 2017b), EDLA (Rezvanian and Meybodi 2017a), ICLA (Ghavipour and Meybodi 2017), VSLA (Rezvanian and Meybodi 2017a), FSLA (Ghavipour and Meybodi 2018a), FALA (Khadangi et al. 2016)

Recommender systems

FALA (Krishna et al. 2013), CALA (Ghavipour and Meybodi 2016), CLA (Toozandehjani et al. 2014)

Trust management

DLA (Ghavipour and Meybodi 2018b), DLA (Ghavipour and Meybodi 2018c), FALA (Lingam et al. 2018), CLA (Bushehrian and Nejad 2017)

42

1 Introduction to Learning Automata Models

References Adinehvand K, Sardari D, Hosntalab M, Pouladian M (2017) An efficient multistage segmentation method for accurate hard exudates and lesion detection in digital retinal images. J Intell Fuzzy Syst 33:1639–1649. https://doi.org/10.3233/JIFS-17199 Agache M, Oommen BJ (2002) Generalized pursuit learning schemes: new families of continuous and discretized learning automata. IEEE Trans Syst Man Cybern Part B Cybern 32:738–749. https://doi.org/10.1109/TSMCB.2002.1049608 Ahangaran M, Taghizadeh N, Beigy H et al (2017) Associative cellular learning automata and its applications. Appl Soft Comput J 53:1–18. https://doi.org/10.1016/j.asoc.2016.12.006 Akbari Torkestani J, Meybodi MR (2010) Learning automata-based algorithms for finding minimum weakly connected dominating set in stochastic graphs. Int J Uncertain Fuzziness Knowl-Based Syst 18:721–758. https://doi.org/10.1142/S0218488510006775 Akbari Torkestani J, Meybodi MR (2012) A learning automata-based heuristic algorithm for solving the minimum spanning tree problem in stochastic graphs. J Supercomput 59:1035–1054. https:// doi.org/10.1007/s11227-010-0484-1 Aldrees M, Ykhlef M (2014) A seeding cellular learning automata approach for viral marketing in social network. In: Proceedings of the 16th international conference on information integration and web-based applications & services—iiWAS’14. ACM Press, New York, pp 59–63 Amirazodi N, Saghiri AM, Meybodi M (2018) An adaptive algorithm for super-peer selection considering peer’s capacity in mobile peer-to-peer networks based on learning automata. Peerto-Peer Netw Appl 11:74–89. https://doi.org/10.1007/s12083-016-0503-y Amiri F, Yazdani N, Faili H, Rezvanian A (2013) A novel community detection algorithm for privacy preservation in social networks. In: Abraham A (ed), pp 443–450 Aso H, Kimura M (1979) Absolute expediency of learning automata. Inf Sci (Ny) 17:91–112. https://doi.org/10.1016/0020-0255(79)90034-3 Baba N (1983) The absolutely expedient nonlinear reinforcement schemes under the unknown multiteacher environment. IEEE Trans Syst Man Cybern SMC-13:100–108. https://doi.org/10. 1109/tsmc.1983.6313039 Beheshtifard Z, Meybodi MR (2018) An adaptive channel assignment in wireless mesh network: the learning automata approach. Comput Electr Eng 72:79–91. https://doi.org/10.1016/ j.compeleceng.2018.09.004 Beigy H, Meybodi MR (2006) Utilizing distributed learning automata to solve stochastic shortest path problems. Int J Uncertain Fuzziness Knowl-Based Syst 14:591–615. https://doi.org/10.1142/ S0218488506004217 Beigy H, Meybodi MRR (2010) Cellular learning automata with multiple learning automata in each cell and its applications. IEEE Trans Syst Man Cybern Part B 40:54–65. https://doi.org/10.1109/ TSMCB.2009.2030786 Billard EA (1994) Instabilities in learning automata playing games with delayed information. In: Proceedings of IEEE international conference on systems, man and cybernetics. IEEE, pp 1160–1165 Billard EA (1996) Stability of adaptive search in multi-level games under delayed information. IEEE Trans Syst Man, Cybern Part A Systems Humans 26:231–240. https://doi.org/10.1109/ 3468.485749 Bushehrian O, Nejad SE (2017) Health-care pervasive environments: a CLA based trust management, pp 247–257 Christensen JPR, Oommen BJ (1990) Epsilon-optimal stubborn learning mechanisms. IEEE Trans Syst Man Cybern 20:1209–1216. https://doi.org/10.1109/21.59983 Daliri Khomami MM, Haeri MA, Meybodi MR, Saghiri AM (2017) An algorithm for weighted positive influence dominating set based on learning automata. In: 2017 IEEE 4th international conference on knowledge-based engineering and innovation (KBEI). IEEE, pp 0734–0740

References

43

Daliri Khomami MM, Rezvanian A, Bagherpour N, Meybodi MR (2018) Minimum positive influence dominating set and its application in influence maximization: a learning automata approach. Appl Intell 48:570–593. https://doi.org/10.1007/s10489-017-0987-z Damerchilu B, Norouzzadeh MS, Meybodi MR (2016) Motion estimation using learning automata. Mach Vis Appl 27:1047–1061. https://doi.org/10.1007/s00138-016-0788-0 Di C, Zhang B, Liang Q et al (2018) Learning automata based access class barring scheme for massive random access in machine-to-machine communications. IEEE Internet Things J 1–1. https://doi.org/10.1109/jiot.2018.2867937 Di C, Su Y, Han Z, Li S (2019) Learning automata based SVM for intrusion detection, pp 2067–2074 Farsi H, Nasiripour R, Mohammadzadeh S (2018) Eye gaze detection based on learning automata by using SURF descriptor. J Inf Syst Telecommun 21:1–10. https://doi.org/10.7508/jist.2018.21. 006 Friedman EJ, Shenker S (1992) Learning by distributed automata. Electronics Research Laboratory, College of Engineering, University of California Friedman EJ, Shenker S (1996) Synchronous and asynchronous learning by responsive learning automata Ge H, Huang J, Di C et al (2017) Learning automata based approach for influence maximization problem on social networks. In: 2017 IEEE second international conference on data science in cyberspace (DSC). IEEE, pp 108–117 Ghamgosar M, Khomami MMD, Bagherpour N, Meybodi MR (2017) An extended distributed learning automata based algorithm for solving the community detection problem in social networks. In: 2017 Iranian conference on electrical engineering (ICEE). IEEE, pp 1520–1526 Ghavipour M, Meybodi MR (2016) An adaptive fuzzy recommender system based on learning automata. Electron Commer Res Appl 20:105–115. https://doi.org/10.1016/j.elerap.2016.10.002 Ghavipour M, Meybodi MR (2017) Irregular cellular learning automata-based algorithm for sampling social networks. Eng Appl Artif Intell 59:244–259. https://doi.org/10.1016/j.engappai.2017. 01.004 Ghavipour M, Meybodi MR (2018a) A streaming sampling algorithm for social activity networks using fixed structure learning automata. Appl Intell 48:1054–1081. https://doi.org/10. 1007/s10489-017-1005-1 Ghavipour M, Meybodi MR (2018b) Trust propagation algorithm based on learning automata for inferring local trust in online social networks. Knowl-Based Syst 143:307–316. https://doi.org/ 10.1016/j.knosys.2017.06.034 Ghavipour M, Meybodi MR (2018c) A dynamic algorithm for stochastic trust propagation in online social networks: learning automata approach. Comput Commun 123:11–23. https://doi.org/10. 1016/j.comcom.2018.04.004 Han Z, Li S (2019) Opportunistic routing algorithm based on estimator learning automata, pp 2486–2492 Hasanzadeh M, Meybodi MR (2014) Grid resource discovery based on distributed learning automata. Computing 96:909–922. https://doi.org/10.1007/s00607-013-0337-x Hasanzadeh Mofrad M, Sadeghi S, Rezvanian A, Meybodi MR (2015) Cellular edge detection: combining cellular automata and cellular learning automata. AEU—Int J Electron Commun 69:1282–1290. https://doi.org/10.1016/j.aeue.2015.05.010 Hasanzadeh-Mofrad M, Rezvanian A (2018) Learning automata clustering. J Comput Sci 24:379–388. https://doi.org/10.1016/j.jocs.2017.09.008 Haykin S (1994) Neural networks: a comprehensive foundation. Prentice Hall PTR, Upper Saddle River Herkenrath U, Kalin D, Lakshmivarahan S (1981) On a general class of absorbing-barrier learning algorithms. Inf Sci (Ny) 24:255–263. https://doi.org/10.1016/0020-0255(81)90034-7 Huang J, Ge H, Guo Y et al (2018) A learning automaton-based algorithm for influence maximization in social networks, pp 715–722

44

1 Introduction to Learning Automata Models

Javadi M, Mostafaei H, Chowdhurry MU, Abawajy JH (2018) Learning automaton based topology control protocol for extending wireless sensor networks lifetime. J Netw Comput Appl 122:128–136. https://doi.org/10.1016/j.jnca.2018.08.012 Jobava A, Yazidi A, Oommen BJ, Begnum K (2018) On achieving intelligent traffic-aware consolidation of virtual machines in a data center using learning automata. J Comput Sci 24:290–312. https://doi.org/10.1016/j.jocs.2017.08.005 John Oommen B, Agache M (2001) Continuous and discretized pursuit learning schemes: various algorithms and their comparison. IEEE Trans Syst Man Cybern Part B Cybern 31:277–287. https://doi.org/10.1109/3477.931507 Kaddour N, Poznyak AS (1994) Learning automata: theory and applications. Pergamon Press, Oxford Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285. https://doi.org/10.1613/jair.301 Kahani N, Fallah MS (2018) A reactive defense against bandwidth attacks using learning automata. In: Proceedings of the 13th international conference on availability, reliability and security—ARES 2018. ACM Press, New York, pp 1–6 Khadangi E, Bagheri A, Shahmohammadi A (2016) Biased sampling from Facebook multilayer activity network using learning automata. Appl Intell 45:829–849. https://doi.org/10.1007/ s10489-016-0784-0 Khomami MMD, Bagherpour N, Sajedi H, Meybodi MR (2016a) A new distributed learning automata based algorithm for maximum independent set problem. In: 2016 artificial intelligence and robotics (IRANOPEN). IEEE, Qazvin, Iran, Iran, pp 12–17 Khomami MMD, Rezvanian A, Meybodi MR (2016b) Distributed learning automata-based algorithm for community detection in complex networks. Int J Mod Phys B 30:1650042. https://doi. org/10.1142/S0217979216500429 Khomami MMD, Rezvanian A, Meybodi MR (2018) A new cellular learning automata-based algorithm for community detection in complex social networks. J Comput Sci 24:413–426. https:// doi.org/10.1016/j.jocs.2017.10.009 King-Sun Fu (1970) Learning control systems—review and outlook. IEEE Trans Automat Contr 15:210–221. https://doi.org/10.1109/TAC.1970.1099405 Kordestani JK, Firouzjaee HA, Meybodi MR (2018) An adaptive bi-flight cuckoo search with variable nests for continuous dynamic optimization problems. Appl Intell 48:97–117. https://doi. org/10.1007/s10489-017-0963-7 Krishna K (1993) Cellular learning automata: a stochastic model for adaptive controllers. Master’s thesis, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India Krishna PV, Misra S, Joshi D, Obaidat MS (2013) Learning automata based sentiment analysis for recommender system on cloud. In: 2013 international conference on computer, information and telecommunication systems (CITS). IEEE, pp 1–5 Krishna PV, Misra S, Joshi D et al (2014) Secure socket layer certificate verification: a learning automata approach. Secur Commun Netw 7:1712–1718. https://doi.org/10.1002/sec.867 Kumar N, Misra S, Obaidat MS (2015a) Collaborative learning automata-based routing for rescue operations in dense urban regions using vehicular sensor networks. IEEE Syst J 9:1081–1090. https://doi.org/10.1109/JSYST.2014.2335451 Kumar NN, Lee JH, Rodrigues JJ (2015b) Intelligent mobile video surveillance system as a Bayesian coalition game in vehicular sensor networks: learning automata approach. IEEE Trans Intell Transp Syst 16:1148–1161. https://doi.org/10.1109/TITS.2014.2354372 Kushner HJ, Huang H (1981) Averaging methods for the asymptotic analysis of learning and adaptive systems, with small adjustment rate. SIAM J Control Optim 19:635–650. https://doi. org/10.1137/0319040 Lakshmivarahan S (1981) Learning algorithms theory and applications. Springer, New York Lakshmivarahan S, Thathachar MAL (1973) Absolute expedient algorithms for stochastic automata. IEEE Trans Syst Man Cybern SMC-3:281–286

References

45

Lakshmivarahan S, Thathachar MAL (1976a) Absolute expediency of Q-and S-model learning algorithms. IEEE Trans Syst Man Cybern SMC-6:222–226 Lakshmivarahan S, Thathachar MAL (1976b) Bounds on the convergence probabilities of learning automata. IEEE Trans Syst Man, Cybern A Syst Humans 6:756–763 Lanctot JK, Oommen BJ (1992) Discretized estimator learning automata. IEEE Trans Syst Man Cybern 22:1473–1483. https://doi.org/10.1109/21.199471 Li W, Ozcan E, John R (2018) A learning automata based multiobjective hyper-heuristic. IEEE Trans Evol Comput 1–1. https://doi.org/10.1109/tevc.2017.2785346 Lingam G, Rout RR, Somayajulu D (2018) Learning automata-based trust model for user recommendations in online social networks. Comput Electr Eng 66:174–188. https://doi.org/10.1016/ j.compeleceng.2017.10.017 Mahdaviani M, Kordestani Jk, Rezvanian A, Meybodi MR (2015) LADE: learning automata based differential evolution. Int J Artif Intell Tools 24:1550023. https://doi.org/10.1142/ S0218213015500232 Meybodi MR, Lakshmivarahan S (1982) ε-optimality of a general class of learning algorithms. Inf Sci (NY) 28:1–20. https://doi.org/10.1016/0020-0255(82)90029-9 Meybodi MR, Lakshmivarahan S (1984) On a class of learning algorithms which have a symmetric behavior under success and failure. Lecture Notes in Statistics. Springer, Berlin, pp 145–155 Misra S, Interior B, Kumar N et al (2014) Networks of learning automata for the vehicular environment: a performance analysis study. IEEE Wirel Commun 21:41–47. https://doi.org/10.1109/ MWC.2014.7000970 Mollakhalili Meybodi MR, Meybodi MR (2014) Extended distributed learning automata. Appl Intell 41:923–940. https://doi.org/10.1007/s10489-014-0577-2 Moradabadi B, Meybodi MR (2016) Link prediction based on temporal similarity metrics using continuous action set learning automata. Phys A Stat Mech its Appl 460:361–373. https://doi. org/10.1016/j.physa.2016.03.102 Moradabadi B, Meybodi MR (2017a) A novel time series link prediction method: learning automata approach. Phys A Stat Mech its Appl 482:422–432. https://doi.org/10.1016/j.physa.2017.04.019 Moradabadi B, Meybodi MR (2017b) Link prediction in fuzzy social networks using distributed learning automata. Appl Intell 47:837–849. https://doi.org/10.1007/s10489-017-0933-0 Moradabadi B, Meybodi MR (2018a) Link prediction in weighted social networks using learning automata. Eng Appl Artif Intell 70:16–24. https://doi.org/10.1016/j.engappai.2017.12.006 Moradabadi B, Meybodi MR (2018b) Link prediction in stochastic social networks: learning automata approach. J Comput Sci 24:313–328. https://doi.org/10.1016/j.jocs.2017.08.007 Moradabadi B, Meybodi MR (2018c) Wavefront cellular learning automata. Chaos 28:21101. https://doi.org/10.1063/1.5017852 Morshedlou H, Meybodi MR (2014) Decreasing impact of SLA violations: a proactive resource allocation approach for cloud computing environments. IEEE Trans Cloud Comput 2:156–167. https://doi.org/10.1109/TCC.2014.2305151 Morshedlou H, Meybodi MR (2018) A new learning automata based approach for increasing utility of service providers. Int J Commun Syst 31:e3459. https://doi.org/10.1002/dac.3459 Mostafaei H (2018) Energy-efficient algorithm for reliable routing of wireless sensor networks. IEEE Trans Ind Electron 1–1. https://doi.org/10.1109/tie.2018.2869345 Mostafaei H, Obaidat MS (2018a) Learning automaton-based self-protection algorithm for wireless sensor networks. IET Netw 7:353–361. https://doi.org/10.1049/iet-net.2018.0005 Mostafaei H, Obaidat MS (2018b) A distributed efficient algorithm for self-protection of wireless sensor networks. In: 2018 IEEE international conference on communications (ICC). IEEE, pp 1–6 Mousavian A, Rezvanian A, Meybodi MR (2013) Solving minimum vertex cover problem using learning automata. In: 13th Iranian conference on fuzzy systems (IFSC 2013), pp 1–5 Mousavian A, Rezvanian A, Meybodi MR (2014) Cellular learning automata based algorithm for solving minimum vertex cover problem. In: 2014 22nd Iranian conference on electrical engineering (ICEE). IEEE, pp 996–1000

46

1 Introduction to Learning Automata Models

Narendra KS, Thathachar MAL (1974) Learning automata—a survey. IEEE Trans Syst Man Cybern SMC-4:323–334. https://doi.org/10.1109/tsmc.1974.5408453 Narendra KS, Thathachar MAL (1989) Learning automata: an introduction. Prentice-Hall Norman MF (1972) Markovian process and learning models. Academic Press, New York Norman MF (1974) Markovian learning processes. SIAM Rev 16:143–162. https://doi.org/10.1137/ 1016025 Oommen BJ (1986) Absorbing and ergodic discretized two-action learning automata. Syst Man Cybern IEEE Trans 16:282–293. https://doi.org/10.1109/TSMC.1986.4308951 Oommen B (1987) Ergodic learning automata capable of incorporating a priori information. IEEE Trans Syst Man Cybern 17:717–723. https://doi.org/10.1109/TSMC.1987.289367 Oommen BJ, Christensen JPR (1988) epsilon-optimal discretized linear reward-penalty learning automata. IEEE Trans Syst Man Cybern 18:451–458. https://doi.org/10.1109/21.7494 Oommen BJ, Hansen E (1984) The asymptotic optimality of discretized linear reward-inaction learning automata. IEEE Trans Syst Man Cybern SMC-14:542–545. https://doi.org/10.1109/ tsmc.1984.6313256 Oommen BJ, Lanctôt JK (1990) Discretized pursuit learning automata. IEEE Trans Syst Man Cybern 20:931–938. https://doi.org/10.1109/21.105092 Oommen BJ, Thathachar MAL (1985) Multiaction learning automata processing ergodicity of the mean. Int J Syst Sci 35:183–198 Papadimitriou GI, Vasilakos AV, Papadimitriou GI, Paximadis CT (1991) A new approach to the design of reinforcement schemes for learning automata: stochastic estimator learning algorithms. In: Conference proceedings 1991 IEEE international conference on systems, man, and cybernetics. IEEE, pp 1387–1392 Papadimitriou GI, Pomportsis AS, Kiritsi S, Talahoupi E (2002) Absorbing stochastic estimator learning algorithms with high accuracy and rapid convergence. In: Proceedings ACS/IEEE international conference on computer systems and applications. IEEE Computer Society, pp 45–51 Parvanak AR, Jahanshahi M, Dehghan M (2018) A cross-layer learning automata based gateway selection method in multi-radio multi-channel wireless mesh networks. Computing. https://doi. org/10.1007/s00607-018-0648-z Poznyak S, Najim K (1997) On nonlinear reinforcement schemes. IEEE Trans Automat Control 42:1002–1004. https://doi.org/10.1109/9.599982 Qavami HR, Jamali S, Akbari MK, Javadi B (2017) A learning automata based dynamic resource provisioning in cloud computing environments. In: 2017 18th international conference on parallel and distributed computing, applications and technologies (PDCAT). IEEE, pp 502–509 Rahmani P, Javadi HHS, Bakhshi H, Hosseinzadeh M (2018) TCLAB: a new topology control protocol in cognitive MANETs based on learning automata. J Netw Syst Manag 26:426–462. https://doi.org/10.1007/s10922-017-9422-3 Rahmanian AA, Ghobaei-Arani M, Tofighy S (2018) A learning automata-based ensemble resource usage prediction algorithm for cloud computing environment. Future Gener Comput Syst 79:54–71. https://doi.org/10.1016/j.future.2017.09.049 Ren J, Wu G, Su X et al (2018) Learning automata-based data aggregation tree construction framework for cyber-physical systems. IEEE Syst J 12:1467–1479. https://doi.org/10.1109/JSYST. 2015.2507577 Rezapoor Mirsaleh M, Meybodi MR (2015) A learning automata-based memetic algorithm. Genet Program Evol Mach 16:399–453. https://doi.org/10.1007/s10710-015-9241-9 Rezapoor Mirsaleh M, Meybodi MR (2018a) Assignment of cells to switches in cellular mobile network: a learning automata-based memetic algorithm. Appl Intell 48:3231–3247. https://doi. org/10.1007/s10489-018-1136-z Rezapoor Mirsaleh M, Meybodi MR (2018b) A Michigan memetic algorithm for solving the vertex coloring problem. J Comput Sci 24:389–401. https://doi.org/10.1016/j.jocs.2017.10.005 Rezapoor Mirsaleh M, Meybodi MR (2018c) Balancing exploration and exploitation in memetic algorithms: a learning automata approach. Comput Intell 34:282–309. https://doi.org/10.1111/ coin.12148

References

47

Rezvanian A, Meybodi MR (2010) Tracking extrema in dynamic environments using a learning automata-based immune algorithm. Communications in computer and information science. Springer, Berlin, pp 216–225 Rezvanian A, Meybodi MR (2015a) Finding maximum clique in stochastic graphs using distributed learning automata. Int J Uncertain Fuzziness Knowl-Based Syst 23:1–31. https://doi.org/10.1142/ S0218488515500014 Rezvanian A, Meybodi MR (2015b) Finding minimum vertex covering in stochastic graphs: a learning automata approach. Cybern Syst 46:698–727. https://doi.org/10.1080/01969722.2015. 1082407 Rezvanian A, Meybodi MR (2016a) Stochastic social networks: measures and algorithms. LAP LAMBERT Academic Publishing Rezvanian A, Meybodi MR (2016b) Stochastic graph as a model for social networks. Comput Human Behav 64:621–640. https://doi.org/10.1016/j.chb.2016.07.032 Rezvanian A, Meybodi MR (2017a) Sampling algorithms for stochastic graphs: a learning automata approach. Knowl-Based Syst 127:126–144. https://doi.org/10.1016/j.knosys.2017.04.012 Rezvanian A, Meybodi MR (2017b) A new learning automata-based sampling algorithm for social networks. Int J Commun Syst 30:e3091. https://doi.org/10.1002/dac.3091 Rezvanian A, Rahmati M, Meybodi MR (2014) Sampling from complex networks using distributed learning automata. Phys A Stat Mech Appl 396:224–234. https://doi.org/10.1016/j.physa.2013. 11.015 Rezvanian A, Saghiri AM, Vahidipour SM et al (2018a) Recent advances in learning automata. Springer Rezvanian A, Saghiri AM, Vahidipour SM et al (2018b) Learning automata for cognitive peer-topeer networks. In: Recent advances in learning automata, pp 221–278 Rezvanian A, Saghiri AM, Vahidipour SM et al (2018c) Learning automata for wireless sensor networks. In: Recent advances in learning automata, pp 91–219 Rezvanian A, Vahidipour SM, Esnaashari M (2018d) New applications of learning automata-based techniques in real-world environments. J Comput Sci 24:287–289. https://doi.org/10.1016/j.jocs. 2017.11.012 Saghiri AM, Meybodi MR (2016) An approach for designing cognitive engines in cognitive peerto-peer networks. J Netw Comput Appl 70:17–40. https://doi.org/10.1016/j.jnca.2016.05.012 Saghiri AM, Meybodi MR (2017) A distributed adaptive landmark clustering algorithm based on mOverlay and learning automata for topology mismatch problem in unstructured peer-to-peer networks. Int J Commun Syst 30:e2977. https://doi.org/10.1002/dac.2977 Saghiri AM, Meybodi MR (2018) An adaptive super-peer selection algorithm considering peers capacity utilizing asynchronous dynamic cellular learning automata. Appl Intell 48:271–299. https://doi.org/10.1007/s10489-017-0946-8 Sato T (1999) On some asymptotic properties of learning automaton networks Sawaragi Y, Baba N (1973) A note on the learning behavior of variable-structure stochastic automata. IEEE Trans Syst Man Cybern SMC-3:644–647. https://doi.org/10.1109/tsmc.1973.4309320 Sawaragi Y, Baba N (1974) Two ε-optimal nonlinear reinforcement schemes for stochastic automata. IEEE Trans Syst Man Cybern SMC-4:126–131. https://doi.org/10.1109/tsmc.1974.5408538 Seyyedi SH, Minaei-Bidgoli B (2018) Estimator learning automata for feature subset selection in high-dimensional spaces, case study: email spam detection. Int J Commun Syst 31:e3541. https:// doi.org/10.1002/dac.3541 Sikeridis D, Tsiropoulou EE, Devetsikiotis M, Papavassiliou S (2018) Socio-physical energyefficient operation in the internet of multipurpose things. In: 2018 IEEE international conference on communications (ICC). IEEE, pp 1–7 Simha R, Kurose JF (1989) Relative reward strength algorithms for learning automata. IEEE Trans Syst Man Cybern 19:388–398. https://doi.org/10.1109/21.31041 Sohrabi MK, Roshani R (2017) Frequent itemset mining using cellular learning automata. Comput Human Behav 68:244–253. https://doi.org/10.1016/j.chb.2016.11.036

48

1 Introduction to Learning Automata Models

Soleimani-Pouri M, Rezvanian A, Meybodi MR (2012) Solving maximum clique problem in stochastic graphs using learning automata. In: 2012 fourth international conference on computational aspects of social networks (CASoN). IEEE, pp 115–119 Su Y, Qi K, Di C et al (2018) Learning automata based feature selection for network traffic intrusion detection. In: 2018 IEEE third international conference on data science in cyberspace (DSC). IEEE, pp 622–627 Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge Thathachar MAL, Harita BR (1987) Learning automata with changing number of actions. IEEE Trans Syst Man Cybern 17:1095–1100. https://doi.org/10.1109/TSMC.1987.6499323 Thathachar MAL, Oommen BJ (1983) Learning automata processing ergodicity of the mean: the two-action case. IEEE Trans Syst Man Cybern SMC-13:1143–1148. https://doi.org/10.1109/ tsmc.1983.6313191 Thathachar MAL, Ramachandran KM (1984) Asymptotic behaviour of a learning algorithm. Int J Control 39:827–838. https://doi.org/10.1080/00207178408933209 Thathachar MAL, Sastry PS (1985a) A new approach to the design of reinforcement schemes for learning automata. IEEE Trans Syst Man Cybern SMC-15:168–175. https://doi.org/10.1109/ tsmc.1985.6313407 Thathachar MAL, Sastry PS (1985b) A class of rapidly converging algorithms for learning automata. IEEE Trans Syst Man Cybern SMC-15:168–175 Thathachar M, Sastry P (1986) Estimator algorithms for learning automata. In: Proceedings of the platinum jubilee conference on systems and signal processing, Bangalore, India. Bangalore, India Thathachar MALAL, Sastry PSS (1987) A hierarchical system of learning automata that can learn die globally optimal path. Inf Sci (NY) 42:143–166. https://doi.org/10.1016/00200255(87)90021-1 Thathachar MAL, Sastry PS (2002) Varieties of learning automata: an overview. IEEE Trans Syst Man Cybern Part B Cybern 32:711–722. https://doi.org/10.1109/TSMCB.2002.1049606 Thathachar MAL, Sastry PS (2003) Networks of learning automata: techniques for online stochastic optimization. Springer, Boston Toffolo TAM, Christiaens J, Van Malderen S et al (2018) Stochastic local search with learning automaton for the swap-body vehicle routing problem. Comput Oper Res 89:68–81. https://doi. org/10.1016/j.cor.2017.08.002 Toozandehjani H, Zare-Mirakabad M-R, Derhami V (2014) Improvement of recommendation systems based on cellular learning automata. In: 2014 4th international conference on computer and knowledge engineering (ICCKE). IEEE, pp 592–597 Tsetlin ML (1962) On the behavior of finite automata in random media. Autom Remote Control 22:1210–1219 Vafashoar R, Meybodi MR (2016) Multi swarm bare bones particle swarm optimization with distribution adaption. Appl Soft Comput J 47:534–552. https://doi.org/10.1016/j.asoc.2016.06.028 Vahidipour SM, Meybodi MR, Esnaashari M (2017a) Adaptive Petri net based on irregular cellular learning automata with an application to vertex coloring problem. Appl Intell 46:272–284. https:// doi.org/10.1007/s10489-016-0831-x Vahidipour SM, Meybodi MR, Esnaashari M (2017b) Finding the shortest path in stochastic graphs using learning automata and adaptive stochastic petri nets. Int J Uncertain Fuzziness Knowl-Based Syst 25:427–455. https://doi.org/10.1142/S0218488517500180 Vahidipour SM, Esnaashari M, Rezvanian A, Meybodi MR (2019) GAPN-LA: a framework for solving graph problems using Petri nets and learning automata. Eng Appl Artif Intell 77:255–267. https://doi.org/10.1016/j.engappai.2018.10.013 Vasilakos AV, Paximadis CT (1994) Faulttolerant routing algorithms using estimator discretized learning automata for high-speed packet-switched networks. IEEE Trans Reliab 43:582–593. https://doi.org/10.1109/24.370222 Velusamy G, Lent R (2018) Dynamic cost-aware routing of web requests. Future Internet 10:57. https://doi.org/10.3390/fi10070057

References

49

Verbeeck K, Nowé A, Nowe A (2002) Colonies of learning automata. IEEE Trans Syst Man Cybern Part B Cybern 32:772–780. https://doi.org/10.1109/TSMCB.2002.1049611 Viswanathan R, Narendra KS (1972) A note on the linear reinforcement scheme for variablestructure stochastic automata. IEEE Trans Syst Man Cyberen I:292–294. https://doi.org/10.1109/ TSMC.1972.4309112 Willianms RJ (1988) Toward a theory of reinforcement-learning connectionist systems. Northeastern University Zhao Y, Jiang W, Li S et al (2015) A cellular learning automata based algorithm for detecting community structure in complex networks. Neurocomputing 151:1216–1226. https://doi.org/10. 1016/j.neucom.2014.04.087

Chapter 2

Wavefront Cellular Learning Automata: A New Learning Paradigm

2.1 Introduction Before describing the wavefront cellular learning automata (WCLA) (Moradabadi and Meybodi 2018), we give a brief overview on cellular learning automaton (CLA) models.

2.1.1 Cellular Learning Automata Cellular learning automaton (CLA) (Mason and Gu 1986) is a combination of cellular automaton (CA) (Packard and Wolfram 1985) and learning automaton (LA) (Narendra and Thathachar 1989). The basic idea of CLA is to use LA for adjusting the state transition probability of a stochastic CA. This model, which opens a new learning paradigm, is superior to CA because of its ability to learn and is also superior to single LA because it consists of a collection of LAs interacting with each other (Beigy and Meyb 2004). A CLA is a CA in which a number of LAs is assigned to every cell. Each LA residing in a particular cell determines its action (state) on the basis of its action probability vector. Like CA, there is a local rule that the CLA operates under. The local rule of the CLA and the actions selected by the neighboring LAs of any particular LA determine the reinforcement signal to that LA. The neighboring LAs (cells) of any particular LA (cell) constitute the local environment of that LA (cell). The local environment of an LA (cell) is non-stationary due to the fact that the action probability vectors of the neighboring LAs vary during the evolution of the CLA. The operation of a CLA could be described as the following steps (Fig. 2.1): At the first step, the internal state of every cell is determined on the basis of the action probability vectors of the LAs residing in that cell. In the second step, the local rule of the CLA determines the reinforcement signal to each LA residing in that cell. Finally, each LA updates its action probability vector based on the supplied reinforcement signal and the chosen action. This process continues until the desired result is obtained. © Springer Nature Switzerland AG 2019 A. Rezvanian et al., Learning Automata Approach for Social Networks, Studies in Computational Intelligence 820, https://doi.org/10.1007/978-3-030-10767-3_2

51

52

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

Fig. 2.1 Operation of the cellular learning automaton (CLA)

β

Local Rule

CLA can be either synchronous or asynchronous. In a synchronous CLA, LAs in all cells are activated at the same time synchronously using a global clock whereas in an asynchronous CLA (ACLA) (Beigy and Meybodi 2008), LAs in different cells are activated asynchronously. The LAs may be activated in either time-driven or step-driven manner. In a time-driven ACLA, each cell is assumed to have an internal clock which wakes up the LAs associated to that cell. In a step-driven ACLA, a cell is selected for activation in either a random or a fixed sequence. From another point of view, CLA can be either close or open. In a close CLA, the action selection of any particular LA in the next iteration of its evolution only depends on the state of the local environment of that LA (actions of its neighboring LAs) whereas in an open CLA (Beigy and Meybodi 2007), this not only depends on the local environment, but also on the external environments. In (Beigy and Meybodi 2010), a new type of CLA, called CLA with multiple LAs in each cell, has been introduced. This model is suitable for applications such as channel assignment in cellular networks, in which it is needed that each cell is equipped with multiple LAs. In (Beigy and Meyb 2004), a mathematical framework for studying the behavior of the CLA has been introduced. It was shown that, for a class of local rules called commutative rules, different models of CLA converge to a globally stable state (Beigy and Meyb 2004, 2007, 2008, 2010). Definition 2.1 A d-dimensional cellular learning automata is a framework (Beigy and Meyb 2004), A  (Z d , N , , A, F), where • Zd presents the lattice of d-tuples of integer numbers. • N  {x¯1 , x¯2 , . . . , x¯m¯ } is a finite subset of Z d that is called neighborhood vector, where x¯i ∈ Z d . •  denotes the finite set of state. ϕi presents the state of the cell ci . • A presents the set of LAs residing in the cells of CLA.

2.1 Introduction

53

• F i : i → β defines the local rule of the CLA for each cell ci , where β is the set of possible values for the reinforcement signal and it calculates the reinforcement signal for each LA using the chosen actions of neighboring LAs. In the rest of this section, we briefly review the classifications of CLA: • Static CLA versus Dynamic CLA: in static CLAs, the cellular structure of the CLA remains fixed during the evolution of the CLA while in dynamic CLAs one of its aspects such as structure, local rule or neighborhood radius may vary with time (Esnaashari and Meybodi 2011, 2013) • Open CLA versus Close CLA: in close CLAs, the action of each LA depends on the neighboring cells, whereas in open CLAs, the action of each LA depends on the neighboring cells, a global environment that influences all cells, and an exclusive environment for each particular cell (Beigy and Meybodi 2007; Saghiri and Meybodi 2017b). • Asynchronous CLA versus Synchronous CLA: In synchronous CLA, all cells use their local rules at the same time. This model assumes that there is an external clock which triggers synchronous events for the cells. In asynchronous CLA, at a given time only some cells are activated and the state of the rest of cells remains unchanged (Beigy and Meybodi 2007). The LAs may be activated in either timedriven where each cell is assumed to have an internal clock which wakes up the LA associated with that cell or in step-driven where a cell is selected in fixed or random sequence. • Regular CLA versus Irregular CLA: in regular CLAs the structure of CLA is represented as a lattice of d-tuples of integer numbers while in Irregular CLA (ICLA) the structure regularity assumption is replaced with an undirected graph (Ghavipour and Meybodi 2017; Vahidipour et al. 2017; Esnaashari and Meybodi 2018). • CLAs with one LAs in each cell versus CLAs with multiple LAs in each cell: in conventional CALAs, each cell equipped with one LA, while in CLAs with multiple LAs in each cell, each cell is equipped with multiple LAs (Beigy and Meybodi 2010). • CLAs with fixed number of LAs in each cell versus CLAs with varying number of LAs in each cell: in conventional CALAs, the number of LAs in each cell remains fixed during the evolution of CLA, while in CLAs with varying number of LAs in each cell, the number of LAs of each cell changes over time (Saghiri and Meybodi 2017b). • CLAs with fixed structure LAs versus CLAs with variable structure LAs: since LAs can be classified into two main families; fixed and variable structure (Narendra and Thathachar 1989; Thathachar and Sastry 2003). In CLAs with fixed structure LAs, constituting LAs are of fixed structure type, whereas in CLAs with variable structure LAs, LAs are of variable structure type.

54

2.1.1.1

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

Asynchronous Cellular Learning Automata

A cellular learning automaton is called asynchronous if at a given time only some LAs are activated independently from each other, rather than all together in parallel. The asynchronous CLA (ACLA) (Beigy and Meybodi 2008) requires the specification of an order in which the learning automata are activated. The learning automata may be activated in either time-driven or step-driven manner. In time-driven asynchronous CLA, each cell is assumed to have an internal clock which wakes up the learning automaton associated to that cell while in step-driven asynchronous CLA; a cell is selected in a fixed sequence or at random. In other words, in step-driven activation methods an algorithm determines the order of activation of the learning automata while in the time-driven activation methods, an algorithm assigns an explicit point in time to every learning automaton at that time this learning automaton will be activated next. In the ACLA, the trajectory starting from a given initial configuration in general depends on the activation order of learning automata. The asynchronous CLA in which cells are selected randomly is of more interest to us because of its application to cellular mobile networks. Formally a d—dimensional asynchronous step-driven CLA is given below. Definition 2.2 A d-dimensional asynchronous step-driven cellular learning automata is a structure A  (Z d , , A, N , F, ρ), where • • • •

Z d is a lattice of d-tuples of integer numbers.  is a finite set of states. A is the set of LAs each of which is assigned to each cell of the CA. N  {x¯1 , x¯2 , . . . , x¯m¯ } is a finite subset of Z d called neighborhood vector, where x¯i ∈ Z d . • F : m¯ → β is the local rule of the cellular learning automata, where β is the set of values that the reinforcement signal can take. • ρ is an n-dimensional vector called activation probability vector, where ρi is the probability that the LA in cell i (for i  1, 2, . . . , n) to be activated in each stage. In what follows, we consider CLA with n cells and neighborhood function N¯ (i). A learning automaton denoted by Ai , which has a finite action set α i is associated to cell i (for i  1, 2, . . . , n) of CLA. Let cardinality of α i be m i and the state of CLA represented by p  ( p 1 , p 2 , . . . , p n ) , where pi  ( pi1 , . . . , pim i ) is the action probability vector of Ai . It is evident that the local environment for each learning automaton is the learning automata residing in its neighboring cells. From the repeated application of simple local rules and simple learning algorithms, the global behavior of CLA can be very complex. The operation of asynchronous CLA (ACLA) takes place as the following iterations. At iteration k, each learning automaton Ai is activated with probability ρi and the activated learning automata choose one of their actions. The activated automata use their current actions to execute the rule (computing the reinforcement signal). The actions of neighboring cells of an activated cell are their most recently selected

2.1 Introduction

55

actions. Let αi ∈ α i and βi ∈ β be the action chosen by the activated and the reinforcement signal received by Ai , respectively. This reinforcement signal is produced by the application of local rule F i (αi+x¯1 , αi+x¯2 , . . . , αi+x¯m¯ ) → β, where F i is the local rule of cell i. The higher value of βi means that the chosen action of Ai is more rewarded. Finally, activated learning automata update their action probability vectors and the process repeats. Based on set β, the CLA can be classified into three groups: P-model, Q-model, and S-model Cellular learning automata. When β  {0, 1}, we refer to ACLA as P-model cellular learning automata, when β  {b1 , . . . , bl }, (for l < ∞), we refer to CLA as Q-model cellular learning automata, and when β  [b1 , b2 ], we refer to CLA as S-model cellular learning automata. If learning automaton Ai uses learning algorithm L i , we denote CLA by the C L A(L 1 , . . . , L n ). If L i  L for all i  1, 2, . . . , n, then we denote the CLA by the CLA(L). Since each set α i is finite, the local rule F i (αi+x¯1 , αi+x¯2 , . . . , αi+x¯m¯ ) → β can be represented by a hyper matrix of dimensions m 1 × m 2 × · · · × m m¯ . These n hyper matrices together constitute the rule of ACLA. When all hyper matrices are equal, the rule is uniform; otherwise the rule is non-uniform. For the sake of simplicity in our presentation, local rule F i (αi+x¯1 , αi+x¯2 , . . . , αi+x¯m¯ ) is denoted by F i (α1 , α2 , . . . , αm¯ ). 2.1.1.2

Open Synchronous Cellular Learning Automata

CLA studied so far are closed, because they do not take into account the interaction between the CLA and the external environments. In this section, a new class of CLA called open CLA which was first introduced by Beigy and Meybodi (2007) is presented. In OSCLA, the evolution of CLA is influenced by the external environments. Two types of environments can be considered in the open CLA: global environment and exclusive environment. Each CLA has one global environment that influences all cells and an exclusive environment for each particular cell. The operation of open CLA takes place as iterations of the following steps. At iteration k, each learning automaton chooses one of its actions. Let on be the action chosen by learning automaton Ai . The actions of all learning automata are applied to their corresponding local environment (neighboring learning automata) as well as global environment and their corresponding exclusive environment. Then all learning automata receive their reinforcement signal, which is combination of the responses from local, global and exclusive environments. These responses are combined using the local rule. Finally, all learning automata update their action probability vectors based on the received reinforcement signal. Note that the local environment for each learning automaton is non-stationary while global and exclusive environments may be stationary or non-stationary. We now present the convergence result for the open CLA, which ensures convergence to one compatible configuration if the CLA has more than one compatible configurations. Definition 2.3 A d-dimensional open cellular learning automata is a structure A  (Z d , , A, E G , E E , N , F), where

56

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

• • • • •

Z d is a lattice of d-tuples of integer numbers.  is a finite set of states. A is the set of LAs each of which is assigned to each cell of the CA. E G is the global environment. E E  {E 1E , E 2E , . . . , E nE } is the set of exclusive environments, where E iE is the exclusive environment for cell i. • N  {x¯1 , x¯2 , . . . , x¯m¯ } is a finite subset of Z d called neighborhood vector, where x¯i ∈ Z d . • F : m¯ × O(G) × O(E) → β is the local rule of the cellular learning automata, where O(G) and O(E) are the set of signals of global and exclusive environments, respectively. In what follows, we consider an open CLA with n cells and neighborhood function N¯ (i). A learning automaton denoted by Ai , which has a finite action set α i is associated to cell i (for i  1, 2, . . . , n) of the open CLA. Let cardinality of α i be m i and the state of the open CLA represented by p  ( p 1 , p 2 , . . . , p n ) , where pi  ( pi1 , . . . , pim i ) is the action probability vector of Ai . It is evident that the local environment for each learning automaton is the learning automata residing in its neighboring cells. From the repeated application of simple local rules and simple learning algorithms, the global behavior of CLA can be very complex. The operation of open synchronous CLA takes place as iterations of the following steps. At iteration k, each learning automaton chooses one of its actions. Let αi be the action chosen by learning automaton Ai . The actions of all learning automata are applied to their corresponding local environment (neighboring learning automata) as well as the global environment and their corresponding exclusive environments. Each environment produces a signal. These signals then are used by the local rule to generate a reinforcement signal to the learning automaton residing in every cell. Finally, all learning automata update their action probability vectors based on the received reinforcement signal. Note that the local environment for each learning automaton is non-stationary while global and exclusive environments may be stationary or nonstationary. In this study, we assume that the global and exclusive environments are stationary. We now present the convergence result for the open synchronous CLA. The result asserts that open synchronous CLA converges to one of its compatible configurations. Based on set β, the OSCLA can be classified into three groups: P-model, Q-model, and S-model Cellular learning automata. When β  {0, 1}, we refer to OSCLA as P-model cellular learning automata, when β  {b1 , . . . , bl }, (for l < ∞), we refer to OSCLA as Q-model cellular learning automata, and when β  [b1 , b2 ], we refer to OSCLA as S-model cellular learning automata. If learning automaton Ai uses learning algorithm L i , we denote OSCLA by the O SC L A(L 1 , . . . , L n ). If L i  L for all i  1, 2, . . . , n, then we denote the OSCLA by the CLA(L).

2.1 Introduction

2.1.1.3

57

Irregular Cellular Learning Automata (ICLA)

Irregular cellular learning automata (ICLA) (Esnaashari and Meybodi 2008) is a generalization of traditional CLA which removes the limitation of rectangular grid structure. Such a generalization seems to be necessary since there are many applications such as graph related applications, social networks, wireless sensor networks, and immune network systems, which cannot be modelled with regular grids (Esnaashari and Meybodi 2008, 2009; Rezapoor Mirsaleh and Meybodi 2016). An ICLA is considered as an undirected graph in which each node is a cell being equipped with a learning automaton, and the neighboring nodes of any particular node constitute the local environment of that cell. The LA residing in a particular cell determines its state (action) according to its action probability vector. Like CLA, there is a rule that the ICLA operates under. The local rule of the ICLA and the actions selected by the neighboring LAs of any particular LA determine the reinforcement signal to that LA. The neighboring LAs of any particular LA constitute the local environment of that LA. The local environment of an LA is non-stationary because the action probability vectors of the neighboring LAs vary during the evolution of the ICLA (Fig. 2.2). The operation of the ICLA is similar to the operation of the CLA. At the first step, the internal state of each cell is specified on the basis of the action probability vector of the LA residing in that cell. In the second step, the rule of the ICLA determines the reinforcement signal to the LA residing in each cell. Finally, each LA updates its action probability vector on the basis of the supplied reinforcement signal and the internal state of the cell. This process continues until the desired result is obtained. Formally, an ICLA is defined as given below. Definition 2.4 Irregular cellular learning automaton is a structure A (GE, V , F, A, F), where:

Fig. 2.2 Irregular cellular learning automaton



58

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

• G is an undirected graph, with V as the set of vertices (cells) and E as the set of edges (adjacency relations). • F is a finite set of states and Fi represents the state of the cell ci . • A is the set of LAs each of which is assigned to one cell of the ICLA. • F : ϕ i → β is the local rule of the ICLA in each cell ci , where ϕ i  {ϕ j |{i, j} ∈ E} ∪ {ϕi } is the set of states of all neighbors of ci and β is the set of values that the reinforcement signal can take. Local rule gives the reinforcement signal to each LA from the current actions selected by the neighboring LAs of that LA. Note that in the definition of the ICLA, no explicit definition is given for the neighborhood of each cell. It is implicitly defined in the definition of the graph G. In what follows, we consider ICLA with N cells. The learning automaton LAi which has a finite action set α i is associated to cell ci (for i  1, 2, . . . , N ) of the ICLA. Let the cardinality of α i be m i . The operation of the ICLA takes place as the following iterations. At iteration k, each learning automaton selects an action. Let αi ∈ α i be the action selected by L Ai . Then all learning automata receive a reinforcement signal. Let βi ∈ β be the reinforcement signal received by L Ai . This reinforcement signal is produced by the application of the local rule F i (ϕi ) → β. Higher values of βi mean that the selected action of L Ai will receive higher penalties. Then, each L Ai updates its action probability vector on the basis of the supplied reinforcement signal and its selected action αi . Like CLA, ICLA can be either synchronous or asynchronous and an asynchronous ICLA can be either time-driven or step-driven.

2.1.1.4

Dynamic Irregular Cellular Learning Automata

A DICLA is defined as an undirected graph in which, each vertex represents a cell and a learning automaton is assigned to every cell (vertex) (Esnaashari and Meybodi 2018). A finite set of interests is defined for DICLA. For each cell of DICLA a tendency vector is defined whose jth element shows the degree of tendency of that cell to the jth interest. In DICLA, the state of each cell consists of two parts; the action selected by the learning automaton and the tendency vector. Two cells are neighbors in DICLA if the distance between their tendency vectors is smaller than or equal to the neighborhood radius. Like ICLA, there is a local rule that DICLA operates under. The local rule of DICLA, the actions selected by the neighboring LAs of any particular learning automaton L Ai determine the followings: (1) the reinforcement signal to the learning automaton L Ai , and (2) the restructuring signal to the cell in which L Ai resides. Restructuring signal is used to update the tendency vector of the cell. Dynamicity of DICLA is the result of modifications made to the tendency vectors of its constituting cells. gives a schematic of DICLA. A DICLA is formally defined below (Fig. 2.3). Definition 2.5 Dynamic irregular cellular learning automaton is a structure A  (GV, E, , A, , α, ψ, τ, F, Z ) where

2.1 Introduction

59

LAj

α j ,ψ

α j ,ψ j

j

αi , ψ

β ,ζ j

Local Rule

αi , ψ i LAi

LAk

αl , ψ l

αi , ψ i

αk ,ψ k

αi , ψ i

LAl

αl , ψ l

αk ,ψ k

Restructuring Function ( Z )

Neighborhood Radius (τ )

Fig. 2.3 Dynamic irregular cellular learning automaton (DICLA)

• G is an undirected graph, with V as the set of vertices (cells) and E as the set of edges (adjacency relations). •  is a finite set of interests. Cardinality of  is denoted by ||. • A is the set of learning automata each of which is assigned to one cell of DICLA. • α, ψ is the cell state. State of a cell ci (ϕi ) consists of two parts; (1) αi which is the action selected by the learning automaton of that cell, and (2) A vector ψi  (ψi1 , ψi2 , . . . , ψi|Ψ | )T called the tendency vector of the cell. Each element ψik ∈ [0, 1] in the tendency vector of the cell ci shows the degree of tendency of ci to the interest ψk ∈ Ψ . • τ is the neighborhood radius. Two cells ci and c j of DICLA are neighbors if |ψi − ψ j |≤ τ . In other words, two cells of DICLA are neighbors if the distance between their tendency vectors is smaller than or equal to τ . • F : ϕ i → β, [0, 1]|Ψ |  is the local rule of DICLA in each cell ci , where ϕ i  {ϕ j ψi − ψ j |≤ τ } ∪ {ϕi } is the set of states of all neighbors of ci , β is the set of values that the reinforcement signal can take, and [0, 1]|Ψ | is a |dimensional unit hypercube. From the current states of the neighboring cells of each cell ci , local rule performs the followings: (1) gives the reinforcement signal to the learning automaton L Ai resides in ci , and (2) produces a restructuring signal ζi  (ζi1 , ζi2 , . . . , ζi|Ψ | )T which is used to change the tendency vector of ci . Each element ζi j of the restructuring signal is a scalar value within the close interval [−1, 1].

60

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

• Z : [0, 1]|Ψ | × [0, 1]|Ψ | [0, 1]|Ψ | is the restructuring function which modifies the tendency vector of the cell ci using the restructuring signal produced by the local rule of the cell. In what follows, we consider DICLA with N cells. The learning automaton L Ai which has a finite action set α i is associated to the cell ci (for i  1, 2, …, N) of DICLA. Let the cardinality of α i be m i . The operation of DICLA takes place as the following iterations. At iteration k, each learning automaton chooses an action. Let αi ∈ α i be the action chosen by L Ai . Then, each LA receives a reinforcement signal. Let βi ∈ β i be the reinforcement signal received by L Ai . This reinforcement signal is produced by the application of the local rule F : ϕ i → β, [0, 1]|Ψ | . Higher values of βi mean that the selected action of L Ai will receive higher penalties. Each LA updates its action probability vector on the basis of the supplied reinforcement signal and the action chosen by the cell. Next, each cell ci updates its tendency vector using the restructuring function ψi (k + 1)  Z (ψi (k), ζi (k)) Like ICLA, DICLA can be either synchronous or asynchronous and an asynchronous DICLA can be either time-driven or step-driven.

2.1.1.5

Recent Applications of Cellular Learning Automata

In the recent years, cellular learning automata have been used in distributed, decentralized and dynamic environments where a large amount of uncertainty or lacking the information about the environment exists (Beigy and Meyb 2004; Rezvanian et al. 2018a, b) and for these reasons, CLA have been successfully applied to a wide range of domains and applications as given in Table 2.1. Such as optimization (Rezvanian and Meybodi 2010a, b; Hasanzadeh et al. 2013; Moradabadi and Beigy 2014; Mahdaviani et al. 2015; Rezapoor Mirsaleh and Meybodi 2015; Vafashoar and Meybodi 2016; Moradabadi et al. 2016; Moradabadi and Beigy 2014), image processing (Hasanzadeh Mofrad et al. 2015; Damerchilu et al. 2016), graph problems (Soleimani-Pouri et al. 2012; Mousavian et al. 2013, 2014; Khomami et al. 2016a; Rezapoor Mirsaleh and Meybodi 2016; Vahidipour et al. 2017), data clustering (Hosein and Navid 2003; Hasanzadeh-Mofrad and Rezvanian 2018), community detection (Amiri et al. 2013; Khomami et al. 2016b, 2018; Liu et al. 2016; Rezapoor Mirsaleh and Reza Meybodi 2016), link prediction (Moradabadi and Meybodi 2016, 2017), grid computing (Hasanzadeh and Meybodi 2014, 2015; Hasanzadeh Mofrad et al. 2016), stochastic social networks (Rezvanian and Meybodi 2015a, b, 2016a, b, 2017), network sampling (Rezvanian et al. 2014; Rezvanian and Meybodi 2015c; Jalali et al. 2016a, b; Rezvanian and Meybodi 2016c; Ghavipour and Meybodi 2017), information diffusion (Daliri Khomami et al. 2014, 2017, 2018), recommender systems (Ghavipour and Meybodi 2016), wireless sensor networks (Safavi et al. 2014; Nicopolitidis 2015), WiMAX networks (Misra et al. 2015), network security (Krishna et al. 2014), wireless mesh networks (Kumar and Lee 2015), mobile video surveillance (Kumar et al. 2015b), vehicular environment

2.2 Wavefront Cellular Learning Automata

61

Table 2.1 Summary of recent applications of cellular learning automata Application

Cellular learning automata type

Cellular networks

CLA (Beigy and Meybodi 2010)

Wireless sensor networks

ICLA (Mostafaei and Obaidat 2018), DICLA (Esnaashari and Meybodi 2018)

Peer to peer networks

CLA (Saghiri and Meybodi 2016), ADCLA (Saghiri and Meybodi 2018), CADCLA (Saghiri and Meybodi 2017b)

Graph problems

WCLA (Moradabadi and Meybodi 2018), ICLA (Vahidipour et al. 2017), ICLA (Rezapoor Mirsaleh and Meybodi 2016), ICLA (Mousavian et al. 2014)

Cloud computing

ICLA (Morshedlou and Meybodi 2017), CLA (Kheradmand and Meybodi 2014)

Task allocation

CLA (Khani et al. 2017)

Data mining

ACLA (Ahangaran et al. 2017), CLA (Sohrabi and Roshani 2017), CLA (Vafaee Sharbaf et al. 2016; Toozandehjani et al. 2014)

Social network analysis

ICLA (Ghavipour and Meybodi 2017), ICLA (Khomami et al. 2018), ICLA (Zhao et al. 2015), CLA (Aldrees and Ykhlef 2014)

Image processing

CLA (Adinehvand et al. 2017; Arish et al. 2016), CLA (Hasanzadeh Mofrad et al. 2015; Hadavi et al. 2014)

Optimization

CLA (Vafashoar and Meybodi 2016), CLA (Vafashoar and Meybodi 2018), CLA (Mozafari et al. 2015)

Robotics

CLA (Santoso et al. 2016)

Opportunistic networks

CLA (Zhang et al. 2016)

(Misra et al. 2014; Kumar et al. 2015a), Peer-to-Peer networks (Saghiri and Meybodi 2016, 2017a), and cloud computing (Morshedlou and Meybodi 2014, 2017).

2.2 Wavefront Cellular Learning Automata A cellular automaton (CA) is made of similar cells where each cell has a state and transitions over the set of possible states are based on a local rule. A learning automaton (LA) is also an adaptive decision-making unit in unknown random environments. Intuitively, the LA, during its learning mechanism, tries to choose the optimal action from its action set based on the environment’s response to the previous action. To improve the local interactions of the LA in complex systems, the CA and LA are merged together and proposed in terms of a CLA. The CLA is preferable to a CA because it tries to learn optimal actions, and it is also preferable to an LA because it can improve the learning capability using a set of learning automata that interact with each other.

62

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

In many real-world learning problems, the learning process can proceed in stages in a particular order. For example, in pattern classification with decision trees, when a node makes a decision, its decision is transmitted to its successor nodes as a wave. In problems related to sampling social networks, when we label a link or node as a sample instance, this decision can propagate into the network as a wave. Therefore, we consider a CLA with one LA for each region and define the neighbors of each cell to be the cells of successor regions, such that each LA can send a wave to its neighbors if its chosen action is different from the previous action. This model of CLA is called a wavefront CLA (WCLA) (Moradabadi and Meybodi 2018). In a wavefront CLA we partition the problem space into stages such that each stage tries to learn the optimal action and helps the neighbors to act accordingly in the environment. A wavefront CLA (WCLA) (Moradabadi and Meybodi 2018) is an extension to the asynchronous CLA model, with a connected neighbor structure and propagation property where each LA can send a wave to its neighbors and activate their learning automata to choose new actions if the chosen action is changed from the previous action. Furthermore, this procedure continues. Each cell that receives the wave is activated and its corresponding LA must choose a new action. The proposed WCLA is an asynchronous CLA because, at each iteration, only some learning automata are activated independently, rather than all of them in a parallel manner. The WCLA enables us to propagate information through the CLA because it has both a connected neighbor structure and wave propagation properties. This section introduces the WCLA and studies its convergence. A wavefront cellular learning automaton is a generalization of the asynchronous CLA which has a connected neighbor structure and the property of diffusion. The main features of a WCLA are that the neighbors of each cell are defined as its successor cells and also that each LA can send a wave to its neighbors and active them if its chosen action is different from the previous action (diffusion property). Each cell that receives the wave is activated and its corresponding LA must choose a new action. The WCLA operates like a CLA in the following respects. In the initial phase, the initial state of every cell is specified using the initial action probability distribution of the corresponding LA. At instance k, one LA called the root LA chooses an action, and if the chosen action is different from the previous action it sends a wave over the network. Each cell that receives the wave is activated and its corresponding LA must choose a new action. Then, using the local rule of the WCLA and the selected actions of its neighbors, a reinforcement signal is generated for the LA residing in that cell and the LA uses this reinforcement signal to update its action probability distribution using the learning algorithm. This procedure continues until a stopping criterion is reached. Figure 2.4 shows the procedure of WCLA in a simple view. In this figure, first the root learning automata that is labeled as 1 is activated. This learning automaton chooses an action and because the new action is different from the previous one, it activates all of its children LAs in the next level (labeled as 2). Then each activated child chooses a new action and if its action is the same as its previous action then the chain is stopped at that node (colored as gray) and if the new action is different from the previous action it activates its children (colored as red) and this procedure goes on. The proposed WCLA is asynchronous because

2.2 Wavefront Cellular Learning Automata

63

Fig. 2.4 WCLA example

at each iteration one LA selects a new action and therefore the action selection does not occur in a parallel manner. It is also irregular and closed because it does not use a lattice for its structure and only uses the local environment to update the CLA behavior. Finally, it is static because the structure of the CLA remains fixed. As mentioned previously, the WCLA utilizes two new concepts compared to the CLA: (1) Different neighbor structure. In the traditional CLA, the structure must be a lattice, while in the WCLA the structure can be any connected structure; either a regular structure such as a lattice or an irregular structure such as a graph or tree. The only necessity is that the structure must be connected. The structure of a WCLA is connected when there is a path between every pair of cells. In a connected structure, there are no unreachable cells. The connectivity property of the structure of the WCLA ensures that a wave can reach any cell in the network, and therefore there is a learning capability throughout the network. Figure 2.4 shows some examples of valid structures for a WCLA. Figure 2.5a shows a lattice structure as an example of a regular structure. Figure 2.5b, c show graph and tree structures respectively, as examples of irregular structures. All the structures are connected. (2) Propagation property using waves. As previously mentioned, in a traditional CLA each learning automaton chooses its action in each iteration. Then the local rule of each learning automaton generates the reinforcement signal for the corresponding learning automaton using its chosen action and the chosen actions of its neighbors. Finally, each learning automaton updates its action probability based on its reinforcement signal. As we see in the traditional CLA, there is no propagation and diffusion property. When a cell state is changed, the other cells are not triggered to choose new actions. To overcome this problem, the WCLA has a propagation property using waves. Each LA can send a wave to its neighbors and activate them if the chosen action is different from the previous action (diffusion property). Each cell that receives the wave is activated and its

64

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

Fig. 2.5 Examples of valid structures in a WCLA

Fig. 2.6 Waves in different structures

corresponding LA must choose a new action. Figure 2.6 shows different waves over different WCLA structures. Each wave starts from a cell in the network and moves to its neighbors. The diffusion path of the wave depends only on the neighboring structure of the WCLA. Because the structure is connected and the waves can move over the entire network, each cell can be activated and improve its state after receiving waves, thus also improving its learning capability. However, in order to control the movement of the waves over the network we define an energy for each wave in the WCLA. The energy of each wave determines the wave’s ability to propagate itself. The energy of the wave decreases as it moves through the network until it reaches zero, at which point the wave disappears and movement stops. In this way, each cell receiving the wave is activated and its LA is triggered to choose a new action. Generally, a WCLA model is characterized as follows: Definition 2.6 A wavefront cellular learning automaton is defined with structure A  (S, , A, F, W, C), where: 1. S is the structure of the WCLA that is a connected network of learning automata. 2.  is a finite set of states. 3. A represents the set of learning automata residing in the cells of the CLA.

2.2 Wavefront Cellular Learning Automata

65

4. F i : i → β defines the local rule of the WCLA for each cell ci , where i is the current state of L Ai and its successors. In addition, β is the set of possible values for the reinforcement signal, and the reinforcement signal is calculated for each LA using the chosen actions of successor learning automata. 5. W defines the wave energy and determines when and where the wave stops moving through the network. 6. C is a condition where an LA sends a wave over the network and triggers its children to choose a new action. In a WCLA, C is defined according to whether the chosen action of the LA is different from the previous action or not. As previously mentioned, the WCLA differs in two main respects compared to the CLA. Firstly, in the WCLA the lattice Z d and the neighborhood vector N are replaced by any connected network. Secondly, in the proposed model each LA can send a wave to its neighbors and each cell that receives the wave is activated and triggers its LA to choose a new action. In the rest of this section, we present the learning algorithm we use for updating the WCLA action. As mentioned above, in the WCLA we have a set of N learning automata indexed from 1 to N . The ith LA has the action set Ai with ri actions as follows: Ai  {ai1 , . . . , airi }

(2.1)

In addition, pi (k) and αi (k) are its action probability vector and its chosen action in iteration k, respectively. The probability of choosing action ai j at instance k is defined as: Pr ob{αi (k)  ai j }  pi j (k)

(2.2)

To update pi (k) for instance k we use the linear reward-inaction (LR-I ) algorithm according to the following equations:   pi j (k + 1)  pi j (k) + λβ(k) 1 − pi j (k) i f αi (k)  ai j pi j (k + 1)  (1 − λβ(k)) pi j (k) i f αi (k)  ai j

(2.3)

where β(k) is the reinforcement signal in instance k and the learning parameter λ satisfies the relation 0 < λ < 1. Also in (Moradabadi and Meybodi 2018) it is shown that WCLA is expedient and converges to global optima.

2.2.1 Analysis We know that each learning automata based model, also WCLA, tries to tune the action probability vector in order to maximize the reinforcement signals received from the environment. In this sub-section, we analyze the convergence of the WCLA.

66

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

So, we first present some definitions and then try to model the objective of WCLA as an optimization problem and analysis it. Definition 2.7 Generally, in instance k the configuration of WCLA is defined by the following equation: P(k)  [ p1T , p2T , . . . , p TN ]T

(2.4)

where p1T is the transpose of action probability vector for ith LA and it should satisfy pi j ≥ 0 1 < j ≤ ri , 1 ≤ i ≤ N ri  pi j  1 1 ≤ i ≤ N j1

Definition 2.8 In WCLA, if the all action probability vectors of a configuration P be a unit vector we called this configuration as a deterministic configuration otherwise we called it probabilistic. Also, we define the K∗ and K be the set of all deterministic and probabilistic configurations, respectively as the following: ⎫ ⎧  ⎨  P  ( p1T , p2T , . . . , p TN )T , pi (k)  [ pi1 (k), . . . piri (k)]T , ⎬  , (2.5) K∗  p  pi y  1 ∀y, i : pi y ∈ {0, 1}, ∀i : ⎭ ⎩  y and ⎧  ⎫ ⎨   ( p1T , p2T , . . . , p TN )T , pi (k)  [ pi1 (k), . . . piri (k)]T , ⎬  K  p  , pi y  1 ∀y, i : 0 ≤ pi y ≤ 1, ∀i : ⎩  ⎭ y

(2.6)

Definition 2.9 The dynamic behavior of the WCLA through the time is defined by the mapping G : K → K. Definition 2.10 The neighborhood set of ith learning automata, N (i), is defined as the set of its successor learning automata with a special depth. Definition 2.11 If pi is updated according to the proposed learning algorithm, the 1 approximation factor in the worst case time required for finding a solution with 1− is: 2 1 + pi∗ (0) −

ε D

log1−λ

  D 1 − pi∗ (0)

(2.7)

where ∈ (0, 1) is the error parameter of the WCLA, λ is the learning rate of the learning algorithm, and D is the maximum degree of the LAs in the WCLA. Proof Assume that in the WCLA for learning automata L i all other actions are selected before the optimal one (we call optimal action αi∗ ). In this case all the other

2.2 Wavefront Cellular Learning Automata

67

action may be rewarded or penalized. So in the worst case if we assume all the other actions are rewarded (we do not consider punishments because we use LR-I learning algorithm) in iteration k, the probability of choosing the optimal action for L i , αi∗ , is computed as: pi∗ (k) ≥ pi∗ (k).(1 − λ)

(2.8)

where pi∗ is the probability of the optimal action. By repeatedly substituting recurrence function pi∗ on the right hand side of inequality, we obtain: pi∗ (k) ≥ pi∗ (0).(1 − λ)k−1

(2.9)

Now assume that action αi∗ is chosen for the first time. Because WCLA uses LR-I learning algorithm then the probability of penalizing the optimal action αi∗ is zero. So the conditional expectation pi∗ (k) remains unchanged when the other actions are selected and increases only when action αi∗ is chosen. So we can rewrite the in the conditional expectation pi∗ (k) as:   pi∗ (1)  pi∗ + λ 1 − pi∗ pi∗ (2)  pi∗ (1) + λ 1 − pi∗ (1)  pi∗ (1)(1 − λ) + λ .. .   ∗ ∗ ∗  pi∗ (k − 2)(1 − λ) + λ pi (k − 1)  pi (k − 2) +λ 1 − pi (k − 2)  ∗ ∗ ∗ pi (k)  pi (k − 1) + λ 1 − pi (k − 1)  pi∗ (k − 1)(1 − λ) + λ

(2.10)

where k denotes the number of times action αi∗ must be chosen until the following condition is met. pi∗ (k)  1 −

ε D

(2.11)

By substituting recurrence function pi∗ (k) and after some simplifications we have: pi∗ (k)  pi∗ (k − 1)(1 − λ) + λ    pi∗ (k − 2)(1 − λ) + λ (1 − λ) + λ  pi∗ (k − 2)(1 − λ)2 + λ.(1 − λ) + λ    pi∗ (k − 3)(1 − λ) + λ (1 − λ)2 + λ.(1 − λ) + λ  pi∗ (k − 2)(1 − λ)3 + λ.(1 − λ)2 + λ.(1 − λ) + λ ···  pi∗ (1)(1 − λ)k−1 + λ.(1 − λ)k−2 + · · · + λ.(1 − λ) + λ  pi∗ (0)(1 − λ)k + λ.(1 − λ)k−1 + · · · + λ.(1 − λ) + λ Hence, we have:

(2.12)

68

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

pi∗ (k)  pi∗ (0)(1 − λ)k + λ.(1 − λ)k−1 + · · · + λ.(1 − λ) + λ After some algebraic simplifications, we have:   pi∗ (k)  pi∗ (0)(1 − λ)k + λ. 1 + (1 − λ) + (1 − λ)2 + · · · + (1 − λ)k−1

(2.13)

and pi∗ (k)



pi∗ (0)(1

k−1 − λ) + λ. (1 − λ) j k

(2.14)

j0

The second term

on the kright  hand side of the above equation is a geometric series 1−(1−λ) that sums up to λ. 1−(1−λ) , so we have:   1 − (1 − λ)k pi∗ (k)  pi∗ (0)(1 − λ)k + λ. 1 − (1 − λ)

(2.15)

pi∗ (k)  pi∗ (0)(1 − λ)k + 1 − (1 − λ)k

(2.16)

ε D

(2.17)

and

Now we can have: pi∗ (0)(1 − λ)k + 1 − (1 − λ)k  1 − And (1 − λ)k 

ε   D 1 − pi∗ (0)

(2.18)

Taking log1−λ of both sides of Eq. (2.18) we derive k  log1−λ

ε   D 1 − pi∗ (0)

(2.19)

Now based on the fact that the probability of choosing all the other actions is initially 1 − pi∗ (0), and reaches Dε after the k of iterations. Thus, the number of times the other action are chosen is obtained as: 1 − pi∗ (0) + 1 + pi∗ (0) −

ε D ε D

.k

So if we consider K as the total number of iterations, we have:

(2.20)

2.2 Wavefront Cellular Learning Automata K 

ε 1 − pi∗ (0) + D

ε 2 2  ε .k + k  1 + p∗ (0) − ε .k  1 + p∗ (0) − ε . log1−λ 1 + pi∗ (0) − D D 1 − pi∗ (0) i i D D

69

(2.21)

And hence the proof of definition. Definition 2.12 We called an automaton, a pure-chance automaton if it selects its action using equal probability over its action set. In other words, if the cardinality of the action-set be r then an automaton called pure-chance automata if: pi 

1 , i  1, 2, . . . , m m

(2.22)

Definition 2.13 Pure-chance WCLA is a WCLA, where in each cell there is a purechance automata rather that a learning automata. Also Ppc is the configuration of a pure-chance WCLA. Definition 2.14 We called a WCLA is expedient with respect to ith LA if in long run the i th LA does better than pure-chance automata based on the following inequality: 1  lim E β| pik E[β] k→∞ m

(2.23)

Definition 2.15 A WCLA is expedient using the proposed learning algorithm. Proof Because the f (P(t)) strictly increases on t along the ODE solutions, and the LR-I learning algorithm is expedient with small enough learning parameter λ, the WCLA with learning algorithm is also expedient.

2.3 Conclusion In this chapter, we provided a brief overview on cellular learning automata models and their applications in recent studies. Next, we introduce a new a generalization of asynchronous cellular learning automata equipped with a diffusion capability called wavefront cellular learning automata (WCLA). Since the WCLA as a collection of connected learning automata mapped to a structure and uses the waves on this structure to diffuse state changes of the learning automata, it is a suitable platform for learning in online, non-deterministic, dynamic, distributed or decentralized environments.

References Adinehvand K, Sardari D, Hosntalab M, Pouladian M (2017) An efficient multistage segmentation method for accurate hard exudates and lesion detection in digital retinal images. J Intell Fuzzy Syst 33:1639–1649. https://doi.org/10.3233/JIFS-17199

70

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

Ahangaran M, Taghizadeh N, Beigy H et al (2017) Associative cellular learning automata and its applications. Appl Soft Comput J 53:1–18. https://doi.org/10.1016/j.asoc.2016.12.006 Aldrees M, Ykhlef M (2014) A seeding cellular learning automata approach for viral marketing in social network. In: Proceedings of the 16th International Conference on Information Integration and Web-based Applications & Services—iiWAS ’14. ACM Press, New York, New York, USA, pp 59–63 Amiri F, Yazdani N, Faili H, Rezvanian A (2013) A novel community detection algorithm for privacy preservation in social networks. In: Abraham A (ed), pp 443–450 Arish S, Javaherian M, Safari H, Amiri A (2016) Extraction of active regions and coronal holes from EUV images using the unsupervised segmentation method in the Bayesian framework. Sol Phys 291:1209–1224. https://doi.org/10.1007/s11207-016-0883-4 Beigy H, Meyb MR (2004) A mathematical framework for cellular learning automata. Adv Complex Syst 07:295–319. https://doi.org/10.1142/S0219525904000202 Beigy H, Meybodi MR (2007) Open synchronous cellular learning automata. Adv Complex Syst 10:527–556 Beigy H, Meybodi MR (2008) Asynchronous cellular learning automata. Automatica 44:1350–1357 Beigy H, Meybodi MRR (2010) Cellular learning automata with multiple learning automata in each cell and its applications. IEEE Trans Syst Man Cybern Part B 40:54–65. https://doi.org/10.1109/ TSMCB.2009.2030786 Daliri Khomami MM, Rezvanian A, Meybodi MR (2014) Irregular cellular automata for multiple diffusion. In: 22th Iranian conference on electrical engineering (ICEE 2014). Tehran, Iran, pp 1–6 Daliri Khomami MM, Rezvanian A, Bagherpour N, Meybodi MR (2017) Irregular cellular automata based diffusion model for influence maximization. In: 2017 5th Iranian joint congress on fuzzy and intelligent systems (CFIS). IEEE, pp 69–74 Daliri Khomami MM, Rezvanian A, Bagherpour N, Meybodi MR (2018) Minimum positive influence dominating set and its application in influence maximization: a learning automata approach. Appl Intell 48:570–593. https://doi.org/10.1007/s10489-017-0987-z Damerchilu B, Norouzzadeh MS, Meybodi MR (2016) Motion estimation using learning automata. Mach Vis Appl 27:1047–1061. https://doi.org/10.1007/s00138-016-0788-0 Esnaashari M, Meybodi MR (2008) A cellular learning automata based clustering algorithm for wireless sensor networks. Sens Lett 6:723–735 Esnaashari M, Meybodi MRR (2009) Dynamic point coverage in wireless sensor networks: a learning automata approach. In: Advances in computer science and engineering. Springer, pp 758–762 Esnaashari M, Meybodi MRM (2011) A cellular learning automata-based deployment strategy for mobile wireless sensor networks. J Parallel Distrib Comput 71:988–1001 Esnaashari M, Meybodi MR (2013) Deployment of a mobile wireless sensor network with kcoverage constraint: a cellular learning automata approach. Wirel Networks 19:945–968 Esnaashari M, Meybodi MR (2018) Dynamic irregular cellular learning automata. J Comput Sci 24:358–370. https://doi.org/10.1016/j.jocs.2017.08.012 Ghavipour M, Meybodi MR (2016) An adaptive fuzzy recommender system based on learning automata. Electron Commer Res Appl 20:105–115. https://doi.org/10.1016/j.elerap.2016.10.002 Ghavipour M, Meybodi MR (2017) Irregular cellular learning automata-based algorithm for sampling social networks. Eng Appl Artif Intell 59:244–259. https://doi.org/10.1016/j.engappai.2017. 01.004 Hadavi N, Nordin MJ, Shojaeipour A (2014) Lung cancer diagnosis using CT-scan images based on cellular learning automata. In: 2014 international conference on computer and information sciences (ICCOINS). IEEE, pp 1–5 Hasanzadeh M, Meybodi MR (2014) Grid resource discovery based on distributed learning automata. Computing 96:909–922. https://doi.org/10.1007/s00607-013-0337-x Hasanzadeh M, Meybodi MR (2015) Distributed optimization grid resource discovery. J Supercomput 71:87–120. https://doi.org/10.1007/s11227-014-1289-4

References

71

Hasanzadeh Mofrad M, Sadeghi S, Rezvanian A, Meybodi MR (2015) Cellular edge detection: combining cellular automata and cellular learning automata. AEU Int J Electron Commun 69:1282–1290. https://doi.org/10.1016/j.aeue.2015.05.010 Hasanzadeh Mofrad M, Jalilian O, Rezvanian A, Meybodi MR (2016) Service level agreement based adaptive grid superscheduling. Futur Gener Comput Syst 55:62–73. https://doi.org/10. 1016/j.future.2015.08.012 Hasanzadeh M, Meybodi MR, Ebadzadeh MM (2013) Adaptive cooperative particle swarm optimizer. Appl Intell 39:397–420. https://doi.org/10.1007/s10489-012-0420-6 Hasanzadeh-Mofrad M, Rezvanian A (2018) Learning automata clustering. J Comput Sci 24:379–388. https://doi.org/10.1016/j.jocs.2017.09.008 Hosein A, Navid F (2003) Cellular learning automata and its applications. J Sci Technol Univ Sharif 54–77 Jalali ZS, Rezvanian A, Meybodi MR (2016a) A two-phase sampling algorithm for social networks. In: Conference proceedings of 2015 2nd international conference on knowledge-based engineering and innovation, KBEI 2015. IEEE, pp 1165–1169 Jalali ZS, Rezvanian A, Meybodi MR (2016b) Social network sampling using spanning trees. Int J Mod Phys C 27:1650052. https://doi.org/10.1142/S0129183116500522 Khani M, Ahmadi A, Hajary H (2017) Distributed task allocation in multi-agent environments using cellular learning automata. Soft Comput. https://doi.org/10.1007/s00500-017-2839-5 Kheradmand S, Meybodi MR (2014) Price and QoS competition in cloud market by using cellular learning automata. In: 2014 4th international conference on computer and knowledge engineering (ICCKE). IEEE, pp 340–345 Khomami MMD, Bagherpour N, Sajedi H, Meybodi MR (2016a) A new distributed learning automata based algorithm for maximum independent set problem. 2016 artificial intelligence and robotics (IRANOPEN). IEEE, Qazvin, Iran, Iran, pp 12–17 Khomami MMD, Rezvanian A, Meybodi MR (2016b) Distributed learning automata-based algorithm for community detection in complex networks. Int J Mod Phys B 30:1650042. https://doi. org/10.1142/S0217979216500429 Khomami MMD, Rezvanian A, Meybodi MR (2018) A new cellular learning automata-based algorithm for community detection in complex social networks. J Comput Sci 24:413–426. https:// doi.org/10.1016/j.jocs.2017.10.009 Krishna PV, Misra S, Joshi D et al (2014) Secure socket layer certificate verification: a learning automata approach. Secur Commun Networks 7:1712–1718. https://doi.org/10.1002/sec.867 Kumar N, Lee J-HH (2015) Collaborative-learning-automata-based channel assignment with topology preservation for wireless mesh networks under QoS constraints. IEEE Syst J 9:675–685. https://doi.org/10.1109/JSYST.2014.2355113 Kumar N, Misra S, Obaidat MSMSMSMS (2015a) Collaborative learning automata-based routing for rescue operations in dense urban regions using vehicular sensor networks. IEEE Syst J 9:1081–1090. https://doi.org/10.1109/JSYST.2014.2335451 Kumar NN, Lee JHJ-H, Rodrigues JJJPC (2015b) Intelligent mobile video surveillance system as a Bayesian coalition game in vehicular sensor networks: learning automata approach. IEEE Trans Intell Transp Syst 16:1148–1161. https://doi.org/10.1109/TITS.2014.2354372 Liu S-C, Zhu F-X, Gan L (2016) A label-propagation-probability-based algorithm for overlapping community detection. Jisuanji Xuebao/Chinese Journal of Computers. 2016:717–729 Mahdaviani M, Kordestani JK, Rezvanian A, Meybodi MR (2015) LADE: learning automata based differential evolution. Int J Artif Intell Tools 24:1550023. https://doi.org/10.1142/ s0218213015500232 Mason LG, Gu X (1986) Learning automata models for adaptive flow control in packet-switching networks. Adapt Learn Syst. Springer, US, Boston, MA, pp 213–227 Misra S, Interior B, Kumar N et al (2014) Networks of learning automata for the vehicular environment: a performance analysis study. IEEE Wirel Commun 21:41–47. https://doi.org/10.1109/ MWC.2014.7000970

72

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

Misra S, Chatterjee SS, Guizani M (2015) Stochastic learning automata-based channel selection in cognitive radio/dynamic spectrum access for WiMAX networks. Int J Commun Syst 28:801–817 Moradabadi B, Beigy H (2014) A new real-coded Bayesian optimization algorithm based on a team of learning automata for continuous optimization. Genet Program Evolvable Mach 15:169–193. https://doi.org/10.1007/s10710-013-9206-9 Moradabadi B, Meybodi MR (2016) Link prediction based on temporal similarity metrics using continuous action set learning automata. Phys A Stat Mech Appl 460:361–373. https://doi.org/ 10.1016/j.physa.2016.03.102 Moradabadi B, Meybodi MR (2017) A novel time series link prediction method: Learning automata approach. Phys A Stat Mech Appl 482:422–432. https://doi.org/10.1016/j.physa.2017.04.019 Moradabadi B, Meybodi MR (2018) Wavefront cellular learning automata. Chaos 28:21101. https:// doi.org/10.1063/1.5017852 Moradabadi B, Ebadzadeh MM, Meybodi MR (2016) A new real-coded stochastic Bayesian optimization algorithm for continuous global optimization. Genet Program Evolvable Mach 17:145–167. https://doi.org/10.1007/s10710-015-9255-3 Morshedlou H, Meybodi MR (2014) Decreasing impact of SLA violations: a proactive resource allocation approach for cloud computing environments. IEEE Trans Cloud Comput 2:156–167. https://doi.org/10.1109/TCC.2014.2305151 Morshedlou H, Meybodi MR (2017) A new local rule for convergence of ICLA to a compatible point. IEEE Trans Syst Man Cybern Syst 47:3233–3244. https://doi.org/10.1109/TSMC.2016. 2569464 Mostafaei H, Obaidat MS (2018) Learning automaton-based self-protection algorithm for wireless sensor networks. IET Networks 7:353–361. https://doi.org/10.1049/iet-net.2018.0005 Mousavian A, Rezvanian A, Meybodi MR (2013) Solving minimum vertex cover problem using learning automata. In: 13th Iranian conference on fuzzy systems (IFSC 2013), pp 1–5 Mousavian A, Rezvanian A, Meybodi MR (2014) Cellular learning automata based algorithm for solving minimum vertex cover problem. In: 2014 22nd Iranian conference on electrical engineering (ICEE). IEEE, pp 996–1000 Mozafari M, Shiri ME, Beigy H (2015) A cooperative learning method based on cellular learning automata and its application in optimization problems. J Comput Sci 11:279–288. https://doi.org/ 10.1016/j.jocs.2015.08.002 Narendra KS, Thathachar MAL (1989) Learning automata: an introduction. Prentice-Hall Nicopolitidis P (2015) Performance fairness across multiple applications in wireless push systems. Int J Commun Syst 28:161–166. https://doi.org/10.1002/dac.2648 Packard NH, Wolfram S (1985) Two-dimensional cellular automata. J Stat Phys 38:901–946. https:// doi.org/10.1007/BF01010423 Rezapoor Mirsaleh M, Meybodi MR (2015) A learning automata-based memetic algorithm. Genet Program Evolvable Mach 16:399–453. https://doi.org/10.1007/s10710-015-9241-9 Rezapoor Mirsaleh M, Meybodi MR (2016) A new memetic algorithm based on cellular learning automata for solving the vertex coloring problem. Memetic Comput 8:211–222. https://doi.org/ 10.1007/s12293-016-0183-4 Rezapoor Mirsaleh M, Reza Meybodi M (2016) A Michigan memetic algorithm for solving the community detection problem in complex network. Neurocomputing 214:535–545. https://doi. org/10.1016/j.neucom.2016.06.030 Rezvanian A, Meybodi MR (2010a) An adaptive mutation operator for artificial immune network using learning automata in dynamic environments. In: 2010 second world congress on nature and biologically inspired computing (NaBIC). IEEE, pp 479–483 Rezvanian A, Meybodi MR (2010b) LACAIS: learning automata based cooperative artificial immune system for function optimization. Communications in computer and information science. Springer, Berlin Heidelberg, pp 64–75 Rezvanian A, Meybodi MR (2015a) Finding maximum clique in stochastic graphs using distributed learning automata. Int J Uncertainty Fuzziness Knowl Based Syst 23:1–31. https://doi.org/10. 1142/S0218488515500014

References

73

Rezvanian A, Meybodi MR (2015b) Finding minimum vertex covering in stochastic graphs: a learning automata approach. Cybern Syst 46:698–727. https://doi.org/10.1080/01969722.2015. 1082407 Rezvanian A, Meybodi MR (2015c) Sampling social networks using shortest paths. Phys A Stat Mech Appl 424:254–268. https://doi.org/10.1016/j.physa.2015.01.030 Rezvanian A, Meybodi MR (2016a) Stochastic social networks: measures and algorithms. LAP LAMBERT Academic Publishing Rezvanian A, Meybodi MR (2016b) Stochastic graph as a model for social networks. Comput Human Behav 64:621–640. https://doi.org/10.1016/j.chb.2016.07.032 Rezvanian A, Meybodi MR (2016c) Sampling algorithms for weighted networks. Soc Netw Anal Min 6:60. https://doi.org/10.1007/s13278-016-0371-8 Rezvanian A, Meybodi MR (2017) Sampling algorithms for stochastic graphs: A learning automata approach. Knowl Based Syst 127:126–144. https://doi.org/10.1016/j.knosys.2017.04.012 Rezvanian A, Rahmati M, Meybodi MR (2014) Sampling from complex networks using distributed learning automata. Phys A Stat Mech its Appl 396:224–234. https://doi.org/10.1016/j.physa. 2013.11.015 Rezvanian A, Saghiri AM, Vahidipour SM, et al (2018a) Recent advances in learning automata. Springer Rezvanian A, Vahidipour SM, Esnaashari M (2018b) New applications of learning automata-based techniques in real-world environments. J Comput Sci 24:287–289. https://doi.org/10.1016/j.jocs. 2017.11.012 Safavi SM, Meybodi MR, Esnaashari M (2014) Learning automata based face-aware mobicast. Wirel Pers Commun 77:1923–1933 Saghiri AM, Meybodi MR (2016) An approach for designing cognitive engines in cognitive peerto-peer networks. J Netw Comput Appl 70:17–40. https://doi.org/10.1016/j.jnca.2016.05.012 Saghiri AM, Meybodi MR (2017a) A closed asynchronous dynamic model of cellular learning automata and its application to peer-to-peer networks. Genet Program Evolvable Mach 18:313–349. https://doi.org/10.1007/s10710-017-9299-7 Saghiri AM, Meybodi MR (2017b) A distributed adaptive landmark clustering algorithm based on mOverlay and learning automata for topology mismatch problem in unstructured peer-to-peer networks. Int J Commun Syst 30:e2977. https://doi.org/10.1002/dac.2977 Saghiri AM, Meybodi MR (2018) An adaptive super-peer selection algorithm considering peers capacity utilizing asynchronous dynamic cellular learning automata. Appl Intell 48:271–299. https://doi.org/10.1007/s10489-017-0946-8 Santoso J, Riyanto B, Adiprawita W (2016) Dynamic path planning for mobile robots with cellular learning automata. J ICT Res Appl 10:1–14. https://doi.org/10.5614/itbj.ict.res.appl.2016.10.1.1 Sohrabi MK, Roshani R (2017) Frequent itemset mining using cellular learning automata. Comput Human Behav 68:244–253. https://doi.org/10.1016/j.chb.2016.11.036 Soleimani-Pouri M, Rezvanian A, Meybodi MR (2012) Solving maximum clique problem in stochastic graphs using learning automata. In: 2012 fourth international conference on computational aspects of social networks (CASoN). IEEE, pp 115–119 Thathachar MAL, Sastry PS (2003) Networks of learning automata: techniques for online stochastic optimization. Springer, Boston, MA Toozandehjani H, Zare-Mirakabad M-R, Derhami V (2014) Improvement of recommendation systems based on cellular learning automata. In: 2014 4th international conference on computer and knowledge engineering (ICCKE). IEEE, pp 592–597 Vafaee Sharbaf F, Mosafer S, Moattar MH (2016) A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. Genomics 107:231–238. https://doi.org/10.1016/j.ygeno.2016.05.001 Vafashoar R, Meybodi MR (2016) Multi swarm bare bones particle swarm optimization with distribution adaption. Appl Soft Comput J 47:534–552. https://doi.org/10.1016/j.asoc.2016.06.028 Vafashoar R, Meybodi MR (2018) Multi swarm optimization algorithm with adaptive connectivity degree. Appl Intell 48:909–941. https://doi.org/10.1007/s10489-017-1039-4

74

2 Wavefront Cellular Learning Automata: A New Learning Paradigm

Vahidipour SM, Meybodi MR, Esnaashari M (2017) Adaptive Petri net based on irregular cellular learning automata with an application to vertex coloring problem. Appl Intell 46:272–284. https:// doi.org/10.1007/s10489-016-0831-x Zhang F, Wang X, Li P, Zhang L (2016) An energy aware cellular learning automata based routing algorithm for opportunistic networks. Int J Grid Distrib Comput 9:255–272. https://doi.org/10. 14257/ijgdc.2016.9.2.22 Zhao Y, Jiang W, Li S et al (2015) A cellular learning automata based algorithm for detecting community structure in complex networks. Neurocomputing 151:1216–1226. https://doi.org/10. 1016/j.neucom.2014.04.087

Chapter 3

Social Networks and Learning Systems: A Bibliometric Analysis

3.1 Introduction Learning is a crucial aspect of intelligence and machine learning has emerged as a vibrant discipline with the avowed objective of developing machines with learning capabilities. Learning has been recognized as an important aspect of intelligent behavior. Over the last few decades, the process of learning, which was studied earlier mostly by psychologists, has become a topic of much interest to engineers as well, in view of its role in machine intelligence. Psychologists or biologists, who conduct learning experiments on animals or human subjects, try to create models of behavior through analysis of experimental data. Engineers are more interested in studying learning behavior for helping them to synthesize intelligent machines. While the above two goals are distinctly different, the two endeavors are nonetheless interrelated, because success in one helps to improve our abilities in the other. Learning is defined as any relatively permanent change in behavior resulting from past experience and learning system is characterized by its ability to improve its behavior with time, in some sense tending towards an ultimate goal (Narendra and Thathachar 1989). The concept of learning makes it possible to design systems which can gradually improve their performance during actual operation through a process of learning from past experiences. Every learning task consists of two parts: learning system and environment. The learning system must learn to act in the environment. The main learning tasks can be classified into supervised learning, semi-supervised learning, active learning, unsupervised learning and reinforcement learning. In recent years, online social networks such as Facebook and Twitter have provided simple facilities for online users to generate and share a variety of information about users’ daily life, activities, events, news and more information about their real worlds which results in the online users have become the main features of online social networks and studying how users behave and interact with their friends in online social networks plays a significant role for analysis of online social networks.

© Springer Nature Switzerland AG 2019 A. Rezvanian et al., Learning Automata Approach for Social Networks, Studies in Computational Intelligence 820, https://doi.org/10.1007/978-3-030-10767-3_3

75

76

3 Social Networks and Learning Systems: A Bibliometric Analysis

Online social networks similar to many real networks are usually modeled as deterministic graphs with a set of nodes as users and links as connection between users of networks. Due to large size, unknown, uncertain, unpredictable and time-varying nature of online social networks, most of conventional methods for social network analysis are failed to yield suitable results for researchers. Therefore, due to powerful properties of learning systems in unknown, uncertain and distributed environments, many research using learning techniques have been presented for social network analysis (Rezvanian et al. 2018). In this chapter, we aim to investigate the different types of research results and address the potential trends of the key research areas related to learning techniques as one of intelligent systems for social network analysis. For this purpose in the rest of this chapter, we will introduce our methodology that used for our analysis and then the main results of our study results will be presented.

3.2 Material and Method We adopted a methodology for preparing a literature review presented by Rowley and Slack (2004) in five steps including scanning documents, making notes, structuring the literature review, writing the literature review, and building the bibliography. In this study, similar to this methodology, the research documents from Web of Science (WoS) are collected, the results are processed and refined, the refined results are structured, key materials (e.g., top contributing authors, institutes, publications, countries and research topics) and applications are identified and finally some insights into current research interests and future directions are provided.

3.2.1 Data Collection and Initial Results The data was collected from WoS an online subscription-based scientific citation indexing service that provides a comprehensive citation search covers more than 90 million records during 1900 to present from peer-reviewed journals belonging to publishing houses such as Elsevier, Springer, Wiley, IEEE, ACM and Taylor & Francis, to name a few. For data collection, we searched for the (“learning” AND “social networks”) as the two main keywords in topic of articles belonging to WoS. The initial search resulted in 3,722 articles from 1992 until November 2018.

3.2 Material and Method

77

3.2.2 Refining the Initial Results To refine the search results, the non-English language results such as Spanish (140), Portuguese (26), Russian (16), German (11), French (7), Turkish (5), Ukrainian (2), Chinese (1), Hungarian (1), Catalan (1) and Swedish (1) were excluded during the data purification process. Thus, the restriction on these results produced 3,511 articles. Then the publication years were limited on articles published during a 15year period i.e., 2004–2018 and the limitation resulted in 3,434 articles. Further, the conference proceeding articles and book chapters were excluded to obtain 1,853 articles from peer-review journals. Finally, the resulted articles were excluded from education educational research in order to retrieve learning research as artificial intelligence perspective. The final search results consist of 1,374 articles.

3.2.3 Analyzing the Final Results In this step, statistical analysis was extracted from the resulted search and bibliometric analysis was performed using VOSviewer (van Eck and Waltman 2010; van Eck and Waltman 2013). VOSviewer provides network visualization on co-authorship, cocitation and citation with respect to authors, organizations and countries and also co-occurrence with respect to keywords.

3.3 Results In this section, several statistics, results and analysis are reported for the research related to topic of “learning and social networks” based on the final resulted search.

3.3.1 Initial Result Statistics In Fig. 3.1, the number of articles published in each of these journals during the time period 2004–2018 is shown. Figure 3.1 demonstrates the changing pattern of publications in the research community in each year from 2004 until the beginning of 2018. It can be clearly seen from the figure that the number of publications on the research related to topic of “learning and social networks” was inactive relatively from 2004 to 2007, but since then it has been increasing dramatically.

78

3 Social Networks and Learning Systems: A Bibliometric Analysis

Number of publications

300 250 200 150 100 50 0 2004

2006

2008

2010

2012

2014

2016

2018

Publication year

Fig. 3.1 Distribution of articles published during the time period 2004–2018

3.3.2 Key Journals In order to understand the role of the different scientific journals, we identified the top 20 journals appearing in the data with the most publication in this field of research related to topic of “learning and social networks”, and it was found that these journals have published 171 journal articles. Table 3.1 shows the distribution of the journals with the most publication for research related to topic of study. Network of citation for the journals with minimum 5 articles for each journal in research related to topic of “learning and social networks” is shown in Fig. 3.2 whose node size is proportional to the number of citations received by the articles of that journal.

Fig. 3.2 Citation network of key journals for the research related to topic of “learning and social networks”

3.3 Results

79

Table 3.1 Distribution of the top 20 journals with the most publication learning and social networks related research Journal name

No. publications

IEEE Access

23

Knowledge-Based Systems

20

Physica A Statistical Mechanics and Its Applications

20

IEEE Transactions on Knowledge and Data Engineering

19

PLOS One

18

Journal of Universal Computer Science

17

Social Network Analysis and Mining

16

Expert Systems with Applications

13

Neurocomputing

13

ACM Transactions on Intelligent Systems and Technology

12

Information Sciences

11

IEEE Transactions on Multimedia

10

Knowledge and Information Systems

9

ACM Transactions on Knowledge Discovery From Data

9

Data Mining and Knowledge Discovery

8

International Journal of Advanced Computer Science and Applications

8

Journal of Knowledge Management

8

Multimedia Tools and Applications

8

Applied Soft Computing

7

Decision Support Systems

7

3.3.3 Key Researchers From the resulted search for the research related to topic of “learning and social networks”, the top ten contributing authors based on the number of publications for the research topic are extracted and these results are summarized in Table 3.2, in which in the second column is the author name, the third column reflects the number of publications in this topic, and the last column shows the highly cited article in this topic for each author. From the results reported in Table 3.2, it can be clearly observed that Meybodi, M. R. with 21 articles dominates the list, and is followed by Rezvanian, A. with 10 publications.

3.3.3.1

Co-authorship Analysis

Co-authorship analysis can be used in authors and/or publications in order to track and study the relationship between authors, institutions or countries. If applied on authors, co-authorship analysis reveals the structure of the social relationships

80

3 Social Networks and Learning Systems: A Bibliometric Analysis

Table 3.2 Top 10 authors based on the number of publications for learning and social networks related research Author

Number of publications in this topic

Highly cited article in this topic

1

Meybodi, Mohammad. Reza

21

Rezvanian and Meybodi (2016)

2

Rezvanian, Alireza

10

Rezvanian et al. (2014)

3

Chen, Yan

4

Guo, Dong

7

Cao et al. (2016)

5

Jadbabaie, Ali

7

Jadbabaie et al. (2012)

6

Li, Qiang

7

Li et al. (2015)

7

Li, Xin

7

Li and Chen (2013)

8

Chen, Lian

6

Wang et al. (2017)

9

Molavi, Pooya

6

Molavi et al. (2013)

10

Tang, Jie

6

Lou et al. (2013)

7

Fig. 3.3 Co-authorship network with its eight communities

Chen et al. (2010b)

3.3 Results

81

Fig. 3.4 Co-authorship network as overlay visualization during 2014–2017

between authors (Chen et al. 2010a). To conduct co-authorship analysis, we used VOSviewer to visualization of co-citation network and then the network clustering (community detection) in this co-authorship network is also applied in order to show the authors that probability work on similar topics in each community. The resulted network of co-authorship based on the limited authors with minimum 5 articles in this topic of study consists of 41 nodes (authors) and 8 major communities. This network with its eight communities is shown in Fig. 3.3. In Fig. 3.3, the size of each node is proportional to the number of published articles by each author and the community structures are revealed by both color and position of which node. This Co-authorship network with its eight community structures is also colored based on the publication year as overlay visualization during the time period of 2014–2018 in Fig. 3.4. In Fig. 3.4, the size of each node is proportional to the number of published articles by each author and the color of each node represents the number publications per year by each author.

3.3.4 Key Articles The top articles for the research related to topic of “learning and social networks” with respect to highly cited articles (the most number of citations received by each

82

3 Social Networks and Learning Systems: A Bibliometric Analysis

Table 3.3 Highly cited articles for the research related to topic of “learning and social networks” Article title (reference)

Year of publication

Number of citation

Average number of citation per year

1

Collective classification in network data (Sen et al. 2008)

2008

206

20.6

2

Diffusion strategies for adaptation and learning over networks: an examination of distributed strategies and network behavior (Sayed et al. 2013)

2013

182

36.4

3

Social big data: recent achievements and new challenges (Bello-Orgaz et al. 2016)

2016

142

71.0

4

A survey on opinion mining and sentiment analysis: tasks, approaches and applications (Ravi and Ravi 2015)

2015

138

46.0

5

Discovering social circles in ego networks (McAuley and Leskovec 2014)

2014

69

17.3

6

Predicting information credibility in time-sensitive social media (Castillo et al. 2013)

2013

67

13.4

7

Learning and predicting the evolution of social networks (Bringmann et al. 2010)

2010

66

8.3

8

Empirical evaluation and new design for fighting evolving twitter spammers (Yang et al. 2013)

2013

57

11.4

9

Recommendation as link prediction in bipartite graphs: a graph kernel-based machine learning approach (Li and Chen 2013)

2013

56

11.2

10

Strategies for predicting local trust based on trust propagation in social networks (Kim and Song 2011)

2011

50

7.1

3.3 Results

83

Fig. 3.5 Citation network for articles

article) are reported in Table 3.3 in which the article titles are given in the second column. The year of publication, number of citations and average number of citation per year are provided in the third, fourth and the last column, respectively. From the results reported in Table 3.3, it can be clearly observed that the survey article by Sen et al. (2008) with 206 citations dominates the list, while, a research study presented by Bello-Orgaz et al. (2016) with respect to average number of citations received each article per year dominates the list with 71.0 value for this measure. Network of citations for articles is demonstrated in Fig. 3.5 that size of each node is proportional to the number of citations which each articled received. In this network, nodes with dark red color represent the recent published articles and nodes with dark blue color represent the more past article publication.

3.3.5 Key Affiliation The plot of the top ten institutions with highly published articles is shown in Fig. 3.6 in which the most articles are published by researchers from Amirkabir University of Technology (Tehran polytechnic) with 24 articles and is followed by researchers from Tsinghua University with 23 articles. The corresponding to each affiliation, the country in which the institution is situated was taken out for further analysis and this result is shown in Fig. 3.7. As shown in Fig. 3.7, it can be seen that institutions in USA (479 articles), China (231 arti-

84

3 Social Networks and Learning Systems: A Bibliometric Analysis Arizona State University (ASU) Nanyang Technological University (NTU) Harvard University University of Pennsylvania Stanford University University of Illinois Massachusetts Institute of Technology (MIT) Chinese Academy of Sciences Tsinghua University

Amirkabir University of Technology (Tehran Polytechnic)

0

5

10 15 20 25 Number of publications

30

Fig. 3.6 Top 10 institutions with highly published articles for the research related to topic of “learning and social networks” France, 49 Germany, 52

Iran , 45

Italy, 70

Canada, 82 USA, 479 Australia, 87

Spain, 94

UK, 118

China, 231

Fig. 3.7 Top 10 contributing countries for the research related to topic of “learning and social networks”

3.3 Results

85

Fig. 3.8 Co-authorship network for institutions for the research related to topic of “learning and social networks”

Fig. 3.9 Co-authorship network for countries

86

3 Social Networks and Learning Systems: A Bibliometric Analysis

cles) and UK (118 articles), are the major contributors. In fact, researchers across the world are attracted towards the area of learning and social network analysis. Network of co-authorships for institutions of contributing authors is shown in Fig. 3.8 that each node size is proportional to the number of citations received by articles published by authors of that institutions. Network of co-authorships for countries of authors’ institutions is shown in Fig. 3.9 that each node size is proportional to the number of articles published by authors of each countries’ institutions.

3.3.6 Top Keywords In this section, we present the results of the keyword analysis. Such a discussion assists in revealing the intellectual core and identity construction of the discipline by looking into keywords used by research articles and their aggregation (Sidorova et al. 2008). To do so, we adopted a similar approach to identify the most commonly used words in the articles titles. The top 10 commonly words used in the article titles is listed in Table 3.4 for the research related to topic of “learning and social networks”. The top 10 keywords used in the articles related to learning topics are shown in Table 3.5. The network of co-occurrence words belonging to the text of titles and abstracts depicted in Fig. 3.10 that in this network, each node size is proportional to the number of occurrences in all text of titles and abstract of articles. One can see that this network consist of three communities. The main words related to the machine learning topics (e.g., prediction, classifier) are found in the community at the right side of the network plot. The density visualization of co-occurrence keywords of articles based on the total number of occurrences is depicted in Fig. 3.11. From the result, one can observe that the main keywords related to the machine learning topics (e.g., link predic-

Table 3.4 Top 10 commonly used words in articles titles Word

Frequency

1

Online social network

34

2

Community detection

32

3

Knowledge

24

4

Link prediction

21

5

Social media

18

6

Sentiment analysis

14

7

Twitter

13

8

Social network analysis

12

9

Complex network

10

10

Deep learning

10

3.3 Results

87

Table 3.5 Top 10 keywords of articles related to learning topics Keyword

Frequency

1

Machine learning

98

2

Leaning

69

3

Deep learning

22

4

Learning automata

19

5

Natural language processing

14

6

Agent based modeling

10

7

Neural networks

10

8

Supervised learning

10

9

Unsupervised learning

10

10

Reinforcement learning

9

Fig. 3.10 Co-occurrence network of words belonging to the text of titles and abstracts

tion, transfer learning, regularization, convolutional neural network, computational intelligence) mainly are positioned in the right side of the figure.

88

3 Social Networks and Learning Systems: A Bibliometric Analysis

Fig. 3.11 Density visualization of Co-occurrence network of keywords

3.4 Conclusion Learning as a growing research area of machine intelligence has found many applications in domains of social and computer sciences. According to the bibliometric and network analysis, in this chapter we presented a brief analysis of literature on social networks and learning systems related research over the period of 15 years (2004–2018). We provided some insights about the contributing key scientific journals, researchers, institutes, countries, articles, keywords and topics towards advancing social networks and learning systems related research as bibliometric and network analysis perspective.

References Bello-Orgaz G, Jung JJ, Camacho D (2016) Social big data: recent achievements and new challenges. Inf Fusion 28:45–59. https://doi.org/10.1016/j.inffus.2015.08.005 Bringmann B, Berlingerio M, Bonchi F, Gionis A (2010) Learning and predicting the evolution of social networks. IEEE Intell Syst 25:26–34. https://doi.org/10.1109/MIS.2010.91 Cao J, Li Q, Ji Y et al (2016) Detection of forwarding-based malicious URLs in online social networks. Int J Parallel Program 44:163–180. https://doi.org/10.1007/s10766-014-0330-9

References

89

Castillo C, Mendoza M, Poblete B (2013) Predicting information credibility in time-sensitive social media. Internet Res 23:560–588. https://doi.org/10.1108/IntR-05-2012-0095 Chen C, Ibekwe-SanJuan F, Hou J (2010a) The structure and dynamics of cocitation clusters: a multiple-perspective cocitation analysis. J Am Soc Inf Sci Technol 61:1386–1409. https://doi. org/10.1002/asi.21309 Chen Yan, Wang Beibei, Lin WS et al (2010b) Cooperative peer-to-peer streaming: an evolutionary game-theoretic approach. IEEE Trans Circuits Syst Video Technol 20:1346–1357. https://doi. org/10.1109/TCSVT.2010.2077490 Jadbabaie A, Molavi P, Sandroni A, Tahbaz-Salehi A (2012) Non-Bayesian social learning. Games Econ Behav 76:210–225. https://doi.org/10.1016/j.geb.2012.06.001 Kim YA, Song HS (2011) Strategies for predicting local trust based on trust propagation in social networks. Knowl-Based Syst 24:1360–1371. https://doi.org/10.1016/j.knosys.2011.06.009 Li X, Chen H (2013) Recommendation as link prediction in bipartite graphs: a graph kernel-based machine learning approach. Decis Support Syst 54:880–890. https://doi.org/10.1016/j.dss.2012. 09.019 Li C, Cheung WK, Ye Y et al (2015) The author-topic-community model for author interest profiling and community discovery. Knowl Inf Syst 44:359–383. https://doi.org/10.1007/s10115-0140764-9 Lou T, Tang J, Hopcroft J et al (2013) Learning to predict reciprocity and triadic closure in social networks. ACM Trans Knowl Discov Data 7:1–25. https://doi.org/10.1145/2499907.2499908 McAuley J, Leskovec J (2014) Discovering social circles in ego networks. ACM Trans Knowl Discov Data 8:1–28. https://doi.org/10.1145/2556612 Molavi P, Jadbabaie A, Rahnama Rad K, Tahbaz-Salehi A (2013) Reaching consensus with increasing information. IEEE J Sel Top Signal Process 7:358–369. https://doi.org/10.1109/JSTSP.2013. 2246764 Narendra KS, Thathachar MAL (1989) Learning automata: an introduction. Prentice-Hall Ravi K, Ravi V (2015) A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl-Based Syst 89:14–46. https://doi.org/10.1016/j.knosys.2015.06.015 Rezvanian A, Meybodi MR (2016) Stochastic graph as a model for social networks. Comput Human Behav 64:621–640. https://doi.org/10.1016/j.chb.2016.07.032 Rezvanian A, Rahmati M, Meybodi MR (2014) Sampling from complex networks using distributed learning automata. Phys A Stat Mech Appl 396:224–234. https://doi.org/10.1016/j.physa.2013. 11.015 Rezvanian A, Saghiri AM, Vahidipour SM et al (2018) Learning automata for complex social networks. In: Recent advances in learning automata, pp 279–334 Rowley J, Slack F (2004) Conducting a literature review. Manag Res News 27:31–39. https://doi. org/10.1108/01409170410784185 Sayed AH, Tu S-Y, Chen J et al (2013) Diffusion strategies for adaptation and learning over networks: an examination of distributed strategies and network behavior. IEEE Signal Process Mag 30:155–171. https://doi.org/10.1109/MSP.2012.2231991 Sen P, Namata G, Bilgic M et al (2008) Collective classification in network data. AI Mag 29:93. https://doi.org/10.1609/aimag.v29i3.2157 Sidorova Evangelopoulos, Valacich Ramakrishnan (2008) Uncovering the intellectual core of the information systems discipline. MIS Q 32:467. https://doi.org/10.2307/25148852 van Eck NJ, Waltman L (2010) Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 84:523–538. https://doi.org/10.1007/s11192-009-0146-3 van Eck NJ, Waltman L (2013) VOSviewer manual Wang H, Wu J, Pan S et al (2017) Towards large-scale social networks with online diffusion provenance detection. Comput Netw 114:154–166. https://doi.org/10.1016/j.comnet.2016.08.025 Yang C, Harkreader R, Gu G (2013) Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans Inf Forensics Secur 8:1280–1293. https://doi.org/10.1109/TIFS. 2013.2267732

Chapter 4

Social Network Sampling

4.1 Introduction Since online social networks have gained more popularity among internet users in recent years, computer scientists and sociologists have started to study and analyze characteristics of these networks. Social network analysis (SNA) as an inherently interdisciplinary field focuses on social relations between users rather in the users themselves. In fact, the main purpose of SNA is to study both contents and patterns of relations in online social networks to understand the implications of these social relations. Unfortunately, there exist many factors which make it difficult, if not impossible, to study social networks in their entirety. First and foremost, the huge size of many real-world networks makes it computationally infeasible to study the entire network. In addition, some online networks are not completely visible to the public (e.g., Facebook) or can be accessed only through crawling (e.g., Web). In other cases, the size of networks may not be as large but the measurements required observing the underlying networks are costly. As a result, network sampling is at the heart and foundation of the study for understanding network structure. While the explicit goal of graph sampling algorithms is to produce representative subgraphs with smaller size which can be used to make inferences about the full graph, there often exists other implicit goals for a sampling process. Three possible goals of graph sampling algorithms are: Scale-down sampling, Back-in-time sampling, and Supervised sampling. The Scale-down sampling aims to sample a representative subgraph that have similar (or scaled-down) topological properties to those of the original graph (Leskovec and Faloutsos 2006). In Back-in-time sampling, the sampled subgraph matches temporal evolution of the original graph (Leskovec and Faloutsos 2006). That is, the sampled subgraph G s is similar to what the original graph G looked like when it was of the same size as G s . Finally, the goal of supervised sampling is to identify nodes belonging to a specific category (Fang et al. 2013, 2016a, b). For this purpose, a biased sampling is done to sample a subgraph under the requirements related to that category.

© Springer Nature Switzerland AG 2019 A. Rezvanian et al., Learning Automata Approach for Social Networks, Studies in Computational Intelligence 820, https://doi.org/10.1007/978-3-030-10767-3_4

91

92

4 Social Network Sampling

In this chapter, we focus on the goal of sampling a representative subgraph (Scaledown sampling). The term “representative subgraph sampling” defined by Leskovec and Faloutsos (2006) refers to producing a small sample of the original network, whose characteristics represent as accurately as possible the entire network. There exist many characteristics which describe a network structure such as degree, clustering coefficient, and k-core distributions. Leskovec and Faloutsos (2006) proposed a set of empirical rules by which the measurements of the sample can be scaled up, to recover estimates for the original graph. Work in Ebbes et al. (2012) investigated the ability of nine different sampling methods in preserving the structural properties such as degree, clustering coefficient, betweenness centrality, and closeness centrality of social networks. Lee et al. (2005) exploited three sampling algorithms and investigated the statistical properties of the samples taken by them. They focused on the topological properties such as degree distribution, average path length, assortativity, clustering coefficient, and betweenness centrality distribution.

4.2 Categorization of Graph Sampling Algorithms Graph sampling algorithms can be classified in several ways. In the following, we first formalize the problem of sampling from social networks and then present four such classifications, namely random versus topology-based sampling, simple versus extended sampling, static versus streaming graph sampling, and unweighted versus weighted graph sampling. Definition 4.1 (Graph sampling) Let G(V, E) be an unweighted and undirected graph with the node  set V  {v1 , v2 , . . . , vn } and the set of edges E   ei j |vi ∈ V, v j ∈ V , such that |V |  n denotes the number of nodes, and |E|  m denotes the number of edges. The neighbourhood of node vi is defined as N (vi )   v j |ei j ∈ E, v j ∈ V , such that d(vi )  |N (vi )| is the degree of node vi . In the case of streaming graphs, we assume that the edge set E is presented as a stream of edges arranged in an arbitrary sequential order. Given an input graph G(V, E) and a sampling fraction f , a sampling algorithm samples a representative subof the nodes Vs ⊂ V and a subset of the edges graph G s (Vs , E s ) with a subset  E s ⊂ ei j |vi ∈ Vs , v j ∈ Vs , such that |Vs |  f n (i.e. n s  |Vs |  f × |V |). The goal is to ensure that the sampled subgraph G s preserves the properties of the original graph G.

4.2.1 Random Versus Topology-Based Sampling Existing sampling algorithms can be categorized into two groups as random and topology-based sampling, based on whether nodes or edges are randomly selected

4.2 Categorization of Graph Sampling Algorithms

93

from the original graph G (node and edge sampling) or if the selection of nodes and edges depends on the existing topology of G (topology-based sampling). Classic node sampling (NS) chooses nodes independently and uniformly at random from the graph G. That is, for a required fraction f of nodes, each node is chosen independently with a probability of f for inclusion in the sampled subgraph G s . Finally, the nodes sampled in Vs along with all the edges among added to E s constitute the subgraph G s . While NS is intuitive and relatively straightforward, Stumpf et al. (2005) showed that it does not accurately capture the properties of graphs with power law degree distribution. Similarly, work in Lee et al. (2005) indicated that although classic node sampling captures nodes of different degrees well, due to its inclusion of all the edges for a sampled node set only, the original level of connectivity is less likely to be preserved. Several variations of node sampling have been proposed in the literature (Leskovec and Faloutsos 2006; Krishnamurthy et al. 2007; Ahmed et al. 2010, 2014; Ghavipour and Meybodi 2017). For example, Leskovec and Faloutsos (2006) studied the variations of NS in which the selection probability of a node is proportional to either its degree or PageRank weight. Work in Ghavipour and Meybodi (2017) also proposed a topology-based node sampling algorithm, called ICLA-NS, that utilizes an irregular cellular learning automata (ICLA) to guarantee the connectivity and the inclusion of the high degree nodes in subgraphs initially sampled by classic node sampling method. Classic edge sampling (ES) chooses edges independently and uniformly at random from the graph G for inclusion in the sampled subgraph G s . For each chosen edge, both incident nodes are added to Vs . Finally, G s is constructed by including the edges sampled in E s and their end points in Vs . Classic edge sampling is likely to capture path lengths, due to its bias towards high degree nodes and the inclusion of both incident nodes of sampled edges (Ahmed et al. 2014). However, ES is less likely to preserve clustering and connectivity, since it samples edges independently (Lee et al. 2005). Classic edge sampling generally produces sparse subgraphs. There exist some improved variations of ES in the literature (Leskovec and Faloutsos 2006; Krishnamurthy et al. 2007; Ahmed et al. 2010, 2014; Ghavipour and Meybodi 2018). Due to the known limitations of node and edge sampling (Lee et al. 2005; Stumpf et al. 2005; Leskovec and Faloutsos 2006), researchers have also considered other sampling methods based on topological structure of graph. The common idea in this class of sampling methods is to select a set of initial nodes and then explore the neighbourhood of this node set. The sampled subgraph G s in these methods is constructed by including the nodes and edges explored. These methods can be categorized into two subclasses: random walks and graph traversals. In the category random walks, sampling is performed with replacement, i.e. nodes can be revisited. This category includes classic Random Walk Sampling (RWS) (Lovász et al. 1993; Yoon et al. 2007; Gjoka et al. 2010; Lu and Li 2012) and its variations such as Weighted Random Walk (WRW) (Kurant et al. 2011a), Re-Weighted Random Walk (RWRW) (Rasti et al. 2009), Metropolis-Hastings Random Walk (MHRW) (Stutzbach et al. 2009; Lee et al. 2012), m-dependent Random Walk (Frontier Sampling) (Ribeiro and Towsley 2010), Random Walks with jumps (Avrachenkov et al. 2010).

94

4 Social Network Sampling

In the category graph traversals, each node is visited at most once (sampling without replacement). Methods in this category differ in the order in which they visit the nodes. Examples are Breadth-First Search (BFS) (Ahn et al. 2007; Mislove et al. 2007; Wilson et al. 2009), Depth-First Search (DFS) (Chauhan et al. 2011), Forest Fire Sampling (FFS) (Leskovec et al. 2005; Aggarwal 2006), Snowball Sampling (SBS) (Goodman 1961; Newman 2003b; Illenberger et al. 2011), Respondent-Driven Sampling (RDS) (Heckathorn 1997; Goel and Salganik 2010), and Expansion Sampling (Maiya and Berger-Wolf 2010). Lu and Li (2012) showed that Random Walk sampling performs much better than NS and ES on Twitter dataset. Kurant et al. (2011b) also compared RWS with the BFS method. They found that the average node degree of original graph is overestimated by BFS, since it is biased towards high degree nodes. In contrast, the RW sampling underestimates the average degree. Authors proposed analytical solutions to correct the bias of BFS sampling. Forest Fire sampling behaves almost the same as the BFS method, the difference is that in FFS sampling only a fraction of neighbouring nodes are followed at each round. Work in Leskovec and Faloutsos (2006) showed that FFS matches very accurately the properties of the original graph. Lee et al. (2005) investigated the Snowball sampling method and observed that this method accurately maintains connectivity within the snowball, however SBS suffers from boundary bias in which the nodes sampled on the last round will be missing a large number of their neighbours. Based on the basic snowball sampling, Gao et al. (2014) developed a random multiple snowball with Cohen process sampling. Their simulations indicated that this method is able to preserve local and global structures of the original graph. Random jump in MHRW sampling with prevention of being trapped in local structures has been proposed in Jin et al. (2011). Piña-García and Gu (2013) altered the behaviour of MHRW sampling by using spirals as a probability distribution instead of a classic normal distribution and showed that their proposed algorithm outperforms normal MHRW in the case of illusion spiral. Ribeiro and Towsley (2010) proposed a new approach to an m-dimensional random walk that benefits from starting its walkers at uniformly selected nodes and performs m dependent random walks in the original graph. They showed that their method called Frontier Sampling (FS) mitigates the large estimation errors caused by disconnected or loosely connected graphs which can trap a random walker and distort the estimated characteristics.

4.2.2 Simple Versus Extended Sampling Simple sampling algorithms consist of one phase which implements the sampling process based on one of three sampling approaches (i.e. node, edge and topologybased sampling) or a combination of them. These algorithms sample each node which they visit for the first time and terminate instantly after visiting the required number of nodes (according to the sampling fraction). Thus, in this group of sampling algorithms, there is a trade-off between the sampling fraction and the accuracy of sampling. The traditional sampling algorithms (Goodman 1961; Lovász et al. 1993;

4.2 Categorization of Graph Sampling Algorithms

95

Heckathorn 1997; Lee et al. 2005; Leskovec and Faloutsos 2006; Maiya and BergerWolf 2010; Chauhan et al. 2011) fall in this group. In contrast, extended sampling algorithms consist of two phases: one phase is dedicated to pre/post processing and other phase implements a simple sampling algorithm. As a result, algorithms in this group because of the time needed for pre/post processing have higher time complexity comparing to the simple sampling algorithms but produce samples with higher accuracy. The pre-processing phase is usually done with the goal of finding important nodes/edges to be added to the sample subgraph. Some extended sampling algorithms with pre-processing phase are given in Rezvanian et al. (2014), Yoon et al. (2015), Rezvanian and Meybodi (2015), Jalali et al. (2016a, b). In Rezvanian et al. (2014), during the pre-processing phase a distributed learning automata is used to find important nodes in the original graph. The algorithm reported in Rezvanian and Meybodi (2015), in its pre-processing phase, uses the concept of shortest path for finding important edges of the input graph. The sampling algorithm in Yoon et al. (2015) extracts the communities of the original graph in its pre-processing phase in order to find important nodes and edges in the graph. Some extended sampling algorithms with post-processing phase have been also reported in the literature. The post-processing phase is usually utilized with the goal of improving the initial sample created by a simple sampling algorithm. For instance, the work reported in Ahmed et al. (2014) creates an initial subgraph by sampling edges at the first of the edge stream such that a required percent of nodes is sampled. Then, the initial subgraph is improved by randomly sampling the remaining edges of the stream with the goal of selecting nodes with high degree. Ghavipour and Meybodi (2018) extended this work by using learning automata to produce more connected sample subgraphs. Ghavipour and Meybodi (2017) also proposed an extended sampling algorithm with post-processing phase, called ICLA-NS. The algorithm ICLA-NS first constructs an initial sample subgraph of the input graph using the node sampling method, and then uses an ICLA isomorphic to the input graph to improve the sample by repeatedly replacing nodes in the sample with the nodes found by exploring the input graph. In order to evaluate the performance of the proposed sampling algorithm, authors conducted a number of experiments on real-world networks. Based on their experimental results, ICLA-NS outperforms the existing sampling algorithms such as node sampling, Random Walk sampling and Forest Fire sampling in terms of Kolmogorov-Smirnov (KS) test for degree, clustering coefficient, and k-core distributions.

4.2.3 Static Versus Streaming Graph Sampling The sampling algorithms based on the assumption of a static graph (Goodman 1961; Lovász et al. 1993; Heckathorn 1997; Lee et al. 2005; Leskovec and Faloutsos 2006; Maiya and Berger-Wolf 2010; Chauhan et al. 2011; Ghavipour and Meybodi 2017) consider the input graph only at one point in time and assume that it is of moderate

96

4 Social Network Sampling

size which can fit in the main memory and the access to the entire graph at any step is possible. However, these assumptions are not realistic for many real world networks. For instance, consider social activity networks formed from communications among users (such as wall posts, tweets, emails, and etc.), where any activity between two users results in an addition of an edge to the network graph. These networks are streaming and include the massive volume of edges. A streaming graph is considered to be a stream of edges that continuously evolves over time and is clearly too large to fit in the memory (Ahmed et al. 2014). Under these conditions, the traditional sampling techniques are not appropriate, since they consider the graph at one point in time and thus the samples taken by them may become less relevant over time because of evolution. In addition, these techniques incur large I/O costs. When the original network has too many edges to fit in the main memory, sampling can only be done sequentially (one edge at a time), since random accesses on disk incur large I/O costs. In the static domain, topology based sampling methods require the random exploration of a node’s neighbors (which requires many passes over the edge stream if only sequential accesses are allowed). Node sampling method also requires to randomly access the entire node set of the network graph. Thus, none of these methods would be appropriate for sampling such a large scale network. In addition, in some cases it is necessary to analyze a dynamic network over time for many applications, such as for investigating how the structure of communities evolve over time, and discovering patterns of interactions among individuals. In these cases also, static graph sampling algorithms are not appropriate as they have no ability to update the sampled subgraph using edges that occur over time. Therefore, several snapshots at different points in time must be taken from the original network and for each snapshot, the sampling process has to be done entirely again to obtain an updated sample of that time point. As a result, it is necessary to develop sampling algorithms that can address the complexities of streaming domain. Analyzing streaming networks is increasingly important for identifying patterns of interactions among individuals and investigating how the network structure evolves over time. As a result, the streaming graph sampling has received more attention in recent years. Researchers have developed algorithms for sampling from streaming graphs (Cormode and Muthukrishnan 2005; Ahmed et al. 2010, 2014; Aggarwal et al. 2011). Cormode and Muthukrishnan (2005) utilized a min-wise hash function to sample almost uniformly from the set of all edges which has been at any time in the graph stream. The sampled edges were used later to maintain cascaded summaries of the stream. Work in Aggarwal et al. (2011) proposed a reservoir sampling method based on min-wise hash sampling of edges in order to maintain structural summaries of the underlying graph. These structural summaries are designed to create dynamic and efficient models for detecting outlier in graph streams. Ahmed et al. (2010) developed a time-based sampling technique for sampling from activity graphs presented as a sequence of edges ordered over time. This method randomly selects a timestamp on the activity timeline of the graph and samples the nodes incident on edges that have occurred in a time window starting from that timestamp. The algorithm repeats this process in a streaming fashion until the required fraction of nodes are collected. Finally, the sample set includes the selected nodes and any future

4.2 Categorization of Graph Sampling Algorithms

97

edges that involve these nodes. Authors have compared their method with traditional sampling algorithms such as node sampling and Forest Fire sampling. Ahmed et al. (2014) outlined a spectrum of computational models for designing sampling algorithms, ranging from the simplest model based on assumption of static graphs to the more challenging model of sampling from graph streams. They proposed several sampling algorithms based on the concept of graph induction generalized across the spectrum from static to streaming. In static domain, they proposed a sampling algorithm called induced edge sampling (ES-i) which was a combination of edge-based node sampling method and the graph induction. The authors proved the better performance of their proposed algorithm ES-i by comparing it with traditional sampling algorithms. In streaming domain, Ahmed et al. (2014) addressed the massive size of edges (that is too large to fit in memory) and continuously evolving edge stream over time, and adapted static sampling algorithms for streaming graphs. They presented streaming variations of node, edge and topology-based sampling, as well as a streaming variation of the algorithm ES-i, referred to as partially-induced edge sampling (PIES), which all run in a single pass over the stream of edges. As reported in Ahmed et al. (2014), PIES preserves more accurately the underlying properties of tested datasets compared to other streaming algorithms. This algorithm is biased to high degree nodes and provides a dynamic sample while the original graph is streaming. The algorithm receives as input a streaming graph presented as an arbitrarily ordered sequence of edges and adds the first m edges incident to n s nodes to the sample set. Then, it scans the rest of the stream and randomly samples edges such that any streaming edge is sampled with probability γ  mt if at least one of two nodes incident to that edge does not belong to the sample set, and otherwise that edge is sampled with probability γ  1. For any sampled edge, the incident nodes replace former sampled nodes chosen uniformly at random. As noted in Ahmed et al. (2014), PIES achieves better result for graphs that are sparse and less clustered. The algorithm PIES has two drawbacks: (1) streaming edges are each sampled independently. Thus, connectivity and clustering of the original graph is less likely to be preserved in sampled subgraphs. Using partial graph induction (sampling some of the edges incident on the sampled nodes) can help the algorithm to recover only some of the original connectivity, (2) the incident nodes of any sampled edge replace nodes randomly selected from the sample. Therefore, it is possible that nodes with high activity are replaced while there exist some low activity or even isolated nodes in the sample. As a result of these drawbacks, PIES performs well for sparse, less clustered graphs and the performance of the algorithm decreases as the graph becomes denser and more clustered. This is while online social networks tend to exhibit high levels of clustering (Jin et al. 2001; Tang and Liu 2010) and it also has been observed that the density of these network increases over time (Leskovec et al. 2005; Kumar et al. 2010). Ghavipour and Meybodi (2018) developed the streaming sampling algorithm FLAS based on learning automata which, while maintaining the advantages of PIES (such as running in a single pass over the stream and considering the stream evolution), can overcome these drawbacks and produce sample subgraphs with high

98

4 Social Network Sampling

quality also for dense and high clustered graphs. In this algorithm, each node of the original activity graph is equipped with a learning automaton which decides whether its corresponding node should to be added to the sample subgraph or not. Using learning automata helps the algorithm FLAS to overcome the drawbacks of PIES (Ahmed et al. 2014) and produce more connected sample subgraphs. Their experimental results on real-world activity networks and synthetic networks indicated that FLAS is superior in terms of the representativeness of sampled subgraphs, and competitive in terms of space complexity as compared to the PIES algorithm. The representativeness ability of FLAS was tested in terms of Kolmogorov-Smirnov (KS) test for degree, clustering coefficient, k-core and path length distributions and also in terms of normalized L1 and L2 distances respectively for eigenvalues and network values. According to their experimental results, FLAS obtains an improvement of about 66% for degree distribution, of about 69% for clustering coefficient distribution, of about 64% for k-core distribution, of about 48% for path length distribution, of about 78% for eigenvalues, and of about 76% for network values at the cost of time complexity as compared to PIES. For an input graph G(V, E) (where V is the set of nodes and E is the set of edges), FLAS needs O(|E|log|V |) access to disk, while the number of disk accesses needed by PIES is O(|E|). It is worth mentioning that if there are enough main memory space both algorithms FLAS and PIES have the same time complexity of O(|E|).

4.3 Learning Automata Based Graph Sampling Algorithms In the following, we describe in details some sampling algorithms proposed in the literature which use learning automata for producing representative subgraphs.

4.3.1 Distributed Learning Automata-Based Sampling (DLAS) Distributed learning automata-based sampling (DLAS) is the first algorithm empowered using learning automata (Rezvanian et al. 2014). In this DLAS, it is assumed that each node of a social network is available by a unique ID of each user as nodes of input network and also their allowable connected users as corresponding edges between them. DLAS uses a set of learning automata in order to guide the process of visiting nodes by visiting the important parts of the input network, those parts of the network which contains important nodes. A DLA is constructed by assigning a learning automaton Ai to node vi of the input network. Action-set of each LA corresponds to selecting the edges of the node to which the LA is assigned. Each LA in DLA initially selects its actions with equal probabilities. In an iteration of the algorithm, DLA starts from a randomly chosen node, and then follows a path

4.3 Learning Automata Based Graph Sampling Algorithms

99

Algorithm 4-1. DLA based sampling algorithm (DLAS) Input: Network , Maximum number of iteration K, Sampling ratio Output: Sampled network Assumptions: Construct an DLA by assigning an automaton to each node vi and initialize their action probabilities ; Let denotes the iteration number of algorithm which is initially set to 1; Begin Disable all learning automata; While ( ) Select node vs randomly as a starting node; While (number of visited nodes ) Automaton is enabled and then selects an action according to its action probability vector; Let the selected action by be ; If ( is already visited in previous paths) then // favorable nodes Reward the selected action; End If ; Disable ; End While ; End While Sort the visited nodes according to the number of times that the nodes are visited in descending order; Construct sampled network using mostly visited nodes; End Algorithm

Fig. 4.1 Pseudo-code of DLA based sampling algorithm (DLAS) for deterministic networks

of nodes according to the probability vectors of set of LA constituting DLA. If a LA selects an action which corresponds to a node which has been already visited in the previous paths (iterations) then the probability of selecting that action will be increased according to the learning algorithm. After several iterations, the visited nodes are sorted according to their probabilities of selecting. The sampled network is then constructed using a particular number (given number (sampling ratio multiply the number of nodes in the network or also a given percentage) of mostly visited nodes. Pseudo-code of DLA based sampling algorithm (DLAS) is given in Fig. 4.1.

4.3.2 Extended Distributed Learning Automata-Based Sampling (EDLAS) Successful results of DLAS motivate the authors to develop a new learning automatabased algorithm for network sampling with the aid of extended distributed learning automata (Mollakhalili Meybodi and Meybodi 2014) called EDLAS. In EDLAS (Rezvanian and Meybodi 2017), a set of learning automata which forms an eDLA isomorphic to the input network and they cooperate with each other in order to take appropriate samples from the network. The algorithm iteratively tries to identify promising nodes or edges by traversing promising parts of the network and then forms the sampled network. In the EDLAS, it is assumed that each node of a social network is available by a unique ID of each user as nodes of input network and also their allowable connected

100

4 Social Network Sampling

users as corresponding edges between them. The LA-based sampling algorithm uses an eDLA to guide the process of visiting nodes by visiting the important parts of the input network, those parts of the network which contains more hub and central nodes. At first, an eDLA isomorphic to the input network is constructed by assigning a learning automaton Ai to node vi of the input network. Action-set of each LA corresponds to the selecting an edge of the node to which the LA is assigned with equal probabilities. In eDLA, at any time, a LA can be in one of four levels: Passive, Active, Fire and Off . Thus in the EDLAS, all learning automata are initially set to Passive level. At the first of algorithm, one of the nodes in eDLA is chosen by firing function to be the starting node and its activity level is set to active by governing rule. In what follows an iteration of the EDLAS is explained. At the beginning of an iteration, one of the active nodes chosen randomly by firing function and then fired (its activity level changes to fire and the activity level of its passive neighboring nodes change to active) and chooses one of its actions which correspond to one of the edges of the fired node and then the level of the fired LA changes to off by governing rule. Then one of LA in eDLA with activity level active is chosen and fired (its activity level changes to fire and the activity level of its passive neighboring nodes change to active). The fired LA chooses one of its actions which correspond to an edge of the fired node and then the activity level of the fired LA changes to off by governing rule. The process of selecting a learning automaton from the set of learning automata with activity level active in eDLA by firing function, firing it, choosing an action by fired LA, changing activity levels of passive adjacent LA to active LA, and changing activity level of fired LA from fire to Off by governing rule is repeated until either the number of LA with activity level Off is reached a given number (sampling ratio multiply the number of nodes in the network) or the set of LA with activity level active is empty. In the EDLAS, the action probabilities of all fired learning automata are updated on the basis of the responses received from the environment. The probabilities are updated as follows: The probabilities of choosing actions of all fired learning automata which correspond to the edges traversed in the previous iterations are updated according to the learning algorithm. At the end of each iteration, the activity level of all learning automata in eDLA are set to passive. Thus, the EDLAS iteratively visits a set of paths and also updates its action probability vectors until it iterates the maximum number of iterations. After that, the visited nodes are sorted according to the number of times that they have been visited in descending order. The sampled network is then constructed using a given number (sampling ratio multiply the number of nodes in the network or also a particular percentage) of mostly visited nodes. Figure 4.2 is the pseudo-code of eDLA based sampling algorithm. In EDLAS visiting a node causes the probability of visiting this node in later iterations to be increased. This means that mostly visited nodes have incoming edges with high probabilities. That means that the EDLAS constructs a sample of the network which includes central and hub nodes leading to a better sampling of the network.

4.3 Learning Automata Based Graph Sampling Algorithms

101

Algorithm 4-2. eDLA based sampling algorithm (EDLAS) Input: Network G= V, E with V= {v1, v2 vn}, Maximum number of iteration K, Sampling ratio Output: Sampled network Gs= Vs, Es Assumptions: Construct an eDLA by assigning an automaton Ai to each node vi from input network G; Initialize action probabilities of automata and activity level of each LA initially set to Passive; Let k denotes the iteration number of algorithm which is initially set to 1; Let Pa is the set of learning automata with activity level of Passive and initially set Pa {v1, v2 vn}; Let Ac is the set of learning automata with activity level of Active and initially set to empty; Let Of is the set of learning automata with activity level of Off and initially set to empty; Let Fi is the learning automaton with activity level of Fire and initially set to empty; Let S is an array of visited nodes with size n and initially each element set to 0; Let N(vi) is a function that returns the all adjacent nodes of vi in Passive level; Begin Select an starting node vs at random by firing function and change its activity level to Active level; Ac {As}; Pa Pa - {As}; While (k < K) Select one learning automata among Active learning automata Ac as As by firing function; Fi As; Ac N(vs) \ vs; Pa Pa \ Ac; Automaton As chooses an action using its action probability vector; Let the action chosen by As be (vs, vm); Sk[vk] Sk[vk] + 1; Sk[vm] Sk[vm] + 1; If (|Of| >= |V| or Ac is empty) then If ( (Sk)) > Sk-1)) then // favorable traverse Reward the actions chosen by the learning automata with activity level Of Else Penalize the actions chosen by the learning automata with activity level Of End If Pa {v1, v2 vn}; Fi {}; Ac {}; Of {}; Select a starting node vs at random by firing function and change its activity level to Active level; Ac {As}; Pa Pa \ As; Else Of Of Fi Fi {} End If k k + 1; End While Sort the visited nodes according to the number of times that the nodes are visited in descending order; Construct sampled network Gs using |V| mostly visited nodes; End Algorithm

Fig. 4.2 Pseudo-code of the eDLA based sampling algorithm (EDLAS) for deterministic networks

102

4 Social Network Sampling

4.3.3 The Extended Topology-Based Node Sampling Algorithm ICLA-NS The algorithm ICLA-NS (Ghavipour and Meybodi 2017) is a node sampling algorithm based on topology using irregular cellular learning automata (ICLA). The original graph G(V, E) and the sampling fraction f are the inputs to this algorithm and the output is a representative subgraph G s (Vs , E s ) which satisfies the constraint |Vs | f |V |. In this algorithm, at first an initial sample subgraph is produced by randomly choosing f percent of the nodes in the input graph. Then, an irregular cellular learning automata (ICLA) isomorphic to the input graph attempts to improve the initial sample by repeatedly replacing nodes in the sample with the nodes reached by exploring the input graph with the goal of guaranteeing the connectivity of the sample subgraph and at the same time sampling the high degree nodes from the input graph. The pseudo code for the algorithm ICLA-NS is given in Fig. 4.3. The proposed algorithm ICLA-NS consists of three phases: ICLA mapping phase, Initialization phase and Improvement phase.

Algorithm 4-3. ICLA-NS 01

Input Original graph

02

Output Sampled subgraph

, Fraction of nodes , Convergence threshold

03

Begin ICLA mapping phase

04

Associate each node

with a cell equipped with an automaton

with

actions

Initialization phase 05

Fraction

06

cells corresponding to nodes in

07

of nodes in

selected at random

Determine initial state for all cells Improvement phase

08

Set action probability vector of each

09

Repeat

10 11

automaton

according to equation (4-1)

corresponding to cell at front of

Repeat

12

Automaton

13

If selected action is its corresponding node

chooses one of its actions according to then

If there is at least one sampled node adjacent to

14

then Reward

Else Penalize

15 16

Remove cell at front of

and add cell corresponding to

If there is no sampled node adjacent to

18 19

End

20 21 22 23

to the end of

Else \\ Assume selected action is

17

Until selected action is Until End Algorithm

Fig. 4.3 The pseudo code of the ICLA-NS algorithm

then Reward

Else Penalize

4.3 Learning Automata Based Graph Sampling Algorithms

103

ICLA mapping phase: In this phase, an irregular cellular learning automata (ICLA) isomorphic to the input graph G is constructed. The action set of the learning automaton Ai associated with node vi has d(vi ) + 1 actions, where d(vi ) denotes the degree of node vi . The automaton Ai can take one of the actions of giving the opportunity to the node vi to be joined to the sample set Vs called the action “selection of the corresponding node vi ” or giving the opportunity to one of the neighbours of vi , say vk ∈ N (vi ), to be joined to Vs called the action “selection of the neighbour vk of the node vi ”. At the beginning of the algorithm, all the cells of ICLA are considered to be inactive. Initialization phase: In this phase, an initial subgraph of the input graph G is formed by randomly selecting f percent of the nodes of G (note that the selected nodes must be non-isolated). Every time a node vi is selected, it is placed in the sample set Vs , and its corresponding cell is added to the queue Q. The action probability vector of each learning automaton Ai is set in such a way that if Ai is in one of the cells in Q, the action “selection of the corresponding node vi ” will have the selection probability of 1, and otherwise the action “selection of the neighbour vk of the node vi ”, where vk is chosen at random from the neighbourhood of vi , will have the selection probability of 1. After that, all the cells of ICLA are activated synchronously which in turn activate the learning automata to have an action selection based on their action probability vectors. This synchronous activation process is done with the goal of determining the initial state for all the cells of ICLA. The selected action by each learning automaton represents the initial state of its corresponding cell. In this Section, the state of a cell and the action chosen by the learning automaton of that cell may be used interchangeably. Improvement phase: In this phase, ICLA starts operating asynchronously and attempts to improve the initial sample by repeatedly replacing its nodes with the nodes reached by exploring the graph G in such a way that the high degree nodes are sampled and at the same time the connectivity of the sampled graph is ensured. The order of the cells in Q will be used to activate the cells of the asynchronous ICLA during its operation. That is, the cell located at the front of Q will always be chosen for next cell activation. In this phase, the action probability vector of each automaton Ai is initialized according to Eq. (4.1). j

pi (0)  j

d(vi ) +

d(v j )  vk ∈N (vi )

d(vk )

∀v j ∈ {vi ∪ N (vi )}

(4.1)

where pi (0) is the probability that node v j be chosen by Ai at instant t  0 and N (vi ) denotes the neighbourhood of node vi . Using Eq. (4.1) for initializing the probability vectors of learning automata residing in the cells of ICLA, nodes with high degrees will be initially assigned higher probabilities of selection. This phase repeats the following three steps until the stopping condition is reached.

104

4 Social Network Sampling

Step 1. Finding a substitute node by exploring the input graph This step aims to find a substitute node for the node vi corresponding to the cell at the front of Q. For this purpose, the input graph G is explored starting from vi by a series of cell activations in ICLA as follows. At first, the cell located at the front of Q is activated. As a result of this cell activation, the learning automaton Ai in this cell is activated in order to decide whether its corresponding node vi or one of the neighbouring nodes of vi to be sampled (as the substitute node) or not. The set of available actions for Ai includes “selection of the corresponding node vi ” and “selection of the neighbour vk of the node vi ” for any neighbour node / Vs }. vk ∈ N (vi ) such that vk has not been already sampled, i.e. vi ∪ {vk ∈ N (vi )|vk ∈ The activated automaton Ai selects one of its available actions according to the scaled action probability vector. If Ai chooses the action “selection of the corresponding node vi ”, then the exploration process is finished and node vi is considered as the substitute node. Otherwise, if Ai chooses the action “selection of the neighbour vk of the node vi ”, the cell corresponding to vk is activated which in turn activates the automaton Ak to have an action selection, and so on. This series of cell activations which explores the input graph G continues until the automaton in an activated cell selects the action “selection of the corresponding node”. Each time an activated automaton, say Ai , selects an action; it receives a reward or penalty signal based on the following rules: 1. If the action chosen by the automaton Ai is “the selection of the corresponding node vi ”, then this action is rewarded if at least one of the neighbouring nodes of vi has been already sampled otherwise the action will be penalized. 2. If the action chosen by the automaton Ai is “the selection of the neighbour vk of the node vi ”, then this action is rewarded if none of the neighbouring nodes of vi has been already sampled otherwise the action will be penalized. Step 2. Updating sample set Vs The sample set Vs is updated by replacing the node vi (which corresponds to the cell at the front of Q) with the substitute node found in the previous step. The cell at the front of Q is removed and the cell corresponding to the substitute node is added to the end of Q. Step 3. Stop condition The algorithm terminates when the product of the scaled probability of the action “selection of the corresponding node” for all the automata resided in the cells in Q (all the automata associated with the nodes in Vs ) is greater than a predefined convergence threshold τ . In other words, the stop condition for the proposed algorithm is defined as  vi ∈Vs

pˆ ii (t) ≥ τ

(4.2)

4.3 Learning Automata Based Graph Sampling Algorithms

(a) ICLA mapping phase

(b) Initialization phase

105

(c) Improvement phase

Fig. 4.4 Description of three phases of the proposed algorithm ICLA-NS by a simple example: a Creating an ICLA isomorphic to the input graph G, b Sampling an initial subgraph, and adding the cells corresponding to the sampled nodes to the Queue Q, c Finding the substitute node v7 for the node v15 corresponding to the cell at the front of Q by traversing a path in G starting from v15 , removing the cell at the front of Q, and adding the cell corresponding to v7 to the end of Q

where pˆ ii (t) is the scaled probability of selection of vi by automaton Ai at instant t, and τ denotes a predefined convergence threshold. The set Vs that satisfies Eq. (4.2) along with all the edges between nodes in Vs constitute the final sampled subgraph. Figure 4.4 illustrates different phases of the proposed algorithm ICLA-NS using a simple example.

4.3.3.1

Experimental Evaluation

In this section, we present the experimental results reported in Ghavipour and Meybodi (2017). The sampling algorithm ICLA-NS has been tested on several real-world networks with various structural properties and then results have been compared with the results obtained for other well-known sampling algorithms. The datasets which have used in experiments include: CMU from the collection of Facebook Networks, Advogato and Hamsterster from the collection of Social Networks, and CondMat, Erdos992 and Netscience from the collection of Collaboration Networks borrowed from Rossi and Ahmed (2013). A summary of the global statistics of these datasets has been provided in Table 4.1.

Network Measure The performance of a sampling algorithm is measured by determining how well the subgraphs sampled by it match properties of the original graph. The statistics that have been considered in the experiments are: degree, clustering coefficient, k-core distributions. We present a formal definition of these properties below:

106

4 Social Network Sampling

Table 4.1 Characteristics of used datasets Dataset

Nodes

CondMat

21,363

Edges

Density

Global clustering

91,286

4 × 10−4

0.642

10−2

0.279

1.1 ×

CMU

6621

249,959

Advogato

6551

51,332

3.5 × 10−3

0.287

Erdos992

6100

7515

4 × 10−4

0.068

2426

16,630

10−3

0.538

379

914

1.3 × 10−2

0.741

Hamsterster Netscience

5.7 ×

Degree Distribution. The degree of a node in graph G is the number of connections or edges the node has to other nodes. The degree distribution P(d) is then considered to be the fraction of nodes in G with degree d, for every d ≥ 0 p(d) 

|{vi ∈ V |d(vi )  d}| n

(4.3)

where |V |  n, and d(vi ) denotes the degree of node vi . The degree distribution of many real-world networks follows a power-law distribution (Albert and Barabási 2002; Newman 2003a; Barabási 2004). Clustering Coefficient Distribution. The clustering coefficient for a node is defined as the proportion of links between the nodes within its neighborhood to the number of links that could possibly exist between them. The clustering coefficient distribution P(c) is then calculated as the number of nodes in the network G with clustering coefficient c, for every 0 ≤ c ≤ 1 p(c) 

|{vi ∈ V |d(vi ) > 1, CC(vi )  c}| |{vi ∈ V |d(vi ) > 1}|

(4.4)

where CC(vi ) is the clustering coefficient of node vi with degree greater than 1 and it is defined as the ratio between the number of edges among nodes within the neighbourhood of vi and the total number of all possible edges between them (Barabási 2004), CC(vi ) 

2|{euv ∈ E|vu ∈ N (vi ), vv ∈ N (vi )}| d(vi )(d(vi ) − 1)

(4.5)

K-core Distribution. The k-core of graph G is defined as largest subgraph of G in which all nodes have degree at least k. A node has coreness k if it belongs to the k-core but not to the (k + 1)-core. The k-core distribution P(k) is computed as the fraction of nodes having a coreness k, for every k ≥ 0 p(k) 

|{vi ∈ V |Cor e(vi )  k}| n

(4.6)

4.3 Learning Automata Based Graph Sampling Algorithms

107

where Cor e(vi ) is the coreness of node vi and it denotes the largest value of k such that vi belongs to the k-core. The k-core is a metric of the connectivity and community structure of a graph (Alvarez-Hamelin et al. 2005; Carmi et al. 2006; Kumar et al. 2010). The core sizes also demonstrate the localized density of subgraphs in graph (Seshadhri et al. 2013). Evaluation Metric The goal is to sample a representative subgraph G s from the original graph G such that the distance between the property of G and that of Gs becomes minimum. In the experiments, the following distance measure has been used for the evaluation: Kolmogorov-Smirnov (KS) statistic. It is widely used as measure of the agreement between two cumulative distribution functions (CDF) (Goldstein et al. 2004). The KS statistic is calculated as the maximum vertical distance between two distributions,   K S  max F(x) − F (x) x

(4.7)

where x denotes the range of the random variable, F and F are two CDFs, and 0 ≤ K S ≤ 1. In Ghavipour and Meybodi (2017), the KS statistic has been used for computing the distance between the true distribution of the original graph and the approximation distribution obtained from the sampled subgraph for the degree, clustering coefficient, and k-core distributions. Experimental Results In order to investigate the performance of the proposed algorithm ICLA-NS, several experiments have been conducted on the real-world datasets described in Table 4.1. In the experiments I and II, the impact of the mechanism used for activating the cells of ICLA, and the sampling rate on the performance of the proposed algorithm ICLA-NS are investigated. In the experiment III and IV, the algorithm ICLA-NS has been compared with some well-known sampling algorithms: Node Sampling (NS), Random Walk Sampling (RWS), Forest Fire Sampling (FFS) and also Distributed Learning Automata based Sampling (DLAS) (Rezvanian et al. 2014). The authors also have defined a cost function for extended sampling algorithms, which computes the cost spent by an algorithm in the pre/post processing phase. They then studied the impact of increasing cost on the performance of ICLA-NS in the experiment V. Finally, they investigated the impact of the learning process implemented by ICLA on the performance of ICLA-NS in the experiment VI. In ICLA-NS, the learning algorithm for updating the action probability vectors is L R−I , the convergence threshold τ is set to 0.9 and the learning rate a to 0.1. Each reported result is an average over 30 independent runs for a range of sampling fractions f from 0.1 to 0.3 with increment 0.05. Experiment I In this experiment, authors studied the impact of the mechanism used for activating the cells of ICLA on the performance of the algorithm ICLA-NS. For this purpose, two following mechanisms have been tested:

108

4 Social Network Sampling

Table 4.2 The impact of cell activation mechanism on ICLA-NS’s efficiency in terms of KS distance, at sampling fraction 0.2 Dataset

Statistic Deg

Clust

K Core

Mechanism Mechanism Mechanism Mechanism Mechanism Mechanism I II I II I II CondMat

0.0752

0.0987

0.2096

0.3067

0.1566

0.1628

CMU

0.1919

0.1908

0.1253

0.2409

0.4520

0.4559

Advogato

0.3306

0.3314

0.1086

0.1236

0.3335

0.3345

Erdos992

0.1512

0.3291

0.1770

0.1104

0.1342

0.3291

Hamsterster

0.1478

0.1308

0.2669

0.2971

0.1726

0.1724

Netscience

0.1230

0.4309

0.0945

0.1887

0.1865

0.6000

Avg. for all datasets

0.1700

0.2520

0.1637

0.2112

0.2392

0.3424

Mechanism I: The activation of the cells of ICLA according to the order that they appear in the queue. Mechanism II: The activation of the cells of ICLA by the random selection of a cell from the set of all cells in the queue. Tables 4.2 and 4.3 show the impact of these two mechanisms on the performance of ICLA-NS in terms of the KS distance for degree, clustering coefficient and k-core distributions respectively for the sampling fractions 0.2 and 0.3. From these tables, we observe that for the sampling fraction of 0.2 both the mechanisms provide almost similar results for degree and k-core distributions in all datasets, except for Erdos992 and Netscience that the mechanism I performs considerably better, and for clustering coefficient distribution, the mechanism I outperforms the mechanism II except for Erdos992. For the sampling fraction of 0.3, the mechanism I provides the better results than the mechanism II for all or majority of the statistics in all dataset except for Advogato. Therefore, the authors have used the mechanism I in their proposed algorithm ICLA-NS in the next experiments. Experiment II This experiment aims to investigate the impact of the sampling rate f on the performance of ICLA-NS in terms of the KS distance. The authors considered the statistics degree, clustering coefficient and k-core distributions, and for each statistic, they plotted the KS distance for all the datasets at different sampling fractions f ranging from 0.1 to 0.3. As shown in Fig. 4.5, the KS distance decreases as the sampling rate increases. However, the decrement is not significant for CondMat for degree and k-core distributions. The highest KS distance for degree, clustering coefficient and k-core distributions is obtained respectively in Advogato, Hamsterster and CMU datasets.

4.3 Learning Automata Based Graph Sampling Algorithms

109

Table 4.3 The impact of cell activation mechanism on ICLA-NS’s efficiency in terms of KS distance, at sampling fraction 0.3 Dataset

Statistic Deg

Clust

K Core

Mechanism Mechanism Mechanism Mechanism Mechanism Mechanism I II I II I II CondMat

0.0792

0.1212

0.1490

0.2484

0.1607

0.1897

CMU

0.1782

0.1804

0.0419

0.1430

0.4283

0.4298

Advogato

0.2873

0.2420

0.0953

0.0899

0.2918

0.2622

Erdos992

0.1131

0.3678

0.1428

0.1113

0.1128

0.3678

Hamsterster

0.0801

0.0825

0.2138

0.2179

0.1704

0.1532

Netscience

0.1041

0.4177

0.0912

0.1779

0.1547

0.6160

Avg. for all datasets

0.1403

0.2353

0.1223

0.1647

0.2198

0.3364

0.8

KS Distance

KS Distance

0.6

0.4

0.2

0 0.1

0.15

0.2

0.25

0.4

0.2

0 0.1

0.3

0.15

0.25

Sampling Fraction

Sampling Fraction

(a) Degree

(b) Clustering Coefficient 0.8

Netscience Erdos992 Advogato Hamsterster CondMat CMU

0.6

KS Distance

0.2

0.4

0.2

0 0.1

0.15

0.2

0.25

Sampling Fraction

(c) K-core Decomposition Fig. 4.5 The impact of sampling rate on ICLA-NS’s efficiency in terms of KS distance

0.3

110

4 Social Network Sampling

Table 4.4 KS distance for degree distribution for all datasets, at sampling fraction 0.2 Dataset

NS

RW

FFS

DLAS

EDLAS

ICLANS

CondMat

0.5729

0.4358

0.1660

0.4165

0.4117

0.0752

CMU

0.5505

0.4682

0.2945

0.2813

0.2699

0.1919

Advogato

0.2827

0.3023

0.4213

0.3467

0.3361

0.3306

Erdos992

0.6826

0.1714

0.2227

0.2849

0.2293

0.1512

Hamsterster

0.4608

0.4857

0.2990

0.2804

0.2783

0.1478

Netscience

0.6737

0.2201

0.1357

0.1082

0.1030

0.1230

Avg. for all datasets

0.5372

0.3473

0.2565

0.2863

0.2714

0.1700

Table 4.5 KS distance for clustering coefficient distribution for all datasets, at sampling fraction 0.2 Dataset

NS

RW

FFS

DLAS

EDLAS

ICLANS

CondMat

0.2444

0.7106

0.1432

0.1800

0.1789

0.2096

CMU

0.1484

0.0884

0.2161

0.0834

0.0859

0.1253

Advogato

0.2010

0.7779

0.1678

0.1334

0.1164

0.1086

Erdos992

0.3893

0.3703

0.2055

0.1954

0.1801

0.1770

Hamsterster

0.2244

0.8181

0.2893

0.2668

0.2563

0.2669

Netscience

0.3162

0.3528

0.0947

0.2129

0.1370

0.0945

Avg. for all datasets

0.2540

0.5197

0.1861

0.1787

0.1591

0.1637

Experiment III This experiment aims to investigate the ability of the proposed algorithm ICLA-NS in preserving graph statistics compared to other sampling algorithms, including Node Sampling (NS), Random Walk Sampling (RWS), Forest Fire Sampling (FFS), and also Distributed Learning Automata based Sampling (DLAS). The comparison is made in terms of the KS distance for three graph statistics. We report the results for the sampling fraction of 0.2 in Tables 4.4, 4.5 and Table 4.6, and for the sampling fraction of 0.3 in Tables 4.7, 4.8 and 4.9. Based on the results, ICLA-NS considerably outperforms other sampling algorithms for degree distribution in all datasets, except for Advogato and Netscience in which ICLA-NS provides the results close to the best ones. As noted in before, ICLA-NS has a selection bias to high degree nodes that improves estimates of the sampled degree distribution. For k-core distribution, ICLA-NS also performs the best in majority of the test datasets and close to the best ones in the rest of datasets. However, ICLA-NS performs slightly better on average than DLAS and FFS for clustering coefficient distribution. ICLA-NS provides the best results for this statistic only in Advogato, Erdos992 and Netscience datasets. Generally, DLAS performs almost similar to FFS. NS and RWS perform the worst on average among the five methods for all statistics.

4.3 Learning Automata Based Graph Sampling Algorithms

111

Table 4.6 KS distance for k-core distribution for all datasets, at sampling fraction 0.2 Dataset

NS

RW

FFS

DLAS

EDLAS

ICLANS

CondMat

0.6769

0.6740

0.2406

0.4744

0.3900

0.1566

CMU

0.7297

0.6719

0.5340

0.5111

0.4625

0.4520

Advogato

0.3163

0.3979

0.4351

0.3504

0.3429

0.3335

Erdos992

0.6826

0.1256

0.2290

0.2577

0.1692

0.1342

Hamsterster

0.5531

0.6813

0.3392

0.3180

0.2762

0.1726

Netscience

0.8195

0.4110

0.2096

0.2321

0.1928

0.1865

Avg. for all datasets

0.6297

0.4936

0.3312

0.3573

0.3056

0.2392

Table 4.7 KS distance for degree distribution for all datasets, at sampling fraction 0.3 Dataset

NS

RW

FFS

DLAS

EDLAS

ICLANS

CondMat

0.4616

0.3823

0.1682

0.3619

0.3016

0.0792

CMU

0.4364

0.3764

0.2334

0.2664

0.1879

0.1782

Advogato

0.2434

0.2940

0.4085

0.3257

0.2859

0.2873

Erdos992

0.5789

0.1337

0.1694

0.1769

0.1715

0.1131

Hamsterster

0.3770

0.4033

0.2847

0.2011

0.1289

0.0801

Netscience

0.5991

0.2044

0.0724

0.0810

0.0988

0.1041

Avg. for all datasets

0.4494

0.2990

0.2228

0.2355

0.1958

0.1403

Table 4.8 KS distance for clustering coefficient distribution for all datasets, at sampling fraction 0.3 Dataset

NS

RW

FFS

DLAS

EDLAS

ICLANS

CondMat

0.1850

0.6586

0.1312

0.1514

0.1532

0.1490

CMU

0.0914

0.0694

0.1543

0.0399

0.0374

0.0419

Advogato

0.1747

0.7081

0.1171

0.1142

0.1058

0.0953

Erdos992

0.2480

0.2668

0.1573

0.1480

0.1491

0.1428

Hamsterster

0.1507

0.7961

0.2466

0.2010

0.2131

0.2138

Netscience

0.2776

0.3143

0.0936

0.1617

0.1344

0.0912

Avg. for all datasets

0.1879

0.4689

0.1500

0.1360

0.1322

0.1223

Table 4.10 reports the average KS distance taken over degree, clustering coefficient and k-core distributions for all datasets. According to the results of this table, we can conclude that the proposed algorithm ICLA-NS preserves more accurately the graph statistics for all test datasets comparing to other sampling algorithms.

112

4 Social Network Sampling

Table 4.9 KS distance for k-core distribution for all datasets, at sampling fraction 0.3 Dataset

NS

RW

FFS

DLAS

EDLAS

ICLANS

CondMat

0.5680

0.6063

0.2432

0.4268

0.3441

0.1607

CMU

0.6320

0.5806

0.4679

0.4705

0.4399

0.4283

Advogato

0.2639

0.3277

0.4200

0.3304

0.3251

0.2918

Erdos992

0.5789

0.1239

0.1745

0.1613

0.1709

0.1128

Hamsterster

0.4488

0.6253

0.3258

0.2134

0.2014

0.1704

Netscience

0.7367

0.3895

0.1404

0.1196

0.1273

0.1547

Avg. for all datasets

0.5380

0.4422

0.2953

0.2870

0.2681

0.2198

Table 4.10 Average KS distance over three measures for all datasets, at sampling fraction 0.2 Dataset

NS

RW

FFS

DLAS

EDLAS

ICLANS

CondMat

0.4981

0.6068

0.1833

0.3570

0.3220

0.1471

CMU

0.4762

0.4095

0.3482

0.2919

0.2283

0.2564

Advogato

0.2667

0.4927

0.3414

0.2768

0.2563

0.2576

Erdos992

0.5848

0.2224

0.2191

0.2460

0.2045

0.1541

Hamsterster

0.4128

0.6617

0.3092

0.2884

0.2480

0.1958

Netscience

0.6031

0.3280

0.1467

0.1844

0.1017

0.1347

Avg. for all datasets

0.4736

0.4535

0.2580

0.2741

0.2268

0.1910

In this experiment, the authors also investigated the ability of ICLA-NS for preserving the local density in sampled subgraphs in comparison to other sampling algorithms. For this purpose, the maximum core number in subgraphs sampled by the sampling algorithms is compared to real maximum core number for each dataset at the sampling fraction 0.2. The maximum core number is defined as the maximum value of k in k-core distribution. As shown in Table 4.11, ICLA-NS captures the local density in subgraphs sampled from CondMat, Advogato and Erdos992. For CMU, ICLA-NS performs better than the other sampling methods. ICLA-NS is as good as FFS for Hamsterster, and as good as FFS and DLAS for Netscience. Experiment IV In this experiment, the authors investigated whether the statistics in subgraphs sampled by the proposed algorithm ICLA-NS overestimate or underestimate the statistics of the original graph. For this purpose, they plotted degree, clustering coefficient and k-core distributions at sampling fraction 0.2 for the proposed algorithm and other sampling algorithms. Figures 4.6, 4.7, 4.8, 4.9, 4.10 and 4.11 indicate the plots for all the distributions across all datasets. Note that in the figures, P(X > x) refers to CCDF, and P(X < x) refers to CDF. As shown in these figures, the proposed algorithm ICLA-NS captures the tail of degree distribution for Advogato and Hamsterster better than other methods. ICLA-

4.3 Learning Automata Based Graph Sampling Algorithms

113

Table 4.11 Maximum core number for sampling algorithms versus real maximum core number of original graph, at sampling fraction 0.2 Dataset

Real max core no.

NS

RW

CondMat

25

8

4

23

CMU

69

15

20

44

Advogato

25

7

3

23

Erdos992

7

2

3

7

24

8

2

8

2

4

Hamsterster Netscience

FFS

0

EDLAS

ICLA-NS

12

25

32

35

67

22

23

25

7

7

7

21

19

20

21

6

6

6

6

9

1

10

0.8

-1

P(Xx)

DLAS

-2

10

0.6 0.4

-3

10

0.2

-4

10

0

1

10

0

100

Degree

0.2

0.4

0.6

0.8

1

Clustering Coefficient

1

P(X>x)

0.8

Origina NS RWS FFS DLAS ICLA-N

0.6 0.4 0.2 0

0

1

10

K (k-core)

Fig. 4.6 Comparing sampling algorithms on CondMat dataset for different distribution statistics, at sampling fraction 0.2

114

4 Social Network Sampling 0

1

10

0.8

-1

P(Xx)

10

-2

10

0.6 0.4

-3

10

0.2

-4

10

0

1

10

100

1,000

0

0.2

0.4

0.6

0.8

1

Clustering Coefficient

Degree 1

P(X>x)

0.8

Origina NS RWS FFS DLAS ICLA-N

0.6 0.4 0.2 0

0

10

1

100

K (k-core)

Fig. 4.7 Comparing sampling algorithms on CMU dataset for different distribution statistics, at sampling fraction 0.2

NS underestimates the low degrees for Advogato, Erdos992 and Hamsterster, overestimates them for Netscience, and captures them for CondMat and CMU. FFS captures the high degrees and underestimates the low degrees for all datasets except for CMU and Netscience. DLAS performs similar to FFS for Advogato and Hamsterster and underestimates the degree distribution for other datasets. RWS underestimates the degree distribution for all datasets. NS underestimates the degree distribution for all datasets. ICLA-NS preserves the clustering coefficient distribution for Netscience more accurately than other methods. Similar to ICLA-NS, DLAS and FFS underestimate the low clustering coefficients for all datasets except for CondMat and Netscience and overestimate the high clustering coefficients for all datasets except for Erdos992. RWS and NS underestimates the clustering coefficient distribution for all datasets. ICLA-NS captures the k-core distribution for CondMat, CMU, Advogato, Erdos992 and Hamsterster better than other methods. Similar to ICLA-NS, DLAS and FFS underestimate the core structures for Netscience and overestimate them for Advogato, Erdos992 and Hamsterster. RWS underestimates the k-core distribution

4.3 Learning Automata Based Graph Sampling Algorithms 0

115

1

10

0.8

-1

P(Xx)

10

-2

10

0.6 0.4 0.2

-3

10

0

1

10

100

1,000

0

0.2

Degree

0.4

0.6

0.8

1

Clustering Coefficient

1

P(X>x)

0.8 Origina NS RWS FFS DLAS ICLA-N

0.6 0.4 0.2 0

0

1

10

100

K (k-core)

Fig. 4.8 Comparing sampling algorithms on Advogato dataset for different distribution statistics, at sampling fraction 0.2

for CondMat, CMU and Netscience. NS underestimates the k-core distribution for all datasets. Experiment V This experiment aims to show the impact of increasing cost on the performance of ICLA-NS comparing to DLAS algorithm (as an example of an extended sampling algorithm with a pre-processing phase). For this purpose, the authors used the following definition for the cost of an extended sampling algorithm: Cost 

|Vvisited | |V |

(4.8)

where |V_visited| is the total number of times that the nodes of the input graph are visited during the pre/post processing phase of the algorithm, and |V| is the number of nodes in the input graph. Actually, the cost function computes the additional processing cost that an extended sampling algorithm spends in the pre/post processing phase

116

4 Social Network Sampling 0

10

1 0.8

-1

P(Xx)

10

-2

10

0.6 0.4 0.2

-3

10

0

1

10

0

100

0.2

0.4

0.6

0.8

1

Clustering Coefficient

Degree 1

P(X>x)

0.8

Original NS RWS FFS DLAS ICLA-NS

0.6 0.4 0.2 0

0

1

10

K (k-core)

Fig. 4.9 Comparing sampling algorithms on Erdos992 dataset for different distribution statistics, at sampling fraction 0.2

to produce subgraphs with the higher quality. In this experiment, the authors changed the cost from 9 to 45 with step 9 and for each value of the cost, they reported the KS distance averaged over all datasets for degree, clustering coefficient and k-core distributions at the sampling fraction 0.2. From Fig. 4.12, we can see that increasing the cost results in the decrease of the KS distance for both extended sampling algorithms ICLA-NS and DLAS. For a same cost, CLAS-NS always provides better results as compared to DLAS. They also compared ICLA-NS with the simple sampling algorithms FFS, RWS and NS in terms of average KS distance over all datasets. Table 4.12 reports these results for degree, clustering coefficient and k-core distributions at sampling fraction of 0.2. According to the results of this table, even though the proposed algorithm ICLA-NS has higher time complexity comparing to the simple sampling algorithms for a same sample size, it preserves more accurately the graph statistics.

4.3 Learning Automata Based Graph Sampling Algorithms 0

10

117

1 0.8

-1

P(Xx)

10

-2

10

0.6 0.4 0.2

-3

10

0

0

1

0.2

0.4

0.6

0.8

1

Clustering Coefficient

Degree

1

P(X>x)

0.8

Original NS RWS FFS DLAS ICLA-NS

0.6 0.4 0.2 0

0

1

10

K (k-core)

Fig. 4.10 Comparing sampling algorithms on Hamsterster dataset for different distribution statistics, at sampling fraction 0.2 Table 4.12 Comparing ICLA-NS with simple sampling algorithms in terms of average KS distance at sampling fraction 0.2

Statistic

NS

RW

FFS

ICLA-NS

Deg

0.5372

0.3473

0.2565

0.1700

Clust

0.2540

0.5197

0.1861

0.1673

K Core

0.6297

0.4936

0.3312

0.2392

Experiment VI This experiment is conducted to study the learning ability of ICLA-NS and its impact on the quality of sampled subgraphs in terms of the KS distance for degree, clustering coefficient and k-core distributions. To do this, the authors considered another version of ICLA-NS algorithm, called PC-NS, in which each learning automaton in ICLA is replaced by a pure-chance automaton. A pure-chance automaton is an automaton that always selects its actions with equal probabilities and is used as the comparison standard for investigating the efficiency of the learning process (a learning automaton must do at least better than such a pure-chance automaton) (Thathachar and Sastry

118

4 Social Network Sampling 0

1

10

P(Xx)

0.8

-1

10

0.6 0.4 0.2

-2

10

0

1

0

10

Degree

0

0.2

0.4

0.6

0.8

1

Clustering Coefficient

1

P(X>x)

0.8

Original NS RWS FFS DLAS ICLA-NS

0.6 0.4 0.2 0

0

1

10

K (k-core)

Fig. 4.11 Comparing sampling algorithms on Netscience dataset for different distribution statistics, at sampling fraction 0.2

2002). They compared the performances of both algorithms in terms of the KS distance for all datasets. We report their results in Tables 4.13 and 4.14 respectively for the sampling fractions 0.2 and 0.3. According to the results, ICLA-NS performs considerably better than PC-NS for degree distribution in all datasets except for Advogato. For clustering coefficient distribution, ICLA-NS also outperforms PCNS in all datasets except for Hamsterster. ICLA-NS preserves k-core distribution better than PC-NS in all test datasets except for Advogato at the sampling fraction of 0.2. The authors also investigated the number of isolated nodes in subgraphs sampled by the ICLA algorithm. They compared the probability of isolated nodes for ICLANS to that for PC-NS, and other sampling algorithms. As shown in Table 4.15, for all datasets the algorithms PC-NS, DLAS and NS produce samples which include isolated nodes. This is because these algorithms sample the nodes independently. In contrast, ICLA-NS, FFS and RWS produce connected subgraphs, especially for Advogato that includes some isolated nodes.

4.3 Learning Automata Based Graph Sampling Algorithms 1

Average KS Distance

Average KS Distance

1 0.8 0.6 0.4 0.2 0

119

9

18

27

36

0.8 0.6 0.4 0.2 0

45

18

9

27

36

45

Cost

Cost

(a) Degree

(b) Clustering Coefficient

Average KS Distance

1 0.8 0.6

ICLA-NS DLAS

0.4 0.2 0

9

18

27

36

45

Cost

(c) K-core Decomposition Fig. 4.12 Comparing extended sampling algorithms in terms of average KS distance versus cost, at sampling fraction 0.2 Table 4.13 The impact of the learning process on ICLA-NS’s efficiency in terms of KS distance, at sampling fraction 0.2 Dataset

Statistic Deg

CondMat

Clust

K Core

ICLANS

PC-NS

ICLANS

PC-NS

ICLANS

PC-NS

0.0752

0.5527

0.2096

0.2550

0.1566

0.6207

CMU

0.1919

0.5186

0.1253

0.1791

0.4520

0.6109

Advogato

0.3306

0.2326

0.1086

0.1972

0.3335

0.2811

Erdos992

0.1512

0.6021

0.1770

0.3824

0.1342

0.6021

Hamsterster

0.1478

0.4560

0.2669

0.2057

0.1726

0.5172

Netscience

0.1230

0.6374

0.0945

0.4049

0.1865

0.6670

Avg. for all datasets

0.1700

0.4999

0.1637

0.2707

0.2392

0.5498

120

4 Social Network Sampling

Table 4.14 The impact of the learning process on ICLA-NS’s efficiency in terms of KS distance, at sampling fraction 0.3 Statistic Dataset

Deg

Clust

K Core

ICLANS

PC-NS

ICLANS

PC-NS

ICLANS

PC-NS

CondMat

0.0792

0.4836

0.1490

0.2129

0.1607

0.5582

CMU

0.1782

0.4602

0.0419

0.1297

0.4283

0.5756

Advogato

0.2873

0.2414

0.0953

0.1649

0.2918

0.3095

Erdos992

0.1131

0.5771

0.1428

0.3071

0.1128

0.5771

Hamsterster

0.0801

0.4045

0.2138

0.2096

0.1704

0.4590

Netscience

0.1041

0.5869

0.0912

0.2833

0.1547

0.6593

Avg. for all Datasets

0.1403

0.4589

0.1223

0.2179

0.2198

0.5231

Table 4.15 Comparing sampling algorithms in terms of the probability of isolated nodes, at sampling fraction 0.2 Dataset

Real probability

NS

DLAS

ELAS

PC-NS

ICLA-NS, FFS, RWS

CondMat

0

0.3267

0.2371

0.2259

0.5470

0

CMU

0

0.0733

0.0211

0.0724

0.5000

0

Advogato

0.2113

0.4672

0.0351

0.0061

0.4748

0

Erdos992

0

0.6811

0.0236

0.0531

0.8027

0

Hamsterster

0

0.3299

0.0082

0.0132

0.4550

0

Netscience

0

0.5208

0.0132

0.0546

0.5921

0

4.3.4 The Streaming Sampling Algorithm FLAS FLAS (Ghavipour and Meybodi 2018) is a fixed structure learning automata-based sampling algorithm for sampling from streaming graphs. The inputs are an activity graph G(V, E) presented as a sequence of edges in an arbitrary order, as well as the sample size n s . The output of the algorithm is a representative sample G s (Vs , E s ) (|Vs | n s ) that will match the properties of the graph G. The algorithm FLAS is described on the basis of the state transitions of the G2N,2 automaton with the stochastic transition function as depicted in Figs. 1.5 and 1.6. The G2N,2 automaton behaves deterministically when it is rewarded and stochastically when it receives a penalty, i.e. γ1  0 and γ2 ∈ [0, 1). Since γ1 is set to zero, in the rest of this section we ignore it and refer to γ2 as γ . The pseudo code of the sampling algorithm FLAS is given in Fig. 4.13. FLAS consists of two parts: initialization part and updating part. Initialization part. At first, an initial sample graph is created by successively adding the edges one by one from the beginning of the stream to E s and their incident nodes to Vs until the required number of nodes is sampled (|Vs | n s ). When a

4.3 Learning Automata Based Graph Sampling Algorithms

121

Algorithm 4-4. FLA-based sampling (FLAS) Input: Original graph , Sample size Output: Sampled subgraph Initialization part

while

do

for each node incident to if then

do

Assign an G2N,2 automaton end end end Updating part while graph

to

;

is streaming do

for each node incident to do if then \\ Reward automaton if then else \\ Penalize automaton if is visited for first time then Assign an G2N,2 automaton Generate at random in interval (0,1); if then if then else if then

to

else Choose sampled node whose corresponding automaton’s state is closest to its boundary state; Delete with all its incident edges from ; end end end end if and

then

end

Fig. 4.13 The pseudo code of FLAS algorithm when G2N-2 automaton is used

node vk is visited for the first time, it is equipped with a fixed structure learning automaton L Ak with two actions α1 and α2 . Initially, the states of learning automata corresponding to the nodes in Vs are set to the boundary state of action α2 (∀vk ∈ Vs : the current state of L Ak is set to φ2N , i.e. Φ k  φ2N ). Note that for an automaton being in one of the states of action α2 indicates that the corresponding node is in Vs and being in one of the states of action α1 indicates that the corresponding node is in V − Vs . The state of the action of learning automaton indicates the strength of membership of the corresponding node.

122

4 Social Network Sampling

Updating part. In this part, the algorithm consecutively processes the remaining edges of the stream and keeps on updating the sample as long as the graph is streaming. The decision whether to sample a streaming edge or not is taken based on the states of the learning automata corresponding to the nodes incident on that edge. When a node is visited for the first time, it is equipped with a fixed structure learning automaton residing in the boundary state of action α1 (i.e. φ N ). Considering this in mind, each time the algorithm is confronted with an edge (an activity) et  (vi , v j ), one of the following cases may be encountered. 1. Both automata L Ai and L A j are in one of the states of action α2 . 2. One of the automata is in one of the states of action α1 and the other automaton is in one of the states of action α2 . 3. Both automata L Ai and L A j are in one of the states of action α1 . The actions taken by the algorithm for each of the above cases are described in the next three paragraphs. If L Ai and L A j are both in one of the states of action α2 , both automata are rewarded (i.e. Φ k  Φ k − 1 : k ∈ {i, j}) and then the edge et is added to E s . If one of the automata (for example L Ai ) is in one of the state of action α1 and the other automaton (L A j ) is in one of the states of action α2 then the automaton L Ai receives penalty (moves one state towards the most internal state with probability γ and towards the boundary state with probability 1 − γ ) and the automaton L A j is rewarded (i.e. Φ j  Φ j − 1). If as a result of penalizing L Ai , L Ai switches to the action α2 , then the edge et is added to E s . In order to preserve the constraint |Vs |  n s , a node from Vs whose corresponding automaton’s state is closest to its boundary state along with its incident edges will be removed. If there exist more than one such a node, one of them, say vi , is selected at random. The state of the automaton of the removed node vi changes to the most internal state of the action α1 (i.e. Φ i  φ1 ). If both L Ai and L A j are in one of the states of action α1 then they are penalized. If after the penalization both automata switch to the action α2 then the incident nodes on edge et are added to Vs and the edge et is added to E s . In order to preserve the constraint |Vs | n s , two nodes from Vs whose corresponding automata’s states are closest to their boundary states along with all the edges incident to those nodes will be removed. If there exist more than two nodes with this property, two of them are selected at random. The state of each of the learning automata of the removed nodes then changes to its most internal state of the action α1 .

4.3.4.1

Complexity Analysis

In this section, we illustrate the analysis presented in Ghavipour and Meybodi (2018) for the space and time complexity of the algorithm FLAS. Space complexity. The input graph G and the learning automata of the nodes not belonging to the sample graph will be always kept on the disk. In the memory, FLAS keeps the sampled edges and the learning automata corresponding to the sampled

4.3 Learning Automata Based Graph Sampling Algorithms Table 4.16 Time complexity of operations

123

Dataset

Insert

Delete

Search

Unsorted list

O(1)

O(n)

O(n)

B-Tree

O(log n)

O(log n)

O(log n)

nodes. Each learning automaton requires only one space of memory for its state value. Hence the total memory space needed by the algorithm is O(|E S |+|VS |). Time complexity. The analysis given below is a worst case analysis. The worst case occurs if in all iterations of the updating part of the algorithm every time an edge is taken from the stream of the edges to be processed, the nodes incident on the edge are not in the set of sampled nodes and also their corresponding learning automata change their actions. The authors assumed that the algorithm uses an unsorted list data structure to store the input graph and the sampled edge stream. They also assumed that for storing the learning automata on both disk and memory the algorithm uses a B-Tree data structure (Bayer and McCreight 1972). Table 4.16 shows the time complexity of operations such as insert, delete and search on these data structures. They also assumed that each disk access takes a certain amount of time referred to as c. Considering these in mind, the time complexity of our algorithm in the worst case is computed as follows. In the initialization part, the algorithm adds each edge read from disk to the sampled edge stream in O(1) time. For each incident nodes of the sampled edge, FLAS inserts its corresponding automaton into B-Tree stored on memory if it does not exist which takes O(log|VS |) time. Therefore, the time complexity of the initialization part is O(|E S |log|VS |+c|E S |). In the updating part, every time a streaming edge is read from disk, the algorithm first searches B-Tree in memory for the learning automata of its incident nodes which takes O(log|VS |) time. Since only the automata corresponding to the sampled nodes are kept in memory, FLAS then has to search the B-Tree stored on disk in O(c log(|V |−|VS |)) time. After that, the found automata are moved to the memory, penalized and then inserted in B-Tree stored on memory in O(c + log|VS |) time. In order to preserve the constraint on the sample size, FLAS searches B-Tree in memory for two automata whose states are closest to the boundary state and moves these two automata to disk which takes O(|VS |+c log(|V |−|VS |)) time. Finally, all the edges incident to two replaced nodes are removed and the new streaming edge is added to the sampled edge stream that takes O(|E S |) time. Hence, the time taken by the updating part in the worst case is O(c(|E|−|E S |) log(|V |−|VS |) + |VS |+|E S |). Considering the above analysis, the total time complexity of our algorithm FLAS will be O(|E S |log|VS |+c|E S |+c(|E|−|E S |) log(|V |−|VS |) + |VS |+|E S |). Since the term |E|log(|V |−|VS |) has the largest growth rate in this formula and usually |VS |

|V |, the time taken by FLAS in the worst case can be considered to be O(|E|log|V |).

124

4 Social Network Sampling

Table 4.17 Characteristics of used datasets Dataset

Nodes

Edges

Density

Avg. path

Global clustering

Flickr

820,878

6,625,280

1.9E−5

6.5

0.116

Facebook

46,952

183,412

2E−4

5.6

0.085

HepPH

34,546

420,877

7E−4

4.33

0.146

CondMAT

23,133

93,439

4E−4

5.35

0.264

4.3.4.2

Experimental Evaluation

The efficiency of the proposed algorithm FLAS has been investigated on several real world networks. The authors have utilized social networks from Gleich (2012) and Facebook from New Orleans City (Viswanath et al. 2009), a collaboration network from CondMAT and a citation network from ArXiv HepPH (Leskovec and Krevl 2014). Table 4.17 summarizes the global statistics of these real world networks. • Network Measure In addition to the graph measures given in Section “Network Measure”, the authors also used three following measures to investigate the representativeness of subgraphs sampled by sampling algorithms in our experiments. Path Length Distribution. The shortest path length denotes the fewest number of hops required to reach from a node to another node. The path length distribution P(h) is considered to be the fraction of pairs of nodes in graph G with shortest path length h, for every h > 0    (vi , v j ) ∈ V |dis(vi , v j )  h  (4.9) P(h)  n2 where |V |  n, and dis(vi , v j ) denotes the shortest path distance between nodes vi and v j . • Eigenvalues The eigenvalue λ of the adjacency matrix A of graph G is computed as Av  λv, where v is the eigenvector of A associated with the eigenvalue λ. Eigenvalues are known as the basis of spectral graph analysis (Ahmed et al. 2014). In the experiments, authors considered the largest 25 eigenvalues of a graph G. • Network Values Network values denotes the distribution of the eigenvector components corresponding to the largest eigenvalue of the adjacency matrix A of graph G (Ahmed et al. 2014). In the experiments, the largest 100 network values of a graph G are considered.

4.3 Learning Automata Based Graph Sampling Algorithms

125

Evaluation Metric To assess the distance between any statistic in the original graph and that in the sampled subgraph, the authors used the following distance measures, in addition to the KS distance given in Section “Evaluation Metric”. The KS statistic is used for computing the distance between the true distribution of the original graph and the estimated distribution from the sampled subgraph for degree, clustering coefficient, k-core, and path length distributions. • Normalized L1 distance. The normalized L1 distance (i.e. normalized manhattan distance) computes the distance between two positive m-dimensional real vectors (Ahmed et al. 2014),  m  1   pi − p´ i  L1  (4.10) m i1 pi where p and p´ are respectively the true vector and the estimated vector. In the experiment, authors used normalized L1 distance to measure the distance between two vectors of eigenvalues from the original and the sampled graphs. • Normalized L2 distance The normalized L2 distance (i.e. normalized Euclidean distance) measures the distance between two vectors when the components of these vectors are fractions (Ahmed et al. 2014). This distance measure is computed as    p − p´  L2  (4.11) p In this section, the normalized L2 distance is utilized for computing the distance between two vectors of network values related to the original and the sampled graphs.

Experimental Results To evaluate the performance of the proposed algorithm FLAS, several experiments on the real world networks described in Table 4.17 have been conducted. FLAS uses the G2N,2 automaton with the setting γ  0.9 and N  4. In the experiments, the sampling fraction f varies from 0.1 to 0.3 with increment 0.025. For each sampling fraction, authors reported the average results of 30 independent runs. In each run, the authors have utilized a random permutation of the edge stream as the input to the algorithm to make the results independent of any stream ordering. It has been ensured that the different variations of the proposed algorithm and the algorithm PIES use the same streaming orders.

126

4 Social Network Sampling

Table 4.18 The results of statistical test for different algorithms in terms of the KS distance for degree distribution Dataset

Test result Mean KS distance (K S F L AS − K S P I E S )

Difference significance

Performance

Flickr

−0.0381

2.4367E−81



Facebook

−0.1987

8.0224E−48



HepPH

−0.4377

3.8966E−21



CondMAT

−0.3096

6.7916E−57



Table 4.19 The results of statistical test for different algorithms in terms of the KS distance for clustering coefficient distribution Dataset

Test result Mean KS distance (K S F L AS − K S P I E S )

Difference significance

Performance

Flickr

−0.0511

3.5431E−76



Facebook

−0.2054

1.2692E−46



HepPH

−0.4457

1.4130E−10



CondMAT

−0.2549

2.1055E−51



Table 4.20 The results of statistical test for different algorithms in terms of the KS distance for path length distribution Dataset

Test result Mean KS distance (K S F L AS − K S P I E S )

Difference significance

Performance

Flickr

−0.1066

1.2893E−68



Facebook

−0.1462

4.7106E−53



HepPH

−0.1689

2.0248E−49



CondMAT

−0.1744

9.3359E−41



Experiment I In the first experiment of the second group of the experiments, the authors compared the proposed algorithm FLAS with the PIES algorithm in terms of KS distance for degree and clustering coefficient distributions. They performed t-Test at the 95% confidence interval and report the results for the sampling fraction 0.2 in Tables 4.18 and 4.19 according to the results of these tables, FLAS has significantly better performance than PIES. Table 4.20. According to the results of these tables, FLAS has significantly better performance than PIES. Experiment II In this experiment, the authors investigate the ability of the proposed algorithm FLAS

4.3 Learning Automata Based Graph Sampling Algorithms 1 FLAS PIES

0.8

Average KS Distance

Average KS Distance

1

0.6 0.4 0.2 0 0.1

0.15

0.2

0.25

0.4 0.2

0.2

0.25

0.3

(a) Degree

(b) Clustering Coefficient 1 FLAS PIES

0.6 0.4 0.2

0.15

0.2

0.25

0.6 0.4 0.2 0 0.1

0.3

FLAS PIES

0.8

Sampling Fraction

0.15

0.2

0.25

0.3

Sampling Fraction

(c) K-core Decomposition

(d) Path Length 1

1 FLAS PIES

0.8

Average L2 Distance

Average L1 Distance

0.15

Sampling Fraction

Average KS Distance

Average KS Distance

0.6

Sampling Fraction

0.8

0.6 0.4 0.2 0 0.1

FLAS PIES

0.8

0 0.1

0.3

1

0 0.1

127

0.15

0.2

0.25

0.3

FLAS PIES

0.8 0.6 0.4 0.2 0 0.1

0.15

0.2

0.25

Sampling Fraction

Sampling Fraction

(e) Eigen Values

(f) Network Value

Fig. 4.14 a–d average KS distance, e average L1 , and f L2 distance, across 4 datasets

0.3

128

4 Social Network Sampling

Table 4.21 KS distance for all datasets, at sampling fraction 0.2 Dataset

FLAS Deg

PIES Clust

k Core

Path

Deg

Clust

k Core

Path

Flickr

0.1445

0.1087

0.1721

0.1963

0.1826

0.1598

0.2145

0.3029

Facebook

0.1099

0.0829

0.1226

0.1471

0.3086

0.2883

0.3846

0.2933

HepPH

0.0813

0.0288

0.1697

0.1786

0.5190

0.4745

0.6641

0.3475

CondMAT

0.0720

0.1726

0.0858

0.1323

0.3816

0.4275

0.4829

0.3067

Avg. for all datasets

0.1019

0.0983

0.1376

0.1636

0.3479

0.3375

0.4365

0.3126

Table 4.22 L1 /L2 distance for all datasets, at sampling fraction 0.2 Dataset

FLAS

PIES

Eigen Val.

Net Val.

Eigen Val.

Net Val.

Flickr

0.0071

0.0013

0.5496

0.1923

Facebook

0.1266

0.0759

0.6285

0.3435

HepPH

0.2147

0.1203

0.7151

0.3383

CondMAT

0.1605

0.0768

0.6743

0.3172

Avg. for all datasets

0.1272

0.0686

0.6419

0.2978

in preserving the graph statistics as compared to the PIES algorithm. They considered the statistics degree, clustering coefficient, k-core and path length distributions. We report the average KS distance over four datasets for different sampling fractions in Fig. 4.14a–d. We also plot the average L1 and L2 distances respectively for eigenvalues in Fig. 4.14e and for network values in Fig. 4.14f. From these figures, we can see that the FLAS algorithm performs considerably better than the PIES algorithm for all the statistics. Tables 4.21 and 4.22 respectively show the KS distance and L1 /L2 distances for each dataset at the sampling faction 0.2. Finally, the authors studied the ability of the FLAS algorithm in preserving the local density in sampled subgraphs comparing to the PIES algorithm. This is done by computing the maximum core numbers (the maximum value of k in the k-core distribution) for subgraphs sampled by both FLAS and PIES algorithms and comparing them with the real maximum core number for each dataset. As shown in Table 4.23, subgraphs sampled by FLAS have the maximum core number very closer to that of the original graphs while in PIES such phenomena does not occur. Experiment III This experiment plots the distributions of six graph statistics for each of the four datasets when sampling fraction f is set to 0.2. For degree and k-core distributions, CCDF (complementary cumulative distribution function) has been plotted and for clustering coefficient and path length distributions, the authors have plotted CDF. As shown in Figs. 4.15, 4.16, 4.17 and 4.18, FLAS preserves the degree distribution more accurately than PIES. The PIES algorithm underestimates the degree distri-

4.3 Learning Automata Based Graph Sampling Algorithms

129

Table 4.23 The maximum core number for sampling fraction 0.2 versus its real value for each dataset Dataset

Real Max Core no.

FLAS

PIES

Flickr

406

221

162

Facebook

16

12

5

HepPH

30

20

5

CondMAT

25

16

6

10 -4

-6

1 0.8

0.6

0.6

0.4

0

105

100

Degree

0.2

0.4

0.6

0.8

|Eigen Values|

0.6 0.4 Original FLAS PIES

0.2 0

5

10

0.4 0.2 0

1

0

15

Path Length

800 600 400

0

10

100

1000

100 Original FLAS PIES

10-2

10-4

200 20

1

K(k-core)

1000

0.8

P(Xx)

10 -2

P(Xx)

10 0

5

10

15

Rank

20

25

10-6 100

Original FLAS PIES

101

102

Rank

Fig. 4.15 Comparing sampling algorithms on Flickr dataset for different distribution statistics, at sampling fraction 0.2

bution for all datasets, except for Flickr. PIES also underestimates the clustering coefficient distribution for Flickr, CondMAT and HepPH. The FLAS algorithm captures the clustering coefficient statistic for HepPH, underestimates it for CondMAT, and overestimates it for Flickr and Facebook. For the k-core distribution, FLAS generally provides better results compared to the PIES algorithm. While FLAS preserves the k-core distribution in almost all the datasets, PIES overestimates the core structures in Flickr and underestimates them in the other datasets. FLAS captures the path length distribution for all the tested graphs. However, PIES underestimates this statistic for Flickr and overestimates it for CondMAT and HepPH. For Facebook graph, PIES performs almost as good as FLAS. FLAS accurately estimates the eigenvalues and the network values of Flicker graph and for the other graphs, it outperforms the PIES algorithm for these two statistics.

4 Social Network Sampling

-2

10

-3

10

-4

10

-5

Original FLAS PIES

100

101

1

0.8

0.8

0.6

0.6

P(X>x)

10

1

0.4 Original FLAS PIES

0.2 102

0

103

0

1

0.6

0.8

0

1

|Eigen Values|

0.6 0.4 Original FLAS PIES

0.2 0

5

10

15

0

20

100 Original FLAS PIES

30 20 10 0

25

10

1

K (k-core)

40

0.8

P(Xx)

1

-4

0.4 Original FLAS PIES

0.2

101

102

0

103

0

0.4

0.6

0.8

0

1

0.6 0.4 Original FLAS PIES

0.2 0

5

10

Path Length

15

20

0

10

100 Original FLAS PIES

30 20 10 0

1

K(k-core)

Network Value

|Eigen Values|

0.8

Original FLAS PIES

0.2

40

1

0

0.2

0.4

Clustering Coefficient

Degree

P(Xx)

P(X>x)

10

-3

10

Original FLAS PIES

-4

10

-3

10

Original FLAS PIES

-4

10

100

101

100

101

Degree

Degree

(a) WS-Net1

(b) WS-Net 2 Original FLAS PIES

-1

10

10

-2

10

P(X>x)

P(X>x)

Original FLAS PIES

-1

-3

10

-2

10

-3

10

-4

-4

10

10

0

10

1

2

10

10

3

10

0

10

1

2

10

10

Degree

Degree

(c) BA-Net1

(d) BA-Net 2

3

10

Fig. 4.20 The impact of high degree bias on the sampled degree distribution for different synthetic networks, at sampling fraction 0.2

in which w¯ i j is an estimate of the edge weight associated with edge ei j ∈ E s with sampling rate 0 < ϕ < 1, where |Vs | ϕ × |V |. Due to important role of network sampling algorithms for preprocessing, characterizing, studying and estimating the properties of online social networks, several sampling algorithms (Gjoka et al. 2011; Papagelis et al. 2013) presented for sampling networks that can be classified into two groups such as random selection sampling algorithms and network traversal sampling algorithms (also called link tracing sampling, crawling based sampling or topology based sampling). In the first groups, two simple techniques for random selection sampling are random edge sampling (RES) and random node sampling (RNS) (Leskovec and Faloutsos 2006) which mainly used for theoretical investigation regardless of topological structure of the networks. Network traversal sampling algorithms such as random walk (RW) (Yoon et al. 2007),

136

4 Social Network Sampling

forest fire (FF) (Kurant et al. 2010), snowball (SB) (Frank 2011), to mention a few, try to collecting samples from the network with respect to the topological structure of the networks. We can also categorize the sampling algorithms into two categories: (1) one-phase sampling algorithms which construct the sampled network by random selection of edges or nodes or using some kind of graph traversal procedure (e.g., RW (Yoon et al. 2007), FFS (Kurant et al. 2010) and SB (Frank 2011) to name a few); (2) two-phase sampling algorithms which construct sampled networks using a network traversal procedure and some pre or post processing (e.g., DLAS, SSP, SST, eDLAS, DPL (Yoon et al. 2015) to mention a few). The former category of sampling algorithms is relatively simple and has low cost, low accuracy and often fails to perform well on all kinds of networks. The latter category heuristically uses additional pre or post processing in order to get more information about the networks such as classifying important nodes into groups based on Katz centrality measures (Luo et al. 2015), scoring important nodes by PageRank (Yoon et al. 2015), extracting groups of nodes in network (Blagus et al. 2015), learning transition probability of random walker, ranking some nodes through finding several shortest paths and computing several spanning trees of graph to enhance the remarkable performance of sampling with respect to accuracy. Such pre or post processing of course increases the cost of the sampling algorithms which must be paid if achieving higher accuracy is the goal. The sampling algorithms for stochastic graphs proposed in this chapter have fallen into the category of two-phased sampling algorithms.

4.4.1.1

Distributed Learning Automata Based Sampling for Stochastic Graphs (DLAS-SG)

Let G  V, E, W be the input stochastic graph, where V  {v1 , v2 , . . . , vn } is the set of nodes, E is the edge set, and W  {w1 , w2 , . . . , wm } is a set of random variables with unknown probability distribution function each of which is associated with an edge of the input graph. Each sampling algorithm tries to estimate the unknown distribution of the weigh associated with an edge by taking samples from that edge. Distributed learning automata (DLAS) sampling algorithm for stochastic graphs (DLAS-SG) introduced by Rezvanian et al. as an extended algorithm for sampling stochastic graphs and they called it DLAS-SG. The DLAS-SG algorithm iteratively visits the nodes of the stochastic graph from different starting nodes several times (e.g., K max starting nodes) using a distributed learning automata and then use the visited nodes for constructing a sampled graph. The DLA used for this purpose is a DLA isomorphic with the input graph. Action-set of learning automaton Ai which is assigned to node vi is the set of outgoing edges of node vi in the input graph. Each learning automaton in DLA initially chooses its actions with equal probabilities.

4.4 Learning Automata Based Stochastic Graph Sampling Algorithms

137

DLAS-SG consists of number of stages. In each stage a path in the graph is traversed and the average weights of the edges along this path are updated. To do this, DLA starts from a randomly chosen starting node vi and learning automaton Ai corresponding to node vi is activated and chooses one of its actions according to its action probability vector. Let the chosen edge be eij . The chosen edge eij is sampled (visited) and then the average weight of the sampled edge is updated. Then learning automaton A j is activated and chooses one of its actions. The process of activating a learning automaton, choosing an action, activating another learning automaton and updating the average weight of the chosen edge is repeated until either the number of visited nodes reaches |V | × ϕ or the activated learning automaton cannot activate another automaton. Once the traversal of path πt at stage t is terminated, the probability vectors of automata along the traversed path πt are updated as follows. If the weight of πt (w¯ πt ) is equal or greater than the dynamic threshold Tt computed as T t  [(t − 1)Tt−1 + w¯ πt ]/t then the probability of actions chosen by all the activated learning automata along the traversed path πt are increased and decreased otherwise according to the learning algorithm. At the end of a stage, all learning automata in DLA are deactivated and then a new stage begins. A stage begins if the maximum number of stages (K max ) has not reached and the difference between two dynamic thresholds in two consecutive stages is equal or greater than a predefined threshold Tmin . When the execution of stages is over, the nodes visited during all stages are sorted in descending order according to the number of paths along which a node has been appeared. The sampled network is then constructed by considering an induced sub-graph of the input network whose node-set contains a given number of mostly visited nodes and its edge weights are the average weight of edges estimated during the execution of the algorithms. Pseudo-code of DLAS-SG algorithm for stochastic graphs is given in Fig. 4.21.

4.4.1.2

Extended Distributed Learning Automata Based Sampling for Stochastic Graphs (eDLAS-SG)

Similar to DLAS-SG, extended distributed learning automata (eDLAS) sampling algorithm is also introduced for sampling stochastic graphs by Rezvanian et al. and they called it eDLAS-SG. The eDLAS-SG algorithm iteratively traverses stochastic graph from different starting nodes (e.g., K max starting nodes) using eDLAS and then use them for constructing the sampled graph. The algorithm uses an eDLA isomorphic with the input graph. Action-set of learning automaton Ai which is assigned to node vi is the set of outgoing edges of node vi in the input graph. Each learning automaton in eDLA initially chooses its actions with equal probabilities. In eDLA, at any time, a LA can be in one of four levels: Passive, Active, Fire and Off. All learning automata are initially set to Passive level. The eDLAS-SG consists of number of stages. In each stage a sub-graph in the graph is traversed and the average weights of the edges along this sub-graph are updated. To do this, one of the nodes in eDLA is chosen by firing function as randomly

138

4 Social Network Sampling

Algorithm 4-5. DLAS-SG (G, , Kmax, Tmin) Input: Stochastic graph G= V, E, W , Sampling rate , Thresholds Kmax, Tmin. Output: Sampled graph G’= V’, E’, W’ . Initialization Construct a DLA by assigning an automaton Ai to each node vi and initialize their action probabilities. t denotes the iteration number of algorithm which is initially set to 1. denotes the average weight of all samples taken from edges along the path of t. Ls is an one-dimensional array with size |V| which is used to store the number of times that the nodes are sampled. Begin algorithm Deactivate all learning automata; t 1; Tk 0; While (t < Kmax OR |Tt Tt-1| Tmin) Do {}; t Select starting node vi randomly from non-visited nodes NV; NV NV \ vi; While (number of visited nodes ≤ |V| AND a learning automaton can be activated) Do Learning automaton Ai is activated and then chooses an action according to its action probability vector; Let the chosen action by Ai be edge eij; Visit and take a sample from the chosen edge eij; {eij}; t t + ; Ls[vi] + 1; Ls[vi] Set Ai to Aj; End While [(t-1)Tt-1 + ]/t; Tt If (

) Then // favorable path

Reward the actions chosen by all the activated learning automata along path πt; Else Penalize the actions chosen by all the activated learning automata along path πt; End If t ← t + 1; End While Sort Ls in descending order; Construct an induced sub-graph G = V , E , W using φ×|V| mostly visited nodes; End Algorithm

Fig. 4.21 Pseudo-code of distributed learning automata based sampling algorithm for stochastic networks

to be the starting node and its activity level is set to active by governing rule. Then learning automaton Ai is fired (its activity level changes to fire and the activity level of its passive neighboring nodes change to active) and chooses one of its actions which correspond to one of the edges of the fired node and at the same time the level of the fired LA Ai changes to off by governing rule. Let the chosen action (edge) be ei j . Edge ei j is sampled (visited) and the average weight of the sampled edge is recomputed. Then one of LA in eDLA with activity level active is chosen and fired (its activity level changes to fire and the activity level of its passive neighboring nodes change to active). The fired LA chooses one of its actions which correspond to an edge of the fired node and then the activity level of the fired LA changes to off by governing rule. The process of selecting a learning automaton from the set of learning automata with activity level active in eDLA by firing function, firing it, choosing an action by fired LA, changing activity levels of passive neighboring LA to active LA, computing the average weight of the chosen edge, and changing

4.4 Learning Automata Based Stochastic Graph Sampling Algorithms

139

activity level of fired LA from fire to Off by governing rule is repeated until either the number of LA with activity level Off is reaches |V |×ϕ or the set of LA with activity level active is empty. Once the traversal of sub-graph τt at stage t is terminated the probability vectors of fired learning automata along the traversed path τt are updated as follows. If the weight of τt (w¯ t ) is equal or greater than the dynamic threshold T t computed as Tt  [(t − 1)Tt−1 + w¯ τt ]/t, then the probability of actions chosen by all the fired learning automata along the traversed path τt are increased and decreased otherwise according to the learning algorithm. Before a new stage begins, activity levels of all learning automata in eDLA are changed and the activity level of all learning automata in eDLA are set to passive. A stage begins if the maximum number of stages (K max ) has not reached and the difference between two dynamic thresholds in two consecutive stages is equal or greater than a predefined threshold T min . When the execution of stages is over, the nodes visited during all the stages are sorted in descending order according to the number of sub-graph along which a node has been appeared. The sampled network is then constructed by considering an induced sub-graph of the input network whose node-set contains a given number of mostly visited nodes and its edge weights are the average weight of edges estimated during the execution of the algorithms. Pseudo-code of eDLAS-SG algorithm for stochastic graphs is given in Fig. 4.22.

4.4.1.3

Experimental Evaluation

In this section, the performance of the learning automata-based sampling algorithms for stochastic graphs is studied on several well-known real and synthetic stochastic networks. Table 4.28 describes the characteristics of test networks used for the experimentations including some real networks and some synthetic networks. The synthetic including WS-SG is made based on Watts–Strogatz model (Watts and Strogatz 1998) which is a synthetic small world network and BA-SG is created based on Barabási–Albert model (Barabási and Albert 1999) which is a synthetic scale-free network. The nodes for all synthetic networks is p  0.2 for WS-SG, and m0  m  5 for BA-SG. The edge weights of these real networks are random variables which are the number of activities among individuals during a specified timestamp for each network. The edge weights of synthetic networks are random variables with Weibull distribution whose parameters are a  0.32, b  0.17. The chosen parameters is adopted from an empirical observation from Twitter for the distribution of lifetime tweets (Bild et al. 2014). The LA-based sampling algorithms for stochastic graphs (DLAS-SG and eDLASSG) are compared with shortest path sampling for stochastic graphs (SPS-SG) and spanning tree sampling for stochastic graphs (SST-SG). For all algorithms, the maximum number of iteration K max is set n × ϕ where n is the number of nodes for each instance graph. For all algorithms, the reinforcement scheme used for updating the action probability vector of learning automata is L R-I with a  0.05. For DLAS-SG

1899

27,770

63,731

63,399

10,000

10,000

Cit-HepTh (Leskovec et al. 2007)

Facebook-wall (Viswanath et al. 2009)

LKML-Reply (Konect 2016)

WS-SG

BA-SG

Node

Facebook-like-OPSAHLUCSOCIAL (Opsahl and Panzarasa 2009)

Network

58,749

10,001,633

1,096,440

1,545,684

352,807

59,835

Edge

Table 4.28 Description of the test networks for the experimentation

Synthetic BA stochastic graph

Synthetic WS stochastic graph

User-user reply

User-user wall post

Author–author collaborations

User-user messaging

Type

N

N

Y

Y

Y

Y

Directed

Synthetic scale free random graph based on Barabási–Albert model graph, the edge weights are Weibull random variables with a  0.32, b  0.17

Synthetic small world random graph generated using Watts–Strogatz model, the edge weights are Weibull random variables with a  0.32, b  0.17

Networks of the communication network of the Linux kernel mailing list.

Network of the wall posts from the Facebook New Orleans networks

The collaboration graph of authors of scientific papers from the arXiv’s High Energy Physics—Theory (Hep-Th) section

Network of sent messages between the users of an online community of students from the University of California, Irvine

Description

140 4 Social Network Sampling

4.4 Learning Automata Based Stochastic Graph Sampling Algorithms

141

Algorithm 4-6. eDLAS-SG(G, , Kmax, Tmin) Input: Stochastic graph G= V, E, W , Sampling rate , Thresholds Kmax, Tmin. Output: Sampled graph G’= V’, E’, W’ . Initialization Construct an eDLA by assigning an automaton Ai to each node vi. Initialize action probabilities of automata and set the activity level of each LA to Passive. t is the iteration number of algorithm which is initially set to 1. Pa is the set of learning automata with activity level of Passive which is initially set to {v1, v2, …, vn}. Ac is the set of learning automata with activity level of Active which is initially set to empty. Of is the set of learning automata with activity level of Off which is initially set to empty. Fi is the learning automaton with activity level of Fire which is initially set to empty. is the average weight of all samples taken from edges along subgraph of t. Ls is a one dimensional array with size |V| which is used to store the number of times that the nodes are sampled. N(vi) is a function that returns the all adjacent nodes of vi with Passive level. Begin algorithm t 1; Tt 0; While (t Kmax OR |Tt –Tt-1| Tmin) Do Pa {v1, v2, …vn}; Fi {}; Ac {}; Of {}; Select a starting node vi randomly by firing function and change its activity level of Ai to Active level; Ac Ac {Ai}; Pa Pa \ Ai; {vi}; t t Ls[vi] Ls[vi] + 1; While (|Of| |V| AND |Ac| 0) Do Select one Active LA Ai by firing function and then chooses an action according to its action probability vector; Let the action chosen by Ai be edge eij; Take a sample from the chosen edge eij; {ejj}; t t + ; Ls[vj] Ls[vj] + 1; Fi Ai; Ac N(vi) \ vi; End While Tt [(t-1)Tt-1 + ]/t; If (

Pa

Pa \ Ac;

Of

Of

{Fi};

Fi

{};

) Then

Reward the actions chosen by all the fired learning automata with activity level Of in subgraph t; Else Penalize the actions chosen by all the fired learning automata with activity level Of in subgraph t; End If t t + 1; End While Sort list of visited node Ls in descending order; Construct an induced sub-graph G’= V’, E’, W’ using |V| mostly visited nodes; End Algorithm

Fig. 4.22 Pseudo-code of extended distributed learning automata based sampling algorithm for stochastic networks

and eDLAS-SG threshold T min is set to 0.05. The reported results are the averages taken over 30 runs.

4.4.1.4

Experimental Results

This experiment is carried out to study the performance of the LA-based sampling algorithms for stochastic graphs with respect to stochastic strength and stochastic

4 Social Network Sampling

Facebook-like-OPSAHLUCSOCIAL

KS-D (Strength distribu on)

(a) 1

SPS-SG SST-SG DLAS-SG eDLAS-SG

0.8 0.6 0.4 0.2 0

10

15

20

25

KS-D (Strength distribu on)

142

(b)

Cit-HepTh

0.7

SPS-SG SST-SG DLAS-SG eDLAS-SG

0.6 0.5 0.4 0.3 0.2 0.1 0

30

10

(d)

(c)

Facebook-wall

0.6 0.5 0.4 0.3

SPS-SG SST-SG DLAS-SG eDLAS-SG

0.2 0.1 0

10

15

20

25

30

SPS-SG SST-SG DLAS-SG eDLAS-SG

15

20

25

Sampling rate (%)

30

SPS-SG SST-SG DLAS-SG eDLAS-SG

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

10

15

(f)

WS_SG

10

25

20

25

30

Sampling rate (%)

30

KS-D (Strength distribu on)

KS-D (Strength distribu on)

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

20

LKML-Reply

0.8

Sampling rate (%)

(e)

15

Sampling rate (%) KS-D (Strength distribu on)

KS-D (Strength distribu on)

Sampling rate (%)

BA-SG

0.8

SPS-SG SST-SG DLAS-SG eDLAS-SG

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

10

15

20

25

30

Sampling rate (%)

Fig. 4.23 Comparing sampling algorithms in terms of KS-D for strength distribution

clustering coefficient. In this experiment, the sampling rate is varied from 10 to 30% with increment 5% and the results for all algorithms and for each test networks are given in terms of KS-D for strength distribution in Fig. 4.23, and RE for average clustering coefficient in Fig. 4.24. The Kolmogorov-Smirnov D-statistic (KS-D) as one of the statistical test methods commonly used for assessment the distance between two cumulative distribution functions (CDFs) computes the maximum vertical distance between the cumulative distribution function of the original distribution from original graph and that of estimated distribution from sampled graph. This measure is defined as

(a) 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0

RE (Average clustering coefficient)

RE (Average clustering coefficient)

4.4 Learning Automata Based Stochastic Graph Sampling Algorithms

Facebook-like-OPSAHLUCSOCIAL SPS-SG SST-SG DLAS-SG eDLAS-SG

10

15

20

25

30

(c) 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0

RE (Average clustering coefficient)

RE (Average clustering coefficient)

Sampling rate (%)

Facebook-wall

SPS-SG SST-SG DLAS-SG eDLAS-SG

10

15

20

25

30

(e) 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0

RE (Average clustering coefficient)

RE (Average clustering coefficient)

Sampling rate (%)

WS-SG SPS-SG SST-SG DLAS-SG eDLAS-SG

10

15

20

25

30

(b) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

143

Cit-HepTh

SPS-SG SST-SG DLAS-SG eDLAS-SG

10

15

20

25

30

Sampling rate (%)

(d) 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0

LKML-Reply SPS-SG SST-SG DLAS-SG eDLAS-SG

10

15

20

25

30

Sampling rate (%)

(f) 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0

BA-SG SPS-SG SST-SG DLAS-SG eDLAS-SG

10

Sampling rate (%)

15

20

25

30

Sampling rate (%)

Fig. 4.24 Comparing sampling algorithms in terms of RE for average clustering coefficient

D(P, Q)  max{|P(x) − Q(x)|} x

(4.12)

where P and Q are two CDFs of original and estimated data, respectively, and x represents the range of the random variable. KS-D is sensitive to both locations and shapes of distributions, and it is an appropriate measure for the similarity of the distribution. If the value of KS-D between original network and the sampled network

144

4 Social Network Sampling

as closer as it is to zero, it means the both networks have a greater similarity and as closer as it is to unit, the two networks have a greater difference. Relative error (RE) can be applied to assess accuracy of the results for a single parameter, which is defined by the following equation RE 

|P − Q| P

(4.13)

where, P and Q denote the values of real and sampled parameters (i.e. real clustering coefficient and estimated clustering coefficient) in the original data and obtained samples data respectively. According to the results, we may conclude that for all the test networks, the performance of the sampling algorithms in terms of mentioned measures increases as the sampling rate increases. From the results shown in Fig. 4.23, it is clear that in terms of KS-D for strength distribution, eDLAS-SG outperforms other sampling algorithms most cases. From the results shown in Fig. 4.24, in terms of RE for average clustering coefficient, eDLAS-SG outperforms other sampling algorithms for most of test networks. Also, another experiment is conducted to compare the proposed algorithms with respect to their costs. For this experiment, the sampling rate is varied from 15 to 30% with 5% interval. The comparison is performed with respect to the relative cost of sampling algorithms defined as follows. Relative cost can be defined as the cost for the total number of times that the edges in the graph are traversed (taking sample from the edges) during the traversal of the graph (C s ) plus the cost for total number of operations performed during the post processing (C p ) divided by the total number of edges in the graph (E) as given by the following equation RC 

c1 · Cs + c2 · C p |E|

(4.14)

where constant c1 is the coefficient cost of traversing edges (taking samples) and constant c2 is the coefficient cost of performing an operation during the post processing phase for assigning rank for nodes. Relative cost can be applied to assess the cost of a sampling algorithm. The results of this experiment (for c1  1, c2  1) for different sampling algorithms are presented in Fig. 4.25. From the results shown in Fig. 4.25, one can observe that for all sampling algorithms, relative cost increases as the sampling rate increases. The higher cost for higher sampling rate is because of additional samples and post processing on the network performed by the algorithms. The results also indicate that for a same sampling rate, eDLAS-SG and DLAS-SG require lower cost than that of other sampling algorithms. Among the sampling algorithms, SST-SG has the highest cost; this is because of computation of spanning tree with all nodes for repetitive iteration.

4.5 Conclusion

145

200

SPS-SG

Average Rela ve Cost

Average Rela ve Cost

210 240

SST-SG DLAS-SG eDLAS-SG

160 120 80 40 0

180

15

20

25

60 30

30

10

10

SPS-SG

15

20

eDLAS-SG

25

30

30

1000 900 800 700 600 500 400 300 200 100 0

SPS-SG

10

SST-SG

15

DLAS-SG

20

eDLAS-SG

25

30

Sampling rate (%)

(c) Facebook-wall

(d) LKML-Reply

SST-SG

DLAS-SG

eDLAS-SG

400 300 200 100

400

15

20

25

Sampling rate (%)

(e) WS-SG

30

SPS-SG

SST-SG

DLAS-SG

eDLAS-SG

350 300 250 200 150 100 50 0

10

25

Sampling rate (%)

500

0

20

(b) Cit-HepTh

Average Rela ve Cost

DLAS-SG

15

Sampling rate (%)

Average Rela ve Cost

Average Rela ve Cost Average Rela ve Cost

600

SST-SG

eDLAS-SG

90

(a) Facebook-like-OPSAHL-UCSOCIAL SPS-SG

DLAS-SG

120

Sampling rate (%)

1000 900 800 700 600 500 400 300 200 100 0

SST-SG

150

0 10

SPS-SG

10

15

20

25

30

Sampling rate (%)

(f) BA-SG

Fig. 4.25 Comparing the proposed sampling algorithms in terms of relative cost for varying sampling rate

4.5 Conclusion In this chapter, we addressed the problem of sampling subgraphs from social networks and reviewed existing sampling methods in the literature. We also described in details some learning automata-based sampling algorithms: DLAS (Rezvanian et al. 2014), EDLAS (Rezvanian and Meybodi 2017), ICLA-NS (Ghavipour and Meybodi

146

4 Social Network Sampling

2017) and FLAS (Ghavipour and Meybodi 2018) which utilize learning automata for producing representative subgraphs from social networks. The algorithm ICLA-NS is in fact an extended sampling algorithm with post-processing phase, since it utilizes an irregular cellular learning automaton (ICLA) to guarantee the connectivity and the inclusion of the high degree nodes in subgraphs initially sampled by classic node sampling method. Authors investigated the effectiveness of their proposed sampling algorithm by conducting a number of experiments on real-world networks. The experimental results demonstrated that the properties of sampled subgraphs created by ICLA-NS are the more similar to those of the original graph as comparing to the existing sampling methods in terms of Kolmogorov-Smirnov (KS) test for degree, clustering coefficient, and k-core distributions. The algorithm FLAS has been proposed with the goal of sampling from activity networks in which the stream of edges continuously evolves over time. These networks are highly dynamic and include a massive volume of edges. Most previous work on sampling from networks either has assumed the network graph is static and fully accessible at any step, or despite considering the stream evolution has not addressed the problem of sampling a representative subgraph from the original graph. FLAS is a streaming sampling algorithm based on fixed structure learning automata that runs in a single pass over the stream. This algorithm satisfies the both goals of being implemented in a streaming fashion and producing the representative samples. Authors have compared the efficiency of the proposed algorithm with the best streaming sampling algorithm called PIES using four real world network datasets. Their empirical results indicated that FLAS significantly outperforms the PIES algorithm in terms of the ability to produce representative samples. We also presented a brief discussion on stochastic graph sampling and introduced two learning automata-based sampling algorithms for stochastic graphs as well.

References Aggarwal CC (2006) On biased reservoir sampling in the presence of stream evolution. Vldb’06, pp 607–618 Aggarwal CC, Zhao Y, Yu PS (2011) Outlier detection in graph streams. In: Proceedings of international conference on data engineering, pp 399–409 Ahmed NNK, Berchmans F, Neville J, Kompella R (2010) Time-based sampling of social network activity graphs. Learning with graphs. ACM, New York, pp 1–9 Ahmed NK, Neville J, Kompella R (2014) Network sampling: from static to streaming graphs. ACM Trans Knowl Discov from Data 8:7 Ahn Y-Y, Han S, Kwak H et al (2007) Analysis of topological characteristics of huge online social networking services. In: Proceedings of the 16th international conference on World Wide Web—WWW’07. ACM, p 835 Albert R, Barabási A-LL (2002) Statistical mechanics of complex networks. Rev Mod Phys 74:47–97. https://doi.org/10.1103/RevModPhys.74.47 Alvarez-Hamelin JI, Dall’Asta L, Barrat A, Vespignani A (2005) K-core decomposition of Internet graphs: hierarchies, self-similarity and measurement biases. arXiv Prepr cs/0511007. https://doi. org/10.3934/nhm.2008.3.371

References

147

Avrachenkov K, Ribeiro B, Towsley D (2010) Improving random walk estimation accuracy with uniform restarts. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, pp 98–109 Barabási A-L (1999) Emergence of scaling in random networks. Science (80-) 286:509–512. https:// doi.org/10.1126/science.286.5439.509 Barabási A-L (2004) Evolution of networks: from biological nets to the Internet and WWW. OUP Oxford Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science (80-) 286:509–512. https://doi.org/10.1126/science.286.5439.509 Bayer R, McCreight EM (1972) Organization and maintenance of large ordered indexes. Acta Informatica. Springer, Berlin, pp 173–189 Bild DR, Liu Y, Dick RP et al (2014) Aggregate characterization of user behavior in twitter and analysis of the Retweet graph. ACM Trans Internet Technol 15:4. https://doi.org/10.1145/2700060 Blagus N, Šubelj L, Weiss G, Bajec M (2015) Sampling promotes community structure in social and information networks. Phys A Stat Mech Appl 432:206–215. https://doi.org/10.1016/j.physa. 2015.03.048 Carmi S, Havlin S, Kirkpatrick S et al (2006) MEDUSA—New model of Internet topology using k-shell decomposition. Proc Natl Acad Sci 104:11150–11154. https://doi.org/10.1073/pnas. 0701175104 Chauhan A, Even S, Chauhan A (2011) Graph algorithms, 2nd edn. Cambridge University Press Cormode G, Muthukrishnan S (2005) Space efficient mining of multigraph streams. In: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems—PODS’05. ACM Press, New York, p 271 Ebbes P, Huang Z, Rangaswamy A (2012) Subgraph sampling methods for social networks: the good, the bad, and the ugly. SSRN Electron J. https://doi.org/10.2139/ssrn.1580074 Fang M, Yin J, Zhu X (2013) Active exploration: simultaneous sampling and labeling for large graphs. In: Cikm. ACM, pp 829–834 Fang M, Yin J, Zhu X (2016a) Active exploration for large graphs. Data Min Knowl Discov 30:511–549. https://doi.org/10.1007/s10618-015-0424-z Fang M, Yin J, Zhu X (2016b) Supervised sampling for networked data. Sig Process 124:93–102. https://doi.org/10.1016/j.sigpro.2015.09.040 Frank O (2011) Survey sampling in networks. In: The SAGE handbook of social network analysis. SAGE Publications, pp 381–403 Gao Q, Ding X, Pan F, Li W (2014) An improved sampling method of complex network. Int J Mod Phys C 25:1440007. https://doi.org/10.1142/S0129183114400075 Ghavipour M, Meybodi MR (2017) Irregular cellular learning automata-based algorithm for sampling social networks. Eng Appl Artif Intell 59:244–259. https://doi.org/10.1016/j.engappai.2017. 01.004 Ghavipour M, Meybodi MR (2018) A streaming sampling algorithm for social activity networks using fixed structure learning automata. Appl Intell 48:1054–1081. https://doi.org/10. 1007/s10489-017-1005-1 Gjoka M, Kurant M, Butts CT, Markopoulou A (2010) Walking in Facebook: a case study of unbiased sampling of OSNs. In: Proceedings—IEEE INFOCOM, pp 1–9 Gjoka M, Butts CTCT, Kurant M, Markopoulou A (2011) Multigraph sampling of online social networks. IEEE J Sel Areas Commun 29:1893–1905. https://doi.org/10.1109/JSAC.2011.111012 Gleich DF (2012) Graph of Flickr photo-sharing social network crawled in May 2006. https://doi. org/10.4231/d39p2w550 Goel S, Salganik MJJ (2010) Assessing respondent-driven sampling. Proc Natl Acad Sci 107:6743–6747. https://doi.org/10.1073/pnas.1000261107 Goldstein ML, Morris SA, Yen GG (2004) Problems with fitting to the power-law distribution. Eur Phys J B 41:255–258. https://doi.org/10.1140/epjb/e2004-00316-5 Goodman LA (1961) Snowball sampling. Ann Math Stat 32:148–170. https://doi.org/10.1214/aoms/ 1177705148

148

4 Social Network Sampling

Heckathorn DD (1997) Respondent-driven sampling: a new approach to the study of hidden populations. Soc Problem 44:174–199. https://doi.org/10.2307/3096941 Illenberger J, Kowald M, Axhausen KW, Nagel K (2011) Insights into a spatially embedded social network from a large-scale snowball sample. Eur Phys J B 84:549–561. https://doi.org/10.1140/ epjb/e2011-10872-0 Jalali ZS, Rezvanian A, Meybodi MR (2016a) Social network sampling using spanning trees. Int J Mod Phys C 27:1650052. https://doi.org/10.1142/S0129183116500522 Jalali ZS, Rezvanian A, Meybodi MR (2016b) A two-phase sampling algorithm for social networks. In: Conference proceedings of 2015 2nd international conference on knowledge-based engineering and innovation, KBEI 2015. IEEE, pp 1165–1169 Jin EM, Girvan M, Newman MEJ (2001) Structure of growing social networks. Phys Rev E—Stat Phys Plasmas, Fluids, Relat Interdiscip Top 64:8. https://doi.org/10.1103/PhysRevE.64.046132 Jin L, Chen Y, Hui P et al (2011) Albatross sampling. In: Proceedings of the 3rd ACM international workshop on MobiArch—HotPlanet’11. ACM Press, New York, p 11 Konect (2016) Linux kernel mailing list replies network dataset—{KONECT}. http://konect.unikoblenz.de/networks Krishnamurthy V, Faloutsos M, Chrobak M et al (2007) Sampling large Internet topologies for simulation purposes. Comput Netw 51:4284–4302. https://doi.org/10.1016/j.comnet.2007.06.004 Kumar R, Novak J, Tomkins A (2010) Structure and evolution of online social networks BT—Link mining: models, algorithms, and applications. In: Link mining: models, algorithms, and applications. Springer, pp 337–357 Kurant M, Markopoulou A, Thiran P (2010) On the bias of BFS (breadth first search). In: 2010 22nd international teletraffic congress (ITC), pp 1–8 Kurant M, Gjoka M, Butts CT, Markopoulou A (2011a) Walking on a graph with a magnifying glass. In: Proceedings of the ACM SIGMETRICS joint international conference on measurement and modeling of computer systems—SIGMETRICS’11. ACM, p 281 Kurant M, Markopoulou A, Thiran P (2011b) Towards unbiased BFS sampling. IEEE J Sel Areas Commun 29:1799–1809. https://doi.org/10.1109/JSAC.2011.111005 Lee SH, Kim P-J, Jeong H (2005) Statistical properties of sampled networks. Phys Rev E 73:016102. https://doi.org/10.1103/PhysRevE.73.016102 Lee C-H, Xu X, Eun DY (2012) Beyond random walk and metropolis-hastings samplers. In: Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on measurement and modeling of computer systems—SIGMETRICS’12. ACM Press, New York, p 319 Leskovec J, Faloutsos C (2006) Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’06. ACM, Philadelphia, p 631 Leskovec J, Krevl A (2014) SNAP datasets: stanford large network dataset collection. SnapStanfordEdu/Data/ Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time. In: Proceeding of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining—KDD’05. ACM Press, New York, p 177 Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov from Data 1:1–41 Lovász L, Lovasz L, Lovász L (1993) Random walks on graphs: a survey. Combinatorics 2:1–46. https://doi.org/10.1.1.39.2847 Lu J, Li D (2012) Sampling online social networks by random walk. ACM, pp 33–40 Luo P, Li Y, Wu C, Zhang G (2015) Toward cost-efficient sampling methods. Int J Mod Phys C 26:1550050 Maiya ASS, Berger-Wolf TYY (2010) Sampling community structure. In: Proceedings of the 19th international conference on World Wide Web. ACM, pp 701–710

References

149

Mislove A, Marcon M, Gummadi KP et al (2007) Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement—IMC’07. ACM, p 29 Mollakhalili Meybodi MR, Meybodi MR (2014) Extended distributed learning automata. Appl Intell 41:923–940. https://doi.org/10.1007/s10489-014-0577-2 Newman MEJ (2003a) The structure and function of complex networks. SIAM Rev 45:167–256. https://doi.org/10.1137/S003614450342480 Newman MEJ (2003b) Ego-centered networks and the ripple effect. Soc Netw 25:83–95. https:// doi.org/10.1016/S0378-8733(02)00039-4 Opsahl T, Panzarasa P (2009) Clustering in weighted networks. Soc Netw 31:155–163. https://doi. org/10.1016/j.socnet.2009.02.002 Papagelis M, Das G, Koudas N (2013) Sampling online social networks. IEEE Trans Knowl Data Eng 25:662–676. https://doi.org/10.1109/TKDE.2011.254 Piña-García CA, Gu D (2013) Spiraling Facebook: an alternative Metropolis-Hastings random walk using a spiral proposal distribution. Soc Netw Anal Min 3:1403–1415. https://doi.org/10.1007/ s13278-013-0126-8 Rasti AH, Torkjazi M, Rejaie R, et al (2009) Respondent-driven sampling for characterizing unstructured overlays. In: Proceedings—IEEE INFOCOM. IEEE, pp 2701–2705 Rezvanian A, Meybodi MR (2015) Sampling social networks using shortest paths. Phys A Stat Mech its Appl 424:254–268. https://doi.org/10.1016/j.physa.2015.01.030 Rezvanian A, Meybodi MR (2017) A new learning automata-based sampling algorithm for social networks. Int J Commun Syst 30:e3091. https://doi.org/10.1002/dac.3091 Rezvanian A, Rahmati M, Meybodi MR (2014) Sampling from complex networks using distributed learning automata. Phys A Stat Mech Appl 396:224–234. https://doi.org/10.1016/j.physa.2013. 11.015 Ribeiro B, Towsley D (2010) Estimating and sampling graphs with multidimensional random walks, pp 390–403 Rossi RA, Ahmed NK (2013) Network repository. In: Purdue University Computer Science Department, http://www.networkrepository.com Seshadhri C, Pinar A, Kolda TG (2013) An in-depth analysis of stochastic Kronecker graphs. J ACM 60:1–32. https://doi.org/10.1145/2450142.2450149 Stumpf MPPH, Wiuf C, May RMM (2005) Subnets of scale-free networks are not scale-free: sampling properties of networks. Proc Natl Acad Sci 102:4221–4224. https://doi.org/10.1073/ pnas.0501179102 Stutzbach D, Rejaie R, Duffield N et al (2009) On unbiased sampling for unstructured peer-to-peer networks. IEEE/ACM Trans Netw 17:377–390. https://doi.org/10.1109/TNET.2008.2001730 Tang L, Liu H (2010) Community detection and mining in social media. Synth Lect Data Min Knowl Discov 2:1–137. https://doi.org/10.2200/S00298ED1V01Y201009DMK003 Thathachar MAL, Sastry PS (2002) Varieties of learning automata: an overview. IEEE Trans Syst Man Cybern Part B Cybern 32:711–722. https://doi.org/10.1109/TSMCB.2002.1049606 Viswanath B, Mislove A, Cha M, Gummadi KP (2009) On the evolution of user interaction in Facebook. In: Proceedings of the 2nd ACM workshop on online social networks—WOSN’09. ACM Press, New York, p 37 Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393:440–442. https://doi.org/10.1038/30918 Wilson C, Boe B, Sala A et al (2009) User interactions in social networks and their implications. In: Proceedings of the fourth ACM European conference on computer systems—EuroSys’09. ACM Press, New York, p 205 Yoon S, Lee S, Yook S-H, Kim Y (2007) Statistical properties of sampled networks by random walks. Phys Rev E 75:046114. https://doi.org/10.1103/PhysRevE.75.046114 Yoon S-HH, Kim K-NN, Hong J et al (2015) A community-based sampling method using DPL for online social networks. Inf Sci (NY) 306:63–69. https://doi.org/10.1016/j.ins.2015.02.014

Chapter 5

Social Community Detection

5.1 Introduction Community structure (also known as modular structure or cluster) refers to a set of nodes whose the number of edges inside a community are more densely than that of edges outside the community (Rabbany et al. 2013). Finding community structures in a network is called community detection problem (also known as community identification) which plays an important role for studying and understanding the structural and behavioral properties social networks including. Many online social networks consist of a set of communities which reflect a common properties of online users such as common interests, common topics, common activities and common hobbies (Ranjbar and Maheswaran 2014). Detecting the communities in complex networks due to the wide spread of applications have been received a great attentions in literature by scholars.

5.1.1 Related Work Since detecting community structure in a network is an important research area for analysis of real-networks, in recent years, many community detection algorithms have been developed to reveal community structures in complex social networks (Fortunato and Hric 2016; Fortunato 2010; Elyasi et al. 2016). In this section, some existing community detection algorithms are briefly introduced. A good review for community detection consists of techniques and applications is the one presented by Fortunato (2010). He classified community detection methods into five categories including traditional algorithms, hierarchical algorithms, modularity-based methods, spectral algorithms and dynamic algorithms. Among the all types of community detection approaches, hierarchical clustering techniques are widely used techniques

© Springer Nature Switzerland AG 2019 A. Rezvanian et al., Learning Automata Approach for Social Networks, Studies in Computational Intelligence 820, https://doi.org/10.1007/978-3-030-10767-3_5

151

152

5 Social Community Detection

which put similar vertices into larger communities. Hierarchical clustering algorithms including two categories of divisive and agglomerative form the communities gradually in a hierarchical manner. Several scholars improved the hierarchical algorithms using some metrics to select a suitable partition or a proper set of partitions that satisfies particular metrics such as the number of desired communities, the maximum (or minimum) number of vertices in each community and optimize an objective function (Fortunato 2010). Divisive techniques try to find the edges that connect vertices of different communities and iteratively eliminate them, so that the communities separated from each other. Girvan and Newman have introduced a famous divisive method (Girvan and Newman 2001) which includes the removal of the edges based on their values of edge betweenness. An agglomerative method, however, tries to form communities in a bottom up manner. In general, the common notion of agglomerative methods is to partition vertices into communities iteratively starting from a partition in which communities are composed of a single vertex. The process of partitioning vertices continues until a single community consists of all vertices of input network is achieved (Newman 2006). Most of agglomerative algorithms select the best partition that maximizes a typical quality objective function. One of the best known quality function and the most widely used quality function is a modularity metric which is proposed by Newman (2006). Using modularity, Clauset et al. proposed a fast greedy modularity optimization method (Brandes 2006) which starting from a set of isolated vertices and then a pair of vertices iteratively connected to each other such that it achieves the maximum possible value of modularity at each step. Although using this measure by modularity optimization achieves many promising results for community detection, it is shown that it has some limitations such as resolution limit. For example in the extreme case, the modularity optimization algorithms are failed for a network with several cliques connected by a single edge (Kumpula et al. 2007). Raghavan et al. (2007) proposed label propagation algorithm (LPA) with nearly linear time complexity for community detection. However, the performance of LPA is dependent to the update order of label information, whose assigned to the nodes of graph randomly in initialization. Hosseini and Azmi (2015) proposes an improved label propagation algorithm called memory-based label propagation algorithm (MLPA) for finding community structure in social networks. In their algorithm, a simple memory element is designed for each node of graph and this element store the most frequent common adoption of labels iteratively. Le Martelot et al. (2012) investigate a stability measure for the quality of partitioning, as an optimization criterion that exploits a Markov process view of networks. Another approach to detecting community structures in networks is the clique percolation method (CPM) (Maity and Rath 2014). Briefly, this algorithm works by finding all the maximal cliques in a network and then forming communities by merging cliques with common nodes. Due to the wide spread of applications for community detection, evolutionary and swarm intelligence-based algorithms could be used for solving the community detection problem using an appropriate objective function. Compared to earlier algorithms, meta-heuristic optimization algorithms can effectively find a proper, high-quality

5.1 Introduction

153

solution within a reasonable period of time (Chen et al. 2016). In addition, in Ji et al. (2013) an ant colony-based algorithm is presented for discovering communities, where each node is modeled by an ant. To detect a community, a new fitness function is proposed, which updates the pheromone diffusion by considering the feedback signal, to investigate the interaction between ants. A multi-objective genetic algorithm to uncover community structure in complex networks, called MOGA-Net, was proposed in Pizzuti (2012). MOGA-Net optimizes two objective functions able to identify densely connected communities of nodes having sparse inter connections. MOGA-Net generates a set of network divisions at different hierarchical levels such that solutions at deeper levels are contained within solutions having a lower number of communities. The number of modules is automatically determined by the objective functions. In Gong et al. (2012) a multi-objective evolutionary algorithm, called MOEA/D-Net, was proposed to investigate community detection in networks. MOEA/D-Net optimizes two conflicting objective functions made up from modularity density. MOEA/D-Net maximizes the density of internal degrees, and minimizes the density of external degrees concurrently. Recently, many community detection algorithms based on reinforcement learning have been developed. Zhao et al. (2015) proposed an open cellular learning automaton called CLA-Net for solving the community detection problem. Based on CLA-Net, the network is formulated with the aid of cellular learning automata. Then, the solution is constituted from current actions chosen by the learning automata in the network. The authors have shown experimentally that their algorithm can solve the problem of the resolution limit of modularity optimization.

5.2 Community Detection Using Distributed Learning Automata In this section, we describe the LA-based algorithm using distributed learning automaton for finding communities in complex social networks (Khomami et al. 2016). It is assumed that the input network G  V, E is an undirected and unweighted network where V  {v1 , v2 , . . . , vn } is the set of vertices and E ⊆ V × V is the set of edges in the given network. The distributed learning automaton algorithm including four steps tries to find iteratively a set of communities that are more densely connected internally to each other than to the rest of the network. After the initialization step is performed, by assigning a learning automaton to the vertices of input network, the LA-based algorithm repeats community finding by doing a guided traversal in the network with the help of distributed learning automata, evaluates the set of found communities and updates their action probability vectors iteratively until stopping criteria are satisfied. We describe four steps of the LA-based algorithm in the following subsections in details.

154

5 Social Community Detection

5.2.1 Initialization In the first step, a distributed learning automata A, α which is isomorphic to the input network is constructed. The resulting network can be defined by 2-tuple A, α where A  {A1 , A2 , . . . , An } is the set of learning automata corresponding to the set of vertices, α  {α1 , α2 , . . . , αn } denotes the set of actions in which αi  {αi1 , αi2 , . . . , αiri } are the set of actions that can be taken by learning automaton Ai and ri is the number of actions that can be taken by learning automaton Ai . An action of a learning automaton Ai corresponds to choosing an adjacent vertex of the corresponding vertex vi . Let p(vi )  { pi1 , pi2 , . . . , piri } be the action probability j vector of learning automaton Ai and pi  1/ri equally initialized for all j. At all iterations each learning automaton can be in either active or inactive mode. At the beginning of the LA-based algorithm all learning automata initially are set in inactive mode. Let C k be the set of vertices in the kth community and initially is set to be empty, G represent the set of unvisited vertices in the execution of algorithm and is initially equal to G and also πt is the path of visited vertices at the iteration t.

5.2.2 Communities Formation At tth iteration of this step, the algorithm finds k communities in such a way that the LA-based algorithm starts with randomly selecting vertex vi among unvisited vertex set G and the selected vertex vi is inserted in the set of current community C k and current path πt . Then, learning automaton Ai corresponds to the starting vertex vi is activated and chooses one of adjacent vertex vi according to its action probability vector. Let the chosen action by learning automaton Ai be vertex vj . If the number of internal connections for union of selected vertex vj and current community C k is greater than the number of internal connections for current community C k then vj is inserted to the set of current community C k , C k is removed from set G’ and also visited vertex vj is updated in path πt . The process of activating an automaton, choosing an action, checking the condition of inserting chosen vertex vj in the current community C k , inserting new vertex vj to C k , updating visited vertex vj in path πt and removing C k from set G is repeated until total number of edges inside the current community C k is more than the total number of edges outside the current community C k or active learning automaton could not select any action. The process of finding new communities according to the above description and updating path of visited vertices at the current iteration πt is continued when the union of all vertex-set of found communities is equal to the input network G.

5.2 Community Detection Using Distributed Learning Automata

155

5.2.3 Computation of the Objective Function Let C t  {C 1 , C 2 , …, C k } be the set of k communities found at the iteration t. The quality of the set of communities found at the iteration t is evaluated via normalized cut as objective function (Dhillon et al. 2004) by following equation   k  t 1  cut Ci , C¯ i NC C  k i1 vol(Ci )

(5.1)

  where cut Ci , C¯ i denotes the number of edges between communities C i and C¯ i  G\Ci , vol(C i ) is the total degree of vertices that are the members of community C i and also k is the number of communities. Since mentioned in (Shi and Malik 2000), normalized cut with low complexity considers extracting the global impression of the network, instead of local features and measures both the total dissimilarity between the different communities as well as the total similarity within the communities. So, using the LA-based algorithm can gradually decreased normalized cut which means that the algorithm gradually converges to the minimum normalized cut and approach to the proper set of communities.

5.2.4 Updating Action Probabilities In this step, the set of k communities found at the iteration t is evaluated via corresponding normalized cut and if the value of the normalize cut at current iteration NC(C t ) is less than or equal to the value of normalized cut at previous iteration NC(C t −1 ), then the chosen action along the path πt by all the activated learning automata are rewarded according to the learning algorithm described in Sect. 5.2 and penalized otherwise.

5.2.5 Stopping Conditions The LA-based algorithm iterates the number of iterations exceeds  steps 2, 3 and 4 until j a given threshold T or Pt  vi ∈C t maxv j N (vi ) ( pi ) at iteration t becomes greater than j a particular threshold τ where pi is the probability of choosing neighboring vertex vj by learning automaton Ai residing in vertex vi and N(vi ) is the set of neighboring vertex vi . Figure 5.1 shows the pseudo-code for the LA-based community detection algorithm for social networks.

156

5 Social Community Detection

Algorithm 5-1. LA-based algorithm for community detection in complex social networks Input: A network G=(V, E), Thresholds , T // : stopping threshold for product of probabilities, T:maximum iteration number Output: Set of found communities C* Assumptions Assign an automaton Ai to each vertex vi; Let k is the number of communities; Let Ck is the set of kth community and initially set to empty; Let t is the iteration number of algorithm and initially set to 0; Let NC(Ct) is the normalized cut value for set of communities found at iteration t and initially set to 0; Let Pt is the product of maximum probabilities in probability vector of LA the vertices of a set of communities at iteration t and initially

Fig. 5.1 Pseudo-code of the DLA-based algorithm for community detection in social network

5.3 Community Detection Using Michigan Memetic Learning Automata In this section, we introduce a new evolutionary algorithm using Michigan Memetic Learning Automata called MLAMA-Net for community detection presented by Mirsaleh and Meybodi (2016). In this algorithm the chromosomes which are represented on the basis of Michigan approach are associated to the nodes of network.

5.3 Community Detection Using Michigan Memetic Learning Automata

157

For this purpose an initial population isomorphic to input network is made. To construct initial population, each network node is equipped with a chromosome, and then a learning automaton is assigned to it. The chromosome represents the community of corresponding node and saves the histories of exploration and the learning automaton represents a meme and saves the histories of the exploitation. Each node vi of network can be modeled by a duple C R i , M i (t) where C R i is a chromosome which represents the community of node vi by an integer number and M i (t) is a meme which save the effect (history) of the local search on the chromosome C R i at generation t. Initial chromosome C R i is created randomly by selecting a random integer number from set c  {c1 , c2 , . . . , cn } as the set of all possible communities. At the beginning of each generation the evolutionary operators are applied on chromosome C R i . First, mutation operator is applied on chromosome C R i with rate rm in which the value of chromosome C R i is replaced by other value of set c  {c1 , c2 , . . . , cn }. Then, crossover operator is performed on chromosome C R i and one of its neighbors, which is selected randomly, with rate rc in which the value of chromosome C R i is exchanged with the value of selected chromosome. Let GF be the fitness function which is used to evaluate the fitness of a chromosome based on its genotype and genotype of its adjacent chromosomes. The fitness of chromosome C R i at generation t; which is referred to as genetic fitness; is denoted by G F i (t). The genetic fitness of chromosome C R i associated to node vi of graph G  (V, E) is calculated as follows   ki k j 1  Ai, j − δ(i, j) (5.2) G F i (t)  2m i, j∈N 2m i

where Ni  {u|[u, vi ] ∈ E} is the set of neighbors of node vi , ki is the degree of node vi , A is a binary matrix where indicates the adjacency of chromosomes in which i j Ai j  1, if chromosome C R is adjacent to chromosome C R ; otherwise, Ai j  0 and m  i, j∈Ni Ai j . The term δ(i, j) is the delta function, i.e., δ(i, j)  1 if node vi and node v j are in the same community; otherwise, δ(i, j)  0. The effect (history) of the local search on chromosome C R i at generation t is represented by meme M i (t) where is equipped with a learning automaton L Ai in which ai  {c1 , c2 , . . . , cn } is the set of actions (communities) that can be taken by learning automaton L Ai . The effect of the local search on chromosome C R i at generation t is represented by the action probability vector of learning automaton in the meme M i (t) as follows.

M i (t)  M1i (t), M2i (t), . . . , Mni (t)

(5.3)

 where 1 ≤ i ≤ n and ∀i , nk1 Mki (t)  1 in which Mki (t) denotes the probability that action k of leaning automaton in the meme M i (t) is selected in the exploitation process. In other words, Mki (t) is the probability that community ck is selected by local search for node vi . Mki (0) where 1 ≤ i, k ≤ n is initially set to 1/n. Updating the action probability vector of learning automaton associated to meme M i (t) is

158

5 Social Community Detection

performed on the basis of the result of applying the local search on the chromosome C R i as described in next paragraph. Let M F be a function which is used to evaluate the effect (history) of local search on a chromosome. The effect of the local search on chromosome C R i at generation t; which is referred to as memetic fitness; is denoted by M F i (t). The memetic fitness of chromosome C R i is calculated as follows M F i (t)  Mki (t)

(5.4)

where k is the action of learning automaton in the meme M i (t) which corresponds to the value of chromosome C R i . Memetic fitness of a chromosome changes when the action probability vector of learning automaton in its corresponding meme is updated. Updating is performed on the basis of the result of applying the local search on a chromosome. It is worth noting that, local search changes only the action probability vector of the meme not value of the chromosome. That is, local search only changes the memetic fitness not the genetic fitness. Local search is applied on chromosome C R i based on the genetic information (genotype and genetic fitness) and memetic information (action probability vector and memetic fitness) of chromosome C R i and the genetic and memetic information of its adjacent chromosomes. Let αi be the community of node vi which is represented by chromosome C R i , Ni  {u|[u, vi ] ∈ E} be the set of neighbors of node vi , Ni (c)  j Ni : α j  c be the set of neighbors of node vi with community c and |Ni (c)| be the number of neighbors of node vi with community c. At generation t, the action αi of learning automaton associated to meme M i (t) is rewarded, if community of node vi has the highest priority amongst its neighbors’ communities. Otherwise, it will be penalized. The priority of node vi can be described as follows      ⎧ ⎨ 1 i f |Ni (αi )| >  N j α j ∀ j ∈ Ni    priorit y(vi )  1 i f |Ni (αi )| >  N j α j ∀ j ∈ Ni and M i (t) > M j (t) ∀ j ∈ Ni ⎩ 0 other wise (5.5) where M i (t) and M j (t) are the memetic fitness of chromosomes C R i and C R j at generation t, respectively. The priority concept can effectively overcome the resolution limit problem in the proposed algorithm. In the last step of proposed algorithm, learning automaton associated to meme M i (t) randomly chooses one of its actions and as a result a new chromosome is generated. If the genetic fitness of new chromosome be higher than genetic fitness of chromosome C R i , the newly generated chromosome replaces the chromosome C R i . The community detection process continues (in parallel) for each node vi , if the probability of an action of learning automaton associated to meme M i (t) exceeds a pre-specified threshold, (e.g., πi ). The relationship between node vi and its neighbors is shown in Fig. 5.2.

5.3 Community Detection Using Michigan Memetic Learning Automata

159

Fig. 5.2 The relationship between node vi and its neighbors in MLAMA-Net algorithm

Fig. 5.3 Pseudo-code of the Michigan memetic learning automata algorithm for community detection (MLAMA-Net)

160

5 Social Community Detection

The MLAMA-Net algorithm is a fully distributed algorithm in which each chromosome locally evolves based on its adjacent chromosomes and independent of the other chromosomes. The operation of the MLAMA-Net can be described as follow. Initial chromosomes are created randomly and the probability of selecting an action for all learning automata is set to 1/n. The MLAMLA-Net is progressed in a number of generations as long as the termination criteria are not satisfied. Each generation is divided to three phases: exploration phase, exploitation phase and memetic effect phase. In exploration phase the mutation and the crossover operators are applied on chromosome C R i with rates rm and rc , respectively. In exploitation phase, local search is applied to chromosome C R i , and then the action probability vector of the meme M i (t) (the history) is updated according to a learning algorithm. In memetic effect phase, chromosome C R j is replaced with a new chromosome which is generated based on the action probability vector of learning automaton associated to meme M i (t), if genetic fitness of new chromosome is higher than genetic fitness of chromosome C R i . Pseudo-code for MLAMA-Net is given in Fig. 5.3.

5.4 Community Detection Using Cellular Learning Automata In this section, we introduce an irregular cellular learning automata-based algorithm called CLACD for revealing the communities in a network. The CLACD algorithm is locally and independently run at each cell of the CLA, which means the decision made by each cell is local and is independent of others. The CLACD algorithm is also a fully distributed algorithm, which avoids remaining in local solutions. We will further detail how it is possible to detect communities with the aid of cellular learning automata in more depth. To achieve this goal, we first describe the whole structure of the algorithm in brief. The CLACD algorithm, consisting of four steps, tries to reveal the community structures in the networks. It is assumed that G  V, E is undirected and unweighted graph, in which V  {v1 , v2 , . . . , vn } is the set of nodes and E ⊆ V × V is the set of links in the input network. For a graph G, there are many possible partitions. The main goal of the community detection in the graph G is to reveal sub-graphs C P  {C1 , . . . , Ck } of the set V of nodes divided into k disjoint partition such that a quality function φ(C P) is optimal. We note that there is no assumption that provides either the number or the size of the communities of the proper partition. To achieve this goal, after the initialization step, the CLACD algorithm tries to find communities in an iterative manner by finding partial spanning tree. The process of detecting communities from the nodes is guided by the set of learning automata residing at the nodes and by selecting neighbor nodes as an action selection of the input networks. With the aid of cellular learning automata, the set of obtained communities is evaluated by both the reinforcement signal of local and global environments, and their action probability vectors are updated until the stop-

5.4 Community Detection Using Cellular Learning Automata

161

Fig. 5.4 Pseudo-code of the cellular learning automata-based community detection algorithm (CLACD)

ping criteria are satisfied. It is necessary to point out that, the asynchronous structure for the cellular learning automata is adopted for the implementation aspect in the interests of simplicity. The pseudo-code of the CLACD algorithm is given in Fig. 5.4. The proposed algorithm called community detection algorithm based on cellular learning automata (CLACD) as described as the following steps.

162

5 Social Community Detection

5.4.1 Initialization To initialize the algorithm, an asynchronous ICLA is created, which is isomorphic to the input network. To construct such an ICLA, each node is associated with a cell of CLA, and then an LA is assigned to each cell (hereafter vi may be used interchangeably for cell vi or node vi ). The resulting ICLA can be described by a duple A, a, where A  {A1 , A2 , . . . , An } denotes the set of LAs residing in each cell (node) of ICLA (network) and a  {a1 , a2 , . . . , an } denotes the action set, in which a i  {ai1 , ai2 , . . . , airi } (for each a i ∈ a) represents the set of actions that can be taken by learning automaton Ai . Learning automaton Ai , which is residing in node vi has ri actions, each of which corresponds to selecting one of the adjacent nodes. Let p(vi )  { pi1 , pi2 , . . . , piri } be the action probability vector of learning automaton Ai and pij  r1i be equally initialized for all j.

5.4.2 Communities Formation In this step, a partial spanning tree of network is constructed and several local communities are formed on the partial spanning tree found by the algorithm. At first, learning automaton Ai in cell vi selects an action based on its action probability vector, which corresponds to selecting learning automata A j . Let Tt be the spanning tree at iteration t. The current node vi and the selected node v j is added to Tt if this addition constructs a partial spanning tree. Then, learning automaton A j in cell v j selects an action. The process of selecting an action by each LA in each cell of CLA, adding the selected node vi to Tt is continued until either the set of remaining available LA is empty or the new selected action is one of the actions that is previously added to Tt . After selecting an action by all LAs, the algorithm constructs local communities on the found partial spanning tree of the network. The partial spanning tree of the network is used in the algorithm in order to reduce the network size and computational cost for detecting communities due to the low time complexity of finding partial spanning tree. A set of local communities C P  {C1 , . . . , Cq , . . . , Ck } are formed based on local connectivity of cells, in which Cq is the qth local community by merging neighboring cells, in such a way that the number of internal connections for merging cell vi with cell v j or a local community Cq is greater than the number of internal connections for current community Cq then vi and v j are formed a local community or assigned to the current community Cq . In this step, most of nodes are formed a set of local communities C P, however the remaining unassigned nodes assigned to the kth community (i.e., Ck ).

5.4 Community Detection Using Cellular Learning Automata

163

5.4.3 Computation of the Objective Function In this step, both reinforcement signals of local and global environments are used to evaluate the set of communities C P found by the algorithm. The conductance θt (C P) is used for generating reinforcement signal of global environment, in order to evaluate the quality of the current communities found by the algorithm at iteration t in the network. Let vol(Cq ) denote the total number of links within community Cq and Cut(Cq ) denote the number of links falling between different community partitions (cut sizes) where one endpoint is inside community Cq . The quality of the found community C P at iteration t can be calculated using conductance as an objective function, as defined by following equation:   k Cut Cq 1 φt (C P)       k q1 min vol Cq , vol Cq

(5.6)

where Cq refers to the complement community Cq (or the rest of the network). In brief, the conductance computes the fraction of the total link volume that points outside the community. The average conductance at iteration t (i.e., θt ), which is the average of all obtained conductance values found up to that point is calculated as follows: θt 

(t − 1)θt−1 + φt (C P) t

(5.7)

We used the conductance due to the fact that it can be simply calculated. Furthermore, it exhibits proper performance in general. We note that, the lower value of conductance reflects the better it is. Hence, the reinforcement signal for the global environment, βg can be computed as follows  βg 

0 θt ≤ θt−1 1 Other wise

(5.8)

The local environment of a learning automaton is configured based on the cellular learning automata in the neighboring cells and incorporates the neighboring nodes in the network. The reinforcement signal for local environment, βi for learning automaton Ai in cell vi of cellular learning automata is defined according to ⎧     out   ⎨0 K iin Cq > K i Cq i∈Cq i∈Cq (5.9) βi  ⎩ 1 Other wise where K iin (Cq ) and K iout (Cq ) are the summation of number of internal and external links for node vi in community Cq with respect to the given input network, respectively.

164

5 Social Community Detection

5.4.4 Updating the Action Probabilities At every iteration t, for each cell, if both the reinforcement signal of local and global environment are favorable (i.e., βi  0 and βg  0), then the action probabilities of learning automaton in each cell are updated depending upon the internal state, the actions chosen by all learning automata are rewarded. Each learning automaton in each cell updates its action probability vector by using an L R−I reinforcement scheme.

5.4.5 Stopping Conditions Here, we aim to describe the stopping condition for the CLACD algorithm. A simple criterion for a stopping condition is the number of iterations, i.e., the algorithm terminates after a predefined number of iterations K max . Furthermore, Entropy also used as an another criterion for stopping condition for an LA, which is defined as E( p)  −



pi log pi

(5.10)

i∈C

where pi is the probability of selecting action αi by an LA. If the entropy value is less than a predefined threshold it can be concluded that the algorithm has converged to an action and it is therefore terminated. In the simulation, we compute Entropy for all LAs in CLA. As mentioned in previous section, the CLACD algorithm can be locally performed at each cell independent of the other cells to create local communities. As the CLACD algorithm proceeds, each LA in each cell of CLA learns how to select a neighboring cell to create a local community. Since the local internal and external connection of each node in a local community is considered by local reinforcement signal and the set of communities is compared with the best set of communities which it has created so far by global reinforcement signal, the CLACD algorithm gradually yields the nearoptimal community structures. Pseudo-code of the proposed algorithm (CLACD) is given in Fig. 5.4.

5.4.6 Experiments The performance of the LA-based algorithms (DLACD and CLACD) for detection of community structures are studied through experimental simulation on the wellknown real (i.e., Karate, Dolphins, Books, Football, Net-science and Power-grid) and synthetic modular networks (i.e., LFR benchmark (Lancichinetti et al. 2008)) as described in Table 5.1. In LFR benchmark, N indicates the number of nodes in the

5.4 Community Detection Using Cellular Learning Automata

165

Table 5.1 Description of the test networks used for the experiments Networks Vertex Karate Dolphins

Edge

34

78

Description Network of Zachary’s karate club (Zachary 1977)

62

159

Books

105

441

Network of books about US politics (Newman 2015)

Football

115

615

Network of American College football union (Girvan and Newman 2001)

5000

38,160

LFR1

Network of Lusseau’s dolphins (Lusseau et al. 2003)

Synthetic modular network benchmark (Lancichinetti et al. 2008)

network, k indicates the average degree of nodes, Maxk indicates maximum degree of nodes, Min c indicates the number of nodes that the smallest community contains, Maxc indicates the maximum size of the communities and μ as mixing parameter indicates the probability that nodes are connected with nodes of the external community. For synthetic modular networks, we set the LFR benchmark parameters as N  5000, k ∈ {15, 100}, maxk  50, min c  10, maxc  50 and μ  {0.10, 0.50} with a span of 0.05. In following experiments presented in this section, the learning scheme is L R–I and the learning rate is 0.01. The maximum threshold τ is 0.9 and maximum iteration T is n × 1000 where n is the number of vertices of graph. This experiment is conducted to study the performance of the presented LA-based algorithms (DLACD, MLAMA-Net and CLACD) for finding the communities in terms of the modularity Q and Normalized Mutual Information (NMI) in comparison with some popular algorithms such as, genetic algorithm for community detection termed as GA-Net (Pizzuti 2008), multi-objective genetic algorithm, termed as MOGA-Net (Pizzuti 2012), multi-objective evolutionary algorithm with decomposition, termed as MOEA/D-Net (Gong et al. 2012), cellular learning automata based algorithm for community detection termed as CLA-Net (Zhao et al. 2015) and Michigan memetic algorithm for community detection termed as MLAMA-Net (Mirsaleh and Meybodi 2016). The results of first experiments is summarized in Table 5.2 in terms of the Modularity Q. The Modularity Q is a popular measure for evaluating the set of community structures and defined as follows:   ki k j 1   Ai, j − (5.11) Q 2m C∈P v v ∈C 2m i j

where A is the adjacency matrix and Ai, j is equal to unity if there is  a link between node vi and node v j and zero otherwise. The degree of node vi is ki  j Ai, j and m is the total number of links in the network. The summation is over all pairs of nodes that are member of the same community C with partitioning P (Newman 2006). In Table 5.2 results of both the maximum (Max) and average (Avg.) values of modularity are reported. As one can see from Table 5.2, for Football the CLACD reveals community structures with larger value of Modularity in comparison with GA-Net and MOGA-Net. For Dolphins network CLACD algorithm produces similar

166

5 Social Community Detection

Table 5.2 Comparison of the community detection algorithms on real networks in terms of maximum (Max) and average (Avg.) modularity Network Q

GANet

MOGANet

MOEA/D- CLANet Net

MLAMANet

DLACD

CLACD

Karate

0.4059

0.4198

0.4198

0.4198

0.4119

0.4188

Max

Avg. 0.4059

0.4158

0.4198

0.4175

0.4136

0.4082

0.4039

0.5014

0.5258

0.5210

0.5277

0.5277

0.5181

0.5277

Avg. 0.4046

0.5215

0.5189

0.5268

0.5222

0.5011

0.5198

Max

0.6044

Dolphins Max Football Book

0.4188

0.5940

0.5280

0.6044

0.6046

0.6058

0.5878

Avg. 0.5830

0.5173

0.6032

0.6042

0.6050

0.5814

0.5864

Max

0.5230

0.5272

0.5268

0.5268

0.5272

0.5245

0.5223

Avg. 0.5230

0.5255

0.5236

0.5254

0.5255

0.5228

0.5188

1 0.9 0.8

NMI

0.7 0.6 0.5 0.4 0.3 0.2

GA-Net CLA-Net CLACD

0.1 0 0.05

0.1

0.15

MOGA-Net MLAMA-Net

0.2

0.25

0.3

0.35

MOEA/D-Net DLACD

0.4

0.45

0.5

Mixing parameter (µ) in LFR benchmark

Fig. 5.5 Results of NMI for different algorithms on LFR benchmark networks

results to MLALA-Net and CLA-Net. The revealed similar results are mainly due to an overlapping community structures in some networks. The results of other experiment for synthetic modular LFR benchmark network with varying values of the mixing parameters (μ) from μ  0.00 to μ  0.50 with a 0.05 interval is plotted in Fig. 5.5 in terms of Normalized Mutual Information (NMI). NMI (Danon et al. 2005) is used for networks with known community structures that measures the similarity between real community structure and community structure found by the algorithm. NMI is calculated as follows:     −2 a∈A b∈B |a ∩ b| log |a∩b|n |a||b|      (5.12) N M I (A, B)   |a| + b∈B |b| log |b| a∈A |a| log n n

5.4 Community Detection Using Cellular Learning Automata

167

where A and B are two partitions of the input networks and NMI is a value in the range [0, 1], such that the higher value indicates that the partitions A and B are totally independent (Khomami et al. 2016). From the results shown in Fig. 5.5, one can observe for a mixing parameter value of less than 0.2 the community structure of networks is clear, and hence the obtained NMI values are close to unity while for mixing parameters between 0.2 and 0.4, it is expected that all algorithms will attain a lower NMI values because when the mixing parameter increases, the detection of communities becomes a more difficult task. Finally, for a mixing parameter value greater than 0.4, the performance of MLAMA-Net is superior to all other algorithms in terms of NMI. Overall, all LA-based community detection algorithms (CLA-Net, MLAMA-Net, DLACD and CLACD) outperform some other community detection algorithms.

5.5 Conclusion The focus of this chapter was on the introducing learning automata based algorithms for community detection in social networks including distributed learning automata, Michigan Memetic learning automata and cellular learning automata. The learning automata based algorithms for community detection gradually try to guide the algorithms to form proper community structures. Computer simulations on well-known real and synthetic modular networks reveal that the superiority of all LA-based algorithm for community finding in comparison with alternative algorithms with respect to modularity and NMI.

References Brandes U (2006) Finding community structure in very large networks. Phys Rev E 70:066111 Chen D, Zou F, Lu R et al (2016) Multi-objective optimization of community detection using discrete teaching–learning-based optimization with decomposition. Inf Sci (Ny) 369:402–418. https://doi.org/10.1016/j.ins.2016.06.025 Danon L, Díaz-Guilera A, Duch J et al (2005) Comparing community structure identification. J Stat Mech: Theory Exp 2005:219–228. https://doi.org/10.1088/1742-5468/2005/09/P09008 Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means, spectral clustering and normalized cuts. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 551–556 Elyasi M, Meybodi M, Rezvanian A, Haeri MA (2016) A fast algorithm for overlapping community detection. In: 2016 Eighth international conference on information and knowledge technology (IKT). IEEE, 2016, pp 221–226 Fortunato S (2010) Community detection in graphs. Phys Rep 486:75–174. https://doi.org/10.1016/ j.physrep.2009.11.002 Fortunato S, Hric D (2016) Community detection in networks: a user guide. Phys Rep 659:1–44. https://doi.org/10.1016/j.physrep.2016.09.002 Girvan M, Newman MEJ (2001) Community structure in social and biological networks. Proc Natl Acad Sci 99:7821–7826. https://doi.org/10.1073/pnas.122653799

168

5 Social Community Detection

Gong M, Ma L, Zhang Q, Jiao L (2012) Community detection in networks by using multiobjective evolutionary algorithm with decomposition. Phys A Stat Mech Appl 391:4050–4060. https://doi. org/10.1016/j.physa.2012.03.021 Hosseini R, Azmi R (2015) Memory-based label propagation algorithm for community detection in social networks. In: Proceedings of the international symposium on artificial intelligence and signal processing, AISP 2015. IEEE, pp 256–260 Ji J, Song X, Liu C, Zhang X (2013) Ant colony clustering with fitness perception and pheromone diffusion for community detection in complex networks. Phys A Stat Mech Appl 392:3260–3272. https://doi.org/10.1016/j.physa.2013.04.001 Khomami MMD, Rezvanian A, Meybodi MR (2016) Distributed learning automata-based algorithm for community detection in complex networks. Int J Mod Phys B 30:1650042. https://doi.org/10. 1142/S0217979216500429 Kumpula JM, Saramaki J, Kaski K, Kertesz J (2007) Limited resolution and multiresolution methods in complex network community detection. Fluct Noise Lett 7:L209–L214. https://doi.org/10. 1117/12.725560 Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev—Stat Nonlinear, Soft Matter Phys 78:1–6. https://doi.org/10.1103/ PhysRevE.78.046110 Le Martelot E, Hankin C, Le Martelot E, Hankin C (2012) Multi-scale community detection using stability optimisation within greedy algorithms. Int J Web Based Communities 9:323–348. https:// doi.org/10.1504/IJWBC.2013.054907 Lusseau D, Schneider K, Boisseau OJ et al (2003) The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54:396–405. https://doi.org/10.1007/s00265-003-0651-y Maity S, Rath SK (2014) Extended Clique percolation method to detect overlapping community structure. In: Proceedings of the 2014 international conference on advances in computing, communications and informatics, ICACCI 2014. IEEE, pp 31–37 Mirsaleh MR, Meybodi MR (2016) A Michigan memetic algorithm for solving the community detection problem in complex network. Neurocomputing 214:535–545. https://doi.org/10.1016/ j.neucom.2016.06.030 Newman ME (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103:8577–8582. https://doi.org/10.1073/pnas.0601602103 Newman MEJ (2015) Newman dataset. http://www-personal.umich.edu/~mejn/netdata/ Pizzuti C (2008) GA-Net: a genetic algorithm for community detection in social networks. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics). Springer, pp 1081–1090 Pizzuti C (2012) A multiobjective genetic algorithm to find communities in complex networks. IEEE Trans Evol Comput 16:418–430. https://doi.org/10.1109/TEVC.2011.2161090 Rabbany R, Takaffoli M, Fagnan J et al (2013) Communities validity: methodical evaluation of community mining algorithms. Soc Netw Anal Min 3:1039–1062. https://doi.org/10.1007/s13278013-0132-x Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76:036106 Ranjbar A, Maheswaran M (2014) Using community structure to control information sharing in online social networks. Comput Commun 41:11–21. https://doi.org/10.1016/j.comcom.2014.01. 002 Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence 22(8):888–905 Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473. https://doi.org/10.1086/jar.33.4.3629752 Zhao Y, Jiang W, Li S et al (2015) A cellular learning automata based algorithm for detecting community structure in complex networks. Neurocomputing 151:1216–1226. https://doi.org/10. 1016/j.neucom.2014.04.087

Chapter 6

Social Link Prediction

6.1 Introduction The advancement of the internet has provided better chances of collaboration and interaction among people and organizations. The advancement has paved the way for the emergence of social networks over the internet which is nowadays very popular. A social network can be formally shown as a graph, where the vertices represent people or organizations, and the connecting edges indicate social connections. Social Network Analysis (SNA) is a vast area of research dealing with techniques and strategies for the study of social networks (Liben-Nowell and Kleinberg 2007). The analysis and knowledge of networks widely employed to understand the behavior of a community (Al Hasan and Zaki 2011). SNA gives us opportunities and benefits in different areas like marketing, economics, health, sociology and safety (Al Hasan and Zaki 2011). Link prediction is one of the main tasks undertaken by SNA. The task is concerned with the problem of predicting the prospective existence of relationships among nodes in a network, based on patterns observed in the existing nodes and relations. Link prediction can help us make out the mechanisms that trigger the evolution in a social network and it can be applied to many application areas. For instance, in the area of Internet and web science, it can be used in tasks such as automatic web hyper-link creation (Adafre and de Rijke 2005) as well as web site hyper-link prediction (Zhu et al. 2002). In e-commerce, one of the most prevalent usages of link prediction is to build recommendation systems (Li and Chen 2009; Huang et al. 2005). It also finds various applications in other scientific fields. For example in bibliography and library science, it can be tapped for de-duplication (Malin et al. 2005) as well as record linkage (Elmagarmid and Member 2007). In bioinformatics, nevertheless, it has been used in protein-protein interaction (PPI) prediction (Freschi 2009). In security-related areas, it can be applied to identify hidden groups of terrorists and criminals (Al Hasan and Zaki 2011). Most of the current link prediction methods have been proposed based on a static network representation, where a snapshot of the network structure is available and the goal is to predict the future links. In such a static network, each link occurrence © Springer Nature Switzerland AG 2019 A. Rezvanian et al., Learning Automata Approach for Social Networks, Studies in Computational Intelligence 820, https://doi.org/10.1007/978-3-030-10767-3_6

169

170

6 Social Link Prediction

is represented by a one-time binary event. But in reality, the social networks are online, non-deterministic and unpredictable, and the structure of the network and its parameters change over time; so using deterministic social network models with fixed values for links are restrictive in solving real social network problems. In other words, link prediction methods based on the static graph representation fail when the social network has online and non-deterministic behavior. In these applications, much richer information could be extracted using other network structures such as: stochastic social networks in which each link is a random variable, fuzzy social networks in which each link is a fuzzy variable that shows a relationship between the two incident nodes, weighted social networks in which each link has a weight that indicates the strength of the corresponding link, and the time series social networks in which we have the time series information of the link occurrences. So in this chapter we first introduce link prediction and then present some novel link prediction methods by using different graph structures and learning automata and show that the presented methods are superior to other recent link prediction methods.

6.2 Link Prediction A classic definition of the link prediction problem is expressed by: “Given a snapshot of a social network at time t, we seek to accurately predict the links that will be added to the network during the interval from time t to a given future time t + 1” (Al Hasan and Zaki 2011). The most widespread approach to the problem is to calculate the topological/structural patterns of the social network of interest (Murata and Moriyasu 2008). Different metrics to describe node pairs have already been adopted in previous studies (McCallum and Guillemin 2013) and they explore the structural patterns of the network and commonly provide a degree of proximity/similarity between the nodes. There are many similarity metrics include (Al Hasan and Zaki 2011): (1) Local similarity metrics that only use the local information of a link to calculate similarity metric: Common Neighbors, Salton Index, Jaccard Index, Hub Depressed Index, Hub Promoted Index, Leicht-Holme Newman Index (LHN1), Preferential Attachment Index, Adamic-Adar Index and Resource Allocation Index, (2) Global similarity metrics that can use all information in the network to calculate the similarity metric between two nodes: Katz Index, Leicht-Holme-Newman Index (LHN2), Matrix Forest Index (MFI) and (3) Quasi-local metrics that do not require global topological information but use more information than local indices: Local Path Index, Local Random Walk, Superposed Random Walk, Average Commute Time, Cos+, random walk with restart, SimRank, Resource Allocation index and Local Path index. As previously mentioned, the starting point in these approaches is to extract the values/scores of different metrics that represent the proximity of the pairs of nodes. Then, the pairs of non-connected nodes are at first ranked according to a chosen metric (for instance the number of common neighbors) (Liben-Nowell and Kleinberg 2007). After that, the top L ranked pairs are assigned as predicted links. To put it another way, it is always assumed that the links have the highest scores

6.2 Link Prediction

171

are most likely to occur. In the following we briefly present the eight commonly similarity measures that we use them to redefine stochastic similarity metrics. In the rest of this chapter, notation (x) denotes the neighbors set of node x. 1. Common Neighborhood: In this measure, two nodes vi and v j are more likely to have a link if they have many common neighbors. This score is defined as:      C N vi , v j  (vi ) ∩  v j 

(6.1)

2. Salton: Because the common neighborhood metric is not normalized, some other metrics are proposed to normalize the common neighborhood metric.  Salton score is one of these metrics that normalizes the CN metrics using (vi )|×|(v j ) and it is defined as: 

Salton vi , v j



   (vi ) ∩  v j      (vi )|×| v j 

(6.2)

3. Jaccard Index: This index was proposed by Jaccard to normalize the CN metric and it is defined as: 

J accar d vi , v j



   (vi ) ∩  v j     (vi ) ∪  v j 

(6.3)

This metric is defined in such a way that for high number of common neighbors, the score would be higher. 4. Preferential Attachment (PA) (Barabási 1999): The preferential attachment (PA) algorithm is based on the preferential attachment phenomena rule (Newman 2001) that is discovered in a variety of social networks. In this method the link score is set to be the product of the degrees of the involved nodes and it is defined as follows:      P A vi , v j  (vi )|×| v j 

(6.4)

5. Adamic-Adar Index (AA) (Adamic and Adar 2003): This index is an extension of common neighborhood method such that the less-connected neighbors have more weight and it is defined as:   A A vi , v j 

 z∈(vi )∩ (v j )

1 log((z))

(6.5)

172

6 Social Link Prediction

This metric first proposed as a metric of similarity between two web pages. 6. Resource Allocation Index (RA) (Zhou et al. 2009): This index and AdamicAdar Coefficient have similar formulas but they come from different motivations. RA is based on physical processes of resource allocation and can be applied to networks formed by airports (flow of aircraft and passengers) or networks formed by electric power stations and is defined as: 

  R A vi , v j 

z∈(vi )∩ (v j )

1 |(z)|

(6.6)

7. Katz Index (Katz 1953): In this metric, a similarity is defined as the sum of a number of paths with different lengths such that shorter paths have more weights. It is defined by the following equation: ∞      l   K at z vi , v j  β l .Path vi , v j 

(6.7)

l1

  where  Path(x, y)l  is the number of paths between vi and v j with length l. The parameter β(≤ 1) can be used to regularize this feature. The challenge of KATZ metric is its computationally complexity. It is also shown that the Katz metric can be calculated based on the following equation: K at z  (I − β A)−1 − I

(6.8)

where A is the adjacency matrix and I is an identity matrix of proper size. This method also has high complexity in large social networks. 8. LP Index (Lü et al. 2009): This index is a restricted version of Katz metric such that only paths in length 2 and 3 is considered. This metric has a lower computational complexity in comparison to Katz and it is defined as the following: L P I ndex(vi , v j )  A2 +  A3

(6.9)

where A is the adjacency matrix and  is a free parameter. If   0 the LP is the same as CN metric. For more information about other similarity metrics please refer to Liben-Nowell and Kleinberg (Liben-Nowell and Kleinberg 2007).

6.3 Link Prediction in Stochastic Social Networks

173

6.3 Link Prediction in Stochastic Social Networks The predicting linkage between data objects is an interesting task in data mining research area. Most of the previous link prediction methods have been proposed based on a static network representation, where a snapshot of the network structure is available and the goal is to predict the future links. In such a static network, each link occurrence is represented by a one-time event and all interest is only the existence of the link. For example, one may be interested to know whether a customer will purchase a product in the future or whether an author will ever collaborate with another author in the future. But in many applications, the social networks are really online, non-deterministic and unpredictable, and the structure of the network and its parameters change over time; so using deterministic social network models with fixed values for links are restrictive in solving real social network problems. In other words, link prediction methods based on the static graph representation fail when the social network has online and non-deterministic behavior. One of the solutions to overcome this problem is link prediction using online and stochastic social networks in which each link is a random variable. In Rezvanian and Meybodi (2016), Rezvanian proposed the stochastic social network and showed that the stochastic social network with a random variable for each link is a good approach to overcome this problem. By choosing a stochastic social network, they redefined some measures of social networks such as degree distribution, betweenness, cluster coefficient and strongness measures. So in this subsection, we present a learning automata based link prediction in stochastic social networks, called SLP, to predict the link occurrence using the uncertainty of the data (Moradabadi and Meybodi 2018b). The presented method can be extended as an online link prediction to use in the online stochastic social networks where the links can be added or removed over time and the future links must be predicted again. To obtain the goal of this sub-section, we first present some similarity metrics for stochastic graphs and then design a learning automaton based algorithm for calculating the similarity metric under the situation that the probability distribution function of the weight of each link is unknown. In other words, the presented algorithm tries to estimate the probability distribution function of the similarity metrics using learning automata and sampling. The process of sampling from the links of the graph is guided by the aid of learning automata in such a way that the number of samples needed to be taken from the links of the stochastic graph and the computational complexity for estimating the similarity metrics probability distributions be reduced as much as possible. Similar to traditional similarity methods that the output of link prediction is the set of links with higher similarity, the output of the presented link prediction is also a set of stochastic links with their similarity probability distribution functions. So, to determine a link’s existence, we should sample from its similarity probability distribution function and based on this sample and a predefined threshold we make the decision about the link existence. This threshold is chosen based on empirical results and experts. It is also should be noticed that the presented method can be applied in online stochastic social networks. In many recent studies, researchers use

174

6 Social Link Prediction

synthetic social networks similar to real networks in order to study and evaluate their methods using a computer generated graph. Synthetic data has no personal information and cannot be traced back to any individual; therefore, the use of synthetic data reduces confidentiality and privacy issues. So in order to evaluate the performance of the presented algorithm in stochastic social networks, we conduct several experiments using different synthetic stochastic social networks and show that the stochastic models achieve better link prediction performance in comparison common stochastic link prediction method. The rest of the section is organized as follows. In Sect. 6.3.1 the presented similarity metrics for stochastic graphs are described. Section 6.3.2 introduces the presented link prediction method for stochastic social networks based on learning automata. Section 6.3.3 presents the simulation results and finally Sect. 6.3.4 summarizes the main discussion of the presented method.

6.3.1 Similarity Metrics in Stochastic Graphs Generally, a stochastic graph G can be described by a triple V, G, W , where V  {v1 , v2 , . . . , vn } is the set of nodes, E  {ei j } ⊆ V × V is the set of links, and W is a matrix in which wi j is a random variable associated to link ei j if such a link exists. In this sub-section we redefine some similarity metrics for link prediction problem in stochastic network as the following: 1. Stochastic Common Neighborhood: In this measure, the common neighborhood random variable is a random variable that represents the sum of shared stochastic weights that the two nodes vi and v j have in their common neighbors and it is defined by: 

  C N vi , v j 

  min wik , w jk

(6.10)

{k|eik ∈E,e jk ∈E}

2. Stochastic Salton: This score in stochastic social networks is defined as the following random variable: 

Salton vi , v j



  min wik , w jk    {k|eik ∈E} wik × {l|e jl ∈E} w jl 

{k|eik ∈E,e jk ∈E}

(6.11)

3. Stochastic Jaccard Index: This index in stochastic social networks is defined as the following random variable:

6.3 Link Prediction in Stochastic Social Networks





175



J accar d vi , v j  

min(wik , w)   {k|eik ∈E,e jk ∈E} max wik , w jk {k|eik ∈E,e jk ∈E}

(6.12)

4. Stochastic Preferential Attachment: The preferential attachment (PA) random variable in the stochastic network is defined as follows:     wik × w jl P A vi , v j  {k|eik ∈E}

(6.13)

{l|e jl ∈E}

5. Stochastic Adamic-Adar Index: This index is an extension of common neighborhood method and in stochastic network is defined as the following random variable:   A A vi , v j 



1  {z|ezk ∈E} wzk {k|eik ∈E,e jk ∈E} log

(6.14)

6. Resource Allocation Index: this index in the stochastic social network is defined as the following random variable: 

  R A vi , v j 

1



{k|eik ∈E,e jk ∈E}

{z|ezk ∈E}

(6.15)

wzk

7. Stochastic Katz Index: In this metric, the similarity is defined as the sum of the stochastic paths weight between vi and v j with different lengths such that shorter paths have more weights and it is defined as the following random variable: ∞    βl × K at z vi , v j  l1



ezk ∈path(vi ,v j )

l



wzk

(6.16)

 l where path vi , v j is any path between vi and v j with length l. 8. Stochastic LP Index: This index is a restricted version of Katz metric such that only paths in length 2 and 3 is considered. This metric in the stochastic social network is a random variable and it is defined as the following: 3    βl × L P − I ndex vi , v j  l2



l ezk ∈path(vi ,v j )

wzk

(6.17)

176

6 Social Link Prediction

6.3.2 Link Prediction in Stochastic Graphs In the previous sub-section, we redefined some similarity metrics for stochastic graphs. In this sub-section, a link prediction method based on learning automata (SLP) is presented for stochastic social networks under the situation that the probability distribution functions of the weights associated with the links of the graph are unknown. The presented algorithms take advantage of using learning automata to estimate the distribution of some chosen similarity metric in order to predict future links. In the algorithm after initialization phase, it iterates the sampling phase, updating phase and feedback phase until one of the stopping conditions are met. Similar to traditional similarity methods that the output of link prediction is the set of links with higher similarity, the output of the presented link prediction is also a set of stochastic links with their similarity probability distribution functions. So, to determine a link’s existence, we should sample from its similarity probability distribution function and based on this sample and a predefined threshold we make the decision about the link existence. This threshold is chosen based on empirical results and experts. The presented method also can have an additional phase called changes phase that can be used in online stochastic social networks to predict the future links after a change in the network has occurred. The details of initialization of the presented method, sampling phase, updating phase, feedback phase, stopping phase and changes phase are given below.

6.3.2.1

Initialization

Let G(V, E, W ) be the input stochastic graph, where V  {v1 , v2 , . . . , vn } is the set of nodes, E  {ei j } ⊆ V × V is the set of links, and W  {wi j } is the set of random variables each of which is associated with a link weight of the input stochastic graph. It is assumed that the weight of each edge is a positive random variable with an unknown probability distribution function. The presented algorithm uses two sets of learning automata L A Links and L A T ests : 1. L A Links is a set of LAs for each of existed links in the network and tries to estimate the importance of sampling from the corresponding links in calculation of similarity metrics. Each L A L in L A Links have two actions: {α L1  “take sample”, α L2  “do not take sample”}. Let p L i  p 1L i , p 2L i be the action probability vector of learning automaton L A L i , and initialized equally p 1L i  p 2L i  1/2 for all vi ∈ V . The process of taking samples from a link is determined by the learning automaton assigned to that link. So this set of LAs is learned in a such way that the number of samples of the network that is required for calculating the similarity metric be reduced as much as possible. At the beginning of the algorithm, the weight of each link is initialized with some random samples in order to provide a coarse estimate of the weight of that link. 2. L A T ests is a set of LAs for each of test links that must be predicted and tries to estimate if the distribution of the similarity metric corresponding to the test link

6.3 Link Prediction in Stochastic Social Networks

177

must be updated or not. Each L A T in L A T ests also has two actions: {αT1  “update the similarity metric probability distribution”, αT2  “do

not update the similarity metric probability distribution”}. Let pTi  pT1i , pT2i be the action probability vector of learning automaton L A Ti , and initialized equally pT1i  pT2i  1/2 for all ti ∈ T (Test set). The action of each L A T in L A T ests determines that if we should update the similarity metric probability distribution of the corresponding link or not. So the L A T ests is guided such that the computational complexity of calculating similarity metrics is decreased as much as possible.

6.3.2.2

Sampling Phase: Sampling the Stochastic Social Network

In each iteration of the presented method, in the first phase, each L A L in the L A Links chooses an action in a parallel manner. The action of each L A L determines that we should sample a new weight or use the previous weight as the weight of corresponding link in the current iteration. If the chosen action be α L1  “take sample”, then we sample a new weight for the corresponding link and if it be α L2  “do not take sample” then we use the weight of previous iteration as the weight of corresponding link in the current iteration. So this set of LAs is learned in a such way that the number of samples of the network that is required for calculating final similarity metrics probability distributions be reduced as much as possible. In other words in SLP the sampling process implemented by learning automata aims to take more samples from the promising region of the graph, the regions that reflect higher rate of activities, instead of walking around and taking unnecessary samples from nonpromising regions of the graph. Now we have a weighted social network and the first phase of the SLP is finished here.

6.3.2.3

Updating Phase: Update the Similarity Metrics Probability Distributions

In this phase, we try to update the probability distribution of the similarity metrics using the weighted graph that is generated in the previous phase. To do this, each L A T in the L A T ests selects a new action. If the action of the L A T be αT1  “update the similarity metric probability distribution”, the calculation of the new estimates for the probability distribution of the similarity metric for the corresponding link are then performed. But if the chosen action be αT2  “do not update the similarity metric probability distribution”} then we do not update the similarity metric probability distribution of the corresponding link and we use its previous probability distribution as its current probability distribution.

178

6.3.2.4

6 Social Link Prediction

Feedback Phase: Update the Action Probabilities of LAs

The goal of this phase is to reward or penalize each LA in the network to update their action probability distributions. To do this first we calculate the distance between the current similarity metric probability distribution and the one obtained in the previous iteration for every test link i and called it di . To calculate the distance we use Skew divergence distance that is described in the experiment setting. This metric is chosen with empirical experiments and based on the trade-off of computational complexity and accuracy. Now we use this value to generate reinforcement signal for every LA in the network based on the following equations: f or ever y i ∈ L A T est , βTi  d i i∈m di β L j  |m| f or ever y i ∈ L A Links ,

(6.18)

where m is the set of test links that use link j to calculate their similarity metrics. Finally we update each LA according to the L R− P learning algorithm and the generated reinforcement signal. In other words for each L A Ti in L A T ests we use di as its reinforcement signal and penalize or reward the action αT1 (update the similarity metric probability distribution) of the corresponding LA based on the closeness of the di to 0 and 1, respectively. In other words if the new distribution is close to the previous distribution, it means that the distribution of the link is not changed in respect to the previous iteration and so we reward the action αT2 (do not update the similarity metric probability distribution) and if the new distribution is different from the previous one then we reward the action αT1 (update the similarity metric probability distribution). Also for each L A L i in L A Links we use the average reinforcement signal of the test links that use the corresponding link to calculate their similarity metrics. So the presented algorithm tries to investigate the sample of links that are more important to generate new information about similarity metrics by using L A Links and reduce the cost of calculating similarity metrics in each iteration by using L A T ests .

6.3.2.5

Stopping Phase

In the presented algorithm sampling, updating and feedback phases is repeated until the average of entropy of probability vector of learning automata reaches a predefined value Tmin or the maximum number of iteration, K, is reached to a threshold k. The information entropy of a learning automaton with r actions can be defined as follows (Mousavian):

H −

r  i

pi .log( pi )

(6.19)

6.3 Link Prediction in Stochastic Social Networks

179

where pi , is the probability of choosing ith action of a learning automaton. The entropy for a learning automaton has maximum value of one when all the actions have equal probabilities of choosing and has minimum value of zero when the action probability vector is a unit vector. After stopping the SLP, we have a similarity probability distribution function for each test link as the output of the presented method. Now because of the similarity metrics are random variable, to determine a link’s existence, we should sample from its similarity probability distribution function and based on this sample and a predefined threshold we make the decision about the link existence. This threshold is chosen based on empirical results and experts.

6.3.2.6

Changes Phase

This phase of SLP is presented to adapt the SLP in online stochastic social networks. In the online stochastic social networks the links can be occurred or removed over the time and the future links must be predicted again. In the presented algorithm after the stopping phase is finished, we have a link prediction result for the structure of the network until time T. Now for adding the online link prediction capability to the SLP, assume that in time T + 1 we have two available actions: a link is added to the network or a link is removed from the network. In the following we presented two scenarios to handle these actions for reconsidering future links: 1. A new link is added to the network: in this case for the new link l we consider a new L A Link learning automata, L Al , to determine the importance of the link l sampling in the prediction task. Also we reset every L At of the L A T est set in the neighbors of link l with a special depth d to the initial configuration. Then we repeat the learning procedure for the set of L Al and the L At in order to reconsider the link prediction result. 2. A new link is removed from the network: in this case for the removed link l we only reset every L At of the L A T est set in the neighbors of link l with a special depth d to initial configuration. Then we repeat the learning procedure for the set of L At in order to reconsider the link prediction result. This online learning procedure reduces the computational complexity of calculating link prediction by using local learning instead of considering total network structure.

6.3.3 Experimental Results In this section, in order to evaluate the performance of the SLP, several computer experiments are conducted on different synthetic stochastic graphs and the presented method has been compared in term of performance and accuracy. In the rest of this sub-section, we first give the experiments materials that we used in our experiments and then give a set of four experiments.

180

6.3.3.1

6 Social Link Prediction

Experiment Materials

This sub-section describe the method and materials that we used in our experiments. To do this first we review the methods we used to produce synthetic graphs and then present the evaluation parameters and metrics that we used them in our experiments.

Random Methods to Generate Synthetic Graphs This sub-section presents the three methods we used them to generate synthetic graphs: 1. Barabasi-Albert model (BA model) as a scale-free network with heavy-tailed degree distribution and parameters N ∈ {2000, 5000, 10,000} and m 0  m  5. 2. Watts-Strogatz model (WS model) as a small world network is which reflects a common property of many real networks such as short average path length with parameters N ∈ {2000, 5000, 10,000}, k  4 and p  0.2. 3. Erdds-Renyi model (ER model) with parameters N ∈ {2000, 5000, 10,000} and p  0.2. In used social network each link’s weight is a random variable with an exponential distribution whose mean is selected randomly from set p ∈ {1, 1.5, 2, 2.5}.

Evaluation Metrics Here we briefly review four common distance measures that try to estimate the distance of two distributions: 1. Kolmogorov-Smirnov distance statistic (Jalali et al. 2016b): for estimate the distance between two cumulative distribution functions (CDFs) and it is in the interval [0 1] and values closer to 0 means more similarity and lower difference. This metric is defined as the follow: KS(P, Q)  max|P(x) − Q(x)| x

(6.20)

where P and Q are two CDFs of original and estimated data, respectively, and x represents the range of the random variable. 2. Skew divergence (Jalali et al. 2016b): for estimate the distance between two probability distribution functions (PDFs) and defined as follows: SD(P, Q, α)  D[αP + (1 − α)Q||αQ + (1 − α)P]

(6.21)

6.3 Link Prediction in Stochastic Social Networks

181

where D is the Kiillback-Leibler (KL) divergence, which measures the similarity between two PDFs P and O that do not have continuous support over the full range of values and a  0.99. The KL divergence is defined as follows:

KL(P(x)||Q(x)) 

 x

p(x)log

P(x) Q(x)

(6.22)

3. Pearson’s correlation coefficient (Luo et al. 2015): to measure the similarity between the estimated parameter values and the original parameters values and it is in the interval [0 1] and values closer to 1 mean more similarity. This distance defined as follow:    n i pi qi − i pi i qi  PCC(P, Q)     2  2  n i pi2 − − n i qi2 − i pi i qi

(6.23)

where pi , and qi , are the values of the original parameter P and estimated parameter O and n is the number of parameters. 4. Normalized L1 distance (Jalali et al. 2016a): to measure the distance between two positive m-dimensional real vectors P (original distribution) and Q (estimated distribution) and defined as below:

L 1 (P, Q) 

6.3.3.2

1  | pi − qi | n i pi

(6.24)

Experiment Settings

In the presented method for stopping criteria we used the following parameters: Tmin  0.05, k  n × 50. Also for the learning automata parameters we used a  0.05, b  0.003. It also should be noted that these parameters are obtained based on empirical experiments. Also, all the results reported in following experiments are based on the averages taken over 30 runs.

6.3.3.3

Experiments

In this section a set of experiments is conducted as the follow: in the first experiment, we evaluate the performance of the presented method in term of stochastic evaluation metrics. In the second experiment we try to compare the final result of the presented link prediction method with some recent link predictions. In the third experiment, we report the final probability distribution of some random nodes in an instance run

182

6 Social Link Prediction

to get an idea about the output of the SLP. In the forth experiment we analyze the number of samples that is taken by the presented LA to reach a certain accuracy and its execution time. In the fifth experiment we test the presented SLP using three different learning parameters to obtain the best learning parameter. Finally in the sixth experiment we compare the presented SLP with different distance metrics to show the reason we chose skew divergence distance metric.

Experiment I In this experiment, we try to compare the presented method to SSM (standard sampling method) in term of stochastic evaluation metric. The SSM in stochastic social networks operates simply as follow: in each iteration each random variable in the network is sampled and the distribution of the interested metric is calculated based on sampled values. So the SSM based link prediction based on some chosen similarity metric operates as follow: in each iteration, the algorithm samples each stochastic link and then it calculates the similarity metric for each test link and updates the distributions of the similarity metrics of the test links based on sampled links. Finally, it outputs the distribution of similarity metrics as the output of link prediction. For this experimentation, different synthetic stochastic graphs (BA, WS, and ER) with size from 1000 to 10,000 are used and the results of this experimentation for each type of network are averages taken over different used sizes. To do this we consider the different distribution distance metrics: Kolmogorov-Smirnov (KS) distance, skew divergence (SD), normalized L distance (ND) and Pearson’s correlation coefficient (PCC) for estimated similarity metrics including: Stochastic Common Neighborhood (SCN), Stochastic Salton (SS), Stochastic Jaccard Index (SJ), Stochastic Preferential Attachment (SPA), Stochastic Adamic-Adar Index (SAA), Resource Allocation Index (SRA), Stochastic Katz Index (SKatz), and Stochastic LP Index (SLP) to compare the presented algorithm with the SSM. The average distance of different similarity metrics is given in Figs. 6.1, 6.2 and 6.3 for different synthetic stochastic graphs. From Figs. 6.1, 6.2 and 6.3, the presented method (SLP) outperforms SSM for all type of datasets, and for all of the presented similarity metrics. Based on the learning ability of our algorithm to sample important regions, it is not surprising that the presented algorithm does better in comparison to the blind SSM algorithm.

Experiment II This experiment tries to compare the final non-stochastic link prediction result of the presented method with some classical and recent link prediction methods. In order to improve the comparison, a set of algorithms in two categories is chosen: (1) Common Similarity-Based Link Predictions: in this group of methods, the following common similarity-based link prediction methods are chosen for comparison with the presented method: CN, Salton, Jaccard, PA, AA as local sim-

6.3 Link Prediction in Stochastic Social Networks 0.05 SSM SLP

0.4 0.3 0.2 0.1 0

SCN

SS

SJ

SPA

SAA

Average SD Distance

Average KS Distance

0.5

183

0.03 0.02 0.01 0

SRA SKATZ SLP

Average KS Distance on Different Similarity Metrics

SS

SJ

SPA

SAA

SRA SKATZ SLP

1 SSM SLP

0.4 0.3 0.2 0.1

SCN

SS

SJ

SPA

SAA

Average PCC Distance

Average ND Distance

SCN

Average SD Distance on Different Similarity Metrics

0.5

0

SSM SLP

0.04

0.6 0.4 0.2 0

SRA SKATZ SLP

SSM SLP

0.8

Average ND Distance on Different Similarity Metrics

SCN

SS

SJ

SPA

SAA

SRA SKATZ SLP

Average PCC Distance on Different Similarity Metrics

Fig. 6.1 Comparing average distance metrics of different similarity metrics for synthetic BA graph 0.05 SSM SLP

0.4 0.3 0.2 0.1 0

SCN

SS

SJ

SPA

SAA

Average SD Distance

Average KS Distance

0.5

0.03 0.02 0.01 0

SRA SKATZ SLP

Average KS Distance on Different Similarity Metrics

SS

SJ

SPA

SAA

SRA SKATZ SLP

1 SSM SLP

0.4 0.3 0.2 0.1

SCN

SS

SJ

SPA

SAA

SRA SKATZ SLP

Average ND Distance on Different Similarity Metrics

Average PCC Distance

Average ND Distance

SCN

Average SD Distance on Different Similarity Metrics

0.5

0

SSM SLP

0.04

SSM SLP

0.8 0.6 0.4 0.2 0

SCN

SS

SJ

SPA

SAA

SRA SKATZ SLP

Average PCC Distance on Different Similarity Metrics

Fig. 6.2 Comparing average distance metrics of different similarity metrics for synthetic WS graphs

ilarity metrics, Katz as the global similarity metric and LP as the quasi-local similarity metric (Liben-Nowell and Kleinberg 2007). (2) Supervised Link Predictions: in this group of algorithms, three recent link prediction algorithms are chosen, Interaction Prediction (IP) (Rossetti et al. 2015), CMA-ES (Bliss et al. 2014) and MI-LP (Tan et al. 2014), that try to predict future links. To evaluate the presented method using other link prediction methods, we use the two common evaluation metrics as follows:

184

6 Social Link Prediction 0.025 SSM SLP

0.2 0.15 0.1 0.05 0

SCN

SS

SJ

SPA

SAA

Average SD Distance

Average KS Distance

0.25

0.015 0.01 0.005 0

SRA SKATZ SLP

Average KS Distance on Different Similarity Metrics

SSM SLP

0.3 0.25 0.2 0.15 0.1 0.05 SCN

SS

SJ

SPA

SAA

SRA SKATZ SLP

Average PCC Distance

Average ND Distance

SCN

SS

SJ

SPA

SAA

SRA SKATZ SLP

Average SD Distance on Different Similarity Metrics 1

0.35

0

SSM SLP

0.02

SSM SLP

0.8 0.6 0.4 0.2 0

Average ND Distance on Different Similarity Metrics

SCN

SS

SJ

SPA

SAA

SRA SKATZ SLP

Average PCC Distance on Different Similarity Metrics

Fig. 6.3 Comparing average distance metrics of different similarity metrics for synthetic ER graphs

(1) AUC Metric (Liben-Nowell and Kleinberg 2007): If we rank all of the nonexistent links based on their scores, the AUC metric can be interpreted as the probability that a random missing link has a higher score than a random nonexistent link. In the algorithmic implementation, at each point in time it randomly picks a missing link and a nonexistent link and compares their scores. If from n independent comparisons, there are n times that missing links have a higher score and n times that they have the same score, the AUC value is:

AUC 

n + 0.5n n

(6.25)

If the AUC value has a value of more than 0.5, it is better than the random link prediction algorithm; the farther the value from 0.5, the more accurate the algorithm. (2) Precision (Liben-Nowell and Kleinberg 2007): If we predict L links to be connected and Lr links from L links are right, the precision is defined as:

Precision 

Lr L

(6.26)

Clearly, higher precision means higher prediction accuracy. It should be noted that the parameters of the used algorithms are borrowed from their references. Tables 6.1 and 6.2 present the average AUC and precision scores based on 30 random runs of the SLP, respectively. Tables 6.1 and 6.2 demonstrate that the presented SLP is able to achieve AUC (0.9263) and precision (0.8238) measures

6.3 Link Prediction in Stochastic Social Networks

185

Table 6.1 AUC measure of presented SLP and other link prediction methods Method/Graph

BA-Graph

WS-Graph

ER-Graph

Average

CN

0.7154

0.7541

0.7421

0.7372

Salton

0.6942

0.7055

0.6843

0.6946

Jaccard

0.7328

0.7191

0.7015

0.7178

PA

0.7645

0.7895

0.7684

0.7741

AA

0.7599

0.7611

0.7641

0.7617

Katz

0.8344

0.8457

0.8462

0.8421

LP

0.8001

0.8199

0.8278

0.8159

IP

0.8575

0.8697

0.8799

0.8690

CMA-ES

0.8547

0.8676

0.8723

0.8648

MI-LP

0.8975

0.8978

0.8976

0.8976

SLP

0.9351

0.9145

0.9295

0.9263

Table 6.2 Precision measure of the presented SLP and other link prediction methods Method/Graph

BA-Graph

WS-Graph

ER-Graph

Average

CN

0.5472

0.5571

0.5498

0.5513

Salton

0.5972

0.5847

0.5801

0.5873

Jaccard

0.6174

0.6282

0.6201

0.6219

PA

0.6281

0.6354

0.6274

0.6303

AA

0.6784

0.6654

0.6579

0.6672

Katz

0.7185

0.7249

0.7346

0.7260

LP

0.6978

0.6900

0.6907

0.6928

IP

0.7549

0.7582

0.7541

0.7557

CMA-ES

0.7786

0.7719

0.7694

0.7733

MI-LP

0.7958

0.7964

0.7913

0.7945

SLP

0.8271

0.8245

0.8200

0.8238

which are significantly better than local similarity-based algorithms: CN (AUC  0.7372, precision  0.5513), Salton (AUC  0.6946, precision  0.5873), Jaccard (AUC  0.7178, precision  0.6219), PA (AUC  0.7645, precision  0.6303), AA (AUC  0.7617, precision  0.6672). Given that the presented stochastic link prediction uses the stochastic information of the social network to predict future links, it is not surprising that the SLP performs better than local similarity metrics. The presented method is also superior to the Katz (AUC  0.8421, precision  0.7260) and LP (AUC  0.8159, precision  0.6928) link predictions. In addition, from Tables 6.1 and 6.2 it can be seen that the SLP is able to achieve an AUC and precision which is much better than IP (AUC  0.8690, precision  0.7557), CMAES (AUC  0.8976, precision  0.7733) and MI-LP (AUC  0.8238, precision  0.5710).

186

6 Social Link Prediction SS

0.05

0.03

0.04

0.015

0.02

0.005 0

0

1

0.05

2

3

4

0.04

0.015

0.01

0.005

0

0

0

0.5

1

1.5

2

0.03 0.02

0.01

0.01

SPA

0.05

0.02

0.03

p(x)

p(x)

p(x)

0.025 0.02

SJ

0.03 0.025

p(x)

SCN

0.035

0.01

0

0.5

1

1.5

0

2

0

2

4

x

x

x

x

SAA

SRA

SKATZ

SLP

0.05

0.05

0.04

6

8

4

5

0.03

0.03

0.03

0.02

0.02

0.02

0.01

0.01

0.01

0

0

0

0

5

10

x

15

20

0

2

4

6

x

8

10

0.03

p(x)

0.04

p(x)

0.04

p(x)

p(x)

0.035 0.04

0.025 0.02 0.015 0.01 0.005

0

5

10

x

15

0

1

2

3

x

Fig. 6.4 Different similarity metrics probability distributions of the random test node 1 in an instant run

Experiment III In this experiment, we choose three random test nodes in BA graphs and report the final probability distribution of each similarity metric of the corresponding node in an instant run. These reports are presented in Figs. 6.4, 6.5 and 6.6. Because the links and so the similarity metrics in the stochastic social networks are random variables and the presented link prediction tries to estimate the true distribution of the similarity metric for each test link so we report the final similarity metric distributions for three instant nodes. The goal of these figure is to visualize the output of the presented method as a stochastic method. As it mentioned before at the end of SLP, we have a similarity probability distribution function for each test link and because of the similarity metrics are random variable, to determine a link’s existence, we should sample from its similarity probability distribution function and based on this sample and a predefined threshold we make the decision about the link existence. Also we try to present the fitted distribution for each link using non-parametric kernel method. So Figs. 6.7, 6.8 and 6.9 show the histogram of each random link with fitted distribution using non-parametric kernel method. This experiment is conducted here to get an idea of the final output of the SLP.

Experiment IV In this experiment we try to compare the number of samples that is taken by the LA with the number of samples taken by SSM to reach a certain accuracy. To do this we consider three accuracy values: {0.65, 0.80, 0.95} and run the SLP and SSM on different synthetic networks BA, WS, and ER with different sizes. We report the average of required samples of the SLP and SSM for each synthetic graphs that is

6.3 Link Prediction in Stochastic Social Networks

0.02

0.015

0.01

0.005

0.005 0

1

2

x

3

SAA

0.035

0.02 0.015

0.02 0.01

0.005

x

0 -1

3

2

1

0

SRA

0.04

2

1

0

x

0

3

SKATZ

0.05

0.02

0.02

0.015

0.01

0.01

0.005

0.005

0 -5

0

5

x

10

0

15

x

6

4

8

SLP

0.06

0.03

p(x)

0.015

2

0

0.08

0.04

0.025

p(x)

p(x)

0.02

-2

0.07

0.03

0.025

0.03

0.01

0.035

0.03

p(x)

0.04

0.025

0 -1

4

SPA

0.05

0.03

0.015

0.01

SJ

0.035

p(x)

0.02

p(x)

p(x)

0.025

0

SS

0.03 0.025

p(x)

SCN

0.03

187

4

2

0.02

x

0.01

0 -5

8

6

0.04 0.03

0.01

0

0.05

0

x

5

0

10

1

2

3

x

4

Fig. 6.5 Different similarity metrics probability distributions of the random test node 2 in an instant run SCN

0.03

0.03

0.04

0.01

0.025

0.015

0.03

p(x)

0.02

p(x)

p(x)

0.02

0.01

0

1

2

3

0

4

0

1

2

x

0.025

10

15

3

0

4

0.015

SKATZ

0.01 0.005

4

x

5

6

0

8

0.015

0.01

3

6

0.02

0.015

0.005 2

4

SLP

0.03

0.01

1

2

0.025

0.005 0

0

x

0.02

p(x)

p(x)

0.01 0.005 5

2

0.025

0.02

0.02 0.015

x

1

0.03

0.025

0

0

x

SRA

0.03

0.03

0 -5

0

3

0.005

x

SAA

0.035

0.02 0.015 0.01

0.01

0.005

p(x)

p(x)

0.015

SPA

0.035

0.025

0.005

p(x)

SJ

0.05

0.03

0.02

0

SS

0.035

0.025

0

5

10

x

15

0

2

3

4

5

6

7

x

Fig. 6.6 Different similarity metrics probability distributions of the random test node 3 in an instant run

obtained using 30 instant runs (10 instant runs for each size of {2000, 5000, 10000}) in Table 6.3. From the results it can be found the SLP decreases the rate of required samples by 50% which shows that the presented SLP is completely better than SSM method. Also to evaluate the performance of the L A T ests set, we compare execution time of the presented SLP with the SSM and the SLP-Without-L A T ests (SLP that don’t have L A T ests set and do not learn anything about if we should calculate and update the similarity metrics probability distributions in each iteration or not) and report the average of execution time over 30 instant runs in Table 6.4 for different sizes of synthetic graphs and accuracy level 0.95. From the results reported here

188

6 Social Link Prediction

40

80 60 40

20 0

4

2

0

6

0

0.5

1.5

1

0

2

SAA

60 40 20

0

0

0

15

0

2

x

4

40

0

2.5

0

2

4

6

8

10

6

8

4

5

x

SKATZ

SLP

100 80

40 20

10

2

60

20 5

1.5

1

80

80

count(x)

count(x)

40

0

0.5

100

100

60

60

x

SRA

120

80

20 0

x

80

count(x)

30 10

x 100

40

count(x)

0 -2

100

50

20

20

SPA

120

60

count(x)

count(x)

60

SJ

70

100

80

count(x)

SS

120

count(x)

SCN

100

60 40 20

0

5

x

10

0

15

1

2

3

x

x

Fig. 6.7 Different similarity metrics histogram with a fitted distribution using kernel method for node 1

40 30

40 30 20

10

10 1

2

x

3

4

SAA

0

1

x

2

count(x)

40 30 20

40 20

10 0

5

x

10

15

0

1

x

2

0

3

SKATZ

2

4

x

6

8

60 40

0 -5

0

2

4

x

6

8

10

SLP

200 150 100 50

20 0

60

20 0

80

60

80

40

100

80

50

40

0 -1

3

SRA

100

60

0 -5

0 -1

5

60

20

count(x)

0

80 70

count(x)

50

20

100

80

60

SPA

120

count(x)

50

SJ

100

count(x)

count(x)

count(x)

60

0

SS

80 70

count(x)

SCN

80 70

0

5

x

10

15

0

1

2

x

3

4

Fig. 6.8 Different similarity metrics histogram with a fitted distribution using kernel method for node 2

we see that using L A T ests completely decreases the computational complexity of calculating similarity metric probability distributions.

Experiment V In this experiment, we evaluate the convergence behavior of the SLP using information entropy and three different LA configurations. To do this we use the three follow-

6.3 Link Prediction in Stochastic Social Networks

60

40 30 20

0

1

2

3

x

0

4

SAA

80 70

50

60

count(x)

60 40 30 20 10 0

0

5

x

10

15

80 60 40

0

1

2

x

0

3

SRA

40 30

x

4

5

6

2

3

x

0

4

SKATZ

50

30

0

0

2

4

6

x

8

SLP

70

40

10 3

1

50

10 2

30

60

20

1

40

60

20 0

50

10 0

70

50

60

20

20

10

70

count(x)

30

count(x)

0

40

SPA

80 70

100

50

20

10

SJ

120

count(x)

50

SS

count(x)

80 70

count(x)

count(x)

60

count(x)

SCN

70

189

40 30 20 10

0

5

x

10

15

0

2

3

4

x

5

6

7

Fig. 6.9 Different similarity metrics histogram with a fitted distribution using kernel method for node 3 Table 6.3 Comparison of the number of samples required by SLP and SSM based on different synthetic graphs and different accuracy levels Method/Graph SLP Accuracy  0.65

SSM Accuracy  0.65

SLP Accuracy  0.80

SSM Accuracy  0.80

SLP Accuracy  0.95

SSM Accuracy  0.95

BA-Graph

7.84 × 104 ± 9.7 × 102

19.25 × 104 ± 13 × 102

12.87 × 104 ± 11 × 102

24.12 × 104 ± 45 × 102

16.93 × 104 ± 15 × 102

35 × 104 ± 61 × 102

WS-Graph

7.12 × 104 ± 8.8 × 102

21.41 × 104 ± 45 × 102

12.14 × 104 ± 10 × 102

26.11 × 104 ± 49 × 102

15.55 × 104 ± 14 × 102

39 × 104 ± 58 × 102

ER-Graph

7.99 × 104 ± 6.9 × 102

20.89 × 104 ± 24 × 102

13.02 × 104 ± 9.1 × 102

24.82 × 104 ± 52 × 102

16.12 × 104 ± 16 × 102

38 × 104 ± 63 × 102

ing configurations and calculate the average information entropy of the probability vector of the LAs: a  0.05, b  0.005, a  0.05, b  0.003, a  0.01, b  0.001. So, Fig. 6.10 plots the average information entropy taken over LAs versus iteration number for three kinds of synthetic stochastic graphs (BA-2000, BA-5000, BA-10000; WS-2000, WS-5000, WS-10000; ER-2000, ER-5000, ER-10000). These figures show that the average entropy of SLP decreases generally where the algorithm proceeds. Also based on these figures we can conclude that the configuration a  0.05, b  0.003 has the best convergence results and for this reason we use this configuration for the other experiments as mentioned before.

190

6 Social Link Prediction

Table 6.4 Average execution time (in minutes) of the SLP, SLP-without L A T ests , and the SSM based on different synthetic graphs Method/Graph

SLP

SLP-Without L A T ests

SSM

BA-2000

10.52 ± 2.10

20.45 ± 3.40

45.10 ± 5.51

BA-5000

14.13 ± 2.33

26.22 ± 3.23

60.47 ± 6.47

BA-10000

18.39 ± 3.45

33.29 ± 4.20

91.42 ± 9.30

WS-2000

11.41 ± 1.58

21.21 ± 3.10

42.17 ± 4.26

WS-5000

15.46 ± 2.58

27.01 ± 3.40

62.13 ± 6.59

WS-10000

20.00 ± 3.51

35.18 ± 4.37

93.39 ± 9.51

ER-2000

10.35 ± 2.23

20.48 ± 3.22

44.24 ± 5.10

ER-5000

14.40 ± 3.01

27.32 ± 3.58

34.11 ± 7.15

ER-10000

21.12 ± 3.40

34.50 ± 4.51

95.02 ± 10.03

BA-2000

1

0.4

0.4

0.2

0.2

0

0

2

4

6

8

10

0

0.2 0

0.5

1

1.5

2

1

0.6

0.4

0.2

0.2 0

2

4

6

10

8

1

0

0.5

1

1.5

0.6

0.4 0.2 0

2

4

6

8

10 4

x 10

5

WS-10000 a=0.05,b=0.005 a=0.05,b=0.003 a=0.01,b=0.001

0.6

2.5

2

ER-5000

0

0

0.5

1

1.5

5

0

ER-10000 a=0.05,b=0.005

0.8

a=0.05,b=0.003 a=0.01,b=0.001

2.5

2

x 10 1

a=0.05,b=0.005

0.6

0.2

5

x 10

5

0.4

0

4

x 10

0.8

a=0.05,b=0.003 a=0.01,b=0.001

3

0.2 0

1 a=0.05,b=0.005

2

0.4

4

0.8

1

0.8

a=0.05,b=0.003 a=0.01,b=0.001

x 10

ER-2000

0

1 a=0.05,b=0.005

0.6

0.4

0

WS-5000

0.8

a=0.05,b=0.003 a=0.01,b=0.001

0

x 10 1

a=0.05,b=0.005

2.5 5

x 10

0.8

a=0.05,b=0.003 a=0.01,b=0.001

0.6 0.4

4

WS-2000

a=0.05,b=0.005

0.8

a=0.05,b=0.003 a=0.01,b=0.001

0.6

BA-10000

1 a=0.05,b=0.005

0.8

a=0.05,b=0.003 a=0.01,b=0.001

0.6

BA-5000

1 a=0.05,b=0.005

0.8

a=0.05,b=0.003 a=0.01,b=0.001

0.6 0.4 0.2

0

0.5

1

1.5

2

2.5

0

0

1

2

3

4

5

x 10

5 5

x 10

Fig. 6.10 The convergence diagram of the presented SLP using different LA configurations

Experiment VI In this experiment, we evaluate the performance of the presented SLP using different distance metrics. To do this we use the following distance metrics and calculate the running time and the AUC metric of the presented SLP: Kolmogorov-Smirnov distance, Skew divergence, Pearson’s correlation coefficient, and Normalized L1 distance. So, Table 6.5 represents the AUC metric and the running time of the presented SLP for three synthetic stochastic graphs BA-5000, WS-5000, and ER-5000. Table 6.5 shows that the SLP with skew divergence has the best performance in term

6.3 Link Prediction in Stochastic Social Networks

191

Table 6.5 AUC and running time (in minutes) of the SLP using different distance metrics Metric/Graph BA-5000 Running time

WS-5000

ER-5000

AUC

Running time

AUC

Running time

AUC

Kolmogorov- 14.23 Smirnov

0.9015

15.51

0.8774

15.16

0.9001

Skew divergence

13.59

0.9481

14.47

0.9249

13.51

0.9399

Pearson’s correlation coefficient

17.51

0.9359

18.12

0.9314

17.52

0.9263

Normalized L1

10.51

0.8723

11.27

0.8641

11.02

0.8794

of accuracy and running time and for this reason we use this metric for the other experiments as mentioned before.

6.3.4 Discussion Because of people activities in social networks are dynamic and uncertainty, and the structure of the networks change over time, deterministic graphs may not be appropriate for modeling and analysis of the social network. The one of solutions is using stochastic social networks as a model to simulate the dynamics and uncertainty of the social networks. This sub-section tried to present a new similarity based link prediction method for stochastic social networks to overcome the problem of traditional link prediction that only uses a static snapshot of the graph to predict future links. The presented method takes advantage of using learning automata to estimate the distribution of some chosen similarity metric in such a way that the number of required samples and also the computational complexity be decreased as much as possible. Also, the presented method has capability to use in online stochastic social networks where the social network changes online and the future links must be predicted again. We consider different synthetic stochastic social networks as the test bench mark and conduct some experiments. The experiments showed that the presented stochastic link prediction improves the link prediction accuracy and the computational complexity in the stochastic networks.

192

6 Social Link Prediction

6.4 Link Prediction in Weighted Social Networks Weighted networks are a kind of social network in which each link has a weight that indicates the strength of the corresponding link (Liben-Nowell and Kleinberg 2007). Link prediction in such networks is required to adapt the current methods such as adopting the similarity metric based link prediction to consider weights in the network. But in this area, there are some researches that show the strong links are important in link prediction (Murata and Moriyasu 2007). On the other hand, there are studies that show weak links are important in the link prediction (Lu and Zhou 2009). So, in this sub-section, we will try to estimate the weight of each test link directly from the links weight information in the network using learning automata. In the presented method there exist one learning automata for each link that must be predicted, and each LA tries to learn the true weight of the corresponding link according to the current network’s links weight information. We also partition the network links in two sets: the training set that we use for training LAs, and the test set that must be predicted. In each iteration of the presented algorithm, each LA chooses a weight as its action. After choosing actions, we will have a weighed network of the test links. Now, we define some metrics to calculate the weight of the training set using these new weights. After calculating the weight of the training set from the weights of the test set, we generate a reinforcement signal for each LA based on its influence on the true weight estimation of the training set, and each LA updates its action probability distribution according to its reinforcement signal. After estimating the weight of test links, we sort them by their weights and predict the existence/absence of each link based on its weight. Our experiments demonstrate that link prediction in the presented method out-performs other link prediction methods. To obtain the goal of this sub-section we first briefly review the related works in the weighted network. Then we present the used weighted similarity metrics for calculating the weight of links from the weights information in the network. After that, present the algorithm and procedures for weighted link prediction problem based on learning automata.

6.4.1 Review of Link Prediction in Weighted Networks In this sub-section, we go on to review related link prediction methods in weighted social networks: In De Sá and Prudêncio (2011) a supervised machine learning strategy for link prediction in the weighted network is presented. The method uses link weights that express the “strength” of relationships. Here, the results of supervised prediction on a co-authorship network revealed satisfactory results when weights were taken into account. Reference Murata and Moriyasu (2007) indicates that link prediction based on graph proximity measures fits open and dynamic online social networks. It proffers new weighted graph proximity measures for link prediction of social networks. The

6.4 Link Prediction in Weighted Social Networks

193

method relies on an assumption that proximities between nodes would be better estimated by using both graph proximity measures and the weights of existing links in a social network. By taking into consideration the weights of links, link prediction performance is improved via previous proximity measures. Paper Wind and Morup (2012) has studied the effect of using weight information when recovering missing edges in a network following the framework of Wang et al. (2007). The researchers have observed that the application of a Poisson-based model on a binary network does not hamper the structure modeling. Using Poissonbased models for weighted networks, and for binary versions of the same networks, they observed that weight information did not improve link prediction. They further witnessed that complex and flexible models in general performed better than simpler models regardless of the available information (i.e., a fraction of edges treated as missing). When predicting the weights of the missing edges, the researchers in the said study saw that complex model overfits to the edges, resulting in a poor recovery of the true edge-weight. Also, there are some relevant works about the prediction in the weighted social networks: In Dong et al. (2013) the dynamics properties of mobile calling patterns and some social characteristics are studied based on a large mobile call duration network where the weights of the links are call durations. They found that the stronger ties have lower call duration; the average call duration get shorter when the end point of call have more common neighbors; the opinion leaders have shorter call duration and the social balance tends to shorter call duration. Based on these facts they proposed a probabilistic model to predict the call duration and they compared their methods with some based methods. In Gupte and Eliassi-Rad (2012) an obvious method to infer the tie strength between the users using bipartite event and people network is proposed. They modeled the characterizations of functions that could serve as a measure of tie strength. They showed that for the applications where the ranking of the tie strength is important, the axioms are equivalent to a natural partial order; and presented that to settle on a particular measure, a non-obvious decision about extending this partial order to a total order which is best left to the particular application is needed. They evaluated the method and showed the coverage of the axioms through the use of Kendall’s Tau correlation. In Xiang et al. (2009) an unsupervised model to estimate link strength from the interaction activity and the user similarity is proposed. This method is based on a link-based latent variable model, along with a coordinate ascent optimization procedure for the inference. The authors used Facebook and LinkedIn datasets to evaluate their methods and showed that the proposed method improves the autocorrelation and the classification performance. The result of this method is a set of link strengths and so a weighted social network that can be used in link prediction tasks. In Lu and Zhou (2009), the authors applied three local similarity indices, Common Neighbor, Adamic-Adar index and Resource Allocation index with the consideration of weights. They were surprised to see that the precision of weighted indices turns out even worse than their corresponding unweighted versions. These unexpected observations reminded them of Weak Ties Theory which claims that links with small weights play yet an important role in social networks. The extensive experimental

194

6 Social Link Prediction

study has shown that weak ties play a significant role in the link prediction problem, and that to emphasize the contribution of weak ties can improve predicting accuracy to a high degree. Finally, in Pech et al. (2017) they have introduced a new link prediction method called Low Rank (LR) using robust principal component analysis. In their method, the adjacency matrix of the target network is decomposed into a lowrank matrix which can be regarded as the backbone of network containing the true links and sparse matrix consisting the corrupted or spurious links in the network. Link prediction, actually, can be regarded as matrix completion problem from the corrupted or incomplete adjacency matrix. By solving the optimization problem, they obtained the low-rank matrix which later on plays a role as score matrix illustrating the possible connectivity between each pair of vertices. Then to show that their method can also deal with weighted network, they have compared their method with other weighted-based algorithms and showed the superiority of the proposed method.

6.4.2 Weighted Similarity Metrics In the present sub-section, we will review the weighted similarity metrics proposed by Lu and Zhou (2009) to estimate the weight of an edge based on the weights information in the network. These methods inherit from popular similarity scores such as Common Neighborhood (CN), Jaccard Index (JC), Preferential Attachment (PA) and Adamic-Adar (AA). To do this, let (x) be the set of neighbors of node x in the social network, |(x)| be the degree (number of neighbors) of node x and w(x, y) be the link weight between nodes x and y. Also, it should be noticed that we consider undirected graphs and do not consider self-connections; so, w(x, y)  w(y, x). In the following we review the weighted similarity metrics: 1. Weighted Common Neighbors (WCN): measure is defined as:  z∈|(x)∩(y)|

W C N (x, y) 

w(x, z) + w(y, z)

|(x) ∩ (y)|

(6.27)

2. Weighted Jaccard’s Coefficient (WJC): To calculate weight from this similarity metric, the Jaccard’s Coefficient coefficient can be extended as: 

W J C(x, y)  

z∈|(x)∩(y)| w(x, z) + w(y, z)  z ∈|(x)| w(x, z ) + z ∈|(y)| w(y, z )

(6.28)

3. Weighted Preferential Attachment (WPA): For weighted networks, the PA measure can be extended as:

6.4 Link Prediction in Weighted Social Networks

W P A(x, y) 



w(a, x) ∗

a∈(x)

195



w(b, y)

(6.29)

b∈(y)

4. Weighted Adamic-Adar Coefficient (AA): The Adamic-Adar Coefficient measure is extended for weighted networks as:

W A A(x, y) 



w(x, z) + w(y, z)    log 1 + c∈|(z)| w(z, c) z∈(x)∩(y)

(6.30)

6.4.3 The Weighted Link Prediction This section describes the weighted link prediction method, CALA-WLP, based on learning automata (Moradabadi and Meybodi 2018a). Within the presented method, there is one CALA for each link that must be predicted. Each CALA attempts to learn the true weight of the corresponding link according to the links weight information in the current network. To do so, we partition the network links in two sets: the training set that we use for training CALAs, and the test set which has to be predicted. In an iteration of the presented algorithm, each CALA chooses a weight as its action. After all the CALAs choose their actions, we will have a weighed network of the test links. We now use these weights to calculate the weights of k percentage links of training set using one of the scores that are introduced in the previous sections. In the presented algorithm we choose k percentage links of the training set using a random order. After calculating the weights of the training set, we will generate a reinforcement signal to each CALA based on its influence on the true weight estimation of the training set. Here, each CALA updates its action probability distribution according to its reinforcement signal. This procedure is repeated until the action of each CALA converges to some value, which is used as the weight of the test link. Finally, we sort the test links based on their respective weights and predict the existence or lack of each test link based on its weights. The obtained weights of test links are really the influence of each test link in the true reconstruction of the original weighted network. Contrary to other weighted link predictions some of which show that the links with higher weights are important in the final prediction and some show that the links with lower weights are important, the method hereby presented is independent of whether higher or lower weights in the network are important. That is because it tries to estimate the influence of each test link to reconstruct the original weighted social network. Our preliminary results from link prediction on some co-authorship and email networks proved satisfactory when weights were considered. The experiments demonstrate that the performance of the link prediction in the weighted social network is better than that in a social network without weights. In the rest of this section, we will first describe the main procedure of the presented algorithm and then provide an example of the presented method.

196

6.4.3.1

6 Social Link Prediction

Main Procedure

As we said before, for each test link in the presented algorithm there is one CALA and each C AL A j tries to find the optimal action through a normal distribution N j (μ, σ ). In each iteration k of the presented algorithm, in the action selection step, each C AL A j chooses its action based on its normal distribution, N j (μk , σk ). Action α j is used as the weight of the corresponding link in the network. After actions selection phase, the actions are evaluated and each CALA updates its probability distribution according to some reinforcement signals. Now, in order to generate the reinforcement signal to CALAj in iteration k, we calculate two reinforcement signals: one is for the chosen action αj (k), βαj (k), and the other is for the mean parameter μj (k), βμj (k). To do so, we have a weighted network that is created according to the chosen action of CALAs. Also, we have a training set which is a set of real links with some weight value. Now, we use the weight information of test set to calculate the weight of k percentage of real links in training set using one of the scores that are introduced in the previous sections, such as: WCN, WJC, WPA. The percentage of training links that we use them for the weight estimation is considered as a parameter and it is studied in the experiment section. The final reinforcement signals for each CALAj are calculated according to the following equations: i     βαj (k)  wl αj (k), α−j (k) − wl ) l0 i     βμj (k)  wl μj (k), α−j (k) − wl )

(6.31)

l0

  where i is the number of real links whose weights we calculate, and w l αj (k), α−j (k) is the estimated weight of link l that is calculated with reference to the chosen action of CALAj in addition to the chosen action of other CALAs, α−j (k). Also, w l (μj (k), α−j (k)) is the estimated weight of link l by using the current mean of CALAj , μj (k), in addition to the chosen action of other CALAs, α−j (k). Finally wj is the real weight of link j. Now, each CALAj updates its μj (k), σ j (k) based on βαj (k), βμj (k) based on the CALA updating rule. In other words, in each iteration, each CALA chooses its action. These actions are evaluated based on a training set in terms of reinforcement signal, and the parameters of each CALA are updated by using the reinforcement signal. The procedure is repeated until the value of each CALA converges to some value. After running the training phase, the test phase will generate the final output prediction of the presented algorithm. To do that, we sort the test links based on the estimated weights and also use a threshold to classify the test links into two groups: existence links and non-existence links. The diagram of the presented link prediction method is present in Fig. 6.11.

6.4 Link Prediction in Weighted Social Networks

197

Fig. 6.11 Diagram of the presented link prediction method

6.4.3.2

Example

In this sub-section we give an example of the presented link prediction on a weighted network: Consider Fig. 6.12: Sub-figure (a) shows a weighted network with each link has a weight value. Sub-figure (b) shows the possible test links in the network and their CALA parameters. As we mentioned in the previous section, for each test link we have a CALA, with a normal distribution as its parameter. Then each CALA chooses an action based on its normal distribution. The chosen actions of CALAs are shown is sub-figure (c). Now we have a new weighted network which we use

198

6 Social Link Prediction

Fig. 6.12 An example of the weighted link prediction method

to estimate the weight of original links based on some weight scores such as WCN, WJC, or WRA. Sub-figure (d) shows the weights of original links that are calculated based on some similarity metric. Sub-figure (e) also shows that each CALA updates its normal distribution based on its influence on the true weight estimation of the original network. Then each CALA chooses an action again such as sub-figure (c) illustrates, and the procedure is repeated until the action of each CALA converges to some value as shown in (f). Then we sort the test links based on their weights and predict the existence or not the existence of each test link based on its weights as it shown in figure (g).

6.4.4 Experiment Results Here, some computer experiments have been conducted to test the presented algorithm in terms of performance and accuracy. In these experiments, we use the quality of solutions as well as the convergence rate of the presented algorithm as criteria for

6.4 Link Prediction in Weighted Social Networks

199

performance. In the rest of this section, we will first present the data set we used in our experiments, and then give a set of experiments.

6.4.4.1

Data Set

In this sub-section, we describe the social network’s data used in our experiments. For the experiments developed in this work, we consider the following two groups of networks: (1) Co-authorship Networks: A type of social network in which the nodes represent the authors and two authors are connected if they have collaborated on a paper. Collaboration network is often used to understand the topology and dynamics of complex networks. In this study, we have adopted three co-authorship networks from three sections of Arxiv and extracted data from the years 1993 to 2003 for all the data sets. The first network is composed by authors that collaborated in theoretical high-energy physics (hep-th). The second network is formed by authors who published papers in the high-energy physics (hep-ph) and the third is sampled from collaboration in Astro Physics (Astro-ph). In these data sets, if author i co-authored a paper with author j, the graph contains an undirected edge from i to j. If the paper is co-authored by k authors, this generates a completely connected (sub) graph on k nodes. (2) Email Communication Networks: A type of social network whose nodes are email addresses and if an address i sends at least one email to address j, the graph contains an undirected edge from i to j. In our experiment, we use two email communication data sets: Enron email communication network and Eu-All email communication network. Enron email communication network includes a data set around half million emails that are public according to the Federal Energy Regulatory Commission and we extract data from May 1999 through May 2002 (36 months). Also, Eu-All email communication network was extracted using email data from a large European research institution and we extracted data from October 2003 to May 2005 (18 months). The network specification of each data set is presented in Table 6.6. Since these networks are highly sparse, to make computation feasible we have reduced the number of candidate pairs by choosing only the ones that have at least two connections on the network. Now to create a weighted network using these social networks we use the following strategy: Build a weighted version of the network in which each link between a pair of nodes is weighted by the total number of occurred events between the two corresponding nodes.

200

6 Social Link Prediction

Table 6.6 Network size in terms of nodes and edges Dataset Hep-th

Nodes

Edges

9877

Description

51,971

Collaboration network of Arxiv High Energy Physics Theory

Hep-ph

12,008

237,010

Collaboration network of Arxiv High Energy Physics

Astro-ph

18,772

396,160

Collaboration network of Arxiv Astro Physics

EmailEnron

36,692

367,662

Email communication network from Enron

EmailEuAll

265,214

420,045

Email network from a EU research institution

6.4.4.2

Experiment I

This experiment evaluates the presented CALA-WLP accuracy. To do so, for collaboration networks (Hep-th, Hep-ph and Astro-ph), we consider the data from 1993 to 2002 as the training data (each year as a time period) and year 2003 as test data. Also for email networks (Enron and EuAll), we consider the first 70% available months as the training data (each month as a time period) and the 30% remaining months as the test data. For all conducted experiments, the initial parameters of each CALA-WLP j, (μ0 (j), σ0 (j)), are chosen randomly, parameter λ is set to 0.005, k is set to 70% and σL is set to 10−3 . Also, we run the presented algorithm with different similarity metrics (CN, JI, AA, PA) which were described in previous sections. So, we call the presented algorithm CALA-WLP-XX if the presented algorithm uses similarity metric XX in its procedure. In order to improve the comparison of the presented algorithm, we choose a set of algorithms in three categories: (1) Similarity-based algorithms: this group of algorithms contains a set of common similarity metrics such as CN, Jaccard, PA, AA, Katz and LP. (2) Supervised algorithms: in this group of algorithms, we choose Interaction Prediction (IP) (Rossetti et al. 2015), CMA-ES (Bliss et al. 2014), Neighbor Communities (NC) (Xie et al. 2014), and Likelihood based Link Prediction (LLP) (Pan et al. 2016) algorithms. IP predicts future interactions through combining dynamic social networks analysis, time series forecast, feature selection such as similarity metrics, as well as network community structure. CMA-ES uses Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to optimize weights which are used in a linear combination of sixteen neighborhood and node similarity indices. The LLP method uses a framework where a network’s probability is calculated according to a predefined structural Hamiltonian, and a non-observed link is scored by the conditional probability of adding the link to the observed network. Finally, the NC method proposes a network-structural similarity index and a link prediction method based on the neighbor communities using the probabilities of the possible situations in which the two nodes are linked to the same community.

6.4 Link Prediction in Weighted Social Networks

201

(3) Weighted link prediction algorithms: in this group of algorithms two weighted algorithms, Sup-WLP (De Sá and Prudêncio 2011) and Weak-WLP (Lu and Zhou 2009), and LR (Pech et al. 2017) are chosen to be compared with the presented algorithm. The parameters of the used algorithms are borrowed from their references. Tables 6.7 and 6.8 present the average AUC and precision scores based on the 10-fold cross-validation method, respectively. The results reported here demonstrate that the presented CALA-WLP-XX is able to achieve an AUC and Precision measures that are significantly better than the XX method which does not consider weights. Also, the results are better than Katz, LP. So, we can conclude that the presented algorithm outperforms all considered static methods which do not consider weights. Also, in comparison to the IP, CMA-ES, LLP, and NC, we see that the presented algorithm achieves a better result in AUC and precision measures. Therefore, we also conclude that CALA-WLP is superior to recent unweighted link prediction algorithms and does not do worse than other recent them. Finally, from the result reported here we can see that the presented algorithm quite better in AUC and precision measures than Sup-WLP and Weak-WLP which use the weight information of the social graph. These results show that CALA-WLP does not depend on whether strong links or weak links are important. Also the presented method is superior to the LR method because it directly tries to use the weight information as an important information in the networks comparing to the LR method. So, the results of CALA-WLP suggest that the prediction is better with consideration of weight information in the prediction task using our policy.

6.4.4.3

Experiment II

In this experiment, we compare the topological features of the predicted network which is obtained from prediction result with the original social network. The predicted network is a network n random predicted edges, where n is the number of edges in the original network. So it has the same number of edges as the original network. So, the goal of this experiment is to compare the topological features of the original network such as the number of connected components, efficiency, and clustering coefficient with the predicted network. If the values of these features in both the original and predicted network are similar, we can conclude that the presented algorithm works well. For each dataset, we test this experiment on its largest connected component. Table 6.9 summarizes the topological features of the original and predicted network for the largest component of the used data sets. In Table 6.9 NUM CX is the number of the connected components in network X and the size of the largest one. For example, 1222/2 means that this network has 2 connected components and the largest one contains 1222 nodes. In this table, Ex is the efficiency of the network X, CX is clustering coefficient and KX is the average degree of the network. From the obtained result we can see that the topological features in the original network and the predicted network are approximately similar which confirms the

202

6 Social Link Prediction

Table 6.7 AUC measures of presented CALA-WLP and other link prediction methods Method/Dataset

Hep-th

Hep-ph

Astro-ph

Enron

EuAll

CN

0.7945

0.7025

0.6791

0.8123

0.6643

Salton

0.7850

0.6854

0.6441

0.8087

0.6285

Jaccard

0.6438

0.6026

0.5719

0.7010

0.6259

PA

0.6400

0.6101

0.5574

0.6743

0.6049

AA

0.7562

0.7109

0.6840

0.8045

0.6097

Katz

0.8487

0.8611

0.7486

0.8896

0.7008

LP

0.8305

0.8128

0.7115

0.8542

0.6905

IP

0.8525

0.8548

0.7328

0.8836

0.7376

CMA-ES

0.8462

0.8501

0.7241

0.8601

0.7243

LLP

0.8759

0.8549

0.7798

0.8851

0.7721

NC

0.8421

0.8247

0.7415

0.8497

0.7084

LR

0.9145

0.8678

0.7814

0.8927

0.7641

Sup-WLP-CN

0.8502

0.8541

0.7701

0.6317

0.6162

Weak-WLP-CN

0.6723

0.5536

0.6412

0.8212

0.7315

CALA-WLP-CN

0.9215

0.8719

0.7847

0.9023

0.7761

Sup-WLP-JI

0.8461

0.8270

0.7503

0.6032

0.5843

Weak-WLP-JI

0.6723

0.5536

0.6412

0.8212

0.7315

CALA-WLP-JI

0.9042

0.8562

0.7602

0.8839

0.7593

Sup-WLP-PA

0.8637

0.8431

0.7699

0.6310

0.6134

Weak-WLP-PA

0.6661

0.5481

0.6400

0.8021

0.7142

CALA-WLP-PA

0.9301

0.8756

0.7900

0.9139

0.7803

Sup-WLP-AA

0.8315

0.8194

0.7465

0.6023

0.5731

Weak-WLP-AA

0.6502

0.5391

0.6286

0.8016

0.7072

CALA-WLP-AA

0.8992

0.8432

0.7493

0.8497

0.7396

efficiency of the presented algorithm. Also, from the results of the previous section and the reported topological features of the original network, we can conclude that the presented algorithm does better in the networks with higher cluster coefficient and higher average degree.

6.4.4.4

Experiment III

In this experiment, we study the method that we choose k percentage of training links in the learning phase. To choose k links there are three suggested methods: 1. Using k percentage training links using random order 2. Using k percentage training links with the lowest weights 3. Using k percentage training links with the highest weights

6.4 Link Prediction in Weighted Social Networks

203

Table 6.8 Precision measures of presented CALA-WLP and other link prediction methods Method/Dataset

Hep-th

Hep-ph

Astro-ph

Enron

EuAll

CN

0.5421

0.4532

0.4291

0.5620

0.4196

Salton

0.5395

0.4260

0.4021

0.5582

0.3758

Jaccard

0.4856

0.3690

0.3620

0.4529

0.3795

PA

0.4821

0.3699

0.3052

0.4283

0.3593

AA

0.5027

0.4549

0.4503

0.5598

0.3500

Katz

0.6135

0.6296

0.5049

0.6473

0.4503

LP

0.5703

0.5831

0.4611

0.6184

0.4319

IP

0.5853

0.6086

0.5025

0.6204

0.4702

CMA-ES

0.6032

0.6140

0.5000

0.6360

0.4821

LLP

0.6051

0.5941

0.5081

0.6647

0.6475

NC

0.5694

0.4972

0.4595

0.6175

0.4589

LR

0.6071

0.6051

0.5147

0.6718

0.6500

Sup-WLP-CN

0.5732

0.5631

0.4392

0.4293

0.5039

Weak-WLP-CN

0.4250

0.4004

0.3930

0.6103

0.6294

CALA-WLP-CN

0.6246

0.6176

0.5594

0.7049

0.6792

Sup-WLP-JI

0.5995

0.5794

0.4703

0.4013

0.4849

Weak-WLP-JI

0.5013

0.4829

0.4103

0.6305

0.6375

CALA-WLP-JI

0.6042

0.5934

0.5692

0.6893

0.6485

Sup-WLP-PA

0.5737

0.3951

0.3492

0.4509

0.5195

Weak-WLP-PA

0.4258

0.3905

0.3602

0.6059

0.6199

CALA-WLP-PA

0.6399

0.6298

0.5703

0.7194

0.6894

Sup-WLP-AA

0.6016

0.5896

0.4893

0.4054

0.5023

Weak-WLP-AA

0.4193

0.3902

0.4294

0.6025

0.6312

CALA-WLP-AA

0.6185

0.6032

0.5603

0.6394

0.6325

Table 6.9 The result of topological features in the original network and predicted network Feature/Dataset

Hep-th

Hep-ph

Astro-ph

Enron

EuAll

NumCOriginal

4042/1

7031/1

9032/1

14,704/1

18,046/1

NumCPredicted

3703/1

6260/1

7901/1

11,651/1

13,893/1

EOriginal

0.185

0.131

0.115

0.073

0.054

EPredicted

0.151

0.102

0.084

0.042

0.025

COrginal

0.764

0.683

0.581

0.236

0.145

CPredicted

0.734

0.654

0.552

0.204

0.101

KOrginal

4.247

4.021

2.538

2.531

1.032

KPredicted

3.521

3.184

2.128

2.035

0.912

204

6 Social Link Prediction

Table 6.10 The comparison of selection methods on the performance of the CALA-WLP Method/Dataset

Hep-th

Hep-ph

Astro-ph

Enron

EuAll

CALA-WLP Random

0.9201

0.8628

0.7620

0.8934

0.7689

CALA-WLP Ascending

0.7263

0.7598

0.6351

0.7388

0.6101

CALA-WLP Descending

0.7361

0.7001

0.6624

0.6843

0.6390

0.8 0.7

Accuracy

0.6 0.5

k=70% k=100% k=35%

0.4 0.3 0.2 0.1 0

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Iteration

Fig. 6.13 Study of parameter k on the CALA-WLP

In Table 6.10 we compare CALA-WLP AUC measure using three selection methods, where the reported value is the mean of CALA-WLP-XX for all four used similarity metrics. From the results reported here, we can conclude that random selection is the best method where we don’t have any special information whether lower or higher weights are important for the prediction task.

6.4.4.5

Experiment IV

This experiment compares the impact of parameter k which determines the percentage of training links that are chosen for CALAs evaluation. To do this, parameter k is varied from the set {35%, 70%, 100%} and the performance of the presented algorithm is calculated for the Hep-th data set. Figure 6.13 presents the results of this experiment. This figure shows that the best value of parameter k is about 70%; because when we choose a small amount of training set, the presented method underfits data and results in poor accuracy. Also, when we use all training sets of the social network, the presented method overfits data and now again results in low accuracy. So the best choice for parameter k is about 70%.

6.4 Link Prediction in Weighted Social Networks

205

0.9 0.8 0.7

Accuracy

0.6 0.5 0.4 CALA-WLP using fast convergence plan CALA-WLP without using fast convergence plan

0.3 0.2 0.1 0

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Iteration

Fig. 6.14 Study of using fast convergence plan

6.4.4.6

Experiment V

This experiment tries to improve the convergence rate of the presented method using defining CALA state as the following: each CALA has two states, On the state and Off state. In On state, the CALA chooses an action and updates its action probability. The On state describes that the corresponding CALA mean is not good enough and it must choose action again; but in Off state, the CALA does not perform any operation of choosing an action and updating its probability distribution because the mean of CALA is good enough and there is no need to perform the learning phase repeatedly. To transfer this idea to the presented algorithm, we use the following procedure: in the beginning of the presented algorithm the state of each CALA is set to On and so all CALA perform learning phase. After generating reinforcement signals for each CALA, we normalize the βμj (k) signals for each CALAj . Now if βμj (k) for CALAj is larger than 0.9 and βμj (k) > βαj (k), we set the state of CALAk to Off and choose the mean parameter value of the CALAk as its action forever. This means that CALAj has a higher reinforcement signal for its mean in comparison to the chosen action αj and overall the mean reinforcement signal is good in comparison to all CALAs in the network. So we assume this CALA is converged to its mean. To do this, we compare the performance of presented algorithm on Hep-th data set with and without suggested procedure. Figure 6.14 presents the results of this experiment. This figure shows that the final accuracy of the presented algorithm does not change overall; only if we use the mentioned procedure, we have a faster convergence rate in comparison to not using the mentioned procedure.

206

6 Social Link Prediction

6.4.5 Discussion This sub-section presents a new weighted link prediction method which uses learning automata to predict the occurrence or non-occurrence of each link using the weight information of the current network. This research uses continuous action set learning automata as a means to learn the optimal weight of each test link. In the presented method, there is one CALA for each test link that must be predicted and each CALA tries to learn the true weight of the corresponding link based on the weight information in the current network. All learning automata iteratively choose the weight of their corresponding link as their action. The set of learning automata actions is used to calculate the weight of training set. Each learning automata is rewarded or punished according to its influence on the true weight estimating of the training set. The final prediction is performed based on the estimated weights. The experimental results reported here show that the presented algorithm is superior to other link prediction methods.

6.5 Link Prediction in Fuzzy Social Networks Using Distributed Learning Automata Traditional social network analysis is based on deterministic models, calculating the value of some measures to analyze the relationship that exists between them. However, a simply determined value is not sufficient to describe these relations accurately. Since the fuzzy concept was proposed (Zadeh 1978), several studies have shown that the introduction of the fuzzy system, using a fuzzy state to replace the original value, can appropriately solve the problem of uncertainty in social networks. The research in Nair (2007) used fuzzy logic to modify the original binary relations into multiple relations among individuals and to achieve the same efficiency in social networks. Reference Brunelli and Fedrizzi (2009) presented a method using fuzzy logic to explain social relations. These methods can improve the flexibility of the relationship between social networks, and thus reduce the conflict between individuals. The authors of Bastani et al. (2013) applied a fuzzy model for link prediction based on network characteristics, and achieved better results than those achieved by the traditional method. The introduction of a fuzzy model through fuzzy clustering to predict social network links also shows good performance (Yang et al. 2015). Recently, some researchers have used the ordered weighted averaging (OWA) operator to obtain the fuzzy relationship between nodes (He et al. 2015). This method needs some attributes of the network to calculate the relationship. Similarity indices such as common neighbors (CN), Katz, Salton, or Adamic-Adar (AA) can be considered attributes here. Although the method has the disadvantage of a large time complexity, it has the advantages of higher prediction accuracy and higher stability. In this sub-section to predict future links a link prediction is presented which uses both fuzzy social networks and WCLA (FLP-WCLA). The method has two steps:

6.5 Link Prediction in Fuzzy Social Networks Using …

207

in the initialization phase, it models the strength of each link as a fuzzy variable by using the time and frequency information of the link. Thus, we have a fuzzy social network where each link has a fuzzy strength and use this network as the input of the next step. The main phase of FLP-WCLA uses WCLA to predict the future links. To do this in FLP-WCLA, we create a node-based WCLA from the original network where there is one LA for each node in the network. Each LA has two possible actions: {0, 1}. Action 0 means that the corresponding node is not the end points of any predicted test link and action 1 means that the corresponding node is one of the end points of some predicted test link. Also each LA has an internal information that is used to store the other end points and their strength. In the proposed method, in each iteration we randomly choose a LA, called L A x , and activate it. The L A x chooses its new action and if its chosen action is different from the previous one it sends a wave and activates its neighbors in a specified depth. After all the neighbors chose their actions to reward L A x we use the following instruction: we check that if there is a test link (x, y) where x is the node corresponds to the L A x and y is any of x neighbors that chose action 1. If there is any test link in form (x, y) then for each test link (x, y) we calculate the strength of the path between the node x and node y, S (x, y), based on the path of activated LAs in the WCLA and then we reward action 1 of the L A x and store the pair (y, S(x, y)) in node x and also store (x, S(x, y)) in node y. The main phase is repeated until the WCLA converges. Finally, it selects the LAs that their converged actions are 1 and sort them based on their strengths. Then choose k-top of them as the output of the link prediction problem. In order to examine the results of the proposed method, several evaluations are conducted, and the results of the presented method are compared with those achieved by other link prediction techniques. In general, the experiments show that this approach performs better than other strategies. So in the rest of this sub-section we introduce the WCLA-based link prediction method. After that we demonstrate the experimental study for some social network datasets.

6.5.1 Link Prediction Method in Fuzzy Social Networks This sub-section describes the link prediction method for fuzzy social networks. To do this, we first review the fuzzy concepts that are used in this method. Then, we describe the preprocessing phase to show how the presented method models one social network into a fuzzy social network. After that, we present a link prediction algorithm based on the fuzzy social network.

6.5.1.1

Fuzzy Concepts

The fuzzy set theory, introduced by Zadeh (1978), is suitable for dealing with the uncertainty and imprecision associated with information concerning various parame-

208

6 Social Link Prediction

ters. In this sub-section, we briefly review the main concepts and definitions of fuzzy sets, fuzzy variables and the fuzzy social networks. Definition 6.1 A fuzzy set A in R (real line) is defined to be a set of ordered pairs A  {(x, μ A (x))|x ∈ R}, where μ A (x) is called the membership function for the fuzzy sets. Definition 6.2 A fuzzy set A is called normal if there is at least one point x ∈ R with μ A (x)  1. Definition 6.3 A fuzzy set A on R is convex if for any x, y ∈ R and any λ ∈ [0,1], we have μ A (λ x + (1 − λ)y) ≥ min(μ A (x), μ A (y)). Definition 6.4 A fuzzy number is a fuzzy set on the real line that satisfies the conditions of normality and convexity. Definition 6.5 A L-R fuzzy number denoted by M  (m, α, λ), has the following membership function: ⎧ ⎪ 0 ⎪ ⎪ ⎪ ⎪ ⎨1 − μ M (x)  1 ⎪ ⎪ ⎪1 − ⎪ ⎪ ⎩0

m−x α x−m λ

x ≤m−α m−α< x tε (the minimum time needed by DyTrust to achieve the error rate ε), we have    Pr ob ci∗ − ci (t) > 0 < ε

(7.12)

Proof Let ci∗ be the final value of the penalty probability ci (t) when t is large enough. That is, for large enough values of t, the probability of penalizing the trust path πi (i.e. ci (t)) converges to its true value ci∗ . Using weak law of large numbers, we conclude that    lim Pr ob ci∗ − ci (t) > ε → 0

t→∞

Hence, for any ε ∈ (0, 1), there exists a learning parameter aε∗ ∈ (0, 1) and tε < ∞ such that for all a < aε∗ and t > tε , we have Pr ob[|ci∗ − ci (t)|> 0] < ε, and the proof of the lemma is completed.  Lemma 7.2 Let ci (t)  Pr ob[Sπi (t) < Sn k (t)] and di  1 − ci be respectively the probability of penalizing and rewarding the trust path πi . If q(t) is updated according to the algorithm DyTrust, then the conditional expectation of qi (t) is given as ρ

 E[qi (t + 1)|q(t)]  q j (t) c j (t)qi (t) + d j (t) puv (t + 1) (7.13) euv ∈πi

j1

where ρ denotes the number of all trust paths from source to direct neighbor and puv (t + 1) is computed as puv (t + 1) 

puv (t) + a[1 − puv (t)] euv ∈ π j / πj euv ∈ (1 − a) puv (t)

Proof As stated before, the probability of choosing a trust path πi is computed as the

product of the selection probability of the trust links along the path, i.e. qi (t)  euv ∈πi puv (t). Since DyTrust uses the learning algorithm L R−I to update the action

264

7 Social Trust Management

Fig. 7.7 a Sample trust network G and b its search tree

probability vectors at each time t, the probability qi (t) of traversing the trust path πi remains unchanged with the probability c j (t), if the chosen trust path π j is penalized by the environment, and otherwise, if the path π j is rewarded with the probability d j (t), the probability of choosing the edges of the trust path πi which are in the selected path π j increases by a given learning rate as that of the other edges of πi decreases. To illustrate the proof of the lemma in more detail, the authors proved it for the optimal trust path of the trust network given in Fig. 7.7a. As shown in Fig. 7.7b, there exist three trust paths from node v1 to target’s neighbor v4 including: π1  {e12 , e23 , e34 }, π2  {e12 , e24 } and π3  {e13 , e34 }. It is assumed that π2 is the optimal trust path from v1 to v4 . Let qi (t) be the probability of traversing the trust path πi and puv the probability of choosing action αuv by the automaton Au , at time t. Therefore, we have q1 (t)  p12 (t) p23 (t) p34 (t) q2 (t)  p12 (t) p24 (t) q3 (t)  p13 (t) p34 (t) The conditional expectation of q2 (t + 1), assuming q(t) is updated according to the algorithm DyTrust, is computed as        E q2 (t + 1)|q(t)  q1 (t) c1 (t)q2 (t) + d1 (t) p12 (t) + a 1 − p12 (t) (1 − a2 (t))p24 (t) + q2 (t)[c2 (t)q2 (t) + d2 (t){ p12 (t) + a(1 − p12 (t))}{ p24 (t) + a2 (t)(1 − p24 (t))}] + q3 (t)[c3 (t)q2 (t) + d3 (t){(1 − a) p12 (t)} p24 (t)]

where au (t) is the step length of the automaton Au at time t and is computed according to Eq. (7.9). After simplifying all terms in the right side of the above equation and some algebraic manipulations, we have

7.4 Learning Automata Based Trust Propagation Algorithms

265

3

   E q2 (t + 1)|q(t)  qj (t) cj (t)q2 (t) + dj (t) puv (t + 1) euv ∈π2

j1



Hence, the proof of the lemma is completed.

Lemma 7.3 If πl is the trust path with the highest expected reliability and q(t) is updated according to the algorithm DyTrust, the increment in the conditional expectation of ql (t) is always non-negative, that is, ql (t)  E[ql (t + 1)|q(t)] − ql (t) > 0 for all ql (t) ∈ (0, 1). Proof From Lemma 7.2, we have ql (t)  E[ql (t + 1)|q(t)] − ql (t) ρ

 q j (t) c j (t)ql (t) + d j (t) puv (t + 1) − ql (t) 

(7.14)

euv ∈πl

j1

Since The probability of traversing, rewarding, and penalizing a trust path is defined as the product of the probability of choosing the trust links along the path, the above equality can be rewritten as ⎡ ⎤ ρ







puv (t)⎣ cuv (t) puv (t) + duv (t) puv (t + 1)⎦ ql (t)  j1 euv ∈πj





euv ∈πj

euv ∈πl

euv ∈πj

euv ∈πl

puv (t)

euv ∈πl

from which we have ql (t) 



E[ puv (t + 1)| pu (t)] −

euv ∈πl



puv (t)

euv ∈πl

This equality can be rewritten as ql (t) ≥



(E[ puv (t + 1)| pu (t)] − puv (t)) 

euv ∈πl



puv (t)

(7.15)

euv ∈πl

and puv (t)  apuv (t)



puh (t)(cuh (t) − cuv (t))

vh ∈αu ,h v

ρ qi (t) ∈ (0, 1) for all q ∈ Sρ0 , where Sρ  {q(t) : 0 ≤ qi (t) ≤ 1; i1 qi (t)  1} and Sρ0 is the interior of Sρ . Hence, puv (t) ∈ (0, 1) for all u, v. Since the trust path πl is the path with the highest expected reliability, then c∗j − cl∗ > 0 for all j  l and

266

7 Social Trust Management

∗ ∗ hence for each euv ∈ πl , cuh − cuv > 0 for any action vh (h  v) of automaton Au . It follows form Lemma 7.1 that cuh (t) − cuv (t) > 0 for large values of t. Therefore, we can conclude that for large values of t, the right side of the above equation is positive and so considering Eq. (7.15), we have

ql (t) ≥

euv ∈πl

apuv (t)



puh (t)(cuh (t) − cuv (t)) ≥ 0

vh ∈αu ,h v



which completes the proof of this lemma.

Corollary 7.1 The set of unit vectors in Sρ − Sρ0 , where Sρ0  {q(t) : qi (t) ∈ ρ (0, 1); i1 qi (t)  1}, constitutes the set of all absorbing barriers of the Markov process {q(t)}t≥1 . Proof Lemma 7.3 implicitly implies that {q(t)} is a sub-Martingale. Considering Martingale theorems and the fact that {q(t)} is non-negative and uniformly bounded, it can be concluded that limt→∞ ql (t) converges to q ∗ with probability one. Further, / {0, 1}, then ql (t + 1)  ql (t) with a from Eq. (7.14), it is observed that if ql (t) ∈ nonzero probability for all t, and if q ∗ ∈ {0, 1}, then q(t + 1)  q(t), hence the proof of this lemma.  Let l (q) be the convergence probability of Dytrust to the unit vector Il with initial probability vector q, which is defined as     (7.16)

l (q)  Pr ob ql (∞)  1|q(0)  q  Pr ob q ∗  Il |q(0)  q Also let C(Sρ ) : Sρ → , where is the real line, be the state space of all realvalued continuously differentiable functions with bounded derivative defined on Sρ , and ψ(.) ∈ C(Sρ ). The operator U is defined as follows U ψ(q)  E[ψ(q(t + 1))|q(t)  q]

(7.17)

Work in (Lakshmivarahan and Thathachar 1976) has been shown that the operator U is linear and preserves the non-negative functions as the expectation of a positive function remains positive. In other word, if ψ(q) ≥ 0, then U ψ(q) ≥ 0 for all q ∈ Sρ . Function ψ(q) is called sub-regular (super-regular) if and only if ψ(q) ≤ U ψ(q) (ψ(q) ≤ U ψ(q)) for all q ∈ Sρ . It has also been shown in (Lakshmivarahan and Thathachar 1976) that l (q) is the only continuous solution of U l (q)  l (q), subject to the following boundary conditions.  

l I j 



1 j l 0 j  l

(7.18)

φl [x, q] ∈ C(Sρ ), defined as given below, satisfies the above boundary conditions.

7.4 Learning Automata Based Trust Propagation Algorithms

267

e−xql /a − 1 e−x/a − 1

(7.19)

φl [x, q]  where x > 0.

Theorem 7.2 If ψl (.) ∈ C(Sρ ) be super-regular with the boundary conditions ψl (Il )  1 and ψl (I j )  0 for j  l, then ψl (q) ≥ l (q) for all q ∈ Sρ . Also if ψl (.) ∈ C(Sρ ) be sub-regular with the same boundary conditions, then ψl (q) ≤ l (q) for all q ∈ Sρ . 

Proof This theorem has been proved in (Thathachar and Sastry 2002).

In what follows, it is shown that φl [x, q] is a sub-regular function and thus φl [x, q] qualifies as a lower bound on l (q). Since sub and super-regular functions are closed under addition and multiplication by a positive constant, and if φ(.) is a superregular function then −φ(.) is sub-regular, φl [x, q] is sub-regular if and only if θl [x, q]  e−xql /a is super-regular. The conditions under which θl [x, q] is superregular are determined as follows. From the definition of the operator U in Eq. (7.17), we have   U θl [x, q]  E e−xql (t+1)/a |q(t)  q 

ρ 



q j d ∗j e

−x a

j1

+

ρ 

q j d ∗j e

−x a



euv ∈πl , ( puv +a(1− puv )) euv ∈π j

euv ∈πl , ( puv (1−a)) / j euv ∈π

j1



 ql dl∗ e

+



−x a

(ql +a(1−ql )) −x

q j d ∗j e

a

+



−x a

q j d ∗j e

j l



p +a(1− p ( )) euv ∈πl , uv uv euv ∈π j



euv ∈πl , ( puv (1−a)) / j euv ∈π

(7.20)

j l

where d ∗j is the final value to which the reward probability d j (t) converges for large −x values of t, and e a (ql +a(1−ql )) denotes the expectation of θl [x, q] when the most reliable path πl is rewarded by the environment. ⎡

U θl [x, q]  

−x ql dl∗ e a (ql +a(1−ql ))



jl

q j d ∗j e

+





⎢ ⎢

−x ⎢ ql (1−a) a ⎢ ⎢ euv ⎣

∈ πl , euv ∈ π j

q j d ∗j e j l

−x a δl j (ql +a(1−ql ))

+



j l

−x q j d ∗j e a (δl j ql (1−a))



puv +a(1− puv ) puv (1−a)

⎥ ⎥ ⎥ ⎥ ⎥ ⎦

268

7 Social Trust Management

where δl j is computed as ⎧ ⎪ ⎪ ⎪ ⎨



euv ∈ πl , δl j  ⎪ euv ∈ π j ⎪ ⎪ ⎩ 1

 −xq δ  l lj U θl [x, q] − θl [x, q]  e a

puv +a(1− puv ) puv (1−a)



j  l   j  l or π j ∩ πl  ∅

jl

−xql δl j a

q j d ∗j e−x (1−ql )δl j + e

 j l

 −xql q j d ∗j e−xql δl j − e a

θl [x, q] is super-regular if e

−xql δl j a

 jl

q j d ∗j e−x(1−ql )δl j + e

−xql δl j a

 j l

q j d ∗j e−xql δl j ≤ e

−xql a

and U θl [x, q] ≤ e

−xql a

ql dl∗ e−x(1−ql ) + e

−xql a

 j l

If θl [x, q] is super-regular, then we have  −xql −xql  U θl [x, q] − θl [x, q] ≤ e a ql dl∗ e−x(1−ql ) + e a

q j d ∗j e−xql

j l

 −xql q j d ∗j e−xql − e a

After multiplying and dividing the right side of the above inequality by −xql and some algebraic simplifications, we have   −x(1−ql ) xql −1  −1 ∗e ∗e ql dl − qjdj U θl [x, q] − θl [x, q] ≤ −xql e j l −xql xql   −x(1−ql ) xql  −xql e − 1 e − 1  −xql e a dl∗ − q j d ∗j j l −x xql   −x(1−ql )  −xql e −1 e xql − 1  −xql e a (1 − ql )dl∗ − q j d ∗j j l −x(1 − ql ) xql −xql a

and U θl [x, q] − θl [x, q] ≤ −xql e

−xql a

    (1 − ql )dl∗ V −x(1 − ql ) −

 −xql θl [x, q]G l [x, q]

where V [u]  and

eu −1 u

1

u  0 u0

j l

!    q j d ∗j V xql

7.4 Learning Automata Based Trust Propagation Algorithms

G l [x, q]  (1 − ql )dl∗ V [−x(1 − ql )] −

269

 j l

 q j d ∗j V [xql ]

(7.21)

Therefore, θl [x, q] is super-regular if G l [x, q] ≥ 0 for all q ∈ Sρ . Considering Eq. (7.21), θl [x, q] is super-regular if  ∗ V [−x(1 − ql )] j l q j d j fl [x, q]  ≤ (7.22) V [xql ] (1 − ql )dl∗ The right side of the above inequality consists of the nonnegative terms, so we have 

! !     d ∗j d ∗j d ∗j 1 q j min ∗ ≤ qj ∗ ≤ q j max ∗ j l j l j l j l dl j l dl dl (1 − ql )

The above inequality can be rewritten as follows by substituting (1 − ql ). min j l

d ∗j dl∗

!



d∗

j l

≤ 

q j d ∗j

j l

l

qj

≤ max j l

d ∗j

 j l

q j by

!

dl∗

From Eq. (7.21), it follows that θl [x, q] is super-regular if fl [x, q] ≥ max j l

d ∗j

! (7.23)

dl∗

For more simplification, let employ logarithms. We define (q, x)  ln fl [x, q]. It has been shown in (Lakshmivarahan and Thathachar 1976) that x

0

0

−x

− ∫ H  (u)du ≤ (q, x) ≤ − ∫ H  (u)du H (u) 

d H (u) , du

H (u)  ln V (u)

Therefore, we have 1 V [−x(1 − ql )] ≤ ≤ V [−x] V [x] V [xql ] and d ∗j 1  max ∗ j l V [x] dl

! (7.24)

270

7 Social Trust Management

Let x ∗ be the value of x for which the above equation is true. It is shown that if d j /dl is smaller than 1 for all j  l, then there exists a value of x > 0 under which this equation is satisfied. By choosing x  x ∗ , Eq. (7.24) holds true, and thus G l [x, q] ≥ 0 for all q ∈ Sρ and θl [x, q] is super-regular. Therefore, Eq. (7.19) is a sub-regular function satisfying the boundary conditions in Eq. (7.18). From Theorem 7.2, we can conclude that φl [x, q] ≤ l (q) ≤ 1. From the definition of φl [x, q], we can see that given any ε > 0 there exists a positive constant a ∗ < 1 such that 1−ε ≤ φl [x, q] ≤ l (q) ≤ 1 for all 0 < a ≤ a ∗ . As a result, the probability with which the algorithm DyTrust finds the trust path with the highest expected reliability is equal to 1 as t → ∞, and so Theorem 7.1 is proved.  Theorem 7.3 Let (1 − ε) be the probability with which the algorithm DyTrust converges to the trust path πl . If q(t) is updated according to DyTrust, then for every ε ∈ (0, 1), there exists a learning parameter a ∈ (ε, q) such that ! dj xa  max (7.25) j l e xa − 1 dl where 1 − e−xql  (1 − e−x )(1 − ε) and ql  [ql (t)|t  0]. Proof It has been proved in (Thathachar and Sastry 2002) that if d j /dl < 1 for all j  l, then there exists a x > 0 under which Eq. (7.24) is satisfied. Therefore, it is concluded that φl [x, q] ≤ l (q) ≤

1 − e−xql 1 − e−x

where ql denotes the initial choice probability of the optimal trust path πl . From Theorem 7.1, it follows that for each 0 < a < a ∗ the probability of converging DyTrust to the trust path with the highest expected reliability is (1−ε) where a ∗ (ε) ∈ (0, 1). Therefore, we can conclude that 1 − e−xql 1−ε 1 − e−x

(7.26)

It is shown that for every ε ∈ (0, 1), there exists a value of x under which Eq. (7.24) is satisfied, and so we have d ∗j x ∗a  max ∗ j l ex a − 1 dl∗

!

It is concluded that for every ε ∈ (0, 1), there exists a learning parameter a ∈ (ε, q) under which the convergence probability of DyTrust to the most reliable trust path is greater than (1 − ε). This completes the proof of the theorem. 

7.4 Learning Automata Based Trust Propagation Algorithms

7.4.2.2

271

Experimental Evaluation

In order to evaluate the performance of our proposed algorithm DyTrust, the authors conducted several expensive experiments on the real trust network dataset of Kaitiaki and compare the effectiveness of DyTrust with that of the well-known trust propagation algorithms, TidalTrust (Yin et al. 2004) and MoleTrust (Avesani et al. 2005). They also considered two algorithms proposed in their previous work on trust propagation (Ghavipour and Meybodi 2018), namely Min-MCFAvg and DLATrust, for the performance comparison. In the experiments, the path probability threshold P and the decay factor λ were set to 0.9, the threshold K for the number of traversed paths was set to 10,000, and the learning parameter a was set to 0.03. Experimental results have been reported as an average of 10 independent runs. The authors used a system with the hardware configuration of Intel® Core i5 2.53 GHz and 4 GB RAM for all the experiments.

7.4.2.2.1 Experimental Method The evaluation was done using a standard evaluation technique in machine learning: leave one out (Kohavi 1995). The trust link between source vs and target user vd is  from vs to vd is calculated through trust propmasked. Next, the trust weight wsd agation algorithms using the remaining trust links. While existing algorithms only use trust weights of the initial snapshot of trust network in the inference process, our proposed algorithm DyTrust considers the changes of trust weights during its running time. At last, a sample is taken from the trust weight of vs on vd as the actual  calculated by the algorithms is compared. In trust weight with which the weight wsd this way, the actual trust weight is different from the trust weight masked from the initial snapshot. It is ensured that for any pair vs and vd of nodes the same actual trust weight is used in the evaluation process of all the algorithms. In the experiments, the authors used two metrics of coverage and prediction accuracy given in Sect. 7.4.1.1.2 for evaluating the performance of trust propagation algorithms.

7.4.2.2.2 Dataset The authors tested their proposed algorithm DyTrust on a real trust network dataset of Kaitiaki (www.kaitiaki.org.nz). The Kaitiaki dataset is a small trust network which contains 178 trust statements issued by 64 users on September 1, 2008 at four levels: Kaitiro, Te Hunga Manuhiri, Te Hunga Käinga, Te Komiti Whakahaere. Since they needed trust weights to be real-valued, they initially assigned 0.2 to Kaitiro, 0.6 to Te Hunga Manuhiri, 0.8 to Te Hunga Käinga, and 1 to Te Komiti Whakahaere. In this way, the initial snapshot of the Kaitiaki network is constructed. The authors then used the technique proposed in (Richardson et al. 2003) to update trust weights over time. This technique has been widely utilized in the literature (Shekarpour and

272

7 Social Trust Management

Katebi 2010; Jiang et al. 2015, 2016; Ghavipour and Meybodi 2018) to augment trust networks with trust weights which are continuous in [0, 1]. In this technique, each user vi is assigned a quality measurement qi ∈ [0, 1] which determines the probability that a statement provided by vi is true. The authors considered a user’s reputation as the quality of the user, namely for a user vi the quality qi is computed as the average of trust statements incoming in vi . Since the trust weights are between 0 and 1, it is ensured that users’ quality is also in [0,1]. After that, for any pair of users vi and v j , the new trust weight wi j from vi to v j is uniformly chosen from i [max(q j − δi j , 0), min(q j + δi j , 1)], where δi j  1−q denotes a noise parameter 2 which determines how accurate the user vi is at estimating the quality of the user v j that he is trusting.

7.4.2.2.3 Experimental Results Experiment I: Evaluating different strategies In this experiment, the authors validated the effectiveness of the proposed algorithm DyTrust in comparison with the other trust propagation algorithms in terms of the prediction accuracy and the running time. The algorithms used for the comparison are: TidalTrust, MoleTrust, Min-MCFAvg and DLATrust. Similar to DLATrust, their proposed algorithm DyTrust considers no limit for the length of trust paths. In contrast, the other three strategies are only based on shortest trust paths from source to target. We report the accuracy and running time of these algorithms on the Kaitiaki dataset in Table 7.4. From this table, we can see that our algorithm Dytrust yields the highest prediction accuracy compared to the others. While DyTrust considers the changes of trust weights during its trust propagation process, the other algorithms estimate the trustworthiness of a given target user only based on the initial snapshot of Kaitiaki. As a result, it is possible that, for example, the target user which has been estimated to be trusted by these algorithms is no longer trustworthy with respect to the trust variations occurred during their running time. Based on this, there is a trade-off between the computational complexity and the inference accuracy. It is worthy to mention that the additional computations required for considering the trust dynamicity are not too time consuming and affect the running time by some constant factor. As

Table 7.4 Comparison between different strategies Strategy

MAE

Precision

Recall

FScore

Time (s)

TidalTrust

0.1193

0.9705

0.9447

0.9574

0.0043

MoleTrust

0.1148

0.9705

0.9447

0.9574

0.0061

Min-MCFAvg

0.1355

0.9711

0.9668

0.9689

0.0050

DLATrust

0.1417

0.9710

0.9646

0.9678

5.1037

DyTrust

0.0834

0.9543

0.9943

0.9739

0.5477

7.4 Learning Automata Based Trust Propagation Algorithms

(b)

(a) 0.16

FScore

MAE

1

0.96

0.12

0.08

TidalTrust MoleTrust 0.04

0.92

TidalTrust 0.88

Min-MCFAvg DLATrust

0.3

MoleTrust Min-MCFAvg DLATrust

DyTrust 0 0.1

273

0.84 0.5

0.7

0.9

0.1

Trust Threshold

DyTrust 0.3

0.5

0.7

0.9

Trust Threshold

Fig. 7.8 The impact of trust threshold on the inference accuracy

reported in Table 7.4, Dytrust has much less running time than DLATrust, while both algorithms consider all paths in the trust inference process. However, it takes more time in comparison with the algorithms using only shortest trust paths. Experiment II: Testing the influence factors Trust Threshold. The authors investigated the impact of the trust threshold on the prediction accuracy and coverage of the proposed algorithm DyTrust in comparison with the other trust propagation algorithms. For this purpose, they changed the trust threshold in the range of [0.1, 0.9] and for each threshold, they only used the trust statements of those direct neighbors that are reliable at or above the threshold in the trust inference process. The accuracy and coverage of their algorithm DyTrust for different trust thresholds are compared with those of the other algorithms in Figs. 7.8a–b and 7.10a. According to the results of these figures, their proposed algorithm DyTrust provides the highest accuracy and coverage for all trust thresholds in comparison with the other algorithms. During the increase of the trust threshold, the MAE and FScore values remain about constant for all the algorithms, except for the threshold equal to 0.9. Increasing the trust threshold decreases the coverage of the algorithms, especially when the threshold is equal to 0.9. It indicates that the trust threshold should not be set to a large value, since for a larger threshold, fewer paths will be trusted. Maximum Length. The authors examined the impact of the maximum path length on the performance of the proposed algorithm DyTrust compared to the other algorithms. The maximum length varies from 2 to 7. Figures 7.9a–c and 7.10b show the comparison of, respectively, MAE, FScore, running time and coverage of DyTrust with those of the other algorithms for different maximum lengths. Since the algorithms with which DyTrust is compared provide the same coverage for any length, in Fig. 7.10b we report the coverage value of these algorithms only once labeled as “Others”.

274

7 Social Trust Management

(a) 0.15

(b)

0.12

0.98

TidalTrust MoleTrust Min-MCFAvg

0.03

10

0.96 TidalTrust

0.94

MoleTrust Min-MCFAvg

0.92

DLATrust

2

3

4

5

6

TidalTrust MoleTrust Min-MCFAvg DLATrust

10

DLATrust

DyTrust

0

Time (s)

0.06

FScore

MAE

0.09

0

(c)

1

-2

DyTrust

DyTrust

0.9

7

2

3

Max Length

4

5

6

7

2

3

Max Length

4

5

6

7

Max Length

Fig. 7.9 The impact of maximum path length on the inference accuracy and time

(a)

(b)

1.4

Others

1

1.2

DyTrust

1

0.6

Coverage

Coverage

0.8

TidalTrust MoleTrust

0.8 0.6 0.4

Min-MCFAvg 0.4

DLATrust

0.2

DyTrust 0.1

0.3

0.5

Trust Threshold

0.7

0.9

0

2

3

4

5

6

7

Max Length

Fig. 7.10 The impact of trust threshold and maximum path length on the coverage

From Fig. 7.9a, b, we can observe that for all the algorithms the MAE error at first increases along with the maximum length and then remains about constant. FScore of DyTrust increases when the maximum length varies from 2 to 3, while for the maximum length larger than 3, it goes down a little. The authors analyzed the reason and found that searching with a larger maximum length results in finding more paths which are of course longer and hence less reliable. This will decrease the accuracy of trust inference. In contrast, for the other algorithms under comparison the FScore value decreases with the increase of the maximum length until 4 and after that, it increases a little. Generally, DyTrust obtains the highest prediction accuracy for all the maximum lengths as compared to the other algorithms. As shown in Fig. 7.9c, increasing the maximum length obviously increases the running time for all the algorithms, especially when it changes from 2 to 3. The increase is much faster for the DLATrust algorithm. According to the results of Fig. 7.10b, the coverage is also increased with increasing the maximum length. The reason is that more new trust paths can be found as the maximum length gets larger. Moreover, for all the maximum lengths our algorithm DyTrust provides the

7.4 Learning Automata Based Trust Propagation Algorithms

275

same coverage as the other algorithms. It indicates that DyTrust always successfully discovers reliable trust paths satisfying the path length constraint between any pair of nodes.

7.5 Conclusion Since trust is one of the most important factors in forming social interactions, it is necessary in these networks to evaluate trust from one user to another indirectly connected user, using propagating trust along reliable trust paths between the two users. The quality of trust inference based on trust propagation is affected by the length of trust paths and also different aggregation and propagation strategies for propagating and combining trust values. In this chapter, we first reviewed existing methods in the literature for the trust inference and then described in details two learning automata based trust propagation algorithms DLATrust (Ghavipour and Meybodi 2018) and DyTrust (Ghavipour and Meybodi 2018). The algorithm DLATrust utilizes distributed learning automata to discover reliable trust paths and predict the trust value between two indirectly connected users. In this algorithm, an automaton is assigned to each node of the trust network to learn the next reliable nodes along trust paths towards the target node. In order to overcome the weaknesses of existing aggregation functions and improve the accuracy of trust inference, authors also proposed a new aggregation function based on standard collaborative filtering for combining trust values derived from multiple paths. Their experimental results on real trust network dataset showed that DLATrust can efficiently achieve high trust inference accuracy compared to existing trust propagation algorithms. The second algorithm DyTrust is a dynamic trust propagation algorithm based on distributed learning automata for inferring trust in stochastic trust networks. Since trust changes over time as a result of repeated direct interactions between users, trust networks can be modelled as stochastic graphs with continuous time-varying edge weights. Even though the dynamic nature of trust has been universally accepted in literature, existing trust propagation algorithms do not take the dynamicity of trust into consideration. These algorithms take an instant snapshot of trust network and then deterministically infer trust in the network snapshot. Due to being time consuming of trust propagation algorithms, it is highly probable that trust weights change during the algorithms’ running time and therefore the estimated trust values will not have enough accuracy. DyTrust is the first to address the dynamicity property of trust in the domain of trust inference. Considering the changes of trust weights in time, this algorithm finds the most reliable trust path to each neighboring user of target and estimates the trust value of the target based on its neighbors’ reliability. In order to validate the effectiveness of the algorithm DyTrust, authors conducted extensive experiments with the real trust network dataset, Kaitiaki and compared the results of DyTrust with those of the well-known trust propagation algorithms, as well as with those of the algorithm DLATrust. They found that by considering the

276

7 Social Trust Management

temporal variation of trust, DyTrust outperforms the existing algorithms in terms of the prediction accuracy. Since DyTrust uses all trust paths in the inference process, it needs more time in comparison with the algorithms using only shortest paths. However, the running time of DyTrust is much less than that of DLATrust which is also based on all trust paths. Authors also investigated the impact of the influence factors including trust threshold and maximum path length on the performance of DyTrust as compared to the other algorithms. From the results, they found that while increasing trust threshold decreases the prediction coverage, the accuracy of the algorithms remains about constant. Moreover, they observed that DyTrust always has the highest accuracy and coverage for all trust thresholds comparing to the others. Finally, by testing different maximum lengths, they found that increasing the maximum length will decrease the accuracy of DyTrust, although it always obtains the highest accuracy for all lengths in comparison with the other algorithms. Authors also observed that the increase of the maximum length increases the running time and prediction coverage for all the algorithms under comparison.

References Abdul-Rahman A, Hailes S (2000) Supporting trust in virtual communities. In: Proceedings of the 33rd annual Hawaii international conference on system sciences. IEEE, p 9 Al-Oufi S, Kim HN, El Saddik A (2012) A group trust metric for identifying people of trust in online social networks. Expert Syst Appl 39:13173–13181. https://doi.org/10.1016/j.eswa.2012. 05.084 Avesani P, Massa P, Symposium RT-P of the 2005 A, 2005 undefined 2005 (2005) A trust-enhanced recommender system application: Moleskiing. DlAcmOrg, pp 1589–1593 Beigy H, Meybodi MR (2006) Utilizing distributed learning automata to solve stochastic shortest path problems. Int J Uncertain Fuzziness Knowl-Based Syst 14:591–615. https://doi.org/10.1142/ S0218488506004217 Caverlee J, Liu L, Webb S (2008) Towards robust trust establishment in web-based social networks with socialtrust. In: Proceeding of the 17th international conference on World Wide Web—WWW ’08. ACM, p 1163 Chen IR, Bao F, Chang M, Cho JH (2014) Dynamic trust management for delay tolerant networks and its application to secure routing. IEEE Trans Parallel Distrib Syst 25:1200–1210. https://doi. org/10.1109/TPDS.2013.116 Cho JH, Swami A, Chen IR (2012) Modeling and analysis of trust management with trust chain optimization in mobile ad hoc networks. J Netw Comput Appl 35:1001–1012. https://doi.org/10. 1016/j.jnca.2011.03.016 Christianson B, Harbison WS (1996) Why isn’t trust transitive? In: International workshop on security protocols. Springer, pp 171–176 Cook KS, Yamagishi T, Cheshire C et al (2005) Trust building via risk taking: a cross-societal experiment. Soc Psychol Q 68:121–142. https://doi.org/10.1177/019027250506800202 Dillon TS, Chang E, Hussain FK (2004) Managing the dynamic nature of trust. IEEE Intell Syst 19:79–82 Fan ZP, Suo WL, Feng B, Liu Y (2011) Trust estimation in a virtual team: a decision support method. Expert Syst Appl 38:10240–10251. https://doi.org/10.1016/j.eswa.2011.02.060

References

277

Fung C, Zhang J (2011) Dirichlet-based trust management for effective collaborative intrusion detection networks. Netw Serv 8:79–91 Ghavipour M, Meybodi MR (2018) A dynamic algorithm for stochastic trust propagation in online social networks: learning automata approach. Comput Commun 123:11–23. https://doi.org/10. 1016/j.comcom.2018.04.004 Golbeck J (2004) Inferring trust relationships in web-based social networks, 2006. Security 6:497–529 Granovetter M (1985) Economic action and social structure: the problem of embeddedness. Am J Sociol 91:481–510. https://doi.org/10.1002/9780470755679.ch5 Guo J, Chen I, Tsai JJP (2017) A Survey of trust computation models for service management in internet of things systems. Comput Commun 97:1–14 Huang F (2007) Building social trust: a human-capital approach. J Inst Theor Econ JITE 163:552–573 Hughes D, Coulson G, Walkerdine J (2005) Free riding on Gnutella revisited—the bell tolls (IEEE JNL 2005). IEEE Distrib Syst online 6 Jang M-H, Faloutsos C, Kim S-W, et al (2016) PIN-TRUST: fast trust propagation exploiting positive, implicit, and negative information. In: Proceedings of the 25th ACM international on conference on information and knowledge management. ACM, pp 629–638 Jiang LL, Perc M (2013) Spreading of cooperative behaviour across interdependent groups. Sci Rep 3. https://doi.org/10.1038/srep02483 Jiang W, Wang G, Wu J (2014) Generating trusted graphs for trust evaluation in online social networks. Futur Gener Comput Syst 31:48–58. https://doi.org/10.1016/j.future.2012.06.010 Jiang W, Wu J, Wang G (2015) On selecting recommenders for trust evaluation in online social networks. ACM Trans Internet Technol 15:14. https://doi.org/10.1145/2807697 Jiang W, Wu J, Li F et al (2016) Trust evaluation in online social networks using generalized network flow. IEEE Trans Comput 65:952–963. https://doi.org/10.1109/TC.2015.2435785 Jones AJI, Pitt J (2011) On the classification of emotions, and its relevance to the understanding of trust. In: Trust in agent societies (TRUST-2011), 14th edn, pp 69–82 Jøsang A, Haller J (2007) Dirichlet reputation systems. In: Proceedings—second international conference on availability, reliability and security, ARES 2007. IEEE, pp 112–119 Jøsang A, Ismail R, Jsang A, Ismail R (2002) The beta reputation system. In: Proceedings of the 15th bled electronic commerce conference. pp 41–55 Jøsang A, Gray E, Kinateder M (2006) Simplification and analysis of transitive trust networks. Web Intell Agent Syst 4:1–26. https://doi.org/10.1109/TDEI.2009.4784557 Jurca R, Faltings B (2003) An incentive compatible reputation mechanism.pdf. In: IEEE international conference on E-commerce, 2003. CEC 2003. IEEE, pp 285–292 Kamvar SD, Schlosser MT, Garcia-Molina H (2003) The Eigentrust algorithm for reputation management in P2P networks. In: Proceedings of the twelfth international conference on World Wide Web—WWW ’03. ACM, p 640 Kim YA (2015) An enhanced trust propagation approach with expertise and homophily-based trust networks. Knowledge-Based Syst 82:20–28. https://doi.org/10.1016/j.knosys.2015.02.023 Kim YA, Song HS (2011) Strategies for predicting local trust based on trust propagation in social networks. Knowl-Based Syst 24:1360–1371. https://doi.org/10.1016/j.knosys.2011.06.009 Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence, vol 2, pp 1137–1143 Kuter U, Golbeck J (2007) SUNNY: a new algorithm for trust inference in social networks, using probabilistic confidence models. In: Proceedings of the 22nd national conference on artificial intelligence (AAAI-07). pp 1377–1382 Kuter U, Golbeck J (2010) Using probabilistic confidence models for trust inference in web-based social networks. ACM Trans Internet Technol 10:1–23. https://doi.org/10.1145/1754393.1754397 Lakshmivarahan S, Thathachar MAL (1976) Bounds on the convergence probabilities of learning automata. IEEE Trans Syst Man, Cybern A Syst Humans 6:756–763

278

7 Social Trust Management

Lesani M, Montazeri N (2009) Fuzzy trust aggregation and personalized trust inference in virtual social networks. Comput Intell 25:51–83. https://doi.org/10.1111/j.1467-8640.2009.00334.x Levien R, Aiken A (1998) Attack-resistant trust metrics for public key certification. In: 7th USENIX security symposium. pp 229–241 Li Q, Chen M, Perc M, et al (2013) Effects of adaptive degrees of trust on coevolution of quantum strategies on scale-free networks. Sci Rep 3. https://doi.org/10.1038/srep02949 Liu H, Lim E-P, Lauw HW, et al (2008) Predicting trusts among users of online communities. In: Proceedings of the 9th ACM conference on electronic commerce—EC ’08. ACM, p 310 Liu X, Wang Y, Zhu S, Lin H (2013) Combating web spam through trust-distrust propagation with confidence. Pattern Recognit Lett 34:1462–1469. https://doi.org/10.1016/j.patrec.2013.05.017 Lyu S, Liu J, Tang M et al (2015) Efficiently predicting trustworthiness of mobile services based on trust propagation in social networks. Mob Netw Appl 20:840–852. https://doi.org/10.1007/ s11036-015-0619-y Maheswaran M, Hon CT, Ghunaim A (2007) Towards a gravity-based trust model for social networking systems. In: Proceedings—international conference on distributed computing systems. IEEE, p 24 Massa P, Avesani P (2007) Trust metrics on controversial users. Int J Semant Web Inf Syst 3:39–64. https://doi.org/10.4018/jswis.2007010103 Möllering G (2001) The nature of trust: from georg simmel to a theory of expectation, interpretation and suspension. Sociology 35:403–420. https://doi.org/10.1017/S0038038501000190 Molm LD, Takahashi N, Peterson G (2000) Risk and trust in social exchange: an experimental test of a classical proposition. Am J Sociol 105:1396–1427. https://doi.org/10.1086/210434 Mui L, Halberstadt A (2002) A computational model of trust and reputation. In: Proceedings of the 35th Hawaii international conference on system sciences. IEEE, pp 2431–2439 Ortega FJ, Troyano JA, Cruz FL et al (2012) Propagation of trust and distrust for the detection of trolls in a social network. Comput Networks 56:2884–2895. https://doi.org/10.1016/j.comnet. 2012.05.002 Perc M, Grigolini P (2013) Collective behavior and evolutionary games—an introduction. Chaos Solitons Fractals 56:1–5. https://doi.org/10.1016/j.chaos.2013.06.002 Ren Y, Li M, Xiang Y et al (2013) Evolution of cooperation in reputation system by group-based scheme. J Supercomput 63:171–190. https://doi.org/10.1007/s11227-010-0498-8 Resnick P, Varian HR (1997) Recommender Systems. Commun ACM 40:56–58 Resnick P, Iacovou N, Suchak M, et al (1994) GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In: Proceedings of ACM 1994 conference on computer supported cooperative work. ACM, pp 175–186 Richardson M, Agrawal R, Domingos P (2003) Trust Management for the Semantic Web. In: International semantic Web conference. Springer, pp 351–368 Rotter JB (1967) A new scale for the measurement of interpersonal trust. J Pers 35:651–665. https:// doi.org/10.1111/j.1467-6494.1967.tb01454.x Roussea DM, Sitkin SB, Burt RS, Camerer C (1998) Not so different after all: a cross-discipline view of trust. Acad Manag Rev 23:393–404 Shekarpour S, Katebi SD (2010) Modeling and evaluation of trust with an extension in semantic web. J Web Semant 8:26–36. https://doi.org/10.1016/j.websem.2009.11.003 Sherchan W, Nepal S, Paris C (2013) A survey of trust in social networks. ACM Comput Surv 45:1–33. https://doi.org/10.1145/2501654.2501661 Song S, Hwang K, Zhou R (2005) Trusted P2P transactions with fuzzy. IEEE Internet Comput 9:24–34 Su Z, Liu L, Li M, et al (2013) ServiceTrust: trust management in service provision networks. In: Proceedings—IEEE 10th international conference on services computing, SCC 2013. IEEE, pp 272–279 Su Z, Li M, Fan X, et al (2014) Research on trust propagation models in reputation management systems. Math Probl Eng 2014. https://doi.org/10.1155/2014/536717

References

279

Thathachar MAL, Ramachandran KM (1985) Asymptotic behavior of a hierarchical system of learning automata. Inf Sci (Ny) 35:91–110. https://doi.org/10.1016/0020-0255(85)90043-X Thathachar MAL, Sastry PS (2002) Varieties of learning automata: an overview. IEEE Trans Syst Man Cybern Part B Cybern 32:711–722. https://doi.org/10.1109/TSMCB.2002.1049606 Vogiatzis G, MacGillivray I, Chli M (2010) A probabilistic model for trust and reputation. In: AAMAS ’10 Proceedings of the 9th international conference on autonomous agents and multiagent systems: volume 1, vol 1, pp 225–232 von Laszewski G, Alunkal B, Veljkovic I (2005) Toward reputable grids. Scalable Comput Pract Exp 6:95–106 Vydiswaran VGV, Zhai C, Roth D (2011) Content-driven trust propagation framework. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’11. ACM, p 974 Wang G, Wu J (2011) Multi-dimensional evidence-based trust management with multi-trusted paths. Futur Gener Comput Syst 27:529–538. https://doi.org/10.1016/j.future.2010.04.015 Wishart R, Robinson R, Indulska J, Jøsang A (2005) Superstring Rep: reputation-enhanced service discovery. In: Proceedings of the twenty-eighth australasian conference on computer sciencevolume 38. Aust Comput Soc Inc. pp 49–57s Witkowski J (2011) Trust mechanisms for online systems (extended abstract). In: IJCAI international joint conference on artificial intelligence. pp 2866–2867 Xiong L, L Ling (2004) PeerTrust: supporting reputation-based trust for peer-to-peer electronic communities. IEEE Trans Knowl Data Eng 16:843–857. https://doi.org/10.1109/TKDE.2004. 1318566 Xiong L, Liu L (2003) A reputation-based trust model for peer-to-peer ecommerce communities [Extended Abstract]. In: Proceedings of the 4th ACM conference on electronic commerce—EC ’03. ACM Press, New York, NY, USA, p 228 Yang X, Guo Y, Liu Y, Steck H (2014) A survey of collaborative filtering based social recommender systems. Comput Commun 41:1–27. https://doi.org/10.1016/j.comcom.2013.06.009 Yin X, Zhang J, Wang X (2004) Sequential injection analysis system for the determination of arsenic by hydride generation atomic absorption spectrometry. Fenxi Huaxue 32:1365–1367. https://doi. org/10.1017/CBO9781107415324.004 Yu B, Singh MP (2002) Distributed reputation management for electronic commerce. Comput Intell 18:535–549. https://doi.org/10.1111/1467-8640.00202 Zhang J, Cohen R (2008) Evaluating the trustworthiness of advice about seller agents in emarketplaces: a personalized approach. Electron Commer Res Appl 7:330–340. https://doi.org/ 10.1016/j.elerap.2008.03.001 Zhang Y, Fang Y (2007) A fine-grained reputation system for reliable service selection in peer-topeer networks. IEEE Trans Parallel Distrib Syst 18:1134–1145. https://doi.org/10.1109/TPDS. 2007.1043 Zhou R, Hwang K (2007) PowerTrust: a robust and scalable reputation system for trusted peerto-peer computing. IEEE Trans Parallel Distrib Syst 18:460–473. https://doi.org/10.1109/TPDS. 2007.1021

Chapter 8

Social Recommender Systems

8.1 Introduction Due to the incredible growth of information on the World Wide Web in the recent years, searching and finding contents, products or services that may be of interest for users has become a very difficult task. Recommender systems (RSs) help overcome the information overload problem by studying the preferences of online users and suggesting items they might like. Many companies and Web sites have implemented these systems to recommend products/information/services to their users in a more accurate manner, therefore improving the company’s profits.

8.2 Categorization of Recommender Systems Recommender systems (RSs) have advanced in the ability to filter out unnecessary information and present the most relevant data to users. These systems make use of different information sources for providing users with recommendations of items. They attempt to balance factors like accuracy, diversity and novelty in their recommendations. Recommender systems can be generally categorized into content-based filtering (CB) (Lops et al. 2011; Martinez-Romo and Araujo 2012; Pera and Ng 2013; Protasiewicz et al. 2016) and collaborative filtering (CF) (Bobadilla et al. 2011, 2012a, b; Altingovde et al. 2013; Formoso et al. 2013; Choi et al. 2016; Wang et al. 2016). Content-based filtering recommends new items based on their similarity to the items already rated by the user. On the other hand, in collaborative filtering approach the rating of the user for a new item is predicted based on past ratings of similar users. Collaborative filtering techniques play an important role in designing recommendation systems.

© Springer Nature Switzerland AG 2019 A. Rezvanian et al., Learning Automata Approach for Social Networks, Studies in Computational Intelligence 820, https://doi.org/10.1007/978-3-030-10767-3_8

281

282

8 Social Recommender Systems

8.2.1 CF-Based Recommender Systems Collaborative filtering is based on the way in which humans make their decisions in real life: besides on our personal experiences, we also base our decisions on the experiences of our acquaintances. In this technique, users are allowed to give ratings about a set of items (e.g. books, movies, music, etc.) in such a way that when enough rating data is stored on the system, recommendations can be made to each user based on information provided by those users who have the most in common with him. Collaborative filtering (CF) (Adomavicius and Tuzhilin 2005) is a representative recommendation technique that has been widely used in recommender systems. Generally, there exist two main approaches to CF: memory-based and model-based CF techniques (Su and Khoshgoftaar 2009; Ricci et al. 2011). Memory-based techniques (Resnick et al. 1994; Linden et al. 2003; Lemire 2005; Symeonidis et al. 2009) use similarity measures to predict the preference of a user for new or unrated items based on user-item ratings stored in the system. These techniques are conceptually simple and easily implementable. They also produce good quality recommendations. In contrast, model-based techniques use rating data to learn a predictive model. These approaches present better scalability under large datasets in comparison with memory-based ones. However, they require expensive model-building processes, and have a trade-off between predication performance and scalability. Among the widely used models, we have Bayesian classifiers (Park et al. 2007), neural networks (Roh et al. 2003), fuzzy systems (Yager 2003), genetic algorithms (inqi Gao and ongdong Li 2008), latent features (Zhong and Li 2010) and matrix factorization (Luo et al. 2012). Despite the significant success of collaborative filtering technique, it suffers from some problems, including data sparsity, cold start and malicious users. While there exist the huge number of items available, users normally rate only few of them. Therefore, the rating matrix usually has the high level of sparsity and the number of items rated in common between each pair of users is not enough for similarity measures to accurately measure user similarities (Bobadilla and Serradilla 2009). The cold start problem (Leung et al. 2008; Rashid et al. 2008; Kim et al. 2011; Bobadilla et al. 2012b) refers to the situation where a new user just enters the RS. The user cannot receive any personalized recommendations based on CF technique, since he has not yet provided any rating in the system. Moreover, CF-based recommender systems can experience shilling attacks (Lam and Riedl 2004; Chirita et al. 2005), in which many positive ratings are generated for a product, while the products from competitors receive negative rating. Standard CF techniques are highly vulnerable to such attacks (O’Mahony et al. 2004). In order to overcome the above-mentioned problems, researchers have proposed to use trust information rather than similarity between users in CF. The intuition is that in real life users rely more on recommendations from people they trust (Sinha and Swearingen 2001). However, in the recommendation context, trust must reflect the user similarity to a certain extent in order to have meaningful results (Ziegler and Lausen 2004). Previous studies have shown that incorporating trust networks into

8.2 Categorization of Recommender Systems

283

recommender systems improves the quality of predictions and recommendations (Massa and Avesani 2004; Arazy et al. 2009; Carrer-Neto et al. 2012).

8.2.2 Trust-Based CF Recommender Systems Trust is an important area of research in recommender systems (O’Donovan and Smyth 2005). Many social applications, such as Filmtrust (Golbeck and Hendler 2006) and Epinions (Epinions.com), provide a web of trust to allow users to express their trust on others. Users connected by a web of trust significantly exhibit higher similarity on items than non-connected users (Lee and Brusilovsky 2009). Therefore, social influences can play a more important role than the similarity of past ratings (Bonhard and Sasse 2006; Salganik et al. 2006). Previous studies have shown that trust information not only improves the recommendation performance in terms of the prediction accuracy and coverage via a trust propagation approach, but also mitigates some problems inherent in CF based recommender systems (Golbeck 2006; He and Chu 2010). With trust information, the data sparsity problem can be alleviated since it is no longer necessary to measure the rating similarity for finding like-minded users. In addition, for a cold start user with no past ratings, recommender system still can make good recommendations using the preferences of his trusted neighbourhood. Moreover, since recommendations are based only on ratings provided by trusted users, it is possible to resist malicious users who are trying to influence the recommendation accuracy. Trust-based CF techniques adopt one of these two main approaches: incorporating trust as a replacement for the user similarity, or using trust in combination with the user similarity. Recommender systems based on the first approach (Massa and Avesani 2004; Avesani et al. 2005; Golbeck 2006) recommend new items to a user from his trusted users. Since only a few of users specify their trust relations and most of them tend to provide no trust statements, trust networks are typically sparse. As a result, the data sparsity and cold start problems still remain in these systems. For this reason, a number of research works have attempted to address this issue by making recommendations based on both similarity and trust information (Yan et al. 2013; Guo et al. 2015; Moradi and Ahmadian 2015; Mao et al. 2017). The common idea in these works is to predict the rating of an active user on a target item by using the ratings from his similar users in addition to those from directly/indirectly trusted users. It has been shown that there exists a strong correlation between trust and user similarity when the trust network is tightly bound to a particular application (Ziegler and Golbeck 2007). As a result, similarity is used as an additional measure in determining the value of implicit trust between users in trust management systems (Uddin et al. 2008; Golbeck 2009; Shambour and Lu 2011, 2012; Bhuiyan 2013). The assumption is that users will trust others who have similar preferences consistently. For instance, (Yan et al. 2013) developed a novel recommendation method, called CITG, based on a two-faceted web of trust. In their proposed method, a web of trust derived by implicit trust relations, called as interest similarity graph (ISG), is

284

8 Social Recommender Systems

constructed for an active user by measuring interest similarities between the user and the others. Then, another web of trust derived by explicit trust relations is formed by computing trust values between the active user and other directly/indirectly connected users in the trust network. The resulting web of trust is called as directed trust graph (DTG). Finally, ISG and DTG are combined to generate a two-faceted web of trust which mitigates the sparsity and cold start problems. Authors in (Guo et al. 2015) proposed a multi-view clustering method which clusters users from the views of both user similarities and trust relations. Their intuition was that a cluster with few users fails to produce reliable rating predictions for a given item. They showed that the proposed method improves both the recommendation accuracy and coverage. Work in (Moradi and Ahmadian 2015) presented a trust-aware collaborative filtering method based on reliability, called RTCF. In this method, an initial trust network for an active user is constructed by combination of the trust statements and the similarity values. Using the trust network, the initial rating of an unseen item is predicted for the active user. After that, a trust based reliability measure is proposed to evaluate the quality of the predicted rating and based on this evaluation, the trust network is reconstructed by removing useless users with a reliability value lower than a predefined threshold. Finally, the new trust network is used to predict the final rate of the unseen item. In order to alleviate the rating sparsity problem, (Mao et al. 2017) considered different user relations (such as the rating similarity and trust) in a multigraph and developed a multigraph ranking model to identify the nearest neighbors of an active user for the recommendation purpose. Authors also proposed a random walk-based social network propagation model which is applied to every single-relational social network to enrich the original data of the network before constructing the multigraph. At last, the user’s closest neighbors are used to make the CF rating predictions for unseen items. Gohari et al. (2018) proposed a new confidence-based recommendation (CBR) approach which employs four different confidence models and derives users’ and items’ confidence values from both local and global perspectives. In this approach, the neighborhood formation process for a user relies on both implicit trust values between users and their interpretations of ratings. CBR predicts an active user’s rating of target item based on the most confident neighbors of the user. While there has been a great deal of research focusing on trust information, only few research works have investigated the incorporation of distrust in addition to trust into the recommendation process (Victor et al. 2011, 2013; Kant and Bharadwaj 2013). Researchers found that besides the trust relationships, the distrust relationships between users are also unavoidable and recommender systems can benefit from distrust information in social networks. Since, in real life, people tend to express their trust in others in linguistic expressions, researchers have presented several fuzzy models of trust and distrust in literature (Bharadwaj and Al-Shamri 2009; Kant and Bharadwaj 2013; Hao et al. 2014; Ayadi et al. 2016). Due to the difficulties in designing fuzzy membership functions, the triangular membership function is most commonly used for representing various linguistic terms (Zadeh 1996). In most works done on fuzzy modelling of trust and distrust, the number of membership functions for each of these fuzzy variables is

8.2 Categorization of Recommender Systems

285

determined based on expert opinion as a constant value and membership functions are often uniformly and symmetrically distributed on the values axis in terms of this fixed value. Using uniformly and symmetrically distributed linguistic terms the same discrimination levels will exist on both sides of the values axis. However, there are some cases in which trust and distrust variables need to be assessed with unbalanced linguistic terms, i.e., linguistic terms that are not uniformly and symmetrically distributed. For example, a recommender system usually looks for the trusted users of a target user to make recommendations. That is, it usually uses the opinions of users trusted by using linguistic trust statements on the right of the values axis of trust. Thus, a more accurate recommendation can be achieved if a higher number of discrimination levels on the right of axis is assumed. In these cases, where expert needs to assess a number of terms in a side of the values axis higher than in the other one, and in general, it is extremely hard for an expert to determine the appropriate number and position of fuzzy sets. There exist several research works proposed in literature to optimize fuzzy membership functions (Simon 2005; Kaya and Alhajj 2006; Zhao and Li 2007; Omizegba and Adebayo 2009; Permana and Hashim 2010; Acilar and Arslan 2011; Huynh et al. 2012). However, to the best of our knowledge, none of these approaches try to optimize the number of membership functions, but only adjust the position of them on the values axis. Authors in (Ghavipour and Meybodi 2016) addressed the above problem and proposed a method based on continuous action-set learning automata (CALA) for simultaneously optimizing the number and position of membership functions for fuzzy trust and distrust such that a recommender system can achieve the highest precision in the user rating estimation. The learning automata has shown to be a very useful tool in solving the optimization problems (Beigy and Meybodi 2006; Akbari Torkestani and Meybodi 2012; Rezvanian and Meybodi 2017). In order to investigate the effect of membership functions optimization on the performance of recommender systems, they employed their proposed method in a fuzzy trust-distrust enhanced recommender system proposed in (Kant and Bharadwaj 2013) and tested its performance over well-known datasets. The experimental results indicated that the proposed method by providing fuzzy membership functions optimized with respect to used dataset improves the accuracy of recommendations in fuzzy recommender systems. Another challenge in recommender systems that must be addressed is the timedependent nature of similarity and trust. User interests may vary over time which in turn change the rating similarities between users. Trust also is dynamic and changes as time passes and users continue social interactions or observations such as rating common items (Staab et al. 2004). Only few research works have considered the dynamicity of similarity and trust in their proposed RSs (Bedi and Sharma 2012; Yan et al. 2013). Bedi and Sharma (2012) presented a trust-based ant recommender system (TARS). This method creates an implicit trust network for each user based on useritem rating matrix and provides recommendations for the active user by continuously updating implicit trust between users and selecting the best neighborhood using ant colony metaphor. Authors in (Yan et al. 2013) considered the temporal nature of

286

8 Social Recommender Systems

both trust and similarity by dynamically updating their proposed two-faceted web of trust. In the ISG, the interest intensity of all the edges is updated by new item ratings, while in the DTG, the trust intensity for all the edges is updated at fixed intervals by referral feedback ratings. However, even these works either focus only on the temporal changes of similarity (Bedi and Sharma 2012), or despite considering the dynamicity of both similarity and trust update their information at fixed time intervals and not during the recommendation process (Yan et al. 2013). Authors in (Ghavipour and Meybodi 2018a) addressed two above-mentioned issues including the sparsity of trust networks and the dynamic nature of trust and similarity, and proposed a stochastic trust propagation-based method, called LTRS, which produces predictions for an active user by propagating trust through an enriched trust network. LTRS uses the similarity relations between users to enrich trust network and, in this way, mitigate the sparsity problem of this network which affects the accuracy and coverage of predictions. This method also modelles the enriched network as a stochastic graph, and continuously captures the temporal variations of edge weights during the recommendation process. Authors used the stochastic trust propagation algorithm Dytrust (Ghavipour and Meybodi 2018c) proposed in their previous work to propagate trust along reliable paths in the enriched network. LTRS differs from existing recommender systems in a number of ways. First, in contrast to methods such as TidalTrust(Golbeck 2005), TARS (Bedi and Sharma 2012), and CBR (Gohari et al. 2018), our algorithm LTRS exploits both user similarities and explicit trust relations for making recommendations to mitigate the sparsity presented by rating information and web of trust. Second, unlike existing models in the literature, LTRS propagates trust through an enriched network consisting of both implicit and explicit trust relations and, in this way, improves the coverage and accuracy of rating predictions. Finally, in comparison with methods such as TARS (Bedi and Sharma 2012), and CITG (Yan et al. 2013), LTRS addresses the dynamic nature of both trust and similarity by modelling the enriched network as a stochastic graph, and continuously captures their temporal variations during the recommendation process and not at fixed intervals. Since users’ interests vary with time, it is highly probable that the similarity and the intensity of trust between them change during the recommendation process and therefore the item ratings predicted by recommender systems may become less relevant because of these variations. Using DyTrust, the algorithm LTRS not only continuously capture the temporal changes of trust and similarity, but also accelerates the propagation process. In order to validate the efficiency of the recommender system LTRS, authors conducted comprehensive experiments on the well-known dataset Epinions. Their experimental results indicated that the proposed method significantly improves both the accuracy and coverage of recommendations.

8.3 Learning Automata Based Recommender Systems

287

Fig. 8.1 Membership functions and parameters of fuzzy trust

8.3 Learning Automata Based Recommender Systems In this section, two recommender systems proposed in the literature: CALA-OptMF (Ghavipour and Meybodi 2016) and LTRS (Ghavipour and Meybodi 2018a), which use learning automata for providing enhanced recommendations are described in details.

8.3.1 The Adaptive Fuzzy Recommender System CALA-OptMF Work in (Ghavipour and Meybodi 2016) proposed a method based on continuous action-set learning automata (CALA), called CALA-OptMF, to optimize membership functions (MFs) in fuzzy modelling of trust and distrust in recommender systems. The authors described their proposed method CALA-OptMF with the goal of tuning membership functions of fuzzy trust during the lifetime of a fuzzy trustenhanced recommender system. The same approach can be also utilized to tune membership functions of fuzzy distrust. Having this in mind, CALA-OptMF aims to adjust the number and position of membership functions of trust linguistic terms. For this purpose, a set of n linguistic terms denoted as L  {L 1 , L 2 , . . . , L n }, such as {high, medium, low}, are considered to describe the various levels of trust. Membership functions (F  {F1 , F2 , . . . , Fn }) for these linguistic terms have been assumed to be in triangular shape, where each triangular membership function Fi is represented by a triple (i , ci , ri ). The authors also assumed that the value of trust is in interval [τmin , τmax ]. Except for the first and the last membership functions F1 and Fn whose centres are fixed on τmin and τmax respectively, the centres of the other membership functions are the parameters of our optimization problem. As shown in Fig. 8.1, the interval of values for each centre parameter ci is [i , ri ]. In the proposed method, at first a C AL Ai is assigned to the centre parameter ci of each of the remaining (n − 2) membership functions (i.e. F2 , F3 , . . . , Fn−1 ) to learn the optimal value of that parameter, and the parameters μ(0) and σ (0) of the action probability distributions for all CALAs are initialized. Then, CALAOptMF iteratively adjusts the membership functions of fuzzy trust according to the

288

8 Social Recommender Systems

actions selected by CALAs and updates the action probability distributions of CALAs based on the performance of trust-enhanced RS until it finds the most appropriate membership functions. The iteration t of the proposed method CALA-OptMF is described in the three following steps: Step 1. MFs adjustment This step adjusts the membership functions of fussy trust. For this purpose, all CALAs select an action in parallel based on their action probability distributions. The chosen action by each C AL Ai , denoted as αi , determines the value of its corresponding centre parameter ci on the values axis of trust, i.e. Fi  (i , αi , ri ). Let F´ be the set of membership functions truly adjusted which forms the output of this step. F´ initially includes the first membership function F1 of F, namely F´  {(1 , c1 , r1 )}. For each other membership function Fi (1 < i < n) belonging to F, the following condition is checked: max(i , ck ) ≤ ci ≤ ri where k is the size of F´ and c´k refers to the centre parameter of the k th membership ´ The above condition investigates whether the centre function F´ k added to the set F. parameter of the membership function Fi is within its specified interval (ci ∈ [i , ri ]) ´ That is, this and after the centre parameter of the latest membership function in F. condition investigates whether the membership function Fi has been truly adjusted to be added to F´ or not. Each membership function Fi that satisfies this condition will be added to F´ as a new member if the distance betweenthe centre parameters of Fi and F´ k is larger than the merge threshold ε, i.e. ci − c´k > ε, and otherwiseit   will be merged with F´ k resulting in the membership function ´k , Avg ci , c´k , ri ,       ´ ´k , c´k , c´k ∪ ´k , Avg c´i , c´k , ri where the symbol \ denotes the i.e. F´ ← F\     set of all elements in F´ except F´k  ´k , c´k , c´k . Finally, if cn − c´k ≤ ε then   ´ ´k , cn , rn and otherwise the membership function Fn will be added to F. Step 2. Performance evaluation In this step, the set F´ constructed in the previous step is employed in a trust-enhanced RS and the performance of this RS is measured in terms of an evaluation metric, such as the mean absolute error (MAE). The authors referred to this evaluation metric as f err and use it for updating the parameters of the action probability distributions of CALAs. That is, the function f err is considered as βα(t) in the learning algorithm (for more details, see Sect. 1.2.6). βμ(t) is computed similar to βα(t) , difference is that in the step 1 the centre parameter of each membership function Fi (1 < i < n) in F is set to the mean μi of the action probability distribution of its corresponding C AL Ai (rather than the action αi chosen by C AL Ai ), i.e. Fi  (i , μi , ri ). The set F´ is

8.3 Learning Automata Based Recommender Systems

289

constructed again, this time in terms of mean parameters, and f err of the RS using F´ is considered as βμ(t) . Step 3. Stop condition The MFs adjustment process and evaluating the performance of RS continue until for all CALAs, μ(t) does not change noticeably and σ (t) converges close to the Lower bound σ L . In this situation, the set F´ constitutes the output of the proposed method.

8.3.1.1

Complexity Analysis

In this subsection, we illustrate the analysis presented in (Ghavipour and Meybodi 2016) for the time complexity of a recommender system using their proposed method CALA-OptMF. As mentioned before, the membership function optimization based on CALA-OptMF can be done online during the lifetime of the recommender system. That is, at the beginning of each time period the CALA-OptMF method adjusts membership functions serving as inputs to the recommender system. Then, RS utilizes these membership functions to make recommendations for users. Finally, the recommendation error at the end of this period will be used by CALA-OptMF to readjust the membership functions in the next period, and so on. Based on the above paragraph, the time complexity of a recommender system using our proposed method CALA-OptMF in each time period is computed as follows. In the MFs adjustment step, at first all CALAs are activated in parallel to choose an action based on their action probability distributions which takes O(1) time. The chosen action by each CALA determines the value of its corresponding centre parameter. Then, the membership functions that satisfy some predefined conditions will be added to the output set in O(n) time, where n is the number of fuzzy sets. This step is repeated again, this time with considering the mean parameter of each CALA as its corresponding centre parameter. Hence, the time taken by the MF adjustment step is O(n). In the performance evaluation step, RS predicts the rating for each target user twice considering both output sets of membership functions. Note that recommendations for the target user is made based on ratings predicted in terms of the output MF set obtained from CALAs’ actions. Therefore, the target user suffers no extra delay in receiving his recommendations. With the assumption that the rating prediction by the RS takes O(x) time, the time required for making recommendations for k users in this period is O(kx). Based on the recommendation error, CALA-OptMF updates the parameters of the action probability distributions for all CALAs in parallel in O(1) time. Hence, the performance evaluation step has the time complexity of O(kx). Since n is obviously much smaller than k (i.e. n  k), the total time taken by the recommender system using CALA-OptMF is O(kx) which is equal to the time complexity of the recommender system without using CALA-OptMF. Thus, the membership function optimization based on CALA-OptMF does not influence the time complexity of any recommender system in which it is utilized.

290

8.3.1.2

8 Social Recommender Systems

Fuzzy Trust-Distrust Enhanced Recommender System: A Case Study

In this subsection, we briefly describe a recommender system which has been used in (Ghavipour and Meybodi 2016) to show the efficiency of their proposed method for the adaptation of fuzzy membership functions. Authors in (Kant and Bharadwaj 2013) proposed a recommender system exploiting both the trust and distrust concepts to enhance the quality of recommendations. They developed fuzzy computational models for trust and distrust using similarity and knowledge factors based on rating data. The similarity factor refers to the mechanism in which trust and distrust are computed based on social similarities and dissimilarities such as interests on common items (Zucker 1986). There is a positive and significant correlation between trust and interest similarity which means, the more similar two users, the greater the value of trust between them (Ziegler and Golbeck 2007; Golbeck 2009). The knowledge factor refers to a mechanism where users get to know each other through repeated interactions and predict others’ future behaviours based on the information obtained from these interactions (Lu et al. 2010). People usually tend to trust others that they are familiar with through interactions (Gefen et al. 2003). That is, the better one user knows another, the better he can trust what the other will do in most situations. The knowledge factor provides the most relevant and reliable information for measuring social trust (Abdul-Rahman and Hailes 2000). Based on their proposed computational models, the total trust and distrust values from user u i to user u j are computed by the following formulae:       w1 ∗ T r ustu1i u j + w2 ∗ T r ustu2i u j T r ustu i u j  (8.1) w1 + w2       w1 ∗ Distr ustu1i u j + w2 ∗ Distr ustu2i u j Distr ustu i u j  (8.2) w1 + w2     where w1 and w2 are weight coefficients, T r ustu1i u j and Distr ustu1i u j are the trust and distrust values from u i to u j computed on the basis of similarity factor, and  T r ustu2i u j and Distr ustu2i u j are the trust and distrust values obtained on the basis of knowledge factor. The computed values of trust and distrust are fuzzified into seven triangular membership functions uniformly and symmetrically distributed along the values axis, as shown in Fig. 8.2. In Fig. 8.2a, seven fuzzy sets are: no trust (NT), very low trust (VLT), low trust (LT), average trust (AT), high trust (HT), very high trust (VHT), and complete trust (CT), and in Fig. 8.2b, seven fuzzy sets represent: no distrust (ND), very low distrust (VLD), low distrust (LD), average distrust (AD), high distrust (HD), very high distrust (VHD), and complete distrust (CD). Finally, a recommendation strategy is utilized to provide enhanced recommendations. Authors evaluated several recommendation strategies. Since based on their experimental results, the recommendation strategy called hybrid fuzzy trust–distrust CF performs the best, the authors in (Ghavipour and Meybodi 2016) therefore con-

8.3 Learning Automata Based Recommender Systems

291

Fig. 8.2 Membership functions for fuzzy a trust, and b distrust

sidered this strategy. The strategy hybrid fuzzy trust–distrust CF combines two other proposed recommendation strategies including modified Pearson CF and scheme 8-CF, and computes the final prediction of rating of user u i on item Ik as ⎧ cf ⎪ 0 i f Putdi ,Ik  0 and Pu i ,Ik  0 ⎪ ⎪ ⎪ c f td ⎪ i f Pu i ,Ik  0 ⎨ Pu i ,Ik cf Pr ed12(u i , Ik )  Putd,I (8.3) i f Pu i ,Ik  0 i k ⎪ cf ⎪ td ⎪ 2∗Pu ,I ∗Pu ,I ⎪ i k i k ⎪ ⎩  P c f +P td  other wise u i ,Ik

u i ,Ik

cf

where Pu i ,Ik is the predicted rating of user u i on item Ik by the scheme modified Pearson CF, and Putdi ,Ik is the predicted rating according to the scheme 8-CF. the scheme modified Pearson CF uses the most similar top-k neighbours of the target user u i for generating recommendations, where the similarity between users is computed based on their ratings. This strategy predicts the rating of an active user u i for an item Ik as follows:  

cf u∈N Sim(u, u i ) ∗ r u,Ik − r¯u

Pu i ,Ik  r¯u i + (8.4) u∈N |Sim(u, u i )| where Sim(u, u i ) denotes the similarity between the active user u i and the user u belonging to the set N of the most similar top-k neighbours, r¯u i represents the mean rating of u i , and ru,Ik is the rating for the item Ik provided by the user u. In contrast, scheme 8-CF utilizes fuzzy trust and distrust information in the recommendation process. In this strategy, the rating prediction of an item Ik for an active user u i is done based on the following equation:  

u∈(S−D) wu ∗ r u,Ik − r¯u td

(8.5) Pu i ,Ik  r¯u i + u∈(S−D) wu where D denotes the set of outliers in the trust-distrust network, where an outlier occurs when a user A trusts highly on a userB who highly trusts a user C, but A distrusts C completely. In this equation, S represents the set of trusted neighbours

292

8 Social Recommender Systems

who are above the low trustworthy users, i.e. average trustworthy, high trustworthy, very high trustworthy, and complete trustworthy neighbours. Weights (wu ) associated with these neighbours are determined based on the core of corresponding triangular fuzzy sets. In order to examine the efficiency of their proposed method CALA-OptMF, the authors employed it in the fuzzy recommender system described above. Note that the proposed method can be utilized in any other fuzzy recommender system with no needed changes.

8.3.1.3

Experimental Evaluation

In this section, we report the experimental results presented in (Ghavipour and Meybodi 2016) to investigate the performance of the proposed method CALA-OptMF for improving the recommendation accuracy in fuzzy recommender systems. For this purpose, the authors considered the trust-distrust enhanced recommender system proposed in (Kant and Bharadwaj 2013) and employed their proposed method in this system to adjust the number and position of fuzzy membership functions for both trust and distrust. They test the recommender system using their proposed method over the well-known datasets MovieLens and Flixster.

Dataset MovieLens. MovieLens is a CF-based recommender system and virtual community website created by (GroupLens.org) that recommends movies to users based on their film preferences. The MovieLens dataset contains 100,000 ratings provided by 943 users for 1682 movies, such that each user has rated at least 20 movies. The ratings are on the numerical five-star scale, i.e. 1: bad, 2: average, 3: good, 4: very good, and 5: excellent. Flixster. Flixster is a social movie website that allows users to share movie ratings, discover new movies and meet others with similar movie taste. Flixster dataset contains movie ratings from the Flixster website. This dataset has 786,936 users, 48794 items and 8196077 ratings, where each movie rating is a discrete value in the range [0.5, 5] with a step size of 0.5.

Experimental Setup In the training phase, the proposed method CALA-OptMF operates as follows in order to adapt the membership functions of fuzzy trust and distrust. At each iteration, CALA-OptMF selects 50 users randomly from the used dataset and evaluates the accuracy of the recommender system using the fuzzy sets adjusted by CALAs for both trust and distrust for predicting ratings of these users. Evaluation is done using the leave-one-out method, where one rating is taken out and then compared with the

8.3 Learning Automata Based Recommender Systems

293

rating predicted using the rest of user ratings. After that, the parameters of action probability distributions of CALAs are updated based on the mean absolute error (MAE) measure (as the evaluation metric f err ) which is computed as the deviation of predicted ratings from the true ratings provided by users. This process is repeated until the stop conditions are satisfied for all CALAs for both trust and distrust. In the test phase, a mini dataset constructed by randomly selecting 50 users from the used dataset will be utilized. To prevent overlapping, the construction of this mini dataset is done before the training phase is started and the remaining of dataset will be used for training. Finally, the recommender system using the optimized membership functions of fuzzy trust and distrust is tested on the mini dataset.

Experimental Results In order to demonstrate the performance of the proposed method CALA-OptMF, the authors conducted several experiments on the well-known datasets described in Section “Dataset”. In the proposed method, they set the merge threshold ε to 0.01, the lower bound σ L to 0.001, and the minimum value τmin and the maximum value τmax of trust/distrust respectively to 0 and 1. The number of fuzzy sets for both trust and distrust was assumed to be 7, as considered in (Kant and Bharadwaj 2013). Therefore, CALA-OptMF needs five CALAs for each of the fuzzy variables trust and distrust to learn the optimal values of the center parameters of five membership functions F2 –F6 . In the experiments, the average results of 30 independent runs have been reported. In each run, the authors utilized a mini dataset including 50 users randomly selected from the used dataset to evaluate the performance of the fuzzy recommender system. It is ensured that the recommender system with and without using the proposed method CALA-OptMF uses the same mini dataset. Experiment I The first experiment has been conducted to study the impact of different strategies for adjusting the learning parameter λ on the performance of the proposed method CALA-OptMF. The learning parameter λ introduces a trade-off between the convergence rate and the steady-state error of the estimated parameter. The authors considered four following strategies and for each strategy, they reported MAE of the recommender system using CALA-OptMF on used datasets in Table 8.1. Note that for all strategies all the CALAs use the same learning rate. Strategy 1. In this strategy, the learning rate is considered to be a fixed value. The experiment has been conducted for values ranging from 0.0001 to 0.0009. Strategy 2. In this strategy, the learning parameter λ is varies according to the following equation: t (8.6) λ(t)  a − b c

294

8 Social Recommender Systems

Table 8.1 MAE for different strategies Strategy Strategy 1

Strategy 2

Strategy 3

Strategy 4

Dataset MovieLens

Flixster

λ  0.0001

0.8421

1.0473

λ  0.0003

0.8419

1.0467

λ  0.0006

0.8418

1.0456

λ  0.0009

0.8452

1.0478

a  0.0003 b  0.0001 c  1000

0.8417

1.0441

a  0.0006 b  0.0001 c  500

0.8417

1.0419

a  0.0009 b  0.0001 c  100

0.8418

1.0534

a  0.0002 b  0.00003

0.8416

1.0419

a  0.0005 b  0.00004

0.8417

1.0415

a  0.0006 b  0.0001

0.8415

1.0346

a  0.0002 b  0.00003

0.8417

1.0372

a  0.0005 b  0.00004

0.8417

1.0413

a  0.0006 b  0.0001

0.8418

1.0419

where a and b are real values in interval [0, 1], c > 0 is an integer value and t is the iteration number. Based on the above equation, the learning parameter λ takes an initial value a and it is decremented by b on each c iterations. In this experiment, the authors tested various settings for these three parameters a, b and c. Strategy 3. In this strategy, the learning parameter λ at iteration t is computed as λ(t)  a f err (t) − b

(8.7)

where a and b take real values in interval [0, 1] and f err (t) is the prediction error for the recommender system at time t. In this experiment, the authors examined different values for a and b.

8.3 Learning Automata Based Recommender Systems

295

Table 8.2 Different initial assignments for mean and variance parameters Initial values Case

Mean μ2 (0)

Variance μ3 (0)

μ4 (0)

μ5 (0)

μ6 (0)

σ2 (0)

σ3 (0)

σ4 (0)

σ5 (0)

σ6 (0)

1

0.1667 0.3333 0.5000 0.6667 0.8333 0.0500 0.0500 0.0500 0.0500 0.0500

2

0.1780 0.2626 0.4639 0.5096 0.7422 0.0299 0.0401 0.0484 0.0436 0.0240

3

0.0691 0.3199 0.4120 0.7197 0.8603 0.0236 0.0302 0.0155 0.0426 0.0200

4

0.1620 0.2264 0.5118 0.7003 0.9008 0.0334 0.0380 0.0160 0.0197 0.0346

5

0.2154 0.3974 0.5703 0.6552 0.8387 0.0190 0.0456 0.0203 0.0472 0.0289

Strategy 4. This strategy is similar to Strategy 3. The only difference is that the square of f err (t) is used for updating the parameter λ at iteration t. 2 λ(t)  a f err (t) − b

(8.8)

In this experiment, the initial values of the mean parameters of the action probability distributions for C AL A2 to C AL A6 related to the trust variable are respectively set to 0.1667–0.8333 with step 0.1667 (this setting is adopted based on the values of the centre parameters of the fuzzy sets in Fig. 8.2). The same setting is used for the CALAs related to the distrust variable. The variance parameter for all the CALAs is initialized to 0.05. From results in Table 8.1, we can see that varying the parameter λ along the learning phase decreases the MAE error on both datasets. Among three strategies that update λ at each iteration, Strategy 3 with a  0.0006 and b  0.0001 provides the best results. Experiment II In this experiment, the authors investigated the impact of different initial values for the mean and variance parameters of the action probability distributions of CALAs on the performance of the proposed method when it uses Strategy 3 with a  0.0006 and b  0.0001 for updating the parameter λ (the best setting for λ as reported in the previous experiment). Since each centre parameter ci (i  2, . . . , 6) can take the values in the interval [(i − 2)/6, i/6], initial values for the mean parameter μi are generated randomly in this interval. They also generated initial values for each variance parameter σi randomly in interval [0.01, 0.05]. Different initial assignments for the mean and variance parameters of CALAs related to the trust variable have been shown in Table 8.2. The same initial assignment is utilized for the mean and variance parameters of CALAs related to the distrust variable. Note that Case 1 is the initial assignment considered in the previous experiment. For each dataset, CALA-OptMF is run by using different initial assignments and its output membership functions are utilized in the recommender system. MAE of the recommender system for different datasets is reported in Table 8.3. As shown,

296 Table 8.3 MAE for different initial assignments

8 Social Recommender Systems

Case

Dataset MovieLens

Flixster

1

0.8415

1.0346

2

0.8302

1.0576

3

0.8416

1.0419

4

0.8415

1.0296

5

0.8636

1.0274

Fig. 8.3 Optimized membership functions of fuzzy a trust, and b distrust for MovieLens

Fig. 8.4 Optimized membership functions of fuzzy a trust, and b distrust for Flixster

the best result is obtained for MovieLens by using Case 2 and for Flixster by using Case 5. Figures 8.3 and 8.4 plot the optimized membership functions of fuzzy trust and distrust respectively for MovieLens (Case 2) and Flixster (Case 5). Experiment III In this experiment, the authors compared the proposed method CALA-OptMF with the method reported in (Kant and Bharadwaj 2013) in which the fuzzy membership functions are not varying during the lifetime of the recommender system. CALAOptMF uses the best settings of the parameters obtained from the previous experiments. MAE and coverage of the recommender system with and without using CALA-OptMF (using the predefined membership functions given in Fig. 8.2) are shown in Table 8.4. Coverage measures the percentage of items for which a RS is able to generate predictions. According to the results, CALA-OptMF by provid-

8.3 Learning Automata Based Recommender Systems

297

Table 8.4 MAE of recommender system with and without using CALA-OptMF Dataset

Without CALA-OptMF

With CALA-OptMF

MAE

Coverage

MAE

Coverage

Num. Iterations

MovieLens

0.8544

0.9446

0.8302

0.9446

2235

Flixster

1.1522

0.5728

1.0274

0.5728

1103

Fig. 8.5 Box plot for the MAE error on a MovieLens, and b Flixster

ing the membership functions optimized in terms of the used dataset improves the accuracy of recommendations. However, using CALA-OptMF does not change the recommendation coverage. This is because the coverage metric depends on the algorithm of the recommender system and the CALA-OptMF method does not make any changes in this system, but only optimizes the input fuzzy sets. The number of iterations needed by CALA-OptMF to optimize the fuzzy sets of trust and distrust for each of two used datasets is also given in Table 8.4. As noted in Sect. 8.3.1.1, the membership function optimization based on the proposed method is done online during the lifetime of the recommender system and it does not influence the time complexity of the system. Figure 8.5 summarizes the performance of the recommender system with and without using the proposed method CALA-OptM over the used datasets in terms of the MAE error. The distribution of the MAE errors of 30 independent runs is represented in box plot format, which visualizes the median, upper quartile, lower quartile, outside value for each distribution of the experimental data. From this figure, we can see that on both datasets the recommender system using CALA-OptM has less MAE error. In order to investigate the statistical significance of the obtained results, this experiment uses t-test for comparing MAE of the recommender system with and without using the proposed method CALA-OptMF. The test results for the confidence level 0.95 are summarized in Table 8.5.

298

8 Social Recommender Systems

Table 8.5 The results of statistical test in terms of MAE Dataset

Test result Mean MAE (x − y)

T -Value

Difference significance (P-value)

MovieLens

−0.0132

−14.2468

4.9920E − 17

Flixter

−0.1248

−33.1408

1.3067E − 35

Performance √ √

In this table, the first column includes the list of the used datasets. The test results are shown in the second column. The authors referred to the recommender system with and without CALA-OptMF as x and y respectively. x statistically performs better than y if the mean MAE (x − y) (and thus √ T − value) is a negative value and P − value is smaller than 0.05. Symbols “ ” and “×” appeared in the column labelled as “Performance” indicate whether x outperforms y or not, respectively. According to the results, the recommender system obtains the higher accuracy on both datasets by using the proposed method CALA-OptMF.

8.3.2 Stochastic Trust Propagation-Based Recommender System LTRS In this section, we first introduce basic notations used by authors in (Ghavipour and Meybodi 2018a) for describing their proposed recommender system. Typically in a recommender system, there exists a set of users U  {u 1 , u 2 , . . . , u N } and a set of items I  {i 1 , i 2 , . . . , i M }. Each user u i specifies his preferences by rating a subset Ii of items by some values. The rating of a user u i on an item i k is denoted by ri,k . The task of a recommender system is as follows. / Is (i.e. rs,d is Given an active user u s ∈ U and a target item i d ∈ I such that i d ∈ unknown). The recommender system predicts the rating of u s on i d , denoted by r¯s,d , based on the existing ratings. In the following, we describe the proposed recommender system LTRS based on stochastic trust propagation in two phases: constructing an enriched trust network and generating predictions for active user. Phase I: Constructing an Enriched Trust Network   A trust network is defined as a weighted digraph G T U, E T , T , where U  {u 1 , u 2 , . . . , u N } represents the set of N nodes which corresponds to users, E T ⊆ U × U  {eiTj |u i , u j ∈ U } is the set of M directed edges or trust relations that connect users, and T  {ti j |eiTj ∈ E T } denotes the set of edge weights or trust values between users. If user u i trusts user u j , then there is a directed trust relation eiTj ∈ E T

8.3 Learning Automata Based Recommender Systems

299

from u i to u j with weight ti j ∈ T referring to the value of this trust. Since trust is dynamic, G T is modelled as a stochastic graph with trust weights being random variables. The authors assumed that each trust weight ti j takes real values in range [0, 1], with 0 referring to no trust and 1 to full trust. Although explicit trust-based recommender systems are characterized by high prediction coverage and accuracy, but trust networks are typically sparse and this matter affects the quality of recommendations. In order to alleviate this problem, the authors enriched the trust network G T by adding implicit trust relations among users. Considering the strong correlation between trust and similarity, implicit trust between two users can be computed using their interest similarity. That is, two notions implicit trust and similarity refer to the same concept and so might be used interchangeably. Based on this, they computed interest similarities between each pair of users and construct a similarity network which in combination with the trust network G T mitigates the sparsity of both networks.   Let a similarity (implicit trust) network be a weighted digraph G S U, E S , S with the user set U, the set of similarity relations E S ⊆ U × U  {eiSj |u i , u j ∈ U }, and the set of similarity weights S  {si j |eiSj ∈ E S }, such that if two users u i and u j have some items commonly rated with positive correlation, then there exist two directed similarity relations eiSj , e Sji ∈ E S in both directions between them respectively with weights si j , s ji ∈ S being equal to the value of interest similarity sim(i, j). Considering the dynamic nature of similarity, G S is also a stochastic graph in which similarity weights are random variables. The authors used the Pearson correlation coefficient for computing the similarity between two users u i and u j in terms of the items rated in common by them at the current time.   

i k ∈Ii, j ri,k − r¯i r j,k − r¯ j corr (i, j) 

(8.9)  2

 2 r r − r ¯ − r ¯ i j,k j i k ∈Ii, j i,k i k ∈Ii, j where ri,k is the rating of user u i for item i k , Ii, j  Ii ∩ I j is the set of items that two users rated in common, and r¯i and r¯ j denote the average of ratings given by u i and u j , respectively. The value of corr (i, j) is in the range [−1, 1]. Since zero and negative correlations indicate that the ratings expressed by two users are uncorrelated or correlated in opposite directions, they are not useful for our purpose and no edges will be added for them to the similarity network G S . The Pearson correlation coefficient measures the extent to which two users u i and u j with similar preferences linearly relate with each other. However, it does not determine the degree of confidence the user u i should have in u j and vice versa. The authors considered the similarity confidence as a sigmoid function of the number of items commonly rated by u i and u j (Jamali and Ester 2009) and compute the interest similarity between these two users as follows.

300

8 Social Recommender Systems

sim(i, j) 

1 1 + e−

| Ii, j |

× corr (i, j)

(8.10)

2

  where corr (i, j) > 0 and  Ii, j  denotes the size of the set Ii, j . Using the sigmoid function, they avoided favouring the size of Ii, j too much and keep the similarity value sim(i, j) in the range [0, 1]. Once the rating similarities between all pairs of users were computed, for each pair with positive correlation two similarity relations in both directions will be added to G S and, in this way, the similarity network G S is constructed. The authors then combined both networks of similarity and trust to generate an enriched trust network G(U, E, W), where U is the user set, E  E S ∪ E T denotes the set of implicit and explicit trust relations such that there is a directed relation ei j ∈ E from user u i to user u j if u i trusts u j in G T (i.e. eiTj ∈ E T ), or u i and u j are positively correlated in G S (i.e.eiSj ∈ E S ), or both, W  {wi j |ei j ∈ E} is the set of integrated edge weights such that each weight wi j is a random variable whose value equals com t (i, j), a combination of similarity and explicit trust from u i to u j at time t that is computed as ⎧ 2σ (t)×τ (t) ij ij ⎪ i f σi j (t)  0 and τi j (t)  0 ⎪ ⎪ ⎨ σi j (t)+τi j (t) τ (t) else i f σi j (t)  0 and τi j (t)  0 ij com t (i, j)  (8.11) ⎪ else i f σi j (t)  0 and τi j (t)  0 ⎪ ⎪ σi j (t) ⎩ 0 else where σi j (t) (τi j (t)) is the expected value of implicit (explicit) trust from user u i to user u j which is computed as given in Eqs. (8.12) (8.13). The value of com t (i, j) lies in the interval [0, 1]. The advantage of using the harmonic mean is that it is robust to large differences among its inputs, so that high values will be obtained only when both expected similarity and trust values are high. The harmonic mean has been widely used in the literature (O’Donovan and Smyth 2005; Bedi and Sharma 2012; Yan et al. 2013; Guo et al. 2015) to integrate similarity and trust.

|ϑi j (t)| |ϑi j (t)|−l l λ ϑi j (t) l1 σi j (t) 

|ϑi j (t)| |ϑi j (t)|−l λ l1

|ωi j (t)| |ωi j (t)|−l l λ ωi j (t) l1 τi j (t) 

|ωi j (t)| |ωi j (t)|−l λ l1

(8.12)

(8.13)

  In the above equations, ϑ i j (t) ωi j (t) denotes the set of similarity   (trust) weights 

observed on the relation eiSj eiTj until the current time and ϑil j (t) ωil j (t) refers to l th member of this set. Since more recent observations should be given relatively greater weight, the authors used the decay factor λ ∈ [0, 1] to control the rate at which old similarity and trust weights are discounted.

8.3 Learning Automata Based Recommender Systems

(a) Trust network

(b) Similarity network

(c) Enriched trust network

301

(d) Enriched trust network after adding target item

Fig. 8.6 Description of how to construct an enriched trust network: a Trust network, where any directed edge indicates who trusts whom, b Similarity network, where there are two directed edges in both direction between any two positively correlated users, c Enriched trust network, where there is a directed edge from u i to u j if u i trusts u j or u i is positively correlated to u j or both, d To predict the rating of target item i k , a node i k is temporarily added to the enriched trust network, and the users u 2 and u 5 , who have already rated i k , are connected to the new node, each by a temporary directed edge

Phase II: Generating Predictions for Active User In order to predict the rating of an active user u s on a target item i d , the authors temporarily added a node i d to the enriched trust graph and connect each user u i who has already rated the item i d to the new node by a directed edge eid with the weight ri,d referring to the rating value. In this way, the rating prediction problem can be converted to a trust inference problem and solved by propagating trust along reliable paths from u s to i d . Figure 8.6 illustrates different steps of our proposed system LTRS using a simple example. LTRS uses stochastic trust propagation algorithm Dytrust proposed in our previous work (Ghavipour and Meybodi 2018c) which exploits learning automata to more efficiently discover reliable trust paths and, at the same time, capture the temporal changes of edge weights during the propagation process. The stochastic enriched trust network G(U, E, W), the active user u s and the indirectly connected target i d form the inputs to DyTrust, and its output is the predicted rating r¯s,d . Let Nd be the set of users rating the target item i d , which are referred to as the direct neighbours of i d . Using the Dytrust algorithm, at first the most reliable trust path to each user u v ∈ Nd is discovered with respect to samples taken from edge weights, and the strength of the found path is considered as the reliability of the direct neighbour u v . Then, the final rating rˆs,d on i d for the active user u s is predicted by aggregating the ratings of the direct neighbours weighted by their reliability values, as given in Eq. (8.14) (Ghavipour and Meybodi 2018b).  

u v ∈Nd Rv r v,d − r¯v (8.14) rˆs,d  r¯s + |Nd |Maxw where Rv refers to the reliability value of direct neighbour u v , r¯v is the average of ratings provided by u v , and Maxw denotes the maximum implicit/explicit trust weight which is equal to 1 in this section.

302

8 Social Recommender Systems

In order to estimate the reliability of direct neighbours, Dytrust first constructs a distributed learning automata (DLA) isomorphic to the input graph G in such a way, each node u i is equipped with a learning automaton Ai . The action set αi of Ai contains all the neighbours implicitly or explicitly trusted by u i , namely the size of αi equals to the number of u i ’s outgoing relations. The action probability vector of each automaton is updated based on the learning algorithm L R−I . After that, for each direct neighbour u v ∈ Nd the Dytrust algorithm performs the following two tasks: 1. Initializing the action probability vectors The Eq. (8.15) is used to initialize the action probability vector pi of each automaton Ai . This equation gives higher selection probability to neighbours who are highly trusted. com t−1 (i, j) ∀u j ∈ αi u k ∈αi com t−1 (i, k)

pi j (t) 

(8.15)

where pi j (t) denotes the probability of selecting node u j by Ai at time t and com t−1 (i, j) is the integrated value of similarity and trust from vi to v j at time t − 1 and is computed according to Eq. (8.11). 2. Learning the most reliable path to direct neighbour DyTrust attempts to discover the most reliable trust path to the direct neighbour u v and estimate the reliability of u v based on the strength of the found path. For this purpose, it repeats the following three subtasks until the stopping criteria are reached. a. Discovering a trust path In this subtask, the aim is to find a trust path π to u v by a series of automaton activations starting from the root automaton As corresponding to the source u s . Each activated automaton Ai determines the next hop along the path π . The set of available actions for Ai contains all trusted neighbours of vi except those whose corresponding automata have been already activated along π . Ai selects one of its available actions, say action v j , based on the scaled action probability vector. As a result of the action selection, a sample is taken from each of relations eiSj and eiTj , the value of com t (i, j) is updated according to Eq. (8.11) and the automaton A j on the other end of the relation ei j is activated. If A j belongs to the set Nd and the minimum of integrated weights along the current path π , called the path strength Rπ , is larger or equal to the maximum strength R j already obtained for v j , namely if A j ∈ Nd and Rπ  minei j ∈π (com t (i, j)) ≥ R j , then R j  Rπ . The activation process is repeated until the neighbour u v is reached or the activated automaton has no available action. b. Evaluating the found path If the found path π ends in the direct neighbour u v and Rπ ≥ Rv , the learning automata activated along π receive a reward. For each automaton A j , the value of the reward parameter a at time t is given as (Beigy and Meybodi 2006).

8.3 Learning Automata Based Recommender Systems

a j (t) 

303

ai (t) ∀ei j ∈ π pi j (t + 1)

where pi j (t + 1) denotes the selection probability of action u j by Ai after rewarding the automaton. c. Checking stop criteria Two previous subtasks are repeated until the path probability, namely the product of the probability of choosing relations along the path π , is greater than a certain threshold P or the number of found paths exceeds a predefined threshold K. In this situation, the maximum strength Rv is considered as the reliability of the rating rv,d provided by u v for the target i d .

8.3.2.1

Experimental Evaluation

In order to evaluate the performance of the proposed recommender system LTRS, the authors conducted empirical experiments on the Epinions dataset and compared their algorithm with the pure Collaborative Filtering (CF) (Resnick et al. 1994), TidalTrust (Golbeck 2005), TARS (Bedi and Sharma 2012), CITG (Yan et al. 2013), RTCF (Moradi and Ahmadian 2015) and CBR (Gohari et al. 2018). These methods have been commonly used in the literature for evaluating the performance of proposed recommender systems (Jamali and Ester 2009; Kant and Bharadwaj 2013; Yan et al. 2013; Moradi and Ahmadian 2015; Gohari et al. 2018). In all the experiments, these settings have been considered for the parameters in the LTRS algorithm: the path probability threshold P is set to 0.9, the threshold K for the number of traversed paths to 10000 and the decay factor λ to 0.9. Experimental results have been averaged over 10 independent runs.

Dataset The experiments have been conducted on a version of the Epinions dataset (www. epinions.com) published by the authors in (Tang et al. 2012). In the Epinions website, users are able to review items and express their opinions about them by assigning numeric ratings in the range of 1–5. Moreover, users can also indicate others as trustworthy. The extracted dataset contains 916,149 item ratings from 22,164 users who have rated at least once among 296,277 different items. The trust network of the Epinions dataset also consists of 18,098 users, with 15,892 users trusted by at least one other user and 15,451 users trusting at least one other user, with a total of 355,727 trust statements.

304

8 Social Recommender Systems

In the experiments, the authors needed trust network to be stochastic graph with real-valued edge weights varying over time. For this purpose, they used the technique proposed by Richardson et al. (Richardson et al. 2003), which has been commonly used in the literature (Shekarpour and Katebi 2010; Jiang et al. 2015, 2016; Ghavipour and Meybodi 2018b). This technique first assigns to each user u i a quality value qi ∈ [0, 1] showing the probability that a statement issued by u i is true. The user qualities are chosen from a normal distribution with the mean μ  0.5 and the standard deviation σ  0.25. Then, the trust weight from user u i to u j is assumed to be a random variable with a continuous uniform distribution on the i is a noise parameter interval [max(q j − δi j , 0), min(q j + δi j , 1)], where δi j  1−q 2 which determines how accurate u i is at estimating the quality of u j that he is trusting. After that, they used the sampling algorithm RandomWalk to produce a smaller subgraph from the stochastic trust network. For a sampling fraction of 0.1, the sampled subgraph contains 1,809 users with 22,115 trust connections among them. The authors also extracted the rating data for these users, which includes 63,897 items with 104,715 ratings expressed for them. These rating information are used for constructing an initial similarity network based on the technique proposed in Sect. 8.3.2. At last, the same technique as what was used for the trust network is utilized here for modelling the similarity network as a stochastic graph. The only difference is that in this case, they considered the average of similarity between a user u i and his direct neighbours in the similarity network as the quality qi of u i . Since similarity weights are between 0 and 1, it is ensured that quality values are also in the interval [0, 1].

Evaluation Method Typically, the leave-one-out method (Kohavi 1995) has been used to simulate a dynamic recommendation process. At every test round, one item rating is taken out from the dataset and the compared algorithms attempt to predict it using the trust network and the remaining ratings. The quality of the various algorithms is measured by comparing the predicted rating and the actual rating. The authors repeated this process through the entire rating dataset, and then averaged the results.

Evaluation Metric The performance of recommender systems has been measured in terms of rating coverage, predictive error and classification accuracy. The rating coverage (RC) refers to the fraction of ratings for which a RS algorithm is able to produce a predicted rating. Mean absolute error (MAE) and root mean square error (RMSE) are popular predictive error metrics to measure the closeness of rating predictions relative to the true ratings: 



  u i ∈U i k ∈I rˆi,k − ri,k (8.17) M AE  n

8.3 Learning Automata Based Recommender Systems







u i ∈U

RMSE 

305

i k ∈I

 2 rˆi,k − ri,k

n

(8.18)

where n denotes the total number of ratings, ri,k and rˆi,k respectively refer to the actual rating and the predicted rating for user u i on item i k . In general, lower MAE and RMSE values indicate higher prediction accuracy. To measure classification accuracy, the authors used the metrics precision and recall which are the most popular metrics for evaluating the accuracy of RS algorithms in making decision. Precision (Pr) measures the ability of a system to suggest item that is truly relevant for an active user and is computed as Pr 

IA ∩ IB IB

(8.19)

where I A denotes the total number of relevant items and I B is the total number of items recommended to the user. Recall (Re) measures the ability of a system to gather the relevant content to the active user and is computed as the fraction of items are actually relevant and are successfully recommended, as given below. Re 

IA ∩ IB IA

(8.20)

The metrics precision and recall are clearly conflicting in nature. If the number of recommended items increases, then the value of precision is decreased, while at the same time recall increases. One may combine them by using the F1 measure which is the harmonic mean of precision and recall. A High F1 corresponds to balanced combination between recall and precision. F1 

2 × Pr × Re Pr + Re

(8.21)

Experimental Results Experiment I: Comparison with other methods The first experiment has been conducted to compare the predictive performance of the proposed method LTRS with CF, TidalTrust, TARS, CITG, RTCF and CBR methods in terms of accuracy and coverage on the Epinions dataset. The comparison results are shown in Table 8.6. As one can see from this table, the proposed method LTRS outperforms all the other methods under six metrics. Epinions has a very sparse rating data and a relatively small number of trust statements. Since the traditional CF considers only the rating information in making recommendations, it shows the worst predictive and classification accuracy with the lowest coverage in comparison with the other methods. By using explicit trust

306

8 Social Recommender Systems

Table 8.6 Comparison between different methods in terms of accuracy and coverage on the Epinions dataset Method

Metric Predictive error

Classification accuracy

RC

MAE

RMSE

Pr

Re

F1

CF

0.9427

1.3327

0.7916

0.8376

0.8140

0.3776

TidalTrust

0.9127

1.2509

0.7910

0.8712

0.8292

0.4301

TARS

0.8990

1.2132

0.7913

0.8920

0.8386

0.4578

CITG

0.8711

1.1591

0.7988

0.9251

0.8573

0.4863

RTCF

0.8563

1.1103

0.8015

0.9043

0.8498

0.4722

CBR

0.8779

1.1624

0.7997

0.9633

0.8739

0.4591

LTRS

0.8339

1.0862

0.8016

0.9724

0.8788

0.5179

relations instead of rating similarities, TidalTrust improves the rating coverage and obtains higher accuracy than CF. TARS performs better than two previous methods because it creates implicit trust network based on the rating information, thereby reducing the sparsity of user similarity. The method CITG mitigates the sparsity problem of rating data and trust network by incorporating both information into the recommendation process. In this way, CITG increases the coverage and produces more accurate predictions than those generated by TARS. RTCF also uses the combination of similarity values and trust statements. Since this method removes the users with lower reliability values from the trust network, it has a lower rating coverage than CITG. Although CITG shows better Re and F1 results, RTCF achieves higher prediction accuracy compared to CITG. CBR does not consider explicit trust relations between users and computes implicit trust based on the rating information. It gives close results to CITG in terms of MAE and RMSE measures and close results to TARS in terms of the coverage measure. However, the classification accuracy of CBR is higher than that of the previous methods. Similar to CITG and RTCF, LTRS combines similarity and trust networks. However, by propagating trust along paths consisting of both types of relations, the proposed method LTRS achieves the advantage of higher coverage and accuracy compared to the other methods. Considering the temporal variation of both similarity and trust during the recommendation process also improves the prediction performance of LTRS. Experiment II: Performance for cold-start users This experiment aims to study the effectiveness of the proposed method LTRS in dealing with new users (cold-start users) who have provided only a few or even no ratings. For this purpose, the authors considered only users who rated less than 5 items in the Epinions dataset. We report the accuracy and coverage of compared methods

8.3 Learning Automata Based Recommender Systems

307

Table 8.7 Performance of different methods in handling the cold-start problem on the Epinions dataset Method

Metric Predictive error

Classification accuracy

RC

MAE

RMSE

Pr

Re

F1

CF

1.0473

1.4309

0.8226

0.7727

0.7969

0.3058

TCF

0.9801

1.3500

0.8284

0.7872

0.8073

0.5670

Hyb

0.9511

1.2736

0.8361

0.8245

0.8303

0.6135

TidalTrust

0.9742

1.3167

0.8447

0.8915

0.8675

0.6911

TARS

0.9386

1.2515

0.8510

0.8666

0.8587

0.6317

CITG

0.9723

1.3089

0.8417

0.9153

0.8770

0.6179

LTRS

0.9279

1.2185

0.8693

0.9760

0.9196

0.7423

in Table 8.7. According to the results of this table, LTRS has the best performance in handling cold-start users as comparing to the other methods. The results prove that using trust information in the recommendation process can effectively alleviate the cold-start problem. In comparison with CF, the method TidalTrust shows higher prediction and classification accuracy and can provides reliable recommendations for a more number of new users. TARS mitigates the coldstart user problem by considering popular users with high reputation as trusted friends of new users. In this way, it achieves better results than TidalTrust. The combination of trust and similarity information and using default recommenders in CITG makes this method more powerful in predicting missing ratings. CITG performs better than TARS in terms of prediction coverage and classification accuracy. However, the MAE and RMSE values of CITG are slightly worse than those of TARS. The RTCF method shows higher predictive accuracy, but lower F1 and coverage in comparison with CITG. By making predictions based on the opinions of neighbours with high global reputations, CBR provides reasonable recommendations for new users. This method has the highest F1 value compared to the previous methods. The results for LTRS also show that trust propagation along both similarity and trust relations increases the coverage and accuracy of our proposed method. LTRS significantly outperforms the other methods under all the metrics. Experiment III: Performance for sparse rating data In this experiment, the authors examined the impact of different rating sparsity levels on the performance of their method LTRS as compared to the other methods. This experiment has been conducted under four different protocols, keeping 100, 75%, and so on down to 25% of the total number of ratings and discarding the rest in the Epinions dataset. Figure 8.7 shows the results of compared methods for different levels of sparsity. In this figure, PRR represents the percentage of retained ratings.

308

8 Social Recommender Systems

Fig. 8.7 Impact of different rating sparsity levels on the performance of different methods

From these figures, we can see that the performance of all methods decreases with increasing rating sparsity. However, those methods that benefit from trust information in the recommendation process can effectively handle the rating sparsity problem. Among them, our proposed method LTRS always shows the best performance. According to the results of Fig. 8.7a–e, the increase of sparsity level decreases the predictive and classification accuracy for all the methods. With respect to the RC metric (Fig. 8.7f), the negative effect of the sparsity problem on the rating coverage is much stronger for CF comparing to the other methods. Generally, LTRS performs better than the others in dealing with the rating sparsity problem.

8.4 Conclusion Trust affects users’ decision making in social networks. Based on this, decision support systems use trust information among users to more effectively help them make their decisions in social networks. In particular, most existing successful recommender systems consider trust relations and recommend items to a target user from her trusted users. In this chapter, we reviewed recommender systems proposed in the literature and presented detailed descriptions for two learning automata based systems CALA-OptMF (Ghavipour and Meybodi 2016) and LTRS (Ghavipour and Meybodi 2018a). Work in (Ghavipour and Meybodi 2016) has addressed the problem of supplying the most appropriate membership functions for fuzzy trust and distrust in recommender systems and proposed a method based on continuous action-set learning automata (CALA), called CALA-OptMF, to simultaneously optimize the number and the position of membership functions on the values axis. In this method, a CALA is assigned to the centre parameter of each triangular membership function to learn

8.4 Conclusion

309

the optimal value of that parameter. CALA-OptMF adjusts fuzzy sets online during lifetime of a recommender system in terms of recommendation error. Therefore, the trust relationship changes and preference changes over time are considered in the optimization process. The method CALA-OptMF can be used in any fuzzy trust-distrust enhanced recommender system without any needed changes or other requirements. In order to investigate the effect of membership functions optimization on the performance of recommender systems, authors utilized CALA-OptMF in a fuzzy trust-distrust enhanced recommender system presented in literature and test its performance over well-known datasets. Their experimental results indicated that CALA-OptMF by providing fuzzy membership functions optimized with respect to used dataset improves the accuracy of recommendations in fuzzy recommender systems. The second system LTRS is a stochastic trust propagation-based recommender system which mitigates the sparsity problem of trust network by constructing an enriched trust network which consists of both implicit and explicit trust relations. LTRS predicts the rating of an active user on a target item by propagating trust through the enriched trust network. To address the dynamic nature of both similarity and trust, LTRS uses a stochastic trust propagation algorithm based on learning automata which dynamically captures the temporal changes of implicit/explicit trust weights during the propagation process and updates the found reliable paths based on these variations. The experimental results on the well-known dataset Epinions demonstrated that the proposed system LTRS can improve both the accuracy and coverage of recommendations in comparison with its competitors. The results also confirmed that LTRS can effectively handle the issues of rating data sparsity and cold-start users.

References Abdul-Rahman A, Hailes S (2000) Supporting trust in virtual communities. In: proceedings of the 33rd annual hawaii international conference on system sciences. IEEE, p 9 Acilar AM, Arslan A (2011) Optimization of multiple input–output fuzzy membership functions using clonal selection algorithm. Expert Syst Appl 38:1374–1381 Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17:734–749 Akbari Torkestani J, Meybodi MR (2012) Finding minimum weight connected dominating set in stochastic graph based on learning automata. Inf Sci (Ny) 200:57–77. https://doi.org/10.1016/j. ins.2012.02.057 Altingovde IS, Subakan ÖN, Ulusoy Ö (2013) Cluster searching strategies for collaborative recommendation systems. Inf Process Manag 49:688–697 Arazy O, Kumar N, Shapira B (2009) Improving social recommender systems. IT Prof 11:38–44. https://doi.org/10.1109/MITP.2009.76 Avesani P, Massa P, symposium RT-P of the 2005 A, 2005 undefined (2005) A trust-enhanced recommender system application: moleskiing. DlAcmOrg, pp 1589–1593 Ayadi O, Halouani N, Masmoudi F (2016) A fuzzy collaborative assessment methodology for partner trust evaluation. Int J Intell Syst 31:488–501

310

8 Social Recommender Systems

Bedi P, Sharma R (2012) Trust based recommender system using ant colony for trust computation. Expert Syst Appl 39:1183–1190 Beigy H, Meybodi MR (2006) Utilizing distributed learning automata to solve stochastic shortest path problems. Int J Uncertainty, Fuzziness Knowledge-Based Syst 14:591–615. https://doi.org/ 10.1142/S0218488506004217 Bharadwaj KK, Al-Shamri MYH (2009) Fuzzy computational models for trust and reputation systems. Electron Commer Res Appl 8:37–47 Bhuiyan T (2013) Trust for intelligent recommendation. Springer Bobadilla J, Hernando A, Ortega F, Bernal J (2011) A framework for collaborative filtering recommender systems. Expert Syst Appl 38:14609–14623 Bobadilla J, Hernando A, Ortega F, Gutiérrez A (2012a) Collaborative filtering based on significances. Inf Sci (Ny) 185:1–17 Bobadilla J, Ortega F, Hernando A, Bernal J (2012b) A collaborative filtering approach to mitigate the new user cold start problem. Knowledge-Based Syst 26:225–238 Bobadilla J, Serradilla F (2009) The effect of sparsity on collaborative filtering metrics. In: Proceedings of the twentieth Australasian Conference on Australasian database-volume 92. Australian computer society, Inc., pp 9–18 Bonhard P, Sasse MA (2006) ’Knowing me, knowing you’—using profiles and social networking to improve recommender systems. BT Technol J 24:84–98 Carrer-Neto W, Hernández-Alcaraz ML, Valencia-García R, García-Sánchez F (2012) Social knowledge-based recommender system. Application to the movies domain. Expert Syst Appl 39:10990–11000 Chirita P-A, Nejdl W, Zamfir C (2005) Preventing shilling attacks in online recommender systems. In: Proceedings of the 7th annual ACM international workshop on web information and data management. ACM, pp 67–74 Choi IY, Oh MG, Kim JK, Ryu YU (2016) Collaborative filtering with facial expressions for online video recommendation. Int J Inf Manage 36:397–402 Formoso V, FernáNdez D, Cacheda F, Carneiro V (2013) Using profile expansion techniques to alleviate the new user problem. Inf Process Manag 49:659–672 Gefen D, Karahanna E, Straub DW (2003) Trust and TAM in online shopping: an integrated model. MIS Q 27:51–90 Ghavipour M, Meybodi MR (2016) An adaptive fuzzy recommender system based on learning automata. Electron Commer Res Appl 20:105–115. https://doi.org/10.1016/j.elerap.2016.10.002 Ghavipour M, Meybodi MR (2018a) Stochastic trust network enriched by similarity relations to enhance trust-aware recommendations. Appl Intell 1–14 Ghavipour M, Meybodi MR (2018b) A dynamic algorithm for stochastic trust propagation in online social networks: learning automata approach. Comput Commun 123:11–23. https://doi.org/10. 1016/j.comcom.2018.04.004 Ghavipour M, Meybodi MR (2018c) Trust propagation algorithm based on learning automata for inferring local trust in online social networks. Knowledge-Based Syst 143:307–316. https://doi. org/10.1016/j.knosys.2017.06.034 Gohari FS, Aliee FS, Haghighi H (2018) A new confidence-based recommendation approach: combining trust and certainty. Inf Sci (Ny) 422:21–50 Golbeck J (2006) Generating predictive movie recommendations from trust in social networks. In: International Conference on Trust Management. Springer, pp 93–104 Golbeck J (2009) Trust and nuanced profile similarity in online social networks. ACM Trans Web 3:12 Golbeck J, Hendler J (2006) Filmtrust: movie recommendations using trust in web-based social networks. In: Proceedings of the IEEE Consumer communications and networking conference. Citeseer, pp 282–286 Golbeck JA (2005) Computing and applying trust in web-based social networks. https://doi.org/10. 1017/cbo9781107415324.004

References

311

Guo G, Zhang J, Yorke-Smith N (2015) Leveraging multiviews of trust and similarity to enhance clustering-based recommender systems. Knowledge-Based Syst 74:14–27 Hao F, Min G, Lin M et al (2014) MobiFuzzyTrust: an efficient fuzzy trust inference mechanism in mobile social networks. IEEE Trans Parallel Distrib Syst 25:2944–2955 He J, Chu WW (2010) A social network-based recommender system (SNRS). In: Data mining for social network data. Springer, pp 47–74 Huynh T, Nguyen H, Le B, Minh HC (2012) A unified design for the membership functions in genetic fuzzy systems. Int J Comput Sci 9:7–16 Gao L, ongdong Li C (2008) Hybrid personalized recommended model based on genetic algorithm. In: 2008 4th international conference on wireless communications, networking mobile computing, Vols 1–31. IEEE, pp 9215–9218 Jamali M, Ester M (2009) Trustwalker: a random walk model for combining trust-based and itembased recommendation. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 397–406 Jiang W, Wu J, Li F et al (2016) Trust evaluation in online social networks using generalized network flow. IEEE Trans Comput 65:952–963. https://doi.org/10.1109/TC.2015.2435785 Jiang W, Wu J, Wang G (2015) On selecting recommenders for trust evaluation in online social networks. ACM Trans Internet Technol 15:14. https://doi.org/10.1145/2807697 Kant V, Bharadwaj KK (2013) Fuzzy computational models of trust and distrust for enhanced recommendations. Int J Intell Syst 28:332–365 Kaya M, Alhajj R (2006) Utilizing genetic algorithms to optimize membership functions for fuzzy weighted association rules mining. Appl Intell 24:7–15 Kim H-N, El-Saddik A, Jo G-S (2011) Collaborative error-reflected models for cold-start recommender systems. Decis Support Syst 51:519–531 Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai. pp 1137–1145 Lam SK, Riedl J (2004) Shilling recommender systems for fun and profit. In: Proceedings of the 13th international conference on World Wide Web. ACM, pp 393–402 Lee DH, Brusilovsky P (2009) Does trust influence information similarity? Recomm Syst Soc Web 10: Lemire D (2005) Scale and translation invariant collaborative filtering systems. Inf Retr Boston 8:129–150 Leung CW, Chan SC, Chung F (2008) An empirical study of a cross-level association rule mining approach to cold-start recommendations. Knowl Based Syst 21:515–529 Linden G, Smith B, York J (2003) Amazon. com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7:76–80 Lops P, De Gemmis M, Semeraro G (2011) Content-based recommender systems: State of the art and trends. In: Recommender systems handbook. Springer, pp 73–105 Lu Y, Zhao L, Wang B (2010) From virtual community members to C2C e-commerce buyers: trust in virtual communities and its effect on consumers’ purchase intention. Electron Commer Res Appl 9:346–360 Luo X, Xia Y, Zhu Q (2012) Incremental collaborative filtering recommender based on regularized matrix factorization. Knowledge-Based Syst 27:271–280 Mao M, Lu J, Zhang G, Zhang J (2017) Multirelational social recommendations via multigraph ranking. IEEE Trans Cybern 47:4049–4061. https://doi.org/10.1109/TCYB.2016.2595620 Martinez-Romo J, Araujo L (2012) Updating broken web links: an automatic recommendation system. Inf Process Manag 48:183–203 Massa P, Avesani P (2004) Trust-aware collaborative filtering for recommender systems. CoopIS/DOA/ODBASE 1(3290):492–508 Moradi P, Ahmadian S (2015) A reliability-based recommendation method to improve trust-aware recommender systems. Expert Syst Appl 42:7386–7398 O’Donovan J, Smyth B (2005) Trust in recommender systems. In: Proceedings of the 10th international conference on Intelligent user interfaces. ACM, pp 167–174

312

8 Social Recommender Systems

O’Mahony M, Hurley N, Kushmerick N, Silvestre G (2004) Collaborative recommendation: a robustness analysis. ACM Trans Internet Technol 4:344–377 Omizegba EE, Adebayo GE (2009) Optimizing fuzzy membership functions using particle swarm algorithm. In: Systems, man and cybernetics, 2009. SMC 2009. IEEE international conference on. IEEE, pp 3866–3870 Park M-H, Hong J-H, Cho S-B (2007) Location-based recommendation system using bayesian user’s preference model in mobile devices. In: International conference on ubiquitous intelligence and computing. Springer, pp 1130–1139 Pera MS, Ng Y-K (2013) A group recommender for movies based on content similarity and popularity. Inf Process Manag 49:673–687 Permana KE, Hashim SZM (2010) Fuzzy membership function generation using particle swarm optimization. Int J Open Probl Compt Math 3:27–41 Protasiewicz J, Pedrycz W, Kozłowski M et al (2016) A recommender system of reviewers and experts in reviewing problems. Knowl Based Syst 106:164–178 Rashid AM, Karypis G, Riedl J (2008) Learning preferences of new users in recommender systems: an information theoretic approach. ACM SIGKDD Explor Newsl 10:90–100 Resnick P, Iacovou N, Suchak M, et al (1994) GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of {ACM} 1994 conference on computer supported cooperative work. ACM, pp 175–186 Rezvanian A, Meybodi MR (2017) A new learning automata-based sampling algorithm for social networks. Int J Commun Syst 30:e3091. https://doi.org/10.1002/dac.3091 Ricci F, Rokach L, Shapira B (2011) Introduction to recommender systems handbook. In: Recommender systems handbook. Springer, pp 1–35 Richardson M, Agrawal R, Domingos P (2003) Trust management for the semantic web. In: International semantic web conference. Springer, pp 351–368 Roh TH, Oh KJ, Han I (2003) The collaborative filtering recommendation based on SOM clusterindexing CBR. Expert Syst Appl 25:413–423 Salganik MJ, Dodds PS, Watts DJ (2006) Experimental study of inequality and unpredictability in an artificial cultural market. Science (80-) 311:854–856 Shambour Q, Lu J (2011) A hybrid trust-enhanced collaborative filtering recommendation approach for personalized government-to-business e-services. Int J Intell Syst 26:814–843 Shambour Q, Lu J (2012) A trust-semantic fusion-based recommendation approach for e-business applications. Decis Support Syst 54:768–780 Shekarpour S, Katebi SD (2010) Modeling and evaluation of trust with an extension in semantic web. J Web Semant 8:26–36. https://doi.org/10.1016/j.websem.2009.11.003 Simon D (2005) H∞ estimation for fuzzy membership function optimization. Int J Approx Reason 40:224–242 Sinha RR, Swearingen K (2001) Comparing recommendations made by online systems and friends. In: DELOS workshop: personalisation and recommender systems in digital libraries Staab S, Bhargava B, Leszek L et al (2004) The pudding of trust: managing the dynamic nature of trust. IEEE Intell Syst 19:74–88 Su X, Khoshgoftaar TM (2009) A survey of collaborative filtering techniques. Adv Artif Intell 2009:4 Symeonidis P, Nanopoulos A, Manolopoulos Y (2009) MoviExplain: a recommender system with explanations. In: Proceedings of the third ACM conference on recommender systems. ACM, pp 317–320 Tang J, Gao H, Liu H (2012) mTrust: discerning multi-faceted trust in a connected world. In: Proceedings of the fifth ACM international conference on web search and data mining. ACM, pp 93–102 Uddin MG, Zulkernine M, Ahamed SI (2008) CAT: a context-aware trust model for open and dynamic systems. In: Proceedings of the 2008 ACM symposium on applied computing. ACM, pp 2024–2029

References

313

Victor P, Cornelis C, De Cock M (2011) Trust networks for recommender systems. Springer Science & Business Media Victor P, Verbiest N, Cornelis C, De Cock M (2013) Enhancing the trust-based recommendation process with explicit distrust. ACM Trans Web 7:6 Wang H, Shao S, Zhou X et al (2016) Preference recommendation for personalized search. Knowl Based Syst 100:124–136 Yager RR (2003) Fuzzy logic methods in recommender systems. Fuzzy Sets Syst 136:133–149 Yan S, Zheng X, Chen D, Wang Y (2013) Exploiting two-faceted web of trust for enhanced-quality recommendations. Expert Syst Appl 40:7080–7095 Zadeh LA (1996) Fuzzy logic  computing with words. IEEE Trans Fuzzy Syst 4:103–111. https:// doi.org/10.1109/91.493904 Zhao Y, Li B (2007) A new method for optimizing fuzzy membership function. In: 2007 international conference on mechatronics and automation. IEEE, pp 674–678 Zhong J, Li X (2010) Unified collaborative filtering model based on combination of latent features. Expert Syst Appl 37:5666–5672 Ziegler C-N, Golbeck J (2007) Investigating interactions of trust and interest similarity. Decis Support Syst 43:460–475 Ziegler C-N, Lausen G (2004) Analyzing correlation between trust and user similarity in online communities. In: ITrust. Springer, pp 251–265 Zucker LG (1986) Production of trust: institutional sources of economic structure, 1840–1920. Res Organ Behav

Chapter 9

Social Influence Maximization

9.1 Introduction Today, online social networks (OSNs) such as Facebook, LinkedIn and Twitter play an important role in diffusing opinions and information between individuals. Extensive research on social network analysis shows that online users, when sharing information, trust the opinions or information obtained from their close social groups such as friends and acquaintances far more than they trust the comments of ordinary people or news about information from public media. Moreover, with the rapid development and increasing popularity of online social networks, an enormous amount of information has been generated and diffused by human interactions through these networks. The availability of information diffused by online users had led to emerging opportunities for enabling large-scale research into information diffusion (Daliri Khomami et al. 2017, 2018; Mashayekhi et al. 2018). One of the fundamental problems in information diffusion in social networks is the influence maximization problem. The problem of optimizing influence maximization in a social network focuses on how to select a small sub-set of individuals as the initial influence adopters to trigger a cascade such that the influence diffusion in the social network is maximized. This problem has drawn a great deal of attention because it can be applied to optimize the spread of information, new ideas or innovations through social networks. One of the most popular applications is viral marketing. For a company needing to promote a new service, viral marketing is an effective strategy for a limited budget. Instead of promoting the new service to all potential customers, a company can target a small sub-set of customers who then recommend the service to their friends or acquaintances. Due to the word-of-mouth effect, a small number of nodes can affect the whole network, while the idea, service or innovation can be spread widely (Kempe et al. 2003). Given a social network, the problem of influence maximization involves the determination of a set of nodes that maximizes the spread of influences based on the given propagation model. In Kempe et al. (2003), two classical propagation models taken

© Springer Nature Switzerland AG 2019 A. Rezvanian et al., Learning Automata Approach for Social Networks, Studies in Computational Intelligence 820, https://doi.org/10.1007/978-3-030-10767-3_9

315

316

9 Social Influence Maximization

from mathematical sociology are presented, including the linear threshold model and the independent cascade model. In the linear threshold (LT) model, each node has a threshold that is chosen uniformly and randomly between 0 and 1. A node is activated when the fraction of its activated neighbors reaches this threshold. Threshold models have been proposed and studied in many domains, for example, sociology, as a reasonable approach toward a model for influence in a social network. In the independent cascade (IC) model, each node has either an active or an inactive state and is allowed to change its state from inactive to active, but not vice versa. Each active node has a single and independent chance to activate its inactive neighbors.

9.1.1 Related Work Domingos and Richardson (2001), Richardson and Domingos (2002) considered influence maximization as a basic algorithmic problem and suggested a probabilistic model of interaction between people in networks. Following this idea, Kempe et al. (2003) formulated influence maximization as a discrete optimization problem that attempts to find a set of k nodes according to a specific propagation model to maximize the influence spread. In Liu et al. (2014), a new algorithm was presented by Liu et al. for solving the influence maximization problem by forming an influencespreading path. They argued that time plays an important role in the influence spread between two users and that the time needed for each user to influence another is different. Another algorithm is investigated by Xu et al. (2014); the authors solved the problem of influence maximization using a weighted maximum cut problem and a semi-definite program-based algorithm. Based on the proposed algorithm, the value of the influence between two non-adjacent individuals is computed according to the existence of a social influence path between them. In addition, Lee and Chung (2015) presented an algorithm for influence maximization on specific users in social networks and formulated the problem as query processing to find specific users within the network as a whole. One of the most important heuristic algorithms is the highdegree approach suggested in Kempe et al. (2003). Lü et al. (2016) presented the extension of the H-index concept to determine how important a node is in the underlying network as an important parameter to quantify the influential spreaders. Based on the definition, the H-index of a node is specified to be the maximum value h such that there exists at least h neighbors of degree no less than h in a network. They also discovered that the H-index of a node can be a better predictor of the influence of a node in epidemic spreading than the degree or the coreness. A Cost-Effective Lazy Forward selection (CELF) algorithm (Leskovec et al. 2007), is a well-known lazy-forward greedy algorithm with an optimization, in which new seed nodes are selected by further exploiting the sub-modular property of influence maximization. More precisely, the marginal gain of a node in the current iteration cannot be greater than its marginal gain in previous iterations; hence, the number

9.1 Introduction

317

of spread estimation calls can be reduced significantly. To improve the effectiveness and performance of CELF, CELF++ (Goyal et al. 2011) is presented as an extension of CELF. In CELF++, the influence spread for two successive iterations of the greedy algorithm is computed simultaneously, leading to a lower number of iterations being required. In Yeruva et al. (2016) a Pareto-Shell decomposition algorithm is proposed such that the algorithm selects non-dominated spreads with a ratio of high out-degree to in-degree and high in-degree. In addition, they have shown experimentally that the proposed algorithm outperforms high degree centrality in some cases.

9.2 Learning Automata Approach for Influence Maximization In this section, we describe the a learning automata approach for influence maximization in social network (Ge et al. 2017; Huang et al. 2018). In this method, a seed set Si is determined in k rounds, with one node added into Si each round. The action set A is defined as the set of candidate nodes in the network that could be added into seed set Si , therefore in the ith round, the cardinality of A is |V − i|. The feedback is defined as the ratio of the number of marginal influenced nodes in the ith round to the total number of nodes in the network defined as follows β(i) 

σ (Si ) − σ (Si−1 ) |V |

(9.1)

where 0 ≤ σ (Si ) − σ (Si−1 ) ≤ |V | and 0 ≤ β(i) ≤ 1. The environment corresponds the propagation process on the whole network. According to the above assumptions and with the aid of Discretized Generalized Confidence Pursuit Algorithm (DGCPA), the algorithm can gradually select proper seed set node Si . The pseudo-code pf LA-based algorithm in S-model environment for influence maximization is presented in Fig. 9.1. The LA-based algorithm for influence maximization similar to popular greedy algorithm starts from an empty seed set Si  ∅, and seeks for the node that has maximal marginal influence and iteratively adds it into Si , until cardinality of seed set exceeds to k (e.g., |Si |  k). In each round of algorithm, all nodes that have not been included in set Si are considered as action set and the unified marginal influence as feedback. According to the ε-optimal property of learning automata, with a proper configuration of learning parameters, the learning automata will converge to the node with maximal marginal influence spread after a certain number of iterations. This process can repeat k times to get a proper seed set Si .

318

9 Social Influence Maximization

λ

Fig. 9.1 Pseudo-code of discretized generalized confidence pursuit algorithm for influence maximization

9.3 Learning Automata for Solving Positive Influence Dominating Set and Its Application in Influence Maximization Daliri Khomami et al. (2018) presented a learning automata algorithm for solving minimum positive influence dominating set (PIDS) and also applied the PIDS as a heuristic for choosing seed set in influence maximization. In the following, we introduce PIDS, describe LA-based algorithm for solving PIDS in brief and finally present the applicability of this heuristic for influence maximization. Let G  V, E be the graph of a network, in which V is the set of nodes and E is the set of edges. For such a network, a PIDS is a subset  P ⊂ V with a minimum size such that any node vi ∈ V is dominated by at least n2i nodes in P, where ni is the degree of node vi . The minimum positive influence dominating set (MPIDS) is the PIDS with minimum cardinality (Wang et al. 2011). In what follows the LA-based algorithm for finding the positive influence dominating set is described

9.3 Learning Automata for Solving Positive Influence Dominating Set …

319

9.3.1 Learning Automata-Based Algorithm for Finding the Minimum Positive Influence Dominating Set Let the undirected graph G  V, E serves as the input to the algorithm and the desired output is the near-optimal minimum positive influence dominating set (MPIDS). The LA-based algorithm involves four steps as follows.

9.3.1.1

Initialization

Each node of the graph is initially equipped with a LA. Thus, a network of LA isomorphic to the graph is initially constructed. The set of LAs can be described by a tuple L A  {A, α} where A  {A1 , . . . , An } corresponds to the LAs and α  {α1 , α2 , . . . , αr } represents the probability action vector for the selection of each action and initialized using a uniform probability distribution. The action set of each learning automaton Ai residing in node vi consists of only two actions, αi0 and αi1 . Let Ωt be the candidate PIDS in iteration t. If αi0 is selected, the corresponding node vi becomes a member of Ωt . Otherwise, if αi1 is selected, no change will occur in the membership of the positive influence dominating set.

9.3.1.2

Finding a Candidate Positive Influence Dominating Set

As the algorithm proceeds, at each iteration, all LAs simultaneously select one of their actions according to the action probability vector. In the proposed algorithm, the action αi0 corresponds to choosing vi as a member of the currently obtained PIDS, while selecting αi1 means that vi does not belong to the current PIDS. Hence, all LAs that selected action αi0 become members of Ωt . Let n i be the number of nodes that are neighbors of node vi ; then the algorithm checks whether  the obtained Ωt is a PIDS by considering whether each node in the network has n2i neighbors included in the current PIDS. If not, the process of action selection by the LAs continues; otherwise, updating probabilities is performed. It is noted that the process of action selection is performed simultaneously by all LA; this cooperation among LAs helps to accelerate the learning process.

9.3.1.3

Computing the Objective Function

A dynamic threshold is applied by the algorithm to evaluate whether the current obtained solution is a proper one. Let the dynamic threshold be defined as the average cardinality of the positive influence dominating sets up to iteration t: 1 |Ωi |, t i1 t

Dt 

(9.2)

320

9 Social Influence Maximization

where Ω is the positive influence dominating set and |Ωt | is the cardinality of the selected positive influence dominating set at iteration t. The dynamic threshold aids in deciding whether the obtained PIDS is a proper one; more precisely, if the dynamic threshold, Dt , is less than |Ωt |, the current PIDS is an appropriate candidate MPIDS.

9.3.1.4

Updating the Probability Vector

At every iteration t, if the average cardinality of the positive influence dominating set is less than or equal to the dynamic threshold Dt , then depending upon the internal state, the actions chosen by all learning automata are rewarded. Each learning automaton updates its action probability vector by using an L R−I reinforcement scheme.

9.3.1.5

Stopping Conditions

Finding the positive influence dominating set and updating the action probabilities are continued until the number of resulting positive influence dominating sets reaches a pre-defined threshold. Figure 9.2 shows the steps of Algorithm 1 in detail. A significant application of the PIDS is the solution of the influence maximization problem. The LA-based algorithm for finding the near-optimal MPIDS in social networks can be applied for choosing seed set nodes in influence maximization. For a lower cardinality of the obtained PIDS, influence will be spread more efficiently because a smaller number of nodes is involved in the spread process. Thus, after finding the MPIDS, a k number of nodes with highest degree are chosen as seed set nodes for influence maximization as given in Fig. 9.3.

9.3.1.6

Convergence Results

In this section, the convergence of LA-based algorithm for finding MPIDS to the optimal solution by a proper choice of learning rate is provided. Theorem 9.1 Let q(t) be the probability of constructing a positive influence dominating set in a given graph at stage t. If q(t) is updated according to Algorithm 1, then for every ε > 0, there exists a learning rate a˜ ∈ (0, 1) such that for all a ∈ (0, a), ˜ we have   (9.3) Prob lim q(t)  1 ≥ 1 − ε. t→∞

Proof The sketch of the proof is similar to the method given in Akbari Torkestani and Meybodi (2011), Narendra (1989) to analyze of the behavior of the learning automaton operating in a non-stationary environment. The steps of the convergence

9.3 Learning Automata for Solving Positive Influence Dominating Set …

321

Algorithm 9-2. LA-based algorithm for finding the minimum positive influence dominating set Input The graph = ( , ), Threshold , Output: Minimum positive influence dominating set Assumptions Assign an automaton Begin

to each node

, and the initial action set of each automaton is set equally;

Let be the iteration number of the algorithm, initially set to 1; Let Let

be the positive influence dominating set at iteration , initially set to empty; be the set of non-positive influence dominating sets at iteration , initially set to ;

Let be the average size of all positive influence dominating sets up to iteration ; While (t < T) Repeat For all of the learning automata Do Learning automaton

chooses an action using its action probability vector;

If (the action selected by

is a member of the positive influence dominating set) then // α

0

i

End End for Until (

construct a candidate positive influence dominating set)

Compute If (| )) then Reward the actions chosen by all of the activated learning automata; Else Penalize the actions chosen by all of the activated learning automata; End If + 1; End while End Algorithm

Fig. 9.2 Pseudo-code for the LA-based algorithm for finding the minimum positive influence dominating set Algorithm 9-3. MPIDS heuristic for seed set selection in influence maximization Input The graph

= ( , ), Threshold

Output: S: Seed Set nodes Begin Ω ← Compute LA-based algorithm for finding MPIDS S ← Extract high-degree nodes from MPIDS Return

number of nodes as seed set nodes

End Algorithm

Fig. 9.3 Pseudo-code for the MPIDS heuristic for seed set selection in influence maximization

proof are briefly outlined as follows. First, the convergence of the penalty probability of the PIDS to the final penalty probability is established for sufficiently large values of t. Moreover, it is shown that the probability of choosing the minimum PIDS is a sub-Martingale process for large values of t; hence, the change in the probability of constructing the minimum PIDS is non-negative at each step. Finally, the convergence of Algorithm 1 to the minimum PIDS is proved using well-known Martingale convergence results. For details and proofs of statements see e.g., (Akbari Torkestani and Meybodi 2012).

322

9 Social Influence Maximization

Step 1. Assuming the generated PIDS at stage t is penalized with probability c(t) and limt→∞ c(t)  c∗ , for each ε ∈ (0, 1) and t > T (ε) we have   pr ob c∗ − c(t) > 0 < ε

(9.4)

where T (ε) denotes the minimum number of stages of Algorithm 1 to achieve error rate ε. Step 2. Let c(t)  pr ob[|Ωt | > Dt ] and d(t)  1 − c(t) be the probabilities of penalizing and rewarding the PIDS respectively. If q(t) is updated according to Algorithm 1, then the conditional expectation of q(t) is defined as E[ q(t + 1)|q(t)] 

 1 q(t) c(t) q(t) + d(t) pm (t) , where 1 − pm1 (t) vm ∈P /

pm1 (t + 1) 



vm ∈P

pm1 (t) + a 1 − pm1 (t)(1 − a)

 pm1 (t) vm ∈ P , vm ∈ P

(9.5)

and pm1 (t) is the probability of declaring vm as a dominator by itself at stage t. Moreover, q(t)  E[ q(t + 1)|q(t)] − q(t) is always non-negative. Step 3. In defining (q)  pr ob [ q(∞)  1|q(0)  q], the goal is to compute a lower bound for Γ (q) that guarantees the convergence of Algorithm 1 after a sufficient number of stages. Let ψ(q) be a sub-regular function on [0, 1] satisfying ψ(1)  0. We note that ψ(q) ≤ Γ (q) holds for any q ∈ [0, 1] (see, e.g., Kanté et al. 2011). Hence, to establish the convergence of Algorithm 1 with probability equal to 1, it is sufficient to find a sub-regular function ψ(q) that is bounded from below and converges to 1. −xq

Let ϕ(x, q)  e −xaa −1 . In Narendra and Thathachar (1989), it is shown that the e −1 sufficient condition for the sub-regularity of ϕ(x, q) on [0, 1] is that G(x, q)  (1 − q)d ∗ V [−x(1 − q)] − qd ∗ V [xq] ≥ 0, where V [u] 

 eu −1 u

0

u  0 . We note that Eq. (9.6) is equivalent to u0 V [−x(1 − q)] q ≥ . V [xq] 1−q

(9.6)

(9.7)

It is shown in Lakshmivarahan and Thathachar (1976) that V [−x(1−q)] ≥ V 1[x] . V [xq] q 1 Hence, Eq. (9.7) is satisfied if V [x] ≥ 1−q and the sufficient condition for subq . regularity of ϕ(x, q) on [0, 1] is to choose x such that V 1[x] ≥ 1−q Lemma 9.1 There exists x ∗ ∈ R such that for x ≤ x ∗ the inequality (9.7) holds. Proof We note that V [x] is continuous, strictly increasing and positive. Moreover, q has a positive value for any 0 < q < 1. Hence, there exists x ∗ ∈ R such that 1−q

9.3 Learning Automata for Solving Positive Influence Dominating Set …

323

q V [x ∗ ]  1−q , and for x ≤ x ∗ the inequalities V 1[x] ≥ 1−q and (9.6) hold. Hence, for q ∗ x ≤ x , as in Lemma 9.1, ϕ(x, q) is sub-regular. It can therefore be concluded that ϕ(x, q) ≤ (q) ≤ 1.

Moreover, from the definition of ϕ(x, q), it is clear that for a given ε > 0, there exists a positive constant a ∗ such that for 0 < a ≤ a ∗ , we have 1 − ε ≤ ϕ(x, q) ≤ (q) ≤ 1. Hence, for a sufficiently small ε > 0, Algorithm 1 converges to the minimum PIDS with probability 1 as t → ∞. This completes the proof of Theorem 9.1.  Theorem 9.2 LA-based algorithm for finding MPIDS converges to the minimal √ log

m 1−ε−1/ 1 (1−a) N −1 −1 Γ

iterations with probability not less PIDS in the network after I  1−a 1−ε+q 2 than 1 − ε, where m and N are the average size and the number of PIDSs in the whole network respectively. Proof To investigate an upper bound for the complexity, we focus on the worst case, in which all PIDSs greater than the minimal one, Pmin , are selected first. The worst case can be divided into the two distinct phases of before and after the first proper selection; these two stages are referred to as the effort phase and the success phase, respectively. In the effort phase, all of the selected PIDSs greater than Pmin are selected. Without loss of generality, we suppose that these PIDSs are selected in decreasing manner; hence, each obtained PIDS is rewarded because it is greater than the previous one. Letting i be the iteration in which the maximal PIDS is obtained for the first time, all of the selected actions are rewarded during the first i iterations. We then have pi (N ) ≥ pi (N − 1)(1 − a)m ,

(9.8)

where N is the total number of PIDSs in the network and m is their average size. The success phase is started in iteration i, in which PIDS min is selected for the first time. In the success phase, the probability of penalizing PIDS min is zero; thus, the change in the probability of selecting PIDS min is non-negative. More precisely, the probability of selecting PIDS min in this period increases if the corresponding actions are rewarded, and remains unchanged otherwise. Therefore, at each iteration of the success phase, the inequality    j j−1 j−1 , (9.9) pk + a 1 − pk p P I DSmin ≥ vk ∈P I DSmin j

holds, where p P I DSmin is the probability of selecting PIDS min after j iterations in the success phase. Using (9.2) iteratively we obtain j

p P I DSmin ≥

 vk ∈P I DSmin

 (1 − a) j pk0 + (1 − a) j−1 + · · · + (1 − a) + 1 a,

(9.10)

324

9 Social Influence Maximization

where pk0 is the probability of selecting the action corresponding to vk at the start of the success phase and is greater than pk (1 − a)m−1 according to Eq. (9.9). Moreover, using the geometric series formula, the above equation can be simplified to j p P I DSmin

 1 − (1 − a) j a ≥ + (1 − a) pk (1 − a) a vk ∈P I DSmin    (1 − a) j pk (1 − a) N −1 − 1 + 1 

j

N −1



(9.11)

vk ∈P I DSmin

Because the algorithm starts from uniform probabilities, each pk is initially equal  m j j 1 1 and we have p ≥ − a) − a) N −1 − 1 + 1 . (1 (1 P I DSmin  Γ Therefore, to ensure that the algorithm converges to PIDS min with probability 

m greater than 1 − 2, it is sufficient to let (1 − a) j Γ1 (1 − a) N −1 − 1 + 1 ≥ 1 − ε. Now, the minimum number of iterations in the success phase is equal to i  √ to

m

1−ε−1/ 1 (1−a) N −1 −1

Γ . log1−a We note that when an incorrect PIDS is chosen during the success phase, the probability of selecting PIDS min remains invariant; hence, i does not include the iterations in which the other PIDSs are selected. In the remainder of this section, we discuss how to compute the total number of iterations. Let Pave be the average probability of selecting PIDS min in the first k iterations. Because this probability is equal to q in the first iteration and reaches 1 − after . Conversely, i iterations in the success phase, its average value is Pave  1−ε+q 2 and hence assuming that I is the total number of iterations, it is clear that i  IP ave √

log

m 1−ε−1/ 1 (1−a) N −1 −1 Γ

I  1−a 1−ε+q . 2 This completes the proof.

9.3.1.7



Experiments

In this section, the performance of the MPIDS heuristic using the LA-based algorithm for finding MPIDS is studied on several well-known real and synthetic networks, including Zachary’s Karate Club (Karate) (Zachary 1977), American college Football (Football) (Girvan and Newman 2001), the Dolphins network (Dolphins) (Lusseau et al. 2003) and Jazz musicians network (Jazz) (Gleiser and Danon 2003) as described in Table 9.1. Figure 9.1 shows the test networks used in the experiments and their characteristics. For LA-based algorithm, the linear learning algorithm L R-I is used and the learning parameter 0.001 is applied in the experiment. In this experiment, the performance of the MPIDS heuristic using LA-based algorithm (LA-MPIDS) is compared with High-degree (Kempe et al. 2003), PageRank (Brin and Page 1998), HITS (Kleinberg 1999), CELF (Leskovec et al. 2007), CELF++ (Goyal et al. 2011), K-Shell (Zeng and Zhang 2013) and Pareto-Shell (Yeruva et al. 2016) with respect to the influence spread for different size of initial spreaders (seed nodes). Figure 9.4 demonstrates the influence spread of algorithms on the test net-

9.3 Learning Automata for Solving Positive Influence Dominating Set …

325

LA-MPIDS

Rand

High-degree

PageRank

HITS

CELF++

CELF

K-Shell

Pareto-Shell

100

Number of activated nodes (%)

90 80 70 60 50 40 30 20 10 0

5

10

15

20

25

Initial spreaders (%)

(a) Karate LA-MPIDS

Rand

High-degree

PageRank

HITS

CELF++

CELF

K-Shell

Pareto-Shell

100

Number of activated nodes (%)

90 80 70 60 50 40 30 20 10 0

5

10

15

20

Initial spreaders (%)

(b) Dolphins Fig. 9.4 Results of influence spread with different initial numbers of activated nodes

25

326

9 Social Influence Maximization LA-MPIDS

Rand

High-degree

PageRank

HITS

CELF++

CELF

K-Shell

Pareto-Shell

100

Number of activated nodes (%)

90 80 70 60 50 40 30 20 10 0

5

10

15

20

25

Initial spreaders (%)

(c) Football LA-MPIDS

Rand

High-degree

PageRank

HITS

CELF++

CELF

K-Shell

Pareto-Shell

100

Number of activated nodes (%)

90 80 70 60 50 40 30 20 10 0

5

10

15

Initial spreaders (%)

(d) Jazz Fig. 9.4 (continued)

20

25

9.3 Learning Automata for Solving Positive Influence Dominating Set …

327

Table 9.1 Test networks for simulations Network Karate

Nodes 34

Edge 78

Description Zachary’s Karate Club Network

Dolphins

62

159

Networks of Dolphins

Football

115

613

Network of American College Football Teams

Jazz

195

5484

Network of American musicians

works for different size for initial spreaders from 5 to 25% with span of 5% interval. The results show that for all of the test networks, increasing the size of the initial spreaders results in increasing the total number of activated nodes at the end of the diffusion process. As shown in Fig. 9.4, MPIDS activates more nodes than other comparing algorithms for some networks. However, for some of real networks (i.e., karate and jazz), High-degree and PageRank activate more nodes than LA-MPIDS, meaning that LA-MPIDS provides important nodes (MPIDS) as initial spreaders (seed nodes) for the diffusion process. This observation might be due to the structural properties of these networks. In dense or modular (clustered) networks, High-degree and PageRank strategies cannot incorporate the fact that many of the most important nodes (i.e., highest-degree or high PageRank) might be clustered or have several common neighboring nodes, so that targeting all of them is unnecessary. Conversely, some important (i.e., highest-degree or high PageRank) nodes can be ignored in LA-MPIDS due to properties of MPIDS for existing common neighboring nodes. Therefore, it seems that LA-PIDS is suitable for sparse networks and High-degree and PageRank strategies are suitable for dense or modular networks.

9.4 Conclusion In this chapter, the important role of influence maximization for information diffusion in online social networks was discussed. Two learning automata approach for influence maximization in online social networks were presented. These algorithms with the aid of learning automata iteratively try to obtain a proper seed set node for influence maximization. One of the algorithms directly chooses seed set nodes and the other algorithm also uses the minimum positive influence dominating set (MPIDS) for this purpose. Both the mathematical analysis and simulation results confirm the successful role of learning automata for influence maximization in online social networks as well.

328

9 Social Influence Maximization

References Akbari Torkestani J, Meybodi MR (2011) A link stability-based multicast routing protocol for wireless mobile ad hoc networks. J Netw Comput Appl 34:1429–1440. https://doi.org/10.1016/ j.jnca.2011.03.026 Akbari Torkestani J, Meybodi MR (2012) Finding minimum weight connected dominating set in stochastic graph based on learning automata. Inf Sci (Ny) 200:57–77. https://doi.org/10.1016/j. ins.2012.02.057 Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine BT—computer networks and ISDN systems. Comput Netw ISDN Syst 30:107–117. https://doi.org/10.1016/ S0169-7552(98)00110-X Daliri Khomami MM, Rezvanian A, Bagherpour N, Meybodi MR (2017) Irregular cellular automata based diffusion model for influence maximization. In: 2017 5th Iranian joint congress on fuzzy and intelligent systems (CFIS). IEEE, pp 69–74 Daliri Khomami MM, Rezvanian A, Bagherpour N, Meybodi MR (2018) Minimum positive influence dominating set and its application in influence maximization: a learning automata approach. Appl Intell 48:570–593. https://doi.org/10.1007/s10489-017-0987-z Domingos P, Richardson M (2001) Mining the network value of customers. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining—KDD’01. ACM, pp 57–66 Ge H, Huang J, Di C et al (2017) Learning automata based approach for influence maximization problem on social networks. In: 2017 IEEE second international conference on data science in cyberspace (DSC). IEEE, pp 108–117 Girvan M, Newman MEJ (2001) Community structure in social and biological networks. Proc Natl Acad Sci 99:7821–7826. https://doi.org/10.1073/pnas.122653799 Gleiser P, Danon L (2003) Community structure in jazz. Adv Complex Syst 6:565–573. https://doi. org/10.1142/S0219525903001067 Goyal A, Lu W, Lakshmanan LVS (2011) CELF++. In: Proceedings of the 20th international conference companion on World wide web—WWW ’11. ACM Press, New York, New York, USA, p 47 Huang J, Ge H, Guo Y et al (2018) A learning automaton-based algorithm for influence maximization in social networks. pp 715–722 Kanté MM, Limouzy V, Mary A, Nourine L (2011) Enumeration of minimal dominating sets and variants. In: Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, pp 298–309 Kempe D, Kleinberg J, Tardos É (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining—KDD’03. p 137 Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM 46:604–632. https://doi.org/10.1145/324133.324140 Narendra KS, Thathachar MAL (1989) Learning automata: an introduction. Prentice-Hall Lakshmivarahan S, Thathachar MAL (1976) Bounds on the convergence probabilities of learning automata. IEEE Trans Syst Man, Cybern A Syst Humans 6:756–763 Lee J-RR, Chung C-WW (2015) A query approach for influence maximization on specific users in social networks. IEEE Trans Knowl Data Eng 27:340–353. https://doi.org/10.1109/TKDE.2014. 2330833 Leskovec J, Krause A, Guestrin C et al (2007) Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining—KDD’07. ACM, p 420 Liu B, Cong G, Zeng Y et al (2014) Influence spreading path and its application to the time constrained social influence maximization problem and beyond. IEEE Trans Knowl Data Eng 26:1904–1917. https://doi.org/10.1109/TKDE.2013.106

References

329

Lü L, Zhou T, Zhang QM, Stanley HE (2016) The H-index of a network node and its relation to degree and coreness. Nat Commun 7:10168. https://doi.org/10.1038/ncomms10168 Lusseau D, Schneider K, Boisseau OJ et al (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54:396–405. https://doi.org/10.1007/s00265-003-0651-y Mashayekhi Y, Meybodi MR, Rezvanian A (2018) Weighted estimation of information diffusion probabilities for independent cascade model. In: 2018 4th international conference on web research (ICWR). IEEE, pp 63–69 Richardson M, Domingos P (2002) Mining knowledge-sharing sites for viral marketing. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining—KDD’02. p 61 Wang G, Wang H, Tao X, Zhang J (2011) Positive influence dominating set in e-learning social networks. In: ICWL, pp 82–91 Xu W, Lu Z, Wu W, Chen Z (2014) A novel approach to online social influence maximization. Soc Netw Anal Min 4:1–13. https://doi.org/10.1007/s13278-014-0153-0 Yeruva S, Devi T, Reddy YS (2016) Selection of influential spreaders in complex networks using Pareto Shell decomposition. Phys A Stat Mech its Appl 452:133–144. https://doi.org/10.1016/j. physa.2016.02.053 Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473. https://doi.org/10.1086/jar.33.4.3629752 Zeng A, Zhang C-JJ (2013) Ranking spreaders by decomposing complex networks. Phys Lett Sect A Gen At Solid State Phys 377:1031–1035. https://doi.org/10.1016/j.physleta.2013.02.039